SE Radio 666: Eran Yahav on the Tabnine AI Coding Assistant
Eran Yahav, Professor of Computer Science at Technion, Israel, and CTO of Tabnine, speaks with host Gregory M. Kapfhammer about the Tabnine AI coding assistant. They discuss how the design and implementation allows software engineers to use code completion and perform tasks such as automated code review while still maintaining developer privacy. Eran and Gregory also explore how research in the field of natural language processing (NLP) and large language models (LLMs) has informed the features in Tabnine.
Brought to you by IEEE Computer Society and IEEE Software magazine.
Eran Yahav, Professor of Computer Science at Technion, Israel, and CTO of Tabnine, speaks with host Gregory M. Kapfhammer about the Tabnine AI coding assistant. They discuss how the design and implementation of Tabnine let software engineers use code completion and perform tasks such as automated code review while still maintaining developer privacy. Eran and Gregory also explore how research in the field of natural language processing (NLP) and large language models (LLMs) has informed the features in Tabnine.
Brought to you by IEEE Computer Society and IEEE Software magazine.
Show Notes
Related Episodes
Other References
Transcript
Transcript brought to you by IEEE Software magazine and IEEE Computer Society. This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number.
Gregory Kapfhammer 00:00:18 Welcome to Software Engineering Radio. I’m your host, Gregory Kapfhammer. Today’s guest is Eran Yahav. He’s the CTO of Tabnine and a faculty member in the computer science department at the Israel Institute of Technology. Eran, welcome to Software Engineering Radio.
Eran Yahav 00:00:36 Hey, great to be here. Thank you for having me.
Gregory Kapfhammer 00:00:39 Today we’re going to be talking about the Tabnine AI coding assistant. It uses large language models to help software engineers complete tasks like code explanation and test case generation. Eran, are you ready to dive into the episode?
Eran Yahav 00:00:53 I’m definitely ready, yeah.
Gregory Kapfhammer 00:00:54 Alright, so one of the things I noticed on the Tabnine website is that it has a million monthly users. First of all, congratulations. That’s really cool. What I want to do now is to talk a little bit about what those Tabnine users are doing. Can you give a few concrete examples of how software engineers use Tabnine?
Eran Yahav 00:01:15 Yeah, sure. So the Tabnine provides assistance across the entire SDLC. It helps you write code with code completions that are very advanced. It has a chat interface that allows you to create new code, review code, refactor code, translate between languages, generate tests as you mentioned initially, document code, explain it. So it basically helps you do anything that you have to do as a software engineer and the vision is really to provide agents that help you do anything and everything a software engineer does just faster and with a higher quality.
Gregory Kapfhammer 00:01:52 So you mentioned that it helps software engineers both move more rapidly and also do things with better quality. What’s the reason that Tabnine helps that to happen?
Eran Yahav 00:02:01 I guess there’s so much boilerplate and so much boring work in co-generation in programming. We know that and for years people have been obviously copying snippets from past projects, from Stack Overflow, from the internet at large just to get over these boring things and get over with it. And then you basically get whatever is out there with the rise of LMs and with the rise of assistance like Tabnine that can contextualize deeply on your context on your organization’s code, etc. You have the opportunity to generate these piece of code much more efficiently and more, I guess true to what they should be in your context, right? So part of the quality of the code that you’re generating is being suitable for the environment in which you’re operating. And I think this is one particular aspect that Tabnine excels with.
Gregory Kapfhammer 00:03:07 So later in our episode we’ll dive into the specifics of Tabnine’s implementation and how software engineers can use it. Before we do that, at a high level, can you explain a little bit about how Tabnine actually automatically performs these tasks?
Eran Yahav 00:03:22 Yeah, sure. So at the bottom of the whole thing are obviously large language models, right LMS and they are capable of doing amazing things these days. They’re very powerful and they’re becoming more powerful on a weekly basis almost. But from the LM to an actual use of software engineer, there is a hole that has to happen in the middle, right? Part of that is the context that I mentioned earlier. So Tabnine has what we call the enterprise context engine, which is something that basically knows how to draw relevant context from all code sources of information and non-code sources of information in the organization to make sure that again, Tabnine operates like an onboarded employee of the organization and not as a foreign engineer in the org. So Tabnine knows everything about the org, and which informs the code that it generates, how it reviews code, etc.
Eran Yahav 00:04:22 The second thing that is there between LM and the agents is a component that provides trust. One of the major questions that we ask ourselves as engineers when we delegate a task to AI or even to a junior engineer, is how can we trust the results that we get back, right? And Tabnine really focuses on that with something that we call coaching, Tabnine coaching, and that allows Tabnine to make sure that the code being generated and code being pushed into your code base follow organizational specific rules of what does it mean to be high quality code, what does it mean to follow organizational best practices, etc. So again, we can and we’ll talk about the technical detail of how these things are done later, but if you’re trying to look like for 30 feet view, 30,000 feet view here, it’s more LM contextual context, organizational context, trust and agents that are built on top of that.
Gregory Kapfhammer 00:05:27 Okay, that was really helpful. We’re going to dive in more to the issues about trusts and the specific LLMs that are used by Tabnine. Before we do that, I wanted to just get a high-level perspective of the various features. Maybe if I could read off a feature and then you could briefly describe it, we might be able to give the listeners a full featured picture of some of the things that Tabnine provides. So you talked about this already, let’s revisit code generation. How does code generation work in Tabnine?
[...]