rss-bridge 2025-10-22T00:10:00+00:00

SE Radio 691: Kacper Łukawski on Qdrant Vector Database

Kacper Łukawski, a Senior Developer Advocate at Qdrant, speaks with host Gregory M. Kapfhammer about the Qdrant vector database and similarity search engine. After introducing vector databases and the foundational concepts undergirding similarity search, they dive deep into the Rust-based implementation of Qdrant. Along with comparing and contrasting different vector databases, they also explore the best practices for the performance evaluation of systems like Qdrant. Kacper and Gregory also discuss topics such as the steps for using Python to build an AI-powered application that uses Qdrant.

Brought to you by IEEE Computer Society and IEEE Software magazine.

Kacper Łukawski, a Senior Developer Advocate at Qdrant, speaks with host Gregory M. Kapfhammer about the Qdrant vector database and similarity search engine. After introducing vector databases and the foundational concepts undergirding similarity search, they dive deep into the Rust-based implementation of Qdrant. Along with comparing and contrasting different vector databases, they also explore the best practices for the performance evaluation of systems like Qdrant. Kacper and Gregory also discuss topics such as the steps for using Python to build an AI-powered application that uses Qdrant.

Brought to you by IEEE Computer Society and IEEE Software magazine.

Show Notes

Related Episodes

SE Radio 676: Samuel Colvin on the Pydantic Ecosystem

SE Radio 673: Abhinav Kimothi on Retrieval-Augmented Generation

SE Radio 666: Eran Yahav on the Tabnine AI Coding Assistant

SE Radio 493: Ram Sriharsha on Vectors in Machine Learning

SE Radio 490: Tim McNamara on Rust 2021 Edition

Other References

Kacper Łukawski

Qdrant

Home – Qdrant

Cloud Quickstart – Qdrant

Vector Search Basics – Qdrant

Advanced Retrieval – Qdrant

Using the Database – Qdrant

Transcript

Transcript brought to you by IEEE Software magazine.

This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.

Gregory Kapfhammer 00:00:18 Welcome to Software Engineering Radio. I’m your host Gregory Kapfhammer. Today’s guest is Kacper Lukawski. He’s a senior developer advocate at Qdrant. Qdrant is an open-source vector database and vector search similarity search engine. Kacper, welcome to the show.

Kacper Lukawski 00:00:35 Hello Greg. Thanks for the invitation.

Gregory Kapfhammer 00:00:37 Hey, I’m really glad today that we get a chance to talk about Qdrant, it’s a vector database and we’re going to learn more about how it helps us to solve a number of key problems. So are you ready to dive in?

Kacper Lukawski 00:00:48 Definitely.

Gregory Kapfhammer 00:00:49 Okay. So we’re going to start with an introduction to vector databases and we’re going to cover a couple high level concepts and then later dive into some additional details. So let’s start with the simple question of what is a vector database? Can you tell us more?

Kacper Lukawski 00:01:03 Yes, of course. First of all, I think vector search engine is a more appropriate term. A search is the main functionality that this kind of tools provide. Nevertheless, it’s a service that can efficiently store and handle high dimensional vectors for the proposals of similarity search and similarity of these vectors is defined by the closeness of the vectors in that space. So vector databases are built to make that process efficient.

Gregory Kapfhammer 00:01:29 Okay, so a vector database helps us to achieve vector search or vector similarity search. Is that the right way to think about it? Exactly. Okay. Now one of the things you mentioned was the word vector and then you said high dimensional. Can you briefly explain what high dimensional data is?

Kacper Lukawski 00:01:46 Yes. In case of vector embeddings we describe them as high dimensional because they usually have at least a few hundreds of dimensions. Typically no more than eight or 9,000 dimensions. And it’s definitely not like high dimensional data if you are the seasoned data expert, but it’s relatively high because it’s hard to imagine, hard to interpret for a regular human. So this is the range that we are usually operating in.

Gregory Kapfhammer 00:02:11 Okay, that’s helpful. Now you mentioned the term embedding a moment ago. Can you talk briefly about the concept of a vector embedding?

Kacper Lukawski 00:02:19 Sure. So vector embeddings are just numerical representations of the input data and the main idea is that they keep the semantic meaning of the input data that was used to generate them. And if we have two different vectors which are similar in some way, then we assume that the objects that are used to generate them are also similar in their nature. And vector embeddings actually enabled semantic search that can understand not only the presence of particular keywords but also user intents and more importantly they enabled search on unstructured data that was impossible to be processed in the past.

Gregory Kapfhammer 00:02:59 So let me see if I’m understanding the workflow correctly. Is the idea that I take something like source code or images or documents and then I convert those two embeddings and then I store those in the vector database? Am I thinking about this the right way?

Kacper Lukawski 00:03:14 Yes, that’s the correct way. And the main idea is that this vector embeddings are generated by neural networks which were trained solely for that purpose. So that’s also why we quite often describe vector search as neural search because it requires some sort of neural networks to encode the data into this numerical representations.

Gregory Kapfhammer 00:03:34 Some of our listeners may not have previously used a vector database or done some type of vector similarity search. Can you tell us a little bit more about how you know when your project actually needs a vector database?

Kacper Lukawski 00:03:47 There are no strict criteria here of course, but generally if you build any kind of search mechanism and whenever you want to add this semantic search capabilities into it, then you should have a look at vector databases because they just make the deployment and the maintenance of this kind of projects easier. Also, when you want to implement search over some data modality that can be processed with traditional search means such as images or audio, then you definitely need to use semantic search because that’s probably the only approach to search on unstructured data like this. And of course if you have just a few examples of documents that never change, then vector databases might be just an additional overhead in your project. So then maybe implementing a semantic search directly into your application and embedding those documents directly into the source code makes sense. But in general, if you have data that changes frequently, you should be using vector database to implement semantic search. Especially nowadays vector databases come along well with large language models because in both cases we expect natural language like interactions and we are not necessarily looking only at the presence of the keywords. So if you build a system that exposes, conversational like interface, then vector databases might be really important to achieve that quickly.

Gregory Kapfhammer 00:05:15 So you mentioned the idea of keyword search engine and we’ve already talked about the concept of a similarity search engine. How are those two types of search engines similar to and different from each other?

[...]

Original source