Feature | Description |
---|---|
Vector Search | Learn the fundamentals of vector search, including how to perform similarity searches, use different distance metrics, and optimize performance. |
Hybrid Search | Combine keyword-based search with vector search to improve retrieval accuracy and relevance. |
Full-Text Search | Perform full-text search on your text data, and combine it with vector search for a powerful hybrid search experience. |
Reranking | Refine your search results using reranking models to improve the relevance of the top-k results. |
Multi-vector Search | Use multiple vector embeddings per document to perform more nuanced and accurate searches. |
Examples
This section provides handpicked examples of applications built with LanceDB, showcasing its versatility and power.
Example | Description |
---|---|
Hybrid Search & Reranking on BEIR View on GitHub |
This example demonstrates how to use LanceDB’s built-in hybrid search feature, which combines the strengths of both semantic and full-text search. By using the BEIR dataset, it shows how to achieve more relevant results by searching for both the meaning of a query and the specific keywords it contains. |
Semantic Search Across Videos View on GitHub |
Learn how to build a video search application using V-JEPA (Video Joint Embedding Predictive Architecture) and LanceDB. This example shows how to generate vector embeddings for videos and then use LanceDB to perform similarity searches, allowing you to find videos that are visually similar to a given query. |
Semantic Result Merging View on GitHub |
Explore the concept of vector arithmetic with LanceDB. This notebook demonstrates how you can manipulate vector embeddings to capture more complex relationships in your data. For instance, you can modify a search query by adding or subtracting vector representations of different concepts, enabling more nuanced and powerful semantic search. |
Reddit Concept Summarizer View on GitHub |
This project showcases a complete pipeline for acquiring text data from Reddit, transforming it into meaningful vector representations using embeddings, and then storing and managing those vectors in LanceDB. It demonstrates how to build applications on top of this data, such as summarization and powerful semantic search. |
NER-Powered Vector Search View on GitHub |
This example demonstrates how to use Named Entity Recognition (NER) to power vector search. By extracting entities (like people, places, and organizations) from text and creating vector embeddings of them, you can significantly improve the accuracy of your search results. |
Multivector Search with XTR View on GitHub |
This notebook dives into LanceDB’s advanced multivector search capabilities, enhanced by the XTR (ConteXtualized Token Retriever) technique. It shows how to represent complex data with multiple vectors for more nuanced meaning and how XTR speeds up retrieval by prioritizing the most important tokens. |