Vector Search Example

This example demonstrates how to build a semantic search application using TiDB and local embedding models. It leverages vector search to find similar items based on meaning, not just keywords. The app uses Streamlit for the web UI and Ollama for local embedding generation.

Semantic search with vector embeddings

Semantic search with vector embeddings

Prerequisites

Python 3.10+
A TiDB Cloud Serverless cluster: Create a free cluster here: tidbcloud.com ↗️
Ollama: You can install it from Ollama ↗️

How to run

Step 1: Start the embedding service with Ollama

Pull the embedding model:

ollama pull mxbai-embed-large

Test the embedding service to make sure it is running:

curl http://localhost:11434/api/embed -d '{
  "model": "mxbai-embed-large",
  "input": "Llamas are members of the camelid family"
}'

Step 2: Clone the repository to local

git clone https://github.com/pingcap/pytidb.git
cd pytidb/examples/vector_search/

Step 3: Install the required packages and set up the environment

python -m venv .venv
source .venv/bin/activate
pip install -r reqs.txt

Step 4: Set up environment to connect to TiDB

Go to TiDB Cloud console and get the connection parameters, then set up the environment variable like this:

cat > .env <<EOF
TIDB_HOST={gateway-region}.prod.aws.tidbcloud.com
TIDB_PORT=4000
TIDB_USERNAME={prefix}.root
TIDB_PASSWORD={password}
TIDB_DATABASE=pytidb_vector_search
EOF

Step 5: Run the Streamlit app

streamlit run app.py

Step 6: Open your browser and visit http://localhost:8501

Source Code: View on GitHub
Category: Search
Description: Implement semantic search using vector embeddings to find similar content.

🏠 Back to Demo Gallery

Vector Search Example

Prerequisites

How to run

Related Resources