Skip to content

Integrate TiDB Vector Search with NVIDIA NIM Embeddings

This tutorial demonstrates how to use NVIDIA NIM models to generate text embeddings, store them in TiDB vector storage, and perform semantic search.

Info

Currently, only the following product and regions support native SQL functions for integrating the NVIDIA NIM Embeddings API:

NVIDIA NIM Embeddings

NVIDIA NIMâ„¢ (NVIDIA Inference Microservices) provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and RTXâ„¢ AI PCs and workstations. NIM microservices expose industry-standard APIs for simple integration into AI applications, development frameworks, and workflows.

You can integrate NVIDIA NIM embedding models with TiDB using the AI SDK, which enables automatic embedding generation from various transformer-based models.

Supported Models

NVIDIA NIM supports a range of embedding models optimized for different use cases. Here are some popular examples:

Model Name Dimensions Max Input Tokens Description
nvidia/nv-embed-v1 4096 32k High-quality general-purpose embeddings based on Mistral-7B-v0.1 with Latent-Attention pooling
nvidia/llama-3_2-nemoretriever-300m-embed-v1 2048 8192 Multilingual embeddings using Llama 3.2 architecture, supporting 20+ languages and long-context reasoning

For a complete list of supported models and detailed specifications, see the NVIDIA Build Platform.

Usage example

This example demonstrates creating a vector table, inserting documents, and performing similarity search using NVIDIA NIM embedding models.

Step 1: Connect to the database

from pytidb import TiDBClient

tidb_client = TiDBClient.connect(
    host="{gateway-region}.prod.aws.tidbcloud.com",
    port=4000,
    username="{prefix}.root",
    password="{password}",
    database="{database}",
    ensure_db=True,
)
mysql -h {gateway-region}.prod.aws.tidbcloud.com \
    -P 4000 \
    -u {prefix}.root \
    -p{password} \
    -D {database}

Step 2: Configure the API key

If you're using NVIDIA NIM models that require authentication, you can configure your API key. You can get free access to NIM API endpoints through the NVIDIA Developer Program or create your API key from the NVIDIA Build Platform:

Configure the API key for NVIDIA NIM models using the TiDB Client:

tidb_client.configure_embedding_provider(
    provider="nvidia_nim",
    api_key="{your-nvidia-api-key}",
)

Set the API key for NVIDIA NIM models using SQL:

SET @@GLOBAL.TIDB_EXP_EMBED_NVIDIA_NIM_API_KEY = "{your-nvidia-api-key}";

Step 3: Create a vector table

Create a table with a vector field that uses an NVIDIA NIM model to generate embeddings:

from pytidb.schema import TableModel, Field
from pytidb.embeddings import EmbeddingFunction
from pytidb.datatype import TEXT

class Document(TableModel):
    __tablename__ = "sample_documents"
    id: int = Field(primary_key=True)
    content: str = Field(sa_type=TEXT)
    embedding: list[float] = EmbeddingFunction(
        model_name="nvidia/nv-embed-v1"
    ).VectorField(source_field="content")

table = tidb_client.create_table(schema=Document, if_exists="overwrite")
CREATE TABLE sample_documents (
    `id`        INT PRIMARY KEY,
    `content`   TEXT,
    `embedding` VECTOR(4096) GENERATED ALWAYS AS (EMBED_TEXT(
        "nvidia/nv-embed-v1",
        `content`
    )) STORED
);

Step 4: Insert data into the table

Use the table.insert() or table.bulk_insert() API to add data:

documents = [
    Document(id=1, content="Machine learning algorithms can identify patterns in data."),
    Document(id=2, content="Deep learning uses neural networks with multiple layers."),
    Document(id=3, content="Natural language processing helps computers understand text."),
    Document(id=4, content="Computer vision enables machines to interpret images."),
    Document(id=5, content="Reinforcement learning learns through trial and error."),
]
table.bulk_insert(documents)

Insert data using the INSERT INTO statement:

INSERT INTO sample_documents (id, content)
VALUES
    (1, "Machine learning algorithms can identify patterns in data."),
    (2, "Deep learning uses neural networks with multiple layers."),
    (3, "Natural language processing helps computers understand text."),
    (4, "Computer vision enables machines to interpret images."),
    (5, "Reinforcement learning learns through trial and error.");

Step 5: Search for similar documents

Use the table.search() API to perform vector search:

results = table.search("How do neural networks work?") \
    .limit(3) \
    .to_list()

for doc in results:
    print(f"ID: {doc.id}, Content: {doc.content}")

Use the VEC_EMBED_COSINE_DISTANCE function to perform vector search with cosine distance:

SELECT
    `id`,
    `content`,
    VEC_EMBED_COSINE_DISTANCE(embedding, "How do neural networks work?") AS _distance
FROM sample_documents
ORDER BY _distance ASC
LIMIT 3;

Advanced Configuration

Custom API Base URL

If you're using a custom NVIDIA NIM deployment or need to specify a different endpoint, you can configure the API base URL:

tidb_client.configure_embedding_provider(
    provider="nvidia_nim",
    api_key="{your-nvidia-api-key}",
    api_base="https://your-custom-nim-endpoint.com/v1",
)
SET @@GLOBAL.TIDB_EXP_EMBED_NVIDIA_NIM_API_BASE = "https://your-custom-nim-endpoint.com/v1";

Model-Specific Parameters

NVIDIA NIM models support various parameters that can be passed through the embedding function:

embedding: list[float] = EmbeddingFunction(
    model_name="nvidia/nv-embed-v1",
    extra_params={
        "truncate": "END",
        "max_length": 512
    }
).VectorField(source_field="content")
CREATE TABLE sample_documents (
    `id`        INT PRIMARY KEY,
    `content`   TEXT,
    `embedding` VECTOR(4096) GENERATED ALWAYS AS (EMBED_TEXT(
        "nvidia/nv-embed-v1",
        `content`,
        '{"truncate": "END", "max_length": 512}'
    )) STORED
);