redis-vl-dotnet

ONNX Vectorizer

RedisVL.Vectorizers.Onnx provides OnnxTextVectorizer, an IBatchTextVectorizer implementation that generates sentence embeddings locally with ONNX Runtime. It lets applications embed text entirely offline, without an API key or network call, mirroring the SentenceTransformers workflow.

Package contents

OnnxTextVectorizer for local sentence embedding generation
OnnxVectorizerOptions for model asset and runtime configuration
OnnxPoolingStrategy for selecting mean or [CLS] pooling
OnnxRuntimeSessionOptions for ONNX Runtime session tuning
OnnxVectorizerPackage as the package marker type

Local model assets

This package does not bundle model files. Applications must provide:

a local model.onnx file
a local Hugging Face tokenizer.json file

A typical source is a SentenceTransformers model such as sentence-transformers/all-MiniLM-L6-v2 exported to ONNX (for example with Hugging Face Optimum), which produces both a model.onnx and a tokenizer.json.

The reference workflow assumes a BERT-style tokenizer and embedding shape:

[CLS] text [SEP]
input_ids
attention_mask
token_type_ids when the model expects them
a last_hidden_state (or token_embeddings) output of shape [batch, sequence, hidden]

Request options

OnnxVectorizerOptions supports:

ModelPath for the local ONNX model file
TokenizerPath for the local tokenizer definition
MaxSequenceLength with a default of 512
Pooling with a default of OnnxPoolingStrategy.Mean; choose OnnxPoolingStrategy.Cls for models that expose a [CLS]-pooled representation
Normalize with a default of true, which L2-normalizes each embedding so cosine similarity equals the dot product
SessionOptions for local ONNX Runtime tuning such as graph optimization, execution mode, and thread counts

Mean pooling averages the per-token outputs using the attention mask, matching the most common SentenceTransformers configuration.

Example workflow

/examples/OnnxVectorizerExample shows the expected integration pattern:

embed a batch of prompts in a single call
print the embedding count and dimensionality
compare cosine similarity between a paraphrase and an unrelated prompt

Run it from the repository root after setting local asset paths:

export ONNX_VECTORIZER_MODEL_PATH=/path/to/model.onnx
export ONNX_VECTORIZER_TOKENIZER_PATH=/path/to/tokenizer.json
dotnet run --project examples/OnnxVectorizerExample/OnnxVectorizerExample.csproj

Validation references

tests/RedisVL.Tests/Vectorizers/OnnxTextVectorizerTests.cs covers validation, pooling, normalization, and batch behavior with stubbed inference
tests/RedisVL.Tests/Vectorizers/OnnxTextTokenizerTests.cs covers tokenizer [CLS]/[SEP] wrapping and truncation
tests/RedisVL.Tests/Vectorizers/OnnxTextVectorizerSmokeTests.cs is the local-asset smoke test gate