redis-vl-dotnet
ONNX Vectorizer
RedisVL.Vectorizers.Onnx provides OnnxTextVectorizer, an IBatchTextVectorizer implementation that generates sentence embeddings locally with ONNX Runtime. It lets applications embed text entirely offline, without an API key or network call, mirroring the SentenceTransformers workflow.
Package contents
-
OnnxTextVectorizerfor local sentence embedding generation -
OnnxVectorizerOptionsfor model asset and runtime configuration -
OnnxPoolingStrategyfor selecting mean or[CLS]pooling -
OnnxRuntimeSessionOptionsfor ONNX Runtime session tuning -
OnnxVectorizerPackageas the package marker type
Local model assets
This package does not bundle model files. Applications must provide:
-
a local
model.onnxfile -
a local Hugging Face
tokenizer.jsonfile
A typical source is a SentenceTransformers model such as sentence-transformers/all-MiniLM-L6-v2 exported to ONNX (for example with Hugging Face Optimum), which produces both a model.onnx and a tokenizer.json.
The reference workflow assumes a BERT-style tokenizer and embedding shape:
-
[CLS] text [SEP] -
input_ids -
attention_mask -
token_type_idswhen the model expects them -
a
last_hidden_state(ortoken_embeddings) output of shape[batch, sequence, hidden]
Request options
OnnxVectorizerOptions supports:
-
ModelPathfor the local ONNX model file -
TokenizerPathfor the local tokenizer definition -
MaxSequenceLengthwith a default of512 -
Poolingwith a default ofOnnxPoolingStrategy.Mean; chooseOnnxPoolingStrategy.Clsfor models that expose a[CLS]-pooled representation -
Normalizewith a default oftrue, which L2-normalizes each embedding so cosine similarity equals the dot product -
SessionOptionsfor local ONNX Runtime tuning such as graph optimization, execution mode, and thread counts
Mean pooling averages the per-token outputs using the attention mask, matching the most common SentenceTransformers configuration.
Example workflow
/examples/OnnxVectorizerExample shows the expected integration pattern:
-
embed a batch of prompts in a single call
-
print the embedding count and dimensionality
-
compare cosine similarity between a paraphrase and an unrelated prompt
Run it from the repository root after setting local asset paths:
export ONNX_VECTORIZER_MODEL_PATH=/path/to/model.onnx
export ONNX_VECTORIZER_TOKENIZER_PATH=/path/to/tokenizer.json
dotnet run --project examples/OnnxVectorizerExample/OnnxVectorizerExample.csproj
Validation references
-
tests/RedisVL.Tests/Vectorizers/OnnxTextVectorizerTests.cscovers validation, pooling, normalization, and batch behavior with stubbed inference -
tests/RedisVL.Tests/Vectorizers/OnnxTextTokenizerTests.cscovers tokenizer[CLS]/[SEP]wrapping and truncation -
tests/RedisVL.Tests/Vectorizers/OnnxTextVectorizerSmokeTests.csis the local-asset smoke test gate