redis-vl-dotnet

SemanticCache

SemanticCache stores prompt/response pairs in a HASH-backed RediSearch index and retrieves the nearest cached response when a new prompt embedding falls within a configured distance threshold.

When to use it

Use SemanticCache when:

  • you want semantic reuse rather than exact-string reuse

  • cached responses need optional JSON metadata

  • one cache needs tenant, model, or prompt-template filters

Schema shape and lifecycle

Each cache instance owns a HASH-backed index with:

  • prompt, response, and metadata text fields

  • one vector field for the embedding

  • optional TAG, TEXT, or NUMERIC filter fields

The main lifecycle methods are the familiar CreateAsync, ExistsAsync, and DropAsync(deleteDocuments: true).

If you call CreateAsync(new CreateIndexOptions(skipIfExists: true)), the library validates that the existing index schema still matches the configured cache options and throws when it does not.

Thresholds, filters, and vectorizers

SemanticCacheOptions requires:

  • EmbeddingFieldAttributes

  • DistanceThreshold

  • optional TimeToLive

  • optional FilterableFields

Important constraints:

  • DistanceThreshold must be greater than zero

  • vector fields must use FLOAT32

  • filterable fields may only be TAG, TEXT, or NUMERIC fields

  • filterable fields cannot use aliases

  • query filters and stored filter values are only allowed when FilterableFields are configured

The cache supports both raw embeddings and ITextVectorizer overloads. Prefer RedisVL.Vectorizers.ITextVectorizer; the legacy RedisVL.Caches.ITextEmbeddingGenerator interface remains only as an obsolete compatibility shim.

Store and check behavior

API Behavior

StoreAsync(prompt, response, embedding, metadata, filterValues)

Stores a HASH document, optional metadata JSON, optional filter fields, and optional TTL.

CheckAsync(prompt, embedding, filter)

Runs a VectorRangeQuery and returns the nearest hit inside the configured distance threshold, otherwise null.

StoreAsync(…​, vectorizer, …​)

Vectorizes the prompt before storing the cache entry.

CheckAsync(…​, vectorizer, …​)

Vectorizes the incoming prompt before lookup.

CheckTopKAsync(prompt, embedding, topK, filter)

Returns up to topK cached entries within the threshold, ordered nearest-first. An empty list means a miss.

StoreManyAsync(requests) / StoreManyAsync(requests, vectorizer)

Stores a batch of SemanticCacheStoreRequest entries; the returned key list is aligned to input order. The vectorizer overload embeds all prompts in a single batch (using IBatchTextVectorizer when available).

CheckManyAsync(requests) / CheckManyAsync(requests, vectorizer)

Runs a batch of SemanticCacheCheckRequest lookups; the result list is aligned to input order, with null for each miss.

UpdateAsync(key, response, metadata)

Updates the response and/or metadata of an existing entry (by the key returned from Store), refreshing the TTL when one is configured. Returns false if the key does not exist; the embedding and filter values are left unchanged.

Filter values are part of the cache identity. That means the same prompt can produce separate cached entries for different tenants, models, or numeric settings.

CheckAsync does not do text matching on the prompt field. The prompt text is useful for traceability and response inspection, but cache hits are driven by vector distance plus any optional filter expression.

Hit/miss statistics

Set trackStatistics: true on SemanticCacheOptions to record lookup outcomes. The cache then exposes:

  • HitCount and MissCount — counts of lookups that found / did not find a match (each CheckAsync / CheckTopKAsync call counts once)

  • HitRatehits / (hits + misses), or 0 when nothing has been tracked

  • ResetStatistics() — clears both counters

Statistics are thread-safe. When trackStatistics is left at its default (false), all three stay at zero.

Example

The runnable example for this API is /examples/SemanticCacheExample.

It demonstrates:

  • creating a cache with TAG and NUMERIC filter fields

  • storing two prompt variants under different tenant filters

  • attaching metadata payloads

  • retrieving a hit through a composed filter expression

Provider-backed vectorizer examples also use this API:

  • /examples/OpenAiVectorizerExample

  • /examples/HuggingFaceVectorizerExample

The detailed edge cases live in tests/RedisVL.Tests/Caches/SemanticCacheTests.cs and tests/RedisVL.Tests/Caches/SemanticCacheIntegrationTests.cs.