5 posts tagged with "search"

Search functionality and implementations

Spice v1.8.2 (Oct 21, 2025)

October 21, 2025 · 5 min read

Token Plumber at Spice AI

Announcing the release of Spice v1.8.2! 🔍

Spice v1.8.2 is a patch release focused on reliability, validation, performance, and bug fixes, with improvements across DuckDB acceleration, S3 Vectors, document tables, and HTTP search.

What's New in v1.8.2

Support Table Relations in `/v1/search` HTTP Endpoint

Spice now supports table relations for the additional_columns and where parameters in the /v1/search endpoint. This enables improved search for multi-dataset use cases, where filters and columns can be used on specific datasets.

Example:

curl 'http://localhost:8090/v1/search' \
    -H 'Content-Type: application/json' \
    -H 'Accept: application/json' -d '{
        "text": "hello world",
        "additional_columns": ["tbl1.foo", "tbl2.bar", "baz"],
        "where": "tbl1.foo > 100000",
        "limit": 5
    }'

In this example, search results from the tbl1 dataset will include columns foo and baz, where foo > 100000. For tbl2, columns bar and baz will be returned.

DuckDB Data Accelerator Table Partitioning & Indexing

Configurable DuckDB Index Scan: DuckDB acceleration now supports configurable duckdb_index_scan_percentage and duckdb_index_scan_max_count parameters, supporting fine-tuning of index scan behavior for improved query performance.

Example:

datasets:
  - from: postgres:my_table
    name: my_table
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      params:
        # When combined, DuckDB will use an index scan when the number of qualifying rows is less than the maximum of these two thresholds
        duckdb_index_scan_percentage: '0.10' # 10% as decimal
        duckdb_index_scan_max_count: '1000'

Hive-Style Partitioning: In file-partitioned mode, the DuckDB data accelerator uses Hive-style partitioning for more efficient file management.
Table-Based Partitioning: Spice now supports partitioning DuckDB accelerations within a single file. This approach maintains ACID guarantees for full and append mode refreshes, while optimizing resource usage and improving query performance. Configure via the partition_mode parameter:

datasets:
  - from: file:test_data.parquet
    name: test_data
    params:
      file_format: parquet
    acceleration:
      enabled: true
      engine: duckdb
      mode: file
      params:
        partition_mode: tables
      partition_by:
        - bucket(100, Field1)

S3 Vectors Reliability

Race Condition Fix: Resolved a race condition in S3 Vectors index and bucket creation. The runtime also now checks if an index or bucket exists after a ConflictException, ensuring robust error handling during index creation and improving reliability for large-scale multi-index vector search.

Document Table Improvements

Primary Key Update: Document tables now use the location column as the primary key, improving performance, consistency, and query reliability.

Additional Improvements & Bugfixes

Reliability: Improved error handling and resource checks for S3 Vectors and DuckDB acceleration.
Validation: Expanded validation for partitioning and index creation.
Performance: Optimized partition refresh and index scan logic.
Bugfix: Don't nullify DuckDB release callbacks for schemas.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No major cookbook updates.

The Spice Cookbook includes 81 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.8.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.8.2 image:

docker pull spiceai/spiceai:1.8.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

Update mongo config for benchmarks by @krinart in #7546
Configurable DuckDB duckdb_index_scan_percentage & duckdb_index_scan_max_count by @lukekim in #7551
Fix race condition in S3 Vectors index and bucket creation by @kczimm in #7577
Use 'location' as primary key for document tables by @Jeadie in #7567
Update official Docker builds to use release binaries by @phillipleblanc in #7597
Hive-style partitioning for DuckDB file mode by @kczimm in #7563
New Generate Changelog workflow by @krinart in #7562
Add support for DuckDB table-based partitioning by @sgrebnov in #7581
DuckDB table partitioning: delete partitions that no longer exist after full refresh by @sgrebnov in #7614
Rename duckdb_partition_mode to partition_mode param by @sgrebnov in #7622
Fix license issue in table-providers by @phillipleblanc in #7620
Make DuckDB table partition data write threshold configurable by @sgrebnov in #7626
fix: Don't nullify DuckDB release callbacks for schemas by @peasee in #7628
Fix integration tests by reverting the use of batch inserts w/ prepared statements by @phillipleblanc in #7630
Return TableProvider from CandidateGeneration::search by @Jeadie in #7559
Handle table relations in HTTP v1/search by @Jeadie in #7615

Spice v1.7.1 (Sep 29, 2025)

September 30, 2025 · 6 min read

Kevin Zimmerman

Principal Software Engineer at Spice AI

Announcing the release of Spice v1.7.1! 🔍

Spice v1.7.1 is a patch release focused on search improvements, bug fixes, and performance enhancements. This release introduces the Reciprocal Rank Fusion (RRF) user-defined table function (UDTF) for hybrid search, improves vector and text search reliability, and resolves several issues across the runtime, connectors, and query engine.

What's New in v1.7.1

Reciprocal Rank Fusion (RRF) UDTF: Spice now supports Reciprocal Rank Fusion (RRF) as a user-defined table function, enabling advanced hybrid search scenarios that combine results from multiple search methods (e.g., vector and text search) for improved relevance ranking.

Features:

Multi-search fusion: Combine results from vector_search, text_search, and other search UDTFs in a single query.
Advanced tuning: Per-query ranking weights, recency boosting, and configurable decay functions.
Performance: Optional user-specified join key for optimal performance.
Automatic joining: Falls back to on-the-fly JOIN key computation when no explicit key is provided.

Example usage:

SELECT id, title, content, fused_score
FROM rrf(
  vector_search(documents, 'machine learning algorithms', rank_weight => 1.5),
  text_search(documents, 'neural networks deep learning', rank_weight => 1.2),
  join_key => 'id',    -- optional join key for optimal performance
  k => 60.0            -- optional smoothing factor
)
WHERE fused_score > 0.01
ORDER BY fused_score DESC;

Learn more in the RRF documentation.

Acceleration Refresh Metrics: Spice now exposes additional Prometheus metrics that provide detailed observability into dataset acceleration refreshes. These metrics help monitor data freshness and ingestion lag for accelerated datasets with a time column.

Reported metrics:

Metric Name	Description
`dataset_acceleration_max_timestamp_before_refresh_ms`	Maximum value of the dataset's time column before refresh (milliseconds).
`dataset_acceleration_max_timestamp_after_refresh_ms`	Maximum value of the dataset's time column after refresh (milliseconds).
`dataset_acceleration_refresh_lag_ms`	Difference between max timestamp after and before refresh (milliseconds).
`dataset_acceleration_ingestion_lag_ms`	Lag between current wall-clock time and max timestamp after refresh (milliseconds).

These metrics are emitted during each acceleration refresh and can be scraped by Prometheus for monitoring and alerting. For more details, see the Observability documentation.

Bug Fixes & Improvements

This release resolves several issues and improves reliability across search, connectors, and query planning:

Full-Text Search (FTS): Ensure FTS metadata columns can be used in projection, fix JOIN-level filters not having columns in schema, and adds support for persistent file-based FTS indexes. Default limit of 1000 results if no limit specified.
Vector Search: Default limit of 1000 results if no limit specified, and fix removing embedding column.
Databricks SQL Warehouse: Improved error handling and support for async queries.
Other: Fixes for Anthropic model regex validation, tweaked AI-model health checks, and improved error messages.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Added Hybrid-Search using RRF - Combine results from multiple search methods (vector and text search) using Reciprocal Rank Fusion for improved relevance ranking.

The Spice Cookbook includes 78 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.7.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.7.1 image:

docker pull spiceai/spiceai:1.7.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

ensure FTS metadata columns can be used in projection (#7282) by @Jeadie in #7282
Fix JOIN level filters not having columns in schema (#7287) by @Jeadie in #7287
Use file-based fts index (#7024) by @Jeadie in #7024
Remove 'PostApplyCandidateGeneration' (#7288) by @Jeadie in #7288
RRF: Rank and recency boosting (#7294) by @mach-kernel in #7294
RRF: Preserve base ranking when results differ -> FULL OUTER JOIN does not produce time column (#7300) by @mach-kernel in #7300
fix removing embedding column (#7302) by @Jeadie in #7302
RRF: Fix decay for disjoint result sets (#7305) by @mach-kernel in #7305
RRF: Project top scores, do not yield duplicate results (#7306) by @mach-kernel in #7306
RRF: Case sensitive column/ident handling (#7309) by @mach-kernel in #7309
For vector_search, use a default limit of 1000 if no limit specified (#7311) by @lukekim in #7311
Fix Anthropic model regex and add validation tests (#7319) by @ewgenius in #7319
Enhancement: Implement before/after/lag metrics for acceleration refresh (#7310) by @krinart in #7310
Refactor chat model health check to lower tokens usage for reasoning models (#7317) by @ewgenius in #7317
Enable chunking in SearchIndex (#7143) by @Jeadie in #7143
Use logical plan in SearchQueryProvider. (#7314) by @Jeadie in #7314
FTS max search results 100 -> 1000 (#7331) by @Jeadie in #7331
Improve Databricks SQL Warehouse Error Handling (#7332) by @sgrebnov in #7332
use spicepod embedding model name for 'model_name' (#7333) by @Jeadie in #7333
Handle async queries for Databricks SQL Warehouse API (#7335) by @phillipleblanc in #7335
RRF: Fix ident resolution for struct fields, autohashed join key for varying types (#7339) by @mach-kernel in #7339

Spice v1.7.0 (Sep 23, 2025)

September 23, 2025 · 21 min read

Sergei Grebnov

Senior Software Engineer at Spice AI

Announcing the release of Spice v1.7.0! ⚡

Spice v1.7.0 upgrades to DataFusion v49 for improved performance and query optimization, introduces real-time full-text search indexing for CDC streams, EmbeddingGemma support for high-quality embeddings, new search table functions powering the /v1/search API, embedding request caching for faster and cost-efficient search and indexing, and OpenAI Responses API tool calls with streaming. This release also includes numerous bug fixes across CDC streams, vector search, the Kafka Data Connector, and error reporting.

What's New in v1.7.0

DataFusion v49 Highlights

DataFusion Clickbench Performance Graph Source: DataFusion 49.0.0 Release Blog.

Performance Improvements 🚀

Equivalence System Upgrade: Faster planning for queries with many columns, enabling more sophisticated sort-based optimizations.
Dynamic Filters & TopK Pushdown: Queries with ORDER BY and LIMIT now use dynamic filters and physical filter pushdown, skipping unnecessary data reads for much faster top-k queries.
Compressed Spill Files: Intermediate files written during sort/group spill to disk are now compressed, reducing disk usage and improving performance.
WITHIN GROUP for Ordered-Set Aggregates: Support for ordered-set aggregate functions (e.g., percentile_disc) with WITHIN GROUP.
REGEXP_INSTR Function: Find regex match positions in strings.

Spice Runtime Highlights

EmbeddingGemma Support: Spice now supports EmbeddingGemma, Google's state-of-the-art embedding model for text and documents. EmbeddingGemma provides high-quality, efficient embeddings for semantic search, retrieval, and recommendation tasks. You can use EmbeddingGemma via HuggingFace in your Spicepod configuration:

Example spicepod.yml snippet:

embeddings:
  - from: huggingface:huggingface.co/google/embeddinggemma-300m
    name: embeddinggemma
    params:
      hf_token: ${secrets:HUGGINGFACE_TOKEN}

Learn more about EmbeddingGemma in the official documentation.

POST /v1/search API Use Search Table Functions: The /v1/search API now uses the new text_search and vector_search Table Functions for improved performance.

Embedding Request Caching: The runtime now supports caching embedding requests, reducing latency and cost for repeated content and search requests.

Example spicepod.yml snippet:

runtime:
  caching:
    embeddings:
      enabled: true
      max_size: 128mb
      item_ttl: 5s

See the Caching documentation for details.

Real-Time Indexing for Full Text Search: Full Text search indexing is now supported for connectors that enable real-time changes, such as Debezium CDC streams. Adding a full-text index on a column with refresh_mode: changes works as it does for full/append-mode refreshes, enabling instant search on new data.

Example spicepod.yml snippet:

datasets:
  - from: debezium:cdc.public.question
    name: questions
    acceleration:
      enabled: true
      engine: duckdb
      primary_key: id
      refresh_mode: changes # Use 'changes'
    params: *kafka_params
    columns:
      - name: title
        full_text_search:
          enabled: true # Enable full-text-search indexing
          row_id:
            - id

OpenAI Responses API Tool Calls with Streaming: The OpenAI Responses API now supports tool calls with streaming, enabling advanced model interactions such as web_search and code_interpreter with real-time response streaming. This allows you to invoke OpenAI-hosted tools and receive results as they are generated.

Learn more in the OpenAI Model Provider documentation.

Runtime Output Level Configuration: You can now set the output_level parameter in the Spicepod runtime configuration to control logging verbosity in addition to the existing CLI and environment variable support. Supported values are info, verbose, and very_verbose. The value is applied in the following priority: CLI, environment variables, then YAML configuration.

Example spicepod.yml snippet:

runtime:
  output_level: info # or verbose, very_verbose

For more details on configuring output level, see the Troubleshooting documentation.

Bug Fixes

Several bugs and issues have been resolved in this release, including:

CDC Streams: Fixed issues where refresh_mode: changes could prevent the Spice runtime from becoming Ready, and improved support for full-text indexing on CDC streams.
Vector Search: Fixed bugs where vector search HTTP pipeline could not find more than one IndexedTableProvider, and resolved errors with field mismatches in vector_search UDTF.
Kafka Integration: Improved Kafka schema inference with configurable sample size, improved consumer group persistence for SQLite and Postgres accelerations, and added cooperative mode support.
Perplexity Web Search: Fixed bug where Perplexity web search sometimes used incorrect query schema (limit).
Databricks: Fixed issue with unparsing embedded columns.
Error Reporting: ThrottlingException is now reported correctly instead of as InternalError.
Iceberg Data Connector: Added support for LIMIT pushdown.
Amazon S3 Vectors: Fixed ingestion issues with zero-vectors and improved handling when vector index is full.
Tracing: Fixed vector search tracing to correctly report SQL status.

Contributors

New Contributors

@ChrisTomAlxHitachi made their first contribution in github.com/spiceai/spiceai/pull/6932 🎉

Breaking Changes

No breaking changes.

Cookbook Updates

New Spice with Dotnet SDK Recipe - The recipe shows how to query Spice using the Dotnet SDK.

The Spice Cookbook includes 78 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.7.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.7.0 image:

docker pull spiceai/spiceai:1.7.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Dependencies

Rust: Upgraded from 1.88.0 to 1.89.0
DataFusion: Upgraded from 48.0.1 to 49.0.0
text-embeddings-inference: Upgraded from 1.7.3 to 1.8.2
twox-hash: Upgraded from 1.6.3 to 2.1.0.

Changelog

Fix parameterised query planning in DataFusion by @Jeadie in #6942
fix: Update benchmark snapshots by @app/github-actions in #6944
refactor: Decouple full text search candidate from UDTF by @peasee in #6940
fix: Re-enable search integration tests by @peasee in #6930
Update acknowledgements and spicepod.schema.json by @sgrebnov in #6948
Add enabling the responses API by @lukekim in #6949
Post-release housekeeping by @sgrebnov in #6951
Add missing param in release notes by @Advayp in #6959
Create comprehensive S3vectors test by @Jeadie in #6903
Update ROADMAP after v1.6 release by @sgrebnov in #6955
Update openapi.json by @app/github-actions in #6961
Add build step for new spiced images in end game template by @Jeadie in #6960
refactor: Use text search UDTF in v1/search by @peasee in #6962
Bump Jimver/cuda-toolkit from 0.2.26 to 0.2.27 by @app/dependabot in #6922
Bump notify from 8.0.0 to 8.2.0 by @app/dependabot in #6924
Use model2vec for search integration tests for speed by @Jeadie in #6971
feat: Add initial DuckDB regexp pushdown support by @peasee in #6966
Bump rustyline from 16.0.0 to 17.0.1 by @app/dependabot in #6976
Upgrade delta_kernel to 0.14 by @phillipleblanc in #6977
Consistent snapshots for mongodb by @krinart in #6974
Bump indexmap from 2.10.0 to 2.11.0 by @app/dependabot in #6921
Fix mongo tests: ignore container_registry() when building image name by @krinart in #6983
Implement support for s3 tables for glue DataConnector by @krinart in #6981
Bump serde_json from 1.0.142 to 1.0.143 by @app/dependabot in #6925
Update build_and_release macOS pipeline to skip updating cmake if installed by @phillipleblanc in #6998
Mark Kafka Data Connector Alpha quality by @sgrebnov in #6991
Add v1.6.1 release notes by @lukekim in #7000
Spice CLI trace: make error friendlier when task_history is disabled by @sgrebnov in #6996
Warn when runtime or management is added in spicepod dependency by @Jeadie in #6953
Enable .datasets[].vectors.params.s3_vectors_distance_metric for S3 Vectors by @Jeadie in #6982
Add s3_vectors index support for CDC and Append streams by @sgrebnov in #6986
Find all vector indexes in v1/search by @Jeadie in #7004
Fix RRF; reorder by score by @Jeadie in #7007
Fix for nested VectorScanTableProvider by @krinart in #7017
Add --sql flag to output SQL query for spice trace by @Jeadie in #7002
Make web search params engine-specific by @Advayp in #7022
Add more MTEB benchmark spicepods by @peasee in #7026
Improve error messaging in tools by @Jeadie in #6895
Add retry for exporting task history records by @sgrebnov in #7049
Increase DoPut write timeout for the next batch from 30 to 120 seconds by @sgrebnov in #7054
Avoid redundant search embedding by @peasee in #7053
Truncate text_embed task_history trace by @sgrebnov in #7050
Use the UTC offset for the start_time and end_time fields in the task history by @ewgenius in #7056
Update supported versions in SECURITY.md by @Jeadie in #7060
Add integration test for Kafka S3 Vectors by @sgrebnov in #6988
Enable parameters to enforce the value is one of several options by @Jeadie in #6984
feat(iceberg): lakekeeper catalog - add warehouse param to spicepod by @ChrisTomAlxHitachi in #6932
feat: Add HTTP query concurrency support to testoperator by @peasee in #7025
Ensure no data does not throw error in v1/search by @Jeadie in #7033
Bump github.com/spf13/cobra from 1.9.1 to 1.10.1 by @app/dependabot in #7013
Add QA analytics for 1.6.x releases by @sgrebnov in #7082
Use env variable for HF cache in model2vec by @Jeadie in #7076
chore: upgrade to Rust 1.88 by @kczimm in #7077
Kafka/Debezium: make common errors user-friendlier by @sgrebnov in #7084
Create Apache Datafusion upgrade issue template by @kczimm in #6800
No join predicate pushdown on empty results by @Jeadie in #7075
Bump tract-onnx from 0.21.10 to 0.22.0 by @app/dependabot in #7071
Bump mongodb from 3.2.4 to 3.3.0 by @app/dependabot in #7073
Bump indicatif from 0.17.11 to 0.18.0 by @app/dependabot in #7070
Bump actions/github-script from 7 to 8 by @app/dependabot in #7069
Bump actions/setup-go from 5 to 6 by @app/dependabot in #7068
Bump actions/download-artifact from 4 to 5 by @app/dependabot in #7066
Bedrock: Tool use without inputs must empty Document by @Jeadie in #7036
Bump github.com/stretchr/testify from 1.10.0 to 1.11.1 by @app/dependabot in #7015
Bump actions/setup-python from 5 to 6 by @app/dependabot in #7067
Upgrade dependabot dependencies by @phillipleblanc in #7061
Bump tempfile from 3.20.0 to 3.21.0 by @app/dependabot in #7018
Only call 'list_datasets' once, after initial system/user messages by @Jeadie in #7039
Bump github.com/spf13/pflag from 1.0.7 to 1.0.10 by @app/dependabot in #7062
Bump actions/checkout from 4 to 5 by @app/dependabot in #7065
Bump golang.org/x/mod from 0.27.0 to 0.28.0 by @app/dependabot in #7064
Bump github.com/AzureAD/microsoft-authentication-library-for-go from 1.4.1 to 1.5.0 by @app/dependabot in #7063
Add friendly message for Kafka operation timeout error, improve code by @sgrebnov in #7088
embed UDF by @mach-kernel in #6967
fix: Update benchmark snapshots by @app/github-actions in #7097
Fix SF100 benchmark tests dispatch by @sgrebnov in #7098
chore(logging): add log when iceberg rest catalog fails with ssl cert error by @ChrisTomAlxHitachi in #6909
Add xxhash support for search/sql results by @krinart in #6978
Use proper federation in max_timestamp_df during acceleration refresh by @krinart in #7055
Fix spiced_docker workflows for new actions/download-artifact@v5 behavior by @phillipleblanc in #7108
Fix spiced_docker workflow by @phillipleblanc in #7111
Add filter for zero vectors before writing to S3 Vectors by @phillipleblanc in #7110
Ensure we find vector index when it also has text search by @Jeadie in #7120
Enable unified traceparent override support for HTTP API by @sgrebnov in #7122
Fix ORDER BY: (BytesProcessedExec to avoid pruning ordered execs during physical optimization) by @mach-kernel in #7105
Fix spiced_docker_nightly workflow by @sgrebnov in #7125
Add output_level to runtime config by @krinart in #7119
Add tests for xxhash hashers by @krinart in #7124
Add input option to update snapshots in Integration tests by @Jeadie in #7127
Fix formatting to improve merges by @lukekim in #7128
Add tests to nulling logic by @Jeadie in #7113
Bump chrono from 0.4.41 to 0.4.42 by @app/dependabot in #7131
Bump ctrlc from 3.4.7 to 3.5.0 by @app/dependabot in #7132
Search: RRF UDTF by @mach-kernel in #7090
Update openapi.json by @app/github-actions in #7141
Bump packages to DF49; resolve incompatibilities by @Jeadie in #7101
fix: Don't error for chunked columns when vectors are disabled by @peasee in #7150
Allow bzip2-1.0.6 license in deny.toml by @Jeadie in #7148
Tune retry settings for Kafka/Debezium connectors by @sgrebnov in #7142
Update TEI by @Jeadie in #7152
Use twox-hash version 2.1.2 by @krinart in #7165
Revert "Use proper federation in max_timestamp_df during acceleration refresh (#7055)" by @phillipleblanc in #7156
Bump octocrab from 0.44.1 to 0.45.0 by @app/dependabot in #7158
Bump github.com/spf13/viper from 1.19.0 to 1.21.0 by @app/dependabot in #7130
Bump keyring from 3.6.2 to 3.6.3 by @app/dependabot in #7157
fix: Remove keywords from AI document search by @peasee in #7052
Bump tract-core from 0.21.10 to 0.22.0 by @app/dependabot in #7134
Update TEI by @Jeadie in #7171
Update openapi.json by @app/github-actions in #7172
fix: Ensure vector search UDTF respects the supplied projection by @peasee in #7155
Bump clap from 4.5.45 to 4.5.47 by @app/dependabot in #7135
Bump golang.org/x/sys from 0.35.0 to 0.36.0 by @app/dependabot in #7129
Include 'catalog_id' in Glue catalog parameters by @Jeadie in #7151
fix: Use head ref from merge group event in pulls-with-spice concurrency group by @peasee in #7175
Fix lint for xxhash feature by @phillipleblanc in #7176
Add Kafka-specific metrics for consumer lag and consumed records by @sgrebnov in #7146
Kafka: persist consumer between restarts with SQLite and PG acceleration by @sgrebnov in #7177
Kafka: support specifying a target consumer group ID by @sgrebnov in #7178
Fix timestamp parsing for spice trace by @krinart in #7173
Support full-text indexing on CDC/append streams by @phillipleblanc in #7180
Bump iceberg-rust version to include limit push down by @krinart in #7191
Make full text stream connector more robust by @phillipleblanc in #7193
fix: Update benchmark snapshots by @app/github-actions in #7179
Initial changes for SearchIndex by @Jeadie in #7103
Robustly handle indexing FTS for CDC streams by @phillipleblanc in #7197
Proper handling/mapping for ThrottlingException during embedding calls by @krinart in #7170
Add spicepod.yml by @lukekim in #7202
Delta Lake: Support read pruning on timestamp columns using maxValues stats by @sgrebnov in #7203
feat: Add initial embeddings cache by @peasee in #7194
Make S3vector a FixedSizeListArray by @Jeadie in #7201
Fix projection mismatch issues with RRF calling vector search / text search by @mach-kernel in #7200
feat: Add embeddings cache to all embeddings by @peasee in #7204
Revert "Make S3vector a FixedSizeListArray (#7201)" by @kczimm in #7210
Update duckdb version to make ICU statically linked by default by @krinart in #7215
Change DataType list nullability from true to false by @Jeadie in #7216
Use Instant + saturating_sub to handle time drift by @krinart in #7212
Flatten 'IndexedTableProvider' when adding full-text support by @Jeadie in #7219
Include comments in pulls by @lukekim in #7224
Add github_max_concurrent_connections = 5 by @lukekim in #7225
RRF: Fix scoring by @mach-kernel in #7226
Update RRF search integration snapshots after scoring change by @mach-kernel in #7227
Make S3vector a FixedSizeListArray by @Jeadie in #7230
Proper federation during acceleration refresh + datafusion version bump + integration tests by @krinart in #7228
Use DuckDBDialect for DuckDB non-federated queries by @krinart in #7232
Move chunking out of llms and into new crate chunking by @Jeadie in #7229
Remove duplicate pg_port configuration in test by @lukekim in #7233
Upgrade to Rust 1.89 by @phillipleblanc in #7235
Catalog connection error: fix connector name from 'iceberg' to 'spice.ai' by @sgrebnov in #7240
Create PutVectorsSink by @kczimm in #7199
Benchmark tests: fix API key reference in spicecloud catalog by @sgrebnov in #7239
Add Dotnet SDK sample to end game template by @sgrebnov in #7238
Update spicepod.schema.json by @app/github-actions in #7254
Postgres: Improve Decimals read performance and add Name type support by @sgrebnov in #7255
Add tests for hybrid search on a vector engine by @Jeadie in #7220

Amazon S3 Vectors with Spice

July 31, 2025 · 26 min read

Jack Eadie

Token Plumber at Spice AI

The latest Spice.ai Open Source release (v1.5.0) brings major improvements to search, including native support for Amazon S3 Vectors. Announced in public preview at AWS Summit New York 2025, Amazon S3 Vectors is a new S3 bucket type purpose-built for vector embeddings, with dedicated APIs for similarity search.

Spice AI was a day 1 launch partner for S3 Vectors, integrating it as a scalable vector index backend. In this post, we explore how S3 Vectors integrates into Spice.ai’s data, search, and AI-inference engine, how Spice manages indexing and lifecycle of embeddings for production vector search, and how this unlocks a powerful hybrid search experience. We’ll also put this in context with industry trends and compare Spice’s approach to other vector database solutions like Qdrant, Weaviate, Pinecone, and Turbopuffer.

Amazon S3 Vectors Overview

Amazon S3 Vectors extends S3 object storage with native support for storing and querying vectors at scale. As AWS describes, it is “designed to provide the same elasticity, scale, and durability as Amazon S3,” providing storage of billions of vectors and sub-second similarity queries. Crucially, S3 Vectors dramatically lowers the cost of vector search infrastructure – reducing upload, storage, and query costs by up to 90% compared to traditional solutions. It achieves this by separating storage from compute: vectors reside durably in S3, and queries execute on transient, on-demand resources, avoiding the need for always-on, memory-intensive vector database servers. In practice, S3 Vectors exposes two core operations:

Upsert vectors – assign a vector (an array of floats) to a given key (identifier) and optionally store metadata alongside it.
Vector similarity query – given a new query vector, efficiently find the stored vectors that are closest (e.g. minimal distance) to it, returning their keys (and scores).

This transforms S3 into a massively scalable vector index service. You can store embeddings at petabyte scale and perform similarity search with metrics like cosine or Euclidean distance via a simple API. It’s ideal for AI use cases like semantic search, recommendations, or Retrieval-Augmented Generation (RAG) where large volumes of embeddings need to be queried semantically. By leveraging S3’s pay-for-use storage and ephemeral compute, S3 Vectors can handle infrequent or large-scale queries much more cost-effectively than memory-bound databases, yet still deliver sub-second results.

Vector Search with Embeddings

Vector similarity search retrieves data by comparing items in a high-dimensional embedding space rather than by exact keywords. In a typical pipeline:

Data to vectors: We first convert each data item (text, image, etc.) into a numeric vector representation (embedding) using an ML model. For example, a customer review text might be turned into a 768-dimensional embedding that encodes its semantic content. Models like Amazon Titan Embeddings, OpenAI, or Hugging Face sentence transformers handle this step.
Index storage: These vectors are stored in a specialized index or database optimized for similarity search. This could be a dedicated vector database or, in our case, Amazon S3 Vectors acting as the index. Each vector is stored with an identifier (e.g. the primary key of the source record) and possibly metadata.
Query by vector: A search query (e.g. a phrase or image) is also converted into an embedding vector. The vector index is then queried to find the closest stored vectors by distance metric (cosine, Euclidean, dot product, etc.). The result is a set of IDs of the most similar items, often with a similarity score.

This process enables semantic search – results are returned based on meaning and similarity rather than exact text matches. It powers features like finding relevant documents by topic even if exact terms differ, recommendation systems (finding similar user behavior or content), and providing knowledge context to LLMs in RAG. With the Spice.ai Open Source integration, this whole lifecycle (embedding data, indexing vectors, querying) is managed by the Spice runtime and exposed via a familiar SQL or HTTP interface.

Amazon S3 Vectors in Spice.ai

Spice integration with Amazon S3 Vectors

Spice.ai is an open-source data, search and AI compute engine that supports vector search end-to-end. By integrating S3 Vectors as an index, Spice can embed data, store embeddings in S3, and perform similarity queries – all orchestrated through simple configuration and SQL queries. Let’s walk through how you enable and use this in Spice.

Configuring a Dataset with Embeddings

To use vector search, annotate your dataset schema to specify which column(s) to embed and with which model. Spice supports various embedding models (both local or hosted) via the embeddings section in the configuration. For example, suppose we have a customer reviews table and we want to enable semantic search over the review text (body column):

datasets:
  - from: oracle:"CUSTOMER_REVIEWS"
    name: reviews
    columns:
      - name: body
        embeddings:
          from: bedrock_titan # use an embedding model defined below

embeddings:
  - from: bedrock:amazon.titan-embed-text-v2:0
    name: bedrock_titan
    params:
      aws_region: us-east-2
      dimensions: '256'

In this spicepod.yaml, we defined an embedding model bedrock_titan (in this case AWS's Titan text embedding model) and attached it to the body column. When the Spice runtime ingests the dataset, it will automatically generate a vector embedding for each row’s body text using that model. By default, Spice can either store these vectors in its acceleration layer or compute them on the fly. However, with S3 Vectors, we can offload them to an S3 Vectors index for scalable storage.

To use S3 Vectors, we simply enable the vector engine in the dataset config:

datasets:
  - from: oracle:"CUSTOMER_REVIEWS"
    name: reviews
    vectors:
      enabled: true
      engine: s3_vectors
      params:
        s3_vectors_bucket: my-s3-vector-bucket
        #... (rest of dataset definition as above)

This tells Spice to create or use an S3 Vectors index (in the specified S3 bucket) for storing the body embeddings. Spice manages the entire index lifecycle: it creates the vector index, handles inserting each vector with its primary key into S3, and knows how to query it. The embedding model and data source are as before – the only change is where the vectors are stored and queried. The benefit is that now our vectors reside in S3’s highly scalable storage, and we can leverage S3 Vectors’ efficient similarity search API.

Performing a Vector Search Query

Once configured, performing a semantic search is straightforward. Spice exposes both an HTTP endpoint and a SQL table-valued function for vector search. For example, using the HTTP API:

curl -X POST http://localhost:8090/v1/search \
 -H "Content-Type: application/json" \
 -d '{
 "datasets": ["reviews"],
 "text": "issues with same day shipping",
 "additional_columns": ["rating", "customer_id"],
 "where": "created_at >= now() - INTERVAL '7 days'",
 "limit": 2
 }'

This JSON query says: search the reviews dataset for items similar to the text "issues with same day shipping", and return the top 2 results, including their rating and customer id, filtered to reviews from the last 7 days. The Spice engine will embed the query text (using the same model as the index), perform a similarity lookup in the S3 Vectors index, filter by the WHERE clause, and return the results. A sample response might look like:

{
  "results": [
    {
      "matches": {
        "body": "Everything on the site made it seem like I’d get it the same day. Still waiting the next morning was a letdown."
      },
      "data": { "rating": 3, "customer_id": 6482 },
      "primary_key": { "review_id": 123 },
      "score": 0.82,
      "dataset": "reviews"
    },
    {
      "matches": {
        "body": "It was marked as arriving 'today' when I paid, but the delivery was pushed back without any explanation. Timing was kind of important for me."
      },
      "data": { "rating": 2, "customer_id": 3310 },
      "primary_key": { "review_id": 24 },
      "score": 0.76,
      "dataset": "reviews"
    }
  ],
  "duration_ms": 86
}

Each result includes the matching column snippet (body), the additional requested fields, the primary key, and a relevance score. In this case, the two reviews shown are indeed complaints about “same day” delivery issues, which the vector search found based on semantic similarity to the query (see how the second result made no mention of "same day" delivery, but rather described a similar issue as the first ).

Developers can also use SQL for the same operation. Spice provides a table function vector_search(dataset, query) that can be used in the FROM clause of a SQL query. For example, the above search could be expressed as:

SELECT review_id, rating, customer_id, body, score
FROM vector_search(reviews, 'issues with same day shipping')
WHERE created_at >= to_unixtime(now() - INTERVAL '7 days')
ORDER BY score DESC
LIMIT 2;

This would yield a result set (with columns like review_id, score, etc.) similar to the JSON above, which you can join or filter just like any other SQL table. This ability to treat vector search results as a subquery/table and combine them with standard SQL filtering is a powerful feature of Spice.ai’s integration – few other solutions let you natively mix vector similarity and relational queries so seamlessly.

See a 2-min demo of it in action:

Managing Embeddings Storage in Spice.ai

An important design question for any vector search system is where and how to store the embedding vectors. Before introducing S3 Vectors, Spice offered two approaches for managing vectors:

Accelerator storage: Embed the data in advance and store the vectors alongside other cached data in a Data Accelerator (Spice’s high-performance materialization layer). This keeps vectors readily accessible in memory or fast storage.
Just-in-time computation: Compute the necessary vectors on the fly during a query, rather than storing them persistently. For example, at query time, embed only the subset of rows that satisfy recent filters (e.g. all reviews in the last 7 days) and compare those to the query vector.

Both approaches have trade-offs. Pre-storing in an accelerator provides fast query responses but may not be feasible for very large datasets (which might not fit entirely, or fit affordably in fast storage) and accelerators, like DuckDB or SQLite aren’t optimized for similarity search algorithms on billion-scale vectors. Just-in-time embedding avoids extra storage but becomes prohibitively slow when computing embeddings over large data scans (and for each query), and provides no efficient algorithm for efficiently finding similar neighbours.

Amazon S3 Vectors offers a compelling third option: the scalability of S3 with the efficient retrieval of vector index data structures. By configuring the dataset with engine: s3_vectors as shown earlier, Spice will offload the vector storage and similarity computations to S3 Vectors. This means you can handle very large embedding sets (millions or billions of items) without worrying about Spice’s memory or local disk limits, and still get fast similarity operations via S3’s API. In practice, when Spice ingests data, it will embed each row’s body and PUT it into the S3 Vector index (with the review_id as the key, and possibly some metadata). At query time, Spice calls S3 Vectors’ query API to retrieve the nearest neighbors for the embedded query. All of this is abstracted away; you simply query Spice and it orchestrates these steps.

The Spice runtime manages index creation, updates, and deletion. For instance, if new data comes in or old data is removed, Spice will synchronize those changes to the S3 vector index. Developers don’t need to directly interact with S3 – it’s configured once in YAML. This tight integration accelerates application development: your app can treat Spice like any other database, while behind the scenes Spice leverages S3’s elasticity for the heavy lifting.

Vector Index Usage in Query Execution

How does a vector index actually get used in Spice’s SQL query planner? To illustrate, consider the simplified SQL we used:

SELECT *
FROM vector_search(reviews, 'issues with same day shipping')
ORDER BY score DESC
LIMIT 5;

Logically, without a vector index, Spice would have to do the following at query time:

Embed the query text 'issues with same day shipping' into a vector v.
Retrieve or compute all candidate vectors for the searchable column (here every body embedding in the dataset). This could mean scanning every row or at least every row matching other filter predicate.
Calculate distances between the query vector v and each candidate vector, compute a similarity score (e.g. score = 1 - distance).
Sort all candidates by the score and take the top 5.

For large datasets, steps 2–4 would be extremely expensive (a brute-force scan through potentially millions of vectors for each search, then a full sort operation). A vector index avoiding unnecessary recomputation of embeddings, reduces the number of distance calculations required, and provides in-order candidate neighbors.

With S3 Vectors, step 2 and 3 are pushed down to the S3 service. The vector index can directly return the top K closest matches to v. Conceptually, S3 Vectors gives back an ordered list of primary keys with their similarity scores. For example, it might return something like: {(review_id=123, score=0.82), (review_id=24, score=0.76), ...} up to K results.

Spice then uses these results, logically as a temporary table (let’s call it vector_query_results), joined with the main reviews table to get the full records. In SQL pseudocode, Spice does something akin to:

-- The vector index returns the closest matches for a given query.
CREATE TEMP TABLE vector_query_results (
 review_id BIGINT,
 score FLOAT
);

Imagine this temp table is populated by an efficient vector retrieval operatin in S3 Vectors for the query.

-- Now we join to retrieve full details
SELECT r.review_id, r.rating, r.customer_id, r.body, v.score
FROM vector_query_results v
JOIN reviews r ON r.review_id = v.review_id
ORDER BY v.score DESC
LIMIT 5;

This way, only the top few results (say 50 or 100 candidates) are processed in the database, rather than the entire dataset. The heavy work of narrowing down candidates occurs inside the vector index. Spice essentially treats vector_search(dataset, query) as a table-valued function that produces (id, score) pairs which are then joinable.

Handling Filters Efficiently

One consideration when using an external vector index is how to handle additional filter conditions (the WHERE clause). In our example, we had a filter created_at >= now() - 7 days. If we simply retrieve the top K results from the vector search and then apply the time filter, we might run into an issue: those top K might not include any recent items, even if there are relevant recent items slightly further down the similarity ranking. This is because S3 Vectors (like most ANN indexes) will return the top K most similar vectors globally, unaware of our date constraint.

If only a small fraction of the data meets the filter, a naive approach could drop most of the top results, leaving fewer than the desired number of final results. For example, imagine the vector index returns 100 nearest reviews overall, but only 5% of all reviews are from the last week – we’d expect only ~5 of those 100 to be recent, possibly fewer than the LIMIT. The query could end up with too few results not because they don’t exist, but because the index wasn’t filter-aware and we truncated the candidate list.

To solve this, S3 Vectors supports metadata filtering at query time. We can store certain fields as metadata with each vector and have the similarity search constrained to vectors where the metadata meets criteria. Spice.ai leverages this by allowing you to mark some dataset columns as “vector filterable”. In our YAML, we could do:

columns:
  - name: created_at
    metadata:
      vectors: filterable

By doing this, Spice's query planner will include the created_at value with each vector it upserts to S3, and it will push down the time filter into the S3 Vectors query. Under the hood, the S3 vector query will then return only nearest neighbors that also satisfy created_at >= now()-7d. This greatly improves both efficiency and result relevance. The query execution would conceptually become:

-- Vector query with filter returns a temp table including the metadata
CREATE TEMP TABLE vector_query_results (
 review_id BIGINT,
 score FLOAT,
 created_at TIMESTAMP
);
-- vector_query_results is already filtered to last 7 days

SELECT r.review_id, r.rating, r.customer_id, r.body, v.score
FROM vector_query_results v
JOIN reviews r ON r.review_id = v.review_id
-- (no need for additional created_at filter here, it’s pre-filtered)
ORDER BY v.score DESC
LIMIT 5;

Now the index itself is ensuring all similar reviews are from the last week, and so if there are at least five results from the last week, it will return a full result (i.e. respecting LIMIT 5).

Including Data to Avoid Joins

Another optimization Spice supports is storing additional, non-filterable columns in the vector index to entirely avoid the expensive table join back to the main table for certain queries. For example, we might mark rating, customer_id, or even the text body as non-filterable vector metadata. This means these fields are stored with the vector in S3, but not used for filtering (just for retrieval). In the Spice config, it would look like:

columns:
  - name: rating
    metadata:
      vectors: non-filterable
  - name: customer_id
    metadata:
      vectors: non-filterable
  - name: body
    metadata:
      vectors: non-filterable

With this setup, when Spice queries S3 Vectors, the vector index will return not only each match’s review_id and score, but also the stored rating, customer_id, and body values. Thus, the temporary vector_query_results table already has all the information needed to satisfy the query. We don’t even need to join against the reviews table unless we want some column that wasn’t stored. The query can be answered entirely from the index data:

SELECT review_id, rating, customer_id, body, score
FROM vector_query_results
ORDER BY score DESC
LIMIT 5;

This is particularly useful for read-heavy query workloads where hitting the main database adds latency. By storing the most commonly needed fields along with the vector, Spice’s vector search behaves like an index-only query (similar to covering indexes in relational databases). You trade a bit of extra storage in S3 (duplicating some fields, but still managed by Spice) for faster queries that bypass the heavier join.

This extends to WHERE conditions on non-filterable columns, or filter predicate unsupported by S3 vectors. Spice's execution engine can apply these filters, still avoiding any expensive JOIN on the underlying table.

SELECT review_id, rating, customer_id, body, score
FROM vector_query_results
where rating > 3  -- Filter performed in Spice on, with non-filterable data from vector index
ORDER BY score DESC
LIMIT 5;

It’s worth noting that you should choose carefully which fields to mark as metadata – too many or very large fields could increase index storage and query payload sizes. Spice gives you the flexibility to include just what you need for filtering and projection to optimize each use case.

Beyond Basic Vector Search in Spice

Many real-world search applications go beyond a single-vector similarity lookup. Spice.ai’s strength is that it’s a full database engine. You can compose more complex search workflows, including hybrid search (combining keyword/text search with vector search), multi-vector queries, re-ranking strategies, and more. Spice provides both an out-of-the-box hybrid search API and the ability to write custom SQL to implement advanced retrieval logic.

Multiple vector fields or multi-modal search: You might have vectors for different aspects of data (e.g. an e-commerce product could have embeddings for both its description and the product's image. Or a document has both a title and body that should be searchable individually and together) that you may want to search across and combine results. Spice lets you do vector search on multiple columns easily, and you can weight the importance of each. For instance, you might boost matches in the title higher than matches in the body.
Vector and full-text search: Similar to vector search, columns can have text indexes defined that enable full-text BM25 search. Text search can then be performed in SQL with a similar text_search UDTF. The /v1/search HTTP API will perform a hybrid search across both full-text and vector indexes, merging results using Reciprocal Rank Fusion (RRF). This means you get a balanced result set that accounts for direct keyword matches as well as semantic similarity. The example below demonstrates how RRF can be implemented in SQL by combining ranks.
Hybrid vector + keyword search: Sometimes you want to ensure certain keywords are present while also using semantic similarity. Spice supports hybrid search natively – its default /v1/search HTTP API actually performs both full-text BM25 search and vector search, then merges results using Reciprocal Rank Fusion (RRF). This means you get a balanced result set that accounts for direct keyword matches as well as semantic similarity. In Spice’s SQL, you can also call text_search(dataset, query) for traditional full-text search, and combine it with vector_search results. The example below demonstrates how RRF can be implemented in SQL by combining ranks.
Two-phase retrieval (re-ranking): A common pattern is to use a fast first-pass retrieval (e.g. a keyword search) to get a larger candidate set, then apply a more expensive or precise ranking (e.g. vector search) on this subset to improve the score of the required final candidate set. With Spice, you can orchestrate this in SQL or in application code. For example, you could query a BM25 index for 100 candidates, then perform a vector search amongst this candidate set(i.e. restricted to those IDs) for a second phase. Since Spice supports standard SQL constructs, you can express these multi-step plans with common table expressions (CTEs) and joins.

To illustrate hybrid search, here’s a SQL snippet that uses the Reciprocal Rank Fusion (RRF) technique to merge vector and text search results for the same query (RRF is used, when needed, in the v1/search HTTP API):

WITH
vector_results AS (
 SELECT review_id, RANK() OVER (ORDER BY score DESC) AS vector_rank
 FROM vector_search(reviews, 'issues with same day shipping')
),
text_results AS (
 SELECT review_id, RANK() OVER (ORDER BY score DESC) AS text_rank
 FROM text_search(reviews, 'issues with same day shipping')
)
SELECT
 COALESCE(v.review_id, t.review_id) AS review_id,
 -- RRF scoring: 1/(60+rank) from each source
 (1.0 / (60 + COALESCE(v.vector_rank, 1000)) +
 1.0 / (60 + COALESCE(t.text_rank, 1000))) AS fused_score
FROM vector_results v
FULL OUTER JOIN text_results t ON v.review_id = t.review_id
ORDER BY fused_score DESC
LIMIT 50;

This takes the vector similarity results and text (BM25) results, assigns each a rank based not on the score, but rather the relative order of candidates, and combines these ranks for an overall order. Spice’s primary key SQL semantics easily enables this document ID join.

For a multi-column vector search example, suppose our reviews dataset has both a title and body with embeddings, and we want to prioritize title matches higher. We could create a combined_score where the title is weighted twice as high as the body:

WITH
body_results AS (
 SELECT review_id, score AS body_score
 FROM vector_search(reviews, 'issues with same day shipping', col => 'body')
),
title_results AS (
 SELECT review_id, score AS title_score
 FROM vector_search(reviews, 'issues with same day shipping', col => 'title')
)
SELECT
 COALESCE(body.review_id, title.review_id) AS review_id,
 COALESCE(body_score, 0) + 2.0 * COALESCE(title_score, 0) AS combined_score
FROM body_results
FULL OUTER JOIN title_results ON body_results.review_id = title_results.review_id
ORDER BY combined_score DESC
LIMIT 5;

These examples scratch the surface of what you can do by leveraging Spice’s SQL-based composition. The key point is that Spice isn’t just a vector database – it’s a hybrid engine that lets you combine vector search with other query logic (text search, filters, joins, aggregations, etc.) all in one place. This can significantly simplify building complex search and AI-driven applications.

(Note: Like most vector search systems, S3 Vectors uses an approximate nearest neighbor (ANN) algorithm under the hood for performance. This yields fast results that are probabilistically the closest, which is usually an acceptable trade-off in practice. Additionally, in our examples we focused on one embedding per row; production systems may use techniques like chunking text into multiple embeddings or adding external context, but the principles above remain the same.)

Industry Context and Comparisons

The rise of vector databases over the past few years (Pinecone, Qdrant, Weaviate, etc.) has been driven by the need to serve AI applications with semantic search at scale. Each solution takes a slightly different approach in architecture and trade-offs. Spice.ai’s integration with Amazon S3 Vectors represents a newer trend in this space: decoupling storage from compute for vector search, analogous to how data warehouses separated compute and storage in the past. Let’s compare this approach with some existing solutions:

Traditional Vector Databases (Qdrant, Weaviate, Pinecone): These systems typically run as dedicated services or clusters that handle both the storage of vectors (on disk or in-memory) and the computation of similarity search. For example, Qdrant (an open-source engine in Rust) allows either in-memory storage or on-disk storage (using RocksDB) for vectors and payloads. It’s optimized for high performance and offers features like filtering, quantization, and distributed clustering, but you generally need to provision servers/instances that will host all your data and indexes. Weaviate, another popular open-source vector DB, uses a Log-Structured Merge (LSM) tree based storage engine that persists data to disk and keeps indexes in memory. Weaviate supports hybrid search (it can combine keyword and vector queries) and offers a GraphQL API, with a managed cloud option priced mainly by data volume. Pinecone, a fully managed SaaS, also requires you to select a service tier or pod which has certain memory/CPU allocated for your index – essentially your data lives in Pinecone’s infrastructure, not in your AWS account. These solutions excel at low-latency search for high query throughput scenarios (since data is readily available in RAM or local SSD), but the cost can be high for large datasets. You pay for a lot of infrastructure to be running, even during idle times. In fact, prior to S3 Vectors, vector search engines often stored data in memory at ~$2/GB and needed multiple replicas on SSD, which is “the most expensive way to store data”, as Simon Eskildsen (Turbopuffer’s founder) noted. Some databases mitigate cost by compressing or offloading to disk, but still, maintaining say 100 million embeddings might require a sizable cluster of VMs or a costly cloud plan.
Spice.ai with Amazon S3 Vectors: This approach flips the script by storing vectors in cheap, durable object storage (S3) and loading/indexing them on demand. As discussed, S3 Vectors keeps the entire vector dataset in S3 at ~$0.02/GB storage , and only spins up transient compute (managed by AWS) to serve queries, meaning you aren’t paying for idle GPU or RAM time. AWS states this design can cut total costs by up to 90% while still giving sub-second performance on billions of vectors. It’s essentially a serverless vector search model – you don’t manage servers or even dedicated indices; you just use the API. Spice.ai’s integration means developers get this cost-efficiency without having to rebuild their application: they can use standard SQL and Spice will push down operations to S3 Vectors as appropriate. This decoupled storage/compute model is ideal for use cases where the data is huge but query volumes are moderate or bursty (e.g., an enterprise semantic search that is used a few times an hour, or a nightly ML batch job). It avoids the “monolithic database” scenario of having a large cluster running 24/7. However, one should note that if you need extremely high QPS (thousands of queries per second at ultra-low latency), a purely object-storage-based solution might not outperform a tuned in-memory vector DB – AWS positions S3 Vectors as complementary to higher-QPS solutions like OpenSearch for real-time needs.
Turbopuffer: Turbopuffer is a startup that, much like Spice with S3 Vectors, is built from first principles on object storage. It provides “serverless vector and full-text search… fast, 10× cheaper, and extremely scalable,” by leveraging S3 or similar object stores with smart caching. The philosophy is the same: use the durability and low cost of object storage for the bulk of data, and layer a cache (memory/SSD) in front for performance-critical portions. According to Turbopuffer’s founder, moving from memory/SSD-centric architectures to an object storage core can yield 100× cost savings for cold data and 6–20× for warm data, without sacrificing too much performance. Turbopuffer’s engine indexes data incrementally on S3 and uses caching to achieve similar latency to conventional search engines on hot data. The key difference is that Turbopuffer is a standalone search service (with its own API), whereas Spice uses AWS’s S3 Vectors service as the backend. Both approaches validate the industry trend toward disaggregated storage for search. Essentially, they are bringing the cloud data warehouse economics to vector search: store everything cheaply, compute on demand.

In summary, Spice.ai’s integration with S3 Vectors and similar efforts indicate a shift in vector search towards cost-efficient, scalable architectures that separate the concerns of storing massive vector sets and serving queries. Developers now have options: if you need blazing fast, realtime vector search with constant high traffic, dedicated compute infrastructure might be justified. But for many applications – enterprise search, AI assistants with a lot of knowledge but lower QPS, periodic analytics over embeddings – offloading to something like S3 Vectors can save enormously on cost while still delivering sub-second performance at huge scale. And with Spice.ai, you get the best of both worlds: the ease of a unified SQL engine that can do keyword + vector hybrid search on structured data, combined with the power of a cloud-native vector store. It simplifies your stack (no separate vector DB service to manage) and accelerates development since you can join and filter vector search results with your data immediately in one query .

References:

Spice.ai announcement: “Spice.ai Now Supports Amazon S3 Vectors For Vector Search at Petabyte Scale!”
Spice.ai Amazon S3 Vectors documentation
Spice.ai Amazon S3 Vectors Cookbook Recipe Sample
Amazon S3 Vectors official page
Pinecone Database Architecture (managed vector database)
Qdrant documentation (storage modes and features)
Turbopuffer blog by Simon Eskildsen (cost of search on object storage)
Weaviate storage architecture discussion

Spice v0.18.3-beta (Sep 30, 2024)

September 30, 2024 · 5 min read

Jack Eadie

Token Plumber at Spice AI

Announcing the release of Spice v0.18.3-beta 🛠️

The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.

Highlights in v0.18.3-beta

GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.

Example spicepod.yml:

- from: github:github.com/spiceai/spiceai/issues/trunk
  name: spiceai.issues
  params:
    github_query_mode: search # Use GitHub Search API
    github_token: ${secrets:GITHUB_TOKEN}

Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.

Example command line:

spice -v
spice --very-verbose

spiced -vv
spiced --verbose

Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.

Example spicepod.yml:

- name: support_tickets
  embeddings:
    - column: conversation_history
      use: openai_embeddings
      chunking:
        enabled: true
        target_chunk_size: 128
        overlap_size: 16
        trim_whitespace: true

For details, see the Search Documentation.

Dependencies

DataFusion Table Providers: Upgraded to rev b0af91992699ecbf5adf2036a07122578f06150e.

Contributors

@Sevenannn
@peasee
@Jeadie
@sgrebnov
@phillipleblanc
@ewgenius
@slyons

What's Changed

- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
- refactor: Set max_rows_per_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
- Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
- Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
- Add verbosity flags for spiced, spice: `-v`, `-vv`, `--verbose`, `--very-verbose`. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831
- Rename `spiceai` data connector to `spice.ai` by @sgrebnov in https://github.com/spiceai/spiceai/pull/2680
- Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
- Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
- Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
- chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
- fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
- fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
- Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
- refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
- docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
- Change `BytesProcessedRule` to be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867
- Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
- Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
- feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
- Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
- Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
- Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
- Revert "Rename `spiceai` data connector to `spice.ai`" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881
- Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
- Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
- Fix `no process-level CryptoProvider available` when using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887
- Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
- Add `log/slog` to spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859
- feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
- Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
- fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
- Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898

**Full Changelog**: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta

Resources

Community

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.

Twitter: @spice_ai
Slack: spiceai.org/slack
Telegram: Spice AI Discussion
Reddit: https://www.reddit.com/r/spiceai
Email: hey@spice.ai

What's New in v1.8.2​

Support Table Relations in /v1/search HTTP Endpoint​

DuckDB Data Accelerator Table Partitioning & Indexing​

S3 Vectors Reliability​

Document Table Improvements​

Additional Improvements & Bugfixes​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.7.1​

Bug Fixes & Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.7.0​

DataFusion v49 Highlights​

Spice Runtime Highlights​

Bug Fixes​

Contributors​

New Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

Amazon S3 Vectors Overview​

Vector Search with Embeddings​

Amazon S3 Vectors in Spice.ai​

Configuring a Dataset with Embeddings​

Performing a Vector Search Query​

Managing Embeddings Storage in Spice.ai​

Vector Index Usage in Query Execution​

Handling Filters Efficiently​

Including Data to Avoid Joins​

Beyond Basic Vector Search in Spice​

Industry Context and Comparisons​

Highlights in v0.18.3-beta​

Dependencies​

Contributors​

What's Changed​

Resources​

Community​

What's New in v1.8.2

Support Table Relations in `/v1/search` HTTP Endpoint

DuckDB Data Accelerator Table Partitioning & Indexing

S3 Vectors Reliability

Document Table Improvements

Additional Improvements & Bugfixes

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.7.1

Bug Fixes & Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.7.0

DataFusion v49 Highlights

Spice Runtime Highlights

Bug Fixes

Contributors

New Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

Amazon S3 Vectors Overview

Vector Search with Embeddings

Amazon S3 Vectors in Spice.ai

Configuring a Dataset with Embeddings

Performing a Vector Search Query

Managing Embeddings Storage in Spice.ai

Vector Index Usage in Query Execution

Handling Filters Efficiently

Including Data to Avoid Joins

Beyond Basic Vector Search in Spice

Industry Context and Comparisons

Highlights in v0.18.3-beta

Dependencies

Contributors

What's Changed

Resources

Community