semlix 3.2 release notes¶

semlix 3.2.0¶

A modernization, performance and tooling release. Backward compatible for Python 3 users; Python 2 support was removed (it had been unmaintained for years). The canonical change list is in CHANGELOG.md.

Highlights¶

Engine guide (Choosing an engine: semlix core vs bm25s): a measured comparison of the two lexical engines. On 20k documents the bm25s engine indexes ~40x faster and queries ~57x faster than the pure-Python core — use it for raw lexical speed and the lexical leg of hybrid search; use the core for its full feature set.
Optional native acceleration (Optional native acceleration): SEMLIX_COMPILE=1 compiles a curated set of hot modules with mypyc; pure-Python stays the default and the automatic fallback.
Faster hybrid search: lexical and semantic searches now run concurrently and the unused side is skipped when alpha is 0/1; a bounded query-embedding cache avoids re-embedding repeated queries.
Incremental BM25s indexing: only new documents are tokenized on add (instead of re-tokenizing the whole corpus each commit).
StandardAnalyzer indexing fast path: a fused tokenize+lowercase+stop+count loop, asserted bit-for-bit identical to the generic pipeline.

Breaking / compatibility¶

Requires Python >= 3.9. Python 2 support and the cached-property dependency were removed; the core install now has no third-party runtime dependencies.

Security¶

NumpyVectorStore persistence is now pickle-free (numpy .npz + JSON, loaded with allow_pickle=False), removing the arbitrary-code-execution risk of unpickling a shared/untrusted index. Legacy pickle stores still load with a deprecation warning and migrate on the next save.

Fixes¶

varint_to_int crashed on Python 3 (ord() on a bytes element).
BM25sStore.search: dropped the update_vocab kwarg removed in bm25s 0.3.9, corrected query tokenization, and clamped k to the corpus size.
codec/whoosh3.py: elif fixedsize is 0 → == 0.

semlix 3.2 release notes¶

semlix 3.2.0¶

Highlights¶

Breaking / compatibility¶

Security¶

Fixes¶

See Also¶

Table of Contents

Previous topic

Next topic

This Page