semlix 3.2 release notes¶
semlix 3.2.0¶
A modernization, performance and tooling release. Backward compatible for
Python 3 users; Python 2 support was removed (it had been unmaintained for
years). The canonical change list is in CHANGELOG.md.
Highlights¶
Engine guide (Choosing an engine: semlix core vs bm25s): a measured comparison of the two lexical engines. On 20k documents the bm25s engine indexes ~40x faster and queries ~57x faster than the pure-Python core — use it for raw lexical speed and the lexical leg of hybrid search; use the core for its full feature set.
Optional native acceleration (Optional native acceleration):
SEMLIX_COMPILE=1compiles a curated set of hot modules with mypyc; pure-Python stays the default and the automatic fallback.Faster hybrid search: lexical and semantic searches now run concurrently and the unused side is skipped when
alphais 0/1; a bounded query-embedding cache avoids re-embedding repeated queries.Incremental BM25s indexing: only new documents are tokenized on add (instead of re-tokenizing the whole corpus each commit).
StandardAnalyzer indexing fast path: a fused tokenize+lowercase+stop+count loop, asserted bit-for-bit identical to the generic pipeline.
Breaking / compatibility¶
Requires Python >= 3.9. Python 2 support and the
cached-propertydependency were removed; the core install now has no third-party runtime dependencies.
Security¶
NumpyVectorStorepersistence is now pickle-free (numpy.npz+ JSON, loaded withallow_pickle=False), removing the arbitrary-code-execution risk of unpickling a shared/untrusted index. Legacy pickle stores still load with a deprecation warning and migrate on the next save.
Fixes¶
varint_to_intcrashed on Python 3 (ord()on a bytes element).BM25sStore.search: dropped theupdate_vocabkwarg removed in bm25s 0.3.9, corrected query tokenization, and clampedkto the corpus size.codec/whoosh3.py:elif fixedsize is 0→== 0.
See Also¶
Choosing an engine: semlix core vs bm25s – choosing an engine
Optional native acceleration – optional native build
CHANGELOG.md– full change list