Optional native acceleration¶
semlix is pure-Python by default — it installs and runs with no compiler.
For extra speed you can compile a curated set of hot, type-clean modules to C
extensions with mypyc. If the compiled
extensions aren’t built, the .py modules are imported as usual, so the
library always works.
Building¶
Needs a C compiler and mypy (which ships mypyc):
pip install "semlix[fast]" # pulls mypy (build-time dep)
SEMLIX_COMPILE=1 pip install . # compile the native modules
# local dev against a checkout:
SEMLIX_COMPILE=1 python setup.py build_ext --inplace
With SEMLIX_COMPILE unset (the default), no extensions are built and you get
the normal pure-Python package. Check whether the native build is active:
import semlix.util.varints as v
print(v.__file__.endswith(".so")) # True if compiled
What gets compiled¶
See NATIVE_MODULES in setup.py. The current set is a group of leaf
modules (varints, numeric, text, idsets, the Levenshtein automaton and edit
distance). Gains there are modest; they are the foundation.
The hot indexing/query cluster (the analysis pipeline, codec and matchers) is
not compiled: those classes are subclassed by interpreted code (every
analyzer/tokenizer/filter in semlix.lang, plus user-defined analyzers), and
mypyc forbids interpreted classes inheriting from compiled ones. For raw lexical
speed today, prefer the bm25s engine (see Choosing an engine: semlix core vs bm25s) rather than compiling
the core.
Note
mypyc is stricter than CPython: a module can compile cleanly yet behave
differently when compiled, so every module added to NATIVE_MODULES must
keep the full test suite green with the compiled ``.so`` present.