Skip to Content
Oren Digital — Offline-first software tools
Open Source · Zero Dependencies

Software tools that work offline.
On any hardware.
Without the cloud.

Oren Stack builds retrieval and AI tools that run on Python's standard library. No GPU. No API keys. No dependencies. Just fast, accurate, private computation.

100%
Accuracy
15ms
Avg Query
0
Dependencies
904K+
Indexed Pairs

Sift — Deterministic Retrieval

What embedding-based search gets wrong

Embeddings compress meaning into vectors. That compression loses information. On a benchmark of 175 real-world queries, embedding-based retrieval scored 25%. Hybrid retrieval (embeddings + BM25) scored 33%. Neither is production-ready.


Sift takes a different approach: deterministic text signatures instead of vector similarity. It matches structural patterns, not approximate meanings. The result is either the right answer or nothing. No "sort of related" noise.

Metric Sift Embeddings Hybrid
Accuracy (175 queries) 100% 25% 33%
Avg response time 15ms ~200ms ~350ms
Dependencies 0 5–20+ 10–30+
GPU required No Usually Usually
Works offline Yes Rarely Rarely
RAM (904K pairs) ~272MB 2–8GB 3–10GB
$ python -m sift query "how to reset password"
Found: Reset Password Procedure (confidence: 1.0, time: 12ms)
 
$ python -m sift benchmark --queries 175
Results: 175/175 correct · avg 15ms · peak RAM 272MB

Integration & Consulting

Sift for Your Data

You have the data — support tickets, medical records, legal documents, knowledge bases. I build custom domain signatures, tune the engine, and deliver a working Sift instance for your specific use case.

Embedded Integration

Sift running inside your product. I handle architecture, integration, and ongoing maintenance. Your users get instant, accurate retrieval. You get a monthly retainer and zero infrastructure headaches.

Offline AI Architecture

Building a system that works without internet? Edge devices, field applications, privacy-sensitive environments. I design local-first AI pipelines. Sift is the proof I know how.

Built by someone who tried embeddings first

I tried the standard approach — sentence transformers, vector databases, cosine similarity. It scored 25% on real queries. So I went the other way: no ML, no models, no dependencies. Just deterministic text analysis on Python's standard library.

Sift is the result of that work. Now I help companies integrate fast, private, offline retrieval into their products.
Focus Offline-first AI
Stack Python stdlib
Model Open Source + Consulting

Need retrieval that actually works?

Whether you're building offline applications, privacy-sensitive systems, or just tired of vector databases that return wrong answers — let's talk.

Get in Touch →
Donate with PayPal