Fine-Tuned Security Reasoning Model That Runs on a 4GB Laptop No GPU. No Cloud.
Most local security models either demand expensive hardware or don't actually reason — they pattern-match and hallucinate CVE numbers. I fine-tuned a small language model that does neither. It runs on a 4GB laptop, requires no GPU, no cloud, and no internet connection. Here's how, and why it matters.

The Problem: Security AI Needs to Stay On Your Machine
Every time you paste a suspicious log, a CVE description, or an internal config into a cloud LLM, that data leaves your machine.
But local security models have been terrible. For years, practitioners faced the same three dead ends:
Prohibitive hardware requirements. Most capable models demand an A100 or 80GB of VRAM — hardware that doesn't exist on a field laptop or an air-gapped workstation.
No real reasoning. Smaller local models don't reason — they pattern-match. They hallucinate CVE numbers, misattribute techniques, and fail to chain observations into conclusions.
Stale threat knowledge. Most models have no training signal for AI-native threats, agentic attack surfaces, or the adversarial techniques that actually matter in 2025–2026.
So I built one that doesn't have those problems.
The Solution: A Reasoning-Focused Security SLM
The goal was clear: a model that could reason through security problems — not just
recall them — while running on commodity hardware with no internet dependency.
What "Reasoning" Actually Means Here
Most small models retrieve. They surface what they've seen before. This model was
fine-tuned on structured reasoning chains: observe → hypothesise → rule in/out → conclude. That's the mental model a senior analyst uses, and it's what separates useful output from confident nonsense.
How It Was Built
Base Model Selection
Starting with a capable open-source base was non-negotiable. The model needed to be small enough to quantise to 4-bit and run on CPU, but expressive enough to follow multi-step reasoning prompts without collapsing into repetition.
Training Data: Security-Specific Reasoning
Generic instruction tuning doesn't produce security reasoning — it produces security-flavoured autocomplete. The training set was built around:
- Raw CVE dumps
- Wiki-style security articles
- Pattern-matching Q&A pairs
- Cloud-dependent fine-tune setups
- Structured reasoning chains
- Incident response walkthroughs
- AI-native & agentic threat scenarios
- Log analysis with chain-of-thought
Quantisation & Edge Deployment
The model is distributed in GGUF format, quantised to 4-bit. It runs entirely via llama.cpp — no Python environment required, no CUDA, no internet. A basic invoke looks like:
-m security-reasoning-slm-q4.gguf \
-p "Analyse this log entry and reason through potential TTPs: [LOG]" \
--ctx-size 4096 \
-t 8 # CPU threads only — no GPU flag needed
Who Is This For?
The 200+ downloads this month tell me the audience is broader than I expected. But the primary use cases are clear: