Fine-Tuned Security Reasoning Model That Runs on a 4GB Laptop — No

Most local security models either demand expensive hardware or don't actually reason — they pattern-match and hallucinate CVE numbers. I fine-tuned a small language model that does neither. It runs on a 4GB laptop, requires no GPU, no cloud, and no internet connection. Here's how, and why it matters.

The Problem: Security AI Needs to Stay On Your Machine

Every time you paste a suspicious log, a CVE description, or an internal config into a cloud LLM, that data leaves your machine.

⚠ For security work — red team engagements, incident response, air-gapped environments — that's a real problem. You cannot send client data to an API. You cannot pipe internal logs to OpenAI.

But local security models have been terrible. For years, practitioners faced the same three dead ends:

💸

Prohibitive hardware requirements. Most capable models demand an A100 or 80GB of VRAM — hardware that doesn't exist on a field laptop or an air-gapped workstation.

🧠

No real reasoning. Smaller local models don't reason — they pattern-match. They hallucinate CVE numbers, misattribute techniques, and fail to chain observations into conclusions.

📅

Stale threat knowledge. Most models have no training signal for AI-native threats, agentic attack surfaces, or the adversarial techniques that actually matter in 2025–2026.

So I built one that doesn't have those problems.

The Solution: A Reasoning-Focused Security SLM

The goal was clear: a model that could reason through security problems — not just

recall them — while running on commodity hardware with no internet dependency.

4GBRAM Required

0GPU Needed

200+Downloads This Month

What "Reasoning" Actually Means Here

Most small models retrieve. They surface what they've seen before. This model was

fine-tuned on structured reasoning chains: observe → hypothesise → rule in/out → conclude. That's the mental model a senior analyst uses, and it's what separates useful output from confident nonsense.

✓ When you feed it a log fragment, it doesn't just label it — it walks through what's anomalous, what technique it maps to, and what the likely next step in the attack chain would be.

How It Was Built

Base Model Selection

Starting with a capable open-source base was non-negotiable. The model needed to be small enough to quantise to 4-bit and run on CPU, but expressive enough to follow multi-step reasoning prompts without collapsing into repetition.

Training Data: Security-Specific Reasoning

Generic instruction tuning doesn't produce security reasoning — it produces security-flavoured autocomplete. The training set was built around:

❌ What I Avoided

Raw CVE dumps
Wiki-style security articles
Pattern-matching Q&A pairs
Cloud-dependent fine-tune setups

✓ What I Used Instead

Structured reasoning chains
Incident response walkthroughs
AI-native & agentic threat scenarios
Log analysis with chain-of-thought

Quantisation & Edge Deployment

The model is distributed in GGUF format, quantised to 4-bit. It runs entirely via llama.cpp — no Python environment required, no CUDA, no internet. A basic invoke looks like:

Shell
./llama-cli \
-msecurity-reasoning-slm-q4.gguf \
-p"Analyse this log entry and reason through potential TTPs: [LOG]" \
--ctx-size 4096 \
-t 8 # CPU threads only — no GPU flag needed

Who Is This For?

The 200+ downloads this month tell me the audience is broader than I expected. But the primary use cases are clear:

The Problem: Security AI Needs to Stay On Your Machine

Every time you paste a suspicious log, a CVE description, or an internal config into a cloud LLM, that data leaves your machine.

⚠ For security work — red team engagements, incident response, air-gapped environments — that's a real problem. You cannot send client data to an API. You cannot pipe internal logs to OpenAI.

But local security models have been terrible. For years, practitioners faced the same three dead ends:

💸

Prohibitive hardware requirements. Most capable models demand an A100 or 80GB of VRAM — hardware that doesn't exist on a field laptop or an air-gapped workstation.

🧠

No real reasoning. Smaller local models don't reason — they pattern-match. They hallucinate CVE numbers, misattribute techniques, and fail to chain observations into conclusions.

📅

Stale threat knowledge. Most models have no training signal for AI-native threats, agentic attack surfaces, or the adversarial techniques that actually matter in 2025–2026.

So I built one that doesn't have those problems.

The Solution: A Reasoning-Focused Security SLM

The goal was clear: a model that could reason through security problems — not just

recall them — while running on commodity hardware with no internet dependency.

4GBRAM Required

0GPU Needed

200+Downloads This Month

What "Reasoning" Actually Means Here

Most small models retrieve. They surface what they've seen before. This model was

✓ When you feed it a log fragment, it doesn't just label it — it walks through what's anomalous, what technique it maps to, and what the likely next step in the attack chain would be.

How It Was Built

Base Model Selection

Training Data: Security-Specific Reasoning

Generic instruction tuning doesn't produce security reasoning — it produces security-flavoured autocomplete. The training set was built around:

❌ What I Avoided

Raw CVE dumps
Wiki-style security articles
Pattern-matching Q&A pairs
Cloud-dependent fine-tune setups

✓ What I Used Instead

Structured reasoning chains
Incident response walkthroughs
AI-native & agentic threat scenarios
Log analysis with chain-of-thought

Quantisation & Edge Deployment

The model is distributed in GGUF format, quantised to 4-bit. It runs entirely via llama.cpp — no Python environment required, no CUDA, no internet. A basic invoke looks like:

Shell
./llama-cli \
-msecurity-reasoning-slm-q4.gguf \
-p"Analyse this log entry and reason through potential TTPs: [LOG]" \
--ctx-size 4096 \
-t 8 # CPU threads only — no GPU flag needed

Who Is This For?

The 200+ downloads this month tell me the audience is broader than I expected. But the primary use cases are clear:

Fine-Tuned Security Reasoning Model That Runs on a 4GB Laptop No GPU. No Cloud.

The Problem: Security AI Needs to Stay On Your Machine

The Solution: A Reasoning-Focused Security SLM

What "Reasoning" Actually Means Here

How It Was Built

Base Model Selection

Training Data: Security-Specific Reasoning

Quantisation & Edge Deployment

Who Is This For?

Fine-Tuned Security Reasoning Model That Runs on a 4GB Laptop No GPU. No Cloud.

The Problem: Security AI Needs to Stay On Your Machine

The Solution: A Reasoning-Focused Security SLM

What "Reasoning" Actually Means Here

How It Was Built

Base Model Selection

Training Data: Security-Specific Reasoning

Quantisation & Edge Deployment

Who Is This For?