Iternal Technologies Logo

AirgapAI Local Model Downloads

Download pre-configured models for AirgapAI offline and local deployments

Large Language Models

General-purpose chat and instruct models for on-device text generation.

Q-3.5

Qwen 3.5

Latest instruct line with thinking, in 4B & 9B

Chat
INTELLIGENCE
Advanced
SPEED
Fast
SIZE
Small–Medium

Qwen 3.5 instruct family with built-in step-by-step thinking for stronger reasoning. Use 4B for quick, smooth performance on more machines, or 9B when you want richer answers and have a stronger device.

Thinking Vision Function calling Structured output
Downloads
  • 4B is the easy default; 9B when answer quality matters most
  • Choose a size, then the download row under it
  • A second 9B Windows row offers uncensored weights (fewer refusals); use only where that trade-off is appropriate.
  • Learn more (4B)
  • Learn more (9B)
GEMMA

Gemma 4

Google's efficient open models, E4B to 31B

Chat
INTELLIGENCE
Advanced
SPEED
Fast
SIZE
Small–Large

Google's Gemma 4 instruct family, packaged for AirgapAI. Start with the E4B build for fast, smooth responses on modest hardware, or step up to 12B / 26B / 31B when you want richer answers and have a stronger GPU or Apple Silicon machine.

Thinking Vision Function calling Structured output
Downloads
  • 4-bit quantisation across all variants
  • E4B runs on most modern GPUs / iGPUs; larger builds want a 2022+ discrete GPU or Apple Silicon
  • Choose a size, then the download row under it
  • Model details
L-3.2

Llama 3.2

Meta's compact instruct models (1B & 3B)

Chat
INTELLIGENCE
Strong
SPEED
Very Fast
SIZE
Small

Meta's Llama 3.2 instruct family in 1B and 3B sizes. Runs universally via MLC or optimised for Intel GPU / NPU via OpenVINO. Includes a 32K-context 3B build for long-document workflows.

Downloads
L-8B

Llama 3.1 8B

Meta's 8B instruct, Intel-optimised

Chat
INTELLIGENCE
Advanced
SPEED
Fast
SIZE
Medium

Llama 3.1 8B Instruct compiled for Intel hardware via OpenVINO. Pick GPU for Arc / iGPU setups or NPU on Core Ultra laptops.

Downloads
Windows 3 downloads
macOS 1 download
  • INT4 quantisation
  • Requires Lunar Lake or later for best results
  • Model details
SAUL

Saul 7B Instruct

Legal-domain expert, Mistral 7B base

Chat
INTELLIGENCE
Advanced
SPEED
Fast
SIZE
Medium

Saul 7B Instruct is a legal-domain specialist fine-tuned from Mistral 7B on a large corpus of legal texts. Ideal for contract review, case summarisation, and legal Q&A where domain accuracy matters.

Downloads
Windows 1 download
macOS 1 download
  • 5 GB+ vRAM
  • Runs via llama.cpp runtime (GPU/CPU hybrid supported)
  • Specialised for legal reasoning and drafting
  • Model details
AFM

Arcee AFM 4.5B

Intel-optimised small reasoning model

Chat
INTELLIGENCE
Flagship
SPEED
Fast
SIZE
Small

Arcee's AFM 4.5B optimised for Intel hardware via OpenVINO runtime.

Downloads
Windows 1 download
4.5B

Document Processing

Specialist models that parse and prepare documents for retrieval-augmented generation.

BLK
By Iternal Technologies

Blockify

Local document → IdeaBlocks ingestion

Document Processing
SPEED
Very Fast
SIZE
Small
MODALITY
Text → IdeaBlocks

Iternal's proprietary fine-tuned ingestion model. Transforms unstructured documents into clean, searchable IdeaBlocks so retrieval-augmented search runs on signal instead of noise. Pick the size that fits your hardware — accuracy scales with parameter count. Because of local hardware limitations some information may be lost; human review is required.

Downloads
  • Processes documents into optimised IdeaBlocks
  • 1B runs on ≥ 2 GB vRAM; 3B on ≥ 3 GB; 8B on ≥ 5 GB
  • Universal (MLC) builds run on any GPU; OpenVINO builds are optimised for Intel
  • Read more about Blockify here

Embeddings & Vector Search

Encoder models that turn text into vectors for semantic search and RAG retrieval.

JE

Jina Embeddings V2

Vector-search embeddings

Vector Search
SPEED
Instant
MEMORY
2 GB+

Light-weight model for high-quality text embeddings.

Downloads
Windows 1 download
CPU
macOS 1 download
CPU

Translation

Sequence-to-sequence models for converting text between languages.

NLLB

NLLB Text Translation

Multi-language text translation

Translation
SPEED
Very Fast
SIZE
Medium
MODALITY
Text → Text

No Language Left Behind (NLLB) model for high-quality text translation across 200+ languages.

Downloads
Windows 1 download
CPU