Model Management

Load, unload, and manage GGUF models — inference runs entirely on your machine

Llama 3.1 8B

Active

Llama-3.1-8B-Instruct-Q4_K_M.gguf

Quantization

Q4_K_M

File size

4.80 GB

Context length

8,192

Parameters

8B
LlamaForCausalLM
Loaded in 2.3s
Last used: 2 min ago

System Memory

RAM9.8/32GB(31%)
VRAM (GPU)3.6/8GB(45%)
Swap4.9/16GB(31%)
Model footprint
Llama-3.1-8B Q4_K_M4.92 GB
KV cache (8192 ctx)0.84 GB
Total model5.76 GB

Load from File System

Drop .gguf file here

or click to browse file system

.gguf only

Recent paths

Paths are read from your local file system — no files are uploaded anywhere

Model Library

8 models in library

ModelSizeParamsContextQuantizationArchitectureLast usedStatusActions

Qwen2.5 7B

Qwen2.5-7B-Instruct-Q6_K.gguf

5.96 GB7B131KQ6_KQwen2ForCausalLM1 week agoError
Load error

CodeLlama 13B

CodeLlama-13b-Instruct-Q4_K_S.gguf

7.62 GB13B16KQ4_K_SLlamaForCausalLM2 days agoAvailable

Llama 3.1 8B

Llama-3.1-8B-Instruct-Q4_K_M.gguf

4.80 GB8B8KQ4_K_MLlamaForCausalLM2 min agoLoaded

DeepSeek Coder 6.7B

deepseek-coder-6.7b-instruct-Q4_K_M.gguf

4.00 GB6.7B16KQ4_K_MLlamaForCausalLM2 weeks agoAvailable

Mistral 7B v0.3

Mistral-7B-Instruct-v0.3-Q8_0.gguf

7.50 GB7B33KQ8_0MistralForCausalLM3 hours agoAvailable

Llama 3.2 1B

Llama-3.2-1B-Instruct-Q8_0.gguf

1.31 GB1B131KQ8_0LlamaForCausalLM3 weeks agoAvailable

Gemma 2 9B

gemma-2-9b-it-Q4_K_M.gguf

5.47 GB9B8KQ4_K_MGemma2ForCausalLM5 days agoAvailable

Phi-3 Mini 3.8B

Phi-3-mini-4k-instruct-Q4_K_M.gguf

2.29 GB3.8B4KQ4_K_MPhi3ForCausalLMYesterdayAvailable
8 of 8 modelsTotal library: 38.9 GB