Model Management
Load, unload, and manage GGUF models — inference runs entirely on your machine
Llama 3.1 8B
ActiveLlama-3.1-8B-Instruct-Q4_K_M.gguf
Quantization
Q4_K_MFile size
4.80 GBContext length
8,192Parameters
8BLlamaForCausalLM
Loaded in 2.3s
Last used: 2 min ago
System Memory
RAM9.8/32GB(31%)
VRAM (GPU)3.6/8GB(45%)
Swap4.9/16GB(31%)
Model footprint
Llama-3.1-8B Q4_K_M4.92 GB
KV cache (8192 ctx)0.84 GB
Total model5.76 GB
Load from File System
Drop .gguf file here
or click to browse file system
.gguf onlyRecent paths
Paths are read from your local file system — no files are uploaded anywhere
Model Library
8 models in library
| Model | Size | Params | Context | Quantization | Architecture | Last used | Status | Actions |
|---|---|---|---|---|---|---|---|---|
Qwen2.5 7B Qwen2.5-7B-Instruct-Q6_K.gguf | 5.96 GB | 7B | 131K | Q6_K | Qwen2ForCausalLM | 1 week ago | Error | Load error |
CodeLlama 13B CodeLlama-13b-Instruct-Q4_K_S.gguf | 7.62 GB | 13B | 16K | Q4_K_S | LlamaForCausalLM | 2 days ago | Available | |
Llama 3.1 8B Llama-3.1-8B-Instruct-Q4_K_M.gguf | 4.80 GB | 8B | 8K | Q4_K_M | LlamaForCausalLM | 2 min ago | Loaded | |
DeepSeek Coder 6.7B deepseek-coder-6.7b-instruct-Q4_K_M.gguf | 4.00 GB | 6.7B | 16K | Q4_K_M | LlamaForCausalLM | 2 weeks ago | Available | |
Mistral 7B v0.3 Mistral-7B-Instruct-v0.3-Q8_0.gguf | 7.50 GB | 7B | 33K | Q8_0 | MistralForCausalLM | 3 hours ago | Available | |
Llama 3.2 1B Llama-3.2-1B-Instruct-Q8_0.gguf | 1.31 GB | 1B | 131K | Q8_0 | LlamaForCausalLM | 3 weeks ago | Available | |
Gemma 2 9B gemma-2-9b-it-Q4_K_M.gguf | 5.47 GB | 9B | 8K | Q4_K_M | Gemma2ForCausalLM | 5 days ago | Available | |
Phi-3 Mini 3.8B Phi-3-mini-4k-instruct-Q4_K_M.gguf | 2.29 GB | 3.8B | 4K | Q4_K_M | Phi3ForCausalLM | Yesterday | Available |
8 of 8 modelsTotal library: 38.9 GB