Model Management

Load, unload, and manage GGUF models — inference runs entirely on your machine

Active

Llama-3.1-8B-Instruct-Q4_K_M.gguf

Quantization

Q4_K_M

File size

4.80 GB

Context length

8,192

Parameters

LlamaForCausalLM

Loaded in 2.3s

Last used: 2 min ago

RAM9.8/32GB(31%)

VRAM (GPU)3.6/8GB(45%)

Swap4.9/16GB(31%)

Model footprint

Llama-3.1-8B Q4_K_M4.92 GB

KV cache (8192 ctx)0.84 GB

Total model5.76 GB

Drop .gguf file here

or click to browse file system

.gguf only

Recent paths

Paths are read from your local file system — no files are uploaded anywhere

8 models in library

Model	Size	Params	Context	Quantization	Architecture	Last used	Status	Actions
Qwen2.5 7B Qwen2.5-7B-Instruct-Q6_K.gguf	5.96 GB	7B	131K	Q6_K	Qwen2ForCausalLM	1 week ago	Error	Load error
CodeLlama 13B CodeLlama-13b-Instruct-Q4_K_S.gguf	7.62 GB	13B	16K	Q4_K_S	LlamaForCausalLM	2 days ago	Available
Llama 3.1 8B Llama-3.1-8B-Instruct-Q4_K_M.gguf	4.80 GB	8B	8K	Q4_K_M	LlamaForCausalLM	2 min ago	Loaded
DeepSeek Coder 6.7B deepseek-coder-6.7b-instruct-Q4_K_M.gguf	4.00 GB	6.7B	16K	Q4_K_M	LlamaForCausalLM	2 weeks ago	Available
Mistral 7B v0.3 Mistral-7B-Instruct-v0.3-Q8_0.gguf	7.50 GB	7B	33K	Q8_0	MistralForCausalLM	3 hours ago	Available
Llama 3.2 1B Llama-3.2-1B-Instruct-Q8_0.gguf	1.31 GB	1B	131K	Q8_0	LlamaForCausalLM	3 weeks ago	Available
Gemma 2 9B gemma-2-9b-it-Q4_K_M.gguf	5.47 GB	9B	8K	Q4_K_M	Gemma2ForCausalLM	5 days ago	Available
Phi-3 Mini 3.8B Phi-3-mini-4k-instruct-Q4_K_M.gguf	2.29 GB	3.8B	4K	Q4_K_M	Phi3ForCausalLM	Yesterday	Available

8 of 8 modelsTotal library: 38.9 GB