LLM Finder
Match local AI models to your GPU.
Configure your hardware
Qwen3.5-27B
27BQwen
q3_K_M
14.6 GB / 16 GB VRAM91%
⚡19-29 tok/s↔1.0M📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math agentic
ollama pull qwen3.5:27bPhi-4-14B-Instruct
14BPhi
q5_K_M
12.9 GB / 16 GB VRAM81%
⚡25-35 tok/s↔128k📄MIT
💻 coding🔢 math🔬 research💬 chat
ollama pull phi4:14bPhi-3-medium-128k-instruct
14BPhi
q5_K_M
12.9 GB / 16 GB VRAM81%
⚡25-35 tok/s↔131k📄MIT
💻 coding💬 chat🔬 research🔢 math
ollama pull phi3:mediumMistral-Small-24B-Instruct
24BMistral
q4_K_M
16 GB / 16 GB VRAM100%
⚡20-30 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Mistral-Small-24B-InstructInternLM2.5-20B-Chat
20BInternLM
q4_K_M
16 GB / 16 GB VRAM100%
⚡20-30 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull InternLM2.5-20B-ChatMistral-Nemo-12B-Instruct
12BMistral
q8_0
16 GB / 16 GB VRAM100%
⚡21-31 tok/s↔128k📄Apache-2.0
💬 chat💻 coding🎨 creative
ollama pull mistral-nemo:12bQwen3.5-9B
9BQwen
q8_0
9.9 GB / 16 GB VRAM62%
⚡33-43 tok/s↔262k📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math agentic reasoning vision
hf download Qwen/Qwen3.5-9BDeepSeek-R1-Distill-Llama-8B
8BDeepSeek
q8_0
11.2 GB / 16 GB VRAM70%
⚡30-40 tok/s↔33k📄MIT
🔢 math🔬 research💬 chat
ollama pull deepseek-r1:8bGemma-4-26B-A4B
25.2BGemmaMoE · 3.8B active
q4_K_M
13.9 GB / 16 GB VRAM87%
⚡24-34 tok/s↔262k📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math agentic reasoning vision
hf download google/gemma-4-26B-A4B-itQwen2.5-Coder-14B-Instruct
14BQwen
q5_K_M
12.9 GB / 16 GB VRAM81%
⚡27-37 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔢 math
ollama pull qwen2.5-coder:14bQwen2.5-14B-Instruct
14BQwen
q5_K_M
12.9 GB / 16 GB VRAM81%
⚡27-37 tok/s↔33k📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math
ollama pull qwen2.5:14bLlama-3.2-11B-Vision-Instruct
11BLlama
q8_0
13 GB / 16 GB VRAM81%
⚡27-37 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math vision
ollama pull Llama-3.2-11B-Vision-InstructGemma-2-27B-Instruct
27BGemma
q3_K_M
15 GB / 16 GB VRAM94%
⚡22-32 tok/s↔8k📄Gemma Terms
💬 chat💻 coding🔬 research🎨 creative🔢 math
ollama pull gemma2:27bGemma-3-12B
12BGemma
q8_0
16 GB / 16 GB VRAM100%
⚡22-32 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Gemma-3-12BQwen3.5-4B
4BQwen
fp16
8.8 GB / 16 GB VRAM55%
⚡38-48 tok/s↔262k📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math agentic reasoning vision
hf download Qwen/Qwen3.5-4BDeepSeek-R1-Distill-Qwen-14B
14BDeepSeek
q5_K_M
12.9 GB / 16 GB VRAM81%
⚡28-38 tok/s↔33k📄MIT
🔢 math🔬 research💻 coding
ollama pull deepseek-r1:14bSolar-10.7B-Instruct
10.7BSolar
q8_0
13 GB / 16 GB VRAM81%
⚡28-38 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Solar-10.7B-InstructMinistral-8B-Instruct
8BMistral
q8_0
11.2 GB / 16 GB VRAM70%
⚡33-43 tok/s↔128k📄Mistral Research
💬 chat🔬 research
ollama pull ministral:8bCommand-R-7B
7BCohere
q8_0
10.1 GB / 16 GB VRAM63%
⚡36-46 tok/s↔128k📄CC-BY-NC
🔬 research💬 chat🎨 creative
ollama pull command-r:7bQwen3-8B
8BQwen
fp16
16 GB / 16 GB VRAM100%
⚡24-34 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Qwen3-8BMistral-7B-Instruct-v0.3
7BMistral
q8_0
10.1 GB / 16 GB VRAM63%
⚡36-46 tok/s↔33k📄Apache-2.0
💬 chat🔬 research
ollama pull mistral:7bDolphin-2.9.2-Qwen2-7B
7BDolphin
q8_0
10.1 GB / 16 GB VRAM63%
⚡36-46 tok/s↔33k📄Apache-2.0
💬 chat🎨 creative💻 coding
ollama pull dolphin3:8bGranite-3.1-8B-Instruct
8BGranite
fp16
16 GB / 16 GB VRAM100%
⚡24-34 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Granite-3.1-8B-InstructQwen2.5-Coder-7B-Instruct
7BQwen
q8_0
10.1 GB / 16 GB VRAM63%
⚡37-47 tok/s↔33k📄Apache-2.0
💻 coding💬 chat
ollama pull qwen2.5-coder:7bQwen2.5-7B-Instruct
7BQwen
q8_0
10.1 GB / 16 GB VRAM63%
⚡37-47 tok/s↔33k📄Apache-2.0
💬 chat💻 coding🔬 research🔢 math
ollama pull qwen2.5:7bOpenHermes-2.5-Mistral-7B
7BNous
q8_0
10.1 GB / 16 GB VRAM63%
⚡37-47 tok/s↔33k📄Apache-2.0
💬 chat🎨 creative
ollama pull openhermesGLM-4.7-9B-Chat
9BGLM
q8_0
10.5 GB / 16 GB VRAM66%
⚡36-46 tok/s↔33k📄Apache-2.0
💬 chat💻 coding
ollama pull glm4:9bLlama-3.1-8B-Instruct
8BLlama
q8_0
11.2 GB / 16 GB VRAM70%
⚡35-45 tok/s↔131k📄Llama 3.1 Community
💬 chat🔬 research🎨 creative
ollama pull llama3.1:8bZephyr-7B-beta
7BZephyr
q8_0
10.1 GB / 16 GB VRAM63%
⚡38-48 tok/s↔33k📄MIT
💬 chat🎨 creative
ollama pull zephyr:7bGemma-2-9B-Instruct
9BGemma
q8_0
12.4 GB / 16 GB VRAM78%
⚡32-42 tok/s↔8k📄Gemma Terms
💬 chat🔬 research🎨 creative
ollama pull gemma2:9bCodeLlama-7B-Instruct
7BCodeLlama
q8_0
10.1 GB / 16 GB VRAM63%
⚡38-48 tok/s↔16k📄Llama 2 Community
💻 coding
ollama pull codellama:7bNeural-Chat-7B-v3.3
7BIntel
q8_0
10.1 GB / 16 GB VRAM63%
⚡38-48 tok/s↔33k📄Apache-2.0
💬 chat🔬 research
ollama pull neural-chatVicuna-13B
13BVicuna
q8_0
16 GB / 16 GB VRAM100%
⚡26-36 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Vicuna-13BOrca-2-13B
13BOrca
q8_0
16 GB / 16 GB VRAM100%
⚡26-36 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Orca-2-13BGemma-4-E4B
8BGemma
q8_0
8.8 GB / 16 GB VRAM55%
⚡42-52 tok/s↔131k📄Apache-2.0
💬 chat💻 coding🔬 research agentic reasoning vision
hf download google/gemma-4-E4B-itPhi-4-mini-instruct
3.8BPhi
fp16
10.9 GB / 16 GB VRAM68%
⚡37-47 tok/s↔128k📄MIT
💬 chat💻 coding🔬 research
ollama pull phi4-miniDeepSeek-R1-Distill-Qwen-7B
7BDeepSeek
q8_0
10.1 GB / 16 GB VRAM63%
⚡40-50 tok/s↔33k📄MIT
🔢 math🔬 research💻 coding
ollama pull deepseek-r1:7bOpenChat-3.6-8B
8BOpenChat
q8_0
11.2 GB / 16 GB VRAM70%
⚡37-47 tok/s↔8k📄Apache-2.0
💬 chat🎨 creative
ollama pull openchat:8bPhi-3-mini-4k-instruct
3.8BPhi
fp16
10.9 GB / 16 GB VRAM68%
⚡38-48 tok/s↔4k📄MIT
💬 chat💻 coding
ollama pull phi3:miniGemma-3-4B
4BGemma
fp16
16 GB / 16 GB VRAM100%
⚡28-38 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Gemma-3-4BDeepSeek-Coder-6.7B-Instruct
6.7BDeepSeek
q8_0
9.7 GB / 16 GB VRAM61%
⚡42-52 tok/s↔16k📄DeepSeek License
💻 coding💬 chat
ollama pull deepseek-coder:6.7bInternLM2.5-7B-Chat
7BInternLM
fp16
16 GB / 16 GB VRAM100%
⚡30-40 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull InternLM2.5-7B-ChatWizardLM-2-7B
7BWizardLM
q8_0
10.1 GB / 16 GB VRAM63%
⚡42-52 tok/s↔33k📄Llama 2 Community
💬 chat💻 coding
ollama pull wizardlm2:7bStarCoder2-15B-Instruct
15BStarCoder
q5_K_M
13.7 GB / 16 GB VRAM86%
⚡34-44 tok/s↔16k📄OpenRAIL-M
💻 coding🔢 math
ollama pull starcoder2:15bVicuna-7B
7BVicuna
fp16
16 GB / 16 GB VRAM100%
⚡30-40 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Vicuna-7BOrca-2-7B
7BOrca
fp16
16 GB / 16 GB VRAM100%
⚡30-40 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Orca-2-7BQwen2-VL-7B-Instruct
7BQwen
fp16
16 GB / 16 GB VRAM100%
⚡30-40 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Qwen2-VL-7B-InstructCodeGemma-7B-Instruct
7BGemma
q8_0
10.1 GB / 16 GB VRAM63%
⚡43-53 tok/s↔8k📄Gemma Terms
💻 coding💬 chat
ollama pull codegemma:7bFalcon-7B-Instruct
7BFalcon
fp16
16 GB / 16 GB VRAM100%
⚡31-41 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Falcon-7B-InstructQwen2.5-3B-Instruct
3BQwen
fp16
9.1 GB / 16 GB VRAM57%
⚡46-56 tok/s↔33k📄Apache-2.0
💬 chat💻 coding🔬 research
ollama pull qwen2.5:3bLlama-3.2-3B-Instruct
3BLlama
fp16
9.1 GB / 16 GB VRAM57%
⚡47-57 tok/s↔131k📄Llama 3.2 Community
💬 chat🎨 creative
ollama pull llama3.2:3bCodeLlama-13B-Instruct
13BCodeLlama
q5_K_M
12.1 GB / 16 GB VRAM76%
⚡39-49 tok/s↔16k📄Llama 2 Community
💻 coding🔢 math
ollama pull codellama:13bLlama-3.2-1B-Instruct
1BLlama
fp16
2 GB / 16 GB VRAM13%
⚡66-76 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull Llama-3.2-1B-InstructStarCoder2-7B-Instruct
7BStarCoder
q8_0
10.1 GB / 16 GB VRAM63%
⚡46-56 tok/s↔16k📄OpenRAIL-M
💻 coding
ollama pull starcoder2:7bSmolLM2-1.7B-Instruct
1.7BSmolLM
fp16
4 GB / 16 GB VRAM25%
⚡63-73 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull SmolLM2-1.7B-InstructGemma-2-2B-Instruct
2BGemma
fp16
6.9 GB / 16 GB VRAM43%
⚡56-66 tok/s↔8k📄Gemma Terms
💬 chat🎨 creative
ollama pull gemma2:2bStarCoder2-3B-Instruct
3BStarCoder
fp16
9.1 GB / 16 GB VRAM57%
⚡53-63 tok/s↔16k📄OpenRAIL-M
💻 coding
ollama pull starcoder2:3bTinyLlama-1.1B
1.1BTinyLlama
fp16
2 GB / 16 GB VRAM13%
⚡72-82 tok/s↔33k📄Apache-2.0
💻 coding💬 chat🔬 research🎨 creative🔢 math
ollama pull TinyLlama-1.1BYi-1.5-9B-Chat
9BYi
q8_0
12.4 GB / 16 GB VRAM78%
⚡51-61 tok/s↔33k📄Apache-2.0
💬 chat🔬 research🎨 creative
ollama pull yi:9bLlama-3-8B-Instruct
8BLlama
q8_0
11.2 GB / 16 GB VRAM70%
⚡58-68 tok/s↔8k📄Llama 3 Community
💬 chat🎨 creative
ollama pull llama3:8b