LLM Finder 2026: Match Local AI Models to Your GPU & RAM

MiniMax-M2.5

230B

MiniMaxMoE · 10B active

🔀 q4_K_M

98 GB / 112 GB (GPU+RAM)88%

⚡11-17 tok/s↔1.0M📄Apache-2.0

💬 chat💻 coding🔬 research agentic

llama-server -hf MiniMax-M2.5-GGUF:q4_K_M \
  --jinja -ngl 999 --ctx-size 16384 --fit on

Qwen3.5-122B-A10B

122B

QwenMoE · 10B active

q3_K_M

40 GB / 48 GB VRAM83%

⚡33-43 tok/s↔1.0M📄Apache-2.0

↑ Hybrid upgrade: q8_0 · 68 GB · 11-17 tok/s

💬 chat💻 coding🔬 research🔢 math agentic

# Needs 40GB+ VRAM/RAM — single H100 or dual 3090
hf download Qwen/Qwen3.5-122B-A10B-Instruct-GGUF --include "Q3_K_M/*"

Qwen3-235B-A22B

235B

QwenMoE · 22B active

🔀 q4_K_M

100 GB / 112 GB (GPU+RAM)89%

⚡11-17 tok/s↔131k📄Apache-2.0

💬 chat💻 coding🔬 research reasoning

llama-server -hf Qwen3-235B-A22B-GGUF:q4_K_M \
  --jinja -ngl 999 --ctx-size 16384 --fit on

Llama-3.3-70B-Instruct

70B

Llama

q4_K_M

47.4 GB / 48 GB VRAM99%

⚡17-27 tok/s↔131k📄Llama 3.3 Community

↑ Hybrid upgrade: q8_0 · 84.4 GB · 12-18 tok/s

💬 chat💻 coding🔬 research🎨 creative🔢 math

ollama pull llama3.3:70b

Qwen2.5-72B-Instruct

72B

Qwen

q3_K_M

38.4 GB / 48 GB VRAM80%

⚡38-48 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: q8_0 · 86.8 GB · 12-18 tok/s

💬 chat💻 coding🔬 research🔢 math🎨 creative

ollama pull qwen2.5:72b

Qwen2.5-Coder-32B-Instruct

32B

Qwen

q8_0

39.6 GB / 48 GB VRAM83%

⚡35-45 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 72.9 GB · 14-20 tok/s

💻 coding💬 chat🔢 math🔬 research

ollama pull qwen2.5-coder:32b

Nous-Hermes-2-Mixtral-8x7B-DPO

46.7B MoE

Nous

q5_K_M

39.9 GB / 48 GB VRAM83%

⚡35-45 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 105.9 GB · 10-16 tok/s

💬 chat🎨 creative💻 coding

ollama pull nous-hermes2-mixtral

Llama-3.1-70B-Instruct

70B

Llama

q4_K_M

47.4 GB / 48 GB VRAM99%

⚡18-28 tok/s↔131k📄Llama 3.1 Community

↑ Hybrid upgrade: q8_0 · 84.4 GB · 12-18 tok/s

💬 chat🔬 research🎨 creative🔢 math

ollama pull llama3.1:70b

Qwen3.5-27B

27B

Qwen

q8_0

32.4 GB / 48 GB VRAM68%

⚡53-63 tok/s↔1.0M📄Apache-2.0

↑ Hybrid upgrade: fp16 · 58.5 GB · 17-23 tok/s

💬 chat💻 coding🔬 research🔢 math agentic

ollama pull qwen3.5:27b

Qwen3-32B

32B

Qwen

q8_0

38 GB / 48 GB VRAM79%

⚡40-50 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 64 GB · 16-22 tok/s

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Qwen3-32B

DeepSeek-R1-Distill-Llama-70B

70B

DeepSeek

q4_K_M

47.4 GB / 48 GB VRAM99%

⚡18-28 tok/s↔33k📄MIT

↑ Hybrid upgrade: q8_0 · 84.4 GB · 12-18 tok/s

🔢 math🔬 research💻 coding💬 chat

ollama pull deepseek-r1:70b

DeepSeek-R1-Distill-Qwen-32B

32B

DeepSeek

q8_0

38 GB / 48 GB VRAM79%

⚡40-50 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 64 GB · 16-22 tok/s

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull DeepSeek-R1-Distill-Qwen-32B

Qwen3.5-35B-A3B

35B

QwenMoE · 3B active

q8_0

43.1 GB / 48 GB VRAM90%

⚡28-38 tok/s↔1.0M📄Apache-2.0

↑ Hybrid upgrade: fp16 · 79.5 GB · 11-17 tok/s

💬 chat💻 coding agentic🔬 research🔢 math

ollama pull qwen3.5:35b-a3b

Gemma-3-27B

27B

Gemma

q8_0

38 GB / 48 GB VRAM79%

⚡41-51 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 64 GB · 16-22 tok/s

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Gemma-3-27B

Qwen2.5-32B-Instruct

32B

Qwen

q8_0

39.6 GB / 48 GB VRAM83%

⚡37-47 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 72.9 GB · 14-20 tok/s

💬 chat💻 coding🔬 research🔢 math🎨 creative

ollama pull qwen2.5:32b

Phi-4-14B-Instruct

14B

Phi

fp16

33.3 GB / 48 GB VRAM69%

⚡52-62 tok/s↔128k📄MIT

💻 coding🔢 math🔬 research💬 chat

ollama pull phi4:14b

Phi-3-medium-128k-instruct

14B

Phi

fp16

33.3 GB / 48 GB VRAM69%

⚡53-63 tok/s↔131k📄MIT

💻 coding💬 chat🔬 research🔢 math

ollama pull phi3:medium

Mistral-Small-24B-Instruct

24B

Mistral

fp16

48 GB / 48 GB VRAM100%

⚡20-30 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Mistral-Small-24B-Instruct

InternLM2.5-20B-Chat

20B

InternLM

fp16

48 GB / 48 GB VRAM100%

⚡20-30 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull InternLM2.5-20B-Chat

Mistral-Nemo-12B-Instruct

12B

Mistral

fp16

28.9 GB / 48 GB VRAM60%

⚡64-74 tok/s↔128k📄Apache-2.0

💬 chat💻 coding🎨 creative

ollama pull mistral-nemo:12b

Qwen3.5-9B

9B

Qwen

fp16

19.8 GB / 48 GB VRAM41%

⚡86-96 tok/s↔262k📄Apache-2.0

💬 chat💻 coding🔬 research🔢 math agentic reasoning vision

hf download Qwen/Qwen3.5-9B

DeepSeek-R1-Distill-Llama-8B

8B

DeepSeek

fp16

20.1 GB / 48 GB VRAM42%

⚡86-96 tok/s↔33k📄MIT

🔢 math🔬 research💬 chat

ollama pull deepseek-r1:8b

Gemma-4-26B-A4B

25.2B

GemmaMoE · 3.8B active

q8_0

27.7 GB / 48 GB VRAM58%

⚡68-78 tok/s↔262k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 55.4 GB · 11-17 tok/s

💬 chat💻 coding🔬 research🔢 math agentic reasoning vision

hf download google/gemma-4-26B-A4B-it

Qwen3-30B-A3B

30B

Qwen

q8_0

38 GB / 48 GB VRAM79%

⚡43-53 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 64 GB · 16-22 tok/s

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Qwen3-30B-A3B

Qwen2.5-Coder-14B-Instruct

14B

Qwen

fp16

33.3 GB / 48 GB VRAM69%

⚡54-64 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔢 math

ollama pull qwen2.5-coder:14b

Llama-4-Scout-17B

109B

LlamaMoE · 17B active

q4_K_M

45 GB / 48 GB VRAM94%

⚡26-36 tok/s↔524k📄Llama 4 Community

↑ Hybrid upgrade: q8_0 · 85 GB · 11-17 tok/s

💬 chat💻 coding vision

ollama pull llama4-scout

Mixtral-8x22B-Instruct

141B MoE

Mistral

q3_K_M

47.8 GB / 48 GB VRAM100%

⚡22-32 tok/s↔66k📄Apache-2.0

↑ Hybrid upgrade: q8_0 · 108 GB · 10-16 tok/s

💬 chat💻 coding🔬 research🎨 creative🔢 math

ollama pull mixtral:8x22b

WizardLM-2-8x22B

141B MoE

WizardLM

q3_K_M

47.8 GB / 48 GB VRAM100%

⚡22-32 tok/s↔66k📄Llama 2 Community

↑ Hybrid upgrade: q8_0 · 108 GB · 10-16 tok/s

💬 chat💻 coding🔬 research🎨 creative

ollama pull wizardlm2:8x22b

Llama-3.2-11B-Vision-Instruct

11B

Llama

fp16

22 GB / 48 GB VRAM46%

⚡82-92 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math vision

ollama pull Llama-3.2-11B-Vision-Instruct

Gemma-3-12B

12B

Gemma

fp16

26 GB / 48 GB VRAM54%

⚡73-83 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Gemma-3-12B

Qwen2.5-14B-Instruct

14B

Qwen

fp16

33.3 GB / 48 GB VRAM69%

⚡55-65 tok/s↔33k📄Apache-2.0

💬 chat💻 coding🔬 research🔢 math

ollama pull qwen2.5:14b

Gemma-2-27B-Instruct

27B

Gemma

q8_0

33.7 GB / 48 GB VRAM70%

⚡54-64 tok/s↔8k📄Gemma Terms

↑ Hybrid upgrade: fp16 · 61.9 GB · 17-23 tok/s

💬 chat💻 coding🔬 research🎨 creative🔢 math

ollama pull gemma2:27b

Qwen3.5-4B

4B

Qwen

fp16

8.8 GB / 48 GB VRAM18%

⚡114-124 tok/s↔262k📄Apache-2.0

💬 chat💻 coding🔬 research🔢 math agentic reasoning vision

hf download Qwen/Qwen3.5-4B

Solar-10.7B-Instruct

10.7B

Solar

fp16

22 GB / 48 GB VRAM46%

⚡83-93 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Solar-10.7B-Instruct

DeepSeek-R1-Distill-Qwen-14B

14B

DeepSeek

fp16

33.3 GB / 48 GB VRAM69%

⚡56-66 tok/s↔33k📄MIT

🔢 math🔬 research💻 coding

ollama pull deepseek-r1:14b

Ministral-8B-Instruct

8B

Mistral

fp16

20.1 GB / 48 GB VRAM42%

⚡88-98 tok/s↔128k📄Mistral Research

💬 chat🔬 research

ollama pull ministral:8b

DBRX-Instruct

132B MoE

Databricks

q3_K_M

44.7 GB / 48 GB VRAM93%

⚡29-39 tok/s↔33k📄Databricks Open

↑ Hybrid upgrade: q8_0 · 100.9 GB · 11-17 tok/s

💬 chat💻 coding🔬 research

ollama pull dbrx

Qwen3-8B

8B

Qwen

fp16

16 GB / 48 GB VRAM33%

⚡98-108 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Qwen3-8B

Command-R-7B

7B

Cohere

fp16

17.9 GB / 48 GB VRAM37%

⚡94-104 tok/s↔128k📄CC-BY-NC

🔬 research💬 chat🎨 creative

ollama pull command-r:7b

Mixtral-8x7B-Instruct

46.7B MoE

Mistral

q5_K_M

39.9 GB / 48 GB VRAM83%

⚡41-51 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 105.9 GB · 10-16 tok/s

💬 chat💻 coding🔬 research🎨 creative

ollama pull mixtral:8x7b

Granite-3.1-8B-Instruct

8B

Granite

fp16

16 GB / 48 GB VRAM33%

⚡99-109 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Granite-3.1-8B-Instruct

Mistral-7B-Instruct-v0.3

7B

Mistral

fp16

17.9 GB / 48 GB VRAM37%

⚡94-104 tok/s↔33k📄Apache-2.0

💬 chat🔬 research

ollama pull mistral:7b

Dolphin-2.9.2-Qwen2-7B

7B

Dolphin

fp16

17.9 GB / 48 GB VRAM37%

⚡94-104 tok/s↔33k📄Apache-2.0

💬 chat🎨 creative💻 coding

ollama pull dolphin3:8b

Qwen2.5-Coder-7B-Instruct

7B

Qwen

fp16

17.9 GB / 48 GB VRAM37%

⚡95-105 tok/s↔33k📄Apache-2.0

💻 coding💬 chat

ollama pull qwen2.5-coder:7b

DeepSeek-Coder-33B-Instruct

33B

DeepSeek

q8_0

40.7 GB / 48 GB VRAM85%

⚡40-50 tok/s↔16k📄DeepSeek License

↑ Hybrid upgrade: fp16 · 75.1 GB · 14-20 tok/s

💻 coding💬 chat🔢 math

ollama pull deepseek-coder:33b

Qwen2.5-7B-Instruct

7B

Qwen

fp16

17.9 GB / 48 GB VRAM37%

⚡95-105 tok/s↔33k📄Apache-2.0

💬 chat💻 coding🔬 research🔢 math

ollama pull qwen2.5:7b

OpenHermes-2.5-Mistral-7B

7B

Nous

fp16

17.9 GB / 48 GB VRAM37%

⚡95-105 tok/s↔33k📄Apache-2.0

💬 chat🎨 creative

ollama pull openhermes

GLM-4.7-9B-Chat

9B

GLM

fp16

18.8 GB / 48 GB VRAM39%

⚡93-103 tok/s↔33k📄Apache-2.0

💬 chat💻 coding

ollama pull glm4:9b

Llama-3.1-8B-Instruct

8B

Llama

fp16

20.1 GB / 48 GB VRAM42%

⚡90-100 tok/s↔131k📄Llama 3.1 Community

💬 chat🔬 research🎨 creative

ollama pull llama3.1:8b

Falcon-40B-Instruct

40B

Falcon

q8_0

48 GB / 48 GB VRAM100%

⚡26-36 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 80 GB · 13-19 tok/s

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Falcon-40B-Instruct

Zephyr-7B-beta

7B

Zephyr

fp16

17.9 GB / 48 GB VRAM37%

⚡96-106 tok/s↔33k📄MIT

💬 chat🎨 creative

ollama pull zephyr:7b

Gemma-2-9B-Instruct

9B

Gemma

fp16

22.3 GB / 48 GB VRAM46%

⚡85-95 tok/s↔8k📄Gemma Terms

💬 chat🔬 research🎨 creative

ollama pull gemma2:9b

CodeLlama-7B-Instruct

7B

CodeLlama

fp16

17.9 GB / 48 GB VRAM37%

⚡96-106 tok/s↔16k📄Llama 2 Community

💻 coding

ollama pull codellama:7b

Neural-Chat-7B-v3.3

7B

Intel

fp16

17.9 GB / 48 GB VRAM37%

⚡96-106 tok/s↔33k📄Apache-2.0

💬 chat🔬 research

ollama pull neural-chat

Vicuna-13B

13B

Vicuna

fp16

26 GB / 48 GB VRAM54%

⚡77-87 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Vicuna-13B

Orca-2-13B

13B

Orca

fp16

26 GB / 48 GB VRAM54%

⚡77-87 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Orca-2-13B

Phi-4-mini-instruct

3.8B

Phi

fp16

10.9 GB / 48 GB VRAM23%

⚡113-123 tok/s↔128k📄MIT

💬 chat💻 coding🔬 research

ollama pull phi4-mini

Gemma-4-E4B

8B

Gemma

fp16

17.6 GB / 48 GB VRAM37%

⚡97-107 tok/s↔131k📄Apache-2.0

💬 chat💻 coding🔬 research agentic reasoning vision

hf download google/gemma-4-E4B-it

Command-R-35B

35B

Cohere

q8_0

43.1 GB / 48 GB VRAM90%

⚡37-47 tok/s↔128k📄CC-BY-NC

↑ Hybrid upgrade: fp16 · 79.5 GB · 13-19 tok/s

🔬 research💬 chat💻 coding

ollama pull command-r:35b

CodeLlama-34B-Instruct

34B

CodeLlama

q8_0

41.9 GB / 48 GB VRAM87%

⚡40-50 tok/s↔16k📄Llama 2 Community

↑ Hybrid upgrade: fp16 · 77.3 GB · 13-19 tok/s

💻 coding🔢 math

ollama pull codellama:34b

DeepSeek-R1-Distill-Qwen-7B

7B

DeepSeek

fp16

17.9 GB / 48 GB VRAM37%

⚡98-108 tok/s↔33k📄MIT

🔢 math🔬 research💻 coding

ollama pull deepseek-r1:7b

OpenChat-3.6-8B

8B

OpenChat

fp16

20.1 GB / 48 GB VRAM42%

⚡93-103 tok/s↔8k📄Apache-2.0

💬 chat🎨 creative

ollama pull openchat:8b

Phi-3-mini-4k-instruct

3.8B

Phi

fp16

10.9 GB / 48 GB VRAM23%

⚡115-125 tok/s↔4k📄MIT

💬 chat💻 coding

ollama pull phi3:mini

Gemma-3-4B

4B

Gemma

fp16

16 GB / 48 GB VRAM33%

⚡103-113 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Gemma-3-4B

InternLM2.5-7B-Chat

7B

InternLM

fp16

16 GB / 48 GB VRAM33%

⚡104-114 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull InternLM2.5-7B-Chat

DeepSeek-Coder-6.7B-Instruct

6.7B

DeepSeek

fp16

17.2 GB / 48 GB VRAM36%

⚡101-111 tok/s↔16k📄DeepSeek License

💻 coding💬 chat

ollama pull deepseek-coder:6.7b

Vicuna-7B

7B

Vicuna

fp16

16 GB / 48 GB VRAM33%

⚡105-115 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Vicuna-7B

Orca-2-7B

7B

Orca

fp16

16 GB / 48 GB VRAM33%

⚡105-115 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Orca-2-7B

Qwen2-VL-7B-Instruct

7B

Qwen

fp16

16 GB / 48 GB VRAM33%

⚡105-115 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Qwen2-VL-7B-Instruct

WizardLM-2-7B

7B

WizardLM

fp16

17.9 GB / 48 GB VRAM37%

⚡100-110 tok/s↔33k📄Llama 2 Community

💬 chat💻 coding

ollama pull wizardlm2:7b

StarCoder2-15B-Instruct

15B

StarCoder

fp16

35.5 GB / 48 GB VRAM74%

⚡58-68 tok/s↔16k📄OpenRAIL-M

💻 coding🔢 math

ollama pull starcoder2:15b

Falcon-7B-Instruct

7B

Falcon

fp16

16 GB / 48 GB VRAM33%

⚡106-116 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Falcon-7B-Instruct

CodeGemma-7B-Instruct

7B

Gemma

fp16

17.9 GB / 48 GB VRAM37%

⚡101-111 tok/s↔8k📄Gemma Terms

💻 coding💬 chat

ollama pull codegemma:7b

Qwen2.5-3B-Instruct

3B

Qwen

fp16

9.1 GB / 48 GB VRAM19%

⚡123-133 tok/s↔33k📄Apache-2.0

💬 chat💻 coding🔬 research

ollama pull qwen2.5:3b

Llama-3.2-3B-Instruct

3B

Llama

fp16

9.1 GB / 48 GB VRAM19%

⚡123-133 tok/s↔131k📄Llama 3.2 Community

💬 chat🎨 creative

ollama pull llama3.2:3b

CodeLlama-13B-Instruct

13B

CodeLlama

fp16

31.1 GB / 48 GB VRAM65%

⚡71-81 tok/s↔16k📄Llama 2 Community

💻 coding🔢 math

ollama pull codellama:13b

Llama-3.2-1B-Instruct

1B

Llama

fp16

2 GB / 48 GB VRAM4%

⚡142-152 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull Llama-3.2-1B-Instruct

StarCoder2-7B-Instruct

7B

StarCoder

fp16

17.9 GB / 48 GB VRAM37%

⚡104-114 tok/s↔16k📄OpenRAIL-M

💻 coding

ollama pull starcoder2:7b

SmolLM2-1.7B-Instruct

1.7B

SmolLM

fp16

4 GB / 48 GB VRAM8%

⚡140-150 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull SmolLM2-1.7B-Instruct

Gemma-2-2B-Instruct

2B

Gemma

fp16

6.9 GB / 48 GB VRAM14%

⚡133-143 tok/s↔8k📄Gemma Terms

💬 chat🎨 creative

ollama pull gemma2:2b

StarCoder2-3B-Instruct

3B

StarCoder

fp16

9.1 GB / 48 GB VRAM19%

⚡129-139 tok/s↔16k📄OpenRAIL-M

💻 coding

ollama pull starcoder2:3b

TinyLlama-1.1B

1.1B

TinyLlama

fp16

2 GB / 48 GB VRAM4%

⚡148-158 tok/s↔33k📄Apache-2.0

💻 coding💬 chat🔬 research🎨 creative🔢 math

ollama pull TinyLlama-1.1B

Yi-1.5-34B-Chat

34B

Yi

q8_0

41.9 GB / 48 GB VRAM87%

⚡55-65 tok/s↔33k📄Apache-2.0

↑ Hybrid upgrade: fp16 · 77.3 GB · 13-19 tok/s

💬 chat🔬 research🎨 creative🔢 math

ollama pull yi:34b

Yi-1.5-9B-Chat

9B

Yi

fp16

22.3 GB / 48 GB VRAM46%

⚡104-114 tok/s↔33k📄Apache-2.0

💬 chat🔬 research🎨 creative

ollama pull yi:9b

Llama-3-8B-Instruct

8B

Llama

fp16

20.1 GB / 48 GB VRAM42%

⚡113-123 tok/s↔8k📄Llama 3 Community

💬 chat🎨 creative

ollama pull llama3:8b

LLM Finder

Configure your hardware

MiniMax-M2.5

Qwen3.5-122B-A10B

Qwen3-235B-A22B

Llama-3.3-70B-Instruct

Qwen2.5-72B-Instruct

Qwen2.5-Coder-32B-Instruct

Nous-Hermes-2-Mixtral-8x7B-DPO

Llama-3.1-70B-Instruct

Qwen3.5-27B

Qwen3-32B

DeepSeek-R1-Distill-Llama-70B

DeepSeek-R1-Distill-Qwen-32B

Qwen3.5-35B-A3B

Gemma-3-27B

Qwen2.5-32B-Instruct

Phi-4-14B-Instruct

Phi-3-medium-128k-instruct

Mistral-Small-24B-Instruct

InternLM2.5-20B-Chat

Mistral-Nemo-12B-Instruct

Qwen3.5-9B

DeepSeek-R1-Distill-Llama-8B

Gemma-4-26B-A4B

Qwen3-30B-A3B

Qwen2.5-Coder-14B-Instruct

Llama-4-Scout-17B

Mixtral-8x22B-Instruct

WizardLM-2-8x22B

Llama-3.2-11B-Vision-Instruct

Gemma-3-12B

Qwen2.5-14B-Instruct

Gemma-2-27B-Instruct

Qwen3.5-4B

Solar-10.7B-Instruct

DeepSeek-R1-Distill-Qwen-14B

Ministral-8B-Instruct

DBRX-Instruct

Qwen3-8B

Command-R-7B

Mixtral-8x7B-Instruct

Granite-3.1-8B-Instruct

Mistral-7B-Instruct-v0.3

Dolphin-2.9.2-Qwen2-7B

Qwen2.5-Coder-7B-Instruct

DeepSeek-Coder-33B-Instruct

Qwen2.5-7B-Instruct

OpenHermes-2.5-Mistral-7B

GLM-4.7-9B-Chat

Llama-3.1-8B-Instruct

Falcon-40B-Instruct

Zephyr-7B-beta

Gemma-2-9B-Instruct

CodeLlama-7B-Instruct

Neural-Chat-7B-v3.3

Vicuna-13B

Orca-2-13B

Phi-4-mini-instruct

Gemma-4-E4B

Command-R-35B

CodeLlama-34B-Instruct

DeepSeek-R1-Distill-Qwen-7B

OpenChat-3.6-8B

Phi-3-mini-4k-instruct

Gemma-3-4B

InternLM2.5-7B-Chat

DeepSeek-Coder-6.7B-Instruct

Vicuna-7B

Orca-2-7B

Qwen2-VL-7B-Instruct

WizardLM-2-7B

StarCoder2-15B-Instruct

Falcon-7B-Instruct

CodeGemma-7B-Instruct

Qwen2.5-3B-Instruct

Llama-3.2-3B-Instruct

CodeLlama-13B-Instruct

Llama-3.2-1B-Instruct

StarCoder2-7B-Instruct