Local LLM

OpenJarvis Brings Local-First Personal AI Agents to Ollama

Ollama announced built-in support for OpenJarvis, a local-first personal AI framework from Stanford's Hazy Research and Scaling Intelligence labs. Here is what v1.0 ships, how local-cloud routing works, and the caveats to know.

May 29, 2026·6 min read·1,236 words

On May 28, 2026, Ollama announced that OpenJarvis, an open-source framework for personal AI agents, now runs with Ollama out of the box. OpenJarvis is built by Stanford's Hazy Research and Scaling Intelligence labs as part of their "Intelligence Per Watt" research into efficient local AI.

The short version: OpenJarvis is not another chatbot wrapper. It is a framework for personal agents that run on your own hardware by default, escalate to cloud models only when needed, and treat energy, cost, and latency as first-class metrics alongside accuracy. Here is what shipped, who it is for, and the caveats worth knowing before you install it.

What OpenJarvis is

OpenJarvis describes itself on GitHub as "Personal AI, On Personal Devices." It is a Python project, licensed under Apache-2.0, and as of May 29, 2026 it had roughly 5.1k stars and 1.1k forks. The repository was created on February 15, 2026, so the public history is short — this is a young project moving quickly.

The v1.0.0 release landed on May 16, 2026. Its design is organized around five primitives: Intelligence, Engine, Agents, Tools & Memory, and Learning. In practice that means OpenJarvis separates the model layer (Engine) from the reasoning layer (Intelligence) and from the agents that actually do tasks, so you can swap a local model for a cloud one without rewriting your agent logic.

Why Ollama support matters

If you already run models with Ollama, the install path is short. The official flow is:


curl -fsSL https://open-jarvis.github.io/OpenJarvis/install.sh | bash
jarvis

That gives you a local agent runtime backed by whatever models you have pulled in Ollama. For people who have spent time setting up a local stack — see our guide to running LLMs locally with Ollama — this means your existing model library becomes the engine for a personal agent without a separate inference service. If you are still choosing models, our roundup of the best Ollama models is a reasonable starting point.

As always, pipe-to-bash installers run arbitrary code. Read the script before you run it if you care about what touches your machine.

What v1.0 ships: agents, presets, and engines

According to the v1.0.0 release notes, OpenJarvis includes:

Eight built-in agents for common personal-AI tasks.
Seven starter presets, including a morning briefing, a deep-research agent that works across your files, the web, and local documents, and a code assistant.
Four local engines: Ollama, vLLM, SGLang, and llama.cpp.
Five cloud engines: OpenAI, Anthropic, Google Gemini, OpenRouter, and MiniMax.

The presets are the most concrete reason to try it. A morning-briefing agent that reads your local docs, a research agent that searches files plus the web, and a code assistant cover a lot of day-to-day personal-AI ground without you wiring up tools yourself. The eight-agent, seven-preset catalog also signals that OpenJarvis is aiming to be a platform for agents, not a single assistant.

The bigger shift: local-cloud routing, not cloud-only assistants

The interesting design choice is where work runs. Most consumer AI assistants send every request to a cloud model. OpenJarvis defaults to local execution and escalates to a cloud engine only when a task needs it. That is the "Intelligence Per Watt" idea in product form: measure energy, compute, cost, and latency next to accuracy, and route accordingly.

The release notes claim this routing pattern can reduce energy, compute, and cost by 60–80% compared with a batched cloud baseline. Treat that as a vendor-reported figure tied to a specific baseline, not a universal result — your savings depend on your workload, your hardware, and which tasks you let escalate to the cloud. The honest framing is that local-first routing *can* cut cost substantially for the right mix of tasks, and OpenJarvis is built to make that routing a default rather than a manual decision.

This is also a useful pattern to understand on its own. If you are designing agent systems, the local-cloud split is a concrete example of the routing and agent memory patterns that decide whether an agent is cheap and fast or slow and expensive.

Who should try it now

OpenJarvis is a good fit if you:

Already use Ollama and want personal agents on top of it.
Care about keeping data on your own hardware by default.
Want a framework to build on, not a finished consumer app, and are comfortable with Python and the command line.

It is less of a fit if you want a polished, support-backed product today. This is a v1.0 from research labs, weeks old in public form, and you should expect rough edges.

Caveats worth knowing

A few things to keep in mind before you rely on it:

"Local-first" does not mean "cloud-free." v1.0 ships five cloud engines and supports hybrid local-cloud routing. If you need a fully air-gapped setup, configure it to use only local engines and verify nothing escalates.
Personal-agent presets touch personal data. Morning briefings, research over your files, and similar agents need access to documents and potentially calendar or email connectors. Review what each preset reads and where any cloud-escalated request sends that data before pointing it at sensitive accounts.
Hardware sets the ceiling. Local agents are only as capable as the models your machine can run. A morning briefing on a small quantized model is fine; deep research with a large model wants real VRAM. If you need more headroom than your machine has, you can rent a GPU by the hour from Vast.ai, or shop GPUs such as the RTX 4090 and RTX 5090 on Amazon. For tuning the model layer, NVIDIA's CUDA developer docs are the canonical reference.
Benchmark and efficiency claims are vendor-reported. The 60–80% figure comes from OpenJarvis/IPW materials against a stated baseline. There are no independent third-party benchmarks in the launch materials, so verify on your own workloads.
Stanford labs, not Stanford the product owner. OpenJarvis is built by Stanford's Hazy Research and Scaling Intelligence labs as research. Do not read that as Stanford University officially shipping or supporting a consumer product.

*Disclosure: This article contains affiliate links. ToolHalla may earn a commission at no extra cost to you. We only link to hardware and services that are genuinely useful for the topic.*

Frequently asked questions

What is OpenJarvis?

OpenJarvis is an open-source framework for personal AI agents that run on your own hardware. It is built by Stanford's Hazy Research and Scaling Intelligence labs as part of their Intelligence Per Watt research into efficient local AI, and v1.0.0 was released on May 16, 2026.

Does OpenJarvis run with Ollama?

Yes. Ollama announced built-in OpenJarvis support on May 28, 2026, and Ollama is one of the four local engines listed in the v1.0.0 release. You install OpenJarvis with the official install.sh script and run it with the jarvis command.

Is OpenJarvis open source?

Yes. The GitHub repository is licensed under Apache-2.0 and is written in Python.

Which local engines does OpenJarvis support?

The v1.0.0 release lists four local engines: Ollama, vLLM, SGLang, and llama.cpp.

Is OpenJarvis only local, or can it use cloud models?

It is local-first, not local-only. v1.0 also supports five cloud engines — OpenAI, Anthropic, Google Gemini, OpenRouter, and MiniMax — and can route between local and cloud models depending on the task.

Sources

Ollama blog: OpenJarvis: a local-first personal AI is now available to run with Ollama (May 28, 2026)
GitHub: open-jarvis/OpenJarvis
GitHub release: OpenJarvis v1.0.0 (May 16, 2026)
Ollama announcement post on X

Frequently Asked Questions

What is OpenJarvis?

Does OpenJarvis run with Ollama?

Is OpenJarvis open source?

Yes. The GitHub repository is licensed under Apache-2.0 and is written in Python.

Which local engines does OpenJarvis support?

The v1.0.0 release lists four local engines: Ollama, vLLM, SGLang, and llama.cpp.

Is OpenJarvis only local, or can it use cloud models?

🔧 Tools in This Article

Make (Integromat)

OpenJarvis

OpenRouter

Ollama

vLLM

Related Guides

All guides →

Local LLM

Gemma 4 Is Out: Apache 2.0, 3.8B Active Params, and the Best Local Model in 2026

Google dropped Gemma 4 on April 2 with four variants, a 256K context window, and — finally — an Apache 2.0 license. The 26B MoE activates only 3.8B params at inference. Here's what changed, what it means for local AI, and how it stacks up.

12 min read

Local LLM

How to Run LLMs Locally with Ollama (2026 Guide)

Running LLMs locally used to mean fighting CUDA drivers and manually patching model loaders. Ollama changed that. It wraps model download, quantization…

8 min read

Local LLM

Qwen 3.5 Small: Best Open-Source LLM for Running AI on Your Phone

Alibaba's Qwen 3.5 8B outperforms models 13x its size on graduate-level reasoning. A 9-billion-parameter model beating 70B+ models on GPQA Diamond isn't…

7 min read

#OpenJarvis#Ollama#local-first AI#personal AI agents#Hazy Research#Intelligence Per Watt#AI agent frameworks#open source AI