9 Best Local/Offline LLMs You Can Try Right Now

With quantum LLMs now available on HuggingFace and AI ecosystems like H20, Text Gen, and GPT4All allowing you to load LLM weights on your computer, you now have an option for free, flexible, and secure AI. Here are the 9 best local/offline LLMs you can try right now!

Table of Contents

1. Hermes 2 Pro GPTQ

9 Best Local/Offline LLMs You Can Try Right Now

Hermes 2 Pro is a state-of-the-art language model fine-tuned by Nous Research. It uses an updated and compact version of the OpenHermes 2.5 dataset, along with the newly introduced Function Calling and JSON datasets developed by the company. The model is based on the Mistral 7B architecture and has been trained on 1,000,000 instructions/conversations of GPT-4 quality or better, mostly synthetic data.

Model

Hermes 2 Pro GPTQ

Model size

7.26 GB

Parameters

7 billion

Quantization

4-bit

Type

Mistral

License

Apache 2.0

The Hermes 2 Pro on the Mistral 7B is the new flagship Hermes 7B model, offering improved performance across a variety of benchmarks, including AGIEval, BigBench Reasoning, GPT4All, and TruthfulQA. Its enhanced capabilities make it suitable for a wide range of natural language processing (NLP) tasks, such as code generation, content creation, and conversational AI applications.

2. Zephyr 7B Beta

Zephyr is a series of language models trained to act as helpful assistants. Zephyr-7B-Beta is the second model in the series, fine-tuned from Mistral-7B-v0.1 using Direct Preference Optimization (DPO) on a mix of publicly available synthetic datasets.

Model

Zephyr 7B Beta

Model size

7.26 GB

Parameters

7 billion

Quantization

4-bit

Type

Mistral

License

Apache 2.0

By removing the built-in alignment of the training datasets, Zephyr-7B-Beta demonstrates improved performance on benchmarks like MT-Bench, increasing its usefulness for a variety of tasks. However, this adjustment can lead to problematic text generation when prompted in certain ways.

3. Falcon Instruct GPTQ

9 Best Local/Offline LLMs You Can Try Right Now

This quantized version of Falcon is based on a decoder-only architecture fine-tuned on TII's raw Falcon-7b model. The base Falcon model is trained using 1.5 trillion outstanding tokens sourced from the public Internet. As an Apache 2-licensed, command-based decoder-only model, Falcon Instruct is perfect for small businesses looking for a model to use for language translation and data ingestion.

Model

Falcon-7B-Instruct

Model size

7.58 GB

Parameters

7 billion

Quantization

4-bit

Type

Falcon

License

Apache 2.0

However, this version of Falcon is not ideal for fine-tuning and is only intended for inference. If you want to fine-tune Falcon, you will need to use the raw model, which may require access to enterprise-grade training hardware like NVIDIA DGX or AMD Instinct AI Accelerators.

4. GPT4ALL-J Groovy

GPT4All-J Groovy is a decoder-only model tuned by Nomic AI and licensed under Apache 2.0. GPT4ALL-J Groovy is based on the original GPT-J model, which is known to be great at generating text from prompts. GPT4ALL-J Groovy has been tuned into a conversational model, which is great for fast and creative text generation applications. This makes GPT4All-J Groovy ideal for content creators in assisting them with their writing and composition, whether it is poetry, music, or stories.

Model

GPT4ALL-J Groovy

Model size

3.53 GB

Parameters

7 billion

Quantization

4-bit

Type

GPT-J

License

Apache 2.0

Unfortunately, the baseline GPT-J model was trained on an English-only dataset, which means that even this fine-tuned GPT4ALL-J model can only converse and perform text generation applications in English.

5. DeepSeek Coder V2 Instruct

9 Best Local/Offline LLMs You Can Try Right Now

DeepSeek Coder V2 is an advanced language model that enhances programming and mathematical reasoning. DeepSeek Coder V2 supports multiple programming languages ​​and provides extended context length, making it a versatile tool for developers.

Model

DeepSeek Coder V2 Instruct

Model size

13 GB

Parameters

33 billion

Quantization

4-bit

Type

DeepSeek

License

Apache 2.0

Compared to its predecessor, DeepSeek Coder V2 shows significant improvements in coding, reasoning, and general performance. It expands support for programming languages ​​from 86 to 338 and extends the context length from 16K to 128K tokens. In benchmarks, it outperforms models such as GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro in cryptographic and mathematical benchmarks.

6. Mixtral-8x7B

9 Best Local/Offline LLMs You Can Try Right Now

Mixtral-8x7B is a mixture of expert (MoE) model developed by Mistral AI. It has 8 experts per MLP, totaling 45 billion parameters. However, only two experts are activated per token during inference, making it computationally efficient, with speed and cost comparable to a 12 billion parameter model.

Model

Mixtral-8x7B

Model size

12 GB

Parameters

45 billion (8 experts)

Quantization

4-bit

Type

Mistral MoE

License

Apache 2.0

Mixtral supports context lengths of 32k tokens and outperforms Llama 2 by 70B on most benchmarks, matching or exceeding GPT-3.5 performance. It is fluent in multiple languages, including English, French, German, Spanish, and Italian, making it a versatile choice for a variety of NLP tasks.

7. Wizard Vicuna Uncensored-GPTQ

Wizard-Vicuna GPTQ is the quantum version of Wizard Vicuna based on the LlaMA model. Unlike most LLMs released to the public, Wizard-Vicuna is an uncensored model with de-linking. This means that the model does not have the same safety and ethical standards as most other models.

Model

Wizard-Vicuna-30B-Uncensored-GPTQ

Model size

16.94 GB

Parameters

30 billion

Quantization

4-bit

Type

LlaMA

License

GPL 3

While it can pose a problem to control AI alignment, having an uncensored LLM also brings out the best in the model by allowing it to respond without any constraints. This also allows users to add their own custom alignment to how the AI ​​should act or respond based on a given prompt.

8. Orca Mini-GPTQ

9 Best Local/Offline LLMs You Can Try Right Now

Looking to test a model trained using a unique learning approach? Orca Mini is an informal implementation of Microsoft’s Orca research papers. The model is trained using a teacher-student learning approach, where the dataset is filled with explanations rather than just prompts and feedback. This should theoretically make the student smarter, as the model can understand the problem rather than just look for input and output pairs as a typical LLM would.

9. Llama 2 13B Chat GPTQ

9 Best Local/Offline LLMs You Can Try Right Now

Llama 2 is the successor to the original Llama LLM, offering improved performance and flexibility. The 13B Chat GPTQ variant is tuned for conversational AI applications optimized for English dialogue.

Some of the models listed above come in multiple spec versions. Generally, higher spec versions will produce better results but require more powerful hardware, while lower spec versions will produce lower quality results but can run on lower-end hardware. If you’re not sure whether your PC can run a model, try the lower spec version first, then move on until you feel the performance drop is no longer acceptable.

Leave a Comment

Solving Microsoft Teams Shortcut Error Not Opening

Solving Microsoft Teams Shortcut Error Not Opening

Tired of Microsoft Teams shortcut error preventing you from opening the app? Follow our expert, step-by-step guide with the latest fixes for instant resolution. Works on Windows, Mac & web – no tech skills needed!

Solving Microsoft Teams Task Management Sync Error

Solving Microsoft Teams Task Management Sync Error

Tired of Microsoft Teams Task Management Sync Error halting your workflow? Follow our proven, step-by-step fixes to resolve sync issues fast and restore seamless task collaboration. No tech expertise needed!

Troubleshooting Microsoft Teams Wiki Error Formatting

Troubleshooting Microsoft Teams Wiki Error Formatting

Struggling with Microsoft Teams Wiki Error Formatting? This step-by-step guide reveals proven fixes for common wiki tab issues, ensuring smooth editing and collaboration in Teams. Get back to productive wikis fast!

How to Fix Microsoft Teams Installation Error for Linux

How to Fix Microsoft Teams Installation Error for Linux

Struggling with Microsoft Teams installation error on Linux? Discover step-by-step fixes for Ubuntu, Fedora & more. Resolve dependency issues, crashes, and errors quickly with our ultimate guide. Get Teams running smoothly today!

Solving Microsoft Teams Error Page Not Loading

Solving Microsoft Teams Error Page Not Loading

Struggling with Microsoft Teams "Error Page" not loading? Get step-by-step fixes for desktop, web, and mobile. Solve Microsoft Teams Error Page issues quickly and resume seamless teamwork today.

Solving Microsoft Teams Error Screenshot Issues

Solving Microsoft Teams Error Screenshot Issues

Tired of Microsoft Teams "Error Screenshot" blocking your workflow? Get proven, step-by-step solutions to resolve screenshot errors in Teams instantly and boost productivity. No tech skills needed!

How to Fix Microsoft Teams Error U User

How to Fix Microsoft Teams Error U User

Tired of Microsoft Teams "Error U" User blocking your chats? Get proven, step-by-step fixes to clear cache, reset, and restore seamless collaboration instantly.

Where are Microsoft Teams Registry Keys Located on Windows 11?

Where are Microsoft Teams Registry Keys Located on Windows 11?

Unlock the precise locations of Microsoft Teams registry keys on Windows 11. Step-by-step guide to find, access, and safely tweak them for optimal performance and troubleshooting. Essential for IT pros and Teams enthusiasts.

How to Fix Microsoft Teams Training Error Video Lag

How to Fix Microsoft Teams Training Error Video Lag

Tired of Microsoft Teams "Training Error" Video Lag ruining your meetings? Follow our step-by-step guide with the latest fixes for smooth video calls—no more frustration!

How to Fix Microsoft Teams Error occurred During Sign-In

How to Fix Microsoft Teams Error occurred During Sign-In

Frustrated by Microsoft Teams "Error occurred" during sign-in? Discover proven, step-by-step fixes to get back online fast. Clear cache, reset app, and more for seamless login. Works on Windows, Mac & web.

Troubleshooting Microsoft Teams Call Error Disconnecting

Troubleshooting Microsoft Teams Call Error Disconnecting

Struggling with Microsoft Teams "Call Error" Disconnecting? Discover proven, step-by-step fixes for call drops, network issues, and more. Get back to seamless meetings fast with our expert troubleshooting guide.

How to Fix Microsoft Teams Error 2603: Login Loop Fix

How to Fix Microsoft Teams Error 2603: Login Loop Fix

Stuck in Microsoft Teams Error 2603 login loop? Get proven step-by-step fixes to resolve the issue fast and restore seamless teamwork. Clear cache, reset, reinstall – all covered!

How to Fix Microsoft Teams Voice Call Error

How to Fix Microsoft Teams Voice Call Error

Tired of Microsoft Teams voice call errors? Follow our step-by-step guide to troubleshoot no audio, echo, or call drops. Proven fixes for crystal-clear calls every time.

How to Fix Microsoft Teams Workflows Error Power Automate

How to Fix Microsoft Teams Workflows Error Power Automate

Struggling with Microsoft Teams "Workflows Error" in Power Automate? Discover proven, step-by-step fixes to resolve connection issues, permissions, and more. Get your workflows running smoothly today!

How to Fix Microsoft Teams Reddit Popular Error Fixes

How to Fix Microsoft Teams Reddit Popular Error Fixes

Tired of Microsoft Teams crashes, black screens, and sign-in loops? Unlock Reddit's most popular, proven fixes to resolve common Microsoft Teams errors quickly and get back to productive meetings. Step-by-step guides inside!