Run your own LLM locally, like ChatGPT or Claude (2 Viewers)

unmateria · 2026-03-29T02:00:00+0100

Jon said:
I'm looking to get a mac mini pro M5 with 24gb or 48gb of RAM and another mac mini M516gb, and join both together using exo software as a cluster to power a nice sized local LLM. They will probably come out in the summer. I am running agentic software called OpenClaw and if I end up running autonomous agents throughout the night, API calls to frontier LLM models through OpenRouter will end up costing me a lot of money. By running the models locally, I will cut the marginal cost of intelligence down to zero. Just a fixed one-off cost.

Hi Jon!

Here my 2 cents!

Not really... It really depends on your use. First of all... take into account if a local model can work for your use (test it in openrouter first). Second... the usage... if usage is little, openrouter + cheap or free models like Step3.5 are really awesome (faaaaaaar much better than any local model for general/coding usage). If your usage is MUCH... the best and the cheapest way is using claude and if u need to integrate it, just use claude code cli with the subcription auto token. That is MUCH cheaper than using for example Step3.5 if u use it a lot (well... and much better).
Local is awesome for many things (translating, TTS, STT, normal coding HELPER, audio generation in general, vision specially with finetuned models for your needs, etc)... but it really cant even compare in heavy tasks with any BIG model (ecosystem i would say, better than 'model').
There are some models dont work very good in local for coding I like Nemotron for example. But the same... you cant let it code 5 lines without reading them, and if you want it more "inteligent" you are going to waste a ton of time finetuning (not just computer time... your time!), implementing rags, chromadbs, prompt enginiering until u die, etc. If you accept an advice... this is bad moment for investing in those sort of things. If you still want to have a MUCH better local LLM aproach, try claudish with LM Studio or ollama. That will decrease by x10 the time you need to prepare things likes rag or chromadb, and you will get much more near "free claude code" experience (still nothing comparable in terms of accuracy or speed). LM Studio alone is like a toy. But is really good for development of other things, testing while finetuning, etc.

Run your own LLM locally, like ChatGPT or Claude (2 Viewers)

unmateria

New member

Similar threads

Users who are viewing this thread