✍️ UberChatMaster Story (with narrator colour)

Uncle Gizmo

Nifty Access Guy
Staff member
Local time
Today, 18:14
Joined
Jul 9, 2003
Messages
17,496


I have been constructing an interface I call the Uber Chat Master ("UberChatMaster"). You can see it in this video I posted on the forum a month ago:


(Narrator’s aside: Picture a slightly mad scientist wiring up a dozen AI brains to a single giant switch. That’s basically what Tony’s built here — a polite monster that answers your questions with whatever brain it thinks best.)

I've been adding more and more large language models to its compendium/stable/list.
Here is the complete list, grouped by vendor:

OpenAI (2 models)
  • gpt-4
  • gpt-4o
Ollama (5 models)
  • mistral:latest
  • phi3:mini
  • llama3:8b
  • llama3.1:8b
  • qwen2.5-coder:1.5b-base
MistralAI (8 models)
  • ministral3B
  • Mistral7b
  • MistralNemo
  • MistralSmall
  • MistralMedium
  • MistralL2
  • Pixstral
  • Codestral
xAI (2 models)
  • Grok 3
  • SuperGrok
Google AI (2 models)
  • Gem 2.5 Pro
  • Gem 2.5 Flash
Anthropic (3 models)
  • claude-3-opus-20240229
  • claude-3-5-sonnet-20240620
  • claude-3-haiku-20240307
(Author’s note: If that list looks excessive… that’s because it absolutely is. Who needs one brain when you can rent twenty?)



Most of them are accessed through an API on the internet, except for the Ollama (5 models), which reside on my PC hard drive.

Why? I can hear you thinking! I ask myself the same question!!!!
(Narrator’s grin: In reality, local models save on API bills, plus it just feels more cyberpunk to have a few AI brains lurking right on your desktop, spinning fans and all.)



The next thing I want to do is use the local Mistral model (mistral:latest) to read the user question and select the large language model to answer the question, based on the language abilities stored in a SQLite table. The main reason for using a local model is to reduce costs.

The problem with a local model on a low-power machine like mine is it takes time. Hitting a model with an API is often faster, but costs money!
(Footnote from the peanut gallery: Nothing like choosing between spending cash or spending seconds of your life waiting for a CPU that’s older than your last DIY project to crank out an answer.)



Once I’ve got that idea sorted and working (or scrapped), the next thing I want to do is ask two models the same question and then analyse the answers for inconsistencies.

(Sidebar: That’s called “catching AI hallucinations.” Or in less polite company: finding out who’s full of it.)



All of my conversations with the large language models are stored in a SQLite database. This means it is gradually accumulating historical records of my conversations. I also have possibly 10 years’ worth of assorted snippets of information stored on my computer, which I’m going to try to add to this database.

I’m thinking of something like an avatar that responds in a way similar to myself — a digital clone of myself which, hopefully, I shall be able to set loose on the world to do my bidding!!!
(Narrator winks ominously: Yes, dear reader, that’s how Skynet starts. But also how we get excellent personal assistants. Let’s hope it’s the latter.)



Finally, I’m looking for testers, probably in about a month, maybe sooner. This new Gemini command-line interface large language model is just genius level at coding. I mean, I’ve added most of those large language models in the last few days — most of the rest of the month has been spent scratching my head trying to work out why it wouldn’t work. Along comes Gemini command line, and it puts it all right.
(Stage whisper: If you’ve ever seen an AI out-code a tired programmer at 3am, it’s both a miracle and slightly unsettling. But hey, fewer bugs for all of us.)
 

Users who are viewing this thread

Back
Top Bottom