@fubarx

fubarx@lemmy.world · 9 days ago

The problem with CAG is not just that it hogs memory, but to keep it fresh you have to keep re-indexing. If the corpus is large and dynamic, it can easily fall out of date and, at runtime, blow out the context window.

GraphRAG has some promise. NVidia has a playbook for converting text into a knowledge graph: https://build.nvidia.com/spark/txt2kg

It’ll probably have the same issues with reindexing, but that will be a common problem, until someone comes up with better incremental training/indexing.

fubarx@lemmy.world · 9 days ago

Looks interesting. Will give it a whirl on my home server.

In this article, they talk about bringing up a local RAG system to let people run an LLM off a large document corpus: https://en.andros.dev/blog/aa31d744/from-zero-to-a-rag-system-successes-and-failures/

Wonder if this, connected to something like that, and wrapped in an easy end-user friendly script or UI could be a good combination for a local, domain-specific, grounded knowledge-base?

fubarx@lemmy.world · 16 days ago

The more local inference, the better. Nice work!

fubarx@lemmy.world · 21 days ago

Was looking at a used car online. Found a site that had option to ‘reserve’ the car until paperwork was done. Started innocuously with email and phone info, to text updates. Fine. Then it asked to verify info, with scans of driver’s license.

Persona.

Noped right out.

fubarx@lemmy.world · 1 month ago

At some point, some DIY person will come up with a way to disable these things in their presence.

And they will make bank.