Intro to retrieval aided generation (RAG)

The appearance of a new name is an interesting signal and I am here to introduce you to the term #RetrievalAidedGeneration (#RAG).

This is notably something I have discussed before without a name - you might remember either a widely circulated story on a Morgan Stanley internal #LargeLanguageModel application or my video on it - and I might say the appearance of a new name is a harbinger of the appearance of a consensus best practice. RAG gives you a functionality as if you were using #LLM for expert knowledge but it also gives you control over the the hallucination problem and the provenance of information in general. It is a natural fit onto the basically universal big business problem that documentation of how to run the corporation quickly accumulates in volume beyond what readers can usefully sift through.

Teaser for later this week: one can almost imagine these technologies opening new frontiers in how a large and decentralized conglomerate could be administered as a sort of #ArtificialIntelligence powered #RhizoCorp.

Analysis: AI and the quest to duct-tape our toys together

#OpenAI recently added some interesting new features and this video is about deeply reading these tea leaves for useful information about your own internal AI initiatives as well as the evolution of the broader marketplace for these tools. Specifically, they opened beta testing on a new #ChatGPT feature for uploading and ingesting .pdf files and announced a roadmap for broadening the input and output options in general. In this video, I pursue three lines of analysis...

1) #ArtificialIntelligence is going to need a lot of help with duct-taping itself into other things before it can take your job. The success or failure of these (otherwise very boring) duct-taping operations is what you need to watch.

2) #GenerativeAI is about it in terms of areas in which startups can raise money recently, and many were working similar add-on features. They may be in trouble, and some may have difficulty finding niches that can stay away from this kind of trouble.

3) We already have an aging tradition of stock dialog about how to handle AI inside your corporate organization and I think recent progress has rendered some of it awkwardly half out-of-date. There is a promise that AI can now help resolve data quality issues that previously held you back from doing AI.

Let's turn sorrow to joy. Send me your complaints!

Are things not as they should be? Feel like no one listens? I want to listen!

It's at best a lost opportunity how much we trumpet our real or fabricated success while concealing our stumbles and failures. A story about a way things can go wrong is actually a pretty useful thing, and we're probably destroying each other's mental health endlessly shouting at one another about our flawless success. That being said, I know you all have lots of good reasons you can't complain too hard about your work on social media.

So... complain to me! I won't tell a soul unless I hear similar complaints repeatedly, and if I do I will make a video about the pattern I see and not anyone's individual story. We can share useful anti-knowledge, give some hints at our real selves (at least in large, consistent groups), and you won't have to look like a malcontent in front of your boss.

Do you like this idea? Please send me a message or an email. I want to hear from you!

Vague BS is the biggest danger your AI program faces

I've got the envelope here and the academy says the biggest danger to your AI program is [drumroll]... !vague bullshit!

What is AI? That might actually be a profound question, and perhaps it is like the House speaker race in that the difficulty is certainly not lack of candidates but lack of consensus and clarity. If you are blessed with success, it is going to be fueled by specificity on business value, infrastructure, algorithms, talent [coach’s clipboard folds out and the list drops to the floor]...

Emerging technology in the private sector tends to suffer from a code-switching problem - the language you need to excite and persuade is deeply unlike the language you need to build. If you don't see that both codes are there, and don't know how to speak in each, you may be in trouble.

In this video, I elaborate on these ideas and give a couple examples. One is about the old tension regarding what is fancy enough to call AI. The other is new: some of our older commentary on AI is very appropriate to #GenerativeAI in some applications and very inappropriate in others.

#ArtificialIntelligence

?? LLM = Large Libel Model ??

Does #LLM stand for #LargeLanguageModel or Large Libel Model? In any event, you should take note that law bloggers are telling these jokes.

I discuss a specific lawsuit in #Georgia, Walters vs. #OpenAI, to provide an example of a #ChatGPT hallucination triggering a credible lawsuit. The specific dynamics of this particular lawsuit also provide us an opportunity to drill down on why it matters that #OpenAI is quite unusually incorporated as a for-profit subsidiary of a not for-profit corporation.

#ArtificialIntelligence

AWS invests in Anthropic vs. Microsoft invests in OpenAI

I cover the basic facts of Amazon / AWS 's recently announced $1.25 - $4 billion investment in generative AI firm Anthropic, creator of the well-known large language model Claude. More fun: I point out that this is kind of a weird deal that not only mirrors the weird deal between Microsoft and Open AI but has involved many of the same people over time.

Morgan Stanley seems to have the right idea on generative AI

There is a lot of media circulating about a new internal artificial intelligence tool at Morgan Stanley and I've really liked many of the things I have read. Specifically, what I could glean about the architecture fits well into some of my ongoing narratives about how you maybe don't want to directly use your large language model as an expert, but rather you should have it read information from trusted sources and write a synthesis for you to read in turn. In this video, I go into some detail about why this is the case.

Evaluating the hype on new large language models

Here I discuss new or upcoming large language models including Technology Innovation Institute's latest Falcon model, Meta's (f.k.a. Facebook's) upcoming iteration on it's influential LLaMA models, and Alphabet Inc.'s (f.k.a. Google's) highly but vaguely anticipated new model Gemini. My real aim is to discuss how you might evaluate these different models, how the outcomes track the goals of their makers, the potential value in embracing open source models, and what sort of marketing language prevails in this area when test drives are not available.

In the heat of the moment, I also accidentally provide some arguments about naming and renaming things...

Music, technology, and AI looking backwards and forwards

It is not terribly hard to see that generative AI is likely to put a lot of pressure on the music business. Music, as a business and cultural institution, is actually a great case study looking backwards as it has been disrupted again and again by new technology with consequences that are an accepted part of your everyday life.

Is AI smarter than people? It certainly is when you can't hold people's attention

Are recent, state-of-art artificial intelligence algorithms smarter than people? I would say AI is still pretty shabby relative to motivated humans that are paying attention, yet motivating humans to pay attention is a hard problem in itself that we all spends a lot of time discussing in frustration.

Intellectual property is a freakish modern invention

It's not hard to see that artificial intelligence is going to put a lot of pressure on our existing system of intellectual property law but easy to forget that that system is hardly eternal. I weave together diverse strands like Homer, James Joyce, cryptocurrency, Immanuel Kant, and the human author as a large language model to make a point about how the status quo we are afraid to leave is actually a pretty fleeting and weird moment in history.

Breaking down ChatGPT Enterprise

Let me give you the high-level story on OpenAI’s new enterprise tier for ChatGPT, including...

...the data privacy and cybersecurity they are trying to assuage...

...the legal concerns that are mostly still at large...

...some very nice although not radically different new features...

...and the landscape of competing goods like the recent new Microsoft Azure offering and the zoo of open source models like Llama 2.

Don't let ChatGPT trade cryptocurrency for you!

I'd like explain why letting ChatGPT trade cryptocurrency for you is a very terrible idea especially because it provides a great prototype for recognizing other terrible ideas...

I was just making a Capnion TikTok account, so you can watch there if it is your preferred platform, and while the advice I observed was terrible it provided great ideas for videos. In this particular case, the story is that the cryptocurrency universe is full of propaganda for pump-and-dump scams and all ChatGPT can realistically be doing is uncritically passing through bad information it scraped off the internet. You are much better off closer to the source because you can take a guess at what manner of game the source may be playing.

Porn is the elephant in the room of the AI regulation conversation

Porn is the elephant in the room of the "social consequences and regulation of generative AI" conversation. I feel a little awkward bringing it up myself, but I think it needs doing. There will be huge economic incentive to turn these new tools to these purposes, and then incentive to always go a little farther down one unspeakable road or another. More than any other area, regulatory inaction here appears to me as unacceptable period yet this is an area polite conversation likes to avoid.

Why Etsy and artisanal mayonnaise are hints AI will not end the world

If you are anxious about the artificial intelligence apocalypse this meditation on Etsy and artisanal mayonnaise might help you see some silver lining. In many ways, you already live in someone else's dystopia and carry their fond memories of another time... This is to say that the world changes and change can be painful but the world is not going to end.

Zoom and the risk in using sensitive data to train AI

I had to weigh in on the Zoom fracas... If you didn't hear, Zoom updated their policies with a clause permitting them to use your datato train artificial intelligence models. In this video, I summarize the basic facts and give an analysis of what sort of marginal data privacy and cybersecurity risk you pick up when your data when a company like Zoom uses your data for these purposes.

Amazon travel guides and generative AI supply gluts

I've been trying to warn you of generative driven supply glut issues for a minute and how you are going to see more of them. The example du jour is the widely discussed plague of inferior travel guides on Amazon, their legions of bogus positive reviews, and the likely possibility that the sudden volume is only possible because they are written with aid of artificial intelligence.

Do angels have faces? A prompt engineering pitfall

Do angels have faces? It depends on whom you ask, and this is a good window into some practical challenges in prompt engineering ...

In this video, I walk through an example involving OpenAI's text-to-image generative AI model Dall-E 2 that went badly. I am guessing that I got a blend here of the modern folkloric angel, which has a (usually attractive) human face, and the Old Testament angels that more resemble gyroscopes. The point here is that there can be subtlety in surprising places, like whether you would like your angel to have a human face or not, and getting these things right can require knowledge from weird corners.