Two authors have filed suit against OpenAI for (allegedly) wrongly using their novels as training data, deciding to do so after noticing that ChatGPT could summarize their novels in great detail. This lawsuit notably involves copyright protection and not some other areas of intellectual property law that are already being litigated. I do cover some of the details of this lawsuit, what merits it might or might not have, and how these issues intersect with the realities of trying to train a large language model. However, as someone who has posted a few videos about OpenAI's legal adventures the better takeaway might be that the law around important practices here is unsettled, some of the best and most accessible LLMs are presently more than a little sketchy legally, and this is something to watch and manage if you want to exploit these tools in your business.
OpenAI targeted in $3 billion class-action lawsuit
OpenAI and benefactor Microsoft, maker of ChatGPT, are now the target of a $3 billion class action lawsuit alleging that they broke the law in scraping the web for training data. The merits of the lawsuit are not fully clear to me, but I can say that massive scraping is probably necessary for a large language model with ChatGPT's flexibility, various fights about this scraping (including this one and beyond it) are already heating up, and web scraping is in fact a legally murky activity.
Update: Ghost PII in iOS
My colleague Jack Phillips and I have an update on our demonstration app built for iOS via Swift. The point, of course, is to show what Ghost PII is about, how it can be used to keep data encrypted everywhere but the user's phone including while in use in computations on the server, and how it is easily layered on top of other technologies. I would say we have had unique success already making this kind of technology accessible and easy-to-use and the ultimate goal of this project is a library that will provide iOS developers the value without requiring them to even know Ghost PII is there.
Mega Breach at MOVEit Managed Data Transfer
It felt irresponsible to wait too long to let you know about the HUGE data breach over the weekend involving the MOVEit Managed File Transfer tool. Short story: it can be hard to send people big datasets, maybe you decide to pay someone to use a tool for this, and if there is a cyber security bug in the tool everybody is really screwed and multiple government agencies are giving up the juiciest of PII on basically the whole population of Louisiana.
An unusual unionization drive at Alphabet with AI intersections
I couldn't let the the recent unionization drive among Alphabet contract workers sneak into the weekend - it might be a great place to read tea leaves for clues on the future of generative AI and large language models specifically.
These models, thus far, have required significant amounts of skilled human labor to train and refine. For this and other reasons it is a pretty expensive to be in and you can guess some things about how many and how big are the companies that will be selling you these things in the foreseeable future.
DeSantis deepfakes Trump! An unfortunate milestone...
It appears the DeSantis campaign has circulated a deepfake (a fake image generated using artificial intelligence techniques) of Trump warmly embracing Fauci (who is very unpopular with GOP base voters) in what appears to be a smear for political advantage. This is a notable "first" that I think many of us knew was coming and we may soon begin learn a lot about the kinetics of deepfake propaganda from events in the wild. This video discusses the basic facts and what to watch for... and I recommend you watch.
Interest rates as a cultural force
You're about to get tired of Marvel because the Federal Reserve is no longer maintaining its post 2008 crash ZIRP (zero interest rate policy)...
This is (probably) crazy talk. It is true, though, that interest rates influence culture in subtle but real ways and the recent shift in the macroeconomic climate is likely to create a shift in culture as well. In this video, I talk some basics of recent monetary policy, the everyday "meaning" of interest rates, and theorize more seriously that people like Elon Musk who are pining for the office hustle culture of a past era may be pining a long time for something that can only grow in a very particular sort of soil.
An AI nightmare at the National Eating Disorder Association
This video covers what one can only call an artificial intelligence chat bot adoption horror story at the National Eating Disorder Alliance. Spoiler: the help line is a robot and its giving optimally terrible advice. Hype is determined by the cost-benefit calculus of posting on social media. The costs and benefits for any given application in a business may be in very different balance, and sometimes it's hard to say just what these new models are likely to do in the wild.
Do we lie to ourselves about where innovation comes from?
Historically innovation looks a product of chaos and misery ("necessity is the mother of invention") but when we try to cultivate it today we seem to presume the opposite...
Gossip on the biometric beer stand at Coors Field
You may have seen articles on LinkedIn News last week about Amazon piloting a system by which stadium-goers can pay for beer via a palm print biometric. As it happens, my colleague Jack Phillips was at Coors Field to see the Colorado Rockies over the weekend and here he kindly shares the word on the street about this new payment technology.
In the background here is that unusual data privacy concerns surround biometric data like palm prints and this data is tightly regulated in many states.
Updates on our upcoming iOS tools and future-proofing data sovereignty
We've been working on some powerful iOS developer tools powered by Ghost PII. I open by talking through why you might want to do this: I posted a video yesterday on Meta’s headaches moving data from the European Union to the United States and this sort of a technology is a way to avoid this problem by making sure that data is never in plaintext anywhere but on the end user's computer.
My colleague Jack Phillips will help you peek a bit under the hood at how all of this works, including showing just what (securely encrypted data) will actually be stored in Firestore (just what we happen to be using in this case) and teasing a bit about the next phase in which we build some analytics that work on this data directly without need of encryption.
The European Data Protection Board announced a record $1.3 billion fine on Meta, parent of Facebook and Instagram, for illegally transferring data on European Union nationals to the United States. In addition to the basic facts, I review the abundant and important backstory for these events going all the way back to Edward Snowden's surveillance revelations, the history of data privacy frameworks going down in flames following lawsuits by Max Schrems, and the complicated political intersections that drag this all out forever including the unique role played by regulators in Ireland.
Juicy tidbits from the history and philosophy of AI
The history and philosophy of artificial intelligence has some surprisingly useful nuggets in it that orbit the question: Is human intelligence "real"? What is it?
I talk about the two previous cycles of AI hype and collapse into "AI winter" first in the 1960s and then in the 1980s. In the former, academic philosophers like Hubert Dreyfus brought interesting critique orbiting around issues like misunderstanding of human consciousness (in a sense still our only available prototype for AI) and how difficult it is to separate what we understand as intelligence in ourselves from the visceral desires that come from our body. I would say these criticisms anticipated the use of reinforcement learning, among other things, and they might have more practicality in them than you might think.
Fear marketing in artificial intelligence
Let's peer behind the veil at the marketing aspects of artificial intelligence (notably LLMs like ChatGPT of late) and observe how much of the psychology of fear is in play. Is this something we should snap out of a bit? Imo, yes.
I discuss how the concerns of Geoffrey Hinton, the reported "Godfather of AI" who left Google to talk about the dangers of AI, actually differ very little from the public statements of top executives at companies like Microsoft and his former employer. What gives? It appears fear is great marketing for AI, and I go into further detail on why the widely discussed potential pause on AI research is probably mostly conscious marketing by the public figures discussing it.
Samsung !bans! generative AI tools like ChatGPT
It was reported yesterday that Samsung has banned internal use of generative AI tools like ChatGPT, and for exactly the kinds of reasons I have been discussing in these videos. They have information they would like to control, and this is difficult if employees send that information to a third-party like OpenAI that has yet to prove an ability to protect it from criminals. Privacy-enhancing technologies, perhaps homomorphic encryption and federated learning in this example especially, are the hand-in-glove solution to this problem, and they even clean up some of the ethical issues as well.
ChatGPT Data Breach!
Open AI confirmed reports of a data breach involving ChatGPT today. Apparently criminals were able to access other users' chat histories (and maybe thus glean interesting intel on their employers) as well as cardholder names, last 4 digits, and expiration dates of payment cards (more obvious ouch!) for the small minority of users presently paying. If you were excited about using these kinds of tools at work, this is a risk you need a plan to manage. It is going to happen again, to OpenAI or a similar firm with a similar offering, and its going to be an important part of the story of this technology. We will be putting out more content about events of this nature and best practices for keeping yourself out of related trouble.
An example of where Chat GPT is a good, SAFE idea and why
Sometimes all the theory in the world can't open your mind like one example, and thus we've taken steps to provide tangible examples of how to use, and how not to use, generative AI tools like Chat GPT. As before, my main points of emphasis are:
you need confidence you have a method of cheap quality control
you need to be sure you aren't telling Open AI secrets you shouldn't
As I announced yesterday, we are creating a basic iOS app, really to showcase our privacy (and cybersecurity)-enhancing technology suite Ghost PII, but along the way we want to provide other useful conversation. Here we are getting started writing basic code for the app in Swift using Chat GPT and outline why this is an appropriate application for us relative to my two criteria above.
Private iOS apps with Ghost PII, plus generative AI best practices
This quarter we will be releasing a series of posts and videos on building an iOS application using Capnion's tool Ghost PII to add a very high level of privacy (and also cybersecurity) protection throughout the entire stack so not even the application server will be seeing user's data in plaintext. Along the way, we will both be utilizing generative AI tools like Chat GPT where appropriate and also expounding best practices about where is appropriate vs. inappropriate and why. While I have you, I couldn't help but comment on some stories about the "Godfather of AI" Geoffrey Hinton leaving Google to free himself to comment on the dangers of artificial intelligence. As usual, my angle here is that we are actually just noticing a problem we have had for a while.
Your boss might not be ready for Chat GPT even if you are...
We are here talking about how useful Chat GPT will be at work, but there is a conflict brewing about whether you should decide to use it at work all on your lonesome.
It might be there are privacy concerns about data involved, your questions might suggest things about your employers intellectual property, or it's just that your sudden dependence on Open AI’s cybersecurity makes someone nervous. You could be an employee using a chat bot now when you really shouldn't be or you could be an executive sitting on top of newly opened abyss of cyber risk you haven't yet mapped out well.
In addition to pointing out this rather prosaic issue in detail, I note...
how we have de facto grandfathered older practices around using Google or similar,
the probably surging value of our chat bot queries as a data asset,
and why these concerns actually relate a lot to privacy-enhancing technologies like homomorphic encryption in particular.
Software engineering was already prompt engineering
I have some optimistic thoughts for the software engineering crowd especially about what the putative artificial intelligence workplace future created by Chat GPT or more specifically Copilot might look like. The story is really just that a new kind of programming language is emerging, not really that qualitatively different than what has come before, and if you have even a little bit of polyglot in you then you are already playing the right game.
I also point out a couple pitfalls suggested in this analysis - I have met people that make a lot of money as the last person working in an ancient language, and I have had some unpleasant adventures with programming languages that wanted to do too much and were too eager to accept sloppy instructions.
If you are a Python developer interested in new things, I also have an advertisement for you.