Where might AGI come from?

May 31, 2024

It has been 18 months since ChatGPT blew everyone away! Now it is used for everything from AI girlfriends to churning out industrial quantities of research papers.

Sam Altman has had his own arc during this time; from basking in the launch glory to getting knifed by his “ethically concerned” colleagues, who have then been shafted and let go.

Ilya Sutskever, in his parting tweet, now says he is confident that Sam will lead OpenAI to achieve AGI.

The pace of the early days leading up to GPT4 was so frenetic that by now we expected that GPT5 would be in beta and GPT6 under development.

But what we have seen is competitors have caught up and the game is almost commoditized. We have the “open-weights” LLAMA 3, the well-regarded Claude 3 and Gemini 1.5 with its million-plus token context window which seems behind in version numbers even if it is from the very folks who invented the transformer (the T in GPT)!

If you want to make sense of LLMs you have to go no further than watch Andrej Karpathy’s one hour explainer:

But what many keen watchers have observed is LLM performance is plateauing. There is no “take-off”! So Ilya’s words seem a bit empty now.

There was a time when adding parameters could lead to a better model so going from a few million to a few billion parameters could really improve results. But now, data is the bottleneck. The data used to train the LLMs, which is basically all of the open internet plus whatever proprietary data each maker had access to, form the limits now.

The simple explanation for the plateauing is that a model trained with data generated by humans, will at best perform like an average human. Yes, AI is a bit more indefatigable but it is also ultimately a word predicting machine. All such models will average out what they are trained on.

This “average human”-level text output is itself going to lead to awesome stuff and greater productivity and trillions of dollars in global GDP increases. But does it lead to AGI?

LLMs are basically prediction models and prediction models don’t tend to do well beyond the datasets they are trained on.

Yes, you could spend millions labeling the data such that cleaner and better sanitized data is used to train the models but still that is a space which is still well within human-level performance.

This was not the case with AlphaGo Zero which, with a few days of learning by playing with itself, could beat the best human players of Go.

What is the difference? Why can’t LLMs be like this?

AlphaGo had a clear reward function and very clear rules. So you could code it up to follow the rules of Go and the reward is simply the game’s outcome. So reinforcement learning could be employed: the program took action and understood the rewards leading from that action versus many other actions.

There is no easy way to define a reward function for LLMs. What constitutes better performance is very hard to define.

For example, in the context of learning, we want less hallucinations (model outputs that are lies) from LLMs but in the context of literature we don’t care about that!

This is not to say they are completely useless at learning from rewards. Sometimes scolding them or praising them can nudge them to a better output (hence a completely new skill called ~~LLM parenting~~ prompt engineering).

But what does “Super-Human” levels of next word prediction even mean? If you can’t define it in simple clear words, you cannot program for it.

You could potentially use LLMs as a brain that makes plans and decides actions; and then studies the outcomes and decides next steps. This brain will not be capable of full human-like intelligence as there is still a lot of data that humans process and knowledge that they keep in their minds that aren’t part of the dataset LLMs are trained on.

As you add more such data it could get better but a sudden take-off does look less likely.

The G in AGI stands for “General”. I think they use this word because it is genuinely hard to define what it means!

For now, let’s just say it is the opposite of “specific” or if you get a decent collection of “specific” intelligences together, you could say you have AGI!

For example, AlphaGo Zero is specific to playing the game of Go.

LLMs may have lots of “general” applications but it is specifically about next word prediction.

I think what we mean when we say AGI is something that is more-or-less human in a non-specific type of way.

People tried to define the Turing test as way to identify the emergence of AGI, but now that’s been gamed and done already. There are models that beat the Turing test but no one calls them AGI yet.

When we think of how humans learn, we have to think of babies. Some of the first abilities babies learn is to identify faces and read facial expressions and emotions.

Toddlers wait up to see reactions on people’s faces to their own actions to decide if that action was right or not.

They will drop a bowl of food and then stare at you to see your reaction! Your child learns by looking at your reactions to what they do.

Your reaction is the reward function.

What babies have that AI models today don’t have is agency in the real world.

Babies can do all kids of of stuff. And then they look at you for cues to know whether what they did was good or bad, funny or irritating, useful or messy and so on.

Toddlers have the full freedom to pick up a knife and stab at stuff! All that stops them is a concerned parent and then they learn from their reaction.

They can learn from a few thousand actions and the corresponding reactions rather than Petabyte sized datasets.

One space where AI has some agency in the real world, is self-driving cars. Some cities now have Waymo taxis and Tesla has FSD.

But the reward functions that can help them scale are not really there yet. There are lots of corner cases that have to be specifically managed.

While AI drivers do seem safer than human drivers, other technology like Automated Emergency Braking achieves similar results.

Non-AI machines can brake faster than humans, see in infra-red, crawl into tiny spaces etc. So AGI will involve the ability to use all these “super-human” powers.

But I think AGI will develop when programs have agency to move around in the real world, take actions in many different fields and then have a collection of reward functions to learn from the outcomes and then improve themselves.

Without being in the real world, you cannot have reward functions that will work realistically.

I think the road toward AGI will be paved by all the negotiations for how much agency we can give to these models and what reward functions will be acceptable.

These negotiations aren’t easy. Google had LLMs but wasn’t ready to give it the agency of “chatting with humans”. OpenAI went ahead and launched “Chat”GPT without worrying about the safety concerns. And so, the Overton window on Agency widened.

How much “Agency” to give AI models is going to be a significant question in the coming years.

Right now they can talk to humans. They can drive on some roads. Can they be allowed to cook food? Can they take inventory? Can they fix bugs? Can they pick apples? A million such negotiations await us.

What about the reward functions?

Reward functions can come from labeling (humans providing feedback like, “Good job AI!” or “That sucks!”).

It can also come from defining and measuring outcomes (“What is the wastage when picking apples?”). Even better if the AI can measure the outcome by itself.

Defining reward functions is limited only by our creativity and the current state of observational or measuring technology.

A collection of specific reward functions might help us train a robot farmhand, for example. A farmer using a lot of automation today looks more and more like a supervisor who checks if all the machines are working fine. If something breaks, they have to figure out how to fix it.

An AI farmhand might be like a tool or machine in some respects and an entry-level supervisor in others.

One important aspect of AGI that people bring up is reasoning skills. I personally think this is overrated. Many experiments show that humans are “rationalizers” rather than rational actors. We do things because culturally they are the things we do. We just invent rational sounding reasons after the fact.

We humans often know what we need to do (think Atomic habits or The psychology of money) to do better in life. But we don’t do it because we just run of “juice”, whatever that means. It could be fatigue or lack of motivation or the need for instant gratification etc.

Even after figuring out all the wisdom, we can’t really execute close to our full theoretical potential consistently. Our performance is very prone to peaks and troughs.

AI programs will not suffer from that.

There will likely come a time when AGI is here, walking among us, tirelessly doing all the stuff that Twitter influencers goad us every day to do.

But this level of agency has to be backed by a lot of safety technology. For example, can we ensure AGI agents don’t commit murder on behalf of their prompters?

No one knows for sure if we will succeed. We have already normalized killing humans with drones with operators sitting miles away, not having to look eye-to-eye with their victims.

So when Ilya says OpenAI can get to AGI, he probably means they have the skills and funding to go in that direction, not necessarily that LLMs will lead to AGI.

Under the blue sky

Discussion about this post