EPISODE #

Elon Musk’s xAI Raising $6B, Worldcoin and OpenAI, Synthesia Expressive Avatars

April 27, 2024

Show Notes

Elon’s raising $6 billion for his AI lab. I dig into how much of this is Elon hype versus banking on real potential.

Worldcoin and OpenAI. What this eyeball-scanning, crypto fever dream has to do with the leading AI company.

Synthesia’s new expressive avatars. The new upgrades are soon to dominate work. But will they break into our personal lives?

Subscribe to the best newsletter on AI: https://theneurondaily.com

Watch The Neuron on YouTube: https://youtube.com/@theneuronai

Transcript

Welcome all you cool cats to The Neuron! I’m Pete Huang.

Today,

Elon’s raising $6 billion for his AI lab. I dig into how much of this is Elon hype versus banking on real potential.
Next, Worldcoin and OpenAI. What this eyeball-scanning, crypto fever dream has to do with the leading AI company.
Finally, Synthesia’s new expressive avatars. The new upgrades are soon to dominate work. But will they break into our personal lives?

It’s Saturday, April 27th. Let’s dive in!

Our first story is xAI raising $6 billion.

xAI is 5th in Elon Musk’s current rotation of corporate brainchildren, after SpaceX, Tesla, Neuralink, The Boring Company and yes, Twitter. Or X, sorry.

At the surface, xAI is just another AI company, starting in the wake of ChatGPT’s launch.

That launch was in November of 2022, and as we both know, ChatGPT proceeded to take over the world in the 6 months following.

So reports of xAI first start floating around in March 2023, with Igor Babuschkin, a researcher from Google’s DeepMind division, which is Google’s AI research center, being named chief engineer for xAI.

By July of 2023, xAI formally launches with 12 total stacked resumes. Igor is joined by researchers and engineers from the who’s who of AI companies and research centers.

The stated mission is lofty and vague: to Understand the Universe.

But the real way to understand xAI is that Elon wants to fight OpenAI.

It turns out that OpenAI is also one of Elon’s corporate brainchildren. Elon helped start OpenAI as a nonprofit back in December 2015 with a mission to democratize AI systems, to prevent big corporations from holding all the control.

By 2018, Elon had left the board because Tesla was starting to do more work in AI and he didn’t want to pose a conflict of interest.

But early 2023 rolls around, and OpenAI has completely changed.

It’s no longer a nonprofit. It’s taken 10 billion dollars in investment from Microsoft. And it released a product that’s captured the world’s attention.

None of this was in the plan!

So Elon starts xAI in March and begins the campaign against OpenAI.

It involves staffing up xAI and building out a competing product. The company now has around 20 or 30 researchers, all from pristine backgrounds.

Its competing product is called Grok, a chatbot like ChatGPT. We’ll get back to Grok in a second.

The campaign against OpenAI also includes a lawsuit in March 2024, which accuses OpenAI CEO Sam Altman and President Greg Brockman of a breach of contract for taking that 10 billion dollars from Microsoft.

And in true Elon Musk style, the lawsuit is accompanied by sharp jabs on, you guessed it, Twitter. Or X, gosh darn it. After the lawsuit goes public, Elon posts: “Change your name to ClosedAI and I will drop the lawsuit”

The news of xAI’s looking for more cash is not surprising. It’s really expensive to build AI. Case in point: Sam Altman said GPT-4 cost about $100 million to train, enough to buy 50 Burger King locations.

The notable part about xAI’s fundraising is how much money investors wanted to literally throw at Elon and xAI.

The Information’s report said that initially, Elon was looking for about $3 billion of cash earlier in 2024. Which is already a huge, huge number to raise.

I mean just imagine seeing three commas in your bank account.

But investors were so excited about xAI that they’re giving him $6 billion, double the amount he was looking for initially.

You might think this is yet another story of Silicon Valley falling in love with Elon Musk.

I definitely thought so, too. After all, SpaceX and Tesla both have completely changed their industries. Maybe the rationale is, well, it’s Elon, let’s just give him money!

But then I looked at xAI’s progress.

Normally, we don’t really pay much attention to Grok because of how far behind it’s been. From a technical perspective, it doesn’t score nearly as high as the leading models like GPT-4 and Claude and Gemini, the ones that we talk about a lot.

And the chatbot that you can access via X isn’t really comparable. There’s no clear reason to use Grok over ChatGPT or Claude.

But last month, xAI released Grok 1.5, a huge improvement to their last model. It’s still one notch below the likes of GPT-4 and Claude, the real leading pack, but it gets surprisingly close to Google’s most recent release of Google Gemini 1.5.

And keep in mind: xAI is just 9 months old. 9 months, and it’s already striking at Google.

Elon fanboying aside, the team is making serious progress and seriously fast.

So with real, tangible progress in developing AI and a once-in-a-generation track record like Elon, no wonder investors have literally been throwing money at him.

Your big takeaway on xAI and it’s $6 billion fundraise:

Even though Elon wants to return to the original OpenAI mission of democratization, one thing he can’t argue with is that making leading AI models really has turned out to be controlled by a few.

Whoever has access to the cash can afford to build these models.

It’s a battle of the billions. Literally.

xAI’s $6 billion fundraise would make it the second most funded AI startup. That leaderboard is led of course by OpenAI, which has about $11 billion, most of that being from Microsoft. Then, in third would be Anthropic with $4-5 billion and Mistral, out of Paris, with $1 billion.

That doesn’t even include Google and Meta, who of course make billions in profit every month, so they’re obviously throwing billions at AI.

To be fair, democratizing AI systems is more about who gets to use and control AI, not necessarily build it.

In that regard, xAI, Meta and Mistral are the ones making the mission happen. They’ve all chosen to open source their models, to make them available for anyone to download and modify and use.

Now we’ll have to see if their AI models can keep up with the companies who haven’t.

Our second story today is OpenAI and Worldcoin.

Worldcoin, the San Francisco and Berlin-based company that started in 2019, is one of the most controversial, sci-fi-like companies that has come out of this whole AI boom.

When it first launched, Worldcoin sounded like a byproduct of a wild weekend of mushrooms, overly confident debates about economics and a few too many questions that started with the words “what if”

So in other words, another weekend with the tech bros of San Francisco.

In 2019, the short pitch for Worldcoin was that it was going to distribute universal basic income. But to make sure you were the right person getting that income, they’d *scan your eyeballs* to biometrically verify you.

And of course, somewhere somehow it would use the blockchain. To prove it, they launched a token, which, since then, has been subject to rampant speculation.

Life moved on for a few years after their launch, until 2023, when Worldcoin actually did start scanning eyeballs. They deployed 11 of these metallic orbs around the world and people literally lined up to get scanned into the system.

In total, they scanned over 5 million people before they got slapped by regulators. Kenya, India and Spain have all but kicked worldcoin out.

So how does OpenAI end up partnering with this kind of company?

Eyeball scanning sounds like a crazy thing to propose with this generation’s understanding of data privacy.

And universal basic income and crypto are both living in the same general neighborhood of politically and culturally untenable.

The answer: OpenAI CEO Sam Altman. He actually was a cofounder of Worldcoin.

Here’s the argument behind it.

AI is going to flood the internet with purely synthetic content. It’s already happening with blogs - these days, I can just tell that half of my Google search results are filled with websites completely generated by AI.

I also know people who built that same technology. So it’s happening.

We know it’s happening in speech with tools like ElevenLabs and in music with tools like Udio. And video really isn’t that far away.

There is not a single Silicon Valley insider who wouldn’t agree that this is coming.

Those same insiders would also agree with the consequence of this: that we’re going to need some way of verifying who is a real human on the internet and what is a real piece of content made by a real human.

The disagreement comes with how that should actually be done.

Finding a way to prove that you are a real person, let’s call that “proof of personhood”, for 8 billion people, with ranging access to internet and technology, in a way that gets adopted by the entire internet, is a big challenge.

It just so happens that Worldcoin wants to scan your eyeball to do it.

Every way to prove your personhood is like Worldcoin: they’re all pretty theoretical and haven’t been tested at scale.

For example, you could group random people up and require that they show up to a randomly chosen spot to verify that the others are real people. I’ll let you think about all the ways that can go wrong.

You can have people prove their identity once, then just reuse that over and over. Worldcoin’s eyeball scan is a version of that. Scan once, use forever.

Even Apple’s Face ID could be interpreted as a version of that.

Or you can do an online test, sort of like a really advanced version of those captcha tests where you have to pick which image is a fire hydrant.

But all of these have tradeoffs that we’re not going to go into for sake of time.

If you have smart friends who like to think about puzzles, or if you’re one of those people, I’m sure you’d have a field day trying to figure out how to make this kind of thing happen.

As for the OpenAI and Worldcoin partnership, it’s nothing tangible yet. It’s more like Sam Altman uniting his two projects and figuring something out.

But you have to give him some credit: this was not an obvious thing in 2019. That was the year GPT-2 was released, which could barely generate any really comprehensible language, let alone something really humanlike.

Not many people would predict then, in that moment, that the world would be flooded with AI and that we would need to start scanning eyeballs.

Your big takeaway for OpenAI and Worldcoin:

The speed of AI is layering on one complex problem after another.

For one, we’re trying to make sense of what this thing even is.

Even researchers don’t know how far current methods can go. And people of all shapes and sizes are trying to figure out what it means for them.

What new things we can create. How to be more effective with it. How to run businesses differently using it.

We then have to deal with what happens as soon as we get to a point where that’s figured out.

That’s because the Internet is so effective at distributing new information that there is a global force of forward-thinking, fast-moving people who will take advantage of everything that’s on the market.

As soon as AI writing tools landed, people started to automated their website. As soon as people figured out how to use AI to write video scripts, video production skyrocketed.

So as soon as someone figures out how to make AI-generated content that is truly indistinguishable from human content in every way, we will have to reckon with these crazy complex questions about how to prove that someone is real.

Worldcoin may be a sci-fi adventurers dream right now, but that might not be the case forever. And if it grows beyond that, it’ll grow beyond that faster than you could possibly imagine.

Our final story is Synthesia and their new expressive avatars.

Synthesia is the London-based, $1 billion dollar AI startup making digital avatars.

They’ve been in the game for a long time. The company was founded in 2017. And I’m sorry to do this to you, but that’s SEVEN years ago.

In spring of 2024, we’re now waiting for GPT-5 from OpenAI. In 2017, not only did GPT-4 not exist. Neither did GPT-3, nor 2 nor even 1.

So this company has been at it for a while

What exactly is a digital avatar? Well, that definition has really changed throughout these seven years as the technology has gotten a lot better.

Synthesia’s first release in 2020 is, by today’s standards, quite clunky. All they did was film someone talking, then loop the video and basically hope that the mouth movement is good enough to match the audio.

If you’re listening to me on audio right now, you won’t see this, but I’m putting up a video of what the 2020 version looked like.

It’s definitely a human, but there’s almost no pretending that the person is supposed to actually be saying what’s being played over audio.

So that brings us to today. Four years later, Synthesia is releasing something called expressive avatars.

The claim is that their avatars can now infer the right emotional expressions from the script that you give it, and the facial expressions and voice inflections would match.

So theoretically, when you want it to say, “Welcome to this week’s update! We have very good news to share!” it should be bright and cheery

And when it says, “I do not have good news to report today.” it should be more serious.

Does Synthesia pass the test? Again if you’re listening to me on audio, I can only give you the voice, not the facial expressions.

So here’s the first one.

Here’s the second one.

So not perfectly human-like, but it’s a big improvement from their last versions!

To set some expectations, I know a lot of people who would only be impressed by this if it was a perfect recreation of humans.

1, I don’t think that’s fair, but 2, I also think you’d get a lot more people believing this was real human footage than you think.

In fact, even the 2020 version, the one that barely gets the faces to line up, that was enough to sell corporate clients.

I’ve spoken to some salespeople who work at Synthesia. Corporate customers are loving these updates. They’re super impressed. And the salespeople are absolutely killing their sales quotas.

And you might be asking, what on earth would these big corporations be using these avatars for? There’s no way they’re actually convincing people these are real, right?

You are right. They’re not. The top use cases for Synthesia and these digital avatars is corporate learning, those video lessons you have to take when you start at the company or once in a while for ongoing training.

The baseline for those training lessons is not just low, it’s buried in the ground. Those things are either really expensive if you have the budget to film a real actor presenting the information or extremely boring if you don’t.

So Synthesia sits in the middle. You get to have a face presenting the information for way cheaper, and even if it’s not 100% perfect, it doesn’t need to be to make this type of content engaging.

People love having a face to look at, and that’s the most important bit. There’s plenty of research showing that educational videos displaying the instructor’s face retains student attention more strongly, and that students also believe they’re more informative, even if the faces themselves don’t actually improve how much a student learns.

Synthesia’s expressive avatars are possible due to a new model that they built called Express-1, which is billed as an AI that can predict facial movements and expressions.

That’s paired with Synthesia partnering with another startup called Hume AI, who’s building technology that decodes the emotions behind your facial expressions and tone of voice and can respond with a certain tone in return.

Synthesia hopes that this combination will create digital avatar experiences that look even more human as it improves. And by the looks of it, they’re on the right track.

Your big takeaway on Synthesia’s expressive avatars:

Digital avatars make perfect sense for business, but consumers might be a different question.

While expressive AI voice has largely been solved by multiple startups, including leading products like ElevenLabs and Play.ht, AI video is much more difficult to make convincing.

Video is a big, big world. Think about it: there’s cinematic film, there’s animation, talking head videos, dashcams, and more.

Synthesia and other digital avatar startups like HeyGen, Tavus and more are looking to make AI-generated, humanlike avatars for work.

The killer use case today is corporate training videos. But they’ll move towards sales and marketing-related content, things that will reach their actual customers, as the technology improves and the avatars look even more humanlike.

The open question is if digital avatars can also reach consumers, think the type of content hitting TikTok and Instagram Reels. Business videos are meant to and even expected to be calm. They’re boring by TikTok standards.

Will these startups’ solutions transcend? And will people engage with them, or will they get turned off?

Some quick hitters to leave you with:

Microsoft and Google both reported strong financial results this week, with Microsoft leading the way. Some highlights for the Seattle-based giant: it now has 1.8 million developers using GitHub Copilot, up from 1.3 million a quarter ago. Microsoft cloud grew 31% year over year, with AI directly making up 7% of that. Meanwhile, Google talked a lot about AI but didn’t share specific numbers for Google Gemini or its AI search experiments. That’s not a good sign.
The White House announced its final roster for its AI safety board. On the board are the CEOs of OpenAI, Anthropic, Nvidia, AMD, Amazon Web Services, Adobe, Alphabet and more. Not on the board are Elon Musk and Mark Zuckerberg.
Big consulting firms continue to win huge in AI. Accenture said it booked over $600 million of AI work in Q2. Boston Consulting Group said AI would represent 20% of its revenue this year.

This is Pete wrapping up The Neuron for April 27th. I’ll see you next week!

‍