There is a pretty big moat for Google: extreme amounts of video data on their existing services and absolutely no dependence on Nvidia and it's 90% margin.
Google has several enviable, if not moats, at least redoubts. TPUs, mass infrastructure and own their own cloud services, they own delivery mechanisms on mobile (Android) and every device (Chrome). And Google and Youtube are still #1 and #2 most visited websites in the world.
Not to mention security. I'd trust Google more not to have a data breach than open AI / whomever. Email accounts are hugely valuable but I haven't seen a Google data breach in the 20+ years I've been using them. This matters because I don't want my chats out there in public.
Also integration with other services. I just had Gemini summarize the contents of a Google Drive folder and it was effortless & effective
While I don’t disagree with you, for historical purposes I think it’s important to highlight why google started its push for 100% wire encryption everywhere all the time:
The NSA and GHCQ and basically every TLA with the ability to tap a fibre cable had figured out the gap in Google’s armour: Google’s datacenter backhaul links were unencrypted. Tap into them, and you get _everything_.
I’ve no idea whether Snowdon’s leaks were a revelation or a confirmation for google themselves; either way, it’s arguably a total breach.
When I worked at PayPal back in 2003/4, one of the things we did (and I think we were the first) was encrypt the datacenter backhaul connections. This was on top of encrypting all the traffic between machines. It added a lot of expense and overhead, but security was important enough to justify it.
Not that I disagree with your assessment but in the spirit of hn pedantry - google had a very significant breach where gmail was a primary target and that was “only” 16 years ago in mid 2009. So bad that it has its own wikipedia page: https://en.wikipedia.org/wiki/Operation_Aurora
While their competitors have to deal with actively hostile attempts to stop scraping training data, in Google's case almost everyone bends over backwards to give them easy access.
The biggest moat is amount of money. Google has infinite amounts of money the print out of thin air (ads). They don't need complex entangled schemes with circular debts to prop up their operations.
They don't abandon their money makers. That's the thing people don't get about the Google graveyard meme, they only cut things that obviously aren't working to make them more money.
Half of the things they build don't even have a chance to make money. But then people end up depending on their products and they they shut it down or sell it.
I have yet to be convinced the broader population has an appetite for AI produced cinematography or videos. Independence from Nvidia is no more of a liability than dependence on electricity rates; it's not as if it's in Nvidia's interest to see one of its large customers fail. And pretty much any of the other Mag7 companies are capable of developing in-house TPUs + are already independently profitable, so Google isn't alone here.
The value of YouTube for AI isn't making AI videos, it's that it's an incredibly rich source for humanity's current knowledge in one place. All of the tutorials, lectures, news reports, etc. are great for training models.
Is that actually a moat? Seems like all model providers managed to scrape the entire textual internet just fine. If video is the next big thing I don’t see why they won’t scrape that too.
Scraping text across the entire internet is orders of magnitudes easier than scraping YouTube. Even ignoring the sheer volume of data (exabytes), you simply will get blocked at an IP and account level before you make a reasonable dent. Even if you controlled the entire IPv4 space I’m not sure you could scrape all of YouTube without getting every single address banned. IPv6 makes address bans harder, true, but then you’re still left with the problem of actually transferring and then storing that much data.
And we're probably already starting to see that, given the semirecent escalations in game of cat and also cat of youtube and the likes of youtube-dl.
Reminds me of Reddit's cracking down on API access after realizing that their data was useful. But I'd expect both youtube to be quicker on the gun knowing about AI data collection, and have more time because of the orders of magnitude greater bandwidth required to scrape video.
Not really. Lots of companies have valuable data they sell and have been in business for decades just fine. It's even better for reddit because it's user generated so they don't even have to do anything. The users who left during the API debacle are not the vast majority of users which are generally casual and do not give a single shit about what happened, much as tech people like to think otherwise.
Again, this is a techie take. Lots of people for example use ChatGPT for personal therapy and guess which subs their training data comes from, r/relationships etc. Those trying to use them for other means are comparatively less frequent.
> Seems like all model providers managed to scrape the entire textual internet just fine
Google, though, has been doing it for literal decades. That could mean that they have something nobody else (except archive.org) has - a history on how the internet/knowledge has evolved.
If you think they are going to catch up with Google's software and hardware ecosystem on their first chip, you may be underestimating how hard this is. Google is on TPU v7. meta has already tried with MTIA v1 and v2. those haven't been deployed at scale for inference.
I don't think many of them will want to, though. I think as long as Nvidia/AMD/other hardware providers offer inference hardware at prices decent enough to not justify building a chip in-house, most companies won't. Some of them will probably experiment, although that will look more like a small team of researchers + a moderate budget rather than a burn-the-ships we're going to use only our own hardware approach.
Well, anthropic just purchased a million TPUs from Google because even with a healthy margin from Google, it's far more cost effective because of Nvidia's insane markup. That speaks for itself. Nvidia will not drop their margin because it will tank their stock price. it's half of the reason for all this circular financing - lowering their effective margin without lowering it on paper.
It's in Nvidia's interest to charge the absolute maximum they can without their customers failing. Every dollar of Nvidia's margin is your own lost margin. Utilities don't do that. Nvidia is objectively a way bigger liability than electricity rates.
I think it will be accepted by broader population. But if generation is easy and cheap I wonder if there is demand. And I mean as total demand in the segment. Will there be enough impressions to go around to actually profit from the content. Especially if storage is also considered.
Given the fact that Apple and Coke but rushed to produce AI slop, and the agreements with Disney, we are going to see a metric fuck-ton of AI-generated cinema in the next decade. The broader population's tastes are absolute harbage when it comes to cinema, so I don't see why you need convincing. 40+ superhero films should be enough.
On paper, Google should never have allowed the ChatGPT moment to happen ; how did a then non-profit create what was basically a better search engine than Google?
Google suffers from classic Innovator's Dilemma and need competition to refocus on what ought to be basic survival instincts. What is worse is the search users are not the customers. The customers of Google Search are the advertisers and they will always prioritise the needs of the customers and squander their moats as soon as the threat is gone.
Google allowed this to happen because they listened to their compliance department and were afraid of a backslash if LLM says something that could anger people.
Exactly, Google's business isn't search, it's ads. Is ChatGPT a more profitable system for delivering ads? That doesn't appear so, which means there's really no reason for Google to have created it first.
There’s a very negative immune response to the idea of Netflix running ads.
And yet they’re there, in the form of prominent product placement in all of their original series along with strategic placement in the frame to make sure they appear in cropped clips posted to social media and made into gifs.
Stranger Things alone has had 100-200 brands show up under the warm guise of nostalgia, with Coke alone putting up millions for all the less-than-subtle screen time their products get.
I’m certain AI providers will figure out how to slyly put the highest bidder into a certain proportion of output without necessarily acting out that scene in Wayne’s World.
I suspect google can last much longer in regards to an AI model chat engine that competes with open AI and other companies, without needing a profit from that particular product in a timely manner. I can's say the same for the others. Google is using it's own money to fund this without mch pressure for immediate profit in a time deadline. They can rely on their other services for revenue and profit for the meantime.
Google had an in-house chatbot that was never allowed to launch. I used to think that they were wrong but now I'm pretty sure they were right to not launch it. Users are very forgiving with a newcomer but not with an established company.
Think about it in terms of the research they put out into the ether though. The research grows into something viable, they sit back and watch the response and move when it makes sense.
It's like that old concept of saying something wrong in a forum on purpose to have everyone flame you for being wrong and needing to prove themselves better by each writing more elaborate answers.
And yes, all their competitors are making custom chips. Google is on TPU v7. absolutely nobody is going to get this right on the first try among their competitors - Google didn't.
Bigger problem for late starts now is that it will be hard to match the performance and cost of Google/Nvidia. It's an investment that had to have started years ago to be competitive now.
That live data for X is mostly gonna consist of brainrot and egotistical founders and rising vibe coders, that data has to be worth so much less than something like Reddit.