There is a pretty big moat for Google: extreme amounts of video data on their ex...

simonsarris · 2025-12-31T02:41:10 1767148870

Google has several enviable, if not moats, at least redoubts. TPUs, mass infrastructure and own their own cloud services, they own delivery mechanisms on mobile (Android) and every device (Chrome). And Google and Youtube are still #1 and #2 most visited websites in the world.

xivzgrev · 2025-12-31T03:08:00 1767150480

Not to mention security. I'd trust Google more not to have a data breach than open AI / whomever. Email accounts are hugely valuable but I haven't seen a Google data breach in the 20+ years I've been using them. This matters because I don't want my chats out there in public.

Also integration with other services. I just had Gemini summarize the contents of a Google Drive folder and it was effortless & effective

mootothemax · 2025-12-31T04:17:46 1767154666

While I don’t disagree with you, for historical purposes I think it’s important to highlight why google started its push for 100% wire encryption everywhere all the time:

The NSA and GHCQ and basically every TLA with the ability to tap a fibre cable had figured out the gap in Google’s armour: Google’s datacenter backhaul links were unencrypted. Tap into them, and you get _everything_.

I’ve no idea whether Snowdon’s leaks were a revelation or a confirmation for google themselves; either way, it’s arguably a total breach.

jedberg · 2025-12-31T07:23:39 1767165819

When I worked at PayPal back in 2003/4, one of the things we did (and I think we were the first) was encrypt the datacenter backhaul connections. This was on top of encrypting all the traffic between machines. It added a lot of expense and overhead, but security was important enough to justify it.

guelo · 2025-12-31T09:47:32 1767174452

And yet Venmo, a Paypal company, publishes transaction data publicly by default, no need to decrypt anything ¯\_(ツ)_/¯

hrimfaxi · 2025-12-31T12:13:41 1767183221

Venmo publishes raw unencrypted transaction data? Or are you referring to their social network features?

matkoniecz · 2025-12-31T15:54:05 1767196445

where?

dilyevsky · 2025-12-31T06:41:01 1767163261

Not that I disagree with your assessment but in the spirit of hn pedantry - google had a very significant breach where gmail was a primary target and that was “only” 16 years ago in mid 2009. So bad that it has its own wikipedia page: https://en.wikipedia.org/wiki/Operation_Aurora

charcircuit · 2025-12-31T08:59:40 1767171580

>very significant breach

That page says it was only 2 accounts and none of the messages within the mail was accessed. I wouldn't call that very significant.

why-o-why · 2025-12-31T05:01:55 1767157315

Is Google even required to inform you of a data breach?

bjt · 2025-12-31T05:29:32 1767158972

They're subject to California law, so yeah.

https://oag.ca.gov/privacy/databreach/reporting

devsda · 2025-12-31T03:00:40 1767150040

Don't forget the other moat.

While their competitors have to deal with actively hostile attempts to stop scraping training data, in Google's case almost everyone bends over backwards to give them easy access.

catoc · 2025-12-31T06:13:26 1767161606

‘Actively hostile’ as in objecting to your content getting ripped off without permission?

satvikpendem · 2025-12-31T07:11:53 1767165113

It's a matter of perspective. In this scenario both sides see the other as hostile, just as one would look at a war happening as an outside observer.

troupo · 2025-12-31T10:08:42 1767175722

The biggest moat is amount of money. Google has infinite amounts of money the print out of thin air (ads). They don't need complex entangled schemes with circular debts to prop up their operations.

DoesntMatter22 · 2025-12-31T04:59:10 1767157150

They also have one of the biggest negatives in that they abandon almost everything they build so it’s hard to get invested in thier products.

I agree with the rest though

satvikpendem · 2025-12-31T07:12:52 1767165172

They don't abandon their money makers. That's the thing people don't get about the Google graveyard meme, they only cut things that obviously aren't working to make them more money.

DoesntMatter22 · 2026-01-01T00:28:03 1767227283

Half of the things they build don't even have a chance to make money. But then people end up depending on their products and they they shut it down or sell it.

nateb2022 · 2025-12-31T01:11:36 1767143496

I have yet to be convinced the broader population has an appetite for AI produced cinematography or videos. Independence from Nvidia is no more of a liability than dependence on electricity rates; it's not as if it's in Nvidia's interest to see one of its large customers fail. And pretty much any of the other Mag7 companies are capable of developing in-house TPUs + are already independently profitable, so Google isn't alone here.

ralph84 · 2025-12-31T01:55:52 1767146152

The value of YouTube for AI isn't making AI videos, it's that it's an incredibly rich source for humanity's current knowledge in one place. All of the tutorials, lectures, news reports, etc. are great for training models.

Nextgrid · 2025-12-31T02:04:31 1767146671

Is that actually a moat? Seems like all model providers managed to scrape the entire textual internet just fine. If video is the next big thing I don’t see why they won’t scrape that too.

jmb99 · 2025-12-31T05:57:04 1767160624

Scraping text across the entire internet is orders of magnitudes easier than scraping YouTube. Even ignoring the sheer volume of data (exabytes), you simply will get blocked at an IP and account level before you make a reasonable dent. Even if you controlled the entire IPv4 space I’m not sure you could scrape all of YouTube without getting every single address banned. IPv6 makes address bans harder, true, but then you’re still left with the problem of actually transferring and then storing that much data.

earthnail · 2025-12-31T07:35:34 1767166534

For now, you actually get pretty far with Tor. Just reset your connection when you hit an IP ban by sending SIGHUP to the Tor daemon.

I did that when I was retraining Stable Audio for fun and it really turned out to be trivial enough to pull of as a little evening side project.

tucnak · 2025-12-31T09:54:44 1767174884

IPv6 doesn't make it "harder," as they would typically ban whole /48 prefixes.

monocasa · 2025-12-31T02:09:18 1767146958

And we're probably already starting to see that, given the semirecent escalations in game of cat and also cat of youtube and the likes of youtube-dl.

Reminds me of Reddit's cracking down on API access after realizing that their data was useful. But I'd expect both youtube to be quicker on the gun knowing about AI data collection, and have more time because of the orders of magnitude greater bandwidth required to scrape video.

jakeydus · 2025-12-31T04:05:14 1767153914

And reddit turned around and sold it all for a mess of pottage…

satvikpendem · 2025-12-31T07:13:54 1767165234

Sold being the operative word, rather than giving it away for free.

monocasa · 2025-12-31T18:19:41 1767205181

Well, it is available for free either way. They pissed off their user base all for a horse that had already left the stable.

https://academictorrents.com/details/2d056b22743718ac81915f2...

satvikpendem · 2025-12-31T18:27:09 1767205629

Look at their stock price. They are doing very well since IPO, and much of it was revenue from selling their data.

monocasa · 2025-12-31T18:35:06 1767206106

Google's $60m/yr is the only thing keeping them profitable.

Mozilla's business model isn't really something to emulate, even if the stock market doesn't really see it that way.

satvikpendem · 2025-12-31T18:41:04 1767206464

Not really. Lots of companies have valuable data they sell and have been in business for decades just fine. It's even better for reddit because it's user generated so they don't even have to do anything. The users who left during the API debacle are not the vast majority of users which are generally casual and do not give a single shit about what happened, much as tech people like to think otherwise.

monocasa · 2025-12-31T22:40:10 1767220810

The causal users (to say nothing of the the massive uptick in bot traffic) are some of the more useless data from an AI training perspective.

satvikpendem · 2025-12-31T22:43:23 1767221003

Again, this is a techie take. Lots of people for example use ChatGPT for personal therapy and guess which subs their training data comes from, r/relationships etc. Those trying to use them for other means are comparatively less frequent.

awesome_dude · 2025-12-31T03:16:22 1767150982

> Seems like all model providers managed to scrape the entire textual internet just fine

Google, though, has been doing it for literal decades. That could mean that they have something nobody else (except archive.org) has - a history on how the internet/knowledge has evolved.

fooblaster · 2025-12-31T01:13:57 1767143637

If you think they are going to catch up with Google's software and hardware ecosystem on their first chip, you may be underestimating how hard this is. Google is on TPU v7. meta has already tried with MTIA v1 and v2. those haven't been deployed at scale for inference.

nateb2022 · 2025-12-31T01:20:44 1767144044

I don't think many of them will want to, though. I think as long as Nvidia/AMD/other hardware providers offer inference hardware at prices decent enough to not justify building a chip in-house, most companies won't. Some of them will probably experiment, although that will look more like a small team of researchers + a moderate budget rather than a burn-the-ships we're going to use only our own hardware approach.

fooblaster · 2025-12-31T01:26:44 1767144404

Well, anthropic just purchased a million TPUs from Google because even with a healthy margin from Google, it's far more cost effective because of Nvidia's insane markup. That speaks for itself. Nvidia will not drop their margin because it will tank their stock price. it's half of the reason for all this circular financing - lowering their effective margin without lowering it on paper.

fragmede · 2025-12-31T03:46:49 1767152809

And, don't forget everyone's buying from TSMC in every case!

margalabargala · 2025-12-31T01:33:07 1767144787

It's in Nvidia's interest to charge the absolute maximum they can without their customers failing. Every dollar of Nvidia's margin is your own lost margin. Utilities don't do that. Nvidia is objectively a way bigger liability than electricity rates.

bdangubic · 2025-12-31T01:57:44 1767146264

it is in every business’s best interest to charge the maximum…

wrs · 2025-12-31T03:39:16 1767152356

Utilities and insurance companies are two examples of business regulated to not charge the maximum, for public policy reasons.

bdangubic · 2025-12-31T04:15:26 1767154526

we suggesting that nvidia/google/.. be regulated for like utilities?

wrs · 2025-12-31T16:13:19 1767197599

Just expanding on the above sentence “utilities don’t do that”. Which is why depending on Nvidia isn’t like depending on electricity.

margalabargala · 2025-12-31T04:48:02 1767156482

[flagged]

AnonHP · 2025-12-31T08:14:16 1767168856

Not GP and haven’t participated in this thread. I’m clueless on what the point in your earlier comment is. Can you elaborate, please?

margalabargala · 2025-12-31T16:04:12 1767197052

I can try. Which part is confusing to you, the "nvidia will charge as much as it can" part or the "utilities won't" part?

Ekaros · 2025-12-31T05:15:46 1767158146

I think it will be accepted by broader population. But if generation is easy and cheap I wonder if there is demand. And I mean as total demand in the segment. Will there be enough impressions to go around to actually profit from the content. Especially if storage is also considered.

Seattle3503 · 2025-12-31T01:20:56 1767144056

The video data is probably good for training models, including text models.

why-o-why · 2025-12-31T05:06:26 1767157586

Given the fact that Apple and Coke but rushed to produce AI slop, and the agreements with Disney, we are going to see a metric fuck-ton of AI-generated cinema in the next decade. The broader population's tastes are absolute harbage when it comes to cinema, so I don't see why you need convincing. 40+ superhero films should be enough.

cdf · 2025-12-31T04:19:17 1767154757

On paper, Google should never have allowed the ChatGPT moment to happen ; how did a then non-profit create what was basically a better search engine than Google?

Google suffers from classic Innovator's Dilemma and need competition to refocus on what ought to be basic survival instincts. What is worse is the search users are not the customers. The customers of Google Search are the advertisers and they will always prioritise the needs of the customers and squander their moats as soon as the threat is gone.

miohtama · 2025-12-31T07:48:41 1767167321

Google allowed this to happen because they listened to their compliance department and were afraid of a backslash if LLM says something that could anger people.

Sergey Brin interview: https://x.com/slow_developer/status/1999876970562166968?s=20

This attitude also partially explains the black vikings incident.

hattmall · 2025-12-31T04:29:13 1767155353

Exactly, Google's business isn't search, it's ads. Is ChatGPT a more profitable system for delivering ads? That doesn't appear so, which means there's really no reason for Google to have created it first.

razodactyl · 2025-12-31T04:37:41 1767155861

There was a very negative "immune" response from the users when they perceived suggestions from ChatGPT as ads.

This will be hard for them to integrate in a way that won't annoy users / will be better implemented than any other competitor in the same space.

Or perhaps we just deal with all AI across the board serving us ads.... this makes more sense unfortunately.

transcriptase · 2025-12-31T05:35:06 1767159306

There’s a very negative immune response to the idea of Netflix running ads.

And yet they’re there, in the form of prominent product placement in all of their original series along with strategic placement in the frame to make sure they appear in cropped clips posted to social media and made into gifs.

Stranger Things alone has had 100-200 brands show up under the warm guise of nostalgia, with Coke alone putting up millions for all the less-than-subtle screen time their products get.

I’m certain AI providers will figure out how to slyly put the highest bidder into a certain proportion of output without necessarily acting out that scene in Wayne’s World.

mahirsaid · 2025-12-31T05:37:46 1767159466

I suspect google can last much longer in regards to an AI model chat engine that competes with open AI and other companies, without needing a profit from that particular product in a timely manner. I can's say the same for the others. Google is using it's own money to fund this without mch pressure for immediate profit in a time deadline. They can rely on their other services for revenue and profit for the meantime.

gniv · 2025-12-31T13:59:26 1767189566

Google had an in-house chatbot that was never allowed to launch. I used to think that they were wrong but now I'm pretty sure they were right to not launch it. Users are very forgiving with a newcomer but not with an established company.

razodactyl · 2025-12-31T04:35:45 1767155745

Think about it in terms of the research they put out into the ether though. The research grows into something viable, they sit back and watch the response and move when it makes sense.

It's like that old concept of saying something wrong in a forum on purpose to have everyone flame you for being wrong and needing to prove themselves better by each writing more elaborate answers.

You catch more fish with bait.

fooblaster · 2025-12-31T01:09:09 1767143349

And yes, all their competitors are making custom chips. Google is on TPU v7. absolutely nobody is going to get this right on the first try among their competitors - Google didn't.

CharlieDigital · 2025-12-31T01:27:29 1767144449

Bigger problem for late starts now is that it will be hard to match the performance and cost of Google/Nvidia. It's an investment that had to have started years ago to be competitive now.

loloquwowndueo · 2025-12-31T02:17:03 1767147423

In this case the difference between its and it’s does alter the meaning of the sentence.

stevenjgarner · 2025-12-31T03:41:36 1767152496

Agreed. Even xAI's (Grok's) access to live data on x.com and millions of live video inputs from Tesla is a moat not enjoyed by OpenAI.

chroma205 · 2025-12-31T10:18:22 1767176302

>Agreed. Even xAI's (Grok's) access to live data on x.com and millions of live video inputs from Tesla is a moat not enjoyed by OpenAI.

Tesla does not have live video feed from (every) Tesla car.

johnnyfived · 2025-12-31T18:20:57 1767205257

That live data for X is mostly gonna consist of brainrot and egotistical founders and rising vibe coders, that data has to be worth so much less than something like Reddit.

choudharism · 2025-12-31T05:16:41 1767158201

The TAM for video generation isn't as big as the other use cases.

xnx · 2025-12-31T05:52:58 1767160378

I agree, but isn't the TAM for video generation all of movies, TV, and possibly video games, or all entertainment? That's a pretty big market.

dilyevsky · 2025-12-31T06:48:29 1767163709

What you’re competing for is people’s attention and the tam for that is biggest there is

lokar · 2025-12-31T05:18:08 1767158288

YT is also a giant corpus of English via the transcription

IncreasePosts · 2025-12-31T06:25:00 1767162300

Hasn't it all been scraped by other ai companies already?