Hacker Newsnew | past | comments | ask | show | jobs | submit | AshleysBrain's commentslogin

Is this not a good example of how generative AI does copyright laundering? Suppose the image was AI generated and it did a bad copy of the source image that was in the training data, which seems likely with such a widely disseminated image. When using generative AI to produce anything else, how do you know it's not just doing a bad quality copy-paste of someone else's work? Are you going to scour the internet for the source? Will the AI tell you? What if code generation is copy-pasting GPL-licensed code in to your proprietary codebase? The likelihood of this, the lack of a way to easily know it's happening, and the risks it causes, seems to me to be being overlooked amidst all the AI hype. And generative AI is a lot less impressive if it often works as a bad quality copy paste tool rather than the galaxy brain intelligence some like to portray it as.

There are countless examples. Often I think about the fact that the google search AI is just rewording news articles from the search results, when you look at the source articles they have exactly the same points as the AI answers.

So these services depends on journalists to continuously feed them articles, while stealing all of the viewers by automatically copying every article.


I actually often have the opposite problem. The AI overview will assert something and give me dozens of links, and then I'm forced to check them one by one to try to figure out where the assertion came from, and, in some cases, none of the articles even say what the AI overview claimed they said.

I honestly don't get it. All I want is for it to quote verbatim and link to the source. This isn't hard, and there is no way the engineers at Google don't know how to write a thesis with citations. How did things end up this way?


I have to say, I suffer from both problems, just not simultaneously.

Depending on what I am searching for, and how important it is to me to verify the accuracy and provenance of the result, I might stop at the AI, or might find, as you have, that there is no there there.

But, no matter what, the AI is essentially reducing the ability of primary sources to monetize their work. In the case where the search stops at the AI, obviously no traffic (except for incessant LLM polling) goes to the primary source.

And in the case you describe, identical traffic (your search) is routed to multiple sources, so if one of them actually was the source of something you were interested in, they effectively wind up sharing revenue with other sources, because the value of every one of your clicks is reduced by how often you click.


ChatGPT was a research prototype thrown at end users as a "product".

It is not a carefully designed product; ask yourself "What is it FOR?".

But the identification of reliable sources isn't as easy as you may think, either. A chat-based interaction really makes most sense if you can rely on every answer, otherwise the user is misled and user and conversation may go in a wrong direction. The previous search paradigm ("ten snippets + links") did not project the confidence that turns out is not grounded in truth that the chat paradigm does.


Yes, and it's slowly killing those websites. Mine is among them and the loss in traffic is around 60%.

Snippets were already getting Google in legal hot water (with Yelp in the US and news agencies in Australia in particular IIRC) long before LLMs and AI scraping. It's a debatable gray area of Fair Use growing out of early rulings on DMCA related cases, and also Google's win over the Author's Guild at SCOTUS.

Of course Google has a history of copying articles in whole (cf. Google Cache, eventually abandoned).

> What if code generation is copy-pasting GPL-licensed code in to your proprietary codebase?

This is obviously a big, unanswered, issue. It's pretty clear to me that we are collectively incentivised to pollute the well, and that it happens for long-enough for everything to become "compromised". That's essentially abandoning opensource and IP licensing at large, taking us to an unchartered era where intellectual works become the protected property of nobody.

I see chatbots having less an impact on our societies than the above, and interestingly it has little to do with technology.


> we are collectively incentivised to pollute the well

Honestly, there are two diametrically opposed incentives occurring right now. The one you describe may not even be paramount -- how hard is it to prove infringement, shepherd a case through court, and win a token amount. Is it worthwhile just to enrich a few lawyers, and get more AI-regurgitated slop to open up?

The second incentive is to not publish source code that might be vacuumed up by a completely amoral automaton. We may be seeing the second golden age of proprietary software.


It sounds like they were testing with iOS 12? In practice that has fallen out of use and doesn't need to be supported any more. Yes, a bunch of problems are to do with Safari specifically, but if you target relatively modern versions only (iOS 16+ is pretty reasonable IMO) it'll save a lot of pain.


I have to support iOS 16. In terms of browser specific bugs that I have to deal with I'd say about 80-90% of what I encounter is Safari specific. Of that another 80% only affects iOS and of that like 2/3 are fixed in more current versions.


Yeah, supporting iOS 12 in 2025 is odd. I was investigating browser support levels just recently for a library and also settled on iOS 16 as a reasonable level.

For reference, iOS 12.x and below are used 0.33% globally https://browsersl.ist/#q=safari+%3C+13+or+ios+%3C+13. Selecting iOS 16 would still exclude less than 1% globally https://browsersl.ist/#q=safari+%3C+16+or+ios+%3C+16. In both cases the vast majority would be older iOS which is unfortunate because I assume they're on older devices with no upgrade path, but you have to decide on the transpile/polyfill cutoff at some point and browser support has an extremely long tail.


I read through this guide a while back and it is great. However it's fairly old now being from 2011 - are there any updated guides or explanations of what's changed since then?


I don't think all that much has changed except for the addition of meshlet and raytracing stages:

The 'descriptors are hard' article is a nice modern compagnion IMHO:

https://www.gfxstrand.net/faith/blog/2022/08/descriptors-are...

Also the D3D12 functional spec might be useful too:

https://microsoft.github.io/DirectX-Specs/


The mesh shader pipeline isn't just an addition, it replaces several pieces of the old pipeline, like vertex shader, tesselation and geometry shaders. It's a pretty big departure. Though most engines don't use it yet I believe.


Does switching a UE5 game to lumen/nanite switch from essentially the "old pipeline" to the new mesh shader path, or do even the non-lumen/nanite modes use the mesh shaders?


Maybe optionally if the hardware supports it, but new UE5 features also work with older cards that already support compute shaders (a specialized pipeline which was introduced somewhat earlier) but no mesh shaders.


That was from an AI hallucinating HN 10 years from now: https://news.ycombinator.com/item?id=46205632


Perhaps someone with good with reverse engineering skills could figure out what went wrong here - it might be amusing...


I think the article is slightly misleading: it says "Google has resumed work on JPEG XL", but I don't think they have - their announcement only says they "would welcome contributions" to implement JPEG XL support. In other words, Google won't do it themselves, but their new position is they're now willing to allow someone else to do the work.


Describing it as 'Google' is misleading, because different arms of the company might as well be completely different companies. The Chrome org seems to have had the same stance as Firefox with regards to JPEG XL: "we don't want to add 100,000 lines of multithreaded C++ because it's a giant gaping security risk", and the JPEG XL team (in a completely separate org) is addressing those concerns by implementing a Rust version. I'd guess that needing the "commitment to long-term maintenance" is Chrome fighting with Google Research or whatever about long-term headcount allocation towards support: Chrome doesn't want the JPEG XL team to launch and abandon JPEG XL in chrome and leaving Chrome engineers to deal with the fallout.


It's technically correct. Googlers (at Google Research Zurich) have been working on jxl-rs, a Rust implementation of JPEG XL. Google Research has been involved in JPEG XL from the beginning, both in the design of the codec and in the implementation of libjxl and now jxl-rs.

But until now, the position of other Googlers (in the Chrome team) was that they didn't want to have JPEG XL support in Chrome. And that changed now. Which is a big deal.


Yes and they will also only accept it if the library is written in Rust. The patch to add support that is in the thread, and referenced in the article uses libjxl which is C++ and therefore cannot be used.


It's easy to say "XYZ is dead, time to replace it with something better". Another example is the Win32 APIs are hideous (look up everything SetWindowPos does) and need replacing.

In the real world though, backwards compatibility reigns supreme. Even if you do go and make a better thing, nobody will use it until it can do the vast majority of what the old thing did. Even then, switching is costly, so a huge chunk of people just won't. Now you have two systems to maintain and arguably an even bigger mess. See Win32 vs. WinRT vs. Windows App SDK or however many else there are now.

So if you're serious about improving big mature platforms, you need a very good plan for how you will handle the transition. Perhaps a new API with a compatibility layer on top is a good approach, but the compatibility layer has to have exactly 100% fidelity, and you can never get rid of it. At the scale of these platforms, that is extremely hard. Even at the end of the day, with a huge compatibility layer like that, have you really made a better and less bloated system? This is why we tend to just muddle along - as much as we all like to dream, it's probably actually the best approach.


>Perhaps a new API with a compatibility layer on top is a good approach

The opposite would make more sense. Have a transpiler or something to the 'old' API (very natural, given it can grow with the new API and you don't have to implement the entire 'old' API), while new apps can slowly transition the 'new' API.


This reminds me of the quote by Robert C. Martin[1]: "the ratio of time spent reading [code] versus writing is well over 10 to 1".

If programmers spend 90%+ of their time reading code rather than writing it, then LLM-generated code is optimizing only a small amount of the total work of programming. That seems to be similar to the point this blog is making.

[1] https://www.goodreads.com/quotes/835238-indeed-the-ratio-of-...


Even worse, in some cases it may be decreasing the writing time and increasing the reading time without reducing the total work.


But the reason we read code is to be able to change it or extend it. If we don’t need to change it extend it, the need to read it disappears too.


Unfortunately, the micro-methods his clean coding style oroduce ends up doing the exact opposite.

Context is never close at hand, it is scattered all over the place defeating the purpose.


No one is hiring though for reading code. I have read 10 million lines of code, roughly, I have written not one line.

Now I have produced a lot of programs, just by reading them.

People should also learn how to read programs. Most open source code is atrocious, corporate code is usually even worse, but not always.

As Donald Knuth once said, code is meant to be read. The time of literate programming is gonna come at some point, either in 100 years or in 3 years.


That ratio no longer holds if people don't look at the code, they just feed it back into a new llm.

People used to resist reading machine generated output. Look at the code generator / source code / compiler, not at the machine code / tables / xml it produces.

That resistance hasn't gone anywhere. Noone wants to read 20k lines of generated C++ nonsense that gcc begrudgingly accepted, so they won't read it. Excitingly the code generator is no longer deterministic, and the 'source code prompt' isn't written down, so really what we've got is rapidly increasing piles of ascii-encoded-binaries accumulating in source control. Until we give up on git anyway.

It's a decently exciting time to be in software.


IIRC JPEG2000 was never supported by any browser other than Safari, and even Safari recently gave up and removed support (around the same time they added support for JPEG XL). As to why other browsers never supported it, I'm not sure.


WebP seems pretty widely supported to me - on Windows at least, Explorer shows thumbnails for them, Paint can open them, other editors like Paint.NET have built-in support... I haven't come across software that doesn't support WebP for a while.


Google Docs, of all things, does not support webp. Preview on Mac can open it but not edit. Those are my two most common use cases.


I celebrated the anniversary of the (internal) bug asking for SVG support in Google slides. I think it's up to 15 years now?

So, uh, don't get your hopes up.


Well SVGs I understand being harder to support, those aren't really images. And various anti-injection security rules treat it as untrusted HTML code.


There is a workaround for using SVG in Google Slides by using Google Drive to convert to EMF (a format I’ve never heard of anywhere else). It’s a pain, though.

https://graphicdesign.stackexchange.com/questions/115814/how...


Huh, first I've heard of EMF too.


It seems so strange to me that it is this hard to add: I wrote a userscript recently to extract titles from slides and if I recall Google slides already renders in an svg format. Wondering what's going on here.


Right so on Linux/KDE.

Is missing WebP support a meme?


Yep, on Gnome we have both eog and GIMP that support webp completely, and have for many years. I don't think I've even tried with other apps but haven't needed to. I didn't even realize this was a problem for some platforms


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: