Hacker Newsnew | past | comments | ask | show | jobs | submit | dgreensp's commentslogin

A curly brace is multiple tokens? Even in models trained to read and write code? Even if true, I’m not sure how much that matters, but if it does, it can be fixed.

Imagine saying existing human languages like English are “inefficient” for LLMs so we need to invent a new language. The whole thing LLMs are good at is producing output that resembles their training data, right?


I ran into this, and there was a bizarre fix—I think having Adobe apps open in the background caused it, or something.

I saw some responses like this. I have zero Adobe apps in my Mac.

Upvoted because educational, despite the AI-ness and clickbait.

I’ve worked at orgs that used Postgres in production, but I’ve never been the one responsible for tuning/maintenance. I never knew that Postgres doesn’t merge pages or have a minimum page occupancy. I would have thought it’s not technically a B-tree if it doesn’t.


This is some of the best writing I've read in a while, and truly fascinating.


Radix sort is not a comparison-based sort and is not O(n log n).


No, because radix sort is not a comparison-based sort and is not O(n log n).


This semi-explains why I have started to notice (sadly) serious bugs in TextEdit, not just scrolling but editing/corruption.


> Never place rich UI elements within a table, list, or other markdown element.

> Place rich UI elements within tables, lists, or other markdown elements when appropriate.


I think people are missing the fact that Wired has been about “vibes” since the beginning.

Wired vs. tired is literally about what’s “cool.” That’s it. It has never been rigorous about anything.


Yeah, and video games were just a way to distract kids for a few hours so parents could watch tv or read a book without being bothered.

It's become something else, Wired has a brand name and a reputation, so when they pooh-pooh something that has more weight that if you or I do.


Googlebot respects robots.txt. And Google doesn't use the fetched data from users of Chrome to supplement their search index (as a2128 is speculating that Perplexity might do when they fetch pages on the user's behalf).


Yes, but there's no way to say "allow indexing for search, but not for AI use", right?


But there is: https://developers.google.com/search/docs/crawling-indexing/...

There is an user agent for search that you can control in robots.txt.

    user-agent: Googlebot
There is another user agent for AI training.

    user-agent: Google-Extended


Wow, I had no idea this page existed, thanks for the reference!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: