Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
yread
17 hours ago
|
parent
|
context
|
favorite
| on:
Show HN: 22 GB of Hacker News in SQLite
I wonder how much smaller it could get with some compression. You could probably encode "This website hijacks the scrollbar and I don't like it" comments into just a few bits.
Rendello
16 hours ago
|
next
[–]
The hard-coded dictionary wouldn't be much stranger than Brotli's:
https://news.ycombinator.com/item?id=27160590
reply
maxbond
8 hours ago
|
parent
|
next
[–]
You can use a BPE variant like SentencePiece to identify these patterns rather than hard coding them.
reply
jacquesm
16 hours ago
|
prev
|
next
[–]
That's at least 45%, then you can leave out all of my comments and you're left with only 5!
reply
hamburglar
11 hours ago
|
prev
|
next
[–]
It might be a neat experiment to use ai to produce canonicalized paraphrasings of HN arguments so they could be compared directly and compress well.
reply
rossant
7 hours ago
|
prev
[–]
Guilty.
reply
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: