More

firesteelrain · 2026-01-02T07:54:14 1767340454

I didn’t know GitHub had discussions. This is great

firesteelrain · 2026-01-01T23:55:37 1767311737

This is what I came to say. We pre cache dependencies into an approved baseline image. And we cache approved and scanned dependencies locally with Nexus and Lifecycle.

firesteelrain · 2026-01-01T20:23:26 1767299006

HSM or TPM?

wolvoleo · 2026-01-01T21:53:08 1767304388

A TPM is a form of HSM (Hardware Security Module).

HSMs come in all sizes, from a chip in your phone (secure element) or even a dedicated part of a SoC chip, to a big box in a datacenter that can handle tons of requests per second.

The idea is having dedicated hardware to protect the private key material. This hardware can execute signing operations, so it can use the key but it can't share the key material itself. It is usually also physically hardened with techniques to extract said keys, like sidechannel attacks based on power draw, X-ray inspection, decapping etc.

firesteelrain · 2026-01-01T23:26:38 1767309998

Thanks - I know the difference

This also sounds very AI-like

wolvoleo · 2026-01-02T00:01:22 1767312082

I'm not AI and I didn't use it for that, I just thought it was a genuine question and tried to explain it clearly :)

I don't really get why anyone would let an AI put random comments on discussions anyway but that's another story.

EPWN3D · 2026-01-02T01:22:42 1767316962

The story implies that these are signing keys, so there is no reason for the private halves to be present in the product's silicon in any form. If these were encryption keys stored in a TPM, they'd have been extracted not leaked.

tosti · 2026-01-01T21:24:44 1767302684

Hypothetically Secure Memory

(I guess)

firesteelrain · 2026-01-01T00:59:36 1767229176

The capsule (?) for humans riding to the ISS looks reminiscent of the 1960s too

JumpCrisscross · 2026-01-01T01:07:46 1767229666

> capsule (?) for humans riding to the ISS looks reminiscent of the 1960s too

Dragon 2 [1] looks like the Apollo and Gemini craft for the same reason it resembles Soyuz 3 [2]. Crewed disposable atmospheric reëntry vehicles launched in cylinders and soft landed or splashed down under parachutes are going to look similar.

[1] https://upload.wikimedia.org/wikipedia/commons/4/4e/Iss071e0...

[2] http://www.spacepatches.nl/soyuz/soyuz3.html

firesteelrain · 2026-01-01T00:57:42 1767229062

I believe that might be Law 16:

“The previous people who did a similar analysis did not have a direct pipeline to the wisdom of the ages. There is therefore no reason to believe their analysis over yours. There is especially no reason to present their analysis as yours.”

firesteelrain · 2026-01-01T00:52:21 1767228741

Sounds an awful like The Right to be Forgotten under GPDR Article 17

scsh · 2026-01-01T00:56:28 1767228988

Absolutely. What sound pretty cool, and different, here is CalPrivacy would be required to build a request mechanism that's one request sent to every data broker.

mikestorrent · 2026-01-01T00:59:40 1767229180

Dare I ask, what happens to data brokers that don't care about Californian laws? Must be many such instances operating from outside the USA?

scsh · 2026-01-01T01:04:33 1767229473

They open themselves up to a lot of risk, but more likely they only comply when CA residents are concerned or stop collecting for CA residents. Good question about outside the USA. Makes me wonder if there may end up being some sort of data broker safe havens setup, like we've seen with banking.

ofalkaed · 2026-01-01T01:09:58 1767229798

California will take them to court and/or block them from doing business in the state, have various ways to penalize them, etc. California is big enough that many will want to play game with them and having a state as powerful as California on board will get other states to jump on board and pass their own legislation and take up the same tactics with non-complying companies. Once it gets enough traction at the state level, the fed will step in because this will affect interstate commerce and that is federal jurisdiction. This is how state sovereignty works, it is not that states can do as they please, they can only do it up until the point it affects other states or crosses the line with federal law.

JumpCrisscross · 2026-01-01T01:43:46 1767231826

> Sounds an awful like The Right to be Forgotten under GPDR Article 17

Does DROP let you censor search records?

I’d encourage anyone in Europe to compare California’s CCPA to the EU’s GDPR. It was inspired by the latter, and fixes a lot of its problem. (The Swiss referendum system was based on learning from and improving on California’s.)

userbinator · 2026-01-01T01:13:33 1767230013

More like The Right to Rewrite History

firesteelrain · 2026-01-01T00:08:21 1767226101

Funding could help, but it still requires PyPI/Warehouse to ship and operate a new public search interface that is safe at internet scale.

coldtea · 2026-01-01T00:51:25 1767228685

They operate a public package hosting interface, how is a search one any harder?

miketheman · 2026-01-01T01:36:10 1767231370

PyPI responses are cached at 99% or higher, with less infrastructure to run.

Search is an unbounded context and does not lend itself to caching very well, as every search can contain anything

bastawhiz · 2026-01-01T02:02:42 1767232962

Pypi has fewer than one million projects. The searchable content for each package is what? 300 bytes? That's a 200mb index. You don't even need fancy full text search, you could literally split the query by word and do a grep over a text file. No need for elasticsearch or anything fancy.

And anyway, hit rates are going to be pretty good. You're not taking arbitrary queries, the domain is pretty narrow. Half the queries are going to be for requests, pytorch, numpy, httpx, and the other usual suspects.

froh · 2026-01-01T04:03:12 1767240192

I wonder how a PyPi search index could be statically served and locally evaluated on `pip search`?

firesteelrain · 2026-01-01T04:25:20 1767241520

PyPI servers would have to be constantly rebuilding a central index and making it available for download. Seems inefficient

ptx · 2026-01-01T16:15:35 1767284135

Debian is somehow able to manage it for apt.

firesteelrain · 2026-01-01T17:17:53 1767287873

1. Debian is local first via client side cache

2. apt repositories are cryptographically signed, centrally controlled, and legally accountable.

3. apt search is understood to be approximate, distro-scoped, and slow-moving. Results change slowly and rarely break scripts. PyPI search rankings change frequently by necessity

4. Turning PyPI search into an apt-like experience would require distributing a signed, periodically refreshed global metadata corpus to every client. At PyPI’s scale, that is nontrivial in bandwidth, storage, and governance terms

5. apt search works because the repository is curated, finite, and opinionated

froh · 2026-01-01T18:52:09 1767293529

isn't this an incrementally updatable tree that is managed with a Merkle tree? git-like, essentially?

firesteelrain · 2026-01-01T19:21:22 1767295282

The install side is basically Merkle-friendly (immutable artifacts, append-only metadata, hashes, mirrors). Search isn’t. Search results are derived, subjective, and frequently rewritten (ranking tweaks, spam/malware takedowns, popularity signals). That’s more like constantly rebasing than appending commits.

You can Merklize “what files exist”; you can’t realistically Merklize “what should rank for this query today” without freezing semantics and turning CLI search into a hard API contract.

froh · 2026-01-02T08:01:48 1767340908

are you saying PyPi search is spammed o-O ?

froh · 2026-01-01T18:48:05 1767293285

that depends on how it can be downloaded incrementally.

woodruffw · 2026-01-01T04:55:07 1767243307

The searchable context for a distribution on PyPI is unbounded in the general case, assuming the goal is to allow search over READMEs, distribution metadata, etc.

(Which isn’t to say I disagree with you about scale not being the main issue, just to offer some nuance. Another piece of nuance is the fact that distributions are the source of metadata but users think in terms of projects/releases.)

bastawhiz · 2026-01-01T15:34:52 1767281692

> assuming the goal is to allow search over READMEs, distribution metadata, etc.

Why would you build a dedicated tool for this instead of just using a search engine? If I'm looking for a specific keyword in some project's very long README I'm searching kagi, not npm.

I'd expect that the most you should be indexing is the data in the project metadata (setup.py). That could be unbounded but I can't think of a compelling reason not to truncate it beyond a reasonable length.

woodruffw · 2026-01-01T15:43:15 1767282195

You would definitely use a search engine. I was just responding to a specific design constraint.

(Note PyPI can’t index metadata from a `setup.py` however, since that would involve running arbitrary code. PyPI needs to be given structured metadata, and not all distributions provide that.)

coldtea · 2026-01-01T17:42:48 1767289368

>The searchable context for a distribution on PyPI is unbounded in the general case, assuming the goal is to allow search over READMEs, distribution metadata, etc.

Even including those, it's what? Sub-20-30GB.

Kwpolska · 2026-01-01T10:42:38 1767264158

How does the big white search box at https://pypi.org/ work? Why couldn’t the same technology be used to power the CLI? If there’s an issue with abuse, I don’t think many people would mind rate limiting or mandatory authentication before search can be used.

firesteelrain · 2026-01-01T13:42:35 1767274955

The PyPI website search is implemented using a real search backend (historically Elasticsearch/OpenSearch–style infrastructure) layered behind application logic on Python Package Index. Queries are tokenized, ranked, filtered, logged, and throttled. That works fine for humans interacting through a browser.

The moment you expose that same service to a ubiquitous CLI like pip, the workload changes qualitatively.

PyPI has the /simple endpoint that the CDN can handle.

It’s PyPI philosophy that search happens on the website and pip has aligned to that. Pip doesn’t want to make a web scraper understandably so the function of searching remains disabled

bastawhiz · 2026-01-01T02:03:12 1767232992

Pypi has a search interface on their public website, though?

BiteCode_dev · 2026-01-01T08:49:48 1767257388

If you really need it, they publish a dump regularly and you can query that.

For simple use cases, you have the web search, and you can curl it.

firesteelrain · 2025-12-31T22:37:00 1767220620

Care to share links to your books?

firesteelrain · 2025-12-31T20:16:52 1767212212

I came away with the same impression. I was less blaming the publisher and more about life getting in the way with the author

firesteelrain · 2025-12-31T05:29:22 1767158962

Opal isn’t just peeking at Drive, it’s using it as its backend. Outputs get saved as Drive files so they persist, can be shared, and open in Docs, etc

The gotcha is the permission scope can be pretty broad (read/write or metadata across Drive), so it’s worth checking what you actually granted.