PyPI Was Subpoenaed

PyPI Was Subpoenaed



@snapcaster 6d
Really weird, anyone have some inside gossip on what this is about?
@wongarsu 6d
> We have waited for the string of subpoenas to subside, though we were committed from the beginning to write and publish this post as a matter of transparency, and as allowed by the lack of a non-disclosure order associated with the subpoenas received in March and April 2023.

That's suspiciously specific. Sounds to me like they had some some other subpoenas they aren't allowed to talk about.

@LordShredda 6d
I'm guessing some poor typosquatter managed to hit a gov agency and is about to get alphabet soup all over him.
@CarbonCycles 6d
What an odd article and release statement. It’s almost as if they’re signaling w-out literally signaling the parties of interest.

Surprised the doj didn’t issue any gag orders.

@jacquesm 6d
> We will not be releasing the usernames involved publicly or to the users themselves.

Why not to the users themselves? Have they been prohibited from doing so? (TFA does not say afaict)

@NelsonMinar 6d
Total speculation on my part by PyPI hosts yt-dlp, the unauthorized video downloader.
@aa_is_op 6d
By the number of malicious packages that site has hosted over the past few months, this was only a matter of time.

I've lost track of the number of "white hats" that contact us with extortion requests after they used some dependency confusion attack.

@misterpigs 6d
I love this level of transparency.
@Zetice 6d
Dumb legal question; what's the difference, if any, between "We've been subpoenaed" and "Someone had a warrant for data"?
@avgcorrection 6d
> The privacy of PyPI users is of utmost concern to PSF and the PyPI Administrators, and we are committed to protecting user data from disclosure whenever possible.

Don’t lead with this.

> In this case, however, PSF determined with the advice of counsel that our only course of action was to provide the requested data.

If you’re going to say this.

I’m not judging their decision. Maybe not going to prison is a greater concern to them. It’s fine to just say that you thought it was best to comply because [lawyer reasons that you don’t have to disclose to anyone]/counsel.

EDIT: Or say “there are bad people out there and we trust the DOJ”. Whatever.

@kjkjadksj 6d
I don’t understand how the information requested is relevant at all for any purpose. Most users of pypi merely download through pip; they arent registering anything. Furthermore, I would think a bad actor who would register would spoof their ip and use burner accounts anyhow.
@junon 6d
Good on the PyPi folks. This is an incredibly well done disclosure, an example to be sure.
@whimsicalism 6d
> as allowed by the lack of a non-disclosure order associated with the subpoenas received in March and April 2023.

Yeah no way they haven't had other subpoenas then.

@tgbugs 6d
One theory that I don't see mentioned yet is that someone used an upload to pypi to exfiltrate data or simply as a way to upload arbitrary data somewhere. In a sense pypi is just a file hosting service, so it could have nothing to do with any actual python projects at all.
@dvt 6d
Most likely caused by phishing, ransomware, or (unlikely) crypto mining. I'd bet someone from some agency had credentials leaked due to a malicious package. Honestly, PyPI is stuck between a rock and a hard place, but having something like a "verified" badge (where someone's real identity is tied to it) for certain packages would go a long way to ensure some level of security.

The problem gets a bit hairier when dealing with dependency chains, however.

@zerealshadowban 6d
They log too much data about their users.

So they should promptly update their policies to a) stop logging so much, b) delete all past logs, and c) sharply limit the span of time until deletion of whatever logs they decide they really need to track for internal needs.

They should avoid logging, and rapidly rotate logs, to thwart future subpoenas from the total surveillance state.

@jehb 6d
Suggestion: Start slipping unique URLs into the "hidden" backend fields of systems where you'd like to know if your data was breached, improperly used, or handed over to a three letter agency.

Suddenly getting hits at[uuid]? At least you know somebody has looked at the data, or at the very least fed it through some processing tool that is extracting and visiting the URLs.

@gjmacd 5d
I would point to Jim Jordan and all the other Republicans after January 6th who didn't honor a subpoena and toss them in the trash. Nobody in our government honors them, why should we in the private sector? What's going to happen, they going to raid offices and get a bunch of PC's and books?
@casey2 5d
How come when PyPI hosts unwanted malware they get subpoenaed but when Apple or Microsoft or anyone else with a big team of lawyers distributes auto-installing "updates" designed to harm/scam users the DOJ is silent?
@asne11 5d
I keep seeing people trying to assure other readers that the recipients of these subpoenas have some recourse to appeal.

This is not the case if the subpoena is issued by the FISA court, otherwise known as "the court of no rejection."

@dpifke 5d
Being reminded that PyPI is a target for law enforcement makes me even more irked that they've removed end-to-end package signing before providing a replacement[0].

PGP signatures—even though rarely used—would allow someone to verify that a signed package was not modified by PyPI after being uploaded by its original author.

Without any sort of signing mechanism, we have to trust the U.S. Government to never demand that PyPI insert a backdoor, via a National Security Letter, FISA court order, or other kangaroo court process. Good luck with that.

The existing PGP signing mechanism had usability issues and security footguns, but was better than nothing. It's a shame they didn't roll out a more usable and secure alternative before removing the existing functionality.


@fijiaarone 5d
It’s ok, this government is perfect.
@krick 5d
I'd say nothing but nickname and the list of packages uploaded (which should be public in the first place) should've been stored anyway.

It immediately reminded me, that PyPI content is really trash as it is because of all the squatting and pointless unfinished toy-projects, and whatever they are logging clearly doesn't help, but I think that big problem for PyPI is one seemingly minor detail: lack of namespaces (as in of It is not a solution for all sorts of malicious behaviour, of course, but it really makes things much easier. It doesn't solve typosquatting and such, but, honestly, neither does the current system, obviously. And at least it allows to keep the actual package names semantic. And which one of countless "*/time" libraries you wanna get you just kinda have to decide separately, using the number of starts on the github as a reference and carefully copy-pastying the id to your requirements.txt

The same issue I have with Cargo. I mean, really, isn't it obvious that making users compete for better project names just makes everything shit?

@Aeolun 5d
Kind makes me feel like a lot of these services should just _not_ be hosted in the US? If we'd have this hosted in Germany or Sweden, would the government be so casually requesting data from these registries?
@bogwog 5d
It’s nice that they’re committed to user privacy, and this post really gives me confidence that my privacy will be reasonably protected.

…but why is that a goal for PyPi? As a publisher of packages, it’s a nice-to-have, but as an end user it’s kind of scary. I don’t want to use software packages published by anonymous and potentially unaccountable people. That’s probably why they have so many malicious packages.

Maybe you live in an oppressive regime who will imprison/murder you for publishing some code; ok, but that’s an outlier, and there are a lot of ways to get around that situation.

I just don’t see the benefit of privacy in this situation? Is it just to reduce the administrative overhead of collecting/verifying identity info? I’m genuinely curious to learn about a realistic use case that justifies the risks to all users.

I know you can self host your own package index, but very few users have the resources to do that.