Could Your Builds Be Exposing You to a PyPI Supply Chain Attack via elementary-data?

April 29, 2026
by
Arjun Bhatnagar
deleteme
Audio Player

If your pipeline auto-pulls Python deps or Docker images without strict pinning, this incident is the kind that slips in quietly. The PyPI package elementary-data (~1.1M monthly downloads) shipped a malicious release (v0.23.3) after attackers abused a GitHub Actions script-injection trick via a pull request comment, grabbed the workflow GITHUB_TOKEN, forged a signed tag, and let the project’s real release pipeline do the dirty work. The impact wasn’t just PyPI—Docker tags like :latest were also affected. If you use elementary-data in CI, treat this like a “check your builds, not your laptop” moment.

What actually happened (and why it’s scarier than a maintainer account takeover)

If you read “elementary-data PyPI supply chain attack” and assumed someone stole a maintainer’s password, you’d be missing the worst part.

This wasn’t a classic account takeover. The attacker didn’t need to log in as a maintainer at all. They abused the project’s GitHub Actions automation.

Here’s the chain, in plain English:

  1. A pull request comment kicked off the problem.
    The attacker posted a malicious comment on a PR that triggered a GitHub Actions workflow in a way that allowed script injection—basically, the workflow ended up executing attacker-controlled shell code .
  2. That code execution exposed the workflow’s GITHUB_TOKEN.
    In GitHub Actions, GITHUB_TOKEN is the built-in token a workflow uses to interact with the repo (creating tags, pushing commits, triggering releases, etc.). Once the attacker could run code inside the workflow context, they could grab that token .
  3. The attacker used the token to forge “legit-looking” release artifacts.
    With GITHUB_TOKEN, they forged a signed commit and tag for v0.23.3. That’s the sneaky bit: it can look normal in the repo history if your process trusts tags and signatures at face value .
  4. The real release pipeline published the malware for them.
    After the forged tag, the project’s own release workflow built and shipped the backdoored package to PyPI, and also pushed a matching container image to GitHub Container Registry (GHCR) . Your CI/CD doesn’t “see” an intruder here—it sees a routine release.

Why this is scarier than a maintainer account takeover

A compromised maintainer account is bad, but at least it’s a familiar failure mode: reset credentials, review access, move on.

This incident is uglier because it attacks the thing teams trust most: automation.

  • No stolen maintainer login required—the workflow did the work
  • “Signed” doesn’t always mean “safe” when the signing event was triggered by an exposed automation token
  • Build systems happily ingest it if you’re using loose pins (>= ranges) or pulling images by floating tags (we’ll get specific in a minute)

If your builds ever run GitHub Actions with permissions that can publish releases, this is the takeaway: your security boundary isn’t just humans and MFA anymore. It’s every workflow trigger, every untrusted input (like PR comments), and every token your CI runner can touch.

Where the backdoor landed: PyPI and Docker (the exact versions/tags to worry about)

Once the attacker got a “real” release published, the backdoor didn’t stay neatly inside a GitHub repo. It landed where builds actually pull from: PyPI and container registries.

Known-bad artifacts (treat these as compromised)

If you have any of the following in your build logs, lockfiles, cache layers, or runner disks, assume exposure:

  • PyPI
    • elementary-data==0.23.3 (malicious release)
  • GHCR (GitHub Container Registry)
    • ghcr.io/elementary-data/elementary:0.23.3
    • ghcr.io/elementary-data/elementary:latest (during the exposure window)

This dual-distribution piece matters. The same release workflow that uploaded to PyPI also built and pushed the Docker image, so the malicious payload reached both channels .

Why “floating” installs made this blow up in CI/CD

Most teams didn’t type pip install elementary-data==0.23.3 on purpose. They got it because automation loves defaults:

PyPI: version ranges quietly drift

If your dependencies allow movement, a short compromise window can still hit a lot of builds.

Common patterns that pull “whatever is newest”:

  • elementary-data>=0.23.0
  • elementary-data~=0.23
  • elementary-data (no pin at all)

StepSecurity’s analysis called out the predictable outcome: systems that didn’t use pinned versions pulled the backdoored build automatically .

Docker: :latest is a trap door

Container tags work the same way. If your workflow says:

  • ghcr.io/elementary-data/elementary:latest

…you’re not choosing a version. You’re choosing a moving target. During the incident window, :latest could point at the compromised image .

Quick self-check (fast ways to confirm exposure)

  • Search CI logs for:
    • pip install elementary-data
    • elementary-data==0.23.3
    • ghcr.io/elementary-data/elementary
  • Check build caches and lockfiles for 0.23.3
  • Check container pull history for :latest and :0.23.3

If any of those hits, don’t stop at “we upgraded.” The payload was built to steal secrets, and CI environments are packed with them .

What the malware hunted for (your CI secrets are the prize)

If your build pulled the malicious release, the risk isn’t “my laptop got weird.” It’s “my CI runner just handed over the keys.”

StepSecurity’s write-up noted the malicious release added an elementary.pth file that executes automatically at startup, loading a secrets stealer . That detail matters because it means you don’t need to “import elementary” in your code for the payload to run. Installation + Python startup can be enough in common setups.

What it actively went looking for

The target list reads like a cheat sheet of how modern teams ship software :

  • SSH keys (think: deploy keys, private repo access)
  • Git credentials (tokens, saved auth)
  • Cloud credentials for AWS / GCP / Azure
  • Kubernetes, Docker, and CI secrets (service account tokens, registry creds, pipeline secrets)
  • .env files and developer tokens
  • Crypto wallet files (multiple wallet types were listed)
  • System data like /etc/passwd, logs, and shell history

What this turns into in the real world

Upgrading the package stops future runs, but it doesn’t rewind what already leaked.

The practical outcomes usually look like this:

  • Compromised deploy keys that can still clone/push to repos after you “fixed the dependency”
  • Cloud API keys that let attackers mint new credentials, spin up infra, or pull data quietly
  • CI tokens that can tamper with builds, artifacts, or releases from inside your normal pipeline
  • Kubernetes access paths that jump from “build environment” to “cluster access” fast

If this ran in CI, assume it saw more than source code. CI tends to be the one place where secrets, signing, registries, and deploy permissions all sit in the same room.

Remediation checklist (do this in order, with zero guessing)

Once a secrets stealer runs in CI, your goal isn’t “clean up the package.” Your goal is cut off access paths the attacker may have copied out.

StepSecurity’s guidance for exposed users was blunt: rotate all secrets and restore environments from a known safe point if you pulled the malicious release or affected images .

1) Confirm exposure (don’t speculate)

Make a short, written list of every place this could’ve run:

  • CI systems (GitHub Actions, GitLab CI, Jenkins, Buildkite)
  • Build runners (hosted + self-hosted)
  • Docker build hosts
  • Any environment that installs Python deps during builds

What to look for:

  • elementary-data==0.23.3 in requirements.txt, lockfiles, pip logs, build output
  • container pulls/builds that referenced the project’s image tags during the window

2) Stop the bleeding

  • Freeze deployments from affected pipelines until you’ve rotated credentials.
  • Disable/revoke tokens that CI can use to publish, push, or deploy (even temporarily).

3) Upgrade and remove compromised artifacts

  • Upgrade to the clean replacement elementary-data==0.23.4
  • Purge build caches:
    • pip cache (runner-level and any shared cache)
    • Docker layer cache (builders, remote cache exporters)
  • Delete local copies of the affected container image(s) and rebuild images fresh from clean inputs

4) Rotate secrets like the attacker already has them

This is the part teams under-do. Rotate anything a CI job could read, including:

  • GitHub tokens (PATs), app credentials, repo deploy keys
  • Cloud keys (AWS access keys, GCP service account keys, Azure creds)
  • Kubernetes kubeconfigs/service account tokens, registry creds
  • CI secrets stored in your platform (and any “shared” org secrets)
  • Signing keys used for releases or artifacts (if present on runners)

StepSecurity explicitly recommends rotating secrets for anyone who downloaded the malicious release or images .

5) Rebuild from a known-good state (no “patch and pray”)

If a compromised job ran on a runner, treat that runner as untrusted:

  • Recreate ephemeral runners from scratch
  • For self-hosted runners: re-image or rebuild from a clean snapshot
  • Rebuild artifacts (wheels, containers) from clean sources and clean runners

This matches StepSecurity’s “restore from a known safe point” guidance .

6) Put guardrails in place so this doesn’t repeat

Keep it boring. Boring is safe.

Pin what you ship

  • Pin Python dependencies to exact versions in lockfiles
  • Prefer hash-checked installs for prod builds (so a “same version, different file” event is harder to slip in)

Stop pulling :latest

  • Pin container images by immutable digest (image@sha256:...) or at least by version tag you control

Tighten GitHub Actions permissions

  • Set the default GITHUB_TOKEN permissions to read-only unless a job truly needs write
  • Avoid workflows that can be triggered by untrusted inputs (like PR comments) with powerful tokens attached

Build-time secret hygiene

  • Keep secrets out of build steps that don’t need them
  • Split identities: one for build, a separate one for deploy
  • Use short-lived credentials where you can, so a one-time leak doesn’t become a long-term foothold

Reducing the blast radius next time: limit what leaked secrets can do

The uncomfortable truth from the elementary-data incident is that an attacker didn’t need to “break into prod.” They just needed one run in a privileged build environment, then they could collect credentials at scale (SSH keys, cloud creds, Kubernetes/CI secrets, .env files, tokens) .

So the win condition isn’t “we’ll never get hit again.” It’s: when something runs in CI that shouldn’t, it can’t do much damage.

1) Default to short-lived credentials

Long-lived keys are great… for attackers.

  • Prefer OIDC-based, short-lived cloud creds (GitHub Actions → cloud provider role/token) over static access keys.
  • Keep time-to-live (TTL) tight for anything that can deploy, sign, or publish.

2) Make “least privilege” real (especially in CI)

Treat every job like it might run attacker-controlled code one day—because sometimes it will.

  • Split identities:
    • Build identity: can fetch deps, run tests, build artifacts
    • Deploy identity: can push to prod, modify infra, write to registries
  • Give deploy rights only to:
    • protected branches
    • protected environments
    • manual approvals (when appropriate)

3) Stop leaving secrets lying around on runners

The StepSecurity list included .env files and developer tokens as explicit targets . That’s a hint.

Practical rules that help:

  • Don’t store long-lived secrets in .env on CI runners.
  • Don’t echo secrets into logs (even accidentally).
  • Don’t mount your whole home directory into build containers.
  • Keep secrets scoped to the one step that needs them, then wipe.

4) Assume dependency code can execute at install/startup

This incident used a mechanism that executes automatically and then hunts for secrets . Your defenses should assume “dependency == executable,” not “dependency == library.”

What to change:

  • Run builds in ephemeral environments.
  • Separate “dependency download” from “dependency execution” wherever you can.
  • Don’t let build jobs access production credentials.

5) Where Cloaked fits (secondary fallout, not core CI security)

After incidents like this, attackers don’t just grab API keys. They also scoop up account recovery surface area: the email/phone tied to dev tools, SaaS dashboards, and non-prod services.

For that specific problem, Cloaked can help in a practical way: use Cloaked identities (masked emails/phone numbers) for non-prod signups and developer tooling accounts so a leak doesn’t automatically hand over your real contact info and recovery channels. Keep it boring and consistent: real identity for core admin accounts, masked identity for everything that doesn’t need to know who you are.

View all

Could You Be Next? Social Media Scams Cost Americans $2.1B in 2025—How to Avoid Getting Scammed on Facebook

Data Breaches
by
Abhijay Bhatnagar

What the Medtronic Data Breach Means for You (If Your Data or Business Touches Their Network)

Data Breaches
by
Arjun Bhatnagar

Was Your Information Exposed in the ADT Data Breach? What Was Stolen—and What You Should Do Next

Data Breaches
by
Pulkit Gupta