[SIP] Proposal for Migrating from npm + Lerna to pnpm
Motivation
Apache Superset’s frontend currently relies on npm for dependency management and Lerna for monorepo publishing and workspace organization. While Lerna has served us well, it has experienced periods of limited maintenance and slower updates. Additionally, we still face challenges with large node_modules
directories, long install times, and complex multi-package workflows.
pnpm offers a compelling alternative that consolidates both package management and monorepo workspace features in a single tool. Some key benefits include:
- Smaller
node_modules
footprint: pnpm’s content-addressable store greatly reduces disk usage. - Faster installations: pnpm’s symlink-based architecture speeds up the dependency installation process.
- Actively maintained: pnpm is supported by the npm team, ensuring ongoing development and timely fixes.
- Built-in workspace management: pnpm can replace Lerna’s monorepo features without additional overhead.
Description of the problem to be solved
- Large
node_modules
directories: Both local development environments and CI pipelines suffer from bloated install footprints. - Long install times: As dependencies grow,
npm install
becomes slower and more resource-intensive. - Maintenance overhead: Using two tools—npm and Lerna—for monorepo management can lead to redundant configuration and potential version mismatches.
- Future-proofing: Lerna’s maintenance status has fluctuated, creating uncertainty for the project’s long-term needs.
Proposed Change
- Adopt pnpm Workspaces
- Migrate from Lerna’s monorepo setup (
lerna.json
and associated scripts) to apnpm-workspace.yaml
configuration. - Remove or deprecate Lerna-specific commands in favor of pnpm’s built-in workspace features.
- Replace npm with pnpm
- Update all
npm install
andnpm run
scripts to use pnpm. - Validate and adjust any scripts or hooks to ensure they function under pnpm.
- Integrate Changesets for Versioning & Publishing (optional but recommended)
- If we still want automated changelog generation and version bumping, incorporate Changesets.
- Configure pnpm to run Changesets during CI to publish packages.
- Update CI/CD
- Switch CI steps from
npm install
topnpm install
. - Evaluate caching strategies (e.g., caching the pnpm store) to maximize build performance.
- Documentation
- Provide clear migration steps for developers (e.g., uninstall Lerna globally if used, install pnpm, and switch to
pnpm
commands). - Update any references to Lerna or npm in project documentation, READMEs, and onboarding guides.
New or Changed Public Interfaces
- No direct impact on Superset’s REST endpoints, dashboards, or CLI is anticipated.
- Developer-facing scripts (e.g.,
lerna publish
ornpm run build
) will be replaced bypnpm run publish
orpnpm run build
, necessitating documentation updates.
New Dependencies
- pnpm
- License: MIT
- Actively maintained by the npm team and open-source community.
- (Optional) Changesets
- License: MIT
- Actively maintained and widely adopted for multi-package versioning and changelog generation.
Migration Plan and Compatibility
- Local Environment
- Developers will install pnpm globally (
corepack enable pnpm
or others). - Replace Lerna commands (e.g.,
lerna bootstrap
,lerna publish
) with pnpm equivalents (pnpm install
,pnpm publish
). - CI Environments
- Update pipelines to install pnpm and run
pnpm install
instead ofnpm install
orlerna bootstrap
. - Validate build scripts, tests, and publishing flows under pnpm.
- Backward Compatibility
- Removing Lerna does not affect final build artifacts or runtime usage.
- The main changes are internal to developer workflows and CI processes.
Rejected Alternatives
- Continue using npm + Lerna
- Maintains the status quo but does not address large
node_modules
, slower install times, or the uncertain maintenance status of Lerna. - Nx or Turborepo
- Both offer monorepo solutions, but either introduce additional layers of complexity or lack out-of-the-box publishing features (like Nx or Turborepo).
By migrating from npm + Lerna to pnpm, Apache Superset can simplify its monorepo workflows, reduce disk usage, speed up installation times, and rely on a single, well-maintained solution for both dependency management and workspace organization. This proposal aims to streamline developer workflows and position Superset for future growth and maintenance.
32692
Comment From: rusackas
Thank you for kicking this off! I definitely support simplification and ridding ourselves of lerna
!
One thing I've been wondering about (which we should consider as an alternative) would be usingyarn
(on v4 at the moment). It's also pretty fast/clean/easy, and works with changesets
as well, and supports workspaces and publishing. We might want to add it to the "alternatives" section for good measure.
One thing I've been enjoying lately about yarn
is the overrides
capabilities it offers over npm
, to help keep subdependencies free from CVE exposure. It sounds like pnpm
has nearly identical capabilities, so that's cool. It even cleans up after itself when the top-level dependency no longer needs the override.
I was GPT-ing again, and weighing pros and cons:
I did a little digging on the "Cross-Project Linking" aspects, and it does sound like pnpm
will be an excellent fit for working with our monorepo packages, as well as developing (and migrating) monorepo pieces around as we work toward our new extensions architecture.
I believe I'll be a +1 on this... let me know if you want help moving it forward with discussion/voting.
Comment From: rusackas
I think this is a bit stuck in the SIP process here, and needs to be brought up for a [DISCUSS] thread on the dev@superset.apache.org
mailing list. Let me know here (or on Slack) if you want any help with that. Thanks!
Comment From: rusackas
Opening the discussion... i think this fits well with the Extensions Architecture and Documentation Portal SIPs, as we can (seemingly/hopefully) use this new pnpm setup to automate the versioning of various sections of documentation along with the packages themselves.
Comment From: rusackas
This only has one vote so far. Anyone for or against this? Someone at Town Hall brought up the idea of bun
as an alternative, which might be faster, but it's also the newer/shiny thing. I'm open to it, but like tools that have a large community of support, when we're picking something that needs to last for years.
Comment From: rusackas
Then again... just looked at bun
's repo, and it is quite popular/active/supported. I'll update my comparison grid and take that into consideration. The vote remains open in the meantime. I'll roll it back to [DISCUSS] if it changes my mind ;)
Comment From: rusackas
Been meaning to circle back to this for AGES now. Threw my friend Claude at the problem, and here's what it has to say about bun vs pnpm:
pnpm vs Bun Comparison for Apache Superset
Aspect | pnpm | Bun |
---|---|---|
Monorepo Support | ✅ Mature workspace protocol, workspace:* for internal deps | ✅ Built-in workspaces (since v1.0), less battle-tested |
Installation Speed | 2-3x faster than npm | 10-100x faster than npm |
Disk Usage | 50-70% space savings via hard links | Standard node_modules, but faster writes |
Publishing Workflow | ✅ Excellent - changesets integration, recursive publish | ❌ Limited - no changeset equivalent, basic npm publish |
Dependency Strictness | ✅ Prevents phantom deps, strict by default | ⚠️ More permissive, follows npm model |
Package Overrides | ✅ .pnpmfile.cjs for patches/overrides | ⚠️ Basic overrides in package.json |
Filtering/Scoping | ✅ Powerful --filter flag for selective ops | ✅ Workspace filtering available |
Node.js Compatibility | 100% (is a package manager for Node) | ~90% Node API coverage |
Production Maturity | ✅ Used by Vue, Microsoft, proven at scale | ⚠️ Newer, less proven in large monorepos |
CI/CD Integration | ✅ Established patterns, wide support | ⚠️ Emerging patterns, less documentation |
Peer Deps Handling | ✅ Auto-install with configurable strictness | ✅ Auto-installs peer dependencies |
Windows Support | ✅ Full support (with some symlink caveats) | ⚠️ Experimental Windows support |
Native Addons | ✅ Full support for Node native addons | ⚠️ Incomplete N-API coverage |
Runtime Included | ❌ Package manager only, needs Node.js | ✅ All-in-one runtime + package manager |
TypeScript | Via Node + ts-node/tsx | ✅ Native TypeScript execution |
Debugging Tools | ✅ Mature ecosystem | ⚠️ Limited debugging tools |
Migration Effort | Medium - import command, adjust for symlinks | Medium-High - runtime changes, compat fixes |
Risk Level | Low - community governed, stable | Medium - single company, rapid changes |
Best For | Production monorepos needing publishing | Greenfield projects, speed-critical builds |
Bottom Line for Apache Superset
Go with pnpm because: - Publishing workflows are critical for Superset's multi-package releases, and pnpm has mature tooling here - Proven stability with large monorepos (Vue.js uses it successfully) - Dependency strictness helps manage Superset's complex dependency tree - .pnpmfile.cjs enables patching problematic dependencies
Consider Bun later (12-18 months) when: - Publishing workflow tools mature - More monorepos prove it in production - Node.js compatibility reaches 95%+ - You need dramatic CI speedups (could be 10x faster builds)
The main tradeoff: pnpm gives you production-ready monorepo management today, while Bun offers revolutionary speed but with ecosystem gaps that could block Superset's publishing needs.