Transitioning from Polyrepos to a Monorepo

TL;DR

You will trade your inconveniences with polyrepos for tooling around your monorepo CI/CD pipeline.

We recently made the transition to a monorepo after repeatedly asking ourselves “why not?” for months; we would save ourselves from erroneous work if we could kill the idea. The motivation behind this post is to share our experience and hopefully help others either contemplating the change or in the trenches, so to speak, while steering clear of the war that is waging on the innerweb whether “to monorepo or not monorepo” 🤣.

Our initial setup consisted of our web and mobile repos plus an auxiliary repo for our universal component library affectionately dubbed knack-pack. Eventually we centralized our utility functions into one repo and then consolidated our cross-platform chat functionality into yet another. These repos appeared separate on the surface yet underneath they were more connected which compelled us to entertain an alternate approach.

The following were pain points we faced with our old polyrepo setup and lend context to our decision to ultimately jump ship:

Multiple PRs were required to add new features and/or bugfixes (e.g. changes to our “Universal Components” UI package required a new publication before those changes could be used in the web/mobile app). Not surprisingly, this slowed down the overall acceptance of our PRs while also introducing the annoying requirement of using yarn link during development. Consequently, this allowed for unforeseen issues where even the smallest change in a UI component would have cascading effects across our codebase. To put it bluntly, we lacked any real confidence in “assessing” the impact of a change holistically.
Inconsistent tooling across our internally published packages created unnecessary work duplicating configs with only slight variations (e.g. Storybook and Rollup).
It became difficult to trace what revisions remedied a particular bug; we were forced to traverse the various package updates in the commits to pin point the exact patch. Time spent in each rabbit hole was less time being productive. Moreover, git blame annotations with a full changeset of all related modifications was not possible.
It was not clear if breaking changes were introduced on publication of internal packages since we did not follow semantic versioning internally (not that the JS ecosystem is well known for its strict adherence).
Code exploration between packages was less than ideal for obvious reasons

Our Solution

To be clear upfront, we did not create a monorepo for the entire company — only for the packages where it made sense. With that out of the way, we chose lerna as it is the most prominent tool for “managing JavaScript projects with multiple packages” — it is essentially a runner allowing commands to be easily run and scoped across your project packages. We also do not create versioned “releases” using lerna. Yarn Workspaces solved our npm package management across all our packages. Below is the folder structure we decided on:

├── apps
│   ├── mobile
│   └── web
└── packages
    ├── chat
    ├── graphql
    ├── logic
    └── ui

To facilitate running scripts across all our packages, our root package.json looks like this:

  ...
  "husky": {
    "hooks": {
      "pre-commit": "lerna run --concurrency 1 --stream precommit"
    }
  },
  "scripts": {
    ...
    "lint": "lerna run lint --parallel --stream",
    "start:web": "lerna run start --scope=@knack/web --stream",
    "start:mobile": "lerna run start --scope=@knack/mobile --stream",
    "test": "lerna run test --parallel --stream",
    "test:ci": "lerna run test:ci --parallel"
  },
  ...
  "workspaces": [
    "apps/*",
    "packages/*"
  ]

Each package has the naming convention @knack/pkg_name and this is what lerna uses in the output streams to differentiate packages.

Challenges Encountered

Every new adventure is not without its mishaps. A good chunk of our time was spent bringing in the React Native codebase. At the time, we were running an older version of RN (v0.55.4) which required Babel 6, but the rest of our packages used Babel 7. Furthermore, RN Metro Bundler would not follow symlinks to our packages that lerna created (a well known and long-standing issue in the community). Lastly, Gradle and Xcode packages could not handle having multiple node_modules folders to resolve within. We solved theses complications by explicitly telling yarn workspaces not to hoist any packages in our package.json:

{
  "name": "@knack/mobile",
  ...
  "workspaces": {
    "nohoist": [
      "**"
    ]
  },
  ...
}

The most irritating part of the entire process was our usage of the babel-plugin-module-resolver plugin, bar none. We had to specify in each of our babel.config.js files not traverse up the tree and use the last config file it found but rather use the closest per the docs. Then, our root .eslintrc.js needed to resolve correctly which is unfortunately something that both Atom and Visual Studio Code plugins did not respect. Yup… JS tooling for the win!

Testing

The hardest part about testing a monorepo is the Directed Acyclic Graph (DAG) problem: given the following changes in a PR, what are the fewest tests required to run across all the packages in the monorepo?

Jest has a --findRelatedTests flag but it does not seem well suited to solved this problem. We are still currently working through this so if you have a solution please reach out to us! Presently, our CI pipeline runs all tests on all packages — far from ideal. CI builds are slow due to this inefficiency.

Linting

The linting stage was removed from our CI pipeline and opted for husky precommits to lint only staged code since it does not make sense to lint untouched code across all packages. Configuration in a monorepo is slightly different so here is the relevant part in the docs.

Closing thoughts

As we have pointed out, it is not all rainbows and butterflies using a monorepo. However, overall our team seems happy with the decision so monorepo for the win! Well, mostly…