A little under two years ago we decided to merge all our separate internal product and library git repositories into one big shared monorepository. We hoped to achieve the following benefits:
Reduced dependency hell due to in-tree dependencies.
While the repo-per-library approach allowed for a more granular versioning of our libraries, in practice it often leads to release cascades: A small bug fix in an utility library would need to be released (via maven release) which in turn would mean a release for all its direct and indirect dependents. This would often create a lot of churn. With the monorepo this has been greatly reduced, since there is only a single revision to manage.
Simplified documentation and traceability.
Following on from the version hell mentioned above, documenting and tracing what versions of applications, libraries and product releases belong together or are compatible added significant. We developed some tooling to help stem the burden, but at the end of the day, it was far better to not have this issue in the first place.
Simpler forking and branch development.
Having everything in a single repository also makes it easier to branch the entire code base to develop new features, to spike out interesting ideas or to make project-specific changes. This allows for better development workflows both for experimentation as well as feature development. In addition, merges are also easier to handle when changes occur in a single repository as opposed to spread out over several.
Streamlined development and debugging experience.
Since most of our code is already checked out, it is easier to debug and edit library code. This greatly reduces barriers to improve the quality of code base. While the initial checkout may take a while, it is far easier to manage and keep up to date with a single repo than a baker’s dozen.
Tighter integration loop.
Our whole monorepo is built automatically courtesy of Jenkins. This means that integration issues are far more likely to pop up sooner rather than later. That in turn means these issues can be addressed and fixed as quickly as possible.
To preserve our commit history, we ended up using some git subtree magic to create our monorepo. Each single parent repository was moved to a subdirectory in the merged monorepo. Thus, the history is available for the inevitable git blame/annotate command.
While the change has been positive overall there are some drawbacks that should be mentioned.
Everything and the kitchen sink.
The monorepo contains a whole lot of stuff, some of which is rarely used or updated. But it still gets pulled and built every time and eats up screen real estate in the IDE. To mitigate this, we’re slowly trying to trim down some excess fat. Other mitigations include setting up IDE and build processes to focus only on a subset of the whole project tree.
Unrelated changes interleave.
Due to everything is happening in the same repo the commit history is sometimes a medley of unrelated changes. To combat this, we’re looking at a more feature-branch oriented approach, in order to make our commit log more coherent. Another neat trick is to constrain history operations to certain subdirectories, in order to get a more focused commit history.
No semantic or finely granular versioning.
Due to having a repo that is versioned collectively we cannot version our libraries using semantic versioning or assign specific versions to subprojects or internal libraries. While this is a drawback, it is a trade-off I’m willing to make to avoid repeated visits to version hell.
In hindsight I think it has been the correct move to combine our code base into a single shared repository to optimize and simplify our development workflows. If you can live with the drawbacks, it is an approach I can highly recommend.
Have fun. Enjoy coding.
Your INNO coding team.