25

I've been reading about the advantages of monorepos, but haven't yet found a mitigation for the problem of sharing parts of a repo:

Let's say an organization has a monorepo for a client/server web application. They hire a contractor to work on the design of some part of the client. How can they give the contractor access to only the relevant client code? Even sparse checkouts are not trivial.

2

3 Answers 3

25
+100

Consider using git subtree.

With git subtree you will be able to:

  • create a monorepo composed of subtrees, each of which can be linked to separate remote repos.

    Given your example use case, the contractor would be given access to only the remote repo tied to a single subtree of the monorepo.

  • have a single aggregate/unified history (the point of a monorepo)

  • pull changes from subtree remotes into the monorepo

  • push changes made in any subtree of the monorepo to its separate remote

  • keep your simple/easy workfows.

    git subtree does not require users of your repository to learn anything new. They can ignore the fact that you are using git subtree to manage dependencies."

For a list of pros/cons check out Atlassian's Git subtree: the alternative to Git submodule. Though I think the example steps in this article are rather limited if not outdated.

For step by step demonstrations with git log details at each step:

  • The example and steps in Merging multiple repositories into a monorepo, while preserving history, using git subtree are cleaner and more logical than the Atlassian article.

  • git subtrees: a tutorial also includes step by step actions and results for making changes in the monorepo and pushing to the subtree repo, and vice versa, and gives some good tips. It does mention one caveat, and that is rebases that include subtree pulls don't work. Another post explains,

    Do not be tempted to rebase this. Push it as is. If you rebase, git subtree won’t be able to reconcile the commits when you do your next subtree pull.

    If you must do a rebase, the follow up Atlassian article I link below provides a workaround.

  • I usually hate watching videos but Introduction to Git Subtrees  one looks worth it and has lots of detail. Also it is far more recent (2019) than all the other articles. It's comforting to see in advance what you'll be dealing with.

If you want an under the covers understanding:

  • This excellent SO answer explains the difference between git subtree and the git subtree merge strategy (git merge -s subtree). In essence former uses the latter under the covers. In other words git's notion of porcelain vs plumbing.
  • GitHub article about Git subtree merges uses the merge strategy if you prefer that approach.
  • A followup to the Atlassian article above gets more "Under the hood of git subtree".
  • Mastering Git subtrees is also good and mentions a couple of other details that you may or may not find acceptable, and has the most detailed step-by-step actions and results of all the links I've provided.
  • For some history on how git subtree came about, and how it works internally, as well as how subtrees are better than submodules, see Git: submodules vs. subtrees.

monorepo-operator is a tool that may make managing your subtree-based monorepo easier. I haven't used it and cannot vouch for it, but might be worth checking out.

2
  • Interesting! So in my situation, I'd have a subtree with the web app, and I'd pull it in the larger repo that contains the backend and the other programs? Why does the Atlassian link say that a drawback is "The responsibility of not mixing super and sub-project code in commits lies with you."? I would actually want to mix code, for example when an API change in the backend needs to be reflected in the web app. That's the point of having one monorepo. Commented Apr 21, 2020 at 8:46
  • @DanDascalescu Yes. I updated my answer with more articles (I may eventually have to employ this myself, but not yet, but it is worth my time to understand). I'm not sure why Atlassian says that. You certainly can commit changes across subtree boundaries as the step-by-step articles I give show. One of them explains how those changes are intelligently filtered to the subtree's directory when you push to the subtree's remote. I'm guessing they're saying its easy to introduce a dependency in a direction that you don't want?
    – Inigo
    Commented Apr 22, 2020 at 3:36
3

How can they give the contractor access to only the relevant client code?

They don't. Confidentiality issues with a full monorepo are simply too important to be mitigated.
And Git itself has no authorization (or authentication for that matter).
Meaning: no amount of native Git feature alone (submodule or subtree) would be enough on their own.

I usually see an intermediate gate repository, composed of the relevant parts for the contractor to work, with a synchronization process to import/export to work.
And if that contractor is working remotely, then that extract would be hosted on a separate server, itself managed in a DMZ, and replicated to an external server on the internet, accessed through VPN?

3
  • 2
    How is an "intermediate gate repository" any different from a subtree where the contractor only has access to the subtree's isolated remote? The push/pull to/from the subtree remote can be managed with the same permissions and review as the "synchronization process" you mention.
    – Inigo
    Commented Apr 23, 2020 at 19:40
  • @Inigo in my experience, that would not fly in any security audit: you need complete separation in a different DMZ, with only the code you want to share. The fact that code is then referenced in your own repository as a subtree is an implementation detail on your side.
    – VonC
    Commented Apr 23, 2020 at 19:47
  • 1
    Yes, and one can achieve such complete separation with a monorepo and subtrees. "The fact that code is then referenced in your own repository as a subtree is an implementation detail on your side." Yes, an implementation detail that achieves the goal. Given that, your absolute answer, specifically the "They don't" and the following sentence, are wrong.
    – Inigo
    Commented Apr 23, 2020 at 19:50
0

I am not sure about a monorepo and I know this breaks the monorepo question, but an approach I can think of is to structure your project (if possible) to support modules and use git submodules https://git-scm.com/book/en/v2/Git-Tools-Submodules

With access controls of git providers e.g. Gitlab, Bitbucket etc, you can only give access to a specific git submodule you have to contractors whether its read / write or admin access.

In your case for example, you can just place the design layer (the one to share with client in another repo and have it as a submodule to your main repo) and if you want tighter security as @VonC mentioned, you can setup a VPN accessed repo for your submodule. It might take time to setup, but I think it could be worth it once implemented properly considering the risks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.