Skip to main content

Preventing overgrown pull requests

Why do pull requests matter to me?

In my opinion, peer review of pull requests (or merge requests, depending on your version control system and hosting) is one of the most important parts of software development. Not only it prevents bugs from slipping in the production environment (well, most of the time) and allows for spotting architectural flaws but also it distributes the knowledge about the built system (both domain and technological) among the team. From my experience, it is also a great tool for mentoring other developers and learning opportunities for both author and reviewer.

However, the quality of peer review decreases significantly when the pull requests get too large (in terms of the number of changes introduced at once, the number of modified files, and sometimes the number of features added to the application). Too large pull requests are harder to read, navigate and comprehend by the reviewer. As a result, developers might choose to only skim the changes and mechanically click Approve, rendering the whole process a waste of time, leaving the flaws unnoticed and the knowledge unshared. To be frank, I have lots of such reviews on my account.

Some tools can assist the reviewer, making large pull request easier to understand by visualizing relationships between changed files and prioritizing them based on some heuristic rules. Such aid can truly enhance the reviewer experience. In my opinion they are just curing the symptoms of issues that lay deeper in the project. From my experience, large pull requests are not a problem on their own. Additional tools that enhance the review process with changes analysis might be an only short-term solutions, as they could not address real issues. To be clear: I’m not against such tools. If there is software that can help you or your team be more productive: go for it without hesitation. Just think for a moment if the problem you are trying to solve is the real one.

From my experience, there are three main reasons for the overgrown pull requests in a project:

  • lack of experience,
  • too big tasks,
  • system architecture friction.

I’ve spotted all of the above during my team lead/architect career. Hopefully, all of them can be overcome by the team if spotted.

Lack of experience

During many projects, I observed that less experienced developers tend to create wide-scoped pull requests. However, this is perfectly normal: we all were rookies once who needed to gain experience somewhere.

The reasons for larger pull requests from junior developers vary. Some of the developers I worked with had an urge to provide the entire solution (i.e. new feature) in a single run. It was perfectly understandable. After all, the functionality must be delivered entirely. Furthermore, such an approach was pretty common at the university I was studying (and I bet that it is still the dominating model): task assigned by the lecturer and the solution presentation one or two weeks later. Thus, graduates are used to such a work model.

Secondly, less experienced developers may not be certain of their code. They might be afraid to expose “incomplete” or partial solutions. As a result, the submitted pull request may cover lots of border cases. While thinking about border cases is desirable in developers, overthinking them may lead to solutions that will not be relevant to the problem. Sometimes such an approach might make developers refactor code not related to the feature implemented at hand. I’m a great fan of continuous refactoring, but I’m also aware that every work we put in the project must bring added value. Thus, I consider refactoring parts that will not yield any return of investment in the foreseeable future as waste. An example of such refactoring is cleaning up a class not planned to be changed: it just works and is not related to any new feature in the backlog. And yes, I’ve done a lot of such wasteful refactorings during my career.

Finally, some less experienced developers openly admit that they had no idea how to divide the massive pull request into smaller chunks. From my experience, this is most common among new developers entering the project. The cause of this is the lack of thorough understanding of the system they are working on that shall come with time.

The above reasons are not actually “problems” at all. They are just temporary hiccups every developer experiences during their career. Also, the remedy for these issues is simple and very effective: education and guidance from more experienced colleagues. For instance, the reviewer can add some “meta” comments during review to emphasize the possible consequences of large pull. These include, but are not limited to: the higher possibility of conflicts between the branches, less thorough review, loose feedback loop, more effective bug detection. One argument is very effective, especially with junior developers. Smaller pull requests usually take less time to send to review. As a result, the more experienced reviewer can share the knowledge about possible improvements earlier. It makes the learning process code much faster, making the pull request author learn quicker. From my experience, the comments that point out how the big pull request be cut into smaller chunks are also very valuable. They convey knowledge about the system design, its boundaries, and dependencies between components.

Too big tasks

Tasks that have wide scope will most likely result in pull requests with wide scope as well. If the task is: Implement feature X, the assignee will, well, implement feature X before considering the task as done and ready for review. If the feature is massive or spans over several components, the resulting changeset will also span a large portion of the system.

The most obvious solution to this problem would be to create smaller tasks by the architect and distribute them to the team. It will work flawlessly as long as the architect can optimally divide the tasks. It is achievable if the architect has proper knowledge about the existing system and the requirements to meet. However, this solution is awful on several levels. First of all, the architect must think about all low-level detail. In large and complex systems, the number of things to cover in tasks will be too overwhelming for one person. Secondly, it reduces the development team to code monkeys producing code based on the set of tasks prepared by someone else. It is how ivory towers are built. Such a situation could lead to less satisfaction in work among the developers, which may lead to frustration and personal conflicts.

However, there is a way better approach. I practiced it in several projects with excellent results. One can summarize such an approach with a simple sentence: hand the system design to the team. As a result, the team members will be able to split too large tasks into smaller ones. It requires developers to have extensive knowledge about the system, its design principles, restrictions, and goals to be achieved. It is achievable by introducing regular design workshops with the team members. During such workshops, the team members discuss the possible solutions. This “democratizes” development. The knowledge is shared very quickly among the team. Furthermore, when several peaple tackle each problem, it is easier to craft a better design. Everyone can bring different solutions to the table. Also, the engagement of the team members rises compared to the model where the architect/team lead assigns tasks to the developers. The architect’s role is to coordinate such workshop sessions and make sure that the team is progressing in the right direction. Workshop sessions can be performed in various forms: from simple whiteboard sessions to more sophisticated like event stormings or CRC cards. Usually, I conduct a single session per feature to be implemented and guide the team to break down the main problem posed by the discussed functionality into a series of smaller, manageble subproblems. As a result, the team members gain the required knowledge to implement the agreed solution in simple steps. In other words: how to deliver the solution as a series of simpler and smaller pull requests. Sometimes we break down the initial Implement feature X task to a series of subtasks. Sometimes this is not even required as understanding of the problem and its solution is very high.

This approach might seem to be time-consuming: every new feature is covered by a careful discussion and a workshop. Indeed, it might be, at least at the beginning of the project or with larger and more significant features. However, when the team learns to work together, the used patterns become more and more understood. As the design principles become part of the team’s way of thinking, the workshops could be reduced to discussions in comments bound to the main task. On the other hand, such design sessions are all in all cheaper than writing actual code, which might turn up to be wrong. Workshops allow the team to discuss solutions before actually implementing them and challenging them without writing the code. I’ve noticed, that the code created after a thorough design workshop is delivered faster, is approved earlier, and contains fewer bugs than code created “on the spot.”

System architecture friction

The last category of reasons for large pull requests has simple characteristic: the current architecture of the system does not match the changes the system undergoes. I can distinguish two cases here:

  • all parts of the application are tightly coupled with each other,
  • boundaries of the modules inside the application do not fit new requirements.

In the first case, the system probably is the big ball of mud. In other words: the application lacks modularity. However, the modularity does not refer to the placement of the code in separate units, for instance: packages in Java or Python, namespaces in C# or Clojure, etc. By modularity, I mean the division of the codebase into autonomous parts. Each part encapsulates some functionality and provides a public API for other modules. Other modules can only use this public API and must not access any of the internals of the current part. API may consist of a set of exported functions, objects, events, etc.

This lack of modularity is a feature of classic monolithic applications. One thing should be clear: a big ball of mud in the case of a working system is not bad per se. The system works, is used, and brings value to the business. However, the maintenance and development of new features in such a system is a nightmare. Even simple changes might render expensive due to the tight coupling of components. Digging in one part of the system will probably result in an unexpected location. As a result, the pull request must include a set of fixes for coupled places. Consequently, the reviewed changeset becomes bloated. This situation is not comfortable for the team at all. If the system is about to undergo heavy development, it might be necessary for the team to refactor the structure too loose coupling in the codebase. In this case, the size of pull requests might be a good metric, indicating if the team is going in the right direction. The smaller the PRs get with time, the more the codebase fits the new changes. However, it is important to remember, that this process might not be instant. In large, legacy systems it may take months or even years. Sometimes, it might not be possible to clean up the entire code base: the budget and deadlines may only be enough to fix the parts of the system that are critical to the new features.

Large pull requests occurring in the modularized application might indicate that the boundaries of the modules are “wrong” in terms of new requirements. The new features might need to be distributed across several different modules, effectively increasing coupling between them. It does not mean, that the initial design was wrong. New requirements sometimes bring new knowledge to the team. As a result, the team might realize that some parts of the application could be organized differently. From my experience, the team can analyze existing modules by noting the responsibilities and collaborators required to fulfill them. Then, the team should try to place the new responsibilities (features) in the described landscape. Some new modules might emerge from this analysis. These modules “grasp” some of the responsibilities of already existing modules apart from covering new functionalities. As a result, the whole application design will better fit the business requirements and prove very useful in future development by reducing time to implement the features that are not yet known. It is important for the architect or technical team lead to properly interpret the fact that the new features make the team deliver large changesets.

Finding proper boundaries in a complex system is hard. It requires an accurate understanding of the business the system will support. This knowledge should be “extracted” from domain experts who know how the business operates and propagated among the team members. When overgrown PRs start to occur into the project, it might be worth analyzing them to spot the misplaced boundaries. For instance, if many pull requests related to a certain new feature span across two (or more) modules, it might indicate a spot where responsibilities are not properly distributed. This place could be a good start to look into during the next design workshop with the team.

Furthermore, applying and enforcing proper module-level architecture style helps a lot to structure both tasks and the pull requests. Most commonly used architecture styles introduce the concept of layers. The high-level layer depends on the lower layers. For instance, the application or use case layer can depend on the persistence layer. As a result, a feature that crosses all layers vertically is implemented can be split into small tasks that touch each layer in the module. The assignee can then implement it from the lowest layer to the highest. For each layer separate pull request can be implemented. This would reduce the size of the changeset as well as reduce the cognitive load of the reviewer. They will operate only on a single layer, with a precise set of responsibilities.

Conclusion

The overgrown pull requests that start to invade the project are like a fever during sickness. They are usually a symptom of some issues that are not visible at first sight. However, when these issues are found and eradicated, both the system and the team will probably benefit. Of course, single large PRs submitted occasionally are something that happens and might as well not be harmful. However, an architect should be wary. It is important to pay attention to the size of changesets prepared by the team members. However, it should not be a burden as the size of PR is a pretty easy metric to track.

Please note that I did not specify how large is too large. This distinction depends on the project and technology used. Some languages are less or more verbose. Some provide low-level abstractions, while others operate on high-level concepts. The desired size of PR can vary from team to team. For instance, in some of my recent projects, PRs which changed up to 10 files and span across 200-300 line changes were acceptable. However, most of the pull requests were smaller than this.