Skip to main content

Right architectural decisions

Architecture and decisions

Architecture is a pretty fuzzy term in software development. Different developers may have a slightly different understanding of this subject. For me, architecture is set of decisions regarding the system that are hard to change. Please note that it’s hard, not impossible to change. Sometimes the architect can make mistakes and choose poorly. The reversion of such a decision might be costly, but it is usually possible.

All architectural decisions have one thing in common. They exchange one set of problems with another. In other words: when making a decision to solve one problem we need to pay the price of the solution we choose. The cost of every decision may lead to other problems.

The big question is: how can we know that the decision we made is the right one? The answer is pretty simple. We must make sure that the problems the solution will cause are more acceptable or easier to handle than the problems we are trying to solve. This is the responsibility of an architect.

Example: messaging solution

Note: this example is for illustrative purposes only. Please do not treat it as a thorough discussion regarding the messaging solution choice. If you face a similar issue in your project, you should perform analysis on your own and consider the context of your system.

The team works on a system that consists of two microservices. These microservices need to collaborate to achieve business goals. The team decided to use messaging to increase the availability of the system. The problem is: how to achieve asynchronous communication?

The most “obvious” solution would be to use a self-hosted message broker. Such software implements many integration patterns that can be utilized by the team out of the box. However, it would require another component to be present in the system. This component will require configuration. Since the broker will connect all the services in the system, its availability should be higher than the desired system availability. This would require the team to have knowledge about broker maintenance, debugging, and disaster recovery. Finally, a new component in the system will incur additional costs related to the consumed computing resources.

One variant of the above approach would be to use a managed broker from the cloud provider. It would alleviate the need for maintaining the broker by the team. On the other hand, vendor lock-in will occur. In the future, when the need for migration to another cloud provider would occur in the future, it would require significantly more time to adapt the code to work with the new vendor. Furthermore, the team will have limited capability to tune the behavior of the broker. Any debugging process will need to be performed via support tickets. It will cause the resolution of problems to take significantly more time compared to the self-hosted solution managed by a skilled team.

Finally, the team could decide to drop the idea of an external message broker and pass messages between microservices using a familiar HTTP protocol. Please note that this situation will not represent synchronous communication. The microservices will provide endpoints that will receive messages and store them for further processing. As a result, no extra component will be added to the system. Additionally, the team will have full control over the messaging solution. On the other hand, the team will need to manually implement all integration mechanisms. Furthermore, the infrastructure layer in the microservices will grow as additional endpoints and worker threads performing message delivery will be created. This solution may also not scale well when new microservices are added. Every new service that would like to communicate in this manner should have the messaging code included. It will be significantly painful if the new services are written in different languages: the code will need to be ported.

So, which of the above solution is the best? It depends. Since every context is different, it is impossible to tell: Always use X instead of Y. The team must carefully analyze the consequences of each possibility and adopt the most frugal one. Basing such decisions on the hype for a new or “popular” technology might lead to a disaster.

Also, the context may change with time. For instance, at the beginning of the project, the team might not have expertise in managing message brokers and might choose to go with the HTTP approach. However, after a couple of months engineers who are literate in managing such solutions may join the team. It can make sense to adopt the previously discarded idea. The problems that the declined solution may have a lesser impact and the drawbacks of the current approach will be eradicated.

The above example is pretty high-level. However, such decisions happen all the time during a software project. It could be as low-level as choosing a library or a design pattern to apply in a specific situation. On a higher level, it might be the selection of responsibilities of a module or a class. Each of these situations requires analysis of the costs of each considered alternative. The current context of the project must be always taken into account.

Things that help

As you may see, the most important part of making any architectural decision is recognizing its consequences. It requires the team to be aware of the current context of the project.

In the case of large-scale decisions, team workshops are very efficient. During such workshops, the team members can synchronize their knowledge about the problem to be solved and the current situation in the project. The team can reach the experiences of each person. As a result, the possible consequences of proposed solutions could be quickly highlighted. Furthermore, a first-hand experience is always better than any documentation and article on the Internet.

If no solution was applied previously by any team member, it might be worth investigating the possible solutions empirically. Splitting the team into subgroups and assigning each subteam one possible solution for swift research is feasible. For instance, each subgroup can develop a quick and dirty prototype assessing its viability. The conclusions should be gathered during the next workshop. They should form a solid base for the final decision.

For lower-level decisions making prototypes might be superfluous. In this case, I would recommend writing Architecture Decision Record and spending some time filling each part of such document thoroughly. This will make you consider the context, the consequences, and the alternatives for the solution you are going to propose. The Architecture Decision Record you create could be then reviewed and discussed by other team members. The discussion might enhance the proposed solution or point to the downside you might miss. Furthermore, when the decision is accepted, you will have a ready-made artifact to put into the system documentation.

In the long run, a proper set of metrics is very helpful. It allows to assess if the decisions the team makes are leading the project in the right direction. Both technical and business metrics can be used. The development of the proper set of metrics is distinctive to each project. The team should choose the metrics to watch based on the business requirements to make them relevant. This will ensure, that the development team’s goals are aligned with the business goals. As a result, decisions made by the development team will be beneficial to the business, even though the customer may not be aware of them.

Conclusion

Making decisions regarding the created system is not only the responsibility of an architect. It is an obligation of anyone who contributes code to the project. As a result, the decisions should be made consciously. Their consequences should be well understood and accepted by the team.

Beware of decisions influenced by the community hype for particular solutions or technologies. Usually, they may lead to disasters and lots of Don’t use technology X articles. I’m not against using new or popular technologies. However, brand new technologies hardly ever advertise the problems they may bring into the project, emphasizing the problems they solve only. Sometimes the implications might not be known yet in the case of bleeding-edge technologies. Thus, be careful. In large-scale projects, planned for a long time, hype-driven can be a reason for a catastrophic failure.