Making the software projects replicable and maintainable

Making the software projects replicable and maintainable

As a DIYer, I've been working on tens of projects since a decade now. The most common problem with many of the software projects is dependency management. Be it a python project, Web Dev or embedded system, the tools and libraries keeps on updating and they're unfortunately commonly not kept backward compatible. This means, if you update libraries, tools or other dependencies, your project would no longer work. If you change the system or are working in a team, the code becomes difficult to maintain and compile and then you're spending more time on figuring out these errors than actually something productive giving rise to cliche developer joke "It works on my system". Also, when you open the project after few years for a minor change, it no longer compiles and you're most likely gonna write the whole project again.

The Evolution

Thankfully, over the years, multiple systems have evolved to that care of this issue. The evolution till now has been broken into 4 stages and based on complexity of the project, we don't need to go all the way to the last stage.

  1. Package Management
  2. Virtual Toolchains
  3. Containers, Stacks and Clusters
  4. Dev Containers

Package Management

The first problem to solve was how do I list the dependencies to be installed along with version numbers of the same. This gave rise to package management solutions. From requirements.txt file for Python to package.json for node.js. Now you could atleast have your packages and dependencies listed and easily configured. List down the libraries and their version number in the files, and the collaborator or you after years would still know how to configure the same project.

Virtual Toolchains

What package management doesn't solve is the version of toolchain itself. What version of Python, nodejs was used for the project. Also, if I have multiple projects in same PC using different versions of tool chain, there would be conflicts and that would raise errors. This have rise to next step of evolution, virtual environments. From venv or anaconda for python to NVM (node version manager) for nodejs, as long as you could specify tool version to be used in README file, it was easy for others to replicate the similar environment.

Containers, Stack and Clusters

Next comes even tougher problem. How about projects where you need installations along side the toolchain for the project to work. Or if your project need OS level changes like opening a network port. For e.g., what if a project uses Python along side compiled C libraries? Or a Node project which needs database alongside installed and running and couple of ports opened on OS?

Here comes the next step, containerization. Have all toolchains needed installed on a virtual layer on top of your operating system so that when you want to share the project, you share a configuration file and all required tools, packages, OS configuration happens automatically. Almost like a magic. The step which used to take weeks earlier alongside tons of documentation, now can be done in minutes.

Docker makes it easier to run virtual OS layers alongside configuration to run on any machine in minutes. Furthermore, tool like docker compose could have these broken down into multiple images all running easily like a stack on single machine. Whereas tool like kubernetes makes it easier to deploy these in clusters across multiple servers. This makes it easier to manage workload across multiple machines simultaneously.

Dev Containers

Now comes the last piece of puzzle. How do I version control my development environment so that my IDE extensions can be version controlled and managed without making it too heavy for me to run another operating system itself for each project. Or avoid the clutter of extension in IDE since each project needs different kinds of extensions. Here drops in devcontainers. Not only are all your toolchains now inside a docker containers but also extensions for IDE. You can customize your host IDE (VS Code in this instance) with themes and UI but all compilation happens inside containers.

Container Architecture

Conclusion

These solutions are not perfect and they do need attention to detail like mentioning proper version number, scripting and testing configuration itself and some other form of communication to run and configure these but they are damn good place to get you started for having replicable projects.

This only solves making compilation replicable. You still need to use proper tools like git to manage versioning your code itself. And then you need to have process and people on top of everything to make and maintain the software quality. Processes like formal review, unit testing, integration testing, documentation and release lifecycle. After all, getting things to work is easier than delivering maintainable quality software.