Releases
Introduction
The eventual goal of any software project is to release the software to users. A software release is the process of distributing the newest version of the software. This new version may contain new features, bug fixes, or other changes. The release process is a critical part of the software development lifecycle, and can vary wildly even across a single organization. At minimum, the process involves planning, testing, and deploying the software to users. A successful release should be time and budget efficient, minimize risk, and provide value to users.
At many organizations releases are a stressful and error-prone process. They often involve long hours, manual steps, and coordination between multiple teams. This can lead to delays, errors, and outages. To mitigate these risks, many organizations have adopted automated release processes. These processes use continuous integration and continuous deployment (CI/CD) pipelines to automate the steps involved in releasing software. This can help reduce the time and effort required to release software, as well as reduce the risk of errors.
Switching to frequent automated releases can have a profound impact on developer satisfaction and work-life balance. By reducing the time and effort required to release software, developers can spend more time on development and less time on manual release tasks. This can lead to happier, more productive developers.
When my team switched to automated releases, the difference was night and day. We went from spending hours on manual release tasks to releasing software with the push of a button. Developer attrition decreased, and the software we released was more reliable and higher quality.
Types of Releases
Releases are categorized in two ways: by the type of changes they contain and by their frequency. Releases can be categorized as major, minor, or emergency fixes. Organizations may also have different release cadences, such as weekly, bi-weekly, or monthly releases.
- Major Releases: Major releases contain significant new features or changes. They may require extensive testing and coordination between teams and might significantly impact users. If a release contains changes that break backward compatibility, it should be considered a major release.
- Minor Releases: Minor releases contain smaller changes, such as bug fixes or minor feature enhancements. They generally do not include major functionality changes, just improvements to existing features.
- Emergency Fixes: Emergency fixes, or hot fixes, are releases that are made to address critical issues, such as security vulnerabilities or outages. They are released as soon as possible and may bypass the normal release process.
Release Cadence
Some organizations have a regular release cadence, such as weekly, bi-weekly, monthly, or even longer. Regular releases can help teams plan and coordinate their work, as well as provide a predictable schedule for users, but they also come with a number of potential pitfalls. Longer cycles are always riskier than shorter cycles, as they tend to accumulate more changes and are more likely to have conflicts or issues. Identifying the root cause of issues in a long release cycle can be extremely difficult simply due to the number of changes that have been made.
Shorter release cycles can help reduce risk by allowing teams to release smaller changes more frequently. This can help identify issues earlier and reduce the impact of changes. Shorter release cycles can also help teams respond more quickly to user feedback and changing requirements. They also tend to reduce the pressure from stakeholders. If some feature doesn't make it into the current release, it can be included in the next one. Waiting a week or two is much easier than waiting a month or more.
"For every 50% reduction in time-to-release, you reduce your likelihood of a bug in production overall by 50%."
-- Accelerate: The Science of Lean Software and DevOps
Increasing the frequency of releases will significantly decrease the number of bugs, shorten the time to fix bugs, and increase the quality of features.
CI/CD Pipelines
Automated release processes have become increasingly popular in response to the need for faster, more reliable releases. Continuous integration and continuous deployment (CI/CD) pipelines are a common way to automate the release process. CI/CD pipelines automate the steps involved in releasing software, such as building, testing, and deploying the software.
The pipeline automatically executes the moment code is merged into the main branch. It runs tests, builds the software, and deploys it to a staging environment. If the tests pass, the software is deployed to production. If the tests fail, the pipeline stops and alerts the team.
Your code changes are in the hands of the users within minutes of merging. This is a powerful concept. It allows you to get feedback quickly and make changes based on that feedback. If there is an issue, identifying and fixing it is much easier when you have only a few changes to look at.
While having a CI/CD pipeline is a great goal, not every system can be released this way. A system might have mandatory audits or other requirements that prevent fully automated releases. In these cases, the pipeline can still be used to automate as much of the release process as possible.
Native mobile apps are a good example of systems that can't be fully automated. They require manual review and approval by the Apple or Google Play stores before they can be released.
Release Process
The exact process to get a feature from development to production will vary depending on the organization and the software being released. Even in fully automated processes, the software will go through several stages before it is released to users.
Environments
Most organizations have multiple environments that software passes through before it is released to users. These environments are used to validate changes and ensure that the software is working as expected. The most common environments are:
- Dev: The development environment is where developers write and test code. It is usually a local environment on the developer's machine.
- QA or Feature: The QA or feature environment is where changes are tested before they are merged into the main branch. This environment is more closely aligned with production and will interact with other services and databases. It gives developers a chance to test their changes in a more realistic environment. Because developers are constantly deploying untested code to this environment, it is important not to rely on it being stable.
- Staging: The staging environment is a scale replica of the production environment. The expectation is that systems deployed here are stable and production-ready. This environment is used to validate your system as it interacts with other real-world systems. It is the last stop before production.
- Production: The production environment is where the software is released to users. It is the live environment that users interact with. These are typically the largest and most expensive environments to maintain.
Some organization also maintain additional environments for specific purposes. Some examples include:
- Load Testing: An environment used to test the performance of the system under load. This environment is used to simulate real-world traffic and ensure that the system can handle the expected load.
- Pre-Production: An extra validation step before releasing to production. This pool will often receive a tiny percent of the real production traffic, letting you test against real data. Some organizations also use their staging environment to support active teams, so they need an additional environment to validate the release before it goes to production.
- Sandboxes: Environments used for testing specific features or integrations. These environments are often used by third-party developers to test their integrations with your system. Think of this as a playground for external developers to experiment with new features before they are released to production.
Even organizations that have every one of these environments will not use all of them for every system or every release. The goal is to have the right environment for the right stage of the release process. The more environments you have, the more time, effort, and money you will spend maintaining them. The key is to find the right balance between having enough environments to validate changes and not having so many that they become a burden.
Releasing to Production
Once software is deemed ready for release, it is deployed to the production environment. The exact process for doing this will vary depending on the organization and the software being released. Some common approaches are:
- All or Nothing: The entire system is released at once. This is the simplest approach but also the riskiest. If the release causes issues, the entire system is impacted.
- Blue-Green Deployment: Blue-green deployment is a release strategy that reduces downtime and risk by running two identical production environments. At any given time, one of the environments is live, while the other is idle. When a new release is ready, traffic is switched from the live environment to the idle environment. This allows you to release new features without downtime or risk.
- Canary Release: Canary release is a release strategy that reduces risk by gradually rolling out changes to a small subset of users. This allows you to test the changes in production before releasing them to all users. If the changes cause issues, you can roll them back without impacting all users.
- Rolling Deployment: Rolling deployment is a release strategy that reduces downtime by deploying changes to a subset of servers at a time. The exact steps will vary, but one approach is to deploy to a single machine. If the stats look good, deploy to 1/3 of the data center. If the stats still look good, deploy to the rest of the data center. If at any point the stats look bad, stop the deployment and rollback.
- A/B Testing: A/B testing is a release strategy that allows you to test changes by comparing two versions of the software. This allows you to measure the impact of the changes on user behavior and make data-driven decisions about which version to release.
Example Process
A manual release process might look something like this:
- Develop locally in your development environment and test your changes.
- Deploy to the QA environment and run tests.
- Deploy to the staging environment and run integration tests.
- Deploy to pre-production and test against real data.
- Deploy to production.
Automated release process must rely heavily on automated tests in order to be successful. The process might look something like this:
- Develop locally in your development environment and test your changes.
- Raise a PR to have your changes reviewed. This will trigger any automated tests you have.
- On approval, merge your changes into the main branch. This will trigger the CI/CD pipeline.
- The pipeline will run tests, build the software, and deploy it to the QA environment.
- If the tests pass, the software will be deployed to the staging environment.
- If the tests pass there, the software will be deployed to the production environment.
Handling Failed Releases
Releases might fail for a variety of reasons. The software might contain bugs that weren't caught in testing, there might be issues with dependencies that weren't anticipated, the deployment process itself might be flaky, or there might be performance bottlenecks that are only apparent at scale.
When a release fails, it is important to respond quickly and decisively. The first step is to identify the root cause of the failure and determine its severity. Minor issues might be resolved in the next release, while major issues might cause the release to be halted and rolled back.
A Rollback is the process of reverting to a previous version of the software. This is done when a release fails or causes issues. The goal of a rollback is to restore service to a known good state as quickly as possible.
With fast release cadences, rollbacks are usually trivial because the difference between versions is small. With slower cadences, rollbacks might be a complex ordeal. This is especially true when a release involves changes to the database schema or other irreversible changes.
When a system fails for any reason, it is important to have a post-mortem. This is a meeting where the team discusses what went wrong, why it went wrong, and how to prevent it from happening again. The goal is to learn from the failure and improve the process for future releases. See the Blameless Retrospective section for more information.
Conclusion
Releases are the final step needed to get software into the hands of users. They are a critical part of the software development lifecycle and require careful planning and coordination. Automated release processes can help reduce the time and effort required to release software, as well as reduce the risk of errors. By adopting automated release processes, organizations can release software faster, more reliably, and with less risk.
Faster release cycles are generally better than slower ones. They reduce the number of bugs, shorten the time to fix bugs, and increase the quality of features. They also allow teams to respond more quickly to user feedback and changing requirements. By adopting automated release processes and faster release cycles, organizations can release software more quickly, more reliably, and with less risk.