Site Reliability Engineering is an engineering discipline devoted to helping an organisation sustainably achieve the appropriate level of reliability².
Key findings
The SRE practices resonate deeply with Kabisa core values which is why we invested in SRE since 2021. In this blog we cover some key findings from the survey performed by the DevOps Institute in 2022¹.
As the findings show, more and more companies are adopting SRE. Reportedly, 62% of respondents are in various states of implementation with very low levels of failure. No blame philosophy and implementation of observability tools are the most popular practices to apply company wide, while toil reduction scores very high on being adopted in a few teams only. Contrastingly, chaos engineering still needs to grow in adoption rate. We think this is because going from zero to the Netflix level of chaos engineering has a lot of steps in between and thus could be considered too difficult to get started with.
Top 3 reasons for implementing SRE
As stated in the report, the top 3 reasons for implementing SRE are:
- Reducing service failure and unplanned downtime
- Improving the organisation’s ability to provide a competitive edge with improved reliable services and offerings.
- Improving satisfaction with business teams via reduced frequency and severity of incidents.
All of these reasons contribute to maintaining a high quality of customer experience and innovation rate, whereas DevOps helps us in accelerating innovation speed while focusing on the product life-cycle. SRE brings the focus back to delivering reliable products to our customers.
Implementation of SRE
Simply put, there is no standard way to implement SRE. There are many ways to implement SRE in team setups, including ‘embedded’, ‘consulting’ or ‘tools’ as referred to in current SRE resources³. The former puts an SRE engineer directly within development teams so that SRE expertise can be directed to specific problems or teams. In contrast, ‘consulting’ holds the SRE team as a centre of expertise where the main focus is teaching others. The ‘tools’ approach has SRE engineers focus more on support and automation code.
Need help?
One of the main challenges mentioned in the report is the lack of staff with the necessary skill set. Another concern is that the value of SRE is not widely understood.
Are you wondering how you can change your organisation to adopt SRE, or what it can add to a team or company already running DevOps? Let us know. We have an SRE team ready for your questions and challenges. If you are interested in getting started or accelerating your SRE journey contact us for a discovery session to see where your team or company is right now.
1: https://www.devopsinstitute.com/blog-global-sre-pulse-insights/
2: https://sre.google/resources/practices-and-processes/anatomy-of-an-incident/
3: https://cloud.google.com/blog/products/devops-sre/how-sre-teams-are-organized-and-how-to-get-started