Five Standard Models to Work on Incidents EffectivelyEvery incident is different, so the best way to make sure you’re working effectively is to follow a standard model.May 28, 2023May 28, 2023
Site Reliability Engineering: How to Manage IncidentsIncident management is a formal process, and not every alert will trigger it.Mar 26, 2023Mar 26, 2023
How to Setup Multi-burn rate Windows Alert on Service Level ObjectivesThe burn rate is a calculation of how fast an issue is burning through the error budget.Jan 30, 2023Jan 30, 2023
Site Reliability Engineering: Which metrics help to measure SLI?SRE recommends a baseline set of metrics to monitor called the four golden signals.Nov 27, 2022Nov 27, 2022
Site Reliability Engineering: SLI Implementation ExampleThe Service Level Indicator is the ongoing measurement of your system that tells you whether you’re meeting your objectiveOct 23, 2022Oct 23, 2022
How do you keep track of the actual Service Level ObjectivesService health is defined in terms of multiple service level objectives, SLOs, which are user-focused rather than operations-focused.Sep 27, 2022Sep 27, 2022
How to effectively Identify and Measure Toil as Site Reliability EngineerThe outcome must be worth the investmentAug 24, 2022Aug 24, 2022
Site Reliability Engineering: What is a Toil?Toil has a negative impact on people and systemsJul 18, 2022Jul 18, 2022
Site Reliability Engineering: Setting up the right Monitoring SystemYou need to know if something is going on with your application that affects the end‑user experience as soon as possible.Jun 22, 20221Jun 22, 20221
Published inGeek CultureHow to Choose the Right Continuous Integration (CI) SystemChoosing the right CI system is important to the success of every productMar 31, 2021Mar 31, 2021