Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

What are Error Budgets

Site reliability engineering (SRE) is a discipline that allows teams to design and operate scalable, resilient systems using a software engineering approach. Gartner defines SRE as a collection of systems and software engineering principles used to build and operate resilient distributed systems at scale. SRE acts as a complement to DevOps practices by managing the risks of rapid change by promoting resilience, accountability and innovation.

Error Budgets enable teams to make decisions on ‘are we focussing on the right things as a team’. It allows the team to see if the time spent on the feature is not taking a toll in production.

When the error budget runs out, the team needs to change direction and make sure it huddles to ensure the systems are stable again and drop any work with regard to features.

Setting up Error Budgets

Step 1. Connect Agile Analytics to your backend

Connect to Google Cloud Monitoring: [Google Cloud Monitoring] Connect Agile Analytics to Google Cloud Monitoring
Connect to AWS Cloud Watch: [AWS Cloud Watch] Connect Agile Analytics to AWS Cloud Watch
Connect to Prometheus: (coming soon)
Connect to Datalog: (coming soon)
Connect to Dynatrace: (coming soon)
Connect to Elasticsearch: (coming soon)

Step 2. Create API Service

  1. Go to the Error Budgets page and select Add service in the dropdown.

  2. Fill in the service information and click Add.


Step 3. Set up Feature

  • No labels