• SRE-iously! Reliability!

  • The Site Reliability Engineering (SRE) is a software engineering practice that combines DevOps and traditional IT operations to solve customer problems, automate Production operations tasks, accelerate operation and software delivery by minimize service risk.
  • Read More

Since SRE is fundamentally connected to system reliability, let's explore Reliability Engineering in more depth here Reliability Engineering
image_2025-01-08_235042717

Google’s Approach of SRE

Site reliability engineering (SRE) was born at Google in 2003, prior to the DevOps movement.

  • "SRE is what happens when you ask a software engineer to design an operations team.” - 
  • Ben Traynor, VP of engineering at Google and founder of Google SRE

  1. Hire software engineers to run products and to create systems to accomplish the work.
  2. Without constant engineering, operations load increases and teams will need more people just to keep pace with the workload
  3. 50% cap on the aggregate “ops” work for all SREs—tickets, on-call, manual tasks, etc.
  4. Want systems that are automatic, not just automated
Screenshot 2025-01-13 at 21.20.57