Site Reliability Engineering

Site Reliability Engineering, it is a practice that tech giants are practicing now a days where operation problems of an organization are treated as software engineering problem, in other way when a developer is assigned to solve operations problem. Basically, SREs are software engineers who build various softwares to make better reliable systems. The question that arises is isn’t that DevOps? or which is better SRE vs DevOps?

History :
This term was first coined by Ben treynor, a software engineer at google in 2003, this practice started lot earlier than DevOps movement. Shortly, after implementing SRE at their premises treynor’s team shortly launched SRE ebook to aware the industry about the practice.

Responsibilities Of Site Reliability Engineers (SREs) :

  • SREs are accountable and take on-call duties for the systems that are running in production.
  • SREs are responsible for developing software(s) that improves the reliability of systems.
  • They are responsible for performing post incident reviews of the systems that fails.

SRE vs DevOps : Which is better?
There’s a great analogy to understand the two terms better. So, here it goes, let’s consider DevOps as an interface i.e. similar to abstract class containing methods without definitions, and SRE as a concrete class implementing DevOps.

Interface DevOps{
Reduce Organizational silos();
Accepting failures();
Implement gradual changes();
Leverage Automation();
Measure Everything();
}

Now, SRE as a concrete class will implements DevOps, alongwith defining all methods as :

  • Reducing the organizational silos, by sharing the ownership among software engineers, product team and SREs by using same set of tools.
  • Accepting Failures, as no system is 100% reliable so faults will be there, SREs do Blameless post-martems of systems and generate metadata for the same.
  • Implementing small changes, smaller the change is, easier it is to identify the problem or faster it is to fix the change or rollback. Thereby, reducing the cost of failure.
  • Leveraging Automation, automating manual tasks, wherever possible on the production system such as user creation, installing packages, alerting or logging etc.
  • Measuring Everything, at the end monitoring the right things that has implemented, as on the end of the day you should have numbers or clear metrics that supports success.

    So, SRE and DevOps are not competing standards, rather they go hand in hand together. So, it is SRE with DevOps.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.