No Sysadmin Required

No SysAdmin Required *

Reach back into your memory (or Wikipedia article) to Thursday, January 1st, 1970. The world of computing consisted only of a few large, expensive computers that varied widely in capability and implementation. These computers were all very different, had their own microcode architecture, operating systems, libraries, and software. Each type of computer was so unique and specialized, they required their own Systems Operator staff with a unique set of skills for that specific computer to keep the machines running.

As you might imagine, trying to get these systems to work together was difficult and frustrating. Two guys, Ken Thompson and Dennis Ritchie, thought so too. They decided to write their own operating system that combined the best concepts of the computers of the day into a single operating system that would be able to run on any of these machines. This new system became UNIX and its core design philosophy branched and flourished into many different flavors, including BSD, IRIX, HP-UX, SunOS, NexTSTEP, NetBSD, FreeBSD, OpenBSD, AIX, and eventually Linux and MacOS X. This open field of UNIX-ness created the Systems Administrator.

The Sysadmin was responsible for the security, stability, and performance of the UNIX-like operating systems. These systems were big and expensive and the organizations purchasing these systems expected them to be safe and reliable. As “Personal Computing” came along, the Sysadmin role would expand to include the Microsoft Windows and Apple MacOS platforms. But the goal remained the same: stability, predictability, security, and productivity.

A new upstart named Linus Torvalds came along and, like Thompson and Ritchie, found the UNIX landscape too complicated and tied up by too few major manufactures. He decided to write his own operating system too and the Linux kernel was born. Linux (along with GNU libraries and tools) joined the fledgling Open Source community and it helped accelerate that movement into the mainstream. The pace of innovation was starting to increase and the traditional Sysadmin role struggled to keep up.

Then the Internet became a thing. Suddenly, the pace of innovation exploded and the role of the Sysadmin, who stood for stability and predictability, became a roadblock for progress. The idea of DevOps emerged and turned the time-tested ideals of the Sysadmin into quaint ideas of the past. “Fail fast!” was the mantra of the day.

And it was glorious! You could do your banking online. You could order custom formulated dog food and days later it would be sitting on your doorstep. You could connect and communicate with long-lost friends in an instant. You could call a car to take you anywhere or to bring you a burrito from the other side of town.

But it wasn’t so shiny behind the scenes. There were massive data breaches, incredible cost overruns, and venture Capital money flowing into very cool projects and products that went nowhere. There were constant changes in projects that lead to undesired outcomes. Perhaps the old Sysadmin ideas of stability and predictability weren’t such bad ideas.

Today, software and infrastructure implementors take a more measured approach. DevOps is still a thing, but the “Fail Fast” mantra has moved aside and concepts around automation and micro-services have arisen. These ideas turn traditional software concepts on their ear by focusing on the data within the application and understanding that implementing a solution as a system of small, purpose-built software and a well-defined data schema. These micro-services can be built and improved in isolation, reducing the scope of changes and therefor the risk associated with the changes.

With this shift, new roles have emerged. The Site Reliability Engineer is one of these roles. This job title first appeared at Google and has since moved out into the larger community. The idea of the SRE is to go somewhat back to the old Sysadmin concepts of stability and predictability, but not at the cost of slowing down innovation. The SRE has many of the same responsibilities as the Sysadmin, but usually goes about them in completely different ways. Automation and orchestration tools are the tools of the day and the SRE knows how to leverage them to their maximum potential. The SRE also works farther up the application stack than the traditional Sysadmin. The SRE will have a fuller understanding of the application running on the systems and how that application interacts with other systems on the network. The SRE is the “Sysadmin Developer.”

Another role has emerged with the advancement of automation and that is the Automation Engineer. This role is responsible for maintaining and improving the “glue” that pulls together all the micro-services and cloud infrastructures into a cohesive system that will then perform its intended task. The AE builds the code that deploys the application, or updates the application, or adds infrastructure so the application can process more data, among many other things. The Application Engineer is the “Developer Sysadmin.”

A third major role as emerged with the other two – the “observability” engineer. The objective of this role is to implement tools that measure the performance of applications and systems. Performance can mean a lot of things, from availability, to throughput, to correctness. The Observability Engineer creates and delivers tools to measure, track, trend, and analyze all this information.

As you can see, the Sysadmin role hasn’t really gone away. The role has changed, diversified, and specialized. So, saying “No Sysadmin Required” is technically true, but disingenuous. The work of the Sysadmin still needs to be done, but it’s no longer appropriate for a single, general-purpose role. The Site Reliability Engineer, the Automation Engineer, and other specialized roles have taken its place and the industry is better off for it.

Ben Vaughan
Ben Vaughan
Maker of Things That Make Things

My interests include DevSecOps, CI/CD, observability, and incident response.