Shifting Stress to Progress— Understanding DevOps to do DevOps Better
Demystifying DevOps: Exploring Its Principles and Practices
Understanding the core principles of DevOps can often clarify the practices surrounding the framework. This article is structured around insights from Sonal Pater, a Principal Program Manager at Microsoft, who conducted a thought-provoking discussion on DevOps during a tech talk. We’ll delve into these principles and explore the connection between them and the common practices of DevOps.
Understanding the Origins of DevOps
DevOps, a concept originating in 2007-2008, bridges the gap between developers and operations through the principles of software development and IT infrastructure management. It all started when the Director of Engineering IEEE and the VP of Ops from Flickr presented a conference at O'Reilly, introducing practices we now consider crucial in DevOps.
Defining DevOps
The Phoenix Projects characterises DevOps as the result of applying lean manufacturing principles to the IT value stream to speed up work progress. The DevOps Handbook further elucidates how to implement these principles. However, to facilitate better understanding of DevOps, Sonal suggests examining the relationship of its principles with already known DevOps practices.
The Power of Flow: A Closer look at DevOps Practices
The Principle of Flow revolves around forming a value stream that provides value to the customer. This principle introduces the concept of value stream and why it is vital to make work visible. Sonal explains how limiting work in progress and reducing batch sizes can help streamline the value delivered.
The Value of Making Work Visible
- Why do we have a sprint board? Its purpose is to make work visible so that the value stream can be established or optimized. This is especially critical in tech work where the outcome isn't physically tangible.
- Why do we work on one task at a time? This is essentially done to limit work in progress and avoid multitasking, which often extends the time to complete tasks.
- Why do we commit small changes in pull requests? The objective here is to reduce batch sizes, thus chunking the work into smaller parts and competing them before moving onto the next task.
- Why do we test and operate our own code? This practice aims to reduce the number of handoffs to avoid losing context and causing potential bottlenecks.
- Why do we use CI/CD pipelines? The concept behind this is to remove constraints from the flow, automating tasks and processes to streamline delivery.
Paving the Way for Continuous Feedback and Learning
Sonal emphasizes the second and third principles of DevOps, focusing on constant feedback and continuous learning and experimentation respectively. She associates the peer review code in pull requests to pushing quality closer to the source and automated testing to see problems as they occur. Finally, conducting retrospective and sprint reviews is attributed to improvement in daily work, stressing the importance of a culture that encourages learning from both successes and failures.
Final Thoughts
Exploring the principles behind DevOps practices may help drive better adoption of these practices in your teams. Being familiar with the “why” behind every action aids in better understanding and implementation of this methodology.
Whether you are a seasoned DevOps professional or a novice breaking into the field, understanding the underlying principles of DevOps can provide clarity and drive in your DevOps journey. If you need more help or want to share your experiences, feel free to get in touch!
Video Transcription
Um So, good morning. Good afternoon and good evening everyone. And thank you so much for joining this talk. My name is Sonal Pater. I'm a principal uh program manager at Microsoft in a group called Commercial Software Engineering.And today I'm here to talk about devops and how understanding DEV ops can help us implement devops better. So when I started my des journey, the first question that came to my mind was what is des I asked around a few of my colleagues who are fairly new in their uh adoption of devops and they told me it's like uh C I CD pipelines, automated testing, Testament development, et cetera, et cetera.
But is that what really Devops is, you know, how do you define it? Um So I did a web search. Uh There were a lot of results. Uh Some of them said Dev ops is Dev and OPS together. But how does automated testing, for example, fit into putting DEV and ops together? So then I decided to research the origins of devops. Where did this name come from? I found out it kicked off somewhere in 2007, 2008, where and the concept of devops came from applying the principles of software development to it, infrastructure management. In 2009, there was an o'reilly conference where the Director of Engineering IEE and the VP of OPS from Flickr did this talk called 10 Plus Deploys per day. It's quite an entertaining watch and the talk was on youtube and I'd really recommend to go and watch it because they did like a roleplay on how bad things were at that time. But also all of these practices that we now take for granted were being introduced for the first time. So now I knew how it started, but what is it exactly can it be defined?
And to find out more, I referred to a couple of books, the first book, the Phoenix project defines DEV OPS as the outcome of applying lean manufacturing principles to the it value stream to accelerate the flow of work, the Devops handbooks now then takes that definition and it define ways or paths to apply that to your software development.
It defines three ways to apply DEV OPS and talks about the principles behind each ways. This probably is still too high level and there are probably some engineers in the audience thinking. This is so typical of what a program manager would say in this talk. What I attempt to do is try to make the connections between the principles and the practices of devops. I've structured this talk to introduce each way from the Devops handbook and then examining how certain Devops practices that we already do contribute towards those principles.
So, let's start with the first one. The principle of flow. The principle of flow is about establishing a value stream that delivers value to the customer. Any work that delivers value to the customer is part of the value stream. Do you write automated tests? Yes. Is it gonna deliver value to the customer? Yes, because they're gonna see less failures when the product is in production, then that's part of your value stream. Uh Do you write infrastructure as code? Yes. Does it deliver value to the customer? Yes, because they can easily set up new environments. So that's part of your value stream. Anything that delivers value to the customer is part of the value stream. The value stream goes from left to right or from de to ops or from business to the customer. So with this definition in mind, the first question I'd like to ask is why do we have a sprint bod? The first way tells us it is to make work visible. Unless your work is visible, you cannot establish or optimize your value stream. This is the most basic starting point of any devops implementation, especially for technology work, right? Because it's more or less invisible that is in the car produced at the end of the assembly line. Often there's friction between engineers and the managers who think why is something taking so long? It's just a few lines of code or the frustration you feel when you spent two or three sprints writing and beautiful API but you don't have anything nice to demo.
Putting your work on a sprint board, helps us visualize that work and see where it's flowing well and where it's stalled. Also, remember work is only considered done when it's running in production, delivering value to the customer. No. So the next question I would like to ask is why do we work on one task at a time? The answer to that is to limit work in progress. Time taken to complete even simple task, significantly degrades when multitasking picture this scenario, right? You've got a project deadline and you've got so many things to do in this one's print, you start working on all your tasks at the same time, you've got six tasks open, you're switching back and forth between bugs and new functionality. You lose your flow constantly switching back and forth. You wanted to finish stuff faster, but you end up taking more time or quality suffers to prevent this scenario to prevent this from happening. You want to limit your work in progress. The next question to ask is why do we commit small changes in pull requests des principles. Tell us it is to reduce bat sizes. What that essentially means is you want to chunk your work into smaller parts and finish that part before beginning another uh to demonstrate that I'd like to show you two floors, right?
The first floor, you can see all your changes are grouped together, they're committed, tested and grouped together in a single code review. The code review probably takes longer because there are more files to review and maybe some merge conflicts. The deployment probably takes longer as well because there's more configuration and again, possible, you can see that and problems are quickly identified and value is delivered to the customer a lot sooner, which is why we want to reduce batch sizes. So the next question to ask is why do we dev test and operate our own code? Now, this appears to be a pet peeve among many engineers I've spoken to, you know, why do I have to test my own code? I used to have a Q A team before. Oh, why don't we have an operation teams? Why do you have to carry a pager? You know, for, for, for, for production? The answer for that is to reduce the number of handoffs every time work passes from team to team, there's a set of communications and contact setting that's required you have specifications, prioritization, ticketing, system, scheduling, documentation, meetings, emails and whatnot.
Yet every time a handoff happens, context is lost, every handoff is a potential queue and each queue is a potential bottleneck. And when you have a bottleneck, the flow of work is stopped or interrupted and it's not going to the customer. So the next question, why do we have C I CD pipelines? And the answer to that is to remove constraints from the floor to demonstrate this, I've put up this picture, this is the waterfall model. And if you're about my age, you've probably lived the day, the dream or should I say the nightmare in the waterfall model, each step must happen in a specific order. So each step is a constraint for the next one. For example, you cannot start requirement specification until you finish analysis. Which means analysis is a constraint for requirement specifications. When we start applying DEV ops, we start removing constraints. So I'm gonna remove the environment creation constrain by writing infrastructure as code.
Now the development is not constrained by an IT infrastructure team. Next, I'm gonna remove testing and integration and deployment by adding C I CD pipelines. The now what happens now is the delivery to the customer is not constrained by sign off by a testing. Neither is it constrained by deployment by infrastructure team? All of that is done through C I CD pipelines automatically as soon as work is committed, it is tested and deployed automatically. Let's remove some more constraints. What about design and architecture? You can remove that by having a loosely coupled architecture. The development team is not constrained anymore from by a team of architects handing them architecture design stays in the realm of the development team what's interesting interesting now is if you look at what is left its analysis, requirement, specification and development, all these activities that can be done by the DEV OPS team, which is the product owner or the developer.
And when you have a DEV OPS team that is not, does not have any external constraints, that's when the best place to give value to the customer. So we looked at the first way and all the principles in the first way. Now let's look at the second way, the principle of feedback, we've established the value stream from de to ops and now we need to make it a safer, more resilient value stream. The way we do this is by fast and constant feedback from left to right. The feedback comes not just from the customer to the business, but from all stages of the value stream. Let's look at some of the dev ops practices that help us with this feedback and how they relate to the principles. So first question here, why do we peer review code in pull requests? And second way tells us it's to push quality closer to source pr reviews. Give us a quicker close of feedback loop. They allow us to detect and remediate problems when they actually are cheaper to fix than when they get amplified in production. A stitch in time saves nine. Right. Peer reviews also make quality, everyone's responsibility and not just the responsibility of a separate quality assurance team and developers share responsibility for quality, not only improves outcomes but also accelerates learnings. The next question, why do we do automated testing?
The answer for that is to see problems as they occur. If we wish to build a safe resilient system of work, we have to constantly test not just a code but also the design and operating assumptions. Also the automated testing again provides you with a shorter, quicker feedback loop. So why do we stop to fix a failing C I CD pipeline? And like I said, I receive a lot of pushback to this. Like do you literally need stop working? The whole team should just stop working to build an automated pipeline. The second way is yes, you should because you want to swarm and solve problems. It's not just sufficient to detect problems. When the unexpected happens, we have to mobilize, put our heads together and fix those problems to get the flow of work going again. A very good example of this is the Andon Corded Toyota in all Toyota manufacturing plants at every work center, all workers and managers are trained to pull a cord. If anything else goes wrong, if anything goes wrong at the work center. So for example, if a part is missing or defective or work is taking longer than it should once the cord is pulled, the team leader and the worker try to resolve it for 60 seconds um or nearabout.
Uh But if they're not able to resolve it in that time, the entire production line is halted and the entire organization can be mobilized until either the problem is fixed or an acceptable countermeasure is found. Instead of working around the problem to fix it later, the swarm to fix it immediately. The same principle can be applied to software development as a virtual and on code. If a builder deployment pipeline is broken, then the flow of work to production is essentially stopped and a value stream has stalled. Instead of postponing the resolution of a problem, we need to swamp together to either fix it right away or put a countermeasure in place for as long as that. So now we move or come to the third way of DS. This is the principle of continual learning and experimentation. We've built a value stream. We've made it safer and more resi resilient. Now, we need to make it a better and more improved value stream. And we do that by continued learning and experimentation. We need an organizational culture of experimentation and learning where all of us are empowered to learn from our successes and failures. And so the final question I'd like to ask is why do we have retrospective and Sprint reviews.
The third way is is for the improvement of daily work learning is not just about learning new technologies or frameworks, it's also about improving how we do work. It's a well known saying that in the absence of improvement processes don't remain the same but degrade due to chaos and entropy by applying the scientific method to both product development and process improvement. We learn from our successes and failures, identifying ideas that don't work and reinforcing ones that do remember that even more important than daily work is the improvement of daily work. So thank you so much for listening. I hope that looking at the dev ops practices from the perspective of the principles behind them will help you drive the adoption of these practices even better with your teams. I would love to hear your views on why you think we should do devops and some of your experiments, ex experiences in implementing them. Thank you very much. Uh I'd be happy to take any questions at this point in time. Um Looking at the chat Dev ops works outside of Dev ops. Yep, it's all about the three weeks. Absolutely. And swarming is fabulous method. Simone. I would really recommend that it really encourages learning. Uh when the team, it's a very good teambuilding method as well.
Uh And it also encourages um helping your teammates. Basically, it's not, it's not a Dev ops engineer's problem if things don't work right, it's everyone's problem. So I definitely encourage people to, you know, next time you have a failed pipeline, just get together, put your heads together and solve it. Um I'm gonna check Q and A if you have any questions there. OK. Got some comments about the waterfall method which can be very long and wet. Absolutely. As I mentioned, I've referred to these two books that I mentioned earlier, the Phoenix project. It's a kind of a novella. Uh It's very interesting read as well. Um And it tells the story of an itit manager and how they move from uh having kind of a dysfunctional team to how they apply to s to make it a better functioning, more effective team. Uh And the Devops uh the devops uh handbook gives you really specific examples and really something that you can actually implement on how you apply these principles as well. Uh So Jessica, what is the hardest part of devops implementation with customers? I think the hardest part is just the change in the mindset.
And it, it sounds like uh you know, something that people say often, but it's literally that like developers think that developers testers think that testers, you need a Dev ops engineer, you need it engineers and the change in mindset that this is it should be a team of people through the verticals, right?
And it's ok to invest your time, not writing code but fixing a pipeline, that's your work as well. It's ok to do that. It's ok to spend your time in writing tests and not fixing a bug. So that change in mindset is very, very important and that is the hardest part I believe. Thank you Simone for that wonderful link as well. Thank you. Um Don't see any more questions coming in. I have Shivani. Uh I've got my, you've got my linkedin ID there. I would love for you to connect uh as well to linkedin. Uh Let me know how you think. I've got an article on linkedin as well that talks about these, about these uh um DeVos principles. We've also got another article that talks about multitasking and um context switching mainly and why that's bad for you. Why I think that's bad for you. So I would really love to see your comments on those as well. Um Just checking on how long this session is meant to be. Thank you, Jessica for sharing that. Um Does anyone have any um stories about Devops implementation or um anything that you'd like to share about how things were, how things were difficult for you and how I would love to have any feedback if putting these practices from the perspective of the principles.
I just, I thought I just flipped it a little bit. So you usually talk about principles and OK, this is how you implement the principles. But did you enjoy having that flip of looking at the practices and then seeing how they um how they align with the principles. I've, I've had success se selling de s this way, right? I've had success selling those practices this way because usually when you tell people, hey, do this, that, that doesn't work, doesn't work, right. It's, we're all intelligent people here. We need to know why we're doing this and this is why I kind of put this together. I've used this uh method of talking to a lot of my engineers, people in my team and they really appreciate providing, having that perspective behind why we do certain things. OK. All right. So if there are no more questions, it was real pleasure to talk at the women's tech network. And you've got my linkedin, uh you've got my contact details there, get in touch, let me know your experiences with Js and you know, we, we can learn from each other. Thank you so much.