The SRE book

I gave a Lightning Talk at SREcon16 and I was lucky enough to win the SRE book from Google while I was there.

Here are some notes of things I was thinking while reading it.

First, this is a phenomenal piece of work, that really marks a special point in time: the dawn of the possibility of wide adoption of SRE principles. I say, “possibility” because after getting exposed to the deepest details of what makes SRE work, I think that there are lots of organizations that won’t be willing or able to make it work.

Even though I’ve been in IT for decades at this point, I’d fallen into the trap, as an outside observer of Google, to imagine that there was some magic bullet they possessed that made it possible to deliver such enormous services with such high reliability. If someone had asked me to explain that viewpoint before reading this book, I would have bashfully admitted that it was ridiculous to imagine that there are silver bullets in IT operations. Fred Brooks already taught us that.

Now that I’ve read the SRE book, I’ve figured out what the silver bullet is: it’s sweat.

Over and over while reading it, I thought to myself, “well, yeah, I knew that was the solution to the problems I was facing growing Tellme in 2001, but we just weren’t in a position to put in that work”. I’d also think while reading, “man, I can see how that would totally work, but you’d really need an immense amount of goodwill, dedication, and good leadership to make it happen; I’ve been in teams where there’s not a critical mass of team members who sweat the details to make that work.”

So that’s my 10000 meter take-away from the SRE book: Wow, man, that looks like a lot of work. It is as if Thomas Edison came back to life and restated his maxim: “Reliability is 99.999% perspiration and 0.0001% downtime”.

But that’s not the end of the story. The other thing I felt time and time again reading the book was a sense of longing to get back in the game. It made me ready to sweat. The book, told as it is from passionate, proud, smart people who have been sweating in the trenches, is as intoxicating as a Crossfit Promotional Video.

To outsiders, who think the understand IT operations or Software Development based just on the English-language definitions of the constituent terms, SRE might look easy. But what I really liked about SRE book was how time and again through the book, it talked about how the values of SRE inform the successful approach to a certain problem. When a team needs to introspect on its values in order to choose a way forward, you are no longer in the realm of technology: that’s about culture.

In my last job, from the first moment, my colleagues looked to me to guide the culture of the team. My title was not “tech lead”, but there were some behaviors I knew we needed to be encouraging and I knew how to model them. Reading the SRE book triggered the same instincts in me again. A lot of the info in the SRE book I already had learned in my own way, from my own experiences. But lots of the information was a new take on the old problems I knew about, and inspired me to say, “wow, yes, of course that’s the answer, I’d like to be in a team that was acting like that!”

But the fact that integrating SRE into an organization is a cultural, not technical, affair dooms it to partial, spotty uptake. There will be organizations that don’t have the right kind of cultural flexibility and leadership who is able to bring people around to SRE. They will carry on with what they are doing, but they will pay the price by forgoing the benefits that Google has shown that SRE can bring to an organization. Their dev teams and ops teams will forever be locked in battle, and the only action item from their postmortems will continue to be “we need more change control meetings”.

I pity the fools.

Dell and the NSA

While I was reading this blog about how NSA’s bad-BIOS malware probably works, I was struck by a “coincidence”: Dell does significant amount of government contracting work. In fact, Ed Snowden worked for Dell at one point. NSA’s bad-BIOS targets the RAID cards in Dell servers.

Now, Dell servers are widely deployed. I’ve used them in several jobs, for example. So it’s not unreasonable that NSA would target them, to get the best bang for the buck. But it also seems possible that in order to achieve the things Dell’s executives promised to NSA executives in fancy sales calls, some Dell engineers would find themselves using what they know about Dell servers to write bad-BIOS malware to attack those very servers.

Which made me think about my company, Cisco. We publicly said we don’t put in backdoors. But we also have a big sales organization staffed with people with clearances who make special products for government organizations. It isn’t hard to imagine, especially with the revolving door between military, intelligence and defense contractors, that some of those people would find their allegiances split between intelligence people asking them for hints from the source code, and Cisco’s Code of Business Conduct.

As Bruce Schneier reminds us, once you start wondering if you can trust your suppliers, it is very hard to stop wondering.

Double take

The nicest thing happened to me on the way home from work… I got one of those movie-perfect double takes from a guy I passed.

Of course, it was probably due to the pretty rainbow umbrella I was carrying, while cruising along on my unicycle. Years ago I realized that if it’s a grey rainy day it is nice to have bright colours overhead. Black umbrellas should be outlawed!

Just Married

I’ve been away from the web for a while because I was in Olivone, Ticino, Switzerland getting married!

Thanks to friends and family who came from so far away to witness such a special day.

And thanks also to our wonderful vendors, who made the day go so well. If you are thinking of putting on a wedding in Blenio, give these guys a call:

It feels different to be married. It seems like it shouldn’t… the house is the same, we still sleep on the same sides of the bed. But it’s different. Good. And safe. And happy. And… different.

Viva gli sposi, all of them, whereever they are in the world. This is why getting married is special! Now I know!

Just DO It

There’s a guy in Philly who realized he’s rich, because me makes $30,000 as a community organizer, and many people in the world live on far less. He decided to go on a diet, for his own helath, but also to understand what it means to live on a simple diet.

That’s already an interesting story, but what’s really interesting is this video he posted, where he talks about what held him back from starting the project (fear) and what the reality has been (support). This is exactly the same thing I found when I made the big change to leave IT and be a log for MSF.

To anyone else out there, and I know some of you are out there because you come to me for advice, here’s the cornerstone of all advice: Just do it. What do you care what other people think? Why do you think your guess about what they will think is right anyway? Just do it, and then be surprised by the reactions, and by the good that comes into your life as the payoff for the risk you took.

Learning to Love Social Welfare

When you fall in love with someone from another country, it just happens. Then what comes after that takes more effort. You have to learn to love the other culture you’ve thrown your lot in with. That process is, and will continue to be, a joy for Marina and me.

Here’s an article that describes some of what I’ve learned about how life is organized in Europe. I followed the same course, roughly, as the author. Though I have to admit, I was never as skeptical as he was. Perhaps that’s because I was given humanist values by my mother, and I always understood that something wasn’t working right in my homeland.

Another great example is something I told Mari the other day. She said something about a dream for the future. I told her, “Congratulations! A little-known benefit of marrying an American is that you are entitled to the American dream. As long as we are together, if you can dream it, we will work to make it happen.”

This is not fluff-speak. The problem of “self advancement” is a serious one for Europeans, and particularly for Swiss, who see themselves as coming from very small cantons deep in the middle of a very small country (the “we’re just a bunch of farmers” effect). There are some Swiss people who look down on those who seek to improve themselves. There are some who criticize their neighbors when they travel too far, or study too much. Entrepreneurialism is much more rare in Europe. The most active, engaged, and entrepreneurial Europeans I have ever met live in Silicon Valley. They have to escape, they can’t imagine a life stuck in the rank and file of Siemens biding their time until they get a turn to be a boss and have a tiny bit of control.

At the same time I’m coming to understand what real social welfare is, Marina is coming to understand what the freedom to dream is, and what it feels like when any dream (big or small) is met with “Great! Let’s go do it!”

Magic!

Repair vs Replace

What are the economic effects of the repair vs replace decision? Interesting question, that.

If you missed “more local employment of the blue collar type”, go read this:

There is a slight diversion of purchasing strategy, repair rather than replace. This feeds a blue collar industry in the local region.

It used to be, when money was loose, replace with new was the norm. If the repair was 75 percent of the cost of replacement a new motor was ordered, the old motor was scraped, and the country of origin (Mexico, China, Taiwan) benefited. These beautiful USA built, 50 year old, 200 horsepower motors, were going to the scrap heap.

Now the repair industry is swamped.

Africans have known this for a long time, though for them, the causality is the other way around. If you don’t have the communications, transport, logistics and capital it takes to access new products made in far off lands, then you take the next best thing, which is to keep what you have working by repairing it.