If you’re not using microservices already, you’ve probably heard people raving about how amazing they are. If you’re tempted to dip your toes in the water, then this podcast with independent tech consultant, author, and former FT technical director for engineering enablement, Sarah Wells, is a must-listen.
During her chat with Zoe Cunningham, she talks about what organisations need to consider before diving into the microservices world (spoiler alert, it’s not just about tech), and the things you need to have in place before you take the plunge.
The discussion also covers situations where microservices may actually not be the most appropriate way of building software.
You can listen to the full podcast on this page, or wherever you get your podcasts, and read the full transcript below.
Digital Lighthouse is our industry expert mini-series on Softwire Techtalks; bringing you industry insights, opinions and news impacting the tech industry, from the people working within it. Follow us to never miss an episode on SoundCloud now: See all Digital Lighthouse episodes on SoundCloud
***
Transcript
[music]
Zoe Cunningham:
Hello, and welcome to the Digital Lighthouse. I’m Zoe Cunningham. On the Digital Lighthouse, we get inspiration from tech leaders to help us shine a light through turbulent times. We believe that if you have a lighthouse, you can harness the power of the storm.
Today, I’m super-excited to welcome Sarah Wells, who is an independent tech consultant and author, and the ex-tech director at the Financial Times. So hello, Sarah, and welcome to the Digital Lighthouse.
Sarah Wells:
Very happy to be here. Hi!
Zoe:
Can you start by telling us a bit about what you do now, and the journey you’ve been on to get there?
Sarah:
So last year, I started up my own company. And I’m working as a consultant doing work around microservices, platform engineering, and technology leadership. And I’m writing a book.
This is the first time I’ve done that kind of thing. I’ve always worked permanently full time in an organisation. I’ve been working in software engineering for nearly 25 years. I actually started out in publishing, and then I went back into the conversion course. Then I got a job as a Java developer, and I’ve probably spent 10 to 15 years working as a senior engineer and a tech lead.
And then, in 2011, I joined the Financial Times. And it was a really interesting time, because there was a lot of change going on in the technology department there. So over that decade that I was there, we moved to the cloud, we adopted microservices, we started to work in a very DevOps way. We adopted containers and Kubernetes very early on – we were really early adopters of a lot of things. And really, I just had a great opportunity to learn a lot of stuff around all of these things that are now fairly mainstream, but we were very early adopters.
At the FT, I was a principal engineer, leading the content platform. And then I became a tech director for operations and reliability. So I moved over to actually run operations, having never done that before, and then engineering enablement. So every team at the FT that built tools, or anything for other engineering teams, was part of my group for the last couple of years I worked at the FT.
Zoe:
Fascinating. Let’s talk about your new book, Enabling Microservice Success. And it’s going to be published by O’Reilly in early 2024. So first, congratulations! I’d like to ask how much of the book is based on your personal experience, on that journey you just described?
Sarah:
It’s all based on my experience at the FT. I think that there’s something quite interesting about staying at a company a long time after you adopt some new technology. I saw the problems we faced. We were such early adopters that we had to solve a lot of problems when there wasn’t much advice out there on how to do it. And there wasn’t the tooling that there is now. We had to build our own solutions.
But I also was there long enough to see where things we tried didn’t work. So I can really draw on that experience. And also, I was there to the point where some of the systems we’d built were now five years, six years old. So you have a smaller team running them. And the people probably aren’t the same people who made some of the decisions. Think about what that means for keeping everything up and running, and safe, and secure.
Zoe:
Yes, super-interesting. Obviously, lots of people are talking about microservices now, and it’s really become mainstream. But why is it important, you talk in your book about the microservice architectural stuff? Why is it important to understand that, as well as what a microservice is and how to use it?
Sarah:
I actually think that it’s more about where microservices fit into the whole of your organisation. I think it’s easy to start doing them and create something that doesn’t really give you the benefits, but you’ve made something that’s much harder to understand and to operate. So the architectural side of it is around working out how you find the right boundaries between your services and how you decide how many different technologies you want to have in your microservices architecture.
But I actually think the main challenge is around the organisational structure and the culture, because if you want to be successful with microservices, it’s beyond the architecture. You really have to have the right setup. I talk in the book about Conway’s Law, because Conway’s Law is just pretty foundational to it. It’s the idea that the systems you design look like the organisational structure you have. The idea is that you ship your organisational structure. So if you have three teams, you’ll have three systems. So you really need to make sure your team structure is aligned to sensible boundaries so that your architecture is as well.
And you want those teams to be pretty independent, so that they don’t have to coordinate with other people, which means you have to have a whole set of skills. They have to be cross-functional. They have to have everything they can do to release that service to production.
And that can be a big change for organisations who may have been set up with guilds and groups of particular skills. You may have a front-end team, and a back-end team, and a testing team. Well that doesn’t really work so much.
And then you need to have an amount of autonomy for those teams. That can be a massive change in company culture. Because you have to trust, for example, that a team can release code when they think it’s ready, rather than going through some process where someone else says, ‘Yes, we can put that change out.’
Zoe:
Gosh, it makes me imagine how this could go very wrong. And how you could have someone just saying, ‘All right, well, we had some Python engineers. Now we need some microservices architecture engineers. We’ll just stick them into our organisation and they’ll go through the same processes.’ You could possibly even be going backwards from where you were before, if you’ve not got that structure, and the people side of it, right.
Sarah:
Yeah the thing about microservices is that they can be more complicated to operate, because they’re a much more distributed system. And there’s all kinds of things that you have to do slightly differently, because you have a lot of services. So if you’re not actually benefiting in terms of being able to move more quickly and release more value, then you’re paying a cost but not getting the benefit. So I think it’s really essential to think about whether your organisation can actually allow 100 releases to be done in a week. Because if you don’t think that’s possible, I think you’re really going to struggle to benefit from microservices.
Zoe:
As technology leaders, we only ever look into these technologies to try and make things better, right? So if you’re thinking that microservices is the answer, it might still be, there might just be a step you have to do first, before you can even start to look at that.
Sarah:
Absolutely, I think there are definitely things that you want to have in place to be able to do it successfully. And they’re all things that are actually really good to have in place anyway. So for example, if you don’t have continuous delivery, if you don’t have the ability to have automated pipelines that you can deploy your code through, with automated testing, you’re going to really struggle with microservices. But they’re a really good thing to have anyway, because they make all of your development life much more easy.
Zoe:
All right, well, let’s chat a bit, then. I kind of blithely said, ‘Oh, microservices architecture is becoming the industry standard.’ But why is that? What are the real benefits that people are getting out of it once they do go through this process?
Sarah:
So I think we were already doing microservices at the point where the State of DevOps report started to be created. But there was a survey that became the book Accelerate, that identified: ‘What does it mean to be a high-performing technology organisation?’
And it’s a really great book. When they’re talking about high-performing, they’re saying technology organisations that make a positive impact on the business they’re part of. So the business becomes more profitable or takes on more market share. And what they identified really, is that you need to have this flow of value to your customers.
And there are a whole set of things that they found were correlated with that, but one of them was loosely coupled architectures. And obviously, microservices aren’t the only loosely coupled architecture you can have. But they’re actually very much focused on that as a benefit. So with microservices, if you get the boundaries right, you should be able to work on your service without having an unexpected impact on teams outside the boundaries of your system. So you can work pretty autonomously, and release changes frequently.
Zoe:
And I suppose for me, as an ex-engineer a little while ago, I can imagine that the obvious benefit of a loosely coupled architecture is that you’re not in someone else’s way. Literally, while you’re coding, you’re not coding something that you’re waiting on other people and you’ve got this hold-up, is that essentially what it is that’s the benefit? Or are there other, more complex benefits?
Sarah:
I think that is the key benefit. There are lots of reasons to adopt microservices. But I think the primary one is about solving an organisational problem, which is that you’ve got so many people working on your system that you actually just can’t communicate, you’re always waiting for people. So you need to be able to carve off a small piece of that system where people can work without having to go and ask someone else to do something. So you really just need to be looking for: ‘Can people make progress?’
When I first worked at the FT, I would go to a meeting on a Tuesday to decide whether I could release code to test on a Thursday. And that kind of coordination just does slow things down. And it doesn’t necessarily make you more likely to catch issues either. If you’re making small changes, it’s much more easy to understand what the impact of that change is. And the person who makes the decision to release that code is the person who understands it the best: the team that have worked on it. So as a small change, you can test it, and you can release it, and if your boundaries are good, it’s pretty unlikely you’re going to affect someone else. I mean, obviously it happens, but it’s less likely than you think.
Zoe:
Yes [laughs]. We all know that there is no 100% in technology. It’s always ‘make it better’. So is microservices always the right architecture to choose?
Sarah:
Oh no. It’s really interesting because you start to hear a lot of stories now that ‘microservices didn’t work for us’. And when you look into it, a lot of it is, ‘Well, we didn’t quite have the right boundaries, or they were too small.’ A lot of our systems are distributed now anyway, because we make a lot of use of the cloud, we make a lot of use of software as a service. So you’re probably making calls over the network anyway. But the major thing is, if you’re one team, you don’t have to separate your system up into microservices to get that independence of being able to work on stuff.
You might choose microservices, for another reason. Perhaps you need to have different technology choices in different parts of your stack, or you want to be able to scale rapidly and only in some parts of the system. But really, I would not start with microservices, because there’s a lot more complexity in the tooling around it and operating them, and [instead] wait until you find yourself slowing down and then consider that as a prompt for, ‘Could we separate out some part of our system?’ And it doesn’t have to be 1500 microservices, it can just be, ‘Well, let’s divide our system into four different areas and we’ve got a team working on each.’ That can be very effective.
Zoe:
So if you’re coming at this building a new platform, or looking at architecting a new platform, how do you go around making good platform engineering decisions?
Sarah:
It’s an interesting challenge, because one of the things when microservices were first coming around, there was this idea that they give you the ability to choose the right technology for your needs. So you could program this in a different language, you could use a different data store. But every time that you do that, you make it more difficult for people to provide common tools or platforms.
So I think that you want product engineering teams to be focused on business value, you don’t want them all solving the same problems because they’re autonomous, and they’re all building a CI/CD pipeline. You don’t really want that.
You want to have a platform engineering team that solves these problems that are not key business things, but they’re just something you need. And you want them to do that in a way where it’s very self-service for those product engineering teams: they can work independently and use these things without having to go and talk to someone. And that platform team is really about reducing the cognitive load for those product engineering teams so that they’re really focused on the difficult business problems and not on, ‘How do I spin up a server? How do I add logging or monitoring into my system?’ So you want the platform engineering [team] doing that.
And you also probably want to have a reasonable level of standardisation. So you only really want to say, hey, use a different language, use a different data store, if it gives you some real benefit for something that really matters to your business. So, for example, you might want to use a graph database, because you’re solving a problem where the information really represents a graph. And that’s an example we definitely had at the FT. It was much easier for us to have a document store for a content article and a graph database for the metadata for that article, than it was shoving all of that into a relational database.
Zoe:
Right. That’s a lovely way to break it down, though, and a lovely way to think about it. Because something that I find comes up time and time again, is this tension. And I think it’s good to talk about where the tensions are. Because again, with this ‘there’s no 100% in technology,’ there is never one true answer. And if anyone’s got the one true answer, all you know is that it’s wrong.
So I think looking at the tension is interesting, but the tension that I find comes up again and again is standardisation versus autonomy and flexibility. And actually, by breaking it down like that, and saying, ‘Which problems are these engineers solving?’ I think that’s actually a really good way to think about it. And perhaps if your engineers are solving the platform challenges over and over, then something’s obviously not working. But the answer isn’t necessarily to say, ‘Well, all right, you all go off and fix that’. Maybe the problem’s elsewhere, and you need to refocus the platform team or whatever.
Sarah:
Yeah, there’s a really interesting idea that if you make the platform optional, so that there is a process where people could choose to use their own version, you are encouraging the platform team to really talk to their customers and consider it as a product that they’re competing with other people, with external suppliers, or with the team building it themselves. But you should be able to beat all of those [competitors], because you really understand your customers, and they’re right there. You do have to have some guardrails, where it’s like, ‘You know what, this may not be the perfect deployment pipeline, but you could spend time building it, it’d be 5% better. And I just don’t think that’s enough of a difference.’
Charity Majors wrote a really interesting blog post about migrations, and said you have to get a massive jump to make up for the fact that you’ve got months of turmoil while you move everybody. So it’s got to be quite a bit better. Once you factor in all of the pain of moving, will it be sufficiently better to make a massive difference? I think that’s quite a good thought. It’s like, ‘Yes, it’s better, but is it better plus I’m prepared to go through three months of chaos?’
Zoe:
Absolutely. So on that note, what are the important things that you need to consider when you’re moving to a microservices architecture?
Sarah:
I think the first thing really is to think about whether your team can do it. Whether you have the kind of organisation that can allow teams to be autonomous and empowered and to trust them.
So you need a culture that’s pretty open and looking to learn things. So if you’ve got your team split out to be a front-end team and a back-end team and a tester, have you got the appetite to actually change those to be cross-functional teams? Will you let them make their own decisions? Can you accept that you’re not mandating one choice for everything, and that you’re going to let people be autonomous? And that is about the culture of your company and your organisation. So is there an openness to letting people have power?
Zoe:
So it’s much more an organisational and people challenge, than a technical challenge? It’s not actually a, ‘Do you have this much server capability or this much money to invest?’
Sarah:
There is a technical maturity level as well. So if you’re an organisation that doesn’t have any automation. [If] you don’t have infrastructure as code, you don’t have automated testing, you don’t have automated deployment pipelines, you really ought to do all of those things first.
So there are some technical sides of it. But then there’s also this cultural side of it, which is, you want teams that want to be autonomous. So there’s a flipside, can you grant people the option to make their own choices, but do your teams actually want to do that?
And there is also an element of, ‘Can you move to thinking of things as products rather than projects?’ Because really, if you’re building microservices, you should consider it a product. And it’s not done until that product doesn’t exist anymore, rather than, ‘Let’s put together 15 people and have them do this project. And at the end, you’ve got a bunch of services that no one owns.’
In the old days, you might build this thing and hand it over to maintenance team. It’s really difficult to do that if you’ve been making decisions as an autonomous team. Because let’s say you did build this one in a different language, or you use a different database, it’s pretty hard to throw that over the wall to a completely different team. They’re the ones that then feel the consequences of the decisions. I think there’s a really core thing, which is you ought to be supporting your service in production. So that every decision you make you’re thinking about it in terms of, ‘Well, I’m the one that’s going to have to wake up at three o’clock if this all goes wrong.’
Zoe:
[Laughs] Again, getting that accountability for making the decisions. I was just thinking, presumably, it means that also the longevity, or the expected longevity, of the software also plays into this decision. It sounds like this is going to be more suitable for something that is going to run for a long time, not a piece of software that you know is going to expire in 12 months or so.
Sarah:
I think so. I think the ultimate aim with microservices is that you never have to stop and rebuild the whole system again. So as things get out of date, or you decide you need to change your approach to something, you build a new service and you decommission the old one. And in theory, it’s just constantly there. It’s just being mutated over time. There’s a thought that microservices architecture is about being able to handle change. So yeah, if you’re building something that’s just going to be used for two months, maybe you don’t need it. The only reason you might need it as if it’s absolutely huge and you’ve got to divide it up into different systems just to be able to work on it.
Zoe:
Although also with the caveat, I know I’ve worked on many software projects where when they were commissioned, everyone thought they were going to be around for six months and then in reality… [they were around considerably longer]. So with that caveat!
This has been an absolutely fascinating chat. Thank you so much, Sarah. As soon as your book is out, we will share that because I, for one, am certainly very much looking forward to reading it.
Sarah:
Excellent. Thank you.
ENDS