Global Cloud-Native Platforms: Faster TTM, Reduce Operational Burden & Grow your Business!
I’d like to share my story of how we have scaled an idea from a meeting room in Boston all backed by a serverless-first mindset;
To a POT (Proof of Technology);
To a single country platform;
To a regional platform;
To a global platform backing a Multi Billion Dollar business across multiple LOB (Lines of Business)!
Let me paint a picture:
Technology Approach Explained
- Cloud Native — Cloud-native is an approach to building and running applications that exploit the advantages of the cloud computing delivery model. Cloud-native is about how applications are created and deployed, not where.
- IAC (Infrastructure As Code) — I’m privileged to work with a strong team of engineers who rubber-stamped their desire to drive towards an IAC approach, coupled with a multi-repo approach. This allows us to build once and deploy anywhere to any of our regional AWS accounts as all we need to do is supplement properties into our platform deployment properties build.
- Multi Tenancy — Design at a container level, separated into separate ECS clusters and parameterised commodities (like functions, DynamoDB and other assets). This allows you to isolate environments within a VPC, we first realised we needed this to have a separate stack for evolving our architecture.
- Serverless-first mindset — to coin the AWS CTO — “No server is easier to manage than no server.” — Werner Vogel . What does this mean? Well its it means you can focus on the core features you want to build, instead of worrying about managing and operating servers, databases, or storage systems. Ultimately once you have developed patterns and expertise in this space you can move really quickly. It doesn’t mean that you have no operational concerns, just that they are less and abstracted to a different level. Let me give you some of our working examples;
- In my programme we needed a local logging stack in EMEA due to local compliance around logging. We assessed existing org SaaS offerings but the data residency remained a key blocker. So my team looked to experiment with a custom logging stack. We quickly spun up a suite of containers to provision the ELK (Elastic Logging) stack. We realised very quickly that we would need to manage this infrastructure, with maintaining EBS (cloud disk space) volumes, containers, clusters, etc. Our team wanted to prioritise other work, so we leveraged a stack using Cognito (to managed federated access), functions and ELK as a service to reduce TCO and operational burden — to me this is a serverless first approach.
- Our team also provisioned clusters for running compute, we saw overhead with having to maintain the AMI’s backing clusters. So we added a feature into our templates to allow us to use AWS Fargate from managed EC2. This allowed us the flexibility to fallback where more granular compute options where needed as well as allowing a pathway for AWS to manage our compute reducing operational burden on our teams.
What is our platform?
We are shipping Insurance as a Service holy based in the public cloud, using containerised products, AWS managed services, third party SaaS offerings and internally managed services backed by a buy over build design philosophy to support TTM.
Evolutionary Architecture
We took the time to decompose our stack based on Wardley Mapping (see Simon Wardley). Simon Wardley talks about pushing something from an idea across the spectrum to a commodity (think SaaS and rental). Once we have defined our stack, moved it to IAC, we then worked on decomposition and multi-tenancy in parallel. An example of pushing commodity would be migrating from an internally managed Kafka cluster across to cloud native eventing technologies like SNS & SQS or pushing from Oracle to AWS RDS Postgres to give you a small flavour of what we have done as we seek to push the platform towards a commodity.
Global Design & Time to Market
Our original build took us months to provision, we can now ship our platform in a week to a new region as we have pushed our core stack towards more of a commodity.
It is also critical to know if your value chain is working. Feedback loops are a mixture of daily best practices, automation, and tools. The last thing you want is for your users to be really excited about what you have done, and then be mad when it blows up all over the place.
We have constantly worked with senior stakeholder to demo our capabilities and how we evolve through live demo’s, videos and presentations.
What is a north star metric and why does it matter?
A north star metric is the key measure of success for the product team in a company. It defines the relationship between the customer problems that the product team is trying to solve and the revenue that the business aims to generate by doing so.
There are essentially 3 types of product games:
- The Attention Game: How much of your customers’ time can you capture in your product.
- The Transaction Game: How many commercial transactions does your user make on your platform.
- The Productivity Game: How many high value digital tasks can your customer perform in your product.
Working with stakeholders to derive a roadmap and goals through the NorthStar framework (see Amplitude) was a valuable process for bringing around cultural change and forming a global team as we all bought into a common purpose rallying around reducing operational expenditure through a serverless-first mind set; whilst focusing on new channels to increase NWP (net written premium) through standardised products. We also are lucky to have a very strong business whom have accelerated acquisition into our business model through opening new channels like the aggregator. The creation of this platform has stated a cultural revolution in our wider company as we look to drive reusable capabilities across the enterprise.
Now as we continue to scale our BOB across multiple LOB and new channels, we do need to continue to focus on decomposition, managing reusable products and looking to support access to third parties. More to follow as we continue this journey! The good news is having a cloud native platform means you have flexibility to scale workloads through code which is amazing when you think they we used to have to manually provision compute in a physical data centre.
You can follow me on: https://mobile.twitter.com/belfast_nerd