Food delivery startup Deliveroo continues to suffer from regular outages at peak times, despite the inherent flexibility and resilience of the AWS cloud infrastructure its core applications run on.
There was a major outage reported in January and again in February, leading to an apology email to customers affected which said: "We've had some problems with our server, which is how we connect your order with the restaurant and the rider who brings it to you.
"This kind of thing can happen to all sorts of businesses, but with hungry people waiting for food, we know it's super important that we don't let you down. We've learnt a lot from the incident that affected you, and we are doing our best to ensure it doesn't happen again."
Deliveroo credited those affected with free delivery on the following two orders, but the voucher code was only valid for a week.
Deliveroo raised a massive Series E of $275 million from the hedge fund Bridgepoint in August 2016 and is growing at breakneck speed, operating in 120 cities in 12 countries. However, outages are extremely harmful to a company like Deliveroo, especially as it faces fierce competition from UberEats and JustEat, both of which have had outages of their own.
Deliveroo runs its core applications on AWS cloud infrastructure. AWS is favoured by startups and app developers because of its near-constant uptime and ability to scale according to demand, charging users as they consume.
AWS does suffer outages of its own, which can have a disastrous effect on organisations that are reliant on the S3 backend storage its applications run on, but these tend to hit a large number of companies at once, not just one application.
Carl Brooks, a cloud infrastructure analyst at 451 Research believes that these outages are a natural symptom of growing pains for a startup expanding at the pace of Deliveroo. "In the overwhelming number of cases we look at with outages, AWS isn't at fault," he told Computerworld UK.
"With a company like this you have to coordinate a lot of things: riders, orders and a lot of apps over a number of cellular networks and tie that back to this central app infrastructure, which they use AWS for. So you take a complex app and start throwing as many users as they can at it, that is probably why it is breaking."
Brooks believes that these outages fundamentally come down to engineering. "You have to architect for resilience and unpredictability in factors like web latency and things like that. Also, if they want to save money by turning servers off at times of low demand and are suddenly getting slammed with a rush you will see an outage or a slow down.
Sign up for CIO Asia eNewsletters.