![]() What NaturalMotion had previously understood but not experienced at scale, was the self-healing of Kubernetes “we saw Amazon EKS self-heal quickly during the Zeus event, and we later discovered that our over-provisioning pods had been set up incorrectly. Amazon EKS managed this within 15 minutes.” Our legacy infrastructure, which was handling 80% of all requests, scaled up and started to respond to requests properly in 45 mins. “What happened was an incredible experience for our team. ![]() Our monitoring data ingestion failed to cope with the very sudden increase in requests to our game servers and for a period of 5 mins, we were riding blind wondering if the game had gone down,” said Siva.įig c – Kubernetes Cluster (handling 20% of all requests) We experienced an increase in traffic 4x our baseline in under a minute. “We were running 20% of our Production traffic through our new Kubernetes cluster. ![]() Highly anticipated by players, the Zeus event was the culmination of a 7 month long ‘Greek Gods’ campaign which saw mythical Gods like Athena, Poseidon, and Hades appear in the game as powerful Titans.ĭuring the five-day event, battles increased by over 200% as players joined to celebrate the addition of the brand-new Titan, Zeus. It took us 8 weeks to go from 0 – 100% of production traffic on our new Amazon EKS cluster.”įig b – simpler and consistent set up by NaturalMotion now have across all Dawn of Titans environments During migration – the Zeus eventĭawn of Titans ran its largest live event at the time. Instead, we introduced our Dev and QA environments to the new cluster first before weighting production traffic between our legacy infrastructure and Amazon EKS. “Running a live game meant that we couldn’t simply bring the game down to introduce a new architecture. Once the decision was made to migrate to Amazon EKS, Siva and his team created a roadmap for moving their live game. AWS also offered Amazon Elastic Kubernetes Service (Amazon EKS), a managed service that handles most of the heavy lifting for us.” explained Siva. We ended up choosing Kubernetes because even though the technology was considered cutting-edge, we felt it was pretty mature. “We evaluated Kubernetes, Mesos, and Docker Swarm. To reduce the burden on the Dawn of Titans development team, it began to investigate ways to simplify and automate their architecture. It introduced an extra step outside of our core development cycle.” Migrating to a simpler life “This allowed us to perform load tests and gameplay at scale, but it didn’t solve everything. That meant everything developed and tested potentially missed issues that were only seen in our production environment.” To resolve this problem, NaturalMotion introduced a staging stack to mirror production. “Our development and QA stacks also weren’t representative of our production set up. We depended heavily on our site reliability team to ensure our production cluster had a reasonable SLA,” said Siva. We couldn’t immediately tell if a drop in concurrent users was the result of an outage on a dependent system that our core services integrated with. “We had insufficient monitoring and alerting capabilities. Until 2018, we containerized workloads to power our game application as well as several other microservices,” but this posed multiple challenges for the team.įig a – NaturalMotion legacy architecture So the question is, how does any developer process that amount of data, let alone develop additional game features? NaturalMotion Engineering Manager, Siva Ganesamoorthy shared, “since launch, Dawn of Titans has relied on Amazon Elastic Compute Cloud (Amazon EC2) to power its core services infrastructure. The success of Dawn of Titans has led NaturalMotion to handle millions of player battles every day. Originally published in 2016, the action strategy game Dawn of Titans continues to maintain a lasting community by bringing console quality graphics and compelling gameplay to mobile audiences. That’s the situation that Zynga subsidiary NaturalMotion found itself in 2019. Game studios are expected to deliver new experiences to players in increasingly tight timeframes, all while juggling maintenance of live games – and with limited engineering resources. ![]() Players get to enjoy their favourite games for longer, and developers can see ongoing success from a single game for years.īut games with longer lifespans can also generate heavy overheads for development teams. Running games as services can have some serious perks.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |