
This was a feature AWS introduced at the end of 2018, allowing one AWS account to create VPCs and subnets, and then share those VPC and subnets to other AWS accounts. The Slack Cloud Engineering team had to devise ways to simplify this problem and make our lives easier. This led to a lot of administrative overhead. Having hundreds of AWS accounts became a nightmare to manage when it came to CIDR ranges and IP spaces, because the mis-management of CIDR ranges meant that we couldn’t peer VPCs with overlapping CIDR ranges. This was great for a while, but as a company we continued to grow. Now the service teams could request their own AWS accounts and could even peer their VPCs with each other when services needed to talk to other services that lived in a different AWS account. To overcome this hurdle, we introduced the concept of child accounts. Having all our infrastructure in a single AWS account led to AWS rate-limiting issues, cost-separation issues, and general confusion for our internal engineering service teams. However, everything we built still lived in one big AWS account. Terraform was also introduced to manage our AWS infrastructure.Īs our customer base grew and the tool evolved, we developed more services and built more infrastructure as needed. Eventually the manual processes were replaced by scripting, and then by a Chef configuration management system. There were very few AWS instances in it and they were all built manually. If you ever wondered what the infrastructure behind the original Slack looked like, it was all in a single AWS account. As time went by, Slack started to evolve, and so did its infrastructure, allowing us to scale from tens of users to millions. It was simpler back then and there were only a few customers using it. Once they realised the potential of this tool, it was officially launched as Slack in 2014. In recent times, Slack has become somewhat of a household name, though its origin is more humble: the messaging service began as the internal communication tool when Slack was Tiny Speck, a game company building Glitch. In this post, we’ll discuss our design decisions and our technological choices made along the way. One of the pain points inherited from the early days caused us to spin up a brand-new network architecture redesign project, called Whitecastle. At Slack, we’ve gone through an evolution of our AWS infrastructure from the early days of running a few hand-built EC2 instances, all the way to provisioning thousands of EC2s instances across multiple AWS regions, using the latest AWS services to build reliable and scalable infrastructure.
