Since it was started in a dorm room in 2003, Squarespace has made it simple for millions of people to create their own websites.
Behind the scenes, though, the company’s monolithic Java application was making things not so simple for its developers to keep improving the platform. So in 2014, the company decided to “go down the microservices path,” says Kevin Lynch, staff engineer on Squarespace’s Site Reliability team. “But we were always deploying our applications in vCenter VMware VMs [in our own data centers]. Microservices solved a problem on the development side, but it pushed that problem to the Infrastructure team. The infrastructure deployment process on our 5,000 VM hosts was slowing everyone down.”
Challenge
After experimenting with another container orchestration platform and “breaking it in very painful ways,” Lynch says, the team began experimenting with Kubernetes in mid-2016 and found that it “answered all the questions that we had.” Deploying it in the data center rather than the public cloud was their biggest challenge, and at the time, not a lot of other companies were doing that. “We had to figure out how to deploy this in our infrastructure for ourselves, and we had to integrate it with our other applications,” says Lynch.
At the same time, Squarespace’s Network Engineering team was modernizing its networking stack, switching from a traditional layer-two network to a layer-three spine-and-leaf network. “It mapped beautifully with what we wanted to do with Kubernetes,” says Lynch. “It gives us the ability to have our servers communicate directly with the top-of-rack switches. We use Calico for CNI networking for Kubernetes, so we can announce all these individual Kubernetes pod IP addresses and have them integrate seamlessly with our other services that are still provisioned in the VMs.”
Solution
Within a couple months, they had a stable cluster for their internal use, and began rolling out Kubernetes for production. They also added Zipkin and CNCF projects Prometheus and fluentd to their cloud native stack. “We switched to Kubernetes, a new world, and we revamped all our other tooling as well,” says Lynch. “It allowed us to streamline our process, so we can now easily create an entire microservice project from templates, generate the code and deployment pipeline for that, generate the Docker file, and then immediately just ship a workable, deployable project to Kubernetes.” Deployments across Dev/QA/Stage/Prod were also “simplified drastically,” Lynch adds. “Now there is little configuration variation.”
And the whole process takes only five minutes, an almost 85% reduction in time compared to their VM deployment. “From end to end that probably took half an hour, and that’s not accounting for the fact that an infrastructure engineer would be responsible for doing that, so there’s some business delay in there as well.”
Impact
With faster deployments, “productivity time is the big cost saver,” says Lynch. “We had a team that was implementing a new file storage service, and they just started integrating that with our storage back end without our involvement”—which wouldn’t have been possible before Kubernetes. He adds: “When we started the Kubernetes project, we had probably a dozen microservices. Today there are twice that in the pipeline being actively worked on.”
There’s also been a positive impact on the application’s resilience. “When we’re deploying VMs, we have to build tooling to ensure that a service is spread across racks appropriately and can withstand failure,” he says. “Kubernetes just does it. If a node goes down, it’s rescheduled immediately and there’s no performance impact.”
Another big benefit is autoscaling. “It wasn’t really possible with the way we’ve been using VMware,” says Lynch, “but now we can just add the appropriate autoscaling features via Kubernetes directly, and boom, it’s scaling up as demand increases. And it worked out of the box.”
For others starting out with Kubernetes, Lynch says his best advice is to “fail fast”: “Once you’ve planned things out, just execute. Kubernetes has been really great for trying something out quickly and seeing if it works or not.”