Evernote has over 200 million happy customers with approx. 5 billion user notes and 5 billion user-uploaded attachments. That’s over 3.5 petabytes of data. To put things into perspective, this is equivalent to roughly 10 copies of every book ever published.

In the recent past, they migrated their workload to the Google Cloud platform. All of this workload was migrated to the cloud in just 70 days with sublime planning and execution.

The present Evernote cloud infrastructure is more scalable, elastic and secure in contrast to the on-prem one they had originally. Moving the workload to the cloud enabled Evernote engineering to focus on innovation, launch new features and stay off the hook of managing the underlying infrastructure.

Distributed Systems
For a complete list of similar articles on distributed systems and real-world architectures, here you go


Reason for the Migration

Evernote Engineering originally managed its own on-prem servers and network. But with the increasing popularity of the business, scaling the infrastructure, maintenance and upgrade became challenging.

Dynamically scaling the on-prem infrastructure wasn’t as smooth as it’s on the cloud. The infrastructure maintenance was eating up resources that could be better utilized innovating and developing new features; focusing on what the customers wanted.

Migrating to the cloud and having the infrastructure managed by Google enabled them to move fast.


Core Modules of the Evernote’s Service

Let’s have a look at the core modules of Evernote’s service

USER NOTE STORE

This service module stores the user’s notes. The service block is built on shards and every shard has the capacity of supporting over 300 thousand users.

A shard comprises:

  • A client-facing front-end service that runs on Tomcat
  • A relational MySQL-based data storage layer
  • A Lucene search service that indexes user-generated content.

There are approx. 762 of these shards which comprehensively handle the entire customer base of Evernote.

ATTACHMENT STORAGE

This is a file storage layer that separately stores all the attachments which the users upload. This storage alone contains 206 servers. The attachments when uploaded are stored at two locations one locally and the other at a remote disaster recovery data center.

USER DATA STORE

This is a central user database that contains the user information, manages the authentication and such.

Besides these, there are quite a number of additional services such as handwriting and text recognition, caching, batch processing, etc. All these are powered by additional 200 Linux servers.


Migration Approach

Moving a workload with petabytes of data from an on-prem data center to the cloud is no easy feat. It requires a lot of care, focus and strategic planning. The entire migration was pulled off in 70 days.

A divide & conquer approach was followed in the migration in several different phases. Small chunks of workloads were migrated to the cloud and then validated and tested.

A couple of services such as the User Store and the Note Store that ran on Linux servers had a direct counterpart technology solution on the Google Cloud and thus were easy to move.

These services ran on-prem on a physical Linux-based data center & were migrated to the Linux virtual machines on the Google Cloud.

The rest of the modules such as the User Attachment Store, Recognition service, etc. required a significant modification to be moved over to the cloud.

The entire service migration and movement of data had a thorough security audit. Data privacy, infrastructure security everything was accounted for.

The application leverages the container technology and serverless functions having evolved from a monolith to loosely coupled microservices.

The application’s uptime, performance, and monitoring are being assisted by Datadog which gives better visibility across the workload. It enabled the devs to create custom dashboards to further enhance the visibility into the workload stats, metrics from Google Compute Engine, Kubernetes and the App Engine.

Datadog is a Google Cloud partner. It’s a monitoring service for applications powered by the cloud.

This write-up is a gist of a 5 part blog article published by Evernote about their migration to GCP

Related read: Twitter’s migration to Google Cloud

Well, Folks, this is pretty much it. If you liked the article, do share it with your network for better reach.

Mastering the Fundamentals of the Cloud
If you wish to master the fundamentals of cloud computing. Check out my platform-agnostic Cloud Computing 101 course. It is part of the Zero to Mastering Software Architecture learning track, a series of three courses I have written intending to educate you, step by step, on the domain of software architecture and distributed system design. The learning track takes you right from having no knowledge in it to making you a pro in designing large-scale distributed systems like YouTube, Netflix, Hotstar, and more.

zerotosa new banner

I’ll see you in the next write-up.
Until then
Cheers!