Application Hosting – How PolyHaven Manages 5 Million Page Views and 80TB Traffic a Month for < $400
Polyhaven, a 3D asset library, recently provided insightful data on how they manage to serve 5 million page views a month, consuming a bandwidth of 80 terabytes just under 400 USD.
If the same amount of traffic was served from AWS S3 object storage, that would cost more than 4K USD a month.
So how do they pull this off every month?
Here are the highlights:
3D Asset Storage
The caching layer of the app is powered by Cloudflare. Static asset downloads from the cache account for 85% of the traffic. This cuts down most of the load on the storage server with a cost of just 40 USD a month.
The cache layer is further optimized by a Cloudflare product Argo, this improves the cache hit ratio to 93%.
Argo is an additional Cloudflare service that optimizes the DNS routes increasing the latency. It also adds an additional layer to the cache.
So, if someone downloads a file in London, Cloudflare, besides caching that file for London, will also cache that file for the whole of Western Europe. This increases the cache hit ratio for the application. But this additional feature comes with a cost. 160 USD per month has to be paid for the Polyhaven data that goes through the Argo network. But this is still less in comparison to the database costs that would add up serving 80TB of data.
All the 3D asset data is stored on Backblaze cloud storage. Empowered by a partnership of Cloudflare and Backblaze, Polyhaven just have to pay 11 USD a month for the storage service; this includes the storage fee, data upload costs and the data fetch API requests.
The images on the website that are thumbnails of the assets, renders, previews, etc., are stored using another CDN service Bunny.net. Besides providing hosting, the CDN service also dynamically resizes and compresses the images for the website costing 27 USD per month.
The site is built using Next.js powered by Vercel servers. Vercel manages the site’s servers offering a serverless environment. The base fee for this is 20 USD a month. Also, the Cloudflare cache is leveraged to avoid unnecessary hits to the server.
The 3D assets are stored on the cloud object storage and the website database that is Google Firestore (a serverless document database) stores info on the 3D assets, download logs and so on. This costs them approx. 100 USD a month. Again a cache layer in front of it averts the DB hits.
A separate API powered by Vultr compute connects the frontend to the database and gives them control over caching the database reads.
Running the API on a separate compute service keeps the costs of the Vercel server within limits. The Vultr API costs them 5 USD a month.
Adding up the costs (Caching Cloudflare with Argo: 200 USD, Asset storage Backblaze: 11 USD, Image Storage Bunny.net: 27 USD, Database Firestore: 100 USD, Vultr API: 5 USD) costs them less than 400 USD.
More writeups on real-world architectures here.
Mastering the Fundamentals of the Cloud
If you need to understand the fundamentals of cloud computing in-depth. Check out my platform-agnostic Cloud Computing 101 course. After having spent a decade in the industry writing code, I strongly believe that every software engineer should have knowledge of cloud computing. It’s the present and the future of application development and deployment.
Zero to Mastering Software Architecture Learning Track - Starting from Zero to Designing Web-Scale Distributed Applications Like a Pro. Check it out.
Master system design for your interviews. Check out this blog post written by me.
- System Design: Hone Your System Design Skills By Exploring Real-World Web-Scale System Architectures [Feed Updated Daily]
- Single-threaded Event Loop Architecture for Building Asynchronous, Non-Blocking, Highly Concurrent Real-time Services
- Understanding SLA (Service Level Agreement) In Cloud Services: How Is SLA Calculated In Large-Scale Services?
- Database Architecture – Part 2 – NoSQL DB Architecture with ScyllaDB (Shard Per Core Design)
- Parallel Processing: How Modern Cloud Servers Leverage Different System Architectures to Optimize Parallel Compute
- Database Architecture – A Deep Dive – Part 1