Home / Companies / Acceldata / Blog / Post Details
Content Deep Dive

How LinkedIn Scales Its Analytical Data Platform

Blog post from Acceldata

Post Details
Company
Date Published
Author
Acceldata Product Team
Word Count
2,308
Language
English
Hacker News Points
-
Summary

LinkedIn operates one of the largest analytics platforms in the world, with approximately 20,000 Hadoop nodes storing one exabyte of data. The company's largest data cluster comprises 10,000 nodes storing 500 PB of data including one billion objects (directories, files and blocks), all using the Hadoop Distributed File System (HDFS). LinkedIn has developed various tools to optimize its storage and compute performance, such as Azkaban for workflow job scheduling, Robin for load balancing, and DynoYARN for performance forecasting. The company is currently in the process of migrating its on-premises Hadoop clusters to the Azure public cloud owned by parent Microsoft.