Home / Companies / DataStax / Blog / Post Details
Content Deep Dive

Powers of Ten – Part II

Blog post from DataStax

Post Details
Company
Date Published
Author
Stephen Mallette
Word Count
2,202
Language
English
Hacker News Points
-
Summary

This article discusses strategies for bulk loading data into Titan at varying scales, focusing on hundreds of millions and billions of edges using Faunus as the loading tool. It provides a step-by-step guide to loading the DocGraph dataset with approximately 1 million vertices and 154 million edges using a single Hadoop node running in pseudo-distributed mode. The article also demonstrates how to load the Friendster social network dataset, which represents a graph with 117 million vertices and 2.5 billion edges, using a four-node Hadoop cluster. It emphasizes that while there are common strategies for loading data at different scales, the actual approach must be adapted to the specific data and domain.