Home / Companies / Neo4j / Blog / Post Details
Content Deep Dive

Turn a Harry Potter Book into a Knowledge Graph

Blog post from Neo4j

Post Details
Company
Date Published
Author
Tomaž Bratanič
Word Count
2,065
Language
English
Hacker News Points
-
Summary

The text discusses the creation of a knowledge graph based on the Harry Potter book "Harry Potter and the Philosopher's Stone" using Neo4j, SpaCy, and Selenium. The author scraped the characters from the book's fandom page and preprocessed the text to remove co-references. They then used SpaCy's rule-based pattern matching feature to extract entities, prioritizing longer-word entities to overcome issues with single-word matches and character disambiguation. The extracted interactions between characters were stored in a Neo4j graph database, which was visualized to examine the results. The author concludes that the approach turned out well, but notes that fine-tuning might be needed for entity disambiguation on subsequent books.