Home / Companies / Neo4j / Blog / Post Details
Content Deep Dive

Text2Cypher Across Languages: Evaluating Foundational Models Beyond English

Blog post from Neo4j

Post Details
Company
Date Published
Author
Makbule Gulcin Ozsoy
Word Count
1,242
Language
English
Hacker News Points
-
Summary

The blog post discusses the evaluation of large language models (LLMs) on the Text2Cypher task, which involves converting natural language questions into Cypher queries for Neo4j graph databases, with a focus on multilingual performance across English, Spanish, and Turkish. The authors released a multilingual test set and analyzed model performance, finding that LLMs perform best in English, followed by Spanish and Turkish, due to variations in language resources and linguistic similarities. The study showed that translating prompts had minimal impact on performance, while schema elements remained in English, suggesting future research could explore fully localized setups and language-specific tuning to improve cross-lingual query generation. The findings aim to promote broader research in structured query generation and contribute to the multilingual capabilities of LLMs.