How to troubleshoot the Elastic App Search web crawler
Blog post from Elastic
Elastic App Search's new web crawler simplifies the process of ingesting publicly available web content to make it instantly searchable on websites, but challenges can arise if pages are not indexed as expected. The setup involves deploying App Search, creating an engine, and configuring the web crawler with specific crawling rules to target desired content. A misconfiguration in the rules, such as the order in which allow and disallow rules are applied, can lead to issues where no documents are indexed. Troubleshooting involves using tools like Kibana to access detailed logs and identify errors, such as the rule_engine_denied message caused by improperly ordered rules. By adjusting the order of these rules and leveraging Elasticsearch's search capabilities within the Logs app, users can efficiently resolve issues and ensure the web crawler indexes the correct pages. The article concludes by encouraging users to try Elastic App Search with a free trial to explore web crawling capabilities firsthand.