Fulltext analyzer not set in cluster installation

We are overriding the analyzer to be used for fulltext indexes to use the simple analyzer. This works on a local single-node installation, but not on our cluster installation, as we can se that a query using English stop words does not yield any mathces.

I create the index with CALL db.index.fulltext.createNodeIndex("test",["Company"],["name"],{analyzer:"simple"})

CALL db.index.fulltext.listAvailableAnalyzers lists the analyzer.

In the fulltext-index.properties created, we see that the index uses the standard analyzer:

analyzer=standard

In the local installation, we get:

analyzer=simple

I've tried to re-create the index with several of the other available analyzers as well, but it always selects standard according to the properties file.

We are running on enterprise version 3.5.3. We saw the same problem on 3.5.0 with a custom analyzer, but upgrading (including removing our custom analyzer as the simple analyzer now is included OOTB) trying to recreate the index did not solve the problem.

The debug log does not contain any information about the analyzer:

2019-03-04 13:55:08.404+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index population started: [NODE:Company(name) [provider: {key=fulltext, version=1.0}]]
2019-03-04 13:55:16.919+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Completed node store scan. Flushing all pending updates.
BatchingMultipleIndexPopulator{activeTasks=0, executor=java.util.concurrent.ThreadPoolExecutor@385bcd3a[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1], batchedUpdates = [0 updates], queuedUpdates = 0}
2019-03-04 13:55:16.951+0000 INFO [o.n.k.i.a.i.IndexPopulationJob] Index creation finished. Index [NODE:Company(name) [provider: {key=fulltext, version=1.0}]] is ONLINE.

Could this be a bug when running in a cluster?

This is a known limitation of the fulltext indexes in 3.5. See the NOTE on https://neo4j.com/docs/cypher-manual/3.5/schema/index/#schema-index-fulltext-create-and-configure

The documentation on the procedures themselves also warn about this.

If you want analyzer configurations to be consistent in a cluster, you need to configure them in neo4j.conf.

Thank you Chris, this solved our problem. I am not sure how I managed to miss that note. Fortunately, we currently can use the same analyzer for all fulltext indexes.

Do you have any plans to support overriding the default fulltext analyzer in a cluster soon? It may save us from indexing some data in Elasticsearch in the future.

I don't know about "soon". It will be cluster-safe in 4.0. It's a consequence of architectural limitations that can't be fixed in 3.5. (I have actually been working on this exact problem for the past 3-odd months already.)

1 Like

Thanks for the update and good luck with 4.0!