Saving an actual search and it's results within graph technique

PeteM · December 12, 2020, 7:05am

Hi, so we are developing a new web application and utilising Neo4j to build reports on data.

Here is a use case scenario:

User enters some search parameters into web form (selecting gender, age etc)
A query is built and run against the database
The results are presented to the user
The user now wants to save this search and results
..
Later on the user then wants to actually perform a search, against those search results

I can think of a couple of different ways of achieving point 4, but I wondered if the more experienced here might be able to offer better solutions or whether there is an efficient way of doing it?

One crude technique:

Save the search query as a node and relate it to the user node in graph

(u:User)-[:MADE_SEARCH]->(s:Search)

Create a relationship between that search node and the result nodes it found

(s:Search)-[:RESULT]->(c:Campaign)

However, the database is reasonably big (85m nodes, 600m+ relationships), and there will be many different searches run per day, with each search having results between a couple of hundred, up to potentially a few million.

This would end up with a lot of relationships being created from the search node to results. If we then wanted to actually perform a search against those results, is having that many relationships efficient?

Am I just worrying over nothing and this is an ok solution?

I am worried that say after 1 year of running this web app, there will be thousands of searches saved, with millions of those search-campaign relationships.

Thanks

clem · December 12, 2020, 8:10pm

Are there any properties associated with the Nodes or Relationships that can be indexed and limit the scope of the query? E.g. Date/Time, Info about the User, the Campaign, etc.

If so, it might not be so bad.

PeteM · December 13, 2020, 6:31am

Yes, hadn't considered that, but there would be some attributes that would be indexed.

First the User node will have a user ID.
The Search node will have a unique search ID as well. So, in reality when searching against those search results, we would match by search ID, to find the records related to it.

I guess my concern is having hundreds of thousands or potentially several million nodes all point to one single node. There would then be multiple instances of this within the graph.

Most of these will also become redundant as the user becomes less interested in using those search results. In this case we could perhaps notify the user that a stored search would be deleted after 3 months of inactivity and offer them the choice to keep it.

Topic		Replies	Views
Neo4j for historical data? Neo4j Graph Platform	3	1074	January 14, 2020
Using Relationship Properties to Filter Other Nodes Neo4j Graph Platform migrated	4	97	February 8, 2023
Https://neo4j.com/graphacademy/online-training/introduction-to-neo4j/part-4/ Graph Academy	1	387	March 11, 2020
Best practice of Neo4j for specified case Neo4j Graph Platform performance , relationship , knowledge-base	2	294	September 8, 2020
Storing information sources in relationship properties Neo4j Graph Platform	2	492	March 25, 2021

Saving an actual search and it's results within graph technique

Related Topics