Counting nodes based on date

I have a set of nodes with a creationDate property that is indexed, and I am trying to count how many nodes were created for every day in a given date range, including days for which none were created. To do this for the month of December, I tried:

// Query 1
MATCH (m:item)
WHERE datetime('2019-12-01') <= m.creationDate < datetime('2020-01-01')
RETURN date(m.creationDate) AS date, COUNT(*) AS count

However, this query does not return dates for which there are no nodes. I managed to get this working:

// Query 2
WITH date('2019-12-01') AS startDate, date('2019-12-31') AS endDate
WITH startDate, duration.inDays(startDate, endDate).days AS days
WITH [day in range(0, days) | startDate + duration({days: day})] AS dates
UNWIND dates AS date
MATCH (m:item)
WHERE date(m.creationDate) = date
RETURN date, COUNT(*) AS count

The problem is that query 2 takes 27 seconds, while query 1 takes 1 second. This makes sense–query 2 is essentially running query 1 once for each day in the range.

I know I can post-process the data in my code to fill in the gaps, but I would prefer not to have to do the extra step if possible. Is there a way to do this in Cypher without the time penalty of rerunning query 1 once for each day in the range?

unwind(['2020-01-01', '2020-01-01', '2020-01-02', '2020-01-03']) as dt 
create (:Sample { creationDate: datetime(dt) });

(Note that these are datetimes in there and not dates)

match (s:Sample) 
return distinct(date(s.creationDate)) as date, count(s) as nodesWithThisDate


"2020-01-01"    2
"2020-01-02"    1
"2020-01-03"    1

If there's a node in there with a null, it will show up and count nulls.

Thanks for the response David!

In your example, if I changed the first statement to be:

unwind(['2020-01-01', '2020-01-01', '2020-01-02', '2020-01-04']) as dt 
create (:Sample { creationDate: datetime(dt) });

then run the same query you give, the results would be:

"2020-01-01"    2
"2020-01-02"    1
"2020-01-04"    1

The results I need are:

"2020-01-01"    2
"2020-01-02"    1
"2020-01-03"    0
"2020-01-04"    1

In a relational database, this would be an OUTER JOIN operation between the aggregated data and a list of dates. In Neo4j, I'm not sure if an analogous operation is possible. Query 2 isn't the same–it's more of an iterated approach. OPTIONAL MATCH seems like the closest thing to OUTER JOIN, but

The only option I see (other than post-processing) is to create "date bin" nodes for each possible creation date:

UNWIND (['2019-12-01', '2019-12-02', ..., '2019-12-31') AS dt
CREATE (:DateBin {date: date(dt)})

Then I can establish relationships between those nodes and the item nodes:

MATCH (m:item)
MATCH (n:DateBin {date: date(m.creationDate})
MERGE (m)-[:CREATED_ON]->(n)

Then the report can be run using those nodes:

MATCH (n:DateBin)
WHERE date('2019-12-01') <= < date('2020-01-01')
RETURN AS date, COUNT(m) AS count

(I haven't actually tried this. Apologies if there are errors; I just wanted to illustrate the idea.)

The only real downside to this approach is the additional structure needed in the database to support these queries. Any chance there's an easier way that isn't O(n) on the size of the date range?