One thing that's not clear is the scope of the aggregation, the grouping key is not clear if we were to use it in a WHERE clause.

Would it be asking for the average of all persons who acted in all movies (so the same average would display on every result row)? Would persons be counted multiple times because they act in multiple movies, and should that duplication be captured in the average? Or is it asking for the average per movie? What if the pattern was more complicated, or there were more variables in scope?

Consider this MATCH and WHERE clause:

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)-[:ATTENDED_COLLEGE_AT]->(college:College)
WHERE AVG(p.born) > 1967

How could we tell what the grouping key is here? Is it per movie? Per college? Neither, and you want the average of all persons who acted in a movie and attended college? What if they acted in multiple movies, or attended multiple colleges?

When we restrict aggregations to WITH clauses, the grouping key becomes much clearer, as in the workaround you cited: you're getting, per movie, the average born year of that movie's actors, and then filtering that row based on whether the average is over 1967.

For fun, I made an some other interesting queries:

Find movies who's average actor's birth year is less or equal than the average from the movie 'The Matrix', (but not including 'The Matrix')

MATCH (m1:Movie{title:'The Matrix'})<-[:ACTED_IN]-(p1:Person)
WITH AVG(p1.born) AS matrixAvg, m1
MATCH (m2:Movie)<-[:ACTED_IN]-(p2:Person)
WITH AVG(p2.born) AS bornAvg, matrixAvg, m2, m1
WHERE bornAvg <= matrixAvg AND id(m2) <> id(m1)
RETURN m2.title, bornAvg

Or yet still more interesting, is to get the actor's average age for a movie at the time the movie was released and select for the average age less than the average age for 'The Matrix".

MATCH (m1:Movie{title:'The Matrix'})<-[:ACTED_IN]-(p1:Person)
WITH m1.released - AVG(p1.born) AS matrixAgeAvg, m1
MATCH (m2:Movie)<-[:ACTED_IN]-(p2:Person)
WITH m2.released - AVG(p2.born) AS movieAgeAvg, matrixAgeAvg, m2, m1
WHERE movieAgeAvg <= matrixAgeAvg AND id(m2) <> id(m1)
RETURN m2.title, m2.released, movieAgeAvg

Unless you want to create your own graph database engine some questions cannot be entirely understand by a Cypher user, and it's the point of having a short and sweet language as Cypher.

But roughly, Cypher works on two data domains at the same time, the graph itself, let's describe it as a bunch of physical links in the memory, and the projected data from this graph who are rows with data in it, you can visualise it as a table.

Some clauses or function are build to work with the graph domain and others are build to work with the projected data. If you think about it carefully, it doesn't make any sens to use avg in the graph domain because the data doesn't exists as a list or rows of something in the graph domain.

Read and try to understand the behaviour of the clause MATCH and what's a graph and so you will be able to understand the answer of @andrew_bowman

After years of mainly using procedural languages, thinking in declaratively can be a bit tricky. Adding Graph thinking on top of that adds an additional dimension.

I certainly find Cypher more intuitive than SQL but every so often, I get tripped up on the finer points.