What is a row?

andrew.bowman mentioned...

As mentioned, a record and a row are referring to the same thing in Cypher. We don't have tables, so we're never referring to a table structure kept as part of the durable graph data.

"Record" is more correct for the reasons you mentioned, and I'd argue that we should probably change the language in our query plans to use "records" and not "rows", that's something I can push for, and it aligns with our usage of "record" in other parts of the Neo4j Browser. It makes sense to standardize the language we use.

That said, when we return these, the most intuitive return format looks a lot like a row. And these rows as a whole appear to make up a table, so we won't completely escape the similarities. We even have a Table result view in the browser.

andrew.bowman mentioned...

And for clarification, a pattern is like what you provide for a MATCH, it's the "this is what you're looking for" instruction to the db.

A path is a piece of the graph data that matches the desired pattern. An instance of a pattern, if you like.

oren214 mentioned...

Yeah, I'm not trying to complain (too much) about semantics. I get why "rows" is totally relevant and helpful, especially for folks coming from table based databases. It's familiar.

oren214 mentioned...

so couldn't we get a collection of paths instead of a collection of records? wouldn't that open some doors?

andrew.bowman mentioned...

Sure...if you collect() the paths. But now you have a single list of paths. How do you refer to the elements in each path that you are interested in?

oren214 mentioned...

well, in the browser when you look at the table view, you see what is basically a collection of hashes. We all know how to dig into a hash, and no reason we couldn't have a dot syntax for that too.

oren214 mentioned...

Same with text view

andrew.bowman mentioned...

Sure...but we basically have a single column for all the paths. But people don't want to work with a list of paths, they want to use separate variables referring to the parts of the paths (usually nodes) that they are interested in. And they want to perform aggregations, and calculations, and projections, so you're usually not going to be working with pure paths at the end.

oren214 mentioned...

I mean, now that I think about it, it could really be represented the way we represent a graphql response...

oren214 mentioned...

I guess that's dgraph's approach, if I understand what they are doing over there (I don't).

andrew.bowman mentioned...

GraphQL responses aren't that much different than Cypher results...it's just a JSON representation of the results. But returning just a giant JSON structure of results isn't really the best when you're anticipating a lot of results, or if you're trying to do something like export to a CSV

oren214 mentioned...

but I'm not saying a giant json structure of results. I'm saying a collection of json structures of paths

oren214 mentioned...

and I would want a convenience syntax so I don't have to feel like I'm dealing with json

oren214 mentioned...

anyhow. I've wasted enough of your time with my ponderings. I'm so grateful for your explanations and time.

andrew.bowman mentioned...

If you want that you can get that.

MATCH path = (:Movie)<-[:ACTED_IN]-(:Person)
RETURN path   

Or relationships(path), or nodes(path) if you're interested in just the nodes, or just the relationships.

oren214 mentioned...

It's been extremely helpful

oren214 mentioned...

I guess I was just saying I wish that was the default instead of an option because FOR ME it makes it easier to think about

andrew.bowman mentioned...

Got it. It does depend upon what is important to each user, for what they want returned. For some paths are the most important. And it also varies for the kinds of queries you want to run.

We're flexible, but that flexibility introduces complexity, no real way around that.

andrew.bowman mentioned...

Glad to help!

andrew.bowman mentioned...

I guess put a better way, if the result you desire are the paths, then work with the path variables in addition to the important components of the path.

But for many many queries, paths are just a means to an end, not the end itself. For example, if you only need, per movie, the movie title and the names of actors (and you have no need to visualize it graphically), then the paths don't matter in the end.

If you want to calculate the average ratings given to a movie, when returning the final results you don't care about all the tens of thousands or more :REVIEW relationships between people and the movie, you just want to output the movie, and the calculated average review score. Having to output all path results that were referenced to calculate the average isn't useful, and will make the query slower and output far more than what's needed.