How to model markets and associated datasets?


Just finished the Neo4j Academy series up past data modelling. Very good. Wish I'd done it earlier! Couldn't find answers to this specific question so wanted to ask: Does anyone have views on how to optimally model financial markets and associated datasets (and data-producers) in a scalable and generalisable way?

This is for a knowledge management system. Business queries include things like:

  1. marketshare – which jurisdictions have the highest share of the forex market?
  2. marketshare – what are the top 10 currencies with the highest share of the forex market?
  3. source/reference – what datasets support these numbers? who produced these dataset? etc
  4. source/file – what file or files did this dataset come from?

For an initial MVP example I'm using the global forex Market from BIS OTC forex data– from Foreign exchange turnover in April 2019

To keep the model simple, I've only included one row foreach data table in the current model and limited it to one year – though the goal is to literally convert all key fields into a graph model:

Can you see any key problems or improvements?
Are there other data models I should check out that might help?
Or perhaps this is silly and there are other more worthy approaches?
The goal is to have a system that can easily call many datasets from many providers across many markets, etc. onto a global timeline with real-time automated individual data field level traceability.