Create schema for website analytics and make dynamic queries to segment users

Journey:

  • User VISITED Home page (Amazon)
  • User VISITED Shoes Category page (Amazon)
  • User VISITED Home page (Amazon)
  • User VISITED Electronics Category page (Amazon)
  • User VISITED Mobiles Category page (Amazon)
  • User VISITED iPhone 11 Pro Max page (Amazon)
  • User has done ADD_TO_CART { name: 'iPhone 11 Pro Max', price: 1000, category: 'mobiles', brand: 'Apple', color: 'gold' } (Amazon)
  • User has done ORDER_PLACED { orderId: 1234, orderedAt: '2021-06-13T05:30:26.346+00:00, name: 'iPhone 11 Pro Max', price: 1000, category: 'mobiles', brand: 'Apple', color: 'gold' } (Amazon)
  • User VISITED Home page (Amazon)

Ex:

  • VISITED { url: 'https://www.amazon.com/mobiles', visitedAt: '2021-06-13T05:30:26.346+00:00', visitId: 'uuid-its-uniq-for each visit' }
  • ADD_TO_CART { name: 'iPhone 11 Pro Max', price: 1000, category: 'mobiles', brand: 'Apple', color: 'gold' }

Points to note:

  • Multiple websites will be there (like Amazon, EBay, Alibaba etc.,)
  • Each website may have (n) of users
  • Each user may have (n) No. of relations with website (Like above Journey)

Queries:

  • Find users who placed mobile orders where price greater than 500 in Amazon,
  • Find users who has done ADD_TO_CART, but not done ORDER_PLACED in Amazon
  • Find users who has done ADD_TO_CART in the last 7 days, but not done ORDER_PLACED in Amazon
  • Find users who VISITED shoes category (based on url), but not done ORDER_PLACED in Amazon

Need a better solution for this, I have already one solution, but this needs to just review and give me better suggestions. For more info, please send a personal message

What about something like this:

@ronny.de.winter
User journey I provided here is just to understanding purpose, the schema should be optimal for all types of websites (not only for ecommerce).

Let consider every action by user on the site as an event.

Ex: VISITED a page, (default for all sites)
ADD_TO_CART, ORDER_PLACED etc., in the case of ecomerce
SUBSCRIBED in the case SaaS
CAMPAIGN_CREATED in the case Marketing App etc., as events

Here VISITED is default event when someone visit any page in the site, it may be ecommerce or any other site.
And rest all or manual events, that site owner wants to track.

So, I think we can't define a fixed schema.
And one is issue is we need to segment users based on their events (Making queries dynamycally).

Note:
Its similer to google analytics we can consider, and should able to store multiple sites user journies and segment users based on their events

Queries Ex:

  • Find users who has done ORDER_PLACED event in mobiles category where price greater than 500,
  • Find users who has done ADD_TO_CART in the last 7 days, but not done ORDER_PLACED in Amazon
  • Find users who VISITED a perticual page (based on url), but not done CAMPAIGN_CREATED

I propose you develop your graph data model iteratively and refactor it along the way.
Getting a rough first model quickly reduces the total time you need to reach the more refined version. Refactoring a graph is relatively inexpensive.

Based on your questions identify the entities and connections. Build and test your query to answer the first question. Test it and refactor it when needed, ie to improve performance or simplify traversing.

Take it question by question and improve with every step.

1 Like