


The hardest thing to wrap my head around when using a graph database, is choosing level of granularity. Lets say I have a graph for things that occur at certain days of the week: trash day, taco tuesday, BYOB friday, etc.

  • 这样,我可以将每天作为节点(星期一,星期二,星期三……),这样可以快速查询特定日期.
  • 我可以创建一个名为Day的节点,并在星期几添加属性名称.这样一来,就可以轻松查询图表中的所有日期.


Thinking to myself, making nodes very specific is bad because there is not limit to granularity. For example saturday morning, evening and night, or worse, a new node per hour of each day. I could also make edges a component of the granularity by saying saturday node is linked by "evening" edge to trash day node.


I come across similar problems every now and then, for example; should I create a new node based on a person's full name, or a node called "Person" with property "name". Then I make nodes either specific or general based on convenience, but I feel there may be some best practice or higher level principle I'm missing. It's not clear to me how to judge which way is better.



The level of granularity of your data model should be driven by your query requirements, not the other way around. That is: when modeling your database, you should ask yourself: "what kind of query I will do over my data?". Based on the answers of this question, you will get a good start point to make a good model with an appropriate granularity level.

在Rik Van Bruggen撰写的《 学习Neo4j 》一书中(您可以在此链接),作者谈到了设计图数据库的可查询性:

In the book Learning Neo4j, by Rik Van Bruggen (you can download in this link) the author says about design graph databases for query-ability:

因此,基于此,您的问题的答案 当粒度级别可以不受限制时应使用什么特异性级别?" :这取决于您的查询要求.首先考虑要执行的查询,然后考虑数据模型.

So, based on this, the answer of your question "what level of specificity should be used when granularity level can be unlimited?" is: it depends on your query requirements. Think first in the queries you will do, and after in the data model.


My suggestion is: keep your model as simple as possible in the beginning and, when required, make gradual changes.


09-05 05:44