问题描述
我有一个名为 Calendar 的数据库表,带有字段
I have a db table called Calendar with fields
- ID (PK)
- 姓名
- 说明
- CalendarTypeId(FK 到CalendarType 表)
我有另一个名为 CalendarType 的表,带有字段
I have another table called CalendarType with fields
- ID (PK)
- 姓名
- 说明
问题是我需要为日历类型为 2 的每个日历存储一个附加字段.(但该字段与任何其他日历类型无关).
The issue is that i need to store an additional field for every calendar where the calendar Type is 2. (but this field would be irrelevant for any other calendar type).
我是否应该只在 Calendar 表中创建一个新字段并忽略具有不同 calendarTypeid 的所有其他日历的该字段,或者是否有更好的方法来组织此架构以支持此需求.
Should i just create a new field in the Calendar table and ignore that field for all other calendar that have a different calendarTypeid or is there a better way to organize this schema to support this need.
推荐答案
好的,这是您当前拥有的 ER 模型(省略基数):
Ok, this is the ER model of what you currently have (omitting cardinalities):
现在,让我们关注日历和子日历.很明显,你在那里有一个层次结构.但是层次结构是如何变成表格的呢?执行此操作的常用方法有以下三种:
Now, let's focus on the Calendar and SubCalendar. Clearly, you have a hierarchy there. But how are hierarchies turned into tables? There are three common ways to do this:
1) 杀死父实体并保留子实体:在这种情况下,您移除父实体并将该实体的所有字段发送给每个子实体.在您的示例中,您只有一个孩子,因此所有父母的属性都将只适用于它.
1) Kill the parent and keep the children: In this case you remove the parent entity and send all the fields from that entity to each of the children. In your example you only have one child, so all the parent's attributes will go only to it.
优点:没有空值,因为每个表都有它所需要的.也不需要连接.如果您将运行查询仅搜索一种类型的子项,则此模式将非常有用,因为您不需要按类型过滤,因为每个表将只存储一种类型
Advantages: No null values as each table will have all it needs. No joins are required either. If you will be running queries searching just for one type of children this schema will be useful because you won't need to filter by type because each table will store only one type
缺点:此架构不适用于您有重叠子项的情况.换句话说,如果在向每个子项发送字段时父行可以有多个子项,则父行数据将在每个子项中重复.不好,所以如果是这种情况,请不要使用此策略.此外,如果您有很多子级,每个子级中的记录很少,那么您将有很多表,每个表都很少记录,因此可能会变得有点难以管理
Disadvantages: This schema is not appropriate for cases where you have overlapping children. In other words, if a parent row can have more than one children when sending the fields to each child the parent data will be duplicated in each children. Not good, so don't use this strategy if that is the case. Additionally, if you have many children and very few records in each, you'll have many tables with few records each, so it might become a little harder to manage
2) 杀死孩子并保留父母:在这种情况下,您删除所有孩子并将他们的所有属性发送给父母.由于父级现在是它自己和它所有子级的混合体,因此需要一种方法来确定哪一行属于什么类型的子级.这是通过向父实体添加一个新属性来实现的,该属性将确定每一行的类型(无论数据类型如何).
2) Kill children and keep the parent: In this case you remove all the children and send all of their attributes to the parent. As the parent is now a mix of itself and all of its children it needs a way to determine what row belongs to what type of children. This is accomplished by adding a new attribute to the parent entity that will determine the type of each row (no matter the data type).
优点:所有孩子只有一张桌子,所以很容易管理.不需要联接.如果针对此表运行的大多数查询都需要来自多个类型的子项的结果,则可能会很有用.
Advantages: There will only be one table for all children, so that is easy to manage. No joins are required. Might be useful if the most queries that are run against this table require results from more than one type of children.
缺点:同样,如果父级可以有与多个子级相关的行,则数据将被复制,因为每个子级都有一行,因此这里有一个限制解决方案.此外,必须添加一个新列作为元数据.表中的记录量会更大.必须将空值分配给孩子拥有的数据以及父母或其他孩子拥有的数据.
Disadvantages: Again, if a parent can have a row that relates to more than one children data will be duplicated as there will be one row per each of them, so there is a limitation in this solution. Additionally, a new column will have to be added as metadata. The volume of records in the table will be the larger. Null values will have to be assigned to the data that children have and the parent nor other children have.
3) 保留所有:最不血腥的解决方案是不杀死任何东西:) 在这种情况下,层次结构由父级和每个子级之间的关系代替.这样,子表就必须通过外键连接到父表才能访问父表的数据.
3) Keep all: The least bloody solution is not to kill anything :) In this case the hierarchy is replaced by a relationship between the parent and each of the children. That way, a child will have to join to the parent table by means of a foreign key to reach the parent's data.
优点:没有数据重复,也没有空值.每个实体只有最少量的数据,其余的可以通过加入父表来获得.在这种情况下,父行可以链接到多个子行而无需复制数据.如果将运行许多查询,而这些查询只能用一个表(通常是父表)来满足,这是一个不错的选择.还有一点是很容易扩展到更多的日历,比如要添加一个新的日历,需要新的字段,那么就得添加一个新的表,而不用修改当前的表
Advantages: There is no data duplication nor null values. Each entity has just the minimal amount of data and the rest can be obtained by joining to the parent table. In this case, a parent row can be linked to more than one children without duplicating data. If there will be run many queries that can be satisfied with only one table (usually the parent) this is a good option. One more thing is that it is easy to extend to more calendars, for example, if a new calendar is to be added that requires new fields, then a new table has to be added, without modifying the current ones
缺点:需要最多的表(实际上比第一个多一个).每个孩子都需要一个连接,这会随着数据集的增大而降低性能.此外,将需要外键来连接两个表.如果大多数查询需要来自父级和子级的数据,则此架构在性能方面将是最差的
Disadvantages: Requires the most tables (actually one more than the first one). A join is needed per each children which will degrade performance the larger the dataset becomes. Additionally, foreign keys will be needed to join both tables. If the most queries will require data from parent and children this schema will be the worst in terms of performance
现在,您问哪个是最佳
数据库架构.我认为现在很清楚这取决于需求、将运行的查询类型、数据的结构方式等.
Now, you asked which is the best
database schema. I think it is now clear it depends on the requirements, types of queries that will be run, the way data is structured, etc.
不过,我可以稍微分析一下.您说您有一个日历表,有时其中之一需要更多数据.所以我们可以说我们有两种类型的日历,父日历和子日历.因此,我们可能认为采用解决方案 2 是一种很好的可能性,因为您将有 2 行代表每种类型,但我们错了.这是因为在这种情况下,每个孩子都包括其父母.现在,如果我们可以假设如果 SubAttribute
对于子项始终为非空而对于父项始终为空,我们甚至可以删除 CalendarType
,这实际上将导致解决方案 1.
However, I can analyze this a little bit more. You said that you had a Calendar table and sometimes more data is needed for one of them. So we can say we have 2 types of calendars, the parent and the child. So we might think that going for solution 2 is a good possibility because you'll have 2 rows representing each type, but we would be wrong. This is because each child includes its parent in this case. Now, if we can assume that if SubAttribute
will always be non-null for a child and null for a parent we can even remove CalendarType
, which will actually lead to solution 1.
最后,根据经验法则(主要是因为大多数查询在现实生活中都有很多连接),如果您想专注于性能,则应该选择解决方案 1,否则,如果您想专注于规范化设计你应该去解决方案 3.
Finally, as a rule of thumb (and mainly because most queries have lots of joins in real life), if you want to focus on performance, you should go for solution 1, otherwise, if you want to focus on having normalized design you should go for solution 3.
我希望这消除了一些疑虑,并可能产生其他疑虑:)
I hope this has cleared some doubts and possibly generated others :)
这篇关于支持仅适用于特定行的值的最佳数据库架构是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!