本文介绍了用于存储历史数据的数据库结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

前言:前几天我正在考虑为新应用程序使用新的数据库结构,并意识到我们需要一种以有效方式存储历史数据的方法.我想让其他人看看这个结构是否有任何问题.我意识到这种存储数据的方法很可能以前已经被发明过(我几乎可以肯定它已经发明了)但我不知道它是否有名字,我尝试过的一些谷歌搜索没有产生任何结果.

Preface:I was thinking the other day about a new database structure for a new application and realized that we needed a way to store historical data in an efficient way. I was wanting someone else to take a look and see if there are any problems with this structure. I realize that this method of storing data may very well have been invented before (I am almost certain it has) but I have no idea if it has a name and some google searches that I tried didn't yield anything.

问题:假设您有一个订单表,订单与下订单的客户的客户表相关.在正常的数据库结构中,您可能会期待这样的事情:

Problem:Lets say you have a table for orders, and orders are related to a customer table for the customer that placed the order. In a normal database structure you might expect something like this:

orders
------
orderID
customerID


customers
---------
customerID
address
address2
city
state
zip

很简单,orderID 有一个customerID 外键,它是customer 表的主键.但是如果我们要在订单表上运行报告,我们将把客户表连接到订单表,这将带回该客户 ID 的当前记录.如果下订单时客户地址不同并且随后更改了该怎么办.现在,我们的订单不再反映下订单时该客户地址的历史记录.基本上,通过更改客户记录,我们只是更改了该客户的所有历史记录.

Pretty straightforward, orderID has a foreign key of customerID which is the primary key of the customer table. But if we were to go and run a report over the order table, we are going to join the customers table to the orders table, which will bring back the current record for that customer ID. What if when the order was placed, the customers address was different and it has been subsequently changed. Now our order no longer reflects the history of that customers address, at the time the order was placed. Basically, by changing the customer record, we just changed all history for that customer.

现在有几种方法可以解决这个问题,其中一种方法是在创建订单时复制记录.不过,我想出的是一种更简单的方法,它可能更优雅一点,并且在任何更改时都有记录的额外好处.

Now there are several ways around this, one of which would be to copy the record when an order was created. What I have come up with though is what I think would be an easier way to do this that is perhaps a little more elegant, and has the added bonus of logging anytime a change is made.

如果我改用这样的结构会怎样:

What if I did a structure like this instead:

orders
------
orderID
customerID
customerHistoryID


customers
---------
customerID
customerHistoryID


customerHistory
--------
customerHistoryID
customerID
address
address2
city
state
zip
updatedBy
updatedOn

请原谅格式,但我认为您可以看到这个想法.基本上,这个想法是,无论何时更改、插入或更新客户,customerHistoryID 都会增加,并且客户表会使用最新的 customerHistoryID 进行更新.订单表现在不仅指向 customerID(它允许您查看客户记录的所有修订),而且指向 customerHistoryID,它指向记录的特定修订.现在订单反映了订单创建时的数据状态.

please forgive the formatting, but I think you can see the idea. Basically, the idea is that anytime a customer is changed, insert or update, the customerHistoryID is incremented and the customers table is updated with the latest customerHistoryID. The order table now not only points to the customerID (which allows you to see all revisions of the customer record), but also to the customerHistoryID, which points to a specific revision of the record. Now the order reflects the state of data at the time the order was created.

通过向 customerHistory 表中添加 updatedby 和 updatedon 列,您还可以看到数据的审计日志",这样您就可以看到谁进行了更改以及何时进行了更改.

By adding an updatedby and updatedon column to the customerHistory table, you can also see an "audit log" of the data, so you could see who made the changes and when.

一个潜在的缺点可能是删除,但我并不担心这个需求,因为不应该删除任何东西.但即便如此,根据数据的域,使用 activeFlag 或类似的东西也可以实现相同的效果.

One potential downside could be deletes, but I am not really worried about that for this need as nothing should ever be deleted. But even still, the same effect could be achieved by using an activeFlag or something like it depending on the domain of the data.

我的想法是所有的表都会使用这种结构.任何时候检索历史数据时,都会使用 customerHistoryID 将其与历史表连接,以显示该特定订单的数据状态.

My thought is that all tables would use this structure. Anytime historical data is being retrieved, it would be joined against the history table using the customerHistoryID to show the state of data for that particular order.

检索客户列表很容易,只需连接到 customerHistoryID 上的客户表即可.

Retrieving a list of customers is easy, it just takes a join to the customer table on the customerHistoryID.

任何人都可以从设计角度或性能原因看出这种方法有什么问题吗?请记住,无论我做什么,我都需要确保保留历史数据,以便对记录的后续更新不会更改历史记录.有没有更好的办法?这是一个已知的有名字的想法,或者任何关于它的文档?

Can anyone see any problems with this approach, either from a design standpoint, or performance reasons why this is bad. Remember, no matter what I do I need to make sure that the historical data is preserved so that subsequent updates to records do not change history. Is there a better way? Is this a known idea that has a name, or any documentation on it?

感谢您的帮助.

更新:这是我真正将要拥有的一个非常简单的例子.我的实际应用程序将具有带有其他表的几个外键的订单".始发地/目的地位置信息、客户信息、设施信息、用户信息等. 有几次建议我可以将信息复制到订单记录中,我已经多次看到这样做了,但这会导致记录有数百列,这在这种情况下确实不可行.

Update:This is a very simple example of what I am really going to have. My real application will have "orders" with several foreign keys to other tables. Origin/destination location information, customer information, facility information, user information, etc. It has been suggested a couple of times that I could copy the information into the order record at that point, and I have seen it done this way many times, but this would result in a record with hundreds of columns, which really isn't feasible in this case.

推荐答案

当我遇到此类问题时,一种替代方法是将订单设置为历史记录表.它的功能相同,但更容易理解

When I've encountered such problems one alternative is to make the order the history table. Its functions the same but its a little easier to follow

orders
------
orderID
customerID
address
City
state
zip



customers
---------
customerID
address
City
state
zip

如果您喜欢的列数变得很高,您可以根据需要将其分开.

if the number of columns gets to high for your liking you can separate it out however you like.

如果您确实使用其他选项并使用历史记录表,您应该考虑使用 双时态 数据,因为您可能必须处理历史数据需要更正的可能性.例如,客户将其当前地址从 A 更改为 B,但您还必须更正当前正在履行的现有订单的地址.

If you do go with the other option and using history tables you should consider using bitemporal data since you may have to deal with the possibility that historical data needs to be corrected. For example Customer Changed his current address From A to B but you also have to correct address on an existing order that is currently be fulfilled.

此外,如果您使用的是 MS SQL Server,您可能需要考虑使用索引视图.这将允许您用小的增量插入/更新性能下降换取较大的选择性能增加.如果您不使用 MS SQL 服务器,您可以使用触发器和表来复制它.

Also if you are using MS SQL Server you might want to consider using indexed views. That will allow you to trade a small incremental insert/update perf decrease for a large select perf increase. If you're not using MS SQL server you can replicate this using triggers and tables.

这篇关于用于存储历史数据的数据库结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-03 22:44
查看更多