问题描述
我有一个应用程序,它循环遍历数据库表中的大量记录,并对该数据库中的记录执行一些SQL和.Net操作(目前我在PostgreSQL上使用Castle.ActiveRecord)。
I have an app, which cycles through a huge number of records in a database table and performs a number of SQL and .Net operations on records within that database (currently I am using Castle.ActiveRecord on PostgreSQL).
我在几个基础上添加了一些基本的btree索引,正如你所期望的,SQL操作的性能大大提高。想要充分利用dbms性能,我想对我应该在我的所有项目上建立索引进行一些更好的教育选择。
I added some basic btree indexes on a couple of the feilds, and as you would expect, the peformance of the SQL operations increased substantially. Wanting to make the most of dbms performance I want to make some better educated choices about what I should index on all my projects.
我知道有一个贬低性能(当数据库需要更新索引以及数据时),但是在创建数据库索引时应该考虑哪些建议和最佳实践?我如何最好选择一组数据库索引的字符集/字段组合(经验法则)?
I understand that there is a detrement to performance when doing inserts (as the database needs to update the index, as well as the data), but what suggestions and best practices should I consider with creating database indexes? How do I best select the feilds/combination of fields for a set of database indexes (rules of thumb)?
此外,如何最好选择要使用的索引聚集索引?当涉及到访问方法,在什么条件下,我应该使用一个btree在一个散列或一个gist或一个gin(它们是什么呢?)。
Also, how do I best select which index to use as a clustered index? And when it comes to the access method, under what conditions should I use a btree over a hash or a gist or a gin (what are they anyway?).
推荐答案
我的一些规则:
- 索引所有主键(我认为大多数RDBMS做这个
- 仅在以下情况下创建更多索引:
- 查询缓慢。
- 您知道数据量会大幅增加。
如果查询速度较慢,计划和:
If a query is slow, look for the execution plan and:
- 如果对表的查询只使用少数列将所有列放在索引上,那么您可以帮助RDBMS
- 不要浪费资源索引小表(数百个记录)。
- 按照从高基数到高基数的顺序索引多个列减。这意味着,首先是具有更多不同值的列,后跟具有很少不同值的列。
- 如果查询需要访问更多的10%的数据,normaly更好的是全扫描
- If the query for a table only uses few columns put all that columns on an index, then you can help the RDBMS to use only the index.
- Don't waste resources indexing tiny tables (hundreds of records).
- Index multiple columns in order from high cardinality to less. It means, first the columns with more distinct values followed by columns with few distinct values.
- If a query needs to access more that 10% of the data, normaly is better a full scan that an index.
这篇关于什么是一些最佳实践和“经验法则”创建数据库索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!