本文介绍了什么是一些最佳实践和“经验法则”创建数据库索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个应用程序,它循环遍历数据库表中的大量记录,并对该数据库中的记录执行一些SQL和.Net操作(目前我在PostgreSQL上使用Castle.ActiveRecord)。

I have an app, which cycles through a huge number of records in a database table and performs a number of SQL and .Net operations on records within that database (currently I am using Castle.ActiveRecord on PostgreSQL).

我在几个基础上添加了一些基本的btree索引,正如你所期望的,SQL操作的性能大大提高。想要充分利用dbms性能,我想对我应该在我的所有项目上建立索引进行一些更好的教育选择。

I added some basic btree indexes on a couple of the feilds, and as you would expect, the peformance of the SQL operations increased substantially. Wanting to make the most of dbms performance I want to make some better educated choices about what I should index on all my projects.

我知道有一个贬低性能(当数据库需要更新索引以及数据时),但是在创建数据库索引时应该考虑哪些建议和最佳实践?我如何最好选择一组数据库索引的字符集/字段组合(经验法则)?

I understand that there is a detrement to performance when doing inserts (as the database needs to update the index, as well as the data), but what suggestions and best practices should I consider with creating database indexes? How do I best select the feilds/combination of fields for a set of database indexes (rules of thumb)?

此外,如何最好选择要使用的索引聚集索引?当涉及到访问方法,在什么条件下,我应该使用一个btree在一个散列或一个gist或一个gin(它们是什么呢?)。

Also, how do I best select which index to use as a clustered index? And when it comes to the access method, under what conditions should I use a btree over a hash or a gist or a gin (what are they anyway?).

推荐答案

我的一些规则:


  • 索引所有主键(我认为大多数RDBMS做这个

  • 仅在以下情况下创建更多索引:


    • 查询缓慢。

    • 您知道数据量会大幅增加。

    如果查询速度较慢,计划和:

    If a query is slow, look for the execution plan and:


    • 如果对表的查询只使用少数列将所有列放在索引上,那么您可以帮助RDBMS

    • 不要浪费资源索引小表(数百个记录)。

    • 按照从高基数到高基数的顺序索引多个列减。这意味着,首先是具有更多不同值的列,后跟具有很少不同值的列。

    • 如果查询需要访问更多的10%的数据,normaly更好的是全扫描

    • If the query for a table only uses few columns put all that columns on an index, then you can help the RDBMS to use only the index.
    • Don't waste resources indexing tiny tables (hundreds of records).
    • Index multiple columns in order from high cardinality to less. It means, first the columns with more distinct values followed by columns with few distinct values.
    • If a query needs to access more that 10% of the data, normaly is better a full scan that an index.

    这篇关于什么是一些最佳实践和“经验法则”创建数据库索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 13:19