问题描述
我在使用概念上简单的
数据库时遇到了严重的性能问题。该数据库有一个表,包含天气观测。
该表目前有大约1350万行,并且正在不断更新
。 (该数据库运行在512MHz RAM的双550MHz PIII上。
我在i686-pc-linux-gnu上有PostgreSQL 7.1.3,由GCC编译2.96
on RedHat 7.2)
总的来说,查询形式如下:
SELECT? FROM obs WHERE station =?
AND valid_time< ? AND valid_time> ?
或:
SELECT? FROM obs WHERE station IN(?,?,...)
AND valid_time< ? AND valid_time> ?
像这样的查询每次花费大约4到5分钟,这对我来说似乎过于缓慢(或者我的期望过于乐观了吗?)。
例如:
SELECT站,air_temp FROM obs
WHERE station =''EGBB''
AND valid_time> ''28 / 8/03 00:00''
AND valid_time< ''28 / 10/03 00:00''
需要4分钟32秒。
以上查询的解释说:
注意:查询计划:
在obs上使用obs_pkey进行索引扫描(成本= 0.00..9.01行= 1宽度= 20)
一个简单的来自obs的SELECT count(*)查询需要花费大约一段时间
。
我已经运行了vacuumdb --analyze obs,收效甚微。
如何加快速度?我哪里错了?
表结构或索引是否有问题?是否持续更新
数据库(以每秒1-3个条目之间的速率)导致
问题?
表和索引定义如下:
表obs
属性|输入|修饰符
---------------------------- + ------------ -------------- + ----------
valid_time |带时区的时间戳|
metar_air_temp |双精度|
relative_humidity |双精度|
pressure_change |双精度|
上限|双精度|
metar_dew_point |双精度|
metar_gusts |双精度|
wet_bulb_temperature |双精度|
past_weather |文字|
公开性|双精度|
metar_visibility |双精度|
降水|双精度|
站|字符(10)|
pressure_msl |双精度|
metar_min_temperature_6hr |双精度|
precipitation_period |双精度|
metar_wet_bulb |双精度|
saturation_mixing_ratio |双精度|
metar_pressure |双精度|
metar_sky_cover |文字|
露点|双精度|
wind_direction |双精度|
actual_time |带时区的时间戳|
gust_speed |双精度|
high_cloud_type |文字|
precipitation_24hr |双精度|
metar_precipitation_24hr |双精度|
pressure_tendency |文字|
metar_relative_humidity |双精度|
low_cloud_type |文字|
metar_max_temperature_6hr |双精度|
middle_cloud_type |文字|
air_temp |双精度|
low_and_middle_cloud_cover |文字|
metar_wind_dir |双精度|
metar_weather |文字|
snow_depth |双精度|
metar_snow_depth |双精度|
min_temp_12hr |双精度|
present_weather |文字|
wind_speed |双精度|
snow_cover |文字|
metar_wind_speed |双精度|
metar_ceiling |双精度|
max_temp_12hr |双精度|
mixing_ratio |双精度|
pressure_change_3hr |双精度|
total_cloud |整数|
max_temp_24hr |双精度|
min_temp_24hr |双精度|
snow_amount_6hr |双精度|
指数:obs_pkey,
obs_station,
obs_valid_time
指数" obs_pkey" ;
属性|输入
------------ + --------------------------
valid_time |带时区的时间戳
station |字符(10)
唯一btree
索引" obs_station"
属性|输入
----------- + ---------------
station |字符(10)
btree
索引" obs_valid_time"
属性|输入
------------ + --------------------------
valid_time |带时区的时间戳
btree
(我怀疑obs_valid_time索引是多余的,因为
obs_pkey索引 - 是对吗?)
我会很感激任何建议和任何线索来帮助加快这一点。
非常感谢,
Dave
---------------------------(播出结束)--- ------------------------
提示3:如果通过Usenet发布/阅读,请发送相应的
subscribe-nomail命令以便
消息可以干净利落地到达邮件列表
I''m having severe performance issues with a conceptually simple
database. The database has one table, containing weather observations.
The table currently has about 13.5 million rows, and is being updated
constantly. (The database is running on a dual 550MHz PIII with 512MB RAM.
I have PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96
on RedHat 7.2)
On the whole, queries are of the form:
SELECT ? FROM obs WHERE station = ?
AND valid_time < ? AND valid_time > ?
or:
SELECT ? FROM obs WHERE station IN (?, ?, ...)
AND valid_time < ? AND valid_time > ?
Queries like these are taking around 4 to 5 minutes each, which seems
excessively slow to me (or are my expectations far too optimistic?).
For instance:
SELECT station, air_temp FROM obs
WHERE station = ''EGBB''
AND valid_time > ''28/8/03 00:00''
AND valid_time < ''28/10/03 00:00''
takes 4 mins 32 secs.
An EXPLAIN of the above query says:
NOTICE: QUERY PLAN:
Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20)
A simple "SELECT count(*) from obs" query takes around that sort of time
too.
I have run "vacuumdb --analyze obs", to little effect.
How can I speed this up? Where am I going wrong? Is there a problem with
the table structure, or the indexes? Does the continual updating of the
database (at the rate of somewhere between 1-3 entries per second) cause
problems?
The table and indexes are defined as follows:
Table "obs"
Attribute | Type | Modifier
----------------------------+--------------------------+----------
valid_time | timestamp with time zone |
metar_air_temp | double precision |
relative_humidity | double precision |
pressure_change | double precision |
ceiling | double precision |
metar_dew_point | double precision |
metar_gusts | double precision |
wet_bulb_temperature | double precision |
past_weather | text |
visibility | double precision |
metar_visibility | double precision |
precipitation | double precision |
station | character(10) |
pressure_msl | double precision |
metar_min_temperature_6hr | double precision |
precipitation_period | double precision |
metar_wet_bulb | double precision |
saturation_mixing_ratio | double precision |
metar_pressure | double precision |
metar_sky_cover | text |
dew_point | double precision |
wind_direction | double precision |
actual_time | timestamp with time zone |
gust_speed | double precision |
high_cloud_type | text |
precipitation_24hr | double precision |
metar_precipitation_24hr | double precision |
pressure_tendency | text |
metar_relative_humidity | double precision |
low_cloud_type | text |
metar_max_temperature_6hr | double precision |
middle_cloud_type | text |
air_temp | double precision |
low_and_middle_cloud_cover | text |
metar_wind_dir | double precision |
metar_weather | text |
snow_depth | double precision |
metar_snow_depth | double precision |
min_temp_12hr | double precision |
present_weather | text |
wind_speed | double precision |
snow_cover | text |
metar_wind_speed | double precision |
metar_ceiling | double precision |
max_temp_12hr | double precision |
mixing_ratio | double precision |
pressure_change_3hr | double precision |
total_cloud | integer |
max_temp_24hr | double precision |
min_temp_24hr | double precision |
snow_amount_6hr | double precision |
Indices: obs_pkey,
obs_station,
obs_valid_time
Index "obs_pkey"
Attribute | Type
------------+--------------------------
valid_time | timestamp with time zone
station | character(10)
unique btree
Index "obs_station"
Attribute | Type
-----------+---------------
station | character(10)
btree
Index "obs_valid_time"
Attribute | Type
------------+--------------------------
valid_time | timestamp with time zone
btree
(I suspect the obs_valid_time index is redundant, because of the
obs_pkey index - is that right?)
I''d be grateful for any advice and any clues to help speed this up.
Many thanks,
Dave
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly
推荐答案
检查 for general
调整提示,如果表经常更新,请填充真空。真空后重新索引
指数。这需要相当长的时间。
看看是否有帮助。
Shridhar
---- -----------------------(播出结束)---------------------- -----
提示8:解释分析是你的朋友
Check http://www.varlena.com/varlena/Gener...bits/perf.html for general
tuning tips and do a vacuum full if table is updated frequently. Also reindex
the indexes after vacuum. It will take quite some time though.
See if that helps.
Shridhar
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
应该多少行返回?
[解释分析会告诉你]
虽然运行是你的磁盘颠簸? vmstat的bi / bo列将
告诉你。
7.1已经很老了,但我确实理解了迁移到7.3 [或
..4beta]带有巨大的数据库
-
Jeff Trout< je ** @ jefftrout.com> ;
---------------------------(广播结束)--------------- ------------
提示9:如果您的
加入列的数据类型,规划人员将忽略您选择索引扫描的愿望不匹配
How many rows should that return?
[explain analyze will tell you that]
and while that runs is your disk thrashing? vmstat''s bi/bo columns will
tell you.
7.1 is quite old, but I do understand the pain of migraing to 7.3 [or
..4beta] with huge db''s
--
Jeff Trout <je**@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column''s datatypes do not match
" explain analyze"似乎不是这个postgres版本的一部分
(或者我误解了一些东西)。
特定查询返回了24行。
虽然运行是你的磁盘颠簸? vmstat'的bi / bo专栏将告诉你。
机器在建筑物的另一侧,所以我不能实际上
看看磁盘是否正在颠簸。
我不知道如何解释vmstat输出;运行vmstat 1在我开始查询之前,显示
我bi / bo都为零(大部分)。然后bi将
提高到2500左右(bo保持在零左右),直到查询结束。
7.1已经很老了,但我确实理解了用巨大的数据库移动到7.3 [或
.4beta]
"explain analyze" doesn''t seem to be part of this postgres version
(or I misunderstood something).
That particular query returned 24 rows.
and while that runs is your disk thrashing? vmstat''s bi/bo columns will
tell you.
The machine''s over the other side of the building, so I can''t physically
see if the disk is thrashing.
I''m not sure how to interpret the vmstat output; running "vmstat 1" shows
me bi/bo both at zero (mostly) until I start the query. Then bi shoots up
to around 2500 (bo remains around zero) until the query finishes.
7.1 is quite old, but I do understand the pain of migraing to 7.3 [or
.4beta] with huge db''s
升级是否有可能产生影响?
我仍然没有明白我的问题是由于:
1 Postgres版本
2数据库大小
3表结构
4配置问题
5慢速硬件
6以上所有
7没有以上
8别的
感谢您的帮助,
Dave。
---------------------------(广播结束)----------------- ----------
提示6:您是否搜索了我们的列表档案?
Is the upgrade likely to make a difference?
I''m still none-the-wiser wether the problem I have is due to:
1 Postgres version
2 Database size
3 Table structure
4 Configuration issues
5 Slow hardware
6 All of the above
7 None of the above
8 Something else
Thanks for the help,
Dave.
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
这篇关于查询性能慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!