我有一个从两个内部联接表中选择的sql查询。 select语句的执行大约需要50秒。但是,fetchall()花费788秒,并且仅获取981个结果。这是查询和提取代码:

time0 = time.time()
self.cursor.execute("SELECT spectrum_id, feature_table_id "+
                    "FROM spectrum AS s "+
                    "INNER JOIN feature AS f "+
                    "ON f.msrun_msrun_id = s.msrun_msrun_id "+
                    "INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax "+
                                 "FROM convexhull GROUP BY feature_feature_table_id) AS t "+
                    "ON t.feature_feature_table_id = f.feature_table_id "+
                    "WHERE s.msrun_msrun_id = ? "+
                    "AND s.scan_start_time >= t.rtMin "+
                    "AND s.scan_start_time <= t.rtMax "+
                    "AND base_peak_mz >= t.mzMin "+
                    "AND base_peak_mz <= t.mzMax", spectrumFeature_InputValues)
print 'query took:',time.time()-time0,'seconds'

time0 = time.time()
spectrumAndFeature_ids = self.cursor.fetchall()
print time.time()-time0,'seconds since to fetchall'


fetchall花这么长时间有原因吗?



更新

正在做:

while 1:
    info = self.cursor.fetchone()
    if info:
        <do something>
    else:
        break


速度和

allInfo = self.cursor.fetchall()
for info in allInfo:
    <do something>

最佳答案

默认情况下,由于fetchall()对象的fetchone()设置为1,因此arraysize与循环遍历Cursor一样慢。

为了加快处理速度,您可以循环遍历fetchmany(),但是要查看性能提升,您需要为其提供一个大于1的大小参数,否则它将以arraysize的批次(即1)批量提取“很多”。

您很可能可以通过提高arraysize的值来获得性能上的提高,但是我没有这样做的经验,因此您可能想首先通过做类似的事情来进行试验:

>>> import sqlite3
>>> conn = sqlite3.connect(":memory:")
>>> cu = conn.cursor()
>>> cu.arraysize
1
>>> cu.arraysize = 10
>>> cu.arraysize
10


有关上述内容的更多信息:http://docs.python.org/library/sqlite3.html#sqlite3.Cursor.fetchmany

关于performance - sqlite.fetchall()这么慢是正常现象吗?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/10336492/

10-13 03:08