我有一个从两个内部联接表中选择的sql查询。 select语句的执行大约需要50秒。但是,fetchall()花费788秒,并且仅获取981个结果。这是查询和提取代码:
time0 = time.time()
self.cursor.execute("SELECT spectrum_id, feature_table_id "+
"FROM spectrum AS s "+
"INNER JOIN feature AS f "+
"ON f.msrun_msrun_id = s.msrun_msrun_id "+
"INNER JOIN (SELECT feature_feature_table_id, min(rt) AS rtMin, max(rt) AS rtMax, min(mz) AS mzMin, max(mz) as mzMax "+
"FROM convexhull GROUP BY feature_feature_table_id) AS t "+
"ON t.feature_feature_table_id = f.feature_table_id "+
"WHERE s.msrun_msrun_id = ? "+
"AND s.scan_start_time >= t.rtMin "+
"AND s.scan_start_time <= t.rtMax "+
"AND base_peak_mz >= t.mzMin "+
"AND base_peak_mz <= t.mzMax", spectrumFeature_InputValues)
print 'query took:',time.time()-time0,'seconds'
time0 = time.time()
spectrumAndFeature_ids = self.cursor.fetchall()
print time.time()-time0,'seconds since to fetchall'
fetchall花这么长时间有原因吗?
更新
正在做:
while 1:
info = self.cursor.fetchone()
if info:
<do something>
else:
break
速度和
allInfo = self.cursor.fetchall()
for info in allInfo:
<do something>
最佳答案
默认情况下,由于fetchall()
对象的fetchone()
设置为1,因此arraysize
与循环遍历Cursor
一样慢。
为了加快处理速度,您可以循环遍历fetchmany()
,但是要查看性能提升,您需要为其提供一个大于1的大小参数,否则它将以arraysize
的批次(即1)批量提取“很多”。
您很可能可以通过提高arraysize
的值来获得性能上的提高,但是我没有这样做的经验,因此您可能想首先通过做类似的事情来进行试验:
>>> import sqlite3
>>> conn = sqlite3.connect(":memory:")
>>> cu = conn.cursor()
>>> cu.arraysize
1
>>> cu.arraysize = 10
>>> cu.arraysize
10
有关上述内容的更多信息:http://docs.python.org/library/sqlite3.html#sqlite3.Cursor.fetchmany
关于performance - sqlite.fetchall()这么慢是正常现象吗?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/10336492/