问题描述
我无法弄清楚如何过滤器功能如此之快。我已经使用过滤器的各种数据,不管数据类型,过滤器涂掉任何替代方法,我使用。我经常使用的二进制搜索算法和书面由斯蒂芬·布伦(在QuickArraySort算法专业Excel的发展)。二进制搜索是快如闪电(以最快的速度过滤功能,因为该数组排序)和快速排序算法是已知最快的排序算法之一。
我已经写了一些测试code以下比较发现速度在一个非常大的数组(大小= 200万)的随机元素。我故意填充一个未有序的方式排列(应当指出的是,我曾尝试过各种非有序分配方法,其结果是相似的,无论分配方法)。
子SearchTest()
昏暗我长,strMyArray()作为字符串,lngSize长,strTest作为字符串
昏暗TimeBinarySearch长,TimeFilterSearch只要
昏暗lngResultBinary长,lngResultFilter只要
昏暗StartHour长,StartMinute长,StartSecond只要
昏暗StartMiliSecond长,开始时间长
昏暗EndHour长,EndMinute长,EndSecond只要
昏暗EndMiliSecond长,结束时间长
lngSize = 2000000
strTest = CStr的(1735674 * 987)
REDIM strMyArray(lngSize)
对于i = 1到UBound函数(strMyArray)
如果我国防部2 = 0然后
strMyArray(ⅰ)= CStr的((ⅰ - 1)* 987)
其他
strMyArray(ⅰ)= CStr的(第(i + 1)* 987)
结束如果
接下来我
''过滤器测试
******************* ******************
StartHour =小时(NOW())* 60 * 60 * 1000
StartMinute =分(NOW())* 60 * 1000
StartSecond =二(NOW())* 1000
StartMiliSecond =格式(NOW(),MS)
开始时间= StartHour + StartMinute + StartSecond + StartMiliSecond
lngResultFilter = CLng函数(筛选(strMyArray,strTest)(0))
EndHour =小时(NOW())* 60 * 60 * 1000
EndMinute =分(NOW())* 60 * 1000
EndSecond =二(NOW())* 1000
EndMiliSecond =格式(NOW(),MS)
结束时间= EndHour + EndMinute + EndSecond + EndMiliSecond
TimeFilterSearch =结束时间 - 开始时间
******************* ******************
''二进制测试
******************* ******************
StartHour =小时(NOW())* 60 * 60 * 1000
StartMinute =分(NOW())* 60 * 1000
StartSecond =二(NOW())* 1000
StartMiliSecond =格式(NOW(),MS)
开始时间= StartHour + StartMinute + StartSecond + StartMiliSecond
QuickSortString1D strMyArray
lngResultBinary = strMyArray(CLng函数(BinarySearchString(strTest,strMyArray)))
EndHour =小时(NOW())* 60 * 60 * 1000
EndMinute =分(NOW())* 60 * 1000
EndSecond =二(NOW())* 1000
EndMiliSecond =格式(NOW(),MS)
结束时间= EndHour + EndMinute + EndSecond + EndMiliSecond
TimeBinarySearch =结束时间 - 开始时间
******************* ******************
MSGBOX lngResultFilter和放大器; VBCR和放大器; VBCR和放大器; lngResultBinary
MSGBOX TimeFilterSearch和放大器; VBCR和放大器; VBCR和放大器; TimeBinarySearch
结束小组
子QuickSortString1D(为ByRef saArray()作为字符串_
可选BYVAL bSortAscending由于布尔=真,_
可选BYVAL lLow1为Variant,_
可选BYVAL lHigh1为Variant)
维变量
昏暗lLow2只要
昏暗lHigh2只要
昏暗SKEY作为字符串
昏暗sSwap作为字符串
对错误转到ErrorExit
如果没有提供,排序整个数组
如果ISMISSING(lLow1)然后lLow1 = LBOUND(saArray)
如果ISMISSING(lHigh1)然后lHigh1 = UBound函数(saArray)
设置新的极端旧极端
lLow2 = lLow1
lHigh2 = lHigh1
获取新的极端中间数组项的值
SKEY = saArray((lLow1 + lHigh1)\ 2)
'循环的极端之间的阵列中的所有项
做,而lLow2< lHigh2
如果bSortAscending然后
查找大于中点项中的第一项
做,而saArray(lLow2)< SKEY而lLow2< lHigh1
lLow2 = lLow2 + 1
循环
查找小于中点项的最后一个项目
做,而saArray(lHigh2)> SKEY而lHigh2> lLow1
lHigh2 = lHigh2 - 1
循环
其他
查找小于中点项中的第一项
做,而saArray(lLow2)> SKEY而lLow2< lHigh1
lLow2 = lLow2 + 1
循环
查找大于中点项目的最后一项
做,而saArray(lHigh2)< SKEY而lHigh2> lLow1
lHigh2 = lHigh2 - 1
循环
结束如果
如果这两个项目都是以错误的顺序,调剂行
如果lLow2< lHigh2然后
sSwap = saArray(lLow2)
saArray(lLow2)= saArray(lHigh2)
saArray(lHigh2)= sSwap
结束如果
如果指针不在一起,前进到下一个项目
如果lLow2< = lHigh2然后
lLow2 = lLow2 + 1
lHigh2 = lHigh2 - 1
结束如果
循环
'递归到极端的下半部排序
如果lHigh2> lLow1然后
QuickSortString1D saArray,bSortAscending,lLow1,lHigh2
结束如果
'递归到极端的上半部分进行排序
如果lLow2< lHigh1然后
QuickSortString1D saArray,bSortAscending,lLow2,lHigh1
结束如果
ErrorExit:
结束小组
******************* **********
评语:使用二进制搜索算法来快速定位
字符串排序数组中的字符串
参数:sLookFor的字符串数组中搜索
saArray字符串数组,升序排列
lMethod无论vbBinaryCompare或vbTextCompare
默认为vbTextCompare
lNotFound要返回的值,如果文本没有
发现。默认为-1
返回:长的位置位于阵列中,
或者lNotFound如果未找到
日期开发行动
--------------------------------
04年6月2日斯蒂芬·布伦创建
功能BinarySearchString(为ByRef sLookFor作为字符串_
为ByRef saArray()作为字符串_
可选BYVAL lMethod作为VbCompareMethod = vbTextCompare,_
可选BYVAL lNotFound只要= -1)只要
昏暗lLow只要
昏暗LMID只要
昏暗lHigh只要
昏暗LCOMP只要
对错误转到ErrorExit
'假设我们没有发现它
BinarySearchString = lNotFound
获取起始位置
lLow = LBOUND(saArray)
lHigh = UBound函数(saArray)
做
查找数组的中点
LMID =(lLow + lHigh)\ 2
中点元素进行比较,以被搜索的字符串的
LCOMP = STRCOMP(saArray(LMID),sLookFor,lMethod)
如果LCOMP = 0则
我们发现它,所以返回的位置和退出
BinarySearchString = LMID
退出待办事项
elseif的LCOMP = 1,则
中点产品比我们大 - 扔掉的上半部分
lHigh = LMID - 1
其他
中点产品比我们小 - 扔掉下半区
lLow = LMID + 1
结束如果
继续,直到我们的三分球越过
循环直到lLow> lHigh
ErrorExit:
端功能
编辑:看来,我应该首先做了一些野蛮的力量测试。通过简单地遍历数组以线性方式约翰·科尔曼提出的过滤功能执行,返回时间相同的情况下上述0毫秒。请看下图:
子Test3的()
昏暗我长,strMyArray()作为字符串,lngSize长,strTest作为字符串
昏暗lngResultBrute长,TimeBruteSearch只要
lngSize = 2000000
strTest = CStr的(936740 * 97)
REDIM strMyArray(lngSize)
对于i = 1到UBound函数(strMyArray)
如果我国防部2 = 0然后
strMyArray(I)= CStr的((I - 1)* 97)
其他
strMyArray(ⅰ)= CStr的(第(i + 1)* 97)
结束如果
接下来我
开始时间=计时器
蛮力搜索
对于i = 1到UBound函数(strMyArray)
如果strMyArray(ⅰ)= strTest然后
lngResultBrute = CLng函数(strTest)
退出对于
结束如果
接下来我
结束时间=计时器
TimeBruteSearch =结束时间 - 开始时间
MSGBOX TimeBruteSearch
结束小组
过滤器
并使用线性搜索 - 它只是执行它减轻快,因为它是在高度实施优化的C / C ++ code。看到这一点,运行以下code:
功能RandString(正长)作为字符串
返回一个随机字符串B-Z
昏暗我只要
昏暗的参考译文字符串
对于i = 1到n
S = S&放大器; CHR(66 + INT(25 *的Rnd()))
接下来我
RandString = S
端功能
子测试()
昏暗的时间(1〜20)作为双
昏暗我长,N当
暗淡了()作为字符串
昏暗的开端,双
昏暗的参考译文字符串
随机
S = RandString(99)
REDIM A(1 200万)
对于i = 1 200万
A(I)= S + RandString(1)
接下来我
S = S&放大器; 一个
对于i = 20至1步骤-1
N = I * 100000
REDIM preserve A(1到N)
启动=计时器
Debug.Print UBound函数(过滤器(A,S))应为-1
次(I)=定时器 - 启动
接下来我
对于i = 1到20
细胞(ⅰ,1)= I
细胞(1,2)=倍(ⅰ)
接下来我
结束小组
此code创建的2000000随机串长度100,其每一个的不同之处的最后一个位置目标字符串的数组。然后将其送至子阵列,其尺寸分别为10的倍数为过滤器
,定时所花费的时间。输出看起来是这样的:
清晰的线性趋势并不完全证明,但强有力的证据表明VBA的过滤器
正在执行一个简单的线性搜索。
I can't figure out how the Filter function works so fast. I have used Filter on all sorts of data and regardless of data-type, Filter obliterates any alternative method I employ. I regularly use the Binary search algorithm and the QuickArraySort algorithm written by Stephen Bullen (found in Professional Excel Development). The Binary Search is lightning fast (as fast as the Filter function, given that the array is sorted) and the Quick Sort algorithm is one of the fastest sorting algorithms known.
I have written some test code below comparing speeds of finding a random element in a very large array (size = 2,000,000). I intentionally populate the array in an un-ordered fashion (it should be noted that I have tried various un-ordered assigning methods, and the results are similar regardless of assigning method).
Sub SearchTest()
Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim TimeBinarySearch As Long, TimeFilterSearch As Long
Dim lngResultBinary As Long, lngResultFilter As Long
Dim StartHour As Long, StartMinute As Long, StartSecond As Long
Dim StartMiliSecond As Long, StartTime As Long
Dim EndHour As Long, EndMinute As Long, EndSecond As Long
Dim EndMiliSecond As Long, EndTime As Long
lngSize = 2000000
strTest = CStr(1735674 * 987)
ReDim strMyArray(lngSize)
For i = 1 To UBound(strMyArray)
If i Mod 2 = 0 Then
strMyArray(i) = CStr((i - 1) * 987)
Else
strMyArray(i) = CStr((i + 1) * 987)
End If
Next i
''Filter Test
'*******************************************************************
StartHour = Hour(Now()) * 60 * 60 * 1000
StartMinute = Minute(Now()) * 60 * 1000
StartSecond = Second(Now()) * 1000
StartMiliSecond = Format(Now(), "ms")
StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond
lngResultFilter = CLng(Filter(strMyArray, strTest)(0))
EndHour = Hour(Now()) * 60 * 60 * 1000
EndMinute = Minute(Now()) * 60 * 1000
EndSecond = Second(Now()) * 1000
EndMiliSecond = Format(Now(), "ms")
EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond
TimeFilterSearch = EndTime - StartTime
'*******************************************************************
''Binary Test
'*******************************************************************
StartHour = Hour(Now()) * 60 * 60 * 1000
StartMinute = Minute(Now()) * 60 * 1000
StartSecond = Second(Now()) * 1000
StartMiliSecond = Format(Now(), "ms")
StartTime = StartHour + StartMinute + StartSecond + StartMiliSecond
QuickSortString1D strMyArray
lngResultBinary = strMyArray(CLng(BinarySearchString(strTest, strMyArray)))
EndHour = Hour(Now()) * 60 * 60 * 1000
EndMinute = Minute(Now()) * 60 * 1000
EndSecond = Second(Now()) * 1000
EndMiliSecond = Format(Now(), "ms")
EndTime = EndHour + EndMinute + EndSecond + EndMiliSecond
TimeBinarySearch = EndTime - StartTime
'*******************************************************************
MsgBox lngResultFilter & vbCr & vbCr & lngResultBinary
MsgBox TimeFilterSearch & vbCr & vbCr & TimeBinarySearch
End Sub
Both methods return the same result, however the Filter method's return time is 0 ms and the QuickSort/BinarySearch method's return time is nearly 20 seconds. That is a huge difference!! As mentioned earlier, if the array is sorted the binary search method returns 0 ms as well (As most know, binary search requires that the array is sorted to begin with)
So, how can the Filter function look through 2,000,000 un-sorted entries and find the correct result immediately? It can't simply loop through every entry and compare it with the filtervalue (this is by far the slowest method), but based off of these preliminary test, it can't be utilizing a binary search either, as it would have to sort the array first. Even if there was an awesome sorting algorithm that was already compiled, I find it hard to believe that it could sort an array of size greater than a million instantaneously.
By the way, below is the QuickSort algorithm and the Binary Search algorithm.
Sub QuickSortString1D(ByRef saArray() As String, _
Optional ByVal bSortAscending As Boolean = True, _
Optional ByVal lLow1 As Variant, _
Optional ByVal lHigh1 As Variant)
'Dimension variables
Dim lLow2 As Long
Dim lHigh2 As Long
Dim sKey As String
Dim sSwap As String
On Error GoTo ErrorExit
'If not provided, sort the entire array
If IsMissing(lLow1) Then lLow1 = LBound(saArray)
If IsMissing(lHigh1) Then lHigh1 = UBound(saArray)
'Set new extremes to old extremes
lLow2 = lLow1
lHigh2 = lHigh1
'Get value of array item in middle of new extremes
sKey = saArray((lLow1 + lHigh1) \ 2)
'Loop for all the items in the array between the extremes
Do While lLow2 < lHigh2
If bSortAscending Then
'Find the first item that is greater than the mid-point item
Do While saArray(lLow2) < sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is less than the mid-point item
Do While saArray(lHigh2) > sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
Else
'Find the first item that is less than the mid-point item
Do While saArray(lLow2) > sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is greater than the mid-point item
Do While saArray(lHigh2) < sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
End If
'If the two items are in the wrong order, swap the rows
If lLow2 < lHigh2 Then
sSwap = saArray(lLow2)
saArray(lLow2) = saArray(lHigh2)
saArray(lHigh2) = sSwap
End If
'If the pointers are not together, advance to the next item
If lLow2 <= lHigh2 Then
lLow2 = lLow2 + 1
lHigh2 = lHigh2 - 1
End If
Loop
'Recurse to sort the lower half of the extremes
If lHigh2 > lLow1 Then
QuickSortString1D saArray, bSortAscending, lLow1, lHigh2
End If
'Recurse to sort the upper half of the extremes
If lLow2 < lHigh1 Then
QuickSortString1D saArray, bSortAscending, lLow2, lHigh1
End If
ErrorExit:
End Sub
'***********************************************************
' Comments: Uses a binary search algorithm to quickly locate
' a string within a sorted array of strings
'
' Arguments: sLookFor The string to search for in the array
' saArray An array of strings, sorted ascending
' lMethod Either vbBinaryCompare or vbTextCompare
' Defaults to vbTextCompare
' lNotFound The value to return if the text isn’t
' found. Defaults to -1
'
' Returns: Long The located position in the array,
' or lNotFound if not found
'
' Date Developer Action
' ———————————————————————————————-
' 02 Jun 04 Stephen Bullen Created
'
Function BinarySearchString(ByRef sLookFor As String, _
ByRef saArray() As String, _
Optional ByVal lMethod As VbCompareMethod = vbTextCompare, _
Optional ByVal lNotFound As Long = -1) As Long
Dim lLow As Long
Dim lMid As Long
Dim lHigh As Long
Dim lComp As Long
On Error GoTo ErrorExit
'Assume we didn’t find it
BinarySearchString = lNotFound
'Get the starting positions
lLow = LBound(saArray)
lHigh = UBound(saArray)
Do
'Find the midpoint of the array
lMid = (lLow + lHigh) \ 2
'Compare the mid-point element to the string being searched for
lComp = StrComp(saArray(lMid), sLookFor, lMethod)
If lComp = 0 Then
'We found it, so return the location and quit
BinarySearchString = lMid
Exit Do
ElseIf lComp = 1 Then
'The midpoint item is bigger than us - throw away the top half
lHigh = lMid - 1
Else
'The midpoint item is smaller than us - throw away the bottom half
lLow = lMid + 1
End If
'Continue until our pointers cross
Loop Until lLow > lHigh
ErrorExit:
End Function
Edit: It seems I should have done some "brute" force tests first. By simply looping through the array in a linear fashion as John Coleman suggests the Filter function performs, the return time for the same scenario above is 0 ms. See below:
Sub Test3()
Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim lngResultBrute As Long, TimeBruteSearch As Long
lngSize = 2000000
strTest = CStr(936740 * 97)
ReDim strMyArray(lngSize)
For i = 1 To UBound(strMyArray)
If i Mod 2 = 0 Then
strMyArray(i) = CStr((i - 1) * 97)
Else
strMyArray(i) = CStr((i + 1) * 97)
End If
Next i
StartTime = Timer
' Brute force search
For i = 1 To UBound(strMyArray)
If strMyArray(i) = strTest Then
lngResultBrute = CLng(strTest)
Exit For
End If
Next i
EndTime = Timer
TimeBruteSearch = EndTime - StartTime
MsgBox TimeBruteSearch
End Sub
Filter
does use a linear search -- it just executes it lightening quick because it is implemented in highly optimized C/C++ code. To see this, run the following code:
Function RandString(n As Long) As String
'returns a random string in B-Z
Dim i As Long
Dim s As String
For i = 1 To n
s = s & Chr(66 + Int(25 * Rnd()))
Next i
RandString = s
End Function
Sub test()
Dim times(1 To 20) As Double
Dim i As Long, n As Long
Dim A() As String
Dim start As Double
Dim s As String
Randomize
s = RandString(99)
ReDim A(1 To 2000000)
For i = 1 To 2000000
A(i) = s + RandString(1)
Next i
s = s & "A"
For i = 20 To 1 Step -1
n = i * 100000
ReDim Preserve A(1 To n)
start = Timer
Debug.Print UBound(Filter(A, s)) 'should be -1
times(i) = Timer - start
Next i
For i = 1 To 20
Cells(i, 1) = i
Cells(i, 2) = times(i)
Next i
End Sub
This code creates an array of 2,000,000 random strings of length 100, each of which differs from the target string in the last position. Then it feeds subarrays whose sizes are multiples of 100,000 into Filter
, timing the time it takes. The output looks like this:
The clear linear trend doesn't exactly prove but is strong evidence that VBA's Filter
is executing a straightforward linear search.
这篇关于使用VBA过滤功能时,性能注意事项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!