问题描述
我处理了一个很长的excel文件(最多11000行和7列),该文件在一个单元格内有很多重复的数据.我正在寻找摆脱它的宏,但找不到任何宏.
I was handled a very long excel file (up to 11000 rows and 7 columns) that has many repeated data inside a cell. I am looking for a macro to get rid of it but couldn't find any.
其中一个这样的单元格的示例:
Example of one such cells:
Ciencias de laEducación,Educación,佩达哥亚,Ciencias de laEducación,Educación,Pedagogía
Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía
它应该像这样:
Cedencias de laEducación,Educación,佩达哥亚(Pedagogía)
Ciencias de la Educación,Educación,Pedagogía
我如何摆脱成千上万的重复(更不用说多余的,孤立的逗号了?)?
How can I get rid of the thousands of repeats (not to mention the extra, orphaned, commas)?
推荐答案
此代码在我的计算机上运行6秒,在@SiddharthRout的计算机上运行2秒:)(数据包含在单元格A1:G20000
中:20000x7 = 140000非空单元格中)
This code runs 6 seconds on my machine and 2 seconds on @SiddharthRout's machine:)(with data in cells A1:G20000
: 20000x7=140000 non empty cells)
Sub test2()
Dim c, arr, el, data, it
Dim start As Date
Dim targetRange As Range
Dim dict As Object
Set dict = CreateObject("Scripting.dictionary")
Application.ScreenUpdating = False
Set targetRange = Range("A1:G20000")
data = targetRange
start = Now
For i = LBound(data) To UBound(data)
For j = LBound(data, 2) To UBound(data, 2)
c = data(i, j)
dict.RemoveAll
arr = Split(c, ",")
For Each el In arr
On Error Resume Next
dict.Add Trim(el), Trim(el)
On Error GoTo 0
Next
c = ""
For Each it In dict.Items
c = c & it & ","
Next
If c <> "" Then c = Left(c, Len(c) - 1)
data(i, j) = c
Next j
Next i
targetRange = data
Application.ScreenUpdating = True
MsgBox "Working time: " & Format(Now - start, "hh:nn:ss")
End Sub
您可以通过更改后两行
You can make this code slightly faster by changing next two lines
Dim dict As Object
Set dict = CreateObject("Scripting.dictionary")
到
Dim dict As new Dictionary
添加对库的引用后:转到工具->参考并选择"Microsoft脚本运行时"
after adding reference to library: go to Tools->References and select "Microsoft Scripting Runtime"
这篇关于excel-如何删除Excel中单元格内用逗号分隔的重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!