问题描述
很抱歉,如果已经提出并回答了这个问题,但我找不到满意的答案.
Apologies if this has already been asked and answered but I couldn't find a satisfactory answer.
我有一个化学式的列表,包括以下顺序:C,H,N和O.我想在每个字母后面加上数字.问题在于,并非所有的公式都包含N.但是,所有的公式都包含C,H和O.并且该数字可以是一位,两位或三位数字(仅在H的情况下).
I have a list of chemical formulas including, in this order: C, H, N and O. And I would like to pull the number after each of these letters. The problem is that not all the formulas contain an N. All contain a C, H and O however. And the number can be either single, double or (in the case of H only) triple digit.
因此数据如下所示:
- C20H37N1O5
- C10H12O3
- C20H19N3O4
- C23H40O3
- C9H13N1O3
- C14H26O4
- C58H100N2O9
我想要列表中每个元素的编号在单独的列中.因此,在第一个示例中将是:
I'd like each element number for the list in separate columns. So in the first example it would be:
20 37 1 5
我一直在尝试:
=IFERROR(MID(LEFT(A2,FIND("H",A2)-1),FIND("C",A2)+1,LEN(A2)),"")
分离出C#.但是,在此之后,由于H#两侧是O或N,我被卡住了.
to separate out the C#. However, after this I get stuck as the H# is flanked by either an O or N.
是否存在可以执行此操作的excel公式或VBA?
Is there an excel formula or VBA that can do this?
推荐答案
使用正则表达式
对于正则表达式(正则表达式),这是一项很好的任务.由于VBA不支持开箱即用的正则表达式,因此我们需要先引用Windows库.
Use Regular Expressions
This is a good task for regular expressions (regex). Because VBA doesn't support regular expressions out of the box we need to reference a Windows library first.
-
在工具下添加对正则表达式的引用,然后在参考下
Add reference to regex under Tools then References
,然后选择 Microsoft VBScript正则表达式5.5
将此功能添加到模块中
Option Explicit
Public Function ChemRegex(ChemFormula As String, Element As String) As Long
Dim strPattern As String
strPattern = "([CNHO])([0-9]*)"
'this pattern is limited to the elements C, N, H and O only.
Dim regEx As New RegExp
Dim Matches As MatchCollection, m As Match
If strPattern <> "" Then
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
Set Matches = regEx.Execute(ChemFormula)
For Each m In Matches
If m.SubMatches(0) = Element Then
ChemRegex = IIf(Not m.SubMatches(1) = vbNullString, m.SubMatches(1), 1)
'this IIF ensures that in CH4O the C and O are count as 1
Exit For
End If
Next m
End If
End Function
在单元格公式中使用像这样的函数
Use the function like this in a cell formula
例如在单元格B2中:=ChemRegex($A2,B$1)
并将其复制到其他单元格中
E.g. in cell B2: =ChemRegex($A2,B$1)
and copy it to the other cells
也可以识别多次出现元素CH3OH
或CH2COOH
的化学式请注意,上面的代码无法计数像CH3OH
这样的元素在其中多次出现的情况.然后只有第一个H3
被计数,最后一个被忽略.
Recognize also chemical formulas with multiple occurrences of elements like CH3OH
or CH2COOH
Note that the code above cannot count something like CH3OH
where elements occur more than once. Then only the first H3
is count the last is omitted.
如果您还需要识别格式为CH3OH
或CH2COOH
的公式(并汇总元素的出现),则还需要更改代码以识别它们……
If you need also to recognize formulas in the format like CH3OH
or CH2COOH
(and summarize the occurrences of the elements) then you need to change the code to recognize these too …
If m.SubMatches(0) = Element Then
ChemRegex = ChemRegex + IIf(Not m.SubMatches(1) = vbNullString, m.SubMatches(1), 1)
'Exit For needs to be removed.
End If
除了上面针对多次出现的元素所做的更改之外,请使用以下模式:
In addition to the change above for multiple occurrences of elements use this pattern:
strPattern = "([A-Z][a-z]?)([0-9]*)" 'https://regex101.com/r/nNv8W6/2
- 请注意,它们必须使用正确的大写/小写字母.
CaCl2
有效,但cacl2
或CACL2
无效. -
请注意,这不能证明这些字母组合是否是元素周期表中的现有元素.因此,这也将识别例如.
Xx2Zz5Q
作为虚拟元素Xx = 2
,Zz = 5
和Q = 1
.
- Note that they need to be in the correct upper/lower letter case.
CaCl2
works but notcacl2
orCACL2
. Note that this doesn't proof if these letter combinations are existing elements of the periodic table. So this will also recognize eg.
Xx2Zz5Q
as fictive elementsXx = 2
,Zz = 5
andQ = 1
.
要仅接受元素周期表中存在的组合,请使用以下模式:
To accept only combinations that exist in the periodic table use the following pattern:
strPattern = "([A][cglmrstu]|[B][aehikr]?|[C][adeflmnorsu]?|[D][bsy]|[E][rsu]|[F][elmr]?|[G][ade]|[H][efgos]?|[I][nr]?|[K][r]?|[L][airuv]|[M][cdgnot]|[N][abdehiop]?|[O][gs]?|[P][abdmortu]?|[R][abefghnu]|[S][bcegimnr]?|[T][abcehilms]|[U]|[V]|[W]|[X][e]|[Y][b]?|[Z][nr])([0-9]*)"
'https://regex101.com/r/Hlzta2/3
'This pattern includes all 118 elements up to today.
'If new elements are found/generated by scientist they need to be added to the pattern.
这篇关于从化学式中提取数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!