我是Python和Pandas的新手,所以如果有人可以帮助我,我将非常高兴。我的问题如下:
如果我有一个.txt文件,其中包含一组作为字符串的反应(R1,R2 ...)。每个反应都有化合物(A,B,C,D ...),它们的化学计量系数分别为(1、2、3 ...),例如:R1: A + 2B + C <=> D
R2: A + B <=> C
我该如何在python中以化学计量矩阵的形式创建数据框(化合物以行X反应以列的形式),如下所示:
R1 R2
A -1 -1
B -2 -1
C -1 1
D 1 0
观察:方程式左侧的化合物应具有负化学计量值,而右侧的化合物应为正化学计量值
谢谢= D
最佳答案
尝试这个:
import pandas as pd
import re # regular expressions
def coeff_comp(s):
# Separate stoichiometric coefficient and compound
result = re.search('(?P<coeff>\d*)(?P<comp>.*)', s)
coeff = result.group('coeff')
comp = result.group('comp')
if not coeff:
coeff = '1' # coefficient=1 if it is missing
return comp, int(coeff)
equations = ['R1: A + 2B + C <=> D', 'R2: A + B <=> C'] # some test data
reactions_dict = {} # results dictionary
for equation in equations:
compounds = {} # dict -> compound: coeff
eq = equation.replace(' ', '')
r_id, reaction = eq.split(':') # separate id from chem reaction
lhs, rhs = reaction.split('<=>') # split left and right hand side
reagents = lhs.split('+') # get list of reagents
products = rhs.split('+') # get list of products
for reagent in reagents:
comp, coeff = coeff_comp(reagent)
compounds[comp] = - coeff # negative on lhs
for product in products:
comp, coeff = coeff_comp(product)
compounds[comp] = coeff # positive on rhs
reactions_dict[r_id] = compounds
# insert dict into DataFrame, replace NaN with 0, let values be int
df = pd.DataFrame(reactions_dict).fillna(value=0).astype(int)
输出看起来像
R1 R2
A -1 -1
B -2 -1
C -1 1
D 1 0