我有文件energy.txt:

path   energy      counter
AXX    100.00          1
AXX     99.99          2
AXX     99.98          1
AXX     99.50          1
AXX     99.00          7

我想比较第二列的值,如果它们之间的差异小于0.02,请保留第二个值并添加一个计数器。
例如,第一步是100.00-99.99=0.01(小于0.02),所以
path   energy      counter
AXX     99.99          3
AXX     99.98          1
AXX     99.50          1
AXX     99.00          7

第二个:99.99-99.98=0.01,所以
path   energy      counter
AXX     99.98          4
AXX     99.50          1
AXX     99.00          7

第三个:99.98-99.50=0.48(大于0.02)
第四个:99.50-99.00=0.50(大于0.02)。
我想用Python做这个。

最佳答案

Pandas-款式:

import pandas as pd

df = pd.read_table(filename, sep='\s+')

# generate a value (label) with which we can group rows together
label = (df['energy'].diff() < -0.02).astype('int')
df['label'] = label.cumsum()
print(df)
#   path  energy  counter  label
# 0  AXX  100.00        1      0
# 1  AXX   99.99        2      0
# 2  AXX   99.98        1      0
# 3  AXX   99.50        1      1
# 4  AXX   99.00        7      2

# Aggregate the count for each label group
grouped = df.groupby(['label'])
counts = grouped[['counter']].agg('sum')
print(counts)
#        counter
# label
# 0            4
# 1            1
# 2            7

# Find the index of the row with the minimum energy per group
idx = grouped['energy'].agg(lambda col: col.idxmin())

# Select only those rows from df
result = df.ix[idx, ['path', 'energy', 'label']]

# Merge in the computed counts
result = pd.merge(result, counts, left_on=['label'], right_index=True)
result = result.ix[:, ['path','energy','counter']]
print(result)

产量
  path  energy  counter
2  AXX   99.98        4
3  AXX   99.50        1
4  AXX   99.00        7

关于python - 如果比较相同的值,如何比较列的值并添加计数器,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/16255276/

10-12 23:06