python - 用字符串迭代数据帧

我正在尝试创建一个名为2-backed test的认知任务。

我创建了具有一定条件的半随机列表，现在我想知道对参与者来说应该是什么好答案。

我想在数据框中显示一列，如果是或否，则在相同的字母之前输入2个字母。

这是我的代码：

from random import choice, shuffle
import pandas as pd

num = 60

letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L']

# letters_1 = [1, 2, 3, 4, 5, 6]

my_list = [choice(letters), choice(letters)]
probab = list(range(num - 2))
shuffle(probab)

# We want 20% of the letters to repeat the letter 2 letters back
pourc = 20
repeatnum = num * pourc // 100
for i in probab:
    ch = prev = my_list[-2]
    if i >= repeatnum:
        while ch == prev:
            ch = choice(letters)
    my_list.append(ch)

df = pd.DataFrame(my_list, columns=["letters"])

df.head(10)
  letters
0       F
1       I
2       D
3       I
4       H
5       C
6       L
7       G
8       D
9       L

# Create a list to store the data
response = []

# For each row in the column,
for i in df['letters']:
    # if more than a value,
    if i == [i - 2]:
        response.append('yes')
    else:
        response.append('no')

# Create a column from the list
df['response'] = response

第一个错误：

if i == [i - 2]:
TypeError: unsupported operand type(s) for -: 'str' and 'int'

如果我使用数字而不是字母，则可以克服此错误，但我更希望保留字母。

但是在那之后，如果我用数字运行它，我不会收到任何错误，但是我的新列响应只有“ no”。但是我知道它应该是12次。

最佳答案

似乎您想对列和移动了两个元素的同一列进行比较。使用shift + np.where-

df['response'] = np.where(df.letters.eq(df.letters.shift(2)), 'yes', 'no')
df.head(10)

  letters response
0       F       no
1       I       no
2       D       no
3       I      yes
4       H       no
5       C       no
6       L       no
7       G       no
8       D       no
9       L       no

但是我知道它应该是12次。

df.response.eq('yes').sum()
12

关于python - 用字符串迭代数据帧，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/47894078/