我们可以在Pandas DataFrame上运行scikit-learn模型,还是需要将DataFrame转换为NumPy数组?

最佳答案

您可以将pandas.DataFramesklearn一起使用,例如:

import pandas as pd
from sklearn.cluster import KMeans

data = [(0.2, 10),
        (0.3, 12),
        (0.24, 14),
        (0.8, 30),
        (0.9, 32),
        (0.85, 33.3),
        (0.91, 31),
        (0.1, 15),
        (-0.23, 45)]

p_df = pd.DataFrame(data)
kmeans = KMeans(init='k-means++', n_clusters=3, n_init=10)
kmeans.fit(p_df)

结果:
>>> kmeans.labels_
array([0, 0, 0, 2, 2, 2, 2, 0, 1], dtype=int32)

关于python - 将 Pandas 数据集放入数组以在Scikit-Learn中进行建模,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22562540/

10-12 20:17