问题描述
这是对从Pandas数据框创建矩阵以显示连通性.区别在于矩阵.
This is a follow-up question to Creating a matrix from Pandas dataframe to display connectedness. The difference is in the matrix.
我在熊猫数据框中使用这种格式的数据:
I have my data in this format in a pandas dataframe:
Customer_ID Location_ID
Alpha A
Alpha B
Alpha C
Beta A
Beta B
Beta D
我想研究客户的流动性模式.我的目标是确定客户最常去的位置集群.我认为以下矩阵可以提供此类信息:
I want to study the mobility patterns of the customers. My goal is to determine the clusters of locations that are most frequented by customers. I think the following matrix can provide such information:
A B C D
A 0 0 0 0
B 2 0 0 0
C 1 1 0 0
D 1 1 0 0
如何在Python中这样做?
How do I do so in Python?
我的数据集非常大(成千上万的客户和大约一百个位置).
My dataset is quite large (hundreds of thousands of customers and about a hundred locations).
推荐答案
为了完整起见,这是我先前回答的修改后的版本.基本上,您在更新矩阵时会添加一个条件:if edge > node:
Just for completeness, here's the modified version of my previous answer. Basically, you add a condition when updating the matrix: if edge > node:
import pandas as pd
#I'm assuming you can get your data into a pandas data frame:
data = {'Customer_ID':[1,1,1,2,2,2],'Location':['A','B','C','A','B','D']}
df = pd.DataFrame(data)
#Initialize an empty matrix
matrix_size = len(df.groupby('Location'))
matrix = [[0 for col in range(matrix_size)] for row in range(matrix_size)]
#To make life easier, I made a map to go from locations
#to row/col positions in the matrix
location_set = list(set(df['Location'].tolist()))
location_set.sort()
location_map = dict(zip(location_set,range(len(location_set))))
#Group data by customer, and create an adjacency list (dyct) for each
#Update the matrix accordingly
for name,group in df.groupby('Customer_ID'):
locations = set(group['Location'].tolist())
dyct = {}
for i in locations:
dyct[i] = list(locations.difference(i))
#Loop through the adjacency list and update matrix
for node, edges in dyct.items():
for edge in edges:
#Add this condition to create bottom half of the symmetric matrix
if edge > node:
matrix[location_map[edge]][location_map[node]] +=1
这篇关于从Pandas数据框创建矩阵以显示连通性-2的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!