代码如下

from elasticsearch import helpers, Elasticsearch
import csv

es = Elasticsearch()

with open(r'C:\Users\user\Desktop\police.csv') as f:
    index_name = 'census_data_records'
    doctype = 'census_record'
    reader = csv.reader(f)
    headers = []
    index = 0
    es.indices.delete(index=index_name, ignore=[400, 404])
    es.indices.create(index=index_name, ignore=400)
    action_list = []
    for row in reader:
        record ={
            '_op_type': 'index',
            '_index': index_name,
            '_type' : doctype,
            '_source': row
        }
        action_list.append(record)
    helpers.bulk(es, action_list)
数据集如下
,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y
0,120058272,WEAPON LAWS,POSS OF PROHIBITED WEAPON,Friday,01/29/2016 12:00:00 AM,11:00,Kan,"ARREST, BOOKED",800 Block of BRYANT ST,10.98727872,75.44928793
1,120058272,WEAPON LAWS,"FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE",Friday,01/29/2016 12:00:00 AM,11:00,Kan,"ARREST, BOOKED",800 Block of BRYANT ST,10.93029836,75.85839714
2,141059263,WARRANTS,WARRANT ARREST,Monday,04/25/2016 12:00:00 AM,14:59,Thi,"ARREST, BOOKED",KEITH ST / SHAFTER AV,10.02948575,74.81278836
3,160013662,NON-CRIMINAL,LOST PROPERTY,Tuesday,01/05/2016 0:00,23:50,Pat,NONE,JONES ST / OFARRELL ST,10.91399488,75.39788708
4,160002740,NON-CRIMINAL,LOST PROPERTY,Friday,01/01/2016 0:00,0:30,Ala,NONE,16TH ST / Alapuzha ST,12.35918751,74.87851143
5,160002869,ASSAULT,BATTERY,Friday,01/01/2016 0:00,21:35,Ern,NONE,1700 Block of BUSH ST,10.87491543,75.96476576
6,160003130,OTHER OFFENSES,PAROLE VIOLATION,Saturday,01/02/2016 0:00,0:04,Kan,"ARREST, BOOKED",MARY ST / HOWARD ST,10.6450246,75.7202032
在 push 时我得到了错误
  • BulkIndexError:('500个文档未能建立索引。',

  • 是否有其他方法可以将csv推入 Elasticsearch ?任何文档或博客也将有所帮助

    最佳答案

    from elasticsearch import helpers, Elasticsearch
    import pandas as pd
    import json
    
    df = pd.read_csv("police.csv")
    json_str = df.to_json(orient='records')
    
    json_records = json.loads(json_str)
    
    es = Elasticsearch()
    index_name = 'census_data_records'
    doctype = 'census_record'
    es.indices.delete(index=index_name, ignore=[400, 404])
    es.indices.create(index=index_name, ignore=400)
    action_list = []
    for row in json_records:
        record ={
            '_op_type': 'index',
            '_index': index_name,
            '_type' : doctype,
            '_source': row
        }
        action_list.append(record)
    helpers.bulk(es, action_list)
    

    关于python - BulkIndexError:('500 document(s) failed to index.',使用Python + Elasticsearch,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/62869254/

    10-09 02:48