csv - 在确定Elasticsearch(通过Logstash)提取的文档类型方面需要帮助

我使用Logstash从https://www.kaggle.com/wcukierski/the-simpsons-by-the-data提取csv文件，并将其保存到Elasticsearch。首先，我使用以下conf摄取了simpsons_characters.csv:

input {
  file {
    path => "/Users/xyz/Downloads/the-simpsons-by-the-data/simpsons_characters.csv"
    start_position => beginning
    sincedb_path => "/dev/null"
  }
}

filter {
  csv {
    columns   => ["id", "name", "normalized_name", "gender"]
    separator => ","
  }
}

output {
  stdout {
    codec => rubydebug
  }
  elasticsearch {
    hosts   => "localhost"
    action  => "index"
    index   => "simpsons"
  }
}

但是，当我这样查询时:http://localhost:9200/simpsons/name/Lou哪里simpsons = indexname = type(我认为...不确定)

我得到以下回复:

{
   "_index": "simpsons",
   "_type": "name",
   "_id": "Lou",
   "found": false
}

所以，问题是，为什么我没有得到正确的答复。此外，当您通过csv进行批量提取时，文档的type是什么？

谢谢!

最佳答案

The default type in Logstash Elasticsearch output is logs 。因此，无论您如何定义ID(从csv-document_id => "%{id}"获取ID或让ES定义自己的ID)，都可以将这些文档作为http://localhost:9200/simpsons/logs/THE_ID获得。

如果您不知道ID，只想检查是否存在:http://localhost:9200/simpsons/logs/_search?pretty。

如果要查看索引的映射，例如查找索引的_type:http://localhost:9200/simpsons/_mapping?pretty。

要更改默认的_type:

  elasticsearch {
    hosts   => "localhost"
    action  => "index"
    index   => "simpsons"
    document_type => "characters"
    document_id => "%{id}"
  }