本文介绍了如何在 DataFrames.jl(0.19 版)中拆分列值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在关注 Tom Kwong 先生的 https://github.com/tk3369/data-wrangling-techniques-in-julia 教程.

I am following Mr.Tom Kwong's https://github.com/tk3369/data-wrangling-techniques-in-julia tutorial.

DataFrame如下:

DataFrame as follows:

julia> df3=stack(df2, Not(:County), variable_name=:Year_Gender, value_name=:Suicides)
486×3 DataFrame
│ Row │ Year_Gender       │ Suicides │ County       │
│     │ Symbol            │ Int64    │ String       │
├─────┼───────────────────┼──────────┼──────────────┤
│ 1   │ Female (2012)     │ 0        │ Asotin       │
│ 2   │ Female (2012)     │ 0        │ Benton       │
│ 3   │ Female (2012)     │ 0        │ Chelan       │
│ 4   │ Female (2012)     │ 0        │ Clallam      │
│ 5   │ Female (2012)     │ 1        │ Clark        │
│ 6   │ Female (2012)     │ 0        │ Columbia     │
│ 7   │ Female (2012)     │ 0        │ Cowlitz      │
│ 8   │ Female (2012)     │ 0        │ Douglas      │
│ 9   │ Female (2012)     │ 0        │ Grays Harbor │
│ 10  │ Female (2012)     │ 0        │ Island       │
│ 11  │ Female (2012)     │ 0        │ Jefferson    │
│ 12  │ Female (2012)     │ 3        │ King         │
│ 13  │ Female (2012)     │ 0        │ Kitsap       │
│ 14  │ Female (2012)     │ 0        │ Lewis        │
│ 15  │ Female (2012)     │ 0        │ Mason        │
│ 16  │ Female (2012)     │ 0        │ Okanogan     │
│ 17  │ Female (2012)     │ 0        │ Pacific      │
│ 18  │ Female (2012)     │ 1        │ Pierce       │
│ 19  │ Female (2012)     │ 0        │ Skagit       │
│ 20  │ Female (2012)     │ 0        │ Snohomish    │
│ 21  │ Female (2012)     │ 0        │ Spokane      │
⋮
│ 465 │ Total (2008-2012) │ 1        │ Columbia     │
│ 466 │ Total (2008-2012) │ 1        │ Cowlitz      │
│ 467 │ Total (2008-2012) │ 2        │ Douglas      │
│ 468 │ Total (2008-2012) │ 6        │ Grays Harbor │
│ 469 │ Total (2008-2012) │ 2        │ Island       │
│ 470 │ Total (2008-2012) │ 1        │ Jefferson    │
│ 471 │ Total (2008-2012) │ 33       │ King         │
│ 472 │ Total (2008-2012) │ 1        │ Kitsap       │
│ 473 │ Total (2008-2012) │ 1        │ Lewis        │
│ 474 │ Total (2008-2012) │ 1        │ Mason        │
│ 475 │ Total (2008-2012) │ 2        │ Okanogan     │
│ 476 │ Total (2008-2012) │ 3        │ Pacific      │
│ 477 │ Total (2008-2012) │ 20       │ Pierce       │
│ 478 │ Total (2008-2012) │ 3        │ Skagit       │
│ 479 │ Total (2008-2012) │ 11       │ Snohomish    │
│ 480 │ Total (2008-2012) │ 6        │ Spokane      │
│ 481 │ Total (2008-2012) │ 2        │ Stevens      │
│ 482 │ Total (2008-2012) │ 2        │ Thurston     │
│ 483 │ Total (2008-2012) │ 1        │ Walla Walla  │
│ 484 │ Total (2008-2012) │ 5        │ Whatcom      │
│ 485 │ Total (2008-2012) │ 1        │ Whitman      │
│ 486 │ Total (2008-2012) │ 8        │ Yakima       │

我正在尝试按如下方式拆分 Year_Gender 列值:

I am trying to split Year_Gender column values as follows:

julia> df3.Year=[split(x, " ")[1] for x in df3.Year_Gender]
ERROR: MethodError: no method matching split(::Symbol, ::String)
Closest candidates are:
  split(::T, ::Any; limit, keepempty) where T<:AbstractString at strings/util.jl:313
Stacktrace:
 [1] (::var"#3#4")(::Symbol) at ./none:0
 [2] iterate at ./generator.jl:47 [inlined]
 [3] collect(::Base.Generator{Array{Symbol,1},var"#3#4"}) at ./array.jl:665
 [4] top-level scope at REPL[9]:1

julia>

请指导我在 DataFrames 0.19 版中拆分列值,因为我无法更新.

Please guide me in splitting column values in DataFrames version 0.19, as I couldn't update.

推荐答案

首先让我说我不建议使用 DataFrames 0.19 - 当前版本是 1.2,所以 0.19 在这一点上相当古老.由于 DataFrames 已经过了 1.0 的第一个主要版本,该 API 现在被认为是稳定的,因此最好了解当前的做事方式,因为这可能会在可预见的未来为您提供良好的服务.

Let me start off by saying I would not recommend using DataFrames 0.19 - the current release is 1.2, so 0.19 is pretty ancient at this point. As DataFrames is past its first major release with 1.0, the API is now considered stable, so it's best to learn the current way of doing things as that will likely serve you well in the foreseeable future.

话虽如此,您的问题与 DataFrames 无关,而这正是 Julia 的基本工作方式:

With that said, your issue has nothing do with DataFrames, and is just how base Julia works:

julia> split(Symbol("Female (2012)"), " ")
ERROR: MethodError: no method matching split(::Symbol, ::String)
Closest candidates are:
  split(::T, ::Any; limit, keepempty) where T<:AbstractString at strings/util.jl:401
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

如果你想使用split,你需要一个String而不是你的列中的一个Symbol来分割它:

If you want to use split, you need a String rather than a Symbol in your column to split it:

julia> split(string(Symbol("Female (2012)")), " ")
2-element Vector{SubString{String}}:
 "Female"
 "(2012)"

您可以使用 last 访问此元素的第二个元素,您可能还需要考虑删除括号,然后在其上调用 parse(Int, x) 以拿出一个号码.

You can access the second element of this using last, and you might want to also consider removing the brackets and then calling parse(Int, x) on that to get a number out.

这篇关于如何在 DataFrames.jl(0.19 版)中拆分列值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 05:16