问题描述
我想知道如何从名称中带有空格的Row
访问属性.
I would like to know how to access an attribute from a Row
that has a blank space in the name.
例如,我有这个Row
对象
Row(ONE CATEGORY=u'category')
如何访问ONE CATEGORY
值.通常,我会使用row.oneCategory
来访问它,但是在这种情况下,由于空白,这是不可能的.如果可能的话,我更喜欢Python中的建议.
How can I access the ONE CATEGORY
value. Normally I would use row.oneCategory
to access it, but in this case that's not possible because of the blank space. If possible, I prefer the suggestions in Python.
谢谢
推荐答案
在Python中可以使用 getattr
函数:
In Python can use getattr
function:
row = Row("ONE CATEGORY")("category")
row
## Row(ONE CATEGORY='category')
getattr(row, u"ONE CATEGORY")
## 'category'
或Row.asDict
方法:
row.asDict()["ONE CATEGORY"]
## 'category'
由于您无法在Scala中使用点语法,因此这并不是真正的问题,但是如果您想按名称访问字段,则可以使用Row.getAs
Since you cannot use dot syntax in Scala it is not really an issue, but if you want to access fields by name you can use Row.getAs
val row = sc.parallelize(Tuple1("category") :: Nil).toDF("ONE CATEGORY").first
row.getAs[String]("ONE CATEGORY")
或Row.getValuesMap
:
row.getValuesMap[String](Seq("ONE CATEGORY"))("ONE CATEGORY")
在Python和Scala中,您都可以按索引访问值:
In both Python and Scala, you can access value by index:
## row[0]
'category'
row(0)
// Any = category
row.getString(0)
// String = category
最后,您可以在选择过程中使用alias
方法来完全避免该问题:
Finally you can use alias
method during select to avoid the issue completely:
df.select(col("ONE CATEGORY").alias("ONE_CATEGORY"))
这篇关于Spark-如何处理名称中包含空格的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!