问题描述
我想从 demo_name 为 NULL 且 demo_name 为空的数据框中删除记录.
I would like to remove records from a dataframe having demo_name as NULL and demo_name as empty.
demo_name 是该数据帧中字符串数据类型的列
demo_name is a column in that dataFrame with String datatype
我正在尝试以下代码.我想应用修剪,因为 demo_name 有多个空格的记录.
I am trying the below code . I want to apply trim as there are records for demo_name with multiple spaces.
val filterDF = demoDF.filter($"demo_name".isNotNull && $"demo_name".trim != "" )
但我收到错误,因为无法解析符号修剪
But I get error as cannot resolve symbol trim
有人可以帮我解决这个问题吗?
Could someone help me to fix this issue ?
推荐答案
你正在调用 trim
就像你在操作一个 String
,但是 $
函数使用 implicit
转换将列的名称转换为 Column
实例本身.问题是 Column
没有 trim
函数.
You are calling trim
as if you are acting on a String
, but the $
function uses implicit
conversion to convert the name of the column to the Column
instance itself. The problem is that Column
doesn't have a trim
function.
您需要导入库函数并将它们应用到您的列中:
You need to import the library functions and apply them to your column:
import org.apache.spark.sql.functions._
demoDF.filter($"demo_name".isNotNull && length(trim($"demo_name")) > 0)
这里我使用库函数 trim
和 length
--trim
来去除空格,然后是 length
以验证结果中是否包含任何内容.
Here I use the library functions trim
and length
--trim
to strip the spaces of course and then length
to verify that the result has anything in it.
这篇关于如何删除DataFrame中特定列的NULL和空?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!