我正在使用Scala在Eclipse上进行Apache Spark项目
我想将日期格式从yyyy-mm-dd
更改为dd-mm-yyyy
这是我的代码:
val conf = new SparkConf().setMaster("local").setAppName("trying")
val sc = new SparkContext(conf)
val x =
sc.textFile("/home/amel/1MB")
.filter(!_.contains("NULL")).filter(!_.contains("Null"))
val re = x.map(row => {
val cols = row.split(",")
val Cycle = cols(2)
val Duration = Cycle match {
case "Licence" => "3 years"
case "Master" => "2 years"
case "Ingéniorat" => "5 years"
case "Ingeniorat" => "5 years"
case "Doctorat" => "3 years"
case _ => "NULL" }
(cols(0)+","+cols(1) + "," + Cycle + "," + cols(3) + ","
+Duration)
})
re.collect.foreach(println)
这是我得到的结果的一个示例:
0000023497,2007-06-27,Master,SI,2 years
这就是我希望结果看起来像的
0000023497,27-06-2007,Master,SI,2 years
最佳答案
可以用正则表达式来完成。
val ymd = raw"(\d+)-(\d+)-(\d+)".r
ymd.replaceAllIn("2007-06-27", m => s"${m group 3}-${m group 2}-${m group 1}")
//res0: String = 27-06-2007
也可以通过
java.time
库格式化来完成。import java.time.LocalDate
import java.time.format.DateTimeFormatter
LocalDate.parse("2019-01-04")
.format(DateTimeFormatter.ofPattern("dd-MM-yyyy"))
//res1: String = 04-01-2019