我正在使用Scala在Eclipse上进行Apache Spark项目

我想将日期格式从yyyy-mm-dd更改为dd-mm-yyyy

这是我的代码:

val conf = new SparkConf().setMaster("local").setAppName("trying")
val sc = new SparkContext(conf)
val x =
sc.textFile("/home/amel/1MB")
.filter(!_.contains("NULL")).filter(!_.contains("Null"))

val re = x.map(row => {
val cols = row.split(",")
val Cycle = cols(2)
val Duration = Cycle match {
case "Licence" => "3 years"
case "Master" => "2 years"
case "Ingéniorat" => "5 years"
case "Ingeniorat" => "5 years"
case "Doctorat" => "3 years"
case _ => "NULL" }
(cols(0)+","+cols(1) + "," + Cycle + "," +  cols(3) + ","
+Duration)
})
re.collect.foreach(println)


这是我得到的结果的一个示例:

0000023497,2007-06-27,Master,SI,2 years


这就是我希望结果看起来像的

0000023497,27-06-2007,Master,SI,2 years

最佳答案

可以用正则表达式来完成。

val ymd = raw"(\d+)-(\d+)-(\d+)".r

ymd.replaceAllIn("2007-06-27", m => s"${m group 3}-${m group 2}-${m group 1}")
//res0: String = 27-06-2007


也可以通过java.time库格式化来完成。

import java.time.LocalDate
import java.time.format.DateTimeFormatter

LocalDate.parse("2019-01-04")
         .format(DateTimeFormatter.ofPattern("dd-MM-yyyy"))
//res1: String = 04-01-2019

10-04 15:55