我正在使用Scala API在Spark中运行序列模型。这是查看结果的代码行:

model.freqSequences.collect().foreach { freqSequence => println(freqSequence.sequence.map(_.mkString("[", ", ", "]")).mkString("[", ", ", "]") + ", " + freqSequence.freq)}


问题在于结果越来越大,不想再使用collect(),而是将其保存在HDFS或本地文件中。我尝试了这个:

scala> val outcome = model.freqSequences.foreach { freqSequence => println(freqSequence.sequence.map(_.mkString("[", ", ", "]")).mkString("[", ", ", "]") + ", " + freqSequence.freq)}

scala> outcome.saveAsTextFile("tmp/outcome1/")

error: saveAsTextFile is not a member of Unit


结果是一个单位,我无法使用saveAsTextFile。还有其他方法可以保存此结果吗? Txs。

最佳答案

foreach返回Unit

您要先mapString,以便可以另存为文件。就像是:

val outcome = model.freqSequences.map { freqSequence => freqSequence.sequence.map(_.mkString("[", ", ", "]")).mkString("[", ", ", "]") + ", " + freqSequence.freq}
// print
outcome.foreach(println)
// save
outcome.saveAsTextFile("tmp/outcome1/")

10-02 02:50