问题描述
有时候,我发现自己希望scala集合包含一些缺少的功能,并且很容易扩展"集合并提供自定义方法.
Sometimes, I find myself wishing scala collections to include some missing functionality, and it's rather easy "extending" a collection, and provide a custom method.
从头开始构建集合时,这会有些困难.考虑有用的方法,例如 .iterate
.我将使用类似的熟悉功能演示用例:展开
.
This is a bit more difficult when it comes to building the collection from scratch.Consider useful methods such as .iterate
.I'll demonstrate the usecase with a similar, familiar function: unfold
.
unfold
是一种从初始状态 z:S
构造一个集合的方法,是一种生成下一个状态的可选元组的函数,而元素 E
,或表示结束的空选项.
unfold
is a method to construct a collection from an initial state z: S
, and a function to generate an optional tuple of the next state, and an element E
, or an empty option indicating the end.
方法签名(对于某些集合类型 Coll [T]
)应大致如下:
the method signature, for some collection type Coll[T]
should look roughly like:
def unfold[S,E](z: S)(f: S ⇒ Option[(S,E)]): Coll[E]
现在,IMO,最自然"的用法应该是,例如:
Now, IMO, the most "natural" usage should be, e.g:
val state: S = ??? // initial state
val arr: Array[E] = Array.unfold(state){ s ⇒
// code to convert s to some Option[(S,E)]
???
}
对于特定的集合类型,这很简单:
This is pretty straight forward to do for a specific collection type:
implicit class ArrayOps(arrObj: Array.type) {
def unfold[S,E : ClassTag](z: S)(f: S => Option[(S,E)]): Array[E] = {
val b = Array.newBuilder[E]
var s = f(z)
while(s.isDefined) {
val Some((state,element)) = s
b += element
s = f(state)
}
b.result()
}
}
有了这个隐式类,我们可以像这样为斐波那契序列生成一个数组:
with this implicit class in scope, we can generate an array for the Fibonacci seq like this:
val arr: Array[Int] = Array.unfold(0->1) {
case (a,b) if a < 256 => Some((b -> (a+b)) -> a)
case _ => None
}
但是,如果我们想为所有其他集合类型提供此功能,除了C& P代码,再将 Array
出现的所有内容替换为 List
, Seq
,etc'...
But if we want to provide this functionality to all other collection types, I see no other option than to C&P the code, and replace all Array
occurrences with List
,Seq
,etc'...
所以我尝试了另一种方法:
So I tried another approach:
trait BuilderProvider[Elem,Coll] {
def builder: mutable.Builder[Elem,Coll]
}
object BuilderProvider {
object Implicits {
implicit def arrayBuilderProvider[Elem : ClassTag] = new BuilderProvider[Elem,Array[Elem]] {
def builder = Array.newBuilder[Elem]
}
implicit def listBuilderProvider[Elem : ClassTag] = new BuilderProvider[Elem,List[Elem]] {
def builder = List.newBuilder[Elem]
}
// many more logicless implicits
}
}
def unfold[Coll,S,E : ClassTag](z: S)(f: S => Option[(S,E)])(implicit bp: BuilderProvider[E,Coll]): Coll = {
val b = bp.builder
var s = f(z)
while(s.isDefined) {
val Some((state,element)) = s
b += element
s = f(state)
}
b.result()
}
现在,在上述范围内,所有需要的就是对正确类型的导入:
Now, with the above in scope, all one needs is an import for the right type:
import BuilderProvider.Implicits.arrayBuilderProvider
val arr: Array[Int] = unfold(0->1) {
case (a,b) if a < 256 => Some((b -> (a+b)) -> a)
case _ => None
}
但是这也不是正确的.我不喜欢强迫用户导入某些东西,更不用说隐式方法了,该方法将在每个方法调用上创建一个无用的接线类.而且,没有简单的方法可以覆盖默认逻辑.您可以考虑诸如 Stream
之类的集合,这些集合最适合延迟创建集合,或者考虑其他特殊实现细节以考虑其他集合.
but this doesn't fell right also. I don't like forcing the user to import something, let alone an implicit method that will create a useless wiring class on every method call. Moreover, there is no easy way to override the default logic. You can think about collections such as Stream
, where it would be most appropriate to create the collection lazily, or other special implementation details to consider regarding other collections.
我能想到的最好的解决方案是使用第一个解决方案作为模板,并使用sbt生成源代码:
The best solution I could come up with, was to use the first solution as a template, and generate the sources with sbt:
sourceGenerators in Compile += Def.task {
val file = (sourceManaged in Compile).value / "myextensions" / "util" / "collections" / "package.scala"
val colls = Seq("Array","List","Seq","Vector","Set") //etc'...
val prefix = s"""package myextensions.util
|
|package object collections {
|
""".stripMargin
val all = colls.map{ coll =>
s"""
|implicit class ${coll}Ops[Elem](obj: ${coll}.type) {
| def unfold[S,E : ClassTag](z: S)(f: S => Option[(S,E)]): ${coll}[E] = {
| val b = ${coll}.newBuilder[E]
| var s = f(z)
| while(s.isDefined) {
| val Some((state,element)) = s
| b += element
| s = f(state)
| }
| b.result()
| }
|}
""".stripMargin
}
IO.write(file,all.mkString(prefix,"\n","\n}\n"))
Seq(file)
}.taskValue
但是该解决方案存在其他问题,并且难以维护.试想一下,如果 unfold
不是唯一要全局添加的函数,那么覆盖默认实现仍然很困难.底线,这很难维护,也不会感觉"正确.
But this solution suffers from other issues, and is hard to maintain. just imagine if unfold
is not the only function to add globally, and overriding default implementation is still hard. bottom line, this is hard to maintain and does not "feel" right either.
那么,有没有更好的方法来实现这一目标?
So, is there a better way to achieve this?
推荐答案
首先,让我们对该函数进行基本实现,该函数使用显式的 Builder
参数.万一展开,它可能看起来像这样:
First, let's make a basic implementation of the function, which uses an explicit Builder
argument. In case of unfold it can look like this:
import scala.language.higherKinds
import scala.annotation.tailrec
import scala.collection.GenTraversable
import scala.collection.mutable
import scala.collection.generic.{GenericCompanion, CanBuildFrom}
object UnfoldImpl {
def unfold[CC[_], E, S](builder: mutable.Builder[E, CC[E]])(initial: S)(next: S => Option[(S, E)]): CC[E] = {
@tailrec
def build(state: S): CC[E] = {
next(state) match {
case None => builder.result()
case Some((nextState, elem)) =>
builder += elem
build(nextState)
}
}
build(initial)
}
}
现在,按类型获取集合生成器的简单方法是什么?
Now, what can be an easy way to get a builder of a collection by its type?
我可以提出两个可能的解决方案.首先是制作一个隐式扩展类,该类扩展了 GenericCompanion
–大多数scala内置集合的通用超类.此 GenericCompanion
具有方法 newBuilder
,该方法针对提供的元素类型返回 Builder
.一个实现可能看起来像这样:
I can propose two possibile solutions. The first is to make an implicit extension class, that extends a GenericCompanion
– the common superclass of most scala's built-in collections. This GenericCompanion
has a method newBuilder
that returns a Builder
for the provided element type. An implementation may look like this:
implicit class Unfolder[CC[X] <: GenTraversable[X]](obj: GenericCompanion[CC]) {
def unfold[S, E](initial: S)(next: S => Option[(S, E)]): CC[E] =
UnfoldImpl.unfold(obj.newBuilder[E])(initial)(next)
}
这很容易使用:
scala> List.unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res1: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
一个缺点是某些集合没有扩展 GenericCompanion
的伴随对象.例如, Array
或用户定义的集合.
One drawback is that some collections don't have companion objects extending GenericCompanion
. For example, Array
, or user-defined collections.
另一种可能的解决方案是使用隐式的构建器提供程序",如您所建议的.Scala在集合库中已经有这样的东西.它是 CanBuildFrom
.具有 CanBuildFrom
的实现可能看起来像这样:
Another possible solution is to use an implicit 'builder provider', like you have proposed. And scala already has such a thing in the collection library. It's CanBuildFrom
. An implementation with a CanBuildFrom
may look like this:
object Unfolder2 {
def apply[CC[_]] = new {
def unfold[S, E](initial: S)(next: S => Option[(S, E)])(
implicit cbf: CanBuildFrom[CC[E], E, CC[E]]
): CC[E] =
UnfoldImpl.unfold(cbf())(initial)(next)
}
}
用法示例:
scala> Unfolder2[Array].unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res1: Array[Int] = Array(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
这与scala的集合 Array
一起使用,并且如果用户提供了 CanBuildFrom
实例,则可以与用户定义的集合一起使用.
This works with scala's collections, Array
, and may work with user-defined collections, if the user has provided a CanBuildFrom
instance.
请注意,这两种方法都不能以懒惰的方式与 Stream
一起使用.这主要是因为原始实现 UnfoldImpl.unfold
使用了 Builder
,对于 Stream
而言,该生成器是渴望.
Note, that both approaches won't work with Stream
s in a lazy fashion. That's mostly because the original implementation UnfoldImpl.unfold
uses a Builder
, which for a Stream
is eager.
要像懒惰地为 Stream
展开一样,您不能使用标准的 Builder
.您必须使用 Stream.cons
(或#::
)提供单独的实现.为了能够根据用户请求的集合类型自动选择实现,可以使用typeclass模式.这是一个示例实现:
To do something like unfolding for Stream
lazily, you can't use the standard Builder
. You'd have to provide a separate implementation using Stream.cons
(or #::
). To be able to choose an implementation automatically, depending on the collection type requested by user, you can use the typeclass pattern. Here is a sample implementation:
trait Unfolder3[E, CC[_]] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): CC[E]
}
trait UnfolderCbfInstance {
// provides unfolder for types that have a `CanBuildFrom`
// this is used only if the collection is not a `Stream`
implicit def unfolderWithCBF[E, CC[_]](
implicit cbf: CanBuildFrom[CC[E], E, CC[E]]
): Unfolder3[E, CC] =
new Unfolder3[E, CC] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): CC[E] =
UnfoldImpl.unfold(cbf())(initial)(next)
}
}
object Unfolder3 extends UnfolderCbfInstance {
// lazy implementation, that overrides `unfolderWithCbf` for `Stream`s
implicit def streamUnfolder[E]: Unfolder3[E, Stream] =
new Unfolder3[E, Stream] {
def unfold[S](initial: S)(next: S => Option[(S, E)]): Stream[E] =
next(initial).fold(Stream.empty[E]) {
case (state, elem) =>
elem #:: unfold(state)(next)
}
}
def apply[CC[_]] = new {
def unfold[E, S](initial: S)(next: S => Option[(S, E)])(
implicit impl: Unfolder3[E, CC]
): CC[E] = impl.unfold(initial)(next)
}
}
现在,此实现非常适合常规集合(包括 Array
和具有适当 CanBuildFrom
的用户定义集合),并且对于 Stream
来说比较懒惰:
Now this implementation works eagerly for normal collections (including Array
and user-defined collections with appropriate CanBuildFrom
), and lazily for Stream
s:
scala> Unfolder3[Array].unfold(1)(a => if (a > 10) None else Some(a + 1, a * a))
res0: Array[Int] = Array(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
scala> com.Main.Unfolder3[Stream].unfold(1)(a => if (a > 10) None else { println(a); Some(a + 1, a * a) })
1
res2: Stream[Int] = Stream(1, ?)
scala> res2.take(3).toList
2
3
res3: List[Int] = List(1, 4, 9)
请注意,如果将 Unfolder3.apply
移至另一个对象或类,则用户完全不必导入任何与 Unfolder3
有关的内容.
Note, that if Unfolder3.apply
is moved to another object or class, the user won't have to import anything to do with Unfolder3
at all.
如果您不了解此实现的工作原理,则可以阅读有关在Scala中,以及隐式解析的顺序.
If you don't understand how this implementation works you can read something about the typeclass patern in Scala, and the order of implicit resolution.
这篇关于具有泛型类型的泛型集合生成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!