问题描述
我试图推行以计算积分图像使用Repa的累加值功能。我的当前实现如下所示:
I am attempting to implement a cumulative sum function using Repa in order to calculate integral images. My current implementation looks like the following:
cumsum :: (Elt a, Num a) => Array DIM2 a -> Array DIM2 a
cumsum array = traverse array id cumsum'
where
elementSlice inner outer = slice array (inner :. (0 :: Int))
cumsum' f (inner :. outer) = Repa.sumAll $ elementSlice inner outer
的问题是在elementSlice功能。在MATLAB或说numpy的,这可能被指定为数组[内,0:外部]。所以,我在找的是沿着线的东西:
The problem is in the elementSlice function. In matlab or say numpy this could be specified as array[inner,0:outer]. So what I am looking for is something along the lines of:
slice array (inner :. (Range 0 outer))
然而,似乎切片不允许在Repa当前指定以上范围。我考虑过使用分区,在高效的并行模板卷积在Haskell的讨论,但是这似乎是一个相当重量级的方法,如果他们将改变每个迭代。我也算是屏蔽片(由二元矢量相乘) - 但这又似乎像它会在大型矩阵非常糟糕,因为执行我会在矩阵中的每个点......
However, it seems that slices are not allowed to be specified over ranges currently in Repa. I considered using partitions as discussed in Efficient Parallel Stencil Convolution in Haskell, but this seems like a rather heavyweight approach if they would be changed every iteration. I also considered masking the slice (multiplying by a binary vector) - but this again seemed like it would perform very poorly on large matrices since I would be allocating a mask vector for every point in the matrix...
我的问题 - 没有人知道是否有计划增加对一系列切片到Repa支持?还是有高性能的方法可以让我已经与不同的方法攻击这个问题,也许?
My question - does anyone know if there are plans to add support for slicing over a range to Repa? Or is there a performant way I can already attack this problem, maybe with a different approach?
推荐答案
其实,我认为主要的问题是,Repa没有扫描原始。 (但是,非常相似的文库的一样。)有扫描的两个变体,preFIX扫描和后缀扫描。由于一维数组
Actually, I think the main problem is that Repa does not have a scan primitive. (However, a very similar library Accelerate does.) There are two variants of scan, prefix scan and suffix scan. Given a 1D array
[A_1,...,A_N]
preFIX扫描返回
prefix scan returns
[0,A_0,A_0 + A_1,...,A_0 + ... + A_ {N-1}]
而后缀扫描产生
[A_0,A_0 + A_1,...,A_0 + A_1 + ... + A_N]
我想这是你要的东西与你的累计总和( cumsum
)的功能。
I assume this is what you are going for with your cumulative sum (cumsum
) function.
preFIX和后缀扫描推广很自然地到多维数组并基于简树的高效实现。就这个话题比较旧纸是。此外,Conal埃利奥特最近写的的的帖子在哈斯克尔推导高效的并行扫描。
Prefix and suffix scans generalise quite naturally to multi-dimensional arrays and have an efficient implementation based on tree reduction. A relatively old paper on the topic is "Scan Primitives for Vector Computers". Also, Conal Elliott has recently written several blog posts on deriving efficient parallel scans in Haskell.
的积分图像(2D阵列上)可以通过执行两次扫描一水平和一个垂直地进行计算。在没有扫描原始的我已经实现的,效率很低。
The integral image (on a 2D array) can be calculated by doing two scans, one horizontally and one vertically. In the absence of a scan primitive I have implemented one, very inefficiently.
horizScan :: Array DIM2 Int -> Array DIM2 Int
horizScan arr = foldl addIt arr [0 .. n - 1]
where
addIt :: Array DIM2 Int -> Int -> Array DIM2 Int
addIt accum i = accum +^ vs
where
vs = toAdd (i+1) n (slice arr (Z:.All:.i))
(Z:.m:.n) = arrayExtent arr
--
-- Given an @i@ and a length @len@ and a 1D array @arr@
-- (of length n) produces a 2D array of dimensions n X len.
-- The columns are filled with zeroes up to row @i@.
-- Subsequently they are filled with columns equal to the
-- values in @arr.
--
toAdd :: Int -> Int -> Array DIM1 Int -> Array DIM2 Int
toAdd i len arr = traverse arr (\sh -> sh:.(len::Int))
(\_ (Z:.n:.m) -> if m >= i then arr ! (Z:.n) else 0)
,用于计算积分图像的功能可以被定义为
The function for calculating the integral image can then be defined as
vertScan :: Array DIM2 Int -> Array DIM2 Int
vertScan = transpose . horizScan . transpose
integralImage = horizScan . vertScan
由于扫描已实施加速,它不应该太硬将其添加到Repa。我不知道,如果使用现有的Repa原语的高效实现是可能或没有。
Given that scan has been implemented for Accelerate, it shouldn't be too hard to add it to Repa. I am not sure if an efficient implementation using the existing Repa primitives is possible or not.
这篇关于如何采取一个数组片断与Repa在一定范围内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!