本文介绍了如何在Swift中标准化UIImage的像素值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在尝试规范化UIImage,以便可以将其正确传递到CoreML模型中.

We are attempting to normalize an UIImage so that it can be passed correctly into a CoreML model.

我们从每个像素中检索RGB值的方法是,首先为每个像素初始化一个名为rawData的值的[CGFloat]数组,以便为​​红色,绿色,蓝色和alpha值定位.在bitmapInfo中,我们从原始UIimage本身获取原始像素值并进行操作.它用于填充context变量context中的bitmapInfo参数.稍后,我们将context变量用于drawCGImage,该变量随后会将归一化的CGImage转换回UIImage.

The way we are retrieving the RGB values from each pixel is by first initializing a [CGFloat] array called rawData of values for each pixel such that there is a position for the colors Red, Green, Blue and the alpha value. In bitmapInfo, we get the raw pixel values from the original UIimage itself and conduct. This is used to fill the bitmapInfo paramter in context, a CGContext variable. We will later used the context variable to draw a CGImage that will later convert the normalized CGImage back into a UIImage.

使用通过xy坐标进行的嵌套for循环迭代,可以找到所有像素中所有颜色(通过CGFloat的原始数据数组找到)中的最小和最大像素颜色值.将绑定变量设置为终止for循环,否则它将出现超出范围的错误.

Using a nested for-loop iterating through x and y coordinates, the minimum and maximum pixel color values among all colors (found through the CGFloat's raw data array) across all the pixels are found.A bound variable is set to terminate the for loop, otherwise, it will has out of range error.

range表示可能的RGB值的范围(即最大颜色值和最小颜色值之间的差).

range indicates the range of possible RGB values (ie. the difference between the maximum color value and the minimum).

使用公式对每个像素值进行归一化:

Using the equation to normalize each pixel value:

A = Image
curPixel = current pixel (R,G, B or Alpha)
NormalizedPixel = (curPixel-minPixel(A))/range

和类似的设计,从上方嵌套for循环,以解析rawData数组,并根据此归一化修改每个像素的颜色.

and a similar designed nested for loop from above to parse through the array of rawData and modify each pixel's colors according to this normalization.

我们的大多数代码来自:

Most of our codes are from:

  1. UIImage到像素颜色的UIColor数组
  2. 更改UIImage中某些像素的颜色
  3. https://gist.github.com/pimpapare/e8187d82a3976b851fc12fe4f8965789
  1. UIImage to UIColor array of pixel colors
  2. Change color of certain pixels in a UIImage
  3. https://gist.github.com/pimpapare/e8187d82a3976b851fc12fe4f8965789

我们使用CGFloat代替UInt8,因为归一化的像素值应该是介于0和1之间的实数,而不是0或1.

We use CGFloat instead of UInt8 because the normalized pixel values should be real numbers that between 0 and 1, not either 0 or 1.

func normalize() -> UIImage?{

    let colorSpace = CGColorSpaceCreateDeviceRGB()

    guard let cgImage = cgImage else {
        return nil
    }

    let width = Int(size.width)
    let height = Int(size.height)

    var rawData = [CGFloat](repeating: 0, count: width * height * 4)
    let bytesPerPixel = 4
    let bytesPerRow = bytesPerPixel * width
    let bytesPerComponent = 8

    let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue & CGBitmapInfo.alphaInfoMask.rawValue

    let context = CGContext(data: &rawData,
                            width: width,
                            height: height,
                            bitsPerComponent: bytesPerComponent,
                            bytesPerRow: bytesPerRow,
                            space: colorSpace,
                            bitmapInfo: bitmapInfo)

    let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
    context?.draw(cgImage, in: drawingRect)

    let bound = rawData.count

    //find minimum and maximum
    var minPixel: CGFloat = 1.0
    var maxPixel: CGFloat = 0.0

    for x in 0..<width {
        for y in 0..<height {

            let byteIndex = (bytesPerRow * x) + y * bytesPerPixel

            if(byteIndex > bound - 4){
                break
            }
            minPixel = min(CGFloat(rawData[byteIndex]), minPixel)
            minPixel = min(CGFloat(rawData[byteIndex + 1]), minPixel)
            minPixel = min(CGFloat(rawData[byteIndex + 2]), minPixel)

            minPixel = min(CGFloat(rawData[byteIndex + 3]), minPixel)


            maxPixel = max(CGFloat(rawData[byteIndex]), maxPixel)
            maxPixel = max(CGFloat(rawData[byteIndex + 1]), maxPixel)
            maxPixel = max(CGFloat(rawData[byteIndex + 2]), maxPixel)

            maxPixel = max(CGFloat(rawData[byteIndex + 3]), maxPixel)
        }
    }

    let range = maxPixel - minPixel
    print("minPixel: \(minPixel)")
    print("maxPixel : \(maxPixel)")
    print("range: \(range)")

    for x in 0..<width {
        for y in 0..<height {
            let byteIndex = (bytesPerRow * x) + y * bytesPerPixel

            if(byteIndex > bound - 4){
                break
            }
            rawData[byteIndex] = (CGFloat(rawData[byteIndex]) - minPixel) / range
            rawData[byteIndex+1] = (CGFloat(rawData[byteIndex+1]) - minPixel) / range
            rawData[byteIndex+2] = (CGFloat(rawData[byteIndex+2]) - minPixel) / range

            rawData[byteIndex+3] = (CGFloat(rawData[byteIndex+3]) - minPixel) / range

        }
    }

    let cgImage0 = context!.makeImage()
    return UIImage.init(cgImage: cgImage0!)
}

在归一化之前,我们希望像素值范围为0-255,在归一化之后,我们希望像素值范围为0-1.

Before normalization, we expect the pixel values range is 0 - 255 and after normalization, the pixel values range is 0 - 1.

归一化公式能够将像素值归一化为0到1之间的值.但是,当我们尝试打印(在遍历像素值时只需添加打印语句)归一化之前的像素值,以验证是否获得了原始像素值正确,我们发现这些值的范围是不正确的.例如,一个像素值的值为3.506e + 305(大于255).我们认为我们在一开始就弄错了原始像素值.

The normalization formula is able to normalize pixel values to values between 0 and 1. But when we try to print out (simply add print statements when we loop through pixel values) the pixel values before normalization to verify we get the raw pixel values correct, we found out that the range of those values are off. For example, a pixel value have value as 3.506e+305 (larger than 255.) We think we get the raw pixel value wrong at the beginning.

我们不熟悉Swift中的图像处理,并且不确定整个规范化过程是否正确.任何帮助将不胜感激!

We are not familiar with image processing in Swift and we are not sure if the whole normalization process is right. any help would be appreciated!

推荐答案

一些观察结果:

  1. 您的rawData是浮点数CGFloat数组,但是您的上下文中没有填充浮点数据,而是填充了UInt8数据.如果需要浮点缓冲区,请使用CGBitmapInfo.floatComponents构建浮点上下文,并相应地调整上下文参数.例如:

  1. Your rawData is floating point, CGFloat, array, but your context isn’t populating it with floating point data, but rather with UInt8 data. If you want a floating point buffer, build a floating point context with CGBitmapInfo.floatComponents and tweak the context parameters accordingly. E.g.:

func normalize() -> UIImage? {
    let colorSpace = CGColorSpaceCreateDeviceRGB()

    guard let cgImage = cgImage else {
        return nil
    }

    let width = cgImage.width
    let height = cgImage.height

    var rawData = [Float](repeating: 0, count: width * height * 4)
    let bytesPerPixel = 16
    let bytesPerRow = bytesPerPixel * width
    let bitsPerComponent = 32

    let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.floatComponents.rawValue | CGBitmapInfo.byteOrder32Little.rawValue

    guard let context = CGContext(data: &rawData,
                                  width: width,
                                  height: height,
                                  bitsPerComponent: bitsPerComponent,
                                  bytesPerRow: bytesPerRow,
                                  space: colorSpace,
                                  bitmapInfo: bitmapInfo) else { return nil }

    let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
    context.draw(cgImage, in: drawingRect)

    var maxValue: Float = 0
    var minValue: Float = 1

    for pixel in 0 ..< width * height {
        let baseOffset = pixel * 4
        for offset in baseOffset ..< baseOffset + 3 {
            let value = rawData[offset]
            if value > maxValue { maxValue = value }
            if value < minValue { minValue = value }
        }
    }
    let range = maxValue - minValue
    guard range > 0 else { return nil }

    for pixel in 0 ..< width * height {
        let baseOffset = pixel * 4
        for offset in baseOffset ..< baseOffset + 3 {
            rawData[offset] = (rawData[offset] - minValue) / range
        }
    }

    return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
}

  • 但这引出了一个问题,为什么您要为浮点数据而烦恼.如果您将此浮点数据返回到ML模型,那么我可以想象它可能会有用,但是您只是在创建一个新图像.因此,您还必须有机会仅检索UInt8数据,进行浮点运算,然后更新UInt8缓冲区并从中创建图像.因此:

  • But this begs the question of why you’d bother with floating point data. If you were returning this floating point data back to your ML model, then I can imagine it might be useful, but you’re just creating a new image. Because of that, you also have to opportunity to just retrieve the UInt8 data, do the floating point math, and then update the UInt8 buffer, and create the image from that. Thus:

    func normalize() -> UIImage? {
        let colorSpace = CGColorSpaceCreateDeviceRGB()
    
        guard let cgImage = cgImage else {
            return nil
        }
    
        let width = cgImage.width
        let height = cgImage.height
    
        var rawData = [UInt8](repeating: 0, count: width * height * 4)
        let bytesPerPixel = 4
        let bytesPerRow = bytesPerPixel * width
        let bitsPerComponent = 8
    
        let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue
    
        guard let context = CGContext(data: &rawData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: bitsPerComponent,
                                      bytesPerRow: bytesPerRow,
                                      space: colorSpace,
                                      bitmapInfo: bitmapInfo) else { return nil }
    
        let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
        context.draw(cgImage, in: drawingRect)
    
        var maxValue: UInt8 = 0
        var minValue: UInt8 = 255
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                let value = rawData[offset]
                if value > maxValue { maxValue = value }
                if value < minValue { minValue = value }
            }
        }
        let range = Float(maxValue - minValue)
        guard range > 0 else { return nil }
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                rawData[offset] = UInt8(Float(rawData[offset] - minValue) / range * 255)
            }
        }
    
        return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
    }
    

    我仅取决于您是否真的为ML模型需要此浮点缓冲区(在这种情况下,您可能会在第一个示例中返回浮点数组,而不是创建新图像),或者目标是否只是为了创建标准化的UIImage.

    I just depends upon whether you really needed this floating point buffer for your ML model (in which case, you might return the array of floats in the first example, rather than creating a new image) or whether the goal was just to create the normalized UIImage.

    我对此进行了基准测试,它在iPhone XS Max上比浮点再现快了一点,但占用了四分之一的内存(例如,使用UInt8拍摄2000×2000px图像需要16mb,而使用Float则需要64mb >).

    I benchmarked this, and it was a tad faster on iPhone XS Max than the floating point rendition, but takes a quarter of the the memory (e.g. a 2000×2000px image takes 16mb with UInt8, but 64mb with Float).

    最后,我应该提到 vImage 具有高度优化的功能, vImageContrastStretch_ARGB8888 与我们所做的事情非常相似已经做完了.只需import Accelerate,然后您就可以执行以下操作:

    Finally, I should mention that vImage has a highly optimized function, vImageContrastStretch_ARGB8888 that does something very similar to what we’ve done above. Just import Accelerate and then you can do something like:

    func normalize3() -> UIImage? {
        let colorSpace = CGColorSpaceCreateDeviceRGB()
    
        guard let cgImage = cgImage else { return nil }
    
        var format = vImage_CGImageFormat(bitsPerComponent: UInt32(cgImage.bitsPerComponent),
                                          bitsPerPixel: UInt32(cgImage.bitsPerPixel),
                                          colorSpace: Unmanaged.passRetained(colorSpace),
                                          bitmapInfo: cgImage.bitmapInfo,
                                          version: 0,
                                          decode: nil,
                                          renderingIntent: cgImage.renderingIntent)
    
        var source = vImage_Buffer()
        var result = vImageBuffer_InitWithCGImage(
            &source,
            &format,
            nil,
            cgImage,
            vImage_Flags(kvImageNoFlags))
    
        guard result == kvImageNoError else { return nil }
    
        defer { free(source.data) }
    
        var destination = vImage_Buffer()
        result = vImageBuffer_Init(
            &destination,
            vImagePixelCount(cgImage.height),
            vImagePixelCount(cgImage.width),
            32,
            vImage_Flags(kvImageNoFlags))
    
        guard result == kvImageNoError else { return nil }
    
        result = vImageContrastStretch_ARGB8888(&source, &destination, vImage_Flags(kvImageNoFlags))
        guard result == kvImageNoError else { return nil }
    
        defer { free(destination.data) }
    
        return vImageCreateCGImageFromBuffer(&destination, &format, nil, nil, vImage_Flags(kvImageNoFlags), nil).map {
            UIImage(cgImage: $0.takeRetainedValue(), scale: scale, orientation: imageOrientation)
        }
    }
    

    虽然采用的算法略有不同,但值得考虑,因为在我的基准测试中,在我的iPhone XS Max上,它的速度是浮点表示法的5倍以上.

    While this employs a slightly different algorithm, it’s worth considering, because in my benchmarking, on my iPhone XS Max it was over 5 times as fast as the floating point rendition.


    一些无关的发现:


    A few unrelated observations:

    1. 您的代码段也在规范化alpha通道.我不确定您是否要这样做.通常,颜色和Alpha通道是独立的.在上面,我假设您真的只想对颜色通道进行标准化.如果您也想规范化Alpha通道,那么您可能对Alpha通道有一个单独的min-max值范围,并分别进行处理.但是,使用与色彩通道相同的值范围来标准化Alpha通道并没有多大意义(反之亦然).

    1. Your code snippet is normalizing the alpha channel, too. I’m not sure you’d want to do that. Usually colors and alpha channels are independent. Above I assume you really wanted to normalize just the color channels. If you want to normalize alpha channel, too, then you might have a separate min-max range of values for alpha channels and process that separately. But it doesn’t make much sense to normalize alpha channel with the same range of values as for the color channels (or vice versa).

    我使用的是CGImage的值,而不是使用UIImage的宽度和高度.如果您的图像可能不具有1的比例,这是重要的区别.

    Rather than using the UIImage width and height, I’m using the values from the CGImage. This is important distinction in case your images might not have a scale of 1.

    例如,如果范围已经为0-255(即无需标准化),则可能要考虑提前退出.

    You might want to consider early-exit if, for example, the range was already 0-255 (i.e. no normalization needed).

    这篇关于如何在Swift中标准化UIImage的像素值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

  • 07-28 06:56