问题描述
这可能是一个愚蠢的问题,但我无法弄清楚 Spark 如何使用 spark.read.format("image").load(....)
参数读取我的图像.
导入我的图像后,它给了我以下内容:
>>>image_df.select("image.height","image.width","image.nChannels", "image.mode", "image.data").show()+------+-----+---------+----+-----+|高度|宽度|n通道|模式|资料|+------+-----+------------+----+-----+|430|470|3|16|[4D 55 4E 4C 54 4...|+------+-----+------------+----+-----+我得出的结论是:
- 我的图片是 430x470 像素,
- 我的图像是彩色的(由于 nChannels = 3 为 RGB),这是一种 openCV 兼容类型,
- 我的图像模式是 16,它对应于特定的 openCV 字节顺序.
- 有人知道我可以浏览哪个网站/文档以了解更多信息吗?
- 数据列中的数据属于
Binary
类型,但是:- 当我运行
image_df.select("image.data").take(1)
时,我得到了一个似乎只有一个数组的输出(见下文).
- 当我运行
接下来的内容与上面显示的结果相关联.这些可能是由于我缺乏有关 openCV(或其他)的知识.尽管如此:
- 1/ 我不明白如果我得到一个 RGB 图像,我应该有 3 个矩阵但输出通过
.......\x84\x87~'))]
.我更想获得类似[(...),(...),(...\x87~')]
之类的东西. - 2/ 这部分有什么特殊含义吗?像那些是每个矩阵之间的分隔符还是什么?
为了更清楚我想要实现的目标,我想处理图像以在每个图像之间进行像素比较.因此,我想知道我的图像中给定位置的像素值(我假设如果我有一个 RGB 图像,我将有 3 个给定位置的像素值).
示例:假设我有一个仅在白天指向天空的网络摄像头,我想知道与左上角天空部分相对应的位置的像素值,我发现这些值的串联给出颜色 Light Blue 表示照片是在晴天拍摄的.假设唯一的可能性是晴天的颜色为 Light Blue
.
接下来,我想将前一个连接与另一个位于完全相同位置但来自第二天拍摄的照片的像素值连接进行比较.如果我发现它们不相等,那么我得出的结论是给定的照片是在阴天/下雨天拍摄的.如果相等,则为晴天.
对此的任何帮助将不胜感激.为了更好地理解,我已经把我的例子粗化了,但我的目标几乎是一样的.我知道可以存在 ML 模型来实现这些东西,但我很乐意先尝试一下.我的第一个目标是将这一列分成 3 列,分别对应每个颜色代码:一个红色矩阵、一个绿色矩阵、一个蓝色矩阵
我想我有逻辑.我使用 keras.preprocessing.image.img_to_array() 函数来了解值是如何分类的(因为我有一个 RGB 图像,我必须有 3 个矩阵:每个颜色 R G B 一个).发帖说如果有人想知道它是如何工作的,我可能错了,但我想我有一些东西:
from keras.preprocessing 导入图像将 numpy 导入为 np从 PIL 导入图像# 使用spark内置数据源first_img = spark.read.format("image").schema(imageSchema).load("....")raw = first_img.select("image.data").take(1)[0][0]np.shape(raw)(606300,) # 即 470*430*3# 使用keras函数img = image.load_img(".../path/to/img")yy = image.img_to_array(img)>>>np.shape(yy)(430, 470, 3) # 表格很好,但我有一个顺序问题,因为:>>>原始[0]、原始[1]、原始[2](77, 85, 78)>>>年[0][0]数组([78., 85., 77.], dtype=float32)# 因此我直接在 raw 上使用了 numpy reshape 函数# 有 470 个 3 行 470 列的矩阵:数组 = np.reshape(raw, (430,470,3))xx = image.img_to_array(array) # 可选,这里不使用>>>数组[0][0] == (raw[0],raw[1],raw[2])数组([真,真,真])>>>数组[0][1] == (raw[3],raw[4],raw[5])数组([真,真,真])>>>数组[0][2] == (raw[6],raw[7],raw[8])数组([真,真,真])>>>数组[0][3] == (raw[9],raw[10],raw[11])数组([真,真,真])
因此,如果我理解得很好,spark 会将图像作为一个大数组读取 - 此处为 (606300,) - 实际上每个元素都按顺序排列并对应于它们各自的颜色阴影 (R G B).
完成我的小转换后,我获得了 3 列 x 470 行的 430 矩阵.由于我的图像是 (470x430) for (WidthxHeight),每个矩阵对应一个像素高度位置和每个矩阵内部:每种颜色 3 列,每个宽度位置 470 行.
希望对某人有所帮助:)!
It might be a silly question but I can't figure out how Spark read my image using the spark.read.format("image").load(....)
argument.
After importing my image which gives me the following:
>>> image_df.select("image.height","image.width","image.nChannels", "image.mode", "image.data").show()
+------+-----+---------+----+--------------------+
|height|width|nChannels|mode| data|
+------+-----+---------+----+--------------------+
| 430| 470| 3| 16|[4D 55 4E 4C 54 4...|
+------+-----+---------+----+--------------------+
I arrive to the conclusion that:
- my image is 430x470 pixels,
- my image is colored (RGB due to nChannels = 3) which is an openCV compatible-type,
- my image mode is 16 which corresponds to a particular openCV byte-order.
- Does someone knows which website/documentation I could browse to know more about it?
- the data in the data column is of type
Binary
but:- when I run
image_df.select("image.data").take(1)
I got an output which seems to be only one array (see below).
- when I run
>>> image_df.select("image.data").take(1)
# **1/** Here are the last elements of the result
....<<One Eternity Later>>....x92\x89\x8a\x8d\x84\x86\x89\x80\x84\x87~'))]
# 2/ I got also several part of the result which looks like:
.....\x89\x80\x80\x83z|\x7fvz}tpsjqtkrulsvmsvmsvmrulrulrulqtkpsjnqhnqhmpgmpgmpgnqhnqhn
qhnqhnqhnqhnqhnqhmpgmpgmpgmpgmpgmpgmpgmpgnqhnqhnqhnqhnqhnqhnqhnqhknejmdilcilchkbh
kbilcilckneloflofmpgnqhorioripsjsvmsvmtwnvypx{ry|sz}t{~ux{ry|sy|sy|sy|sz}tz}tz}tz}
ty|sy|sy|sy|sz}t{~u|\x7fv|\x7fv}.....
What come next are linked to the results displayed above. Those might be due to my lack of knowledge concerning openCV (or else). Nonetheless:
- 1/ I don't understand the fact that if I got an RGB image, I should have 3 matrix but the output finishes by
.......\x84\x87~'))]
. I was more thinking on obtaining something like[(...),(...),(...\x87~')]
. - 2/ Is this part has a special meaning? Like those are the separator between each matrix or something?
To be more clear about what I'm trying to achieve, I want to process images to do pixel comparison between each images. Therefore, I want to know the pixel values for a given position in my image (I assume that if I have an RGB image, I shall have 3 pixel values for a given position).
Example: let's say that I have a webcam pointing to the sky only during the day and I want to know the values of a pixel at a position corresponding to the top left sky part, I found out that the concatenation of those values gives the colour Light Blue which says that the photo was taken on a sunny day. Let's say that the only possibility is that a sunny day takes the colour Light Blue
.
Next I want to compare the previous concatenation with another concat of pixel values at the exact same position but from a picture taken the next day. If I found out that they are not equal then I conclude that the given picture was taken on a cloudy/rainy day. If equal then sunny day.
Any help on that would be highly appreciated. I have vulgarized my example for a better understanding but my goal is pretty much the same. I know that ML model can exist to achieve those stuff but I would be happy to try this first. My first goal is to split this column into 3 columns corresponding to each color code: a red matrix, a green matrix, a blue matrix
I think I have the logic. I used the keras.preprocessing.image.img_to_array() function to understand how the values are classified (since I have an RGB image, I must have 3 matrix: one for each color R G B). Posting that if someone wonder how it works, I might be wrong but I think I have something :
from keras.preprocessing import image
import numpy as np
from PIL import Image
# Using spark built-in data source
first_img = spark.read.format("image").schema(imageSchema).load(".....")
raw = first_img.select("image.data").take(1)[0][0]
np.shape(raw)
(606300,) # which is 470*430*3
# Using keras function
img = image.load_img(".../path/to/img")
yy = image.img_to_array(img)
>>> np.shape(yy)
(430, 470, 3) # the form is good but I have a problem of order since:
>>> raw[0], raw[1], raw[2]
(77, 85, 78)
>>> yy[0][0]
array([78., 85., 77.], dtype=float32)
# Therefore I used the numpy reshape function directly on raw
# to have 470 matrix of 3 lines and 470 columns:
array = np.reshape(raw, (430,470,3))
xx = image.img_to_array(array) # OPTIONAL and not used here
>>> array[0][0] == (raw[0],raw[1],raw[2])
array([ True, True, True])
>>> array[0][1] == (raw[3],raw[4],raw[5])
array([ True, True, True])
>>> array[0][2] == (raw[6],raw[7],raw[8])
array([ True, True, True])
>>> array[0][3] == (raw[9],raw[10],raw[11])
array([ True, True, True])
So if I understood well, spark will read the image as a big array - (606300,) here - where in fact each element are ordered and corresponds to their respective color shade (R G B).
After doing my little transformations, I obtain 430 matrix of 3 columns x 470 lines. Since my image is (470x430) for (WidthxHeight), each matrix corresponds to a pixel heigth position and inside each: 3 columns for each color and 470 lines for each width position.
Hope that helps someone :)!
这篇关于Spark 如何使用图像格式读取我的图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!