问题描述
在 Python 3.X + TensorFlow 中,如果我有两个 TF 向量,point_x 和 point_y(相同的形状)代表一些指针的 X 和 Y 坐标,我如何找到所有唯一的点?
In Python 3.X + TensorFlow, if I have two TF vectors, point_x and point_y (same shape) that represent the X and Y coordinates of some number of pointers, how do I find all unique points?
我能够在 Theano 中使用复数向量将其组合在一起,X 在实部,Y 在虚部:
I was able to hack this together in Theano using a complex vector, with X in the real and Y in the imaginary portion:
complex_points = point_x + point_y * 1j
unique_points, idxs, groups = T.extra_ops.Unique(True, True, False)(complex_points)
我正在尝试的 TF 等效项是:
The TF equivalent I'm trying is:
complex_points = tf.complex(point_x, point_y)
unique_points, groups = tf.unique(complex_points)
TensorFlow 失败,如下所示:
TensorFlow fails with something like:
InvalidArgumentError: No OpKernel was registered to support Op 'Unique' with these attrs.
... # supported types include the float/int/string types, no complex types
[[Node: Unique_1 = Unique[T=DT_COMPLEX64, out_idx=DT_INT32](Complex_1)]]
显然,没有人实施/注册了独特"操作的复杂版本.知道如何完成这项任务吗?
Clearly, no one's implemented/registered a complex version of the "unique" op. Any idea how to accomplish this task?
推荐答案
好吧,这里有一个偶数黑客解决方案:使用位级转换.
Well, here's an even hacker solution: use bit-level cast.
如果你的张量都是 tf.float32 类型,你可以使用:
If you tensor are all of type tf.float32, you can use:
xy = tf.transpose(tf.pack([point_x, point_y]))
xy64 = tf.bitcast(xy, type=tf.float64)
unique64, idx = tf.unique(xy64)
unique_points = tf.bitcast(unique64, type=tf.float32)
这背后的原理是将 x 和 y 坐标放在一起,让 TensorFlow 将 (x, y) 对视为更长的浮点数,然后 tf.unique 对这个一维张量起作用.最后,根据需要将较长的浮点数转换为两个真正的浮点数.
The principle behind this is to put x and y coordinates together and let TensorFlow treat an (x, y) pair as a longer float, then tf.unique works for this 1-D tensor. Finally, convert the longer float to two genuine floats, as we desired.
注意:这个方法真的很hacky,你有遭受Nan或无穷大或一些奇怪值的风险.但机会真的很渺茫.
另一种可能的解决方法是,如果您的数据类型是整数,则可以将两个整数合并为一个,就像编译器将 2-d 索引转换为 1-d 索引时所做的那样.假设 x = [1, 2, 3, 2], y = [0, 1, 0, 1],你可以通过 x*10+y 将 x 和 y 压缩成一个张量(10 是一个足够大的数字.任何大于 max(y) 的值都应该有效),然后在这个压缩数组中找到唯一值.
Another possible work around is, if your data type is integer, you can pack two integers into one, like what a compiler does when it convert 2-d indices into 1-d ones. Say, if x = [1, 2, 3, 2], y = [0, 1, 0, 1], you can compress x and y into one tensor by x*10+y (10 is a large enough number. Any value larger than max(y) should work), then find unique values in this compressed array.
最后,如果您没有任何理由在 TensorFlow 内部执行此操作,那么最好在外部执行此操作,例如在 numpy 中执行此操作.您可以评估张量,并删除 numpy 中的重复值,然后使用这些 numpy 数组生成新的张量并提供给网络的其余部分.
Lastly, if you don't have any reason to do this inside TensorFlow, it might be better to do it outside, say, in numpy. You can evaluate the tensors, and remove duplicate values in numpy, then use these numpy arrays to generate new tensors and feed to the rest of your network.
这篇关于在 Tensorflow 中查找唯一的值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!