好吧,我想出了一个完全不同的方法。 (也感谢corsiKa提供了一些有用的见解!)与其对哈希/编码方块进行编码,对它们周围的路径进行编码。该路径由一系列转向(包括不转弯)在绘制每个单元段之前执行。我认为从平方坐标获得路径的算法超出了这个问题的范围。


它引入了一个额外的约束:多边形不能有完全封闭的孔。 (正式来说,它必须。)多数民众讨论即使仅由两个接触角密封,也要考虑存在一个孔,因为这可以防止与其他任何不平凡的多边形拼接。追踪边缘并不会因触碰角落而受到阻碍(如单个有孔),但它不能从一个外环跳到内环,就像在完整的环形八音形中一样: com / Rj8Gb.pngalt =它还产生了一个额外的挑战:寻找编码路径循环的最小排序。这是因为路径的任何旋转(在字符串旋转的意义上)都是有效的编码。要始终获得相同的编码,我们必须找到路径指令的最小(或最大)旋转。谢天谢地,这个问题已经解决了:例如参见 http://en.wikipedia.org/wiki/Lexicographically_minimal_string_rotation
以下是顺时针方向的F pentomino:

一个任意的初始编码为F pentomino(从右下角开始):

  2,2,3, 1,2,2,3,2,2,3,2,1 


  1,2,2,3,1,2,2,3,2,2,3, 2 

有12个元素,这个循环ca如果每个指令使用两位,则n被打包成24位,或者如果指令被编码为三的幂,则只有19位。即使使用2位元素编码也可以很容易地将它放在一个无符号的32位整数 0x6B6BAE 中:

<$ p $ 1- 2- 2- 3- 1- 2- 2- 3- 2- 2- 3- 3
= 01-10-10-11-01-10-10- 11-10-10-11-10
= 00000000011010110110101110101110
= 0x006B6BAE

以3的最显着幂开始循环的base-3编码是 0x5795F

<$ p $ 1 2 3 4 5 6 7 8 9 10 1 2 3 1 2 3 3 3 3 2 2 3 3 3 3 2 3 2 3 3 2 2 3 3 3 2 2 3 3 3 3 2 3 3 3 2 2 3 3 3 2 3 ^ 5 + 3 * 3 ^ 4 + 2 * 3 ^ 3 + 2 * 3 ^ 2 + 3 * 3 ^ 1 + 2 * 3 ^ 0
= 0x0005795F

n 的多边形周围路径中顶点的最大数量是 2n + 2 。对于2位编码,位数是移动次数的两倍,因此所需的最大位数是 4n + 4 。对于base-3编码,它是:


In short: How to hash a free polyomino?

This could be generalized into: How to efficiently hash an arbitrary collection of 2D integer coordinates, where a set contains unique pairs of non-negative integers, and a set is considered unique if and only if no translation, rotation, or flip can map it identically to another set?

For impatient readers, please note I'm fully aware of a brute force approach. I'm looking for a better way -- or a very convincing proof that no other way can exist.

I'm working on some different algorithms to generate random polyominos. I want to test their output to determine how random they are -- i.e. are certain instances of a given order generated more frequently than others. Visually, it is very easy to identify different orientations of a free polyomino, for example the following Wikipedia illustration shows all 8 orientations of the "F" pentomino (Source):

How would one put a number on this polyomino - that is, hash a free polyomino? I don't want to depend on a prepolulated list of "named" polyominos. Broadly agreed-upon names only exists for orders 4 and 5, anyway.

This is not necessarily equavalent to enumerating all free (or one-sided, or fixed) polyominos of a given order. I only want to count the number of times a given configuration appears. If a generating algorithm never produces a certain polyomino it will simply not be counted.

The basic logic of the counting is:

testcount = 10000 // Arbitrary
order = 6         // Create hexominos in this test
hashcounts = new hashtable
for i = 1 to testcount
    poly = GenerateRandomPolyomino(order)
    hash = PolyHash(poly)
    if hashcounts.contains(hash) then
        hashcounts[hash] = 1

What I'm looking for is an efficient PolyHash algorithm. The input polyominos are simply defined as a set of coordinates. One orientation of the T tetronimo could be, for example:

[[1,0], [0,1], [1,1], [2,1]]:

0| X

You can assume that that input polyomino will already be normalized to be aligned against the X and Y axes and have only positive coordinates. Formally, each set:

  • Will have at least 1 coordinate where the x value is 0
  • Will have at least 1 coordinate where the y value is 0
  • Will not have any coordinates where x < 0 or y < 0

I'm really looking for novel algorithms that avoid the increasing number of integer operations required by a general brute force approach, described below.

Brute force

A brute force solution suggested here and here consists of hashing each set as an unsigned integer using each coordinate as a binary flag, and taking the minimum hash of all possible rotations (and in my case flips), where each rotation / flip must also be translated to the origin. This results in a total of 23 set operations for each input set to get the "free" hash:

  • Rotate (6x)
  • Flip (1x)
  • Translate (7x)
  • Hash (8x)
  • Find minimum of computed hashes (1x)

Where the sequence of operations to obtain each hash is:

  1. Hash
  2. Rotate, Translate, Hash
  3. Rotate, Translate, Hash
  4. Rotate, Translate, Hash
  5. Flip, Translate, Hash
  6. Rotate, Translate, Hash
  7. Rotate, Translate, Hash
  8. Rotate, Translate, Hash

Well, I came up with a completely different approach. (Also thanks to corsiKa for some helpful insights!) Rather than hashing / encoding the squares, encode the path around them. The path consists of a sequence of 'turns' (including no turn) to perform before drawing each unit segment. I think an algorithm for getting the path from the coordinates of the squares is outside the scope of this question.

This does something very important: it destroys all location and orientation information, which we don't need. It is also very easy to get the path of the flipped object: you do so by simply reversing the order of the elements. Storage is compact because each element requires only 2 bits.

It does introduce one additional constraint: the polyomino must not have fully enclosed holes. (Formally, it must be simply connected.) Most discussions of polyominos consider a hole to exist even if it is sealed only by two touching corners, as this prevents tiling with any other non-trivial polyomino. Tracing the edges is not hindered by touching corners (as in the single heptomino with a hole), but it cannot leap from one outer loop to an inner one as in the complete ring-shaped octomino:

It also produces one additional challenge: finding the minumum ordering of the encoded path loop. This is because any rotation of the path (in the sense of string rotation) is a valid encoding. To always get the same encoding we have to find the minimal (or maximal) rotation of the path instructions. Thankfully this problem has already been solved: see for example http://en.wikipedia.org/wiki/Lexicographically_minimal_string_rotation.


If we arbitrarily assign the following values to the move operations:

  • No turn: 1
  • Turn right: 2
  • Turn left: 3

Here is the F pentomino traced clockwise:

An arbitrary initial encoding for the F pentomino is (starting at the bottom right corner):


The resulting minimum rotation of the encoding is


With 12 elements, this loop can be packed into 24 bits if two bits are used per instruction or only 19 bits if instructions are encoded as powers of three. Even with the 2-bit element encoding can easily fit that in a single unsigned 32 bit integer 0x6B6BAE:

   1- 2- 2- 3- 1- 2- 2- 3- 2- 2- 3- 2
= 01-10-10-11-01-10-10-11-10-10-11-10
= 00000000011010110110101110101110
= 0x006B6BAE

The base-3 encoding with the start of the loop in the most significant powers of 3 is 0x5795F:

    1*3^11 + 2*3^10 + 2*3^9 + 3*3^8 + 1*3^7 + 2*3^6
  + 2*3^5  + 3*3^4  + 2*3^3 + 2*3^2 + 3*3^1 + 2*3^0
= 0x0005795F

The maximum number of vertexes in the path around a polyomino of order n is 2n + 2. For 2-bit encoding the number of bits is twice the number of moves, so the maximum bits needed is 4n + 4. For base-3 encoding it's:

Where the "gallows" is the ceiling function. Accordingly any polyomino up to order 9 can be encoded in a single 32 bit integer. Knowing this you can choose your platform-specific data structure accordingly for the fastest hash comparison given the maximum order of the polyominos you'll be hashing.


08-13 16:49