问题描述
我在从HBase写入数据并用Phoenix读取数据时遇到问题.这些是重现该问题的步骤:
I am having problems writing data from HBase and reading it with Phoenix.These are the steps to reproduce the problem:
使用Phoenix创建表.
Create a table using Phoenix.
CREATE TABLE test (
id varchar not null,
t1.a unsigned_int,
t1.b varchar
CONSTRAINT pk PRIMARY KEY (id))
COLUMN_ENCODED_BYTES = 0;
如果我使用Upsert使用Phoenix将信息添加到表中
If I add information to the table using Phoenix using Upsert
upsert into test (id, t1.a, t1.b) values ('a1',1,'foo_a');
然后我尝试查询表,我得到了:
And I try query the table, I get this:
select * from test;
+-----+----+--------+
| ID | A | B |
+-----+----+--------+
| a1 | 1 | foo_a |
+-----+----+--------+
这时一切正常,但是现在我将直接使用HBase添加一个新条目.
At this point everything work as expected, but now I am going to add a new entry using HBase directly.
put 'TEST', 'id_1','T1:A', 2
put 'TEST', 'id_1','T1:B','some text'
此后,我无法再查询该表,得到此信息:
After that I can't query the table anymore, getting this:
select * from test;
Error: ERROR 201 (22000): Illegal data. Expected length of at least 4 bytes, but had 1 (state=22000,code=201)
我知道问题与HBase如何存储unsigned_int有关,如果我从表中删除此列,查询将再次起作用.该问题如何解决?
I know that the problem is related to how HBase is storing the unsigned_int, and if I remove this column from the table, the queries will work again.How can this problem be solved?
推荐答案
问题似乎与HBase如何存储数据有关,如果我对表进行了扫描,就会发现:
The problem seems to be related with how HBase is storing the data, if I make a scan of the table I get this:
ROW COLUMN+CELL
a1 column=T1:A, timestamp=1551274930165, value=\x00\x00\x00\x01
a1 column=T1:B, timestamp=1551274930165, value=foo_a
a1 column=T1:_0, timestamp=1551274930165, value=x
id_1 column=T1:A, timestamp=1551274993067, value=2
id_1 column=T1:B, timestamp=1551275070577, value=some text
这意味着新的整数值将作为字符串存储,因此正确的存储方式应为:
That means that the new integer value is being stored as a string, so the right way to store this data should be:
put 'TEST', 'id_1','T1:A', "\x00\x00\x00\x02"
完成此操作后,扫描将为我们提供:
Once this is done the scan will give us this:
ROW COLUMN+CELL
a1 column=T1:A, timestamp=1551274930165, value=\x00\x00\x00\x01
a1 column=T1:B, timestamp=1551274930165, value=foo_a
a1 column=T1:_0, timestamp=1551274930165, value=x
id_1 column=T1:A, timestamp=1551274993067, value=\x00\x00\x00\x02
id_1 column=T1:B, timestamp=1551275070577, value=some text
Phoenix可以毫无问题地访问数据.
感谢鲍里斯的提示.
And data will be accessible from Phoenix without any problem.
Thanks to Boris for the hint.
这篇关于Apache Phoenix非法数据异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!