我试图基于其中带有双引号的字符串模式来过滤我在Pig中的输入。

例如,假设input.txt具有

field1="value1" field2="value2"
field1="value1" field2="val2"

I want to filter out lines which has field2="value2". So, I run the following script

A = LOAD 'input.txt' AS line:chararray;
B = FILTER A BY line MATCHES '.*field2="value2".*';
DUMP B;

上面的代码片段引发了0条记录。如果我不给双引号结尾,那就行得通
B = FILTER A BY line MATCHES '.*field2="value2.*';

我想知道为什么前者不起作用。

最佳答案

尝试使用转义字符获取引号:

A = LOAD 'input.txt' AS line:chararray;
B = FILTER A BY line MATCHES '.*field2=\\"value2\\".*';
DUMP B;

关于hadoop - 在Pig中对MATCHES使用双引号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40093424/

10-12 23:45