我试图基于其中带有双引号的字符串模式来过滤我在Pig中的输入。
例如,假设input.txt具有
field1="value1" field2="value2" field1="value1" field2="val2"
I want to filter out lines which has field2="value2". So, I run the following script
A = LOAD 'input.txt' AS line:chararray;
B = FILTER A BY line MATCHES '.*field2="value2".*';
DUMP B;
上面的代码片段引发了0条记录。如果我不给双引号结尾,那就行得通
B = FILTER A BY line MATCHES '.*field2="value2.*';
我想知道为什么前者不起作用。
最佳答案
尝试使用转义字符获取引号:
A = LOAD 'input.txt' AS line:chararray;
B = FILTER A BY line MATCHES '.*field2=\\"value2\\".*';
DUMP B;
关于hadoop - 在Pig中对MATCHES使用双引号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40093424/