问题描述
我在Java UDF函数中使用了一个小的映射文件,我想通过构造函数从Pig传递此文件的文件名.
I am using a small map file in my Java UDF function and I want to pass the filename of this file from Pig through the constructor.
以下是我的UDF函数的相关部分
Following is the relevant part from my UDF function
public GenerateXML(String mapFilename) throws IOException {
this(null);
}
public GenerateXML(String mapFilename) throws IOException {
if (mapFilename != null) {
// do preocessing
}
}
在Pig脚本中,我有以下一行
In the Pig script I have the following line
DEFINE GenerateXML com.domain.GenerateXML('typemap.tsv');
这在本地模式下有效,但在分布式模式下无效.我在命令行中将以下参数传递给Pig
This works in local mode, but not in distributed mode. I am passing the following parameters to Pig in command line
pig -Dmapred.cache.files="/path/to/typemap.tsv#typemap.tsv" -Dmapred.create.symlink=yes -f generate-xml.pig
我收到以下异常
2013-01-11 10:39:42,002 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse:
<file generate-xml.pig, line 16, column 42> Failed to generate logical plan. Nested exception: java.lang.RuntimeException: could not instantiate 'com.domain.GenerateXML' with arguments '[typemap.tsv]'
您知道我需要进行哪些更改才能使其正常工作吗?
Any idea what I need to change to make it work?
推荐答案
问题已解决.
似乎我使用以下参数运行Pig脚本
It seems that when I run the Pig script using following parameters
pig -Dmapred.cache.files="/path/to/typemap.tsv#typemap.tsv" -Dmapred.create.symlink=yes -f generate-xml.pig
/path/to/typemap.tsv
应该是本地路径,而不是HDFS中的路径.
The /path/to/typemap.tsv
should be the local path and not a path in HDFS.
这篇关于使用分布式缓存将文件名从Pig传递到Java UDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!