问题描述
我有一种情况,在这里我需要写两个hbase表,比如说table1,表2.在表1中发生写操作时,我需要对表2执行一些操作,比如在表2中增加一个计数器如触发)。为此,我需要在map-reduce程序的相同任务中访问(写入)两个表。我听说可以使用MultiTableOutputFormat完成。但我找不到任何详细解释的好例子。请问有人可以回答是否可以这样做。如果是的话,我该怎么做/我该怎么做。提前致谢。
I have a situation here where I need to write to two of the hbase tables say table1,table 2. Whenever a write happens on table 1, I need to do some operation on table 2 say increment a counter in table 2 (like triggering). For this purpose I need to access (write) to two tables in the same task of a map-reduce program. I heard that it can be done using MultiTableOutputFormat. But I could not find any good example explaining in detail. Could some one please answer whether is it possible to do so. If so how can/should I do it. Thanks in advance.
请为我提供一个不应包含协处理器的答案。
Please provide me an answer that should not include co-processors.
推荐答案
要在map-reduce作业中写入多个表格,必须在作业配置中指定该表格。你是对的,这可以使用MultiTableOutputFormat完成。
通常对于您使用的单个表:
To write into more than one table in map-reduce job, you have to specify that in job configuration. You are right this can be done using MultiTableOutputFormat.Normally for a single table you use like:
TableMapReduceUtil.initTableReducerJob("tableName", MyReducer.class, job);
而不是这样写:
Instead of this write:
job.setOutputFormatClass(MultiTableOutputFormat.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setNumReduceTasks(2);
TableMapReduceUtil.addDependencyJars(job);
TableMapReduceUtil.addDependencyJars(job.getConfiguration());
现在在编写表格时写入数据为:
Now at the time of writing data in table write as:
context.write(new ImmutableBytesWritable(Bytes.toBytes("tableName1")),put1);
context.write(new ImmutableBytesWritable(Bytes.toBytes("tableName2")),put2);
这篇关于在HBASE中写入多个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!