问题描述
由于按照我的火花SQL查,才知道,超过2表不能直接加入,我们必须使用子查询,使其工作。所以我使用子查询,并能加入3个表:
与下面的查询:
SELECT姓名,年龄,性别,dpi.msisdn,subscriptionType,maritalStatus,isHighARPU,ip地址,的startTime,结束时间,isRoaming,dpi.totalCount,dpi.website FROM(SELECT subsc.name,subsc.age,subsc.gender ,subsc.msisdn,subsc.subscriptionType,subsc.maritalStatus,subsc.isHighARPU,cdr.ipAddress,cdr.startTime,cdr.endTime,cdr.isRoaming FROM SUBSCRIBER_META订网络,CDR_FACT CDR WHERE subsc.msisdn = cdr.msisdn和cdr.isRoaming ='Y')温度,DPI_FACT dpi的WHERE temp.msisdn = dpi.msisdn;
但是,当以相同的模式,我想加入4桌,这是我抛出以下异常
了java.lang.RuntimeException:[1.517]失败:标识符预期
查询加入4表:
选择名称,dueAmount FROM(SELECT姓名,年龄,性别,dpi.msisdn,subscriptionType,maritalStatus,isHighARPU,ip地址,的startTime,结束时间,isRoaming,dpi.totalCount,dpi.website FROM(SELECT subsc.name,订网络。年龄,subsc.gender,subsc.msisdn,subsc.subscriptionType,subsc.maritalStatus,subsc.isHighARPU,cdr.ipAddress,cdr.startTime,cdr.endTime,cdr.isRoaming FROM SUBSCRIBER_META订网络,CDR_FACT CDR WHERE subsc.msisdn = CDR。 MSISDN和cdr.isRoaming ='Y')温度,DPI_FACT dpi的WHERE temp.msisdn = dpi.msisdn)内,BILLING_META计费,其中inner.msisdn = billing.msisdn
谁能帮我做这个查询工作的?
在此先感谢。错误是如下:
2015年9月2日2时55分24秒[错误] org.apache.spark.Logging $类:错误运行的工作流的工作1423479307000 ms.0
了java.lang.RuntimeException:[1.517]故障:标识符预期 SELECT名字,dueAmount FROM(SELECT姓名,年龄,性别,dpi.msisdn,subscriptionType,maritalStatus,isHighARPU,ip地址,的startTime,结束时间,isRoaming,dpi.totalCount,dpi.website FROM(SELECT subsc.name,subsc.age,订网络.gender,subsc.msisdn,subsc.subscriptionType,subsc.maritalStatus,subsc.isHighARPU,cdr.ipAddress,cdr.startTime,cdr.endTime,cdr.isRoaming FROM SUBSCRIBER_META订网络,CDR_FACT CDR WHERE subsc.msisdn = cdr.msisdn和CDR .isRoaming ='Y')温度,DPI_FACT dpi的WHERE temp.msisdn = dpi.msisdn)内,BILLING_META计费,其中inner.msisdn = billing.msisdn
^
在scala.sys.package $ .error(package.scala:27)
在org.apache.spark.sql.catalyst.SqlParser.apply(SqlParser.scala:60)
在org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:73)
在org.apache.spark.sql.api.java.JavaSQLContext.sql(JavaSQLContext.scala:49)
在com.hp.tbda.rta.examples.JdbcRDDStreaming5 $ 7.call(JdbcRDDStreaming5.java:596)
在com.hp.tbda.rta.examples.JdbcRDDStreaming5 $ 7.call(JdbcRDDStreaming5.java:546)
在org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274)
在org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274)
在org.apache.spark.streaming.dstream.DStream $$ anonfun $ foreachRDD $ 1.适用(DStream.scala:527)
在org.apache.spark.streaming.dstream.DStream $$ anonfun $ foreachRDD $ 1.适用(DStream.scala:527)
在org.apache.spark.streaming.dstream.ForEachDStream $$ anonfun $ 1.适用$ MCV $ SP(ForEachDStream.scala:41)
在org.apache.spark.streaming.dstream.ForEachDStream $$ anonfun $ 1.适用(ForEachDStream.scala:40)
在org.apache.spark.streaming.dstream.ForEachDStream $$ anonfun $ 1.适用(ForEachDStream.scala:40)
在scala.util.Try $。适用(Try.scala:161)
在org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
在org.apache.spark.streaming.scheduler.JobScheduler $ JobHandler.run(JobScheduler.scala:172)
在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
在java.lang.Thread.run(Thread.java:745)
由于您已在SQL中使用的保留关键字内部星火发生的异常。避免使用的关键词在星火SQL 定制标识。
As per the my investigation on spark sql, come to know that more than 2 tables can't be joined directly, we have to use sub query to make it work. So I am using sub query and able to join 3 tables :
with following query :
"SELECT name, age, gender, dpi.msisdn, subscriptionType, maritalStatus, isHighARPU, ipAddress, startTime, endTime, isRoaming, dpi.totalCount, dpi.website FROM (SELECT subsc.name, subsc.age, subsc.gender, subsc.msisdn, subsc.subscriptionType, subsc.maritalStatus, subsc.isHighARPU, cdr.ipAddress, cdr.startTime, cdr.endTime, cdr.isRoaming FROM SUBSCRIBER_META subsc, CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming = 'Y') temp, DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn";
But when in the same pattern, i am trying to join 4 tables, It is throwing me following exception
java.lang.RuntimeException: [1.517] failure: identifier expected
Query to join 4 tables:
SELECT name, dueAmount FROM (SELECT name, age, gender, dpi.msisdn, subscriptionType, maritalStatus, isHighARPU, ipAddress, startTime, endTime, isRoaming, dpi.totalCount, dpi.website FROM (SELECT subsc.name, subsc.age, subsc.gender, subsc.msisdn, subsc.subscriptionType, subsc.maritalStatus, subsc.isHighARPU, cdr.ipAddress, cdr.startTime, cdr.endTime, cdr.isRoaming FROM SUBSCRIBER_META subsc, CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming = 'Y') temp, DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn) inner, BILLING_META billing where inner.msisdn = billing.msisdn
can anyone please help me making this query work?
Thanks in advance. Error is as follow:
09/02/2015 02:55:24 [ERROR] org.apache.spark.Logging$class: Error running job streaming job 1423479307000 ms.0
java.lang.RuntimeException: [1.517] failure: identifier expected
SELECT name, dueAmount FROM (SELECT name, age, gender, dpi.msisdn, subscriptionType, maritalStatus, isHighARPU, ipAddress, startTime, endTime, isRoaming, dpi.totalCount, dpi.website FROM (SELECT subsc.name, subsc.age, subsc.gender, subsc.msisdn, subsc.subscriptionType, subsc.maritalStatus, subsc.isHighARPU, cdr.ipAddress, cdr.startTime, cdr.endTime, cdr.isRoaming FROM SUBSCRIBER_META subsc, CDR_FACT cdr WHERE subsc.msisdn = cdr.msisdn AND cdr.isRoaming = 'Y') temp, DPI_FACT dpi WHERE temp.msisdn = dpi.msisdn) inner, BILLING_META billing where inner.msisdn = billing.msisdn
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.SqlParser.apply(SqlParser.scala:60)
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:73)
at org.apache.spark.sql.api.java.JavaSQLContext.sql(JavaSQLContext.scala:49)
at com.hp.tbda.rta.examples.JdbcRDDStreaming5$7.call(JdbcRDDStreaming5.java:596)
at com.hp.tbda.rta.examples.JdbcRDDStreaming5$7.call(JdbcRDDStreaming5.java:546)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274)
at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:274)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:527)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:527)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:172)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
The exception occurred due to you have used the reserved keyword "inner" of Spark in your sql. Avoid using of Keywords in Spark SQL as custom identifier.
这篇关于阿帕奇星火SQL问题:了java.lang.RuntimeException:[1.517]故障:标识符预期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!