问题描述
即使我获得了所有必需的 IAM 权限,我也无法访问 AWS Glue 表.我什至无法列出所有数据库.这是代码.
I am not able to access AWS Glue tables even if I given all required IAM permissions. I cant even list all the databases.Here is the code.
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
# New recommendation from AWS Support 2018-03-22
newconf = sc._conf.set("spark.sql.catalogImplementation", "in-memory")
sc.stop()
sc = sc.getOrCreate(newconf)
# End AWS Support Workaround
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
错误出现在这里.访问其中一个 Glue 表时.
The error is here.while accessing one of the Glue table.
datasource_history_1 = glueContext.create_dynamic_frame.from_catalog(database = "dev", table_name = "history", transformation_ctx = "datasource_history_1")
我也尝试列出数据库,在那里我只能看到默认的数据库,没有其他(我在 Glue 中创建的)
I tried to list databases also where I can see only the default one, nothing else(which I have created in Glue)
我尝试参考下面的链接,仍然没有帮助我.
I tried to refer the below link, still did not help me.
无法运行在 AWS Glue PySpark Dev Endpoint 中正确编写脚本
推荐答案
您似乎直接从这个问题 braj 中获取了您的代码:无法在 AWS Glue PySpark Dev Endpoint 中正确运行脚本 - 但该代码特定于我的 Amazon Glue 环境,我引用的表不会存在于您的环境中.
You seem to have taken your code straight from this question braj: Unable to run scripts properly in AWS Glue PySpark Dev Endpoint - but that code is specific to my Amazon Glue environment and the tables I'm referencing won't exist in your environment.
要使此命令起作用:
datasource_history_1 = glueContext.create_dynamic_frame.from_catalog(database = "dev", table_name = "history", transformation_ctx = "datasource_history_1")
检查您自己的 Glue 目录 https://eu-west-1.console.aws.amazon.com/glue/home 并确保在名为 dev 的数据库中有一个名为 history 的表.如果您不这样做,那么我不确定您希望从这段代码中看到什么行为.
Check your own Glue Catalog https://eu-west-1.console.aws.amazon.com/glue/home and ensure you have a table called history inside a database called dev. If you don't then I'm not sure what behaviour you expect to see from this code.
与其从取自其他人的 StackOverflow 答案中的脚本开始,我建议您在 Glue 中创建一个作业,然后先让它为您生成源连接代码.以此为起点.它将在该脚本中为您生成 create_dynamic_frame.from_catalog 命令.
Instead of starting from a script taken from someone else's StackOverflow answer I suggest you create a Job in Glue and get it to generate the source connection code for you first. Use that as your starting point. It'll generate the create_dynamic_frame.from_catalog command for you in that script.
这篇关于AWS 胶水错误 |无法使用 spark 从开发人员端点读取胶水表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!