问题描述
我是GCP和Airflow的新手,正在尝试通过python 3通过简单的PYODBC连接运行python管道.但是,我相信我已经找到了需要在计算机上安装的工具[Microsoft doc] https://docs.microsoft.com/zh-cn/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql- server-2017 ,但我不确定要在GCP中运行这些命令的位置.我走了几个深坑寻找答案,但不知道如何解决问题
I am new to GCP and Airflow and am trying to run my python pipelines via a simple PYODBC connection via python 3. However, I believe I have found what I need to install on the machines [Microsoft doc]https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-2017 , but I am not sure where to go in GCP to run these commands. I have gone down several deep holes looking for answers, but don't know how to solve the problem
这是我上传DAG时不断看到的错误:
Here is the error I keep seeing when I upload the DAG:
这是PYODBC连接:
Here is the PYODBC connection:
pyodbc.connect('DRIVER={Microsoft SQL Server};SERVER=servername;DATABASE=dbname;UID=username;PWD=password')
当我在环境中打开gcloud shell并运行Microsoft下载时,它只是中止,当我下载SDK并从本地下载连接到项目时,它会自动中止或无法识别来自Microsoft的命令.谁能给我一些简单的指导,说明从哪里开始以及我做错了什么?
When I open my gcloud shell in environments and run Microsoft downloads it just aborts, when I downloaded SDK and connected to project from local download it auto aborts or doesn't recognize commands from Microsoft. Can anyone give some simple instruction on where to start and what I am doing wrong?
推荐答案
考虑到Composer是Google托管的Apache Airflow实施,因此,希望它的行为有所不同.
Consider that Composer is a Google managed implementation of Apache Airflow hence, expect it to behave differently.
请记住, Cloud Composer工作程序映像中不提供的自定义Python依赖项和二进制依赖项可以使用KubernetesPodOperator
选项.
Having this in mind, custom Python dependincies and binary dependencies not available in the Cloud Composer worker image can use the KubernetesPodOperator
option.
这实际上是允许您创建自定义容器映像根据您的所有要求,将其推送到容器映像存储库(Dockerhub,GCR)中,然后将其拉入Composer环境,这样就可以满足您的所有依赖关系.
What this does essentially, is to allow you to create a custom container image with all your requirements, push it into a container image repository (Dockerhub, GCR) and then pull it into your Composer environment, so all of your dependencies are met.
这升级得更好,因为您无需与机器进行交互(此方法已在您的原始问题中进行了说明),并且仅凭其中所需的内容构建容器映像就显得更加容易.
This escalates better as there is no need for you to interact with the machines (this approach is stated in your original question), and it looks easier to just build your container image with whatever you need in there.
具体说来pyodbc
,在使用Composer安装依赖项的情况下,存在功能请求来解决此问题,它还概述了一种解决方法(基本上是此答案中提到的内容).您可能要检查一下.
Specifically speaking of pyodbc
and in this context of dependency installation using Composer, there is a feature request to address this issue, that also outlines a workaround (basically what is mentioned in this answer). You might want to check it out.
这篇关于Google Composer-如何在环境中安装Microsoft SQL Server ODBC驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!