问题描述
我有2个Blob文件要复制到Azure SQL表.我的管道有两个活动:
I have 2 blob files to copy to Azure SQL tables. My pipeline with two activities:
{
"name": "NutrientDataBlobToAzureSqlPipeline",
"properties": {
"description": "Copy nutrient data from Azure BLOB to Azure SQL",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "60.00:00:00"
}
},
"inputs": [
{
"name": "FoodGroupDescriptionsAzureBlob"
}
],
"outputs": [
{
"name": "FoodGroupDescriptionsSQLAzure"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst"
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "FoodGroupDescriptions",
"description": "#1 Bulk Import FoodGroupDescriptions"
},
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "60.00:00:00"
}
},
"inputs": [
{
"name": "FoodDescriptionsAzureBlob"
}
],
"outputs": [
{
"name": "FoodDescriptionsSQLAzure"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst"
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "FoodDescriptions",
"description": "#2 Bulk Import FoodDescriptions"
}
],
"start": "2015-07-14T00:00:00Z",
"end": "2015-07-14T00:00:00Z",
"isPaused": false,
"hubName": "gymappdatafactory_hub",
"pipelineMode": "Scheduled"
}
}
据我了解,一旦完成第一项活动,第二次就开始了.然后如何执行该管道,而不是转到数据集切片并手动运行?另外, pipelineMode 如何只设置为OneTime,而不是Scheduled?
As I understand, once first activity is done, second starts. How do you then execute this pipeline, instead of going to Dataset slices and run manually? Also pipelineMode how can I set up to OneTime only, instead of Scheduled?
推荐答案
为了使活动同步运行(有序),第一个管道的输出将需要作为第二个管道的输入.
In order to have activities run synchronously (ordered) the output of the first pipeline will need to be an input of the second pipeline.
{
"name": "NutrientDataBlobToAzureSqlPipeline",
"properties": {
"description": "Copy nutrient data from Azure BLOB to Azure SQL",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "60.00:00:00"
}
},
"inputs": [
{
"name": "FoodGroupDescriptionsAzureBlob"
}
],
"outputs": [
{
"name": "FoodGroupDescriptionsSQLAzureFirst"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst"
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "FoodGroupDescriptions",
"description": "#1 Bulk Import FoodGroupDescriptions"
},
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "60.00:00:00"
}
},
"inputs": [
{
"name": "FoodGroupDescriptionsSQLAzureFirst",
"name": "FoodDescriptionsAzureBlob"
}
],
"outputs": [
{
"name": "FoodDescriptionsSQLAzureSecond"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst"
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "FoodDescriptions",
"description": "#2 Bulk Import FoodDescriptions"
}
],
"start": "2015-07-14T00:00:00Z",
"end": "2015-07-14T00:00:00Z",
"isPaused": false,
"hubName": "gymappdatafactory_hub",
"pipelineMode": "Scheduled"
}
如果您注意到第一个活动的输出"FoodGroupDescriptionsSQLAzureFirst"成为第二个活动的输入.
If you notice the output of the first activity "FoodGroupDescriptionsSQLAzureFirst" becomes an input in the second activity.
这篇关于Azure数据工厂-管道执行顺序中的多个活动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!