问题描述
我已经在Redshift中使用中执行函数( to_char )行。还有其他方法吗?
doesn't work and of course I suspect that you can't execute functions (to_char) in the "TO" line. Is there any other way I can do it?
如果 UNLOAD 不是,我有什么办法吗?其他选项如何使用当前可用的基础架构自动执行此类任务( Redshift + S3 + 数据管道,我们的 Amazon EMR 尚未激活)。
And if UNLOAD is not the way, do I have any other options how to automate such tasks with current available infrastructure (Redshift + S3 + Data Pipeline, our Amazon EMR is not active yet).
我唯一的一件事认为可行(但不确定)不是使用脚本,而是将脚本复制到 SQLActivity 选项中>(当前指向文件)并引用 {@ ScheduleStartTime}
The only thing that I thought could work (but not sure) is not instead of using script, to copy the script into the Script option in SQLActivity (at the moment it points to a file) and reference {@ScheduleStartTime}
推荐答案
为什么不使用RedshiftCopyActivity从Redshift复制到S3?输入是RedshiftDataNode,输出是S3DataNode,您可以在其中指定directoryPath的表达式。
Why not use RedshiftCopyActivity to copy from Redshift to S3? Input is RedshiftDataNode and output is S3DataNode where you can specify expression for directoryPath.
您还可以在RedshiftCopyActivity中指定transformSql属性,以覆盖默认值:select * from + inputRedshiftTable。
You can also specify the transformSql property in RedshiftCopyActivity to override the default value of : select * from + inputRedshiftTable.
示例管道:
{
对象:[{
id: CSVId1,
name: DefaultCSV1,
type: CSV
},{
id: RedshiftDatabaseId1,
databaseName: dbname,
username: user,
name: DefaultRedshiftDatabase1,
* password: password,
type: RedshiftDatabase,
clusterId: redshiftclusterId
},{
id:默认,
scheduleType:时间序列,
failureAndRerunMode: CASCADE,
名称:默认,
角色: DataPipelineDefaultRole,
resourceRole: DataPipelineDefaultR esourceRole
},{
id: RedshiftDataNodeId1,
schedule:{
ref: ScheduleId1
},
tableName:订单,
name: DefaultRedshiftDataNode1,
type: RedshiftDataNode,
database:{
ref: RedshiftDatabaseId1
}
},{
id: Ec2ResourceId1,
schedule:{
ref: ScheduleId1
},
securityGroups: MySecurityGroup,
name: DefaultEc2Resource1,
role: DataPipelineDefaultRole,
logUri: s3: // myLogs,
resourceRole: DataPipelineDefaultResourceRole,
type: Ec2Resource
},{
myComment:此对象用于控制任务计划。,
id: DefaultSchedule1,
name: RunOnce,
ocences: 1,
period: 1 Da y,
type:时间表,
startAt: FIRST_ACTIVATION_DATE_TIME
},{
id: S3DataNodeId1,
schedule:{
ref: ScheduleId1
},
directoryPath: s3:// my-bucket /#{format(@scheduledStartTime,'YYYY-MM- dd-HH-mm-ss')},
name: DefaultS3DataNode1,
dataFormat:{
ref: CSVId1
},
type: S3DataNode
},{
id: RedshiftCopyActivityId1,
output:{
ref: S3DataNodeId1
},
input:{
ref: RedshiftDataNodeId1
},
schedule:{
ref: ScheduleId1
},
name: DefaultRedshiftCopyActivity1,
runsOn:{
ref: Ec2ResourceId1
},
type: RedshiftCopyActivity
}]
}
{ "objects": [{ "id": "CSVId1", "name": "DefaultCSV1", "type": "CSV" }, { "id": "RedshiftDatabaseId1", "databaseName": "dbname", "username": "user", "name": "DefaultRedshiftDatabase1", "*password": "password", "type": "RedshiftDatabase", "clusterId": "redshiftclusterId" }, { "id": "Default", "scheduleType": "timeseries", "failureAndRerunMode": "CASCADE", "name": "Default", "role": "DataPipelineDefaultRole", "resourceRole": "DataPipelineDefaultResourceRole" }, { "id": "RedshiftDataNodeId1", "schedule": { "ref": "ScheduleId1" }, "tableName": "orders", "name": "DefaultRedshiftDataNode1", "type": "RedshiftDataNode", "database": { "ref": "RedshiftDatabaseId1" } }, { "id": "Ec2ResourceId1", "schedule": { "ref": "ScheduleId1" }, "securityGroups": "MySecurityGroup", "name": "DefaultEc2Resource1", "role": "DataPipelineDefaultRole", "logUri": "s3://myLogs", "resourceRole": "DataPipelineDefaultResourceRole", "type": "Ec2Resource" }, { "myComment": "This object is used to control the task schedule.", "id": "DefaultSchedule1", "name": "RunOnce", "occurrences": "1", "period": "1 Day", "type": "Schedule", "startAt": "FIRST_ACTIVATION_DATE_TIME" }, { "id": "S3DataNodeId1", "schedule": { "ref": "ScheduleId1" }, "directoryPath": "s3://my-bucket/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}", "name": "DefaultS3DataNode1", "dataFormat": { "ref": "CSVId1" }, "type": "S3DataNode" }, { "id": "RedshiftCopyActivityId1", "output": { "ref": "S3DataNodeId1" }, "input": { "ref": "RedshiftDataNodeId1" }, "schedule": { "ref": "ScheduleId1" }, "name": "DefaultRedshiftCopyActivity1", "runsOn": { "ref": "Ec2ResourceId1" }, "type": "RedshiftCopyActivity" }]}
这篇关于Amazon Redshift-卸载到S3-动态S3文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!