问题描述
我正在尝试在U-SQL中生成动态FROM子句,以便我们可以基于先前的查询结果从不同的文件中提取数据.就像这样:
I am trying to generate a dynamic FROM clause in U-SQL so that we can extract data from different files based on a previous query outcome. That's something like this:
@filesToExtract = SELECT whatevergeneratesthepaths from @foo; <-- this query generates a rowset with all the file we want to extract like: [/path/file1.csv, /path/file2.csv]
SELECT * FROM @filesToExtract; <-- here we want to extract the data from file1 and file2
恐怕尚不支持这种动态查询,但是有人可以帮我指出实现此目的的方法吗?似乎唯一可行的方法是生成另一个U-SQL脚本,然后再执行它.
I'm afraid that this kind of dynamics queries are not supported yet, but can someone help pointing me out the way to achieve this? It seems that the only feasible approach is to generate another U-SQL script and execute it afterwards.
谢谢.
推荐答案
您是否希望将文件名动态地检索并传递给EXTRACT语句,或者表/行集的名称以及传递给SELECT的FROM子句.或两者.
It is not fully clear from your question if you want the file names to be dynamically retrieved and passed to an EXTRACT statement, or the name of tables/rowsets and passed to a SELECT's FROM clause. Or both.
通常,您不能从U-SQL表达式动态生成源名称.您可能要在此处 http://aka.ms/adlfeedback 提交功能请求,以获取动态或静态可参数化的源
In general, you cannot dynamically generate source names from your U-SQL expression. You may want to file a feature request here http://aka.ms/adlfeedback for dynamically or statically parameterizable sources.
话虽如此,根据您的确切要求,可能有一些方法可以实现您的目标,而无需您描述的解决方法.
Having said that, depending on your exact requirements, there may be some ways to achieve your goals without the work-around you describe.
例如,您可以将代码编写为参数化的表值函数,然后使用不同的脚本传递不同的行集,或者-如果可以静态确定要选择的行集,则可以使用IF语句.
For example, you could write your code as a parameterized table-valued function and then pass the different rowsets with different scripts, or - if you statically can decide which rowset to choose - you can use the IF statement.
这是一个伪代码示例:
DECLARE EXTERNAL @someconditionparameter Boolean = true;
IF (@someconditionparameter) THEN
@data = EXTRACT a int, b string FROM @fileset1 USING Extractors.Csv();
ELSE
@data = EXTRACT a int, b string FROM @file2 USING ...;
END;
@results = MyTableValuedFunction (@data);
...
如果文件的架构不同,则可以在TVF中使用灵活的列集(当前在预览中,请参见发行说明)来处理行集架构的可变性.
If your files are schematized differently, you may be able to use flexible column sets (currently in preview, see release notes) in the TVF to handle the variability of the rowset schema.
这篇关于U-SQL语句中的动态FROM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!