问题描述
我正在使用一个医疗记录系统,该系统将数据存储在类似于电子表格的结构中——列标题中的日期/时间,每行第一列中的测量值(例如医生姓名、Rh、血型),和相交单元格中的值.基于此构造的报告通常需要显示 10 个或更多这些度量.
I'm working with a medical-record system that stores data in a construct that resembles a spreadsheet--date/time in column headers, measurements (e.g. physician name, Rh, blood type) in first column of each row, and a value in the intersecting cell. Reports that are based on this construct often require 10 or more of these measures to be displayed.
出于报告目的,数据集需要为每个患者、测量的日期/时间和每个测量的列对应一行.本质上,需要将构造旋转 90 度.
For reporting purposes, the dataset needs to have one row for each patient, the date/time the measurement was taken, and a column for each measurement. In essence, one needs to pivot the construct by 90 degrees.
有一次,我实际上使用了 SQL Server 的 PIVOT 功能来做到这一点.由于各种原因,很明显这种方法行不通.我决定使用内联视图 (IV) 将数据转换为所需的格式.简化的查询类似于:
At one point, I actually used SQL Server's PIVOT functionality to do just that. For a variety of reasons, it became apparent that this approach wouldn't work. I decided that I would use an inline view (IV) to massage the data into the desired format. The simplified query resembles:
SELECT patient_id,
datetime,
m1.value AS physician_name,
m2.value AS blood_type,
m3.value AS rh
FROM patient_table
INNER JOIN ( complex query here
WHERE measure_id=1) m1...
INNER JOIN (complex query here
WHERE measure_id=2) m2...
LEFT OUTER JOIN (complex query here
WHERE measure_id=3) m3...
如您所见,在某些情况下,这些 IV 用于限制结果数据集 (INNER JOIN),在其他情况下,它们不限制数据集 (LEFT OUTER JOIN).但是,除了measure_id 的不同之外,对于这些度量中的每一个来说,复杂查询"部分本质上是相同的.虽然这种方法有效,但它会导致相当大的 SQL 语句,限制了重用,并使查询容易出错.
As you can see, in some cases these IVs are used to restrict the resulting dataset (INNER JOIN), in other cases they do not restrict the dataset (LEFT OUTER JOIN). However, the 'complex query' part is essentially the same for each of these measure, except for the difference in measure_id. While this approach works, it leads to fairly large SQL statements, limits reuse, and exposes the query to errors.
我的想法是用内联表值 UDF 替换复杂查询"和 WHERE 子句.这将大大简化查询,减少错误并增加代码重用.我脑子里唯一的问题是性能.UDF 方法会导致性能显着下降吗?它可能会改善问题吗?
My thought was to replace the 'complex query' and WHERE clause with a Inline Table-Value UDF. This would simplify the queries quite a bit, reduce errors, and increase code reuse. The only question on my mind is performance. Will the UDF approach lead to significant decreases in performance? Might it improve matters?
感谢您的时间和考虑.
推荐答案
正确定义的 TVF 不会带来任何问题.与视图或临时表和变量相比,您会发现许多关于内部爆破 TVF 的性能问题的声明.通常不理解的是 TVF 的行为与视图不同.将视图定义放入原始查询中,然后优化器将按照它认为合适的方式重新排列查询树(除非在索引视图上使用了 NOEXPAND 子句).TVF 具有不同的语义,有时,特别是在更新数据时,这会导致 TVF 输出被假脱机用于 万圣节保护.它有助于标记函数 WITH SCHEMABINDING
,参见 使用 T-SQL UDF 上的 SCHEMABINDING 选项改进查询计划.
A correctly defined TVF will not introduce any problem. You'll find many claims on the interned blasting TVFs for performance problems as compared to views or temp tables and variables. What is usualy not understood is that a TVF behaves differently from a view. A View definition is placed into the original query and then the optimizer wil rearrange the query tree as it sees fit (unless the NOEXPAND clause is used on indexed views). A TVF has different semantics and sometimes, specially when updating data, this results in the TVF output being spooled for haloween protection. It helps to mark the function WITH SCHEMABINDING
, see Improving query plans with the SCHEMABINDING option on T-SQL UDFs.
对于理解确定性和精确函数的概念也很重要.尽管它们主要适用于标量值函数,但 TVF 也会受到影响.请参阅用户定义的函数设计指南.
Alsois important to understant the concepts of deterministic and precise function. Although they apply mostly to scalar value funcitons, TVFs can be also affected. See User-Defined Function Design Guidelines.
这篇关于SQL Server:内联表值 UDF 与内联视图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!