BackgroundIn January 2011 I posted Inline SQL PL vs compiled SQL PL.One of the points I made in this entry was that inline SQL PL is preferable for simple SQL PL logic because it is faster.This implies of course that compiled SQL PL has some headroom when it comes to performance improvements.There is always room for improvement just about anywhere, so it can be interesting to talk about what motivated investments in this area.In this case an IBM partner was enabling an Oracle application to DB2.Most times we find that applications operate at competitive speeds after moving to DB2 thanks to the law of averages.So DB2 may be slower for some SQL statements, but it will be faster for others. As long as the fast statements cancel out the slow statements for all the time critical transactions everything is good.In this case however the partner ended up with one critical transaction that exposed a performance regression.Some further analysis revealed that there was a single PL/SQL function which consumed a lot more time in DB2.  A look at the function revealed that the function did not contain a single SQL statement. All the function did was some sophisticated  scanning of strings to mine for patterns.In this post I will describe the problem and how DB2 10 has fixed it.Compiled SQL PL in DB2 9.7Naturally I cannot reveal the partner's source code here. So I have chosen an artificial test to illustrate the problem. Some of you may know the "3*a+1" sequence. I don't even recall when I bumped into it the first time. But it's easy to understand and code:Take a whole number nIf n is 1 then the sequence finishesIf n odd then multiply n by 3 and add one. This is the new nIf n is even then divide n by 2. This is the new nGo back to step 2 until the sequence finishesThe question is whether this sequence is always finite.The following SQL PL function tries out the first n whole numbers and verifies the sequence reaches 1 in each case.To ensure we are comparing apples to apples we make this test in DB2 10, but restricting it to DB2 9.,7 technology: 点击(此处)折叠或打开CALL set_routine_opts('PVM_OPTLEVEL 127');CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO IF (a / 2) * 2 = a THEN SET a = a / 2; ELSE SET a = 3 * a + 1; END IF; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 209This is the baseline to beat. But where does time go here? The SQL PL Profiler described previously in this blog can tell us that: 点击(此处)折叠或打开LINE EXEC TIME(ACT) CPU_TIME TEXT ---- --------- --------- --------- -------------------------------------------------------------------- 1 CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) 2 RETURNS INTEGER SPECIFIC three_a_plus_one 3 BEGIN 4 DECLARE len INTEGER; 5 DECLARE n, a INTEGER DEFAULT 1; 6 DECLARE stamp TIMESTAMP; 7 1 0 0 SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); 8 WHILE n 9 SET a = n; 10 WHILE a 1 DO 11 10753712 46283 64847998 IF (a / 2) * 2 = a THEN 12 SET a = a / 2; 13 ELSE 14 SET a = 3 * a + 1; 15 END IF; 16 END WHILE; 17 SET n = n + 1; 18 END WHILE; 19 1 0 0 RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp); 20 ENDWhat this is showing is that several of the statements inside of the loop are seen by DB2 as SQL statements.Most of the math already happens within the Procedure Virtual Machine (PVM). For example comparison on line 8 or the division in line 12.It is interesting to see that about 10 million statements are being executed here. Now let's take a look at the best case.Inlined SQL PLThe densest way to drive the logic is to inline the SQL PL.This is done by adding an ATOMIC keyword to the function.When this is done DB2  will macro expand the logic into the VALUES statement. 点击(此处)折叠或打开CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN ATOMIC DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO IF (a / 2) * 2 = a THEN SET a = a / 2; ELSE SET a = 3 * a + 1; END IF; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 17Due to inlining there are no SQL statements by definition. So I cannot show profiler output here. Instead we would have to look at the optimizer plan which I will skip to not distract from the goal of the post. Now, let's take a look at normal DB2 10 operation.Compiled SQL PL in DB2 10We reset the "magic switch" we used to place DB2 10 into  DB2 9.7 mode and recreate the function without the ATOMIC keyword. 点击(此处)折叠或打开CALL set_routine_opts('');CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER)RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO IF (a / 2) * 2 = a THEN SET a = a / 2; ELSE SET a = 3 * a + 1; END IF; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 9We have reduced the time 20 fold! Even better we have beaten inline SQL PL by a factor of 2!A look at the profiler shows why 点击(此处)折叠或打开LINE EXEC TIME(ACT) CPU_TIME TEX ---- ---------- ---------- ---------- ------------------------------------------------------------- 1 CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) 2 RETURNS INTEGER SPECIFIC three_a_plus_one 3 BEGIN 4 DECLARE len INTEGER; 5 DECLARE n, a INTEGER DEFAULT 1; 6 DECLARE stamp TIMESTAMP; 7 1 0 0 SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); 8 WHILE n 9 SET a = n; 10 WHILE a 1 DO 11 IF (a / 2) * 2 = a THEN 12 SET a = a / 2; 13 ELSE 14 SET a = 3 * a + 1; 15 END IF; 16 END WHILE; 17 SET n = n + 1; 18 END WHILE; 19 1 0 0 RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp); 20 ENDThe only remaining SQL statements are the ones used for the timer. The reason being that DB2 presently cannot execute expressions locally which are non deterministic.GENERATE_UNIQUE() falls into this category.Other limiting factors are global variables or user defined function invocations. Restoring the Honor of inline SQL PLBeating inline SQL PL was unexpected. The reason for this is in the IF THEN ELSE statement.Inline SQL PL has to emulate that in a relational access plan which is not trivial complex.If we squeeze that IF out we should restore inline SQL PL to the top: 点击(此处)折叠或打开CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN ATOMIC DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO SET a = CASE WHEN (a / 2) * 2 = a THEN a / 2 ELSE 3 * a + 1 END; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 4That's much better!But really we are not comparing apples to apples anymore. Will the same trick help compiled SQL PL as well 点击(此处)折叠或打开CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO SET a = CASE WHEN (a / 2) * 2 = a THEN a / 2 ELSE 3 * a + 1 END; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 7That's better as well, but doesn't beat inline SQL PL. To complete the circle let's try the same routine without the optimizations of DB2 10 点击(此处)折叠或打开CALL set_routine_opts('PVM_OPTLEVEL 127');CREATE OR REPLACE FUNCTION three_a_plus_one(IN stop INTEGER) RETURNS INTEGER SPECIFIC three_a_plus_oneBEGIN DECLARE len INTEGER; DECLARE n, a INTEGER DEFAULT 1; DECLARE stamp TIMESTAMP; SET (len, stamp) = (0, TIMESTAMP(GENERATE_UNIQUE())); WHILE n SET a = n; WHILE a 1 DO SET a = CASE WHEN (a / 2) * 2 = a THEN a / 2 ELSE 3 * a + 1 END; END WHILE; SET n = n + 1; END WHILE; RETURN midnight_seconds(TIMESTAMP(GENERATE_UNIQUE()))- midnight_seconds(stamp);END/VALUES three_a_plus_one(100000); 1----------- 211Roughly the same speed.So with 210s/7s that's a factor of 30 for a well tuned function. ConclusionDB2 10 significantly improves SQL PL (and with that also PL/SQL) performance. In extreme cases this may be a factor of 30! As SQL statements such as SELECT or UPDATE are added to a routine this relative increase will drop. But for logic-heavy routines you can expect solid improvements which reduce the need to resort to inline SQL PL. Note that what we have not measured here was the overhead of invoking a compiled routine over inlining SQL PL.
12-17 05:30