AppLib PENULTIMATE VALUES
From Eigenpedia
Contents |
Syntax
select * from table(
applib.penultimate_values(
cursor( SQL_QUERY ),
row( GROUPING_COLUMNS ),
row( DESIGNATED_VALUE_AND_TIMESTAMP )
)
);
Purpose
The Penultimate values UDX takes in a sorted input table which has a timestamp column, a designated column, and grouping column(s), among other columns, and returns a table with the same columns as the input table, plus an additional timestamp column. One row per group is outputted - it contains the next to last value change for each column other than the grouping column and the timestamp column. Since each column does not necessarily change per row, the output row for a grouping may not be an actual row in the input table. The timestamp column contains the timestamp value taken from the input row for which the next to last value change occurred for the designated column. The additional timestamp column is calculated by looking at the last value change for the designated column and taking the timestamp from that row in the input table.
Parameters
Input:
- SQL_QUERY: rows to be aggregated, presorted on GROUPING_COLUMNS plus timestamp value from DESIGNATED_VALUE_AND_TIMESTAMP.
- GROUPING_COLUMNS: - one or more columns of the input which are used to form the key of the output (input rows will be grouped accordingly)
- DESIGNATED_VALUE_AND_TIMESTAMP: - exactly two columns from the input; the first (of any datatype) is used for determining which rows timestamps are taken from; the second selects the timestamp column
Output:
- a table with the same columns as the input table, plus one additional UNTIL_TIMESTAMP column of type TIMESTAMP
Calculation:
The output contains one row per distinct key from GROUPING_COLUMNS, in the same order as the input.
- For GROUPING_COLUMNS, the output values for a given group are the same as the input values (which are the same for the entire group).
- For the timestamp column, the output value for a given group is determined as follows. First, find the penultimate value of the designated column. (If the group contains only one value for the designated column, then the penultimate value will be NULL.) Next, find the last run of contiguous rows within the group which have that penultimate value for the designated column. Take the timestamp from the first row within that run. If the penultimate value of the designated column is NULL, the timestamp value is also NULL.
- For the UNTIL_TIMESTAMP column, the output value for a given group is determined by taking the timestamp value from the next row following the penultimate value run defined in the previous bullet. If the group contains only one distinct value for the designated column, then return the timestamp of the first row of the group for UNTIL_TIMESTAMP instead.
- For any other column, the output value for a given group is equal to the penultimate value in the input for that group according to the timestamp ordering. If the group contains only one value for that column, then the output value is NULL.
Boundary cases:
- Empty input will return an empty table with the additional UNTIL_TIMESTAMP column
- There is no special handling for null values in columns other than the timestamp column
- This means that if a penultimate value for a column is a null, it will be outputted as is.
- The same goes for nulls in the grouping column - null is a valid group
- all string fields are right trimmed prior to comparison for equality. This means that 'Field' is the same as 'Field ', and strings that consist of spaces are equivalent to the empty string.
Exceptions:
A fatal exception is thrown under any of the following conditions:
- DESIGNATED_VALUE_AND_TIMESTAMP is not a row of exactly two columns
- Second column of DESIGNATED_VALUE_AND_TIMESTAMP is not a TIMESTAMP
- Input table not presorted correctly in ascending order of group keys plus timestamp (as defined by SQL:2003 ORDER BY semantics, with NULLS FIRST, and PAD SPACE collation characteristic for group keys of VARCHAR datatype)
- for rules related to NULLS FIRST ordering, see SQL:2003 Part 2 Section 10.10 General Rule 1.e
- for specifics of VARCHAR with PAD SPACE, see SQL:2003 Part 2 Section 4.2.2
- If any of the columns in DESIGNATED_VALUE_AND_TIMESTAMP are in GROUPING_COLUMNS
- If the timestamp input column (2nd column of DESIGNATED_VALUE_AND_TIMESTAMP) contains a NULL value
Example
select
"OpportunityId",
"OpportunityChangeTimestamp" as "PreviousStageStart",
"Stage" as "PreviousStage",
"Amount" as "PreviousAmount",
"ExpectedRevenue" as "PreviousExpectedRevenue"
until_timestamp as "PreviousStageEnd"
from table(
penultimate_values(
cursor(
select * from opportunity_change_history
order by "OpportunityId", "OpportunityChangeTimestamp"
),
row("OpportunityId"),
row("Stage", "OpportunityChangeTimestamp")));
| OpportunityId | OpportunityChangeTimestamp | Stage | Amount | ExpectedRevenue |
|---|---|---|---|---|
| (null) | 1800-01-01 00:00:00 | (null) | 0 | (null) |
| (null) | 1802-01-01 00:00:00 | (null) | (null) | 0 |
| A | 2007-10-12 12:23:41 | F | 100 | (null) |
| A | 2007-10-14 00:00:00 | F | 150 | 100 |
| A | 2007-12-01 09:00:00 | G | 150 | 200 |
| A | 2007-12-01 15:00:00 | F | (null) | 100 |
| A | 2007-12-29 11:45:00 | H | 200 | 100 |
| B | 2007-09-09 04:00:23 | (null) | 10 | (null) |
| B | 2007-09-09 05:00:23 | F | 10 | 90 |
| C | 2008-01-01 07:27:00 | G | 900 | 2000 |
| D | 2007-12-04 16:56:09 | F | 10 | 15 |
| D | 2007-12-05 00:00:00 | G | 10 | 15 |
| D | 2007-12-05 01:00:00 | H | 15 | 15 |
| D | 2007-12-05 05:00:00 | H | 20 | 20 |
| OpportunityId | PreviousStageStart | PreviousStage | PreviousAmount | PreviousExpectedRevenue | PreviousStageEnd |
|---|---|---|---|---|---|
| (null) | (null) | (null) | 0 | (null) | 1800-01-01 00:00:00 |
| A | 2007-12-01 15:00:00 | F | (null) | 200 | 2007-12-29 11:45:00 |
| B | 2007-09-09 04:00:23 | (null) | (null) | (null) | 2007-09-09 05:00:23 |
| C | (null) | (null) | (null) | (null) | 2008-01-01 07:27:00 |
| D | 2007-12-05 00:00:00 | G | 15 | 15 | 2007-12-05 01:00:00 |

