LucidDbAppLib SPLIT STRINGS
From Eigenpedia
Contents |
Purpose
Split one or more input strings, by a provided separator string, into separate rows.
Syntax
There are six variants of this UDX, taking as input either a string, a one-column table, or a multi-column table, all optionally generating a column of sequence numbers:
Single-string input:
select * from table(
applib.split_string_to_rows(
IN_STRING,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS)
);
Single-column input:
select * from table(
applib.split_rows(
IN_CURSOR,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS)
);
Multi-column input:
select * from table(
applib.split_rows(
IN_CURSOR,
COL_NAME,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS)
);
Single-string input with sequence numbers:
select * from table(
applib.split_string_to_rows(
IN_STRING,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS,
START_NUM,
INCREMENT_BY)
);
Single-column input with sequence numbers:
select * from table(
applib.split_rows(
IN_CURSOR,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS,
START_NUM,
INCREMENT_BY)
);
Multi-column input with sequence numbers:
select * from table(
applib.split_rows(
IN_CURSOR,
COL_NAME,
SEPARATOR_CHAR,
ESCAPE_CHAR,
TRIM_TOKENS,
START_NUM,
INCREMENT_BY)
);
Single-string input
Parameters
Input:
- INPUT_STRING varchar(65535): a string to be split
- SEPARATOR_CHAR char(1): a single-character separator to split by.
- ESCAPE_CHAR char(1): an escape character to escape the separator when needed
- TRIM_TOKENS boolean: trim leading and trailing whitespace from the split tokens, TRUE or FALSE
Output:
- OUT_STRINGS: a single-column result set containing the split tokens from INPUT_STRING.
- If INPUT_STRING is empty, or null, or if it's whitespace-only and TRIM_TOKENS is TRUE, no output rows will be produced.
Example
Input
- SEPARATOR_CHAR: "~"
- ESCAPE_CHAR: '!'
- TRIM_TOKENS: TRUE
- IN_STRING: "AA~BB~CC"
Query
select * from table(
applib.split_string_to_rows(
'AA~BB~CC',
'~',
'!',
TRUE)
);
Output
OUT_STRINGS ----------- AA BB CC
Single-column input
Parameters
Input:
- INPUT_CURSOR cursor: a single-column cursor containing strings to be split.
- SEPARATOR_CHAR varchar(1): a single-character separator to split by
- ESCAPE_CHAR varchar(1): an escape character to escape the separator when needed
- TRIM_TOKENS boolean: trim leading and trailing whitespace from the split tokens, TRUE or FALSE
Output:
- a single-column result set containing the split tokens from all strings from INPUT_CURSOR, named the same as the single column in INPUT_CURSOR.
Example
Input
- SEPARATOR_CHAR: "~"
- ESCAPE_CHAR: '!'
- TRIM_TOKENS: TRUE
- INPUT_CURSOR:
MY_COLUMN_NAME -------------- AA~BB CC~DD
Query
select * from table(
applib.split_rows(
cursor(select MY_COLUMN_NAME from ...),
'~',
'!',
TRUE)
);
Output
MY_COLUMN_NAME -------------- AA BB CC DD
Multi-column input
Parameters
Input:
- INPUT_CURSOR cursor: a multi-column cursor containing, among other things, a column of strings to be split
- COL_NAME: a ROW reference to the single column to be split
- SEPARATOR_CHAR char(1): a single-character separator to split by
- ESCAPE_CHAR char(1): an escape character to escape the separator when needed
- TRIM_TOKENS boolean: trim leading and trailing whitespace from the split tokens, TRUE or FALSE
Output:
- a multi-column result set, with the split column containing the split tokens from all strings of that column in INPUT_CURSOR, with the other columns' values duplicated.
Example
Input
- SEPARATOR_CHAR: '~'
- ESCAPE_CHAR: '!'
- TRIM_TOKENS: TRUE
- COL_NAME: row(COL2)
- INPUT_CURSOR:
COL1 |COL2 |COL3 -----+------+------- A1 |AA~BB |C1 A2 |CC~DD |C2
Query
select * from table(
applib.split_rows(
cursor(select COL1, COL2, COL3 from ...),
row(COL2),
'~',
'!',
TRUE)
);
Output
COL1 |COL2 |COL3 -----+------+---- A1 |AA |C1 A1 |BB |C1 A2 |CC |C2 A2 |DD |C2
Sequence Numbers
The variants with sequence numbers all add a column (SEQ_NUM bigint) to the resulting table, filled with a sequence of numbers.
Parameters
- START_NUM bigint: the sequence number used for the first row. If NULL, this will default to 1.
- INCREMENT_BY bigint: added to the previous sequence number to generate the next. Can be positive (for increasing sequence) or negative (for decreasing sequence). Must not be 0. If NULL, will default to 1.
Escaping
The "escape" parameter sets which character to be used for escaping the splitter character in cases where that character legally appears in tokens. For example, if exclamation mark, '!', is used, a splitter character preceded by an exclamation mark will not cause a string split but instead be included in the output string, unescaped:
- INPUT_STRING: AA!~B!B~CC
- out:
OUT_STRINGS ----------- AA~B!B CC
Exceptions
Exceptions are thrown in the following situations:
- (SEPARATOR_CHAR is more or less than one single character) - currently disabled due to FRG-261
- (ESCAPE_CHAR is more or less than one single character) - currently disabled due to FRG-261
- INPUT_CURSOR is more than one single column in single-column mode
- COL_NAME is more than one column
- START_NUM and INCREMENT_BY are present and INCREMENT_BY=0.

