TEXTTABLE
The TEXTTABLE
function processes character input to produce tabular output. It supports both fixed and delimited file format parsing. The function itself defines what columns it projects. The TEXTTABLE
function is implicitly a nested table and may be correlated to the preceding FROM
clause entries.
Usage
TEXTTABLE(expression [SELECTOR string] COLUMNS <COLUMN>, ... [NO ROW DELIMITER | ROW DELIMITER char] [DELIMITER char] [(QUOTE|ESCAPE) char] [HEADER [integer]] [SKIP integer] [NO TRIM]) AS name
Where <COLUMN>
COLUMN := name (FOR ORDINALITY | ([HEADER string] datatype [WIDTH integer [NO TRIM]] [SELECTOR string integer]))
This function has the following parameters:
To view the full table, click the expand button in its top right corner
Parameter | Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
expression | Text content to process, should be convertible to CLOB | ||||||||||||
SELECTOR | Specifies that delimited lines should only match if the line begins with the selector string followed by a delimiter. The selector value is a valid column value. If a TEXTTABLE SELECTOR is specified, a SELECTOR may also be specified for column values. A column SELECTOR argument will select the nearest preceding text line with the given SELECTOR prefix and select the value at the given 1-based integer position (which includes the selector itself). If no such text line or position with a given line exists, a null value will be produced | ||||||||||||
NO ROW DELIMITER | Indicates that fixed parsing should not assume the presence of newline row delimiters | ||||||||||||
ROW DELIMITER | Sets the row delimiter/new line to an alternate character. Default: new line character with built-in handling for treating the carriage-return new line as a single character. If ROW DELIMITER is specified, carriage return will be given no special treatment | ||||||||||||
DELIMITER | Sets the field delimiter character to use. Default: , | ||||||||||||
QUOTE | Sets the quote (or qualifier) character used to wrap field values. Default: " . May be set to nothing by specifying QUOTE '' | ||||||||||||
ESCAPE | Sets the escape character to use if no quoting character is in use. This is used in situations where the delimiter or newline characters are escaped with a preceding character, e.g. \ | ||||||||||||
HEADER | Set for the TEXTTABLE specifies the text line number (counting every new line) on which the column names occur. All lines before the header will be skipped. If HEADER is specified, the header line will be used to determine the TEXTTABLE column position by case-insensitive name matching. This is especially useful when only a subset of the columns is needed. If the HEADER value is not specified, it defaults to 1 . If HEADER is not specified, then columns are expected to match positionally with the text contents. If the HEADER option for a column is specified, then that will be used as the expected header name. | ||||||||||||
SKIP | Specifies the number of text lines (counting every new line) to skip before parsing the contents. HEADER may still be specified with SKIP | ||||||||||||
A FOR ORDINALITY | This column is typed as an integer and will return the 1-based item number as its value | ||||||||||||
WIDTH | Indicates the fixed-width length of a column in characters, not bytes. With the default ROW DELIMITER , a CR NL sequence counts as a single character | ||||||||||||
NO TRIM | When specified on the TEXTTABLE , will affect all column and header values. When specified on a column, the fixed or unqualified text value not be trimmed of a leading and trailing whitespace | ||||||||||||
SKIPERRORS[maxerrors] | |||||||||||||
ERRORLOG tablename | Failed lines are written into the named table using the following format:
|
Syntax Rules
- If width is specified for one column it must be specified for all columns and be a non-negative integer;
- If width is specified, then fixed-width parsing is used, and
ESCAPE
,QUOTE
,SELECTOR
, andHEADER
should not be specified; - If width is not specified, then
NO ROW DELIMITER
cannot be used; - The columns names must contain no duplicates;
QUOTE
,DELIMITER
, andROW DELIMITER
must all be different characters.
Examples
1. Use of the HEADER
parameter, returns 1 row ['b']:
SELECT * FROM TEXTTABLE(UNESCAPE('col1,col2,col3\na,b,c') COLUMNS col2 string HEADER) x
2. Use of fixed width, returns 2 rows ['a', 'b', 'c'], ['d', 'e', 'f']:
SELECT * FROM TEXTTABLE(UNESCAPE('abc\ndef') COLUMNS col1 string width 1, col2 string width 1, col3 string width 1) x
3. Use of fixed width without a row delimiter, returns 3 rows ['a'], ['b'], ['c']:
SELECT * FROM TEXTTABLE('abc' COLUMNS col1 string width 1 NO ROW DELIMITER) x
4. Use of the ESCAPE
parameter, returns 1 row ['a,', 'b']:
SELECT * FROM TEXTTABLE('a:,,b' COLUMNS col1 string, col2 string ESCAPE ':') x
5. As a nested table:
SELECT x.* FROM t, TEXTTABLE(t.clobcolumn COLUMNS first string, second date SKIP 1) x
6. Use of SELECTOR
s, returns 2 rows ['c', 'd', 'b'], ['c', 'f', 'b']:
SELECT * FROM TEXTTABLE(UNESCAPE('a,b\nc,d\nc,f') SELECTOR 'c' COLUMNS col1 string, col2 string, col3 string SELECTOR 'a' 2) x
7. Use of an escaped literal to set specific chars as column and row delimiters (here: carriage return as column delimiter and line feed as row delimiter) whilst reading a CSV file from the local file system:
SELECT
"csv_table".*
FROM (CALL "ds_file".getFiles('planets_export.csv')) f,
TEXTTABLE(to_chars(f.file,'utf-8')
COLUMNS
"id" STRING
, "name" STRING
, "population" STRING
, "diameter" STRING
, "gravity" STRING
ROW DELIMITER E'\n'
DELIMITER E'\r'
QUOTE '"'
MAXWIDTH 8000
SKIP 1
)
"csv_table";;
8. Specifying any Unicode character as a delimiter with referencing their number. Same case as above but different implementation:
SELECT
"csv_table".*
FROM (CALL "ds_file".getFiles('planets_export.csv')) f,
TEXTTABLE(to_chars(f.file,'utf-8')
COLUMNS
"id" STRING
, "name" STRING
, "population" STRING
, "diameter" STRING
, "gravity" STRING
ROW DELIMITER E'\u000A'
DELIMITER E'\u000D'
QUOTE '"'
MAXWIDTH 8000
SKIP 1
)
"csv_table";;
See Also
Using TEXTTABLE to Implement SPLIT / SPLIT_PART for a guide on how to split large strings into columns based on a delimiter