Expressions
You are looking at an older version of the documentation. The latest version is found here.
Identifiers, literals, and functions can be combined into expressions. Expressions can be used almost anywhere in a query: SELECT
, FROM
(if specifying join criteria), WHERE
, GROUP BY
, HAVING
, or ORDER BY
.
The Data Virtuality Server supports the following types of expressions:
- Column Identifiers
- Literals
- Scalar Functions
- Arrays
- Aggregate Functions
- Window Functions
- Case and Searched Case
- Scalar Subqueries
- Parameter References
Column Identifiers
Column identifiers are used to specify the output columns in SELECT
statements, the columns and their values for INSERT
and UPDATE
statements, and criteria used in WHERE
and FROM
clauses. They are also used in GROUP BY
, HAVING
, and ORDER BY
clauses. For the syntax for column identifiers, please refer to the Identifiers section.
Literals
Literal values represent fixed values. These can be any of the 'standard' data types.
Syntax rules:
- Integer values will be assigned an integral data type big enough to hold the value (integer, long, or biginteger);
- Floating point values will always be parsed as a double;
- The keyword 'null' represents an absent or unknown value and is inherently untyped. In many cases, a null literal value will be assigned an implied type based on context. For example, in the function '5 + null', the null value will be assigned the type 'integer' to match the type of the value '5'. A null literal used in the
SELECT
clause of a query with no implied context will be assigned to type 'string'.
Some examples of simple literal values
'
abc'
- escaped single quote:
'
isn'
'
t true'
5
- scientific notation:
37.75e01
- exact numeric type
BigDecimal
:100.0
true
false
- Unicode character:
'\u0027'
- binary:
X'0F0A'
Date/Time Literals can use JDBC Escaped Literal Syntax or the ANSI keyword syntax:
JDBC Escaped Literal Syntax | ANSI keyword syntax | |
---|---|---|
Date Literal | {d '...' } | DATE '...' |
Time Literal | {t '...' } | TIME '...' |
Timestamp Literal | {ts '...' } | TIMESTAMP '...' |
Using Unicode characters
You can specify any Unicode characters (with codes 0-65535) in SQL strings. Normally if you try to include special characters in the usual way using a backslash, it will be taken literally by DV. To use special characters, you should use an 'E' letter before the first apostrophe character bounding a string constant.
The syntax supports the following codes: \b, \t, \n, \f, \r and the standard way to specify a Unicode character: \uXXXX, where X is a hexadecimal digit.
Examples:
SQL code | Output |
---|---|
SQL
| some\ntext |
SQL
| some text |
SQL
| danke schön |
Arrays
An array is an ordered list of elements of a single data type. Each element has its own index value by which it can be accessed.
Defining Arrays
General Array Syntax
(expr, expr ... [,])
ARRAY(expr, ...)
Empty Arrays
()
(,)
ARRAY()
Single Element Arrays
(expr,)
ARRAY(expr)
Using Arrays
Here are some things to keep in mind when working with arrays:
- In an SQL statement, arrays use 1-based indexes, i.e. the first element of an array must be referenced as
ARRAY[1]
; - For the parser to recognize a single element expression as an array with parentheses, a trailing comma is necessary;
- If all of the elements of the array have the same type, the array will have a matching base type. If the element types are different, the base type of the array will be
object
.
And here's how to define and use an array:
BEGIN
// Defining an array
DECLARE OBJECT favourite_birds = ARRAY('seagull', 'eagle', 'nightingale', 'sparrow');
// Using the array
SELECT favourite_birds[2]
END
// Output
eagle
The Data Virtuality Server also has special functions for working with arrays: Array Functions.
Aggregate Functions
Aggregate functions take sets of values from a group produced by an explicit or implicit GROUP BY and return a single scalar value computed from the group.
Data Virtuality Server supports the following aggregate functions:
To view the full table, click the expand button in its top right corner
Function | Description |
---|---|
COUNT(*) | Counts the number of values (including nulls and duplicates) in a group |
COUNT(x) | Counts number of values (excluding nulls) in a group |
SUM(x) | Sum of values (excluding nulls) in a group |
AVG(x) | Average of values (excluding nulls) in a group |
MIN(x) | Minimum value in a group (excluding null) |
MAX(x) | Maximum value in a group (excluding null) |
ANY(x)/SOME(x) | Returns TRUE if any value in the group is TRUE (excluding null) |
EVERY(x) | Returns TRUE if every value in the group is TRUE (excluding null) |
VAR_POP(x) | Biased variance (excluding null) logically equals (sum(x^2) - sum(x)^2/count(x))/count(x) ; returns a double; null if count = 0 |
VAR_SAMP(x) | Sample variance (excluding null) logically equals (sum(x^2) - sum(x)^2/count(x))/(count(x) - 1) ; returns a double; null if count < 2 |
STDDEV_POP(x) | Standard deviation (excluding null) logically equals SQRT(VAR_POP(x)) |
STDDEV_SAMP(x) | Sample standard deviation (excluding null) logically equals SQRT(VAR_SAMP(x)) |
| CSV text aggregation of all expressions in each row of a group.
|
XMLAGG(xml_expr [ ORDER BY ... ]) | XML concatenation of all XML expressions in a group (excluding null). The ORDER BY clause cannot reference alias names or use positional ordering
SQL
|
JSONARRAY_AGG(x [ORDER BY …]) | Creates a JSON array result as a Clob including null value. The ORDER BY clause cannot reference alias names or use positional ordering. See also the JSONArray function.
SQL
|
STRING_AGG(x, delim) | Creates a lob result from the concatenation of x using the delimiter
SQL
|
ARRAY_AGG(x [ORDER BY ...]) | Creates an array with a base type matching the expression
SQL
|
| User-defined aggregate function |
Syntax Rules
Some aggregate functions may contain a keyword 'DISTINCT
' before the expression, indicating that duplicate expression values should be ignored. DISTINCT
is not allowed in COUNT(*)
and is not meaningful in MIN
or MAX
(the result would be unchanged), but it can be used in COUNT
, SUM
, and AVG
.
- Aggregate functions cannot be used in
FROM
,GROUP BY
, orWHERE
clauses without an intervening query expression; - Aggregate functions cannot be nested within another aggregate function without an intervening query expression;
- Aggregate functions may be nested inside other functions;
Any aggregate function may take an optional
FILTER
clause of the following form:CODEFILTER ( WHERE condition )
The condition may be any boolean value expression that does not contain a subquery or a correlated variable. The filter will logically be evaluated for each row prior to the grouping operation. If
FALSE
, the aggregate function will not accumulate a value for the given row;- User-defined aggregate functions need
ALL
specified if no other aggregate-specific constructs are used to distinguish the function as an aggregate rather than a normal function. For more information on aggregates, see the sections onGROUP BY
andHAVING
.
Window Functions
Data Virtuality Server supports ANSI SQL 2003 window functions. A window function allows an aggregate function to be applied to a subset of the result set without the need for a GROUP BY
clause. A window function is similar to an aggregate function, but requires the use of an OVER
clause or window specification.
Usage
aggregate|ranking OVER ([PARTITION BY ...] [ORDER BY ...] [<FRAME-CLAUSE>])
aggregate can be any aggregate function. Ranking can be one of
ROW_NUMBER()
, RANK()
, DENSE_RANK()
.
Syntax Rules
- Window functions can only appear in the
SELECT
andORDER BY
clauses of a query expression; - Window functions cannot be nested in one another;
- Partitioning and order by expressions cannot contain subqueries or outer references;
- The ranking (
ROW_NUMBER
,RANK
,DENSE_RANK
) functions require the use of the window specificationORDER BY
clause; - An
XMLAGG ORDER BY
clause cannot be used when windowed; - The window specification
ORDER BY
clause cannot reference alias names or use positional ordering; - Windowed aggregates may not use
DISTINCT
if the window specification is ordered; - The Frame Clause can only be used in combination with the Partition Clause (at least one of
PARTITION BY
orORDER BY
should be present).
Frame Clause
The frame clause allows the user to specify a dynamic group of rows, or a range, inside the partition (analogous to a sliding frame in a window). It can be used in two ways:
[ RANGE | ROWS ] frame_start
[ RANGE | ROWS ] BETWEEN frame_start AND frame_end
Where frame_start and frame_end can be one of:
UNBOUNDED PRECEDING
value PRECEDING (ROWS only)
CURRENT ROW
value FOLLOWING (ROWS only)
UNBOUNDED FOLLOWING
Restrictions are that frame_start cannot be UNBOUNDED FOLLOWING
, frame_end cannot be UNBOUNDED PRECEDING
, and the frame_end choice cannot appear earlier in the above list than the frame_start choice. For example, RANGE BETWEEN CURRENT ROW AND
value PRECEDING
is not allowed.
Examples of Usage
Let us assume we have the following table called my_table
:
x | y |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 3 |
Examples of Frame Clause usage
SELECT x, SUM(y) OVER (PARTITION BY y ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS window_column
FROM data_source.my_table;
-- Result:
-- |-----------|---------------|
-- | x | window_column |
-- |-----------|---------------|
-- | 1 | 1 |
-- | 2 | 2 |
-- | 3 | 3 |
-- | 4 | 2 |
-- | 5 | 3 |
-- |-----------|---------------|
SELECT x, COUNT(y) OVER (PARTITION BY y RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS window_column
FROM data_source.my_table;
-- Result:
-- |-----------|---------------|
-- | x | window_column |
-- |-----------|---------------|
-- | 1 | 3 |
-- | 2 | 3 |
-- | 3 | 3 |
-- | 4 | 1 |
-- | 5 | 1 |
-- |-----------|---------------|
SELECT x, y*100/SUM(y) OVER (PARTITION BY y RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS window_column
FROM data_source.my_table;
-- Result:
-- |-----------|---------------|
-- | x | window_column |
-- |-----------|---------------|
-- | 1 | 33 |
-- | 2 | 33 |
-- | 3 | 33 |
-- | 4 | 100 |
-- | 5 | 100 |
-- |-----------|---------------|
Function Definitions
Function | Description |
---|---|
ROW_NUMBER() | Functionally same as COUNT(*) with the same window specification. Assigns a number to each row in a partition starting at 1 |
RANK() | Assigns a number to each unique ordering value within each partition starting at 1, such that the next rank is equal to the count of prior rows |
DENSE_RANK() | Assigns a number to each unique ordering value within each partition starting at 1, such that the next rank is sequential |
LEAD(scalar_expression [, offset [, default]]) | Returns scalar_expression evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead returns default (which must be of the same type as scalar_expression ). Both offset and default are evaluated for the current row. If omitted, offset defaults to 1 and default to null |
LAG(scalar_expression [, offset [, default]]) | Returns scalar_expression evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead returns default (which must be of the same type as scalar_expression ). Both offset and default are evaluated for the current row. If omitted, offset defaults to 1 and default to null |
FIRST_VALUE(scalar_expression) | Returns scalar_expression evaluated at the row that is the first row of the window frame |
LAST_VALUE(scalar_expression) | Returns scalar_expression evaluated at the row that is the last row of the window frame |
Processing
Window functions are logically processed just before creating the output from the SELECT
clause. Window functions can use nested aggregates if a GROUP BY
clause is present. The is no guaranteed effect on the output ordering from the presence of window functions. The SELECT
statement must have an ORDER BY
clause to have a predictable ordering.
The Data Virtuality Server will process all window functions with the same window specification together. Generally, a full pass over the row values coming into the SELECT
clause will be required for each unique window specification. For each window specification, the values will be grouped according to the PARTITION BY
clause. If no PARTITION BY
clause is specified, the entire input is treated as a single partition. The output value is determined based upon the current row value, its peers (that is, rows that are the same with respect to their ordering), and all prior row values based upon ordering in the partition. The ROW_NUMBER
function will assign a unique value to every row regardless of the number of peers.
Example Windowed Results
SELECT name, salary, max(salary) over (partition by name) as max_sal,
rank() over (order by salary) as rank, dense_rank() over (order by salary) as dense_rank,
row_number() over (order by salary) as row_num
FROM data_source.employees
name | salary | max_sal | rank | dense_rank | row_num |
---|---|---|---|---|---|
John | 100000 | 100000 | 2 | 2 | 2 |
Henry | 50000 | 50000 | 5 | 4 | 5 |
John | 60000 | 100000 | 3 | 3 | 3 |
Suzie | 60000 | 150000 | 3 | 3 | 4 |
Suzie | 150000 | 150000 | 1 | 1 | 1 |
Case and Searched Case
The Data Virtuality Server supports two forms of CASE
expression, allowing conditional logic in a scalar expression. The supported forms are the following:
CASE <expr> ( WHEN <expr> THEN <expr>)+ [ELSE expr] END
CASE ( WHEN <criteria> THEN <expr>)+ [ELSE expr] END
Each form allows for an output based on conditional logic. The first form starts with an initial expression, evaluates WHEN
expressions until the values match, and outputs the THEN
expression. If no WHEN
is matched, the ELSE
expression is output. If no WHEN
is matched and no ELSE
is specified, a null literal value is output. The second form (the searched case expression) searches the WHEN
clauses, which specify arbitrary criteria to evaluate. If any criteria evaluates to true, the THEN
expression is evaluated and output. If no WHEN
is true, the ELSE
is evaluated or NULL
is output if none exists.
Scalar Subqueries
Subqueries can be used to produce a single scalar value in the SELECT
, WHERE
, or HAVING
clauses only. A scalar subquery must have a single column in the SELECT
clause and should return either 0 or 1 row. If no rows are returned, null will be returned as the scalar subquery value. For other types of subqueries, see the Subqueries section below.
Parameter References
Parameters are specified using the '?' symbol. Parameters may only be used with PreparedStatement
or CallableStatements
in JDBC. Each parameter is linked to a value specified by a 1-based index in the JDBC API.