Azure Table Connector
You are looking at an older version of the documentation. The latest version is found here.
The Azure Table Connector is used to access Azure Tables data and make it available in the Data Virtuality Server in the relational form. This way, you can query an Azure Table using the SQL query language and write complex queries directed to multiple Data Sources. For example, a single SQL query can join tables from MySQL and combine them with data coming from Azure Table.
Connector Configuration
Connecting to Azure Table
Before issuing SQL queries to the Azure Table, you need to represent Azure Table's Metadata in relational format, for example represent non-relation collections as relation tables. This is done by connecting to Azure Table data source using the createConnection
/createDatasource
commands.
The complete list of connection properties can be found in Connection Configuration Interface under azuretable
.
The Azure Table endpointSuffix
connection property is explained below.
Example
CALL "SYSADMIN.createConnection"(
"name" => 'azuretable',
"jbossCLITemplateName" => 'azuretable',
"connectionOrResourceAdapterProperties" =>
'accountName=<accountName>
,accountKey=<accountKey>
,defaultEndpointsProtocol=https
,endpointSuffix=<suffix>'
) ;;
CALL "SYSADMIN.createDataSource"(
"name" => 'azuretable',
"translator" => 'azuretable',
"modelProperties" => 'importer.useFullSchemaName=false,importer.defaultSchema=<accountName>,importer.schemaPattern=<accountName>',
"translatorProperties" => ''
) ;;
Azure Table endpointSuffix
Azure domain name (e.g. core.windows.net)
Usage
Azure Tables are represented as relational tables in the schema named the same as in createDatasource()
.
SELECT Query
To query these tables, a regular SQL SELECT
query can be used:
SELECT * FROM azuretable.table1
Write Capability
- Azure Table Storage table names must be alpha-numeric and must not contain any special characters (underscore is not allowed);
- Any Table in Azure Table Storage has system columns ("partitionkey", "rowkey", "time_stamp");
- The column combination of ("partitionkey", "rowkey") has to be unique for each table in Azure Table Storage;
- The name of the
time_stamp
column is "time_stamp" in the Data Virtuality Server metadata and "timestamp" in Azure Table Storage metadata; - Column order in the Data Virtuality Server metadata is as follows: "partitionkey", "rowkey", "time_stamp". All other columns are ordered alphabetically;
- Empty tables in Azure Table Storage have no metadata in the Azure Table Storage. The Data Virtuality Server will not load such tables on data source refresh.
CREATE TABLE / INSERT INTO
- CREATE table may specify system columns as ("partitionkey" string(1024), "rowkey" string(1024), time_stamp timestamp);
INSERT INTO
must contain unique values for ("partitionkey", "rowkey") column combination;- Target columns in
INSERT INTO
queries have to be specified explicitly; - The timestamp column is maintained by Azure Table Storage.
INSERT INTO
andSELECT INTO
may specify a value for the "time_stamp" column. This value will be ignored by Azure Table Storage.
SELECT INTO
SELECT INTO
must contain unique values for ("partitionkey", "rowkey") column combination;- The timestamp column is maintained by Azure Table Storage.
INSERT INTO
andSELECT INTO
may specify a value for the "time_stamp" column. This value will be ignored by Azure Table Storage.
Data types
- Data Virtuality Server
BLOB
data type columns (Snowflake'sVARBINARY
) may be explicitly cast toSTRING
for inserting them into Azure Table Storage; - Numeric Data Virtuality Server data types
BIGDECIMAL
/BIGINTEGER
will be created as data type string in Azure Table Storage. If that is not desired, they may be explicitly cast toDOUBLE
orINTEGER
/LONG
in the Data Virtuality Server before being written to Azure Table Storage.
Recommendations
- Avoid table creation without specifying system columns;
- Refresh the Azure Table Storage before issuing a
SELECT INTO
orCREATE TABLE
query. The connection is not kept alive and needs to be activated if idle for longer than one hour.
Further hints
- Technically, tables without ("partitionkey", "rowkey", "time_stamp") can be created.
- If such a table is created via
SELECT INTO
, only one row may be written. That is because metadata columns will be added in Azure Table Storage automatically with values (null, null) for ("partitionkey", "rowkey") column combination and one row containing that value will disallow the presence of any other due to uniqueness constraint. - If such a table is created via
CREATE TABLE
, exactly one row may be inserted, for the same reason as stated above;
- If such a table is created via
- Drop operation could take up to 40 seconds due to internal treatment of drop requests in Azure Table Storage;
- Upsert and History update procedures do neither work nor make sense towards Azure Table Storage. Any such workflows should be covered by a customized approach.
Translator properties for Azure Tables Connector
- numberOfThreads:
- controls the number of threads being used to write to Azure Table Storage
- default: 3
- reliable values: up to 15
Azure Table connector available since v2.4.6
Write capability for Azure Table connector available since v2.4.12