Azure Table Connector
The Azure Table Connector is used to access Azure Tables data and make it available in the CData Virtuality Server in the relational form. This way, you can query an Azure Table using the SQL query language and write complex queries directed to multiple Data Sources. For example, a single SQL query can join tables from MySQL and combine them with data coming from Azure Table.
Connector Configuration
Connecting to Azure Table
Before issuing SQL queries to the Azure Table, you need to represent Azure Table's Metadata in relational format, for example represent non-relation collections as relation tables. This is done by connecting to Azure Table data source using the createConnection/createDatasource commands.
The complete list of connection properties can be found in Connection Configuration Interface under azuretable.
The Azure Table endpointSuffix connection property is explained below.
Example
CALL "SYSADMIN.createConnection"(
"name" => 'azuretable',
"jbossCLITemplateName" => 'azuretable',
"connectionOrResourceAdapterProperties" =>
'accountName=<accountName>
,accountKey=<accountKey>
,defaultEndpointsProtocol=https
,endpointSuffix=<suffix>'
) ;;
CALL "SYSADMIN.createDataSource"(
"name" => 'azuretable',
"translator" => 'azuretable',
"modelProperties" => 'importer.useFullSchemaName=false,importer.defaultSchema=<accountName>,importer.schemaPattern=<accountName>',
"translatorProperties" => ''
) ;;
Azure Table endpointSuffix
Azure domain name (e.g. core.windows.net)
Usage
Azure Tables are represented as relational tables in the schema named the same as in createDatasource().
SELECT Query
To query these tables, a regular SQL SELECT query can be used:
SELECT * FROM azuretable.table1
Write Capability
Azure Table Storage table names must be alpha-numeric and must not contain any special characters (underscore is not allowed);
Any Table in Azure Table Storage has system columns ("partitionkey", "rowkey", "time_stamp");
The column combination of ("partitionkey", "rowkey") has to be unique for each table in Azure Table Storage;
The name of the
time_stampcolumn is "time_stamp" in the CData Virtuality Server metadata and "timestamp" in Azure Table Storage metadata;Column order in the CData Virtuality Server metadata is as follows: "partitionkey", "rowkey", "time_stamp". All other columns are ordered alphabetically;
Empty tables in Azure Table Storage have no metadata in the Azure Table Storage. The CData Virtuality Server will not load such tables on data source refresh.
CREATE TABLE / INSERT INTO
CREATE table may specify system columns as ("partitionkey" string(1024), "rowkey" string(1024), time_stamp timestamp);
INSERT INTOmust contain unique values for ("partitionkey", "rowkey") column combination;Target columns in
INSERT INTOqueries have to be specified explicitly;The timestamp column is maintained by Azure Table Storage.
INSERT INTOandSELECT INTOmay specify a value for the "time_stamp" column. This value will be ignored by Azure Table Storage.
SELECT INTO
SELECT INTOmust contain unique values for ("partitionkey", "rowkey") column combination;The timestamp column is maintained by Azure Table Storage.
INSERT INTOandSELECT INTOmay specify a value for the "time_stamp" column. This value will be ignored by Azure Table Storage.
Data Types
The CData Virtuality Server
BLOBdata type columns (Snowflake'sVARBINARY) may be explicitly cast toSTRINGfor inserting them into Azure Table Storage;Numeric CData Virtuality Server data types
BIGDECIMAL/BIGINTEGERwill be created as data type string in Azure Table Storage. If that is not desired, they may be explicitly cast toDOUBLEorINTEGER/LONGin the CData Virtuality Server before being written to Azure Table Storage.
Recommendations
Avoid table creation without specifying system columns;
Refresh the Azure Table Storage before issuing a
SELECT INTOorCREATE TABLEquery. The connection is not kept alive and needs to be activated if idle for longer than one hour.
Further Hints
Technically, tables without ("partitionkey", "rowkey", "time_stamp") can be created.
If such a table is created via
SELECT INTO, only one row may be written. That is because metadata columns will be added in Azure Table Storage automatically with values (null, null) for ("partitionkey", "rowkey") column combination and one row containing that value will disallow the presence of any other due to uniqueness constraint.If such a table is created via
CREATE TABLE, exactly one row may be inserted, for the same reason as stated above;
Drop operation could take up to 40 seconds due to internal treatment of drop requests in Azure Table Storage;
Upsert and History update procedures do neither work nor make sense towards Azure Table Storage. Any such workflows should be covered by a customized approach.
Translator Properties for Azure Tables Connector
numberOfThreads:Controls the number of threads being used to write to Azure Table Storage
Sefault:
3Reliable values: up to
15
sampleSize:Defines number of rows that will be used to determine column type
Default:
100
supportsWhereColumnEqualsColumn:Controls pushdown of queries like
WHERE col1 = col2Default:
FALSE