Skip to main content
Skip table of contents

Azure Synapse Analytics (Former SQL Data Warehouse)

Upload Mode

The Azure Synapse Analytics (Former SQL Data Warehouse) connector supports two different upload modes: Standard (default) and Blob. As the Standard mode is very slow, we do not recommend using it, except for simple test scenarios. For better performance, we recommend the Blob mode which uses Azure Blob Storage for uploading and copying data into the target table. In this section, we describe how to set up the Blob mode and how to use it.

Automatic Setup

Normally, the connector is set up automatically, with the CData Virtuality Server performing three steps: first, it creates a master key, then it creates the database for credentials, and finally, it defines an external file format depending on the stringDelimiter property.

Please note that if at a later stage you will need to update one of the definitions (e.g. if the secret key has changed), the related object (e.g. the database for credentials) must be deleted first manually in Azure.

Manual Setup and Defining External File Format

If automatic setup described above is impossible for some reason, you will have to perform the three steps (creating a master key, creating the database for credentials, and creating an external file format depending on the stringDelimiter property) manually.

The external file format is defined as follows:

SQL
CREATE EXTERNAL FILE FORMAT DVTextFileFormat WITH (
    FORMAT_TYPE = DELIMITEDTEXT,
    FORMAT_OPTIONS (
        FIELD_TERMINATOR = ',',
        STRING_DELIMITER = '<stringDelimiter>',
        USE_TYPE_DEFAULT = FALSE
    )
);

The default value of the stringDelimiter property is |. To change it, you can specify the value you need in both createConnection() and createDatasource() calls. For better performance, we recommend to avoid changing the value of the property after you have set the connector up and used it.

Copying Data using Azure Blob Storage

When the connector has been set up, you can copy the data using Azure Blob Storage as follows:

  1. Create an external data source - it depends on the storage account name and on the container name (needed once for each copy);
  2. Generate and upload a text file (blob) to the container in the storage (needed once for each copy);
  3. Create an external table and copy data from the blob into it;
  4. Copy data from the external table into the final target table.

Deleting and Recreating External File Format

Should the need arise, you can delete the external file format using this command:

SQL
CALL "<your Azure Synapse ds name>.native"(
    "request" => 'DROP EXTERNAL FILE FORMAT DVTextFileFormat'
);;

After executing the command, just drop the Azure data source in the CData Virtuality Server and create it again.

Using Blob Storage with Managed Service Identity

If the Blob Storage has an MSI (Managed Service Identity) configured, a special CLI template (azure_msi) should be used to create the data source connection. There are several things to keep in mind:

  • Translator parameter useMSI must be set to TRUE;
  • Connector property storageAccountKey is not needed;
  • Translator property storageAccountKey must be set.

Distribution Modes

Azure Synapse Analytics supports two table distribution options: ROUND_ROBIN (default) and HASH. You can change the default distribution method when the table is created by providing an appropriate option (applicable to both CREATE TABLE and SELECT INTO commands).

Here are examples for both methods:

ROUND_ROBIN distribution

SQL
CREATE TABLE azure_blob.testdist ( col1 integer, col2 string )
OPTIONS (distribution 'ROUND_ROBIN') ;; --this line can be skipped

HASH distribution for CREATE TABLE command

SQL
CREATE table azure_blob.testdist ( col1 integer, col2 string )
OPTIONS (distribution 'hash(col1)') ;;

HASH distribution for SELECT INTO command

SQL
SELECT 1 AS col1, null AS col2, 'a' AS col3, null AS col4 INTO azure_blob.testdistselectinto
OPTIONS (distribution 'hash(col1)') ;;

Another way to change the distribution mode is to define the DISTKEY index when materialized tables are created by the replicator. In our example, the materialized table will be created with a HASH distribution on column a:

SQL
CALL "SYSADMIN.createIndex"(
    "recOptId" => 1,
    "columnName" => 'a',
    "indexKind" => 'DISTKEY'
) ;;

Indexing

Tables are created as heaps, thus without a clustered index. WITH (HEAP) table option is used on table creation. To add an index, use the native() procedure or do it directly in the database.

String and BLOB size limitations

When the Blob upload mode is used, String and BLOB values are limited to 500 kilobytes due to blob storage limitations. For the standard upload mode, the limitation is set to 500,000 symbols.

Usage Examples

Standard Upload Mode

SQL
CALL SYSADMIN.createConnection(name => 'azure_std', jbossCLITemplateName => 'azure', connectionOrResourceAdapterProperties => 'db=<bd>,user-name=<user-name>,password=<password>,host=<host>,port=1433') ;;
CALL SYSADMIN.createDataSource(name => 'azure_std', translator => 'azure', modelProperties => 'importer.useFullSchemaName=false,importer.TableTypes="TABLE,VIEW",importer.schemaPattern=dbo,importer.importIndexes=false,importer.defaultSchema=dbo', translatorProperties => 'supportsNativeQueries=true') ;;

Blob Upload Mode

For the two examples that follow, please note that the storageAccountKey must be provided both in createConnection() and createDatasource() procedure calls.

Blob Upload Mode

SQL
CALL SYSADMIN.createConnection(name => 'azure_blob', jbossCLITemplateName => 'azure', connectionOrResourceAdapterProperties => 'db=<db>,user-name=<user-name>,password=<password>,host=<host>,port=1433,storageAccountKey=<storageAccountKey>') ;;
CALL SYSADMIN.createDataSource(name => 'azure_blob', translator => 'azure', modelProperties => 'importer.useFullSchemaName=false,importer.TableTypes="TABLE,VIEW",importer.schemaPattern=dbo,importer.importIndexes=false,importer.defaultSchema=dbo', translatorProperties => 'supportsNativeQueries=true,uploadMode=BLOB,storageAccountName=dvblobtest,storageAccountKey=<storageAccountKey>,defaultEndpointsProtocol=https,deleteContainer=false,tempFolder=D:\tmp\azure') ;;

If tempFolder is not specified, the default OS tmp folder is used.


Blob Upload Mode with Managed Service Identity

SQL
CALL SYSADMIN.createConnection(name => 'azure_msi', jbossCLITemplateName => 'azure_msi', connectionOrResourceAdapterProperties => 'db=<db>,user-name=<user-name>,password=<password>,host=<host>,port=1433,stringDelimiter=|') ;;
CALL SYSADMIN.createDataSource(name => 'azure_msi', translator => 'azure', modelProperties => 'importer.useFullSchemaName=false,importer.TableTypes="TABLE,VIEW",importer.schemaPattern=dbo,importer.importIndexes=false,importer.defaultSchema=dbo', translatorProperties => 'supportsNativeQueries=true,uploadMode=BLOB,useMSI=true,storageAccountName=dvsecuredstorage,storageAccountKey=<storageAccountKey>,defaultEndpointsProtocol=https,deleteContainer=true,stringDelimiter=|') ;;
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.