Snowflake Data Warehouse
Snowflake is a cloud-based data storage and analytics service providing relational database support for both structured data, such as CSV files and tables, and semi-structured data, including JSON, Avro, Parquet, etc. It can be used both as data source and as analytical storage.
Connector Configuration
Snowflake requires the following parameters:
Parameter | Description |
---|---|
host | A Snowflake endpoint based on account name |
user-name | Name of the user's account (provided by Snowflake) |
password | Password for the specified user |
db | Default database to use once connected. Data source parameter |
warehouse | Virtual warehouse to use once connected |
Example
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'supportsNativeQueries=true') ;;
Translator Properties
Amazon AWS S3 Upload
Inserting data into Snowflake using the standard JDBC protocol can be extremely slow, so we recommend configuring and using Amazon AWS S3 (S3LOAD) for productive usage.
To configure S3LOAD, you will need the following translator parameters:
Parameter | Description |
---|---|
uploadMode=s3load | Explicitly specifies the S3LOAD mode |
region | AWS S3 region endpoint |
bucketName | Optional parameter; bucket name to upload data files to. It is not necessarily an already existing bucket: you can use the |
| Optional parameter; specifies if the bucket set in the |
| Optional parameter; prefix of the temporary bucket to upload data files to if |
| AWS S3 key ID |
secretKey | AWS S3 secret key |
Examples
1. Snowflake as Data Source
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'supportsNativeQueries=true,uploadMode=s3Load,region=<region>,bucketName=<bucket_name>,createBucket=true,keyId=<key_ID>,secretKey="<secret_key>"') ;;
2. Snowflake as Analytical Storage
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'supportsNativeQueries=true,uploadMode=s3Load,region=<region>,bucketName=<bucket_name>,createBucket=true,keyId=<key_ID>,secretKey="<secret_key>"') ;;
CALL SYSADMIN.setCurrentDWH('dwh', 'PUBLIC') ;;
To read data from an S3 bucket, the security and access management policies on the bucket must allow Snowflake to access the bucket. There must be an IAM user with the required permissions to access the specified S3 bucket. This one-time setup involves establishing access permissions on a bucket and associating the required permissions with an IAM user. Access management instructions can be found in Configuring AWS IAM User Credentials.
Proxy settings for Amazon AWS S3 upload can be configured using System Properties.
Internal Stage Upload
As an alternative to Amazon AWS S3 upload mode, you can configure data upload via Internal Stage.
Here are the translator properties you need:
Parameter | Description |
---|---|
uploadMode= |
|
tempFolder | Value: a path to a folder (relative or absolute). Specifies the folder to be used for creating temporary files instead of the system configured one. If not specified, the default OS tmp folder is used |
keepTempFiles | If set to TRUE , temporary files will not be deleted after uploading |
| The name of the internal stage that will be used or created for uploading the file. If not set, a named stage with a unique name starting with TEMPORARY_NAMED_STAGE_ will be created for uploading the file and removed when done being used by the command |
Example
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'supportsNativeQueries=true, uploadMode=INTERNAL_STAGE, tempFolder=D:/tmp,keepTempFiles=true, internalStageName=justSomeName') ;;
INTERNAL_STAGE
upload mode available since v4.8
Microsoft Azure Blob Storage Upload
As an alternative to the above modes, you can configure data upload via Microsoft Azure Blob Storage.
Here are the translator parameters you will need:
Parameter | Description |
---|---|
uploadMode | Microsoft Azure Blob Storage upload mode: |
| Microsoft Azure Blob Storage account name |
storageAccountKey | Microsoft Azure Blob Storage account key |
containerName | Name of the container defined in the Microsoft Azure Blob Storage account. This property is case-sensitive as the container name is also case-sensitive in Microsoft Azure Blob Storage |
defaultEndpointsProtocol | Endpoint protocol |
tempFolder | Value: a path to a folder (relative or absolute). Specifies the folder to be used for creating temporary files instead of the system configured one. If not specified, the default OS tmp folder is used |
keepTempFiles | If set to TRUE , temporary files will not be deleted after uploading |
uploadZipped | If set to TRUE , zipped files will be uploaded |
azureKey | Key for encrypting data while it is stored in the Azure Blob Storage, must be a base64-encoded 128, 192, or 256-bit value. When set to an empty value (i.e. azureKey= ), the data does not get encrypted. The default behaviour depends on the way the data source connects to the Azure Blob Storage (see below) |
To set up secure access to data files in a Microsoft Azure Blob container, you can use one of two options: creating a SAS token or configuring a Snowflake Storage integration.
Connecting via SAS Token
To use the SAS token, configure the sasToken
property with the SAS token used to access the Microsoft Azure Blob Storage from Snowflake with the following parameters:
Parameter | Description |
---|---|
| SAS token used to access the Microsoft Azure Blob Storage from Snowflake |
| Specifies if the container set in the |
Please keep in mind the following:
- The
containerName
parameter specifies a temporary Azure Blob Storage container to be used (created and deleted for each transaction) if this parameter is not set; - By default, when
azureKey
is not set, the data is encrypted with a key generated by the server.
Example
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'uploadMode=blob_azure,supportsNativeQueries=true,storageAccountName=<storageAccountName>,storageAccountKey=<storageAccountKey>,sasToken="<sasToken>",defaultEndpointsProtocol=https,tempFolder=d:\tmp\azure\,keepTempFiles=false') ;;
Connecting via Snowflake Storage Integration
To use the Snowflake Storage Integration, configure the following properties:
Property | Description |
---|---|
| Fully qualified name of the external stage predefined and configured in Snowflake |
| Name of the container defined in the Microsoft Azure Blob Storage account. Must be set to the same container as defined in Please note that this property is case-sensitive as the container name is also case-sensitive in Microsoft Azure Blob Storage |
Please keep in mind the following:
- The value of the
azureKey
parameter should be the same as defined by themaster_key
parameter of the external stage; - By default, when
azureKey
is not set, the data is not encrypted.
Example
CALL SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
CALL SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'uploadMode=blob_azure,supportsNativeQueries=true,storageAccountName=<storageAccountName>,storageAccountKey=<storageAccountKey>,externalStageName=<fully_qualified_name>,containerName=<case_sensitive_test_containrer_name>,defaultEndpointsProtocol=https,tempFolder=d:\tmp\azure\,keepTempFiles=false') ;;
Session Keepalive
By default, Snowflake terminates open sessions after four hours of inactivity. To prevent this, you can set the JDBC parameter CLIENT_SESSION_KEEP_ALIVE
to TRUE
:
CALL SYSADMIN.createConnection(name => 'dwh', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>,jdbc-properties="CLIENT_SESSION_KEEP_ALIVE=TRUE"') ;;
CALL SYSADMIN.createDataSource(name => 'dwh', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'supportsNativeQueries=true,uploadMode=s3Load,region=<region>,bucketName=<bucket_name>,keyId=<key_ID>,secretKey="<secret_key>"') ;;
CALL SYSADMIN.setCurrentDWH('dwh', 'PUBLIC') ;;
Miscellanea
Indexes and native statistics are not supported by Snowflake data sources.