AWS S3 Connector
The AWS S3 connector, known by the type name s3
, exposes stored procedures to leverage resources stored in AWS S3.
Connector-specific Connection Properties
Name | Description |
---|---|
keyId | S3 key id |
secretKey | S3 secret key |
bucketName
| S3 bucket name to work with |
region
| S3 region (optional) |
prefix | pathAndPattern prefix to be used when handling files |
Example
CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1, keyId=<id>, secretKey="<secret>", bucketName=dv-redshift-upload-test');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;
IAM Role Authorization
When IAM Role authorization is configured, the keyId
and secretKey
connector parameters can be omitted:
CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1, bucketName=dv-redshift-upload-test');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;
Example
This example shows using IAM policy on the AWS side:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAccountLevelS3Actions",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:HeadBucket"
],
"Resource": "*"
},
{
"Sid": "AllowListAndReadS3ActionOnMyBucket",
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::mk-s3-test/*",
"arn:aws:s3:::mk-s3-test"
]
}
]
}
Multi-part Upload
The AWS S3 connector can be configured to perform the multipart upload using the following properties:
Name | Description | Default value |
---|---|---|
| TRUE for performing multi-part upload (optional) | FALSE |
| Number of threads for multi-part upload (optional) | 5 |
| Part size for multi-part upload in bytes (optional) | 5MB |
The partSize
can be specified between 5 MB to 5 TB in size. If the specified value is out of this range, it will be automatically changed to either 5 MB or 5 TB, respectively.
Example
CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1,keyId=<id>,secretKey="<secret>",bucketName=dv-redshift-upload-test,multipartUpload=true,partSize=1024,numberOfThreads=5');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;
Prefix
The Prefix enables limiting result set (see SDK documentation):
- The
Prefix
property value gets passed inconnectionOrResourceAdapterProperties
; - All procedures of the connector automatically take the prefix into consideration (e.g. calling
listFiles(pathAndPattern => NULL)
still applies the prefix from the data source settings; - If the data source has a prefix configured, and a
pathAndPattern
gets passed, the values are concatenated. For example, if the data source is configured with prefix: a/b, andlistFiles(pathAndPattern => 'c/d')
gets called, this results ina/b/c/d
.
Ceph Support
Ceph is an open-source distributed storage solution that can use S3 API. Please note that for the CData Virtuality S3 connector to work with Ceph, the RGW service must be configured.
A data source connected to Ceph via S3 API can be configured with the following properties:
Name | Description |
---|---|
endPoint | Mandatory in the case of Ceph; otherwise S3 API will use its Amazon endpoints by default |
passStyleAccess | Mandatory in the case of Ceph if the DNS is not configured on the server running it; otherwise, by default, the S3 library will add a bucket name to the initial endpoint |
Example
CALL SYSADMIN.createConnection(name => 'test_ceph_rgw', jbossCLITemplateName => 's3', connectionOrResourceAdapterProperties => 'endPoint=<endPoint>,keyId=<keyID>,secretKey=<secretKey>,bucketName=<bucketName>,passStyleAccess=true');;
CALL SYSADMIN.createDataSource(name => 'test_ceph_rgw', translator => 'ufile', modelProperties => 'importer.useFullSchemaName=false', translatorProperties => '');;