Amazon RedShift as Analytical Storage
You are looking at an older version of the documentation. The latest version is found here.
When using Amazon Redshift as analytical storage, keep in mind the following:
- Loading data using S3 (S3LOAD) should be configured for any productive usage, as inserting data into Redshift using standard JDBC protocol can be extremely slow;
- Redshift does not support
BLOB
orCLOB
type. - The maximum length of the
VARCHAR
type is 65534 bytes. Redshift calculates theVARCHAR
length in bytes, whereas most other SQL databases, including the Data Virtuality Server, calculate the size in characters.
That means that thevarchar(X)
field on RedShift is sometimes able to store fewer characters than comparable types on other systems, especially if and when international characters are used.
Please use the translator propertiesvarcharReserveAdditionalSpacePercent and truncateStrings
to configure your Analytical Storage if needed; - Default query concurrency on Redshift - 5 concurrent queries - should be increased for the Data Virtuality Server.
We recommend allowing at least 15 concurrent queries. For heavy loads, an even higher number will be necessary.
Please consult Amazon Redshift documentation for details: how to configure query concurrency on Amazon Redshift. - For optimal operation, Redshift requires that the
VACUUM
andANALYZE
commands are run at regular intervals of time.
This can be achieved by scheduling the following SQL job, for example, to run every night. The queries presume that the Redshift data source is configured to support native queries, which is done by adding"supportsNativeQueries=TRUE"
as a translator property to the data source configuration.
CALL "dwh.native"( "request" => 'COMMIT; VACUUM;' );
CALL "dwh.native"( "request" => 'COMMIT; ANALYZE;' );