Parquet Connector
The Parquet connector enables exporting data in Parquet format to the local filesystem. In the current implementation, the Parquet connector is write-only.
Connector-specific Connection Properties
Property name | Description |
---|---|
path | Path to the local filesystem |
Parquet Connector Data Source Creation
CALL SYSADMIN.createConnection(name => <parquetalias>, jbossCLITemplateName => 'parquet', connectionOrResourceAdapterProperties => 'path="path/to/folder"') ;;
CALL SYSADMIN.createDataSource(name => <parquetalias>, translator => 'parquet', modelProperties => null, translatorProperties => null) ;;
Usage
Data is exported by using the SELECT INTO
command:
SELECT *
INTO <parquet data source name>.<file name>
FROM ...
The data will be exported into the folder that is specified via the path
connection property. The filename is generated according to the following pattern: <parquet data source name>_<name provided in select into command>.parquet
. Please note that if a file with this name already exists, it will be overwritten.
Example
CALL SYSADMIN.createConnection(name => 'parquet_1', jbossCLITemplateName => 'parquet', connectionOrResourceAdapterProperties => 'path="/home/exportuser/examples"') ;;
CALL SYSADMIN.createDataSource(name => 'parquet_1', translator => 'parquet', modelProperties => null, translatorProperties => null) ;;
SELECT *
INTO parquet_1.example_salesorderdetail
FROM adventurework.salesorderdetail ;;
As a result of this call, content of the salesorderdetail
table in the adventureworks
schema will be exported into the parquet_1.example_salesorderdetail.parquet file in the /home/exportuser/examples folder.
See Also
Parquet File Creation and S3 Storage with Data Virtuality to learn how to take any data source table and create a local Parquet file
Query Parquet Files in Data Virtuality Using Amazon Athena for information on how to read from Parquet