Skip to main content
Skip table of contents

Parquet Connector

The Parquet connector enables exporting data in Parquet format to the local filesystem.

Parquet Connector Data Source Creation

SQL
CALL SYSADMIN.createConnection(name => <parquetalias>, jbossCLITemplateName => 'ufile', connectionOrResourceAdapterProperties => 'ParentDirectory="directory"') ;;
CALL SYSADMIN.createDataSource(name => <parquetalias>, translator => 'parquet', modelProperties => null, translatorProperties => null) ;;

Model Properties

NameDescriptionDefault value
importer.loadMetadataWhen set to TRUE, the data source will load the metadata of the tables that were present in the folder prior to data source creationFALSE

Usage

Data is exported using the SELECT INTO command:

SQL
SELECT *
INTO <parquet data source name>.<table name> 
FROM ... 

The data will be exported into the folder specified in the path connection property. The table is represented by a folder named according to the following pattern: <parquet data source name>_<table name>.parquet. The folder contains files named like <table name>_<UID>.parquet. When new data is inserted into a table, a new file is created in the respective table folder with new data appended to the old data.

You can also create a table using the CREATE TABLE statement. However, the physical file will only be created when some data is inserted into this table using the INSERT VALUES or INSERT SELECT statement.

Example

SQL
CALL SYSADMIN.createConnection(name => 'parquet_1', jbossCLITemplateName => 'ufile', connectionOrResourceAdapterProperties => 'ParentDirectory="/home/exportuser/examples"') ;;
CALL SYSADMIN.createDataSource(name => 'parquet_1', translator => 'parquet', modelProperties => 'importer.loadMetadata=true', translatorProperties => null) ;;
 
SELECT * 
INTO parquet_1.example_salesorderdetail
FROM adventurework.salesorderdetail ;;

As a result of this call, the content of the salesorderdetail table in the adventureworks schema will be exported into a file named something like example_salesorderdetail_1e04e8d5-f963-11ed-a1bc-0a0027000003.parquet in the /home/exportuser/examples/parquet_1.example_salesorderdetail.parquet folder.

See Also

Parquet File Creation and S3 Storage with Data Virtuality to learn how to take any data source table and create a local Parquet file.

Query Parquet Files in Data Virtuality Using Amazon Athena for information on how to read from Parquet.

Since v.3.9:

  • ufile jbossCLITemplateName is used for creating Parquet data sources;
  • importer.loadMetadata model property is available;
  • Tables are stored in dedicated folders;
  • Files are not re-written when inserting data;
  • Reading from Parquet tables is possible.
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.