The Parquet connector enables exporting data in Parquet format to the local filesystem.

Connector-specific Connection Properties

Property nameDescription
path
 Path to the local filesystem

Parquet Connector Data Source Creation

CALL SYSADMIN.createConnection(name => <parquetalias>, jbossCLITemplateName => 'ufile', connectionOrResourceAdapterProperties => 'path="path/to/folder"') ;;
CALL SYSADMIN.createDataSource(name => <parquetalias>, translator => 'parquet', modelProperties => null, translatorProperties => null) ;;
SQL

Model Properties

NameDescriptionDefault value
importer.loadMetadataWhen set to TRUE, the data source will load the metadata of the tables that were present in the folder prior to data source creationFALSE

Usage

Data is exported using the SELECT INTO command:

SELECT *
INTO <parquet data source name>.<table name> 
FROM ... 
SQL

The data will be exported into the folder specified in the path connection property. The table is represented by a folder named according to the following pattern: <parquet data source name>_<table name>.parquet. The folder contains files named like <table name>_<UID>.parquet. When new data is inserted into a table, a new file is created in the respective table folder with new data appended to the old data.

You can also create a table using the CREATE TABLE statement. However, the physical file will only be created when some data is inserted into this table using the INSERT VALUES or INSERT SELECT statement.

Example

CALL SYSADMIN.createConnection(name => 'parquet_1', jbossCLITemplateName => 'ufile', connectionOrResourceAdapterProperties => 'path="/home/exportuser/examples"') ;;
CALL SYSADMIN.createDataSource(name => 'parquet_1', translator => 'parquet', modelProperties => 'importer.loadMetadata=true', translatorProperties => null) ;;
 
SELECT * 
INTO parquet_1.example_salesorderdetail
FROM adventurework.salesorderdetail ;;
SQL

As a result of this call, the content of the salesorderdetail table in the adventureworks schema will be exported into the example_salesorderdetail_1e04e8d5-f963-11ed-a1bc-0a0027000003.parquet file in the /home/exportuser/examples/parquet_1.example_salesorderdetail.parquet folder.

See Also

Parquet File Creation and S3 Storage with Data Virtuality to learn how to take any data source table and create a local Parquet file

Query Parquet Files in Data Virtuality Using Amazon Athena for information on how to read from Parquet

Since v.3.9:

  • ufile jbossCLITemplateName is used for creating Parquet data sources;
  • importer.loadMetadata model property is available;
  • Tables are stored in dedicated folders;
  • Files are not re-written when inserting data;
  • Reading from Parquet tables is possible.