File-based Connectors
The file-based connectors bridge between CData Virtuality Server and local and remote file storage systems.
Type name | Description | Specific features |
---|---|---|
ufile | Accessing and managing files at local filesystem | |
ftp | Accessing and managing files via FTP | |
sftp | Accessing and managing files via SFTP | |
scp | Accessing and managing files via SCP | |
s3 | Accessing and managing files stored at Amazon AWS S3 storage | |
blob | Accessing and managing files stored at Azure Blob Storage |
Metadata
Before issuing queries to the file data source, we need to configure the data source using the appropriate CData Virtuality Server procedures:
CALL SYSADMIN.createConnection( name => <alias>, jbossCLITemplateName => <type name>, 'connectionOrResourceAdapterProperties => '<connector specific setting depending on type>');
CALL SYSADMIN.createDatasource( name => <alias>, translator => 'ufile', modelProperties => '', translatorProperties => '');
The translator has to be ufile
for all file-based data sources.
The CData Virtuality Studio provides a comfortable way to connect to data sources using graphical wizards. In order to do so, use the corresponding data source type under the File section in the Add data source wizard.
Usage
File data sources are utilizing stored procedures shared by all file-based connectors to gather data from their sources. These data may be further processed by the CData Virtuality Server. This is commonly done with table functions (like TABLE
, TEXTTABLE
, and XMLTABLE
) in combination with parsing functions depending on a data structure.
The CData Virtuality Studio provides a variety of Query Builders for that purpose. They allow an easy specification for file encoding and structure of data. These Query Builders are accessible via SQL editor -> Tools.
In Amazon S3, buckets and objects are the primary resources, and objects are stored in buckets. Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, CData Virtuality supports the folder concept as a means of grouping objects. CData Virtuality does this by using a shared name prefix for the grouped objects. In other words, the grouped objects have names that begin with a common string. This common string, or shared prefix, is the folder name.
Stored Procedures Shared by All File-based Connectors
To view the full table, click the expand button in its top right corner
Procedure name | Input parameter (data type / nulls allowed) | Example call & purpose |
---|---|---|
getFiles |
| Retrieves all files as If the extension path is specified, then it will filter all of the files in the directory referenced by the base path. If the extension pattern is not specified and the path is a directory, all files in the directory will be returned. Otherwise, the single file referenced will be returned. Supported wildcards:
For the S3 and Azure Blob connectors, the wildcards can be used anywhere - in file name or path. Usage:
SQL
Example:
SQL
|
getTextFiles |
| Retrieves all files as All the same files as with Supported wildcards - * for all file-based connectors and ? for S3 and Azure Blob connectors. For S3 and Azure Blob connectors wildcards could be used everywhere - in file name or path. Usage:
SQL
Example:
SQL
|
saveFile |
| Saves the Usage:
SQL
Example:
SQL
|
|
| Lists all files from specified directory. Supported wildcards:
For the S3 and Azure Blob connectors, the wildcards can be used anywhere - in the file name or path. Usage:
SQL
Example:
SQL
|
|
| Deletes all files matching the pattern. Usage:
SQL
CALL <alias>.deleteFile('') will delete all files in the directory without further confirmation.
|
The listFiles()
, getFiles(),
procedures work with wildcards * and ? since v4.0.7getTextFiles()