Skip to main content
Skip table of contents

File-based Connectors

You are looking at an older version of the documentation. The latest version is found here.

The file-based connectors bridge between Data Virtuality Server and local and remote file storage systems. 

Type nameDescriptionSpecific features
ufileAccessing and managing files at local filesystem
ftpAccessing and managing files via FTP
sftpAccessing and managing files via SFTP

scpAccessing and managing files via SCP
s3Accessing and managing files stored at Amazon AWS S3 storage
blobAccessing and managing files stored at Azure Blob Storage

Metadata

Before issuing queries to the file data source, we need to configure the data source using the appropriate Data Virtuality Server procedures:

SQL
CALL SYSADMIN.createConnection( name => <alias>, jbossCLITemplateName => <type name>, 'connectionOrResourceAdapterProperties => '<connector specific setting depending on type>');
CALL SYSADMIN.createDatasource( name => <alias>, translator => 'ufile', modelProperties => '', translatorProperties => '');

(info) The translator has to be ufile for all file-based data sources.

(info) The Data Virtuality Studio provides a comfortable way to connect to data sources using graphical wizards. In order to do so, use the corresponding data source type under the File section in the Add data source wizard.

Usage

File data sources are utilizing stored procedures shared by all file-based connectors to gather data from their sources. These data may be further processed by the Data Virtuality Server. This is commonly done with table functions (like TABLE, TEXTTABLE, and XMLTABLE) in combination with parsing functions depending on a data structure.

(info) The Data Virtuality Studio provides a variety of Query Builders for that purpose. They allow an easy specification for file encoding and structure of data. These Query Builders are accessible via SQL editor -> Tools.

(info) In Amazon S3, buckets and objects are the primary resources, and objects are stored in buckets. Amazon S3 has a flat structure instead of a hierarchy like you would see in a file system. However, for the sake of organizational simplicity, Data Virtuality supports the folder concept as a means of grouping objects. Data Virtuality does this by using a shared name prefix for the grouped objects. In other words, the grouped objects have names that begin with a common string. This common string, or shared prefix, is the folder name.

Stored Procedures Shared by All File-based Connectors

To view the full table, click the expand button in its top right corner


Procedure nameInput parameter (data type / nulls allowed)Example call & purpose
getFiles

pathAndPattern (string / not null)

Retrieves all files as blobs with an optional extension at the given path.

If the extension path is specified, then it will filter all of the files in the directory referenced by the base path.

If the extension pattern is not specified and the path is a directory, all files in the directory will be returned. Otherwise, the single file referenced will be returned.

Supported wildcards:

  • * for all file-based connectors
  • ? for S3 and Azure Blob connectors

(info) For the S3 and Azure Blob connectors, the wildcards can be used anywhere - in the file name or path.

Usage:

SQL
CALL <alias>.getFiles('pathAndPattern')

Example:

SQL
CALL s3.getFiles('marketing/*.csv')
getTextFiles

pathAndPattern (string / not null)

Retrieves all files as clobs with an optional extension at the given path.

All the same files as with getFiles will be retrieved, the only difference is that the results will be clob values.

Supported wildcards:

  • * for all file-based connectors
  • ? for S3 and Azure Blob connectors

(info) For the S3 and Azure Blob connectors, the wildcards can be used anywhere - in the file name or path.


 Usage:

SQL
CALL <alias>.getTextFiles('pathAndPattern')

Example:

SQL
CALL s3.getTextFiles('marketing/*.csv')
saveFile

pathAndPattern (string / not null)

value (clob, blob, xml / not null)

Saves the clob, blob or xml value to a given path. The path should reference a new file location or an existing file to overwrite completely. Both absolute and relative paths are valid as input.

 Usage:

SQL
CALL <alias>.saveFile('pathAndPattern', value)

(info) PathAndPattern has slightly different meaning here than for the other file-based connector procedures. Actually, it's specifying path and filename, not a pattern.

 Example:

SQL
CALL <alias>.saveFile('marketing/july_2016.csv', (SELECT cast(data as xml) FROM report_server.marketing_data_july_2016));

listFiles

pathAndPattern (string / not null)

Lists all files from the specified directory.

Supported wildcards:

  • * for all file-based connectors
  • ? for S3 and Azure Blob connectors

(info) For the S3 and Azure Blob connectors, the wildcards can be used anywhere - in the file name or path.

 Usage:

SQL
CALL <alias>.listFile('pathAndPattern')

Example:

SQL
CALL s3.getTextFiles('marketing/*.csv'))

deleteFiles

pathAndPattern (string / not null)

Deletes all files matching the pattern.

 Usage:

SQL
CALL <alias>.deleteFile('pathAndPattern')


CALL <alias>.deleteFile('') will delete all files in the directory without further confirmation.

The listFiles(), getFiles(), getTextFiles() procedures work with wildcards * and ? since v4.0.7

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.