Optimization Jobs

The optimization jobs perform a full replication of data sources related to the recommended optimization into the Analytical Storage. If the job is scheduled more often than once, a new materialized table will be created and assigned to the optimization with every run. Stale tables, which are outdated because a new replication has been done in the meantime, will be deleted by the cleanUp job.

Here is how to create a job for a recommended optimization:

SQL

SYSADMIN.createOptimizationJob(IN recOptId biginteger, IN allowIndexCreation string, IN gatherNativeStats boolean, IN indexCreationByStatus string, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN jobName string, IN uuid string, IN recOptUuid string, OUT id biginteger NOT NULL RESULT)

This procedure returns the id of the newly created job which can then be used for changing it, if necessary:

SQL

SYADMIN.changeOptimizationJob(IN jobId biginteger, IN allowIndexCreation string, IN gatherNativeStats boolean, IN indexCreationByStatus string, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN uuid string, IN recOptUuid string, OUT id biginteger NOT NULL RESULT)

Both procedures use the following parameters:

To view the full table, click the expand button in its top right corner

Parameter	Description
`recOptId`	ID of the recommended optimization corresponding to the job to be created. `recOptId` or `recOptUuid` or both should be set
`jobId`	Id of the optimization job to be changed (for `changeOptimizationJob`)
`allowIndexCreation`	String representing the types of the indexes that the system must create after completing the replication. See the `ALLOW_INDEX_CREATION` section in Default Options for information on which values are allowed. This parameter is optional and the default value is ALL
`gatherNativeStats`	Flag to define if native statics on the materialized table should be gathered or not when the replication has been completed. This parameter is optional and the default value is `TRUE`
`indexCreationByStatus`	String representing the status that the indexes must have in order to be created after completing the replication. Please refer to Recommended Index Statusfor a list of possible values. The default value is the value of the recommended optimization's `allowIndexCreationByStatus` field.
`parallelRunsAllowed`	Number of parallel runs the job can have; If the number is reached, further starts of the particular job will be ignored. Default: `1`
`retryCounter`	Number of retry attempts before the job is set to `FAILED` state
`retryDelay`	Delay in seconds between the retry attempts
`runTimeout`	Individual job timeout in minutes. If no `runTimeout` is set, configuration set via default value option `JOB_RUN_TIMEOUT` will be considered for the job; if `runTimeout` is reached before the job is finished, the job terminates with the `INTERRUPTED` state
`jobName`	Job's unique identifier; please see jobName for more information
`uuid`	Custom uuid of the job to be created
`recOptUuid`	UUID of the recommended optimization corresponding to the job to be created. `recOptId` or `recOptUuid` or both should be set

Here are some examples with and without optional parameters:

SQL

CALL SYSADMIN.createOptimizationJob(recOptId => 5) 

CALL SYSADMIN.createOptimizationJob(recOptId => 5, allowIndexCreation => 'JOIN, MANUAL') 

CALL SYSADMIN.createOptimizationJob(recOptId => 5, allowIndexCreation => 'NONE', gatherNativeStats => FALSE)

uuid and recOptUuid parameters in SYSADMIN.createOptimizationJob and changeOptimizationJob are available since v4.1

Deleting a Scheduler Job

This special procedures deletes a scheduler job by id or uuid, only one parameter can be used:

SQL

SYSADMIN.deleteSchedulerJob(IN id biginteger, IN uuid string)

uuid parameter in SYSADMIN.deleteSchedulerJob is available since v4.1

Incremental Optimization Jobs

This special type of optimization jobs supports incremental load of data into materialized tables. The CData Virtuality Server updates the rows in the materialized table that corresponds to the evaluation of the newRowCheckExpression parameter, deleting (optionally) and inserting the corresponding rows.

The first run of such a job will be a full replication and include the creation of a new materialized table. It is directly followed by an incremental load, as the first full replication run can take some time and new data may have arrived at the source meanwhile. All further runs will not create new materialized tables: data will be added to the existing one in accordance with the settings of the incremental optimization schedule jobs.

And here is how to create a schedule job for such incremental update:

SQL

SYSADMIN.createIncrementalOptimizationJob(IN recOptId biginteger, IN newRowCheckExpression string NOT NULL, IN identityExpression string, IN deleteOldData boolean NOT NULL, IN gatherNativeStats boolean, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN jobName string, IN uuid string, IN recOptUuid string, OUT id biginteger NOT NULL RESULT)

This procedure returns the id of the newly created job which can then be used for changing it, if necessary:

SQL

SYSADMIN.changeIncrementalOptimizationJob(IN jobId biginteger, IN newRowCheckExpression string NOT NULL, IN identityExpression string, IN deleteOldData boolean NOT NULL, IN gatherNativeStats boolean, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN uuid string, IN recOptUuid string, OUT id biginteger NOT NULL RESULT)

Both procedures use the following parameters:

To view the full table, click the expand button in its top right corner

Parameter	Description
`recOptId`	ID of the recommended optimization corresponding to the schedule job to be created. `recOptId` or `recOptUuid` or both should be set
`jobId`	Id of the incremental optimization job to be changed (for `changeIncrementalOptimizationJob`)
`newRowCheckExpression`	String representing the check expression for the incremental optimization schedule job. All field names used within a `newRowCheckExpression` should be wrapped in double quotes
`identityExpression`	String representing the identity expression for the incremental optimization schedule job expressed in SQL language. If this parameter is null, no identity expressions will be used to delete old rows from the materialized table. All field names used within an `identityExpression` should be wrapped in double quotes
`deleteOldData`	Determines how data which is already in the materialized table and has a matching check expression or a matching identity expression is treated; see below for more information
`gatherNativeStats`	Flag to define if native statics on the materialized table should be gathered or not when the replication has been completed. This parameter is optional and the default value is `FALSE`
`parallelRunsAllowed`	Number of parallel runs the job can have; If the number is reached, further starts of the particular job will be ignored. Default: `1`
`retryCounter`	Number of retry attempts before the job is set to `FAILED` state
`retryDelay`	Delay in seconds between the retry attempts
`runTimeout`	Individual job timeout in minutes. If no `runTimeout` is set, configuration set via default value option `JOB_RUN_TIMEOUT` will be considered for the job; if `runTimeout` is reached before the job is finished, the job terminates with the `INTERRUPTED` state
`jobName`	Job's unique identifier; please see jobName for more information
`uuid`	Custom uuid of the job to be created
`recOptUuid`	UUID of the recommended optimization corresponding to the job to be created. `recOptId` or `recOptUuid` or both should be set

uuid and recOptUuid parameters in SYSADMIN.createIncrementalOptimizationJob and SYSADMIN.changeIncrementalOptimizationJob are available since v4.1

Here is how the deleteOldData parameter works:

To view the full table, click the expand button in its top right corner

`deleteOldData`	identity expression is used	data matching row-check expression will be deleted in materialized table	data no longer present at source and matching row-check expression will be lost in materialized table	"match row-check expression in source" means
`TRUE`	no	yes	yes	the value in the source is equal or greater then the maximal value in materialized table
`TRUE`	yes	no*	no
`FALSE`	no	no	no	the value in the source is greater than the maximal value in materialized table
`FALSE`	yes	no	no

* Technically, some data will be deleted, but only data

- which is still present in the source;
- which has a matching row-check-expression in source;
- whose identity-expression is already present in materialized table.

It will be inserted into the materialized table again with the new values from the source within a replication job, so effectively those rows are updated.

Here are some examples with different values:

SQL

CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => null, deleteOldData => true)

CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => 'id', deleteOldData => false)

CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => 'id', deleteOldData => false, gatherNativeStats => true)

The accessState of the corresponding materialized table (in SYSADMIN.MaterializedTable) will be set to UPDATING during the execution of the incremental update bob to inform the user that an incremental job is running on the current materialized table. The old (non-updated) values from this materialized table will be returned unless the update has finished. If an error occurs during the incremental update, accessState will be set to INCOMPLETE.

If the server is killed when an incremental job is updating a materialized table, its state will be set to INCOMPLETE when the server restarts.

At the moment, we only support incremental updates on a single table (Recommended Optimizations with type MAT_TABLE). If an incremental optimization job for a MAT_JOIN is created, an error message will be shown to the user.

Example: Check Expressions with Non-unique Results

Example: Using an Identity Expression