Optimization Jobs
You are looking at an older version of the documentation. The latest version is found here.
The optimization jobs perform a full replication of data sources related to the recommended optimization into the Analytical Storage. If the job is scheduled more often than once, a new materialized table will be created and assigned to the optimization with every run. Stale tables, which are outdated because a new replication has been done in the meantime, will be deleted by the cleanUp
job.
Here is how to create a job for a recommended optimization:
SYSADMIN.createOptimizationJob(IN recOptId biginteger NOT NULL, IN allowIndexCreation string, IN gatherNativeStats boolean, IN indexCreationByStatus string, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN jobName string, OUT id biginteger NOT NULL RESULT)
This procedure returns the id of the newly created job which can then be used for changing it, if necessary:
SYADMIN.changeOptimizationJob(IN jobId biginteger NOT NULL, IN allowIndexCreation string, IN gatherNativeStats boolean, IN indexCreationByStatus string, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, OUT id biginteger NOT NULL RESULT)
Both procedures use the following parameters:
To view the full table, click the expand button in its top right corner
Parameter | Description |
---|---|
recOptId | ID of the recommended optimization corresponding to the job to be created |
jobId | Id of the optimization job to be changed (for changeOptimizationJob ) |
allowIndexCreation | String representing the types of the indexes that the system must create after completing the replication. See the ALLOW_INDEX_CREATION section in ALLOW_INDEX_RECOMMENDATION & ALLOW_INDEX_CREATION for information on which values are allowed. This parameter is optional and the default value is ALL |
gatherNativeStats | Flag to define if native statics on the materialized table should be gathered or not when the replication has been completed. This parameter is optional and the default value is TRUE |
indexCreationByStatus | String representing the status that the indexes must have to be created after completing the replication. Please refer to Recommended Index Statusfor a list of possible values. The default value is the value of the recommended optimization's allowIndexCreationByStatus field. |
parallelRunsAllowed | Number of parallel runs the job can have; If the number is reached, further starts of the particular job will be ignored. Default: 1 |
retryCounter | Number of retry attempts before the job is set to FAILED state |
retryDelay | Delay in seconds between the retry attempts |
runTimeout
| Individual job timeout in minutes. If no runTimeout is set, configuration set via default value option JOB_RUN_TIMEOUT
will be considered for the job; if runTimeout is reached before the job is finished, the job terminates with the INTERRUPTED state |
jobName | Job's unique identifier; please see jobName for more information |
Here are some examples with and without optional parameters:
CALL SYSADMIN.createOptimizationJob(recOptId => 5)
CALL SYSADMIN.createOptimizationJob(recOptId => 5, allowIndexCreation => 'JOIN, MANUAL')
CALL SYSADMIN.createOptimizationJob(recOptId => 5, allowIndexCreation => 'NONE', gatherNativeStats => FALSE)
Deleting a Scheduler Job
This special procedures deletes a scheduler job by id:
SYSADMIN.deleteSchedulerJob(IN id biginteger NOT NULL)
Incremental Optimization Jobs
This special type of optimization jobs supports incremental load of data into materialized tables. The Data Virtuality Server updates the rows in the materialized table that corresponds to the evaluation of the newRowCheckExpression
parameter, deleting (optionally) and inserting the corresponding rows.
The first run of such a job will be a full replication and include the creation of a new materialized table. It is directly followed by an incremental load, as the first full replication run can take some time and new data may have arrived at the source meanwhile. All further runs will not create new materialized tables: data will be added to the existing one in accordance with the settings of the incremental optimization schedule jobs.
And here is how to creat a schedule job for such incremental update:
SYSADMIN.createIncrementalOptimizationJob(IN recOptId biginteger NOT NULL, IN newRowCheckExpression string NOT NULL, IN identityExpression string, IN deleteOldData boolean NOT NULL, IN gatherNativeStats boolean, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, IN jobName string, OUT id biginteger NOT NULL RESULT)
This procedure returns the id of the newly created job which can then be used for changing it, if necessary:
SYSADMIN.changeIncrementalOptimizationJob(IN jobId biginteger NOT NULL, IN newRowCheckExpression string NOT NULL, IN identityExpression string, IN deleteOldData boolean NOT NULL, IN gatherNativeStats boolean, IN parallelRunsAllowed integer, IN retryCounter integer, IN retryDelay integer, IN runTimeout integer, OUT id biginteger NOT NULL RESULT)
Both procedures use the following parameters:
To view the full table, click the expand button in its top right corner
Parameter | Description |
---|---|
recOptId | ID of the recommended optimization corresponding to the schedule job to be created |
jobId | Id of the incremental optimization job to be changed (for changeIncrementalOptimizationJob ) |
newRowCheckExpression | String representing the check expression for the incremental optimization schedule job. All field names used within a newRowCheckExpression should be wrapped in double quotes |
identityExpression | String representing the identity expression for the incremental optimization schedule job expressed in SQL language. If this parameter is null, no identity expressions will be used to delete old rows from the materialized table. All field names used within an identityExpression should be wrapped in double quotes |
deleteOldData | Determines how data which is already in the materialized table and has a matching check expression or a matching identity expression is treated; see below for more information |
gatherNativeStats | Flag to define if native statics on the materialized table should be gathered or not when the replication has been completed. This parameter is optional and the default value is FALSE |
parallelRunsAllowed | Number of parallel runs the job can have; If the number is reached, further starts of the particular job will be ignored. Default: 1 |
retryCounter | Number of retry attempts before the job is set to FAILED state |
retryDelay | Delay in seconds between the retry attempts |
runTimeout
| Individual job timeout in minutes. If no runTimeout is set, configuration set via default value option JOB_RUN_TIMEOUT
will be considered for the job; if runTimeout is reached before the job is finished, the job terminates with the INTERRUPTED state |
jobName | Job's unique identifier; please see jobName for more information |
Here is how the deleteOldData
parameter works:
To view the full table, click the expand button in its top right corner
deleteOldData | identity expression is used | data matching row-check expression will be deleted in materialized table | data no longer present at source and matching row-check expression will be lost in materialized table | "match row-check expression in source" means |
---|---|---|---|---|
TRUE | no | yes | yes | the value in the source is equal or greater then the maximal value in materialized table |
TRUE | yes | no* | no | |
FALSE | no | no | no | the value in the source is greater than the maximal value in materialized table |
FALSE | yes | no | no |
* Technically, some data will be deleted, but only data
- which is still present in the source;
- which has a matching row-check expression in source;
- whose identity expression is already present in materialized table.
It will be inserted into the materialized table again with the new values from the source within a replication job, so effectively those rows are updated.
Here are some examples with different values:
CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => null, deleteOldData => true)
CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => 'id', deleteOldData => false)
CALL SYSADMIN.createIncrementalOptimizationJob(recOptId => 159, newRowCheckExpression => 'YEAR(orderDate)', identityExpression => 'id', deleteOldData => false, gatherNativeStats => true)
The accessState
of the corresponding materialized table (in SYSADMIN.MaterializedTable
) will be set to UPDATING
during the execution of the incremental update bob to inform the user that an incremental job is running on the current materialized table. The old (non-updated) values from this materialized table will be returned unless the update has finished. If an error occurs during the incremental update, accessState
will be set to INCOMPLETE
.
INCOMPLETE
when the server restarts.
At the moment, we only support incremental updates on a single table (Recommended Optimizations with type MAT_TABLE
). If an incremental optimization job for a MAT_JOIN
is created, an error message will be shown to the user.