Create run configuration

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025

A run configuration defines source and target datasets as well as the pipeline engine that runs a pipeline. You can choose to immediately run the pipeline when you create a run configuration.

  1. On the main navigation menu, click Quality > Pipelines.
  2. Find the pipeline for which your want to create a new run configuration.
  3. In the Name column, click the ellipsis!, then click Edit.
  4. Click Run.
  5. Click + Create New Run Configuration. This expands the New Run Configuration settings.
  6. Configure settings for the new run configuration.
    • Name: Provide a meaningful name to the run configuration that reflects the purpose or the type of data processing it will perform.
    • Type: There are two types of run configurations that can be configured:
      • Batch: The batch type is ideal for processing datasets in bulk.
      • Service: The service type exposes the pipeline as a real-time API for on-demand execution.
        Note: When creating a service-based run configuration, only pipelines with one input and one output can be deployed as a service.
    Batch run configuration
    Field Description
    Pipeline Engine Select the pipeline engine on which you want to process the data. You can review the list of supported combinations to make your selection.
    Warning: When choosing a pipeline engine, ensure that the connection of the pipeline engine is compatible with the source dataset.
    Note: When creating a new run configuration during the on-boarding flow for a connection, a compatible pipeline engine is automatically selected by default.
    Inputs
    • Input Name: A unique name for each input to identify it in the configuration.
    • Source Dataset: Select the dataset from which the data will be sourced. A list of connections compatible with the pipeline engine is provided to assist with your selection.
      Note: If an invalid connection is selected, an error will occur. While you won’t be able to schedule or manually execute invalid run configurations, you will still be able to save them.
    Outputs
    • Output Name: Output name that will help in identifying the result of the data processing.
    • Target Options: Choose the processing options that apply to the output data. Here are the options you can choose from:
      • Append: Output is appended to columns in the target dataset if the pipeline output schema matches the target dataset schema.
      • Override Truncate: Clears the dataset and writes new data to it, ensuring the schema of the pipeline output matches the target dataset schema before proceeding.
      • Overwrite Drop: Deletes and recreates the target dataset, writing new data without matching or verifying the schema with the pipeline output.
    • Target Dataset: Select the target dataset where the processed data will be stored. Ensure that the target location is compatible with the output specifications and accessible by the pipeline engine.
    Note: When setting up an output for a Databricks pipeline, ensure that the output dataset table name is specified in lowercase letters only.
    Service run configuration
    Limited Availability: ^ This feature is currently available only in select workspaces and might be subject to change before general availability.
    Field Description
    Description Provide a meaningful description for the run configuration that reflects its purpose.
    Service Identifier URL By default, the Service Identifier URL is automatically populated with the pipeline name. The system checks for duplicates and prevents deployment if a matching URL already exists. After deployment, the generated Service URL becomes available for use.
    Preview and Copy URL Once the service is deployed it will show the Service URL https://pipeline-realtime-dev.dqcore.cloud.precisely.services/v1/api/service/ serviceName and additional actions become available:
    • Preview Service: Open the service details in the Swagger page, showing the available endpoints.
    • Copy URL: Copies the generated Service URL so you can easily access it or open it in the Swagger page.
    The Swagger page displays all deployed services, and you can test or execute the selected service directly by clicking the POST button.
  7. Click Create to save the settings for the new run configuration, or click Create and Run Pipeline to save the settings and immediately run the pipeline.
  8. Alternatively, you can select Create or Create and Deploy to create a new service. Once it is successfully created, a Service tag will be displayed at the top of your pipeline canvas, indicating its availability.
  9. Once batch or service run configurations are created successfully, you can find them listed under Run or Deploy.
  10. You can now directly click the Run or Deploy button to execute any batch configuration or deploy any service.

Note: Views and Target Dataset Configuration in Databricks
  • Views under Default Catalog: Views from the default catalog will appear as options for the Target Dataset when creating a Run Configuration. However, selecting a view from the default catalog as the Target Dataset will result in job execution failure in Databricks.

    Recommendation: Avoid selecting views from the default catalog for the Target Dataset due to this Databricks limitation.

  • Views under Custom Catalog: Views from a custom catalog are not displayed as options for the Target Dataset during Run Configuration.

Browse or edit existing run configuration

You can browse and edit existing run configurations for a pipeline. You can only edit a run configuration if your Data Integrity Suite administrator permission to do so.

  1. On the main navigation menu, click Quality > Pipelines.
  2. On the Pipelines tab, find the pipeline for which you want to browse or edit existing run configurations.
  3. Alternatively, you can manage configurations directly from the Pipeline list page, where you can run, deploy, edit, or delete the service.
  4. Click the ellipsis and, then click Edit. Alternatively, you can click the pipeline Name to edit a pipeline.
  5. Click the Settings button. This expand the Data Quality Pipeline Settings.
  6. On the Run Configurations tab, click the run configuration that you want to view or edit. You can only edit a pipeline if you are authorized to do so by the Data Integrity Suite administrator.
  7. Make any necessary changes to the run configuration settings.
  8. Click Save Changes.
For more information on running a quality pipeline, refer to Run a quality pipeline.

Browse or edit existing service run configuration

Limited Availability: ^ This feature is currently available only in select workspaces and might be subject to change before general availability.
You can browse and edit existing service run configurations for a pipeline. If the service run configuration is created successfully, you can find it listed under the Run or Deploy tab. You can only edit a service run configuration if your Data Integrity Suite administrator has permission to do so.
Warning: Once the service is deployed, any modifications to the pipeline will cause the service to be out of sync. To synchronize the service with the pipeline, it will need to be redeployed.
  1. To configure your deployed service, go to Settings or update it directly from the Pipeline list page, where you can run, deploy, edit, or delete the service.
  2. Under Services, you will find a list of available services for the pipeline, along with their respective statuses.
  3. From this service panel, you have the option to deploy, edit, undeploy or re-deploy to sync any service.
  4. If you wish to delete an existing service run configuration, simply click on the ellipses next to the service name in the service list.