Run quality pipeline

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025

Before running a quality pipeline, ensure that all prerequisites are met. This will allow the pipeline to execute successfully without encountering errors.

Prerequisites

These prerequisites assume that you have already built a quality pipeline and it is ready to be run. Here are the steps to ensure everything is set up correctly:
  • Validate pipeline: Fix any issues flagged in the pipeline, particularly those related to invalid transformation step configurations or input dataset errors. Revisit and correct these issues as necessary.
  • Check valid data subscription: Verify if you are using transformation steps that require data subscriptions. While you can add these steps during pipeline design, a valid subscription is necessary for successful execution. Examples include Enrich, Identify Country, and Verify & Geocode Address steps.
  • Set up an agent: This step is applicable only if you intend to run the pipeline on-premises. Create an agent that will be utilized in the run configuration settings of the pipeline.
  • Create pipeline engine: The pipeline engine serves as the processing engine required to execute the pipeline. Create a pipeline engine using supported connections.
  • Run configuration settings: Every pipeline requires run configuration settings to be set up. These settings define the target dataset where pipeline changes are applied, as well as the pipeline engine used for execution.
  • Filter Sensitive Table Data Using Row Filters and Column Masks: To enable filtering sensitive table data, ensure that your workspace is configured for serverless compute and verify that the instance pool is set up with Databricks Runtime 15.4 LTS or a later version.
    Note: This guideline applies specifically to running jobs in a Databricks environment.

To run a quality pipeline:

  1. Navigate to Quality > Pipelines and find the pipeline you want to run.
  2. In the Name column, click the ellipsis and select Edit. Then, click Run.
  3. Click + Create New Run Configuration. This expands the New Run Configuration settings.
  4. Configure settings for the new run configuration.
  5. Click Create to save the settings for the new run configuration, or click Create and Run Pipeline to save the settings and immediately run the pipeline.

Adhering to these prerequisites ensures that the quality pipeline runs without encountering errors. You can view the job details on the Quality > Jobs page.