Pipelines

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2026

A Data Quality pipeline can be executed in any supported processing environment. Every pipeline run or catalog connection is logged as a job in Data Integrity Suite.

Once you have created and tested a pipeline application using a sample dataset, you can set up a run configuration. Run configurations are used to process data from a data source and define the environment, as well as the input and output data assets. Data Quality pipeline jobs are defined by a pipeline run configuration. Pipeline run configurations are created and edited on the Pipeline Editor page. A run configuration specifies:

  • Connection
  • Pipeline engine
  • Source dataset and target dataset
  • Target options

View pipeline jobs

Limited Availability: ^ This feature is currently available only in select workspaces and might be subject to change before general availability.

To see data quality jobs, navigate to Quality > Jobs from the main navigation menu and go to the Pipelines tab. On this tab, you can view the following:

  • Click a column heading to reorder jobs in ascending or descending order by entries in the column. You can click the Filter button in a column heading to filter jobs by values in a column or to clear an existing filter.
  • Select the check box next to the ID column to either filter or delete the jobs.
  • Click the Refresh button on the toolbar to refresh entries in the table.
Column Description
ID This is an integer that is assigned sequentially in the order that a quality job is started. Click the Ellipsis to either Delete or generate a Quick Run.
Pipeline The pipeline on which a quality job was run. Click on the pipeline name to preview the pipeline definition page.
Run Configuration The environment in which a quality job was run.
Start Time The date and time at which a quality job started.
Duration The time that it took to complete the quality job (HH:MM:SS).
User The user name associated with a quality job.
Status The current status of a quality job. The available statuses are:
  • Ready
  • Pending
  • Running
  • Paused
  • Successful
  • Failed
  • Terminating
  • Cancelled
  • Unknown
Tip: You can use the Filter and Refresh controls to quickly find or update job entries.

What causes a pipeline job to fail

Any of the following issues with a run configuration will cause a job to fail:

  • The schema of the source dataset does not match the schema of the pipeline input.
  • The schema of the target dataset does not match the schema of the pipeline output.
  • The connection is deleted. The connection specified by the run configuration must be available when a job is run. When a connection is deleted, any run configurations that specify the connection are rendered invalid.
  • The pipeline engine is deleted. The pipeline engine specified by a run configuration must be available when a job is run. When a pipeline engine is deleted, any run configurations that specify the pipeline engine are rendered invalid.
Warning: Ensure that all referenced connections and pipeline engines exist before running a job to avoid failures.