General Setup

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025

The initial step requires defining the pipeline’s name, description, replication type, and available options. This foundational information sets the context for the entire pipeline, enabling clear identification and configuration throughout its lifecycle. This step specifies the general information, including the replication pipeline type and options.

  • Replication pipeline: Provide name of the replication pipeline. Every project contains one or more replication pipeline. You configure the pipeline properties that, when the pipeline is run or started, move source data to a target database or file system in bulk or run data capture and replication processes.
  • Description: Additional information about the replication pipeline.
  • Type: Choose the replication type:
    • Synchronize: Copies a snapshot of a set of source tables from the database source and distributes it to the target. Data changed during the copy process is captured. When the copy process finishes, a change-based replication begins, applying all changes made during the copy up through the time the copy finished. Thereafter, changes made to the source data are detected, captured, and replicated to the Kafka or Snowflake target, synchronizing the table column data.
    • Replicate: Detects changes made to data, captures the changes, and applies them to the Kafka or Snowflake target.
    • Copy: Copies data extracted from the database source and distributes it to the target. Not supported for the pipeline using Db2 for IBM i data source connections.
    • Audit: Provides an audit log of all changes made to the source without actually altering the target dataset. Changes to the source data are detected and captured, but rather being applied to the target, information about each change is simply recorded in the target dataset. This includes details such as whether the change on the source was an insert, update, or delete, as well as timestamp of when the change occurred.
      Note:
      • This type of replication is available only for journal mapping.
      • Target tables in audit pipelines should not have primary keys. If primary keys are required, they should be based on values that are NOT part of the source table, such as a unique row metadata value. This is because, in audit pipelines, rows are always inserted into the target table and never updated.
  • Replication Options:
    • Transaction error mode: Select what happens when an error occurs while the replication pipeline transfers data to the target and the error does not affect the connection to the target server.
      • Delete from queue and continue: Processing continues despite the error. The statement and record that caused the error are copied to the log then deleted from the replication backlog tables.
      • Shut down apply component: For the errored replication pipeline transaction, transferring data to the target is stopped and rolled back and all connections are shut down. The source server's replication backlog tables remain in the state that the last successful transfer left them.
    • Conflict resolution between source and target (DBMS only): Depending on how you want to manage and resolve conflicts between data in the source and target found during data replication activities:
      • Source data is correct. Update the target: Selected by default.
      • Target data is correct. Do not update the target
  • Copy Options:
    • Copy method
      • Use load mode when possible: This option is available exclusively for SQL Server, Oracle, and Teradata, and utilizes the database's fast load utility to enhance data loading performance. Additionally, it allows data to be copied in batches of 100,000 rows.
        Note: The loader files are automatically cleaned up when the load process is successful. However, they will remain if an error occurs during the load or if debug mode is enabled. In the event of an error, the loader files will be cleaned up once the copy operation runs successfully during the next execution.
    • Error mode: Select the type of error mode to use when a non-recoverable error occurs for a request to the target connection.
      • Terminate run (default): Terminates the request.
      • Skip record: Continues to the next target row.
      • Next table: Continues to the next table in the request.
    • Source isolation level: Select the transaction isolation level to set at the source server.
      • Read committed(default): Extracts only committed changes. Extracted rows can be updated by other concurrent transactions, however the extracted data will contain only committed changes.
      • Read uncommitted: The extracted data may include uncommitted changes from running transactions. This option may improve processing performance.
      • Repeatable read: Extracts only committed changes and ensures that the rows being extracted are not updated during the extract transaction. Other concurrent transactions may add new rows that satisfy a search condition of the extraction, however the extract transaction will not include those rows.
      • Serializable: Extracts only committed changes and ensures that during the extract transaction rows being extracted cannot be updated by other concurrent transactions and new rows that satisfy a search condition of the extraction cannot be added by other concurrent transactions.
    • Source locking: Select the table locking approach to use at the source during the copy process.
      • By individual table (default): The process accesses and locks, one at a time, the source tables selected for copying, then commits and unlocks each table after processing.
      • All tables: The process accesses and locks, one at a time, the source tables selected for copying, then commits and unlocks the tables after processing all the tables.
    • Source table retrieval order: Select the order to copy data tables.
      • Retrieve in any order (default): Copies tables in the order they are received.
      • Retrieve in order of mappings: Copies parent tables before their child tables. Select this option when the tables being copied have referential integrity constraints.
    • Conflict resolution between source and target (DBMS only) : Select one of the following options, depending on how you want to manage and resolve conflicts between data in the source and target found during data copy activities.
      • Clear the target table before copying. Records not on the source will be lost. Selected by default.
      • Only copy records that are not on the target.
      • Source record is correct. Update the target.
      • Target record is correct. Only update the mapped columns.