About continuous replication components

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025

Components are fundamental elements of continuous replication that facilitate data transfer and synchronization between systems. Each component plays a an important role in ensuring data integrity and operational efficiency. The main components include:

Replication projects

A replication project is a structured framework that includes multiple replication pipelines designed to manage data across your dataset. These projects can operate in both test and production environments, facilitating the following key functions:

  1. Data change capture: The project captures changes from source systems.
  2. Data retrieval: It retrieves modified data for processing.
  3. Data application: The project applies the captured data to target sources.

Monitoring and status: Once a replication project is initiated, you can monitor the following aspects:

  • Process status: View the current status of capture, replication, and log reader processes.
  • Server processes: Check the status of server processes associated with the project.
  • Data backlog: Monitor the number of data rows in the backlog waiting for replication.

You can oversee any replication projects within the repository. Configuration changes can be deployed with minimal impact on the production environment. Captured data is retained, and replication information continues to be collected even if a process is stopped or disabled.

Metabases: Certain data connections such as Db2 for IBM i, Db2 for LUW, Oracle, and SQL Server JDBC in a replication projects must have an associated metabase. In the Data Integration Suite, metabases are created and stored on the database source server accessed through the JDBC data connection. After creation, metabases can be managed from the Metabases page.

Data flow mechanics: The operation of data flows depends on the type of replication projects. These projects allow the configuration of replication pipelines to enable or disable data capture for specific tables or data connections and provide the ability to start or stop replication as needed.

Each pipeline within a project is interconnected, meaning that actions taken at the project level such as starting or stopping replication, which can significantly impact the overall configuration. Conversely, actions taken at the pipeline level specifically target individual data flows.

Replication pipeline

Replication pipelines used in a replication project can copy, synchronize, and replicate your data from source to a target datasource.

You can monitor details of a selected replication pipeline and perform certain actions based on the component. When you click the selected replication pipeline, additional details such as health of the pipeline, the current phase of the pipeline, required actions, and the statuses of the processes are displayed. For example, if the overall health status is Stopped that indicates all processes and replication pipelines for the project are stopped. This includes kernel processes on all runtime engines and replication processes for all replication pipelines. Capture can still be enabled. Staged but not active projects will appear stopped. You can also perform actions on the components. For example, by clicking the three ellipses next to replication pipeline, you can select an item from the context menu.

Metabase

Metabases are repositories of database tables and objects that define, enable, and manage data distribution replication projects. Metabases are tied to a project and each project has a unique metabase. Metabases contain replication backlogs and metadata about what tables are enabled for capture on the source. Db2 for IBM i, Db2 for LUW, Oracle, and data connections used for a replication project must be associated with a metabase.

Engine

The engine any continuous replication pipeline, servers as the core component that coordinates and executes the entire data replication process. Its importance lies in its ability to handle crucial tasks such as Change Data Capture (CDC), where it captures and monitors changes made to data in real-time or near-real-time from source systems. This ensures that any modifications, whether inserts, updates, or deletes, are accurately tracked and replicated to designated target systems promptly.

Furthermore, the engine helps in monitoring the replication process, diagnosing issues promptly, and ensuring smooth operation. By providing detailed logs and alerts, the engine facilitates effective troubleshooting and maintenance, thereby minimizing downtime and maximizing data availability and reliability.

Mapping

Mapping focuses on how data elements are mapped between source and target systems. It covers field mapping, where specific fields or columns in source data are matched with corresponding fields in the target system. Transformation rules and business logic integration are documented to ensure that data is accurately transformed and aligned with organizational requirements during replication. Examples of mappings for different scenarios are included to illustrate how data integrity and consistency are maintained throughout the replication process.

Together, these components create a robust framework that enables reliable and efficient data replication.