Estimate the workload - Precisely Data Integrity Suite

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2026

Estimate the number of pipelines/schemas per integration use case:

  • Continuous Replication: Continuous Replication refers to the real-time or near-real-time synchronization of data changes (inserts, updates, deletes) from a source system to a target. It is commonly used in scenarios where up-to-date data is critical, such as replicating from DB2, SQL Server, or Oracle to platforms like BigQuery, Snowflake, or Kafka. This use case typically involves multiple pods, such as connect-cdc, connect-hub, and cloud-applier, which remain active continuously to monitor and apply data changes. Because of this sustained activity, the resource requirements for each pipeline are modest but must remain consistent to ensure reliable replication with low latency.
  • Mainframe Replication: Mainframe Replication involves transferring data from legacy mainframe systems (e.g., IBM DB2z) to modern platforms for analytics or integration. It supports high-throughput replication of mainframe data structures and formats into platforms like Kafka or cloud data warehouses. This use case generally requires fewer pods, with the sqdata-management pod being the primary component. While fewer in number, these pods often require higher CPU and memory per pipeline due to the complexity and size of the data being handled.
  • Fast Load: Fast Load is designed for high-speed, bulk data loading operations, typically used during initial data migration or full dataset refreshes. Unlike continuous replication, it does not track incremental changes but instead focuses on loading large volumes of data quickly from sources like DB2 into cloud targets such as BigQuery. This use case heavily utilizes the connect-cdc and connect-hub pods, but with significantly increased CPU and memory requirements per pipeline to support large data throughput in a short time frame.