Calculate total resources

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2026

This step walks through the complete process of estimating CPU, memory, and storage required for Data Integration, using two example use cases:

  • Use Case A: DB2i to BigQuery (Continuous Replication) with 32 pipelines
  • Use Case B: DB2z to Kafka (Mainframe Replication) with 32 pipelines

The process includes multiplying per-pipeline resource values (from Step 2), applying fixed memory logic, accounting for system overhead, and rounding off for provisioning.

Table 1: Base Resource Requirements per Pipeline (from Step 2)

These per-pipeline values are sourced from Step 2 and form the foundation for the calculations in this step.

Use Case Pod Name vCPU (millicores) JVM Memory (MiB) Pod Memory (MiB) Disk (GB)
DB2i to BQ (Continuous) connect-cdc 50 103 154 0.3
connect-hub 12 24 48 -
cloud-applier 56 70 112 20
DB2z to Kafka (Mainframe) sqdata-management 170 N/A 162 1

This step walks through the complete process of estimating CPU, memory, and storage required for Data Integration, using two example use cases:

How to Calculate JVM and vCPU for Pods Based on the Number of Pipelines

To determine the required JVM and vCPU resources for pods, simply multiply the number of pipelines or schemas (e.g., 32 pipelines) by the resource allocation per pipeline or schema (as outlined in Step 2) for each relevant pod.

Table 2: Calculated Resources (32 Pipelines × Base Values)

This table multiplies the base values in Table 1 by 32 pipelines to determine unadjusted totals.

Use Case Pod Name vCPU (millicores) JVM Memory (MiB) Pod Memory (MiB) Disk (GB)
DB2i to BQ connect-cdc 50 × 32 = 1600 103 × 32 = 3296 154 × 32 = 4928 0.3 × 32 = 9.6
connect-hub 12 × 32 = 384 24 × 32 = 768 48 × 32 = 1536 -
cloud-applier 56 × 32 = 1792 70 × 32 = 2240 112 × 32 = 3584 20 × 32 = 640
DB2z to Kafka sqdata-management 170 × 32 = 5440 N/A 162 × 32 = 5184 1 × 32 = 32

Add fixed memory for pods

  1. cloud-applier: Add 500 MiB of JVM memory and 250 millicores vCPU.
  2. connect-cdc: Allocate 25% of the total pod memory for the listener process, but ensure this memory is assigned outside the JVM. (Refer to Section C for the calculation of total pod memory.) The listener is a separate process that receives its memory allocation based on the total pod memory.
  3. Add extra JVM memory for continuous replication (based on pipeline complexity).
    • For multiple projects:
      Table 1.
      connect-cdc Take resource required for largest project × number of projects.
      connect-hub Take resource required for largest project.
      Note: If pipelines are grouped into projects, consider the largest project.
    • Determining the Largest Project

      The largest project is determined by the number of pipelines it has configured. The project with the most pipelines is considered the largest.

      Example:

      Project Pipelines
      A 10
      B 15
      C 32
      In this scenario, Project C is the largest project. Therefore, the resource allocation will be connect-cdc = Project C × 3 and connect-hub = Project C
  4. connect-hub: no fixed memory
  5. sqdata-management: no fixed memory

Examples: Based on the above criteria, let’s incorporate fixed memory wherever it is applicable to ensure consistency and efficiency.

Table 3: Resources After Fixed Additions

Use Case Pod Name vCPU (millicores) JVM Memory (MiB) Pod Memory (MiB) Disk (GB)
DB2i to BQ connect-cdc 1600 3296 4928 + 25% of 4928 = 6160 9.6
connect-hub 384 768 1536 -
cloud-applier 1792 + 250 = 2042 2240 + 500 = 2740 3584 + 750 = 4334 640
DB2z to Kafka sqdata-management 5440 N/A 5184 32
Note: For every MiB added to the JVM, 50% of that amount should be allocated to off-heap memory for the cloud-applier. For example, if 500 MiB is added to the JVM, an additional 250 MiB should be allocated for off-heap memory, resulting in a total of 750 MiB added to the pod memory.

Aggregate vCPU and Memory Usage for All Pods

Calculate the total vCPU and memory usage across all pods.

Table 4: Aggregate Totals (Before Defaults & OS Overhead)

Summed totals of vCPU and memory for each use case (from table 3), prior to accounting for default pods or system overhead.

Use Case Total vCPU (millicores) Total JVM Memory (MiB) Total Pod Memory (MiB) Total Disk (GB)
DB2i to BQ 4026 (1600+384+2042) 3296 + 768 + 2740 = 6804 6160 + 1536 + 4334 = 12030 ~650
DB2z to Kafka 5440 N/A 5184 32
Note: We need to set a default memory allocation for pods that are not directly part of our use case. This ensures efficient resource management and prevents unnecessary memory consumption.

Default Resource Allocation for Unused Pods: To ensure stability and completeness of the deployment environment, a default set of CPU and memory resources is added for pods that are not directly involved in the current use case. This baseline allocation helps prevent under-provisioning and supports essential background processes managed by the Precisely Agent.

Table 5: Add Default Pod Memory for Unused Pods

Use Case vCPU (millicores) Pod Memory (MiB)
DB2i to BQ 4026 + 200 = 4226 12030 + 500 = 12530
DB2z to Kafka 5440 + 3200 = 8640 5184 + 6144 = 11328
Note: Defaults added for pods unused in a given use case:
  • DB2i to BQ: +200 vCPU, +500 MiB (sqdata-management)
  • DB2z to Kafka: +3200 vCPU, +6144 MiB (connect-cdc, connect-hub, cloud-applier)

Operating System Overload: In addition to workload-specific resources, it's important to reserve system-level resources for the operating system itself. This includes a small allocation of CPU, memory, and disk for running the Ubuntu (non-GUI) environment, which hosts the Data Integration components.

Table 6: Add OS Overhead

Use Case vCPU (millicores) Pod Memory (MiB) Disk (GB)
DB2i to BQ 4226 + 1000 = 5226 12530 + 2048 = 14578 650 + 20 = 670
DB2z to Kafka 8640 + 1000 = 9640 11328 + 2048 = 13376 32 + 20 = 52
Note: Add system-level resource requirements (Ubuntu non-GUI):
  • +1000 vCPU
  • +2048 MiB memory
  • +20 GB disk
Add agent memory requirement: The agent requires 8 vCPUs, 16 GB RAM, and 300 GB storage. This includes the default memory allocated for four Data Integration pods: connect-cdc, connect-hub, cloud-applier, and sqdata-management. When excluding the default memory for these pods, the agent's resource requirements are reduced to:
  • CPU: 4600 millicores (4.6 vCPUs)
  • RAM: 9740 MiB (approximately 9.74 GB)
  • Storage: 240 GB
To calculate the total resource requirements, simply add the default memory back to these values.

Table 7: Add Agent Resource Requirements

Use Case vCPU (millicores) Pod Memory (MiB) Disk (GB)
DB2i to BQ 5226 + 4600 = 9826 14578 + 9740 = 24318 670 + 240 = 910
DB2z to Kafka 9640 + 4600 = 14240 13376 + 9740 = 23116 52 + 240 = 292

Round off to nearest: Rounded for VM provisioning (used in Step 4)

Table 8: Rounded Resource Totals (Final)

Use Case vCPU RAM (MiB) Disk
DB2i to BQ ~10 vCPU ~24 GB ~1 TB
DB2z to Kafka ~15 vCPU ~23 GB ~300 GB

When to Modify Default Resource Allocation?

Condition Action Rationale
Calculated < Default Use default Defaults support the load, no changes needed
Calculated > Default Proceed to Step 5 to manually update resource settings Ensure performance and stability under heavier workloads