This step walks through the complete process of estimating CPU, memory, and storage required for Data Integration, using two example use cases:
- Use Case A: DB2i to BigQuery (Continuous Replication) with 32 pipelines
- Use Case B: DB2z to Kafka (Mainframe Replication) with 32 pipelines
The process includes multiplying per-pipeline resource values (from Step 2), applying fixed memory logic, accounting for system overhead, and rounding off for provisioning.
Table 1: Base Resource Requirements per Pipeline (from Step 2)
These per-pipeline values are sourced from Step 2 and form the foundation for the calculations in this step.
| Use Case | Pod Name | vCPU (millicores) | JVM Memory (MiB) | Pod Memory (MiB) | Disk (GB) |
|---|---|---|---|---|---|
| DB2i to BQ (Continuous) | connect-cdc | 50 | 103 | 154 | 0.3 |
| connect-hub | 12 | 24 | 48 | - | |
| cloud-applier | 56 | 70 | 112 | 20 | |
| DB2z to Kafka (Mainframe) | sqdata-management | 170 | N/A | 162 | 1 |
This step walks through the complete process of estimating CPU, memory, and storage required for Data Integration, using two example use cases:
How to Calculate JVM and vCPU for Pods Based on the Number of Pipelines
To determine the required JVM and vCPU resources for pods, simply multiply the number of pipelines or schemas (e.g., 32 pipelines) by the resource allocation per pipeline or schema (as outlined in Step 2) for each relevant pod.
Table 2: Calculated Resources (32 Pipelines × Base Values)
This table multiplies the base values in Table 1 by 32 pipelines to determine unadjusted totals.
| Use Case | Pod Name | vCPU (millicores) | JVM Memory (MiB) | Pod Memory (MiB) | Disk (GB) |
|---|---|---|---|---|---|
| DB2i to BQ | connect-cdc | 50 × 32 = 1600 | 103 × 32 = 3296 | 154 × 32 = 4928 | 0.3 × 32 = 9.6 |
| connect-hub | 12 × 32 = 384 | 24 × 32 = 768 | 48 × 32 = 1536 | - | |
| cloud-applier | 56 × 32 = 1792 | 70 × 32 = 2240 | 112 × 32 = 3584 | 20 × 32 = 640 | |
| DB2z to Kafka | sqdata-management | 170 × 32 = 5440 | N/A | 162 × 32 = 5184 | 1 × 32 = 32 |
Add fixed memory for pods
- cloud-applier: Add 500 MiB of JVM memory and 250 millicores vCPU.
- connect-cdc: Allocate 25% of the total pod memory for the listener process, but ensure this memory is assigned outside the JVM. (Refer to Section C for the calculation of total pod memory.) The listener is a separate process that receives its memory allocation based on the total pod memory.
- Add extra JVM memory for continuous replication (based on pipeline
complexity).
- For multiple projects:
Table 1. connect-cdc Take resource required for largest project × number of projects. connect-hub Take resource required for largest project. Note: If pipelines are grouped into projects, consider the largest project. Determining the Largest Project
The largest project is determined by the number of pipelines it has configured. The project with the most pipelines is considered the largest.
Example:
In this scenario, Project C is the largest project. Therefore, the resource allocation will be connect-cdc = Project C × 3 and connect-hub = Project CProject Pipelines A 10 B 15 C 32
- For multiple projects:
- connect-hub: no fixed memory
- sqdata-management: no fixed memory
Examples: Based on the above criteria, let’s incorporate fixed memory wherever it is applicable to ensure consistency and efficiency.
Table 3: Resources After Fixed Additions
| Use Case | Pod Name | vCPU (millicores) | JVM Memory (MiB) | Pod Memory (MiB) | Disk (GB) |
|---|---|---|---|---|---|
| DB2i to BQ | connect-cdc | 1600 | 3296 | 4928 + 25% of 4928 = 6160 | 9.6 |
| connect-hub | 384 | 768 | 1536 | - | |
| cloud-applier | 1792 + 250 = 2042 | 2240 + 500 = 2740 | 3584 + 750 = 4334 | 640 | |
| DB2z to Kafka | sqdata-management | 5440 | N/A | 5184 | 32 |
Aggregate vCPU and Memory Usage for All Pods
Calculate the total vCPU and memory usage across all pods.
Table 4: Aggregate Totals (Before Defaults & OS Overhead)
Summed totals of vCPU and memory for each use case (from table 3), prior to accounting for default pods or system overhead.
| Use Case | Total vCPU (millicores) | Total JVM Memory (MiB) | Total Pod Memory (MiB) | Total Disk (GB) |
|---|---|---|---|---|
| DB2i to BQ | 4026 (1600+384+2042) | 3296 + 768 + 2740 = 6804 | 6160 + 1536 + 4334 = 12030 | ~650 |
| DB2z to Kafka | 5440 | N/A | 5184 | 32 |
Default Resource Allocation for Unused Pods: To ensure stability and completeness of the deployment environment, a default set of CPU and memory resources is added for pods that are not directly involved in the current use case. This baseline allocation helps prevent under-provisioning and supports essential background processes managed by the Precisely Agent.
Table 5: Add Default Pod Memory for Unused Pods
| Use Case | vCPU (millicores) | Pod Memory (MiB) |
|---|---|---|
| DB2i to BQ | 4026 + 200 = 4226 | 12030 + 500 = 12530 |
| DB2z to Kafka | 5440 + 3200 = 8640 | 5184 + 6144 = 11328 |
- DB2i to BQ: +200 vCPU, +500 MiB (sqdata-management)
- DB2z to Kafka: +3200 vCPU, +6144 MiB (connect-cdc, connect-hub, cloud-applier)
Operating System Overload: In addition to workload-specific resources, it's important to reserve system-level resources for the operating system itself. This includes a small allocation of CPU, memory, and disk for running the Ubuntu (non-GUI) environment, which hosts the Data Integration components.
Table 6: Add OS Overhead
| Use Case | vCPU (millicores) | Pod Memory (MiB) | Disk (GB) |
|---|---|---|---|
| DB2i to BQ | 4226 + 1000 = 5226 | 12530 + 2048 = 14578 | 650 + 20 = 670 |
| DB2z to Kafka | 8640 + 1000 = 9640 | 11328 + 2048 = 13376 | 32 + 20 = 52 |
- +1000 vCPU
- +2048 MiB memory
- +20 GB disk
- CPU: 4600 millicores (4.6 vCPUs)
- RAM: 9740 MiB (approximately 9.74 GB)
- Storage: 240 GB
Table 7: Add Agent Resource Requirements
| Use Case | vCPU (millicores) | Pod Memory (MiB) | Disk (GB) |
|---|---|---|---|
| DB2i to BQ | 5226 + 4600 = 9826 | 14578 + 9740 = 24318 | 670 + 240 = 910 |
| DB2z to Kafka | 9640 + 4600 = 14240 | 13376 + 9740 = 23116 | 52 + 240 = 292 |
Round off to nearest: Rounded for VM provisioning (used in Step 4)
Table 8: Rounded Resource Totals (Final)
| Use Case | vCPU | RAM (MiB) | Disk |
|---|---|---|---|
| DB2i to BQ | ~10 vCPU | ~24 GB | ~1 TB |
| DB2z to Kafka | ~15 vCPU | ~23 GB | ~300 GB |
When to Modify Default Resource Allocation?
| Condition | Action | Rationale |
|---|---|---|
| Calculated < Default | Use default | Defaults support the load, no changes needed |
| Calculated > Default | Proceed to Step 5 to manually update resource settings | Ensure performance and stability under heavier workloads |