Pipeline job execution throws an Out of Memory error.
It has been observed that when executing pipeline jobs using the Spark connector on Precisely Agent, an unexpected Out of Memory error occurs, causing the job execution to fail.
Problem:
The pipeline engine lacks the required memory for successful job execution.
Error message:
SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (10.42.0.87 executor 4): ExecutorLostFailure (executor 4 exited caused by one of the running tasks) Reason: The executor with id 4 exited with exit code 137(SIGKILL, possible container OOM).
Workaround:
Add the following keys within the Spark properties to the pipeline engine to provide additional buffer memory, which will help mitigate Out of Memory exception scenarios. It is recommended to set the buffer memory value in the Spark connector to at least 0.2.
Spark provides several configuration options to fine tune memory related settings. These properties control the memory overhead beyond the heap size. It is crucial for handling off heap memory, task execution overhead, and other internal memory usage.
For instance, if the executor memory allocated is 3GB, setting a buffer memory value of 0.4 in the Spark property would allocate an additional 1.2GB of memory, facilitating successful job execution.
Keys to be added within the Spark properties section to the pipeline engine:
| spark.driver.memoryOverhead |
| spark.driver.memoryOverheadFactor |
| spark.executor.memoryOverhead |
| spark.executor.memoryOverheadFactor |