The Custom Coding Step in the Data Quality Suite enables you to address advanced use cases or requirements that cannot be fulfilled using the existing Quality pipeline steps. This feature is particularly beneficial for users who prefer coding or scripting and need to implement complex logic such as conditional statements, loops, or integrations with scripts from other vendors or solutions.
- Custom Coding Capability: Solve use cases beyond the scope of pre-defined Quality pipeline steps.
- Flexibility with Existing Scripts: Leverage custom scripts or existing code from other solutions.
- Enhanced Editing Environment: Provides a dedicated code block with auto-complete (IntelliSense) and error validation.
AI Capabilities
The Custom Coding step in the Quality pipeline now supports AI-powered Natural Language Processing (NLP) capabilities. This allows users to describe logic in plain language, which the system automatically converts into equivalent JavaScript code.
Key Benefits
- Reduces time spent translating business requirements into code
- Enables non-technical users to contribute logic using natural language
- Minimizes errors by automating syntax and logic generation
| Scenario | Natural Language Input | Generated JavaScript |
|---|---|---|
| Data Validation | Check if ID is greater than 18 and mark as valid |
|
| String Normalization | Trim and convert full name to lowercase. |
|
| Date Transformation | Convert all dates in
birth_date column YYYY-MM-DD format |
|
Setup Instructions
To configure AI capabilities for your workspace for custom coding operator ensure that you've enabled the feature via AI Manager, on detailed steps on how to configure it for your workspace refer to Features under AI documentation.
How to configure the custom coding step
Step Name: Defines the name for the step. Provide a meaningful name so that others editing the pipeline can easily identify its purpose.
Input Fields and Output Fields: The inputs and output field types are mapped to JavaScript types:
| Field Type | JavaScript Type | Description | Example |
|---|---|---|---|
| String | String | Represents text values. |
if (input.get('name') === 'John')
|
| Boolean | Boolean | Represents true/false values. |
if (input.get('isActive') === true)
|
| Integer | BigInt | Used for large whole numbers. Requires n suffix for literals.
Note: When working with BigInt, use the n
suffix for number literals.
|
if (input.get('intField') > 100n)
|
| Float | Number | Represents floating-point numbers. |
let price = input.get('price') + 2.5;
|
| Double | Number | Similar to Float but with double precision. |
let total = input.get('total') * 1.1;
|
| Decimal | BigDecimal |
High-precision decimal values. Uses Java’s BigDecimal class. |
if (input.get('decimalField').equals(new
BigDecimal('100')))
|
| Datetime | LocalDateTime | Represents date and time without a timezone. |
let timestamp = input.get('timestamp');
|
| Date | LocalDate | Represents a calendar date (year, month, day). |
let today = input.get('dateField');
|
| Time | LocalTime | Represents a time of day (hours, minutes, seconds). |
let currentTime = input.get('timeField');
|
- In the case of DateTime, Date, and Time datatypes, the Java Time API should
be used instead of JavaScript Date. Example:
var currentDate = LocalDate.now() // assigns today’s date to variable currentDate, LocalDate is a class present in Java Time api - In case you want to use classes that are not whitelisted and still wish to
use them, then you can do so by writing the complete classpath. Example:
java.time.temporal.ChronoUnit.YEARS.between(startDate, endDate); // temporal package is present inside java time api and can be used in this way if package is not whitelisted
- Select multiple fields at once if needed.
- Click on an input field name in the list to insert a "get" snippet with the correct syntax into the code.
- The input object is for read/input only, i.e., input.set(‘field’, value) is not allowed and will throw an error.
- If an input field is removed from the list but still referenced in the code, validation errors will occur.
- Field Name: Assign a unique name for new fields or select an input field to overwrite it.
- Datatype: Specify the data type for the field.
- Click the list item to insert a "Put" script snippet at the cursor point in the code.
- The script is executed against each row of input
- Full-screen mode is available for a better coding experience.
- Syntax errors are highlighted in real time within the editor.
Example: Creating a user-defined isPrime() function to find prime numbers from an existing column.
function isPrime(num) {
// Handle numbers less than 2
if (num <= 1) {
return false;
}
// Check divisibility from 2 to the square root of num
for (let i = 2; i <= Math.sqrt(num); i++) {
if (num % i === 0) {
return false; // num is divisible by i, so it's not prime
}
}
return true; // num is prime
}
output.set('IsPrime', isPrime(Number(input.get('CustomerId')))) try {
//script code
} catch (error) {
output.set(‘field’, -1);
} AI Assist: It is an NLP-based tool that helps you get started with AI capabilities in the Custom Coding step. It allows you to enter natural language instructions, which are then converted into JavaScript code snippets. Follow the steps given below to use AI Assist for the code:
- In the dialog box, enter instructions in natural language for the AI to generate the required JavaScript code.
- Click Generate to create the code snippet.
- You can choose to:
- Click Apply to insert the generated code
directly into the pipeline.Note: If you click Apply and the code editor already contains existing code, a confirmation prompt will appear. You must confirm before the existing code is replaced with the AI-generated code.
- Copy the code for later use.
- Click Apply to insert the generated code
directly into the pipeline.
Additional Features
- Data Preview: Use the data preview feature to validate that your script works as intended.
- Issues Tab: Syntax errors are highlighted in real time within the Monaco editor. If a script has errors but you still want to preview it, navigate to the Issues Tab to review and resolve the errors.
Best Practices
- Always test the script using the data preview functionality before finalizing.
- Ensure referenced fields are correctly declared to avoid validation errors.
- Use meaningful and consistent naming conventions for steps, input fields, and output fields.