Iterator Operations
Iterator operations in Raikoo allow you to process collections of data by applying the same operation(s) repeatedly to each item. This guide explains the different types of iterator operations and how to use them effectively in your workflows.
Overview of Iterator Operations
Iterator operations are a powerful way to process multiple items with the same logic. They allow you to:
- Process files in a directory
- Iterate through rows in structured data
- Execute operations a specified number of times
- Apply consistent processing to varying inputs
All iterator operations act as "wrapper operations" - they contain child operations that are executed for each iteration. This creates a nested workflow structure that maintains isolation between iterations while allowing for parallel or sequential processing.
For Each Loop
The "For Each Loop" operation iterates over a collection of files in the workspace, executing specified operations for each file.
Configuration
- Request: Specify the file(s) in the workspace you wish to loop over
- Parallel (optional): Control execution mode
- If unset, it will inherit the execution mode of the parent workflow
- If set to "true", items will be processed in parallel
- If set to any value other than "true", items will be processed sequentially
Parameters Provided to Child Operations
For Each Loop automatically provides several parameters to the child operations during execution:
| Parameter | Description | Example |
|---|---|---|
BasePath |
Path of the file's parent directory | /animals/mammals |
Extension |
Name of the file with extension | .csv |
FileName |
Name of the file with extension | customers.csv |
FileNameNoExtension |
Name of the file with no extension | customers |
FilePath |
Full path of the current file | /animals/mammals/dog.txt |
FilePathNoExtension |
File path including file name with no extension | /data/customers |
Content |
The current file's content as a string | "Woof woof \n Bark bark!" |
Count |
Total number of iterations | 100 |
Current |
1-based index of the current iteration | 1 (first iteration) |
Example Usage
In your operations within the For Each Loop, you can reference these parameters as:
{{WrapperOperation.BasePath}}
{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.Current}}
{{WrapperOperation.Extension}}
{{WrapperOperation.FileName}}
{{WrapperOperation.FileNameNoExtension}}
{{WrapperOperation.FilePath}}
{{WrapperOperation.FilePathNoExtension}}
Common Use Cases
- Processing multiple files with the same operation
- Converting a batch of files from one format to another
- Applying the same AI operation to multiple texts
- Generating multiple documents from template files
For Loop
The "For Loop" operation iterates over a numerical range, executing specified operations for each iteration.
Configuration
- Start: The starting value for the loop
- Stop: The ending value for the loop
- Step (optional): The increment value between iterations (defaults to 1)
- Parallel (optional): Control execution mode
- If unset, it will inherit the execution mode of the parent
- If set to "true", iterations will be processed in parallel
- If set to any value other than "true", iterations will be processed sequentially
Parameters Provided to Child Operations
For Loop provides the following parameter to child operations:
| Parameter | Description | Example |
|---|---|---|
Current |
The number of the current loop iteration | 3 (third iteration) |
Start |
The start iteration | 1 (first iteration) |
Step |
The iteration step | 1 (step of one) |
Stop |
The stop iteration | 10 (last iteration) |
Example Usage
In your operations within the For Loop, you can reference this parameter as:
{{WrapperOperation.Current}}
{{WrapperOperation.Start}}
{{WrapperOperation.Step}}
{{WrapperOperation.Stop}}
Common Use Cases
- Creating a specific number of outputs
- Generating sequences (e.g., numbering, dates)
- Testing with different parameter values
- Building iterative processes with numerical control
Data Loop
The "Data Loop" operation iterates over structured data in various formats, such as CSV, JSON, or YAML, executing specified operations for each data row or item.
Configuration
- Request: The file in the workspace containing structured data
- DataType (optional): Specifies how to parse and iterate over the data:
csv_array: For CSV files without a header row, each row is passed as a JSON-stringified arraycsv_object: For CSV files with a header row, each row is passed as an array of JSON objects withkeyandvaluefieldsjson: For JSON files:- Arrays: iterates over each element
- Objects: iterates over each key-value pair
yaml: For YAML files, follows the same rules asjsonnewlines: For plain text files, iterates over each line of text
- Parallel (optional): Control execution mode (same behavior as other iterator operations)
Parameters Provided to Child Operations
Data Loop provides the following parameters to child operations:
| Parameter | Description | Example |
|---|---|---|
BasePath |
Path of the file's parent directory | /data |
Content |
The current data row/item as a JSON string | "{ \"name\": \"Fred\", \"occupation\": \"Singer\" }" |
Count |
Total number of iterations | 100 |
Current |
1-based index of the current iteration | 1 (first iteration) |
Extension |
Name of the file with extension | .csv |
FileName |
Name of the file with extension | customers.csv |
FileNameNoExtension |
Name of the file with no extension | customers |
FilePath |
Full path of the data file | /data/customers.csv |
FilePathNoExtension |
File path including file name with no extension | /data/customers |
Example Usage
In your operations within the Data Loop, you can reference these parameters as:
{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.Current}}
{{WrapperOperation.BasePath}}
{{WrapperOperation.Extension}}
{{WrapperOperation.FileName}}
{{WrapperOperation.FileNameNoExtension}}
{{WrapperOperation.FilePath}}
{{WrapperOperation.FilePathNoExtension}}
For the Content parameter, you would typically parse the JSON string to work with the structured data:
{{#with (parseJson WrapperOperation.Content) as |data|}}
{{data.name}} - {{data.occupation}}
{{/with}}
Common Use Cases
- Processing customer records or user data
- Generating personalized content for multiple recipients
- Analyzing multiple data points
- Converting data from one format to another
Folder Loop
The "Folder Loop" operation iterates over files and folders in a specified directory, executing operations for each item found.
Configuration
- Request: The path to the directory you wish to iterate over
- Parallel (optional): Control execution mode (same behavior as other iterator operations)
Parameters Provided to Child Operations
Folder Loop provides the following parameters to child operations:
| Parameter | Description | Example |
|---|---|---|
Count |
Total number of iterations | 25 |
Current |
1-based index of the current iteration | 1 (first iteration) |
FolderPath |
The path to the current folder being looped over | /path/to/folder |
Example Usage
In your operations within the Folder Loop, you can reference these parameters as:
{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.FolderPath}}
Common Use Cases
- Processing all files in a directory structure
- Batch converting files in a folder
- Analyzing directory contents
- Creating a directory index or catalog
Best Practices for Iterator Operations
Performance Considerations
- Choose Parallel Execution Carefully: Parallel execution improves performance for independent operations, but use sequential execution when operations depend on each other's results or when maintaining a specific order is important.
- Manage Resource Usage: For large datasets, consider limiting the number of parallel operations to avoid overloading resources.
- Watch Memory Usage: When processing large files, monitor memory consumption, especially when loading file contents.
Error Handling
- Add Error Handling: Include error handling operations within your loops to manage issues with individual items without failing the entire workflow.
- Validate Inputs: Verify that input data matches the expected format before processing.
- Consider Partial Success: Design workflows that can still provide partial results if some iterations fail.
Organization
- Name Operations Clearly: Use descriptive names for operations within loops to distinguish between different iterations in logs and monitoring.
- Manage Output Paths: Create a clear naming convention for outputs to avoid collisions between iterations.
- Document Iteration Logic: Add comments explaining the purpose and expected behavior of iterator operations.
Advanced Techniques
Nested Iteration
Iterator operations can be nested to handle multi-dimensional data or complex processing requirements:
- Add an outer iterator operation (e.g., Folder Loop)
- Add an inner iterator operation (e.g., Data Loop) as a child operation
- Configure nested dependencies as needed
Conditional Iteration
Combine iterator operations with conditional operations to filter items:
- Add an iterator operation
- Include a condition operation as the first child
- Use the condition's result to determine whether to proceed with other operations
Aggregating Results
To combine results from multiple iterations:
- Have each iteration write to a distinct output file
- After the iterator operation completes, use a "Merge Workspace Items" operation to combine results
Conclusion
Iterator operations are powerful tools for processing collections of data or repetitive tasks in Raikoo workflows. By understanding the different types and configuring them appropriately, you can create efficient, scalable workflows that handle multiple items with consistent logic.