Iterator Operations

Iterator operations in Raikoo allow you to process collections of data by applying the same operation(s) repeatedly to each item. This guide explains the different types of iterator operations and how to use them effectively in your workflows.

Overview of Iterator Operations

Iterator operations are a powerful way to process multiple items with the same logic. They allow you to:

Process files in a directory
Iterate through rows in structured data
Execute operations a specified number of times
Apply consistent processing to varying inputs

All iterator operations act as "wrapper operations" - they contain child operations that are executed for each iteration. This creates a nested workflow structure that maintains isolation between iterations while allowing for parallel or sequential processing.

For Each Loop

The "For Each Loop" operation iterates over a collection of files in the workspace, executing specified operations for each file.

Configuration

Request: Specify the file(s) in the workspace you wish to loop over
Parallel (optional): Control execution mode
- If unset, it will inherit the execution mode of the parent workflow
- If set to "true", items will be processed in parallel
- If set to any value other than "true", items will be processed sequentially

Parameters Provided to Child Operations

For Each Loop automatically provides several parameters to the child operations during execution:

Parameter	Description	Example
`BasePath`	Path of the file's parent directory	`/animals/mammals`
`Extension`	Name of the file with extension	`.csv`
`FileName`	Name of the file with extension	`customers.csv`
`FileNameNoExtension`	Name of the file with no extension	`customers`
`FilePath`	Full path of the current file	`/animals/mammals/dog.txt`
`FilePathNoExtension`	File path including file name with no extension	`/data/customers`
`Content`	The current file's content as a string	`"Woof woof \n Bark bark!"`
`Count`	Total number of iterations	`100`
`Current`	1-based index of the current iteration	`1` (first iteration)

Example Usage

In your operations within the For Each Loop, you can reference these parameters as:

{{WrapperOperation.BasePath}}
{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.Current}}
{{WrapperOperation.Extension}}
{{WrapperOperation.FileName}}
{{WrapperOperation.FileNameNoExtension}}
{{WrapperOperation.FilePath}}
{{WrapperOperation.FilePathNoExtension}}

Common Use Cases

Processing multiple files with the same operation
Converting a batch of files from one format to another
Applying the same AI operation to multiple texts
Generating multiple documents from template files

For Loop

The "For Loop" operation iterates over a numerical range, executing specified operations for each iteration.

Configuration

Start: The starting value for the loop
Stop: The ending value for the loop
Step (optional): The increment value between iterations (defaults to 1)
Parallel (optional): Control execution mode
- If unset, it will inherit the execution mode of the parent
- If set to "true", iterations will be processed in parallel
- If set to any value other than "true", iterations will be processed sequentially

Parameters Provided to Child Operations

For Loop provides the following parameter to child operations:

Parameter	Description	Example
`Current`	The number of the current loop iteration	`3` (third iteration)
`Start`	The start iteration	`1` (first iteration)
`Step`	The iteration step	`1` (step of one)
`Stop`	The stop iteration	`10` (last iteration)

Example Usage

In your operations within the For Loop, you can reference this parameter as:

{{WrapperOperation.Current}}
{{WrapperOperation.Start}}
{{WrapperOperation.Step}}
{{WrapperOperation.Stop}}

Common Use Cases

Creating a specific number of outputs
Generating sequences (e.g., numbering, dates)
Testing with different parameter values
Building iterative processes with numerical control

Data Loop

The "Data Loop" operation iterates over structured data in various formats, such as CSV, JSON, or YAML, executing specified operations for each data row or item.

Configuration

Request: The file in the workspace containing structured data
DataType (optional): Specifies how to parse and iterate over the data:
- csv_array: For CSV files without a header row, each row is passed as a JSON-stringified array
- csv_object: For CSV files with a header row, each row is passed as an array of JSON objects with key and value fields
- json: For JSON files:
- Arrays: iterates over each element
- Objects: iterates over each key-value pair
- yaml: For YAML files, follows the same rules as json
- newlines: For plain text files, iterates over each line of text
Parallel (optional): Control execution mode (same behavior as other iterator operations)

Parameters Provided to Child Operations

Data Loop provides the following parameters to child operations:

Parameter	Description	Example
`BasePath`	Path of the file's parent directory	`/data`
`Content`	The current data row/item as a JSON string	`"{ \"name\": \"Fred\", \"occupation\": \"Singer\" }"`
`Count`	Total number of iterations	`100`
`Current`	1-based index of the current iteration	`1` (first iteration)
`Extension`	Name of the file with extension	`.csv`
`FileName`	Name of the file with extension	`customers.csv`
`FileNameNoExtension`	Name of the file with no extension	`customers`
`FilePath`	Full path of the data file	`/data/customers.csv`
`FilePathNoExtension`	File path including file name with no extension	`/data/customers`

Example Usage

In your operations within the Data Loop, you can reference these parameters as:

{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.Current}}
{{WrapperOperation.BasePath}}
{{WrapperOperation.Extension}}
{{WrapperOperation.FileName}}
{{WrapperOperation.FileNameNoExtension}}
{{WrapperOperation.FilePath}}
{{WrapperOperation.FilePathNoExtension}}

For the Content parameter, you would typically parse the JSON string to work with the structured data:

{{#with (parseJson WrapperOperation.Content) as |data|}}
  {{data.name}} - {{data.occupation}}
{{/with}}

Common Use Cases

Processing customer records or user data
Generating personalized content for multiple recipients
Analyzing multiple data points
Converting data from one format to another

Folder Loop

The "Folder Loop" operation iterates over files and folders in a specified directory, executing operations for each item found.

Configuration

Request: The path to the directory you wish to iterate over
Parallel (optional): Control execution mode (same behavior as other iterator operations)

Parameters Provided to Child Operations

Folder Loop provides the following parameters to child operations:

Parameter	Description	Example
`Count`	Total number of iterations	`25`
`Current`	1-based index of the current iteration	`1` (first iteration)
`FolderPath`	The path to the current folder being looped over	`/path/to/folder`

Example Usage

In your operations within the Folder Loop, you can reference these parameters as:

{{WrapperOperation.Content}}
{{WrapperOperation.Count}}
{{WrapperOperation.FolderPath}}

Common Use Cases

Processing all files in a directory structure
Batch converting files in a folder
Analyzing directory contents
Creating a directory index or catalog

Best Practices for Iterator Operations

Performance Considerations

Choose Parallel Execution Carefully: Parallel execution improves performance for independent operations, but use sequential execution when operations depend on each other's results or when maintaining a specific order is important.
Manage Resource Usage: For large datasets, consider limiting the number of parallel operations to avoid overloading resources.
Watch Memory Usage: When processing large files, monitor memory consumption, especially when loading file contents.

Error Handling

Add Error Handling: Include error handling operations within your loops to manage issues with individual items without failing the entire workflow.
Validate Inputs: Verify that input data matches the expected format before processing.
Consider Partial Success: Design workflows that can still provide partial results if some iterations fail.

Organization

Name Operations Clearly: Use descriptive names for operations within loops to distinguish between different iterations in logs and monitoring.
Manage Output Paths: Create a clear naming convention for outputs to avoid collisions between iterations.
Document Iteration Logic: Add comments explaining the purpose and expected behavior of iterator operations.

Advanced Techniques

Nested Iteration

Iterator operations can be nested to handle multi-dimensional data or complex processing requirements:

Add an outer iterator operation (e.g., Folder Loop)
Add an inner iterator operation (e.g., Data Loop) as a child operation
Configure nested dependencies as needed

Conditional Iteration

Combine iterator operations with conditional operations to filter items:

Add an iterator operation
Include a condition operation as the first child
Use the condition's result to determine whether to proceed with other operations

Aggregating Results

To combine results from multiple iterations:

Have each iteration write to a distinct output file
After the iterator operation completes, use a "Merge Workspace Items" operation to combine results

Conclusion

Iterator operations are powerful tools for processing collections of data or repetitive tasks in Raikoo workflows. By understanding the different types and configuring them appropriately, you can create efficient, scalable workflows that handle multiple items with consistent logic.