If the steps of file processing are common across all types and are well defined, as a best practice it is important to have one template/dynamic flow process files.
For example, you have a Claims file that is a CSV format and needs to be processed into a tab delimited file (common format). However, you can build a unique flow to handle this file but the more efficient way would be to create a dynamic flow that will process not only this type of file but any other type that needs to be processed as part of the common steps.
A unique flow would have activities like this:
- Start > Source > Source Schema > Mapping > Target Schema > Target > End
– A dynamic flow would look like this:
- Start > Source > Lookup > Assign activity IDs to next steps > Source Schema > Mapping > Target Schema > Target > End
– A unique flow can have individual activities also get dynamically overridden but for the purpose of this scenario, we will assume that we are overriding everything after the source activity.
– To build a dynamic flow we need to look at the following factors:
- For each file type, we would need to define its schema/file definition. Thus for Claims we need to create a text schema etc.
- For each file type, we would need to define its mapping to create the right output file. Thus for Claims we would need to map it to the common format etc.
- And the Target schema and Target location can be unique as well depending upon the mapping used for that source file.
- As part of creating the above activities (source/target locations, source data/target data schemas, mappings), a proper naming convention is important so that the names identify the purpose of the activity in overall context of your business.
- Also, we need to design the flow that would encapsulate the above activities as individual steps in a process design.
- To create the flow, open the process designer and first design the flow such as:
– Start > Source > Source Schema > Mapping > Target Schema > Target > End
– File Event would trigger of the flow.
– Remember the schema means meta-data or data structure of the file.
– As part of the Process Flow design, one of the steps that we need to add would be to do a lookup into a table based on a file name pattern and fetch related activity ids to be used in the process.
– Lookup table can have as many columns that are in the flow such as FileName, SourceSchemaID, MappingID, TargetSchemaID, TargetID. It can be created in any database such as sql server. Make all varchar types and non-nullable.
– Now one of the new steps in the above flow would be to add a step after the source which will do a lookup into a table based on a file name (use ‘like’ in where condition). This can be a custom java code (one of the links in support forum has a code sample) or it can be a mapping. I prefer the mapping activity.
– The other activity after the lookup would be to assign the Activity IDs to the related steps in the process. We would use put-context-var to do that part. And that is all is needed to make the flow dynamic.
- To manage the lookup data entry, we can use the Form builder available in the tool to create a web front end that would allow users to enter in the values for the above table columns. We can add additional links for edit, delete etc.
- Now the source activity (we can use proto-source, proto-schema, proto-mapping for all the activities, these would be dummy activities) would be “event enabled”. Meaning that the dummy values we have placed in that activity will be over-written by the particular File Event we will attach to the flow.
- File Event can have properties such as look for any file that has extension .txt, .tab, .dat etc and when a new file arrives (or existing gets modified), it would trigger the process.
- The dynamic framework is good in cases where there is commonality on how the file gets processed and using one template flow you can process any file type that conforms to those common steps.
- In case where a file requires more unique handling where multiple targets with unique rules need to be processed, then I would recommend that a unique flow for that file type be created to handle that scenario.