Hierarchical Stage In DataStage.
Hierarchical stage in DataStage is used to parse or compose XML (Extensible Markup Language) and JSON data. This stage was introduced in Version 11.3. When we have huge amounts of data to work with, then Hierarchical stage is preferred over XML packs.
Usage and benefits of Hierarchical stage:
- Extensible Markup Language (XML) data can be read in two ways:
1. Using XML packs
2. Using hierarchical data
- XML pack includes XML input, XML output, XML transformer stages, which are used for small transformations.
- Hierarchical data is used for complex transformations for large amounts of data.
- If we are working with large datasets then Hierarchical stage best choice always.
- Hierarchical data stage is used to create, parse and transform XML or JSON data.
- This stage is available in Realtime section of DataStage palette.
- In this article, we will explain about how XML data and JSON data is transformed.
- Above screenshot represents the homepage of the hierarchical stage.
- Click on edit assembly, below page with all the stages will appear from palette.
- Input step and output step are default steps.
Example 1 – With aggregator step
A Job is created with an XML file with aggregation step.
First of all we need to add the schema files in the library as below or from import table definitions.
- From libraries tab→click new library→give the library name, description and category.
- After creating library, click on the created library→click import new resource→add the schema files.
- Now we are able to see the columns.
- Test the source file by clicking the test assembly tab → select the source file and click on run test.
- It will display as test completed if there are no errors.
- Double click the XML-parser step from palette to take as input, which will appear after input step in assembly outline tab.
- XML Source tab contains String set, single file and file set.
- Check single file and click insert parameter →to take the source file (which appear from the parameters which we have given).
- In document root, click browse and get the XML file.
Double click on aggregate step from palette. This contains:
- List to Aggregate: Select the table from dropdown box.
- Scope: Select scope from dropdown.
- Aggregation Items: Select column for aggregation and type of function like sum, avg, count, min, max, concat.
- Aggregation Keys: Select group by column.
- This step contains columns to be displayed.
- Manually add the columns along with the datatype.
- In Mappings →map the columns for the given output columns.
- Click ok, Compile and run the job.
- Below screen is the log for the job:
Example 2 – With Union Step
- Add the input files.
- Double click on Union step and map the columns.
Here we are mapping XML parser, XML parser_1 metadata to output metadata.
Example 3 – Hierarchical stage with JSON data
- First of all we need to import the XSD file.
- Go to Libraries→click new library→import new resource.
- Click assembly editor tab, by default input step and output step will be there.
- Double click on JSON_parser from palette to get in between input and output steps.
Step 3: In JSON_Parser step
- From JSON Source→click on single file radio button and insert the parameter of source file.
- In document root → click on browse to get the root file.
Click mappings and map the corresponding columns to the given output columns.
Step-5: Click ok and run the job.
- Below is the log of the job.
With the help of the above mentioned examples, it is useful for beginners who are working on Hierarchical stage instead of XML packs.
Contact us for more details:
Team Lead, Analytics ETL Data Stage