Monday, 22 June 2015

COPY and AGGREGATOR Stage in Datastage

In this article we are going to discuss about two simple stages in datastage.
  1. COPY Stage
  2. AGGREGATOR Stage
Let’s study them with example one by one.
  1. COPY Stage
Name itself describes the function of this stage. It is a Processing stage of having one input link and number of output links which helps to create multiple copies of input data source.
Copy Stage also helps to make a backup for data while performing another operation on that data. We can also drop or change order of columns, modify data type of columns.
Let’s see with an example:
Consider following table.
ID
FName
LName
Location
Network ID
EmailID
DOJ
1
Jach
Simmons
LA
JS524
Letsc@gmail.com
20140101
2
Shumas
Jane
Las
Sj145
Jaene@ymail.com
20130623
3
Jonty
Waughn
Syd
JW927
JontyW@sdbh.com
20131123
3
Suhana
Safar
Mexico
SS99
Sas@gmail.com
20130631
8
Abc
Cde
Mex
ab254
lmn@gmail.com
20131225
4
Shumas
Jane
Las
Sj145
Jaene@ymail.com
20130625
 Step 1: Design your structure like below.

Configure above table with the input sequential file ‘sf_emp_det’.
Step 2: Let’s configure the ‘copy’ stage. Double click on it. Following window will pop up. Under ‘Output’ tab we just have to do ‘mappings’ now.

By using highlighted drop down we can map the required columns to the desired link. As shown here, columns ID, Fname, Lname are mapped to the link ‘lnk_id_name’ by dragging and dropping from left to right pane.
Step 3: Repeat it for other columns Loc, LanID, Email, DOJ as shown below.
3
4
Step 4: Let’s see what output is.
For ‘lnk_id_name’, columns mapped are ID, Fname, Lname.

For ‘lnk_email_lanid’, columns mapped are LanID, Email.
6
For ‘lnk_loc_doj’, columns mapped are Loc, DOJ.
7
       2. AGGREGATOR Stage
As name suggest this stage is used to perform Aggregate functions in Datastage like Min, Max, Sum, Variance, Mean value etc. In some Business scenarios we need to use Aggregate function for that purpose this stage plays a part. This is simple and efficient.
Here, let’s try to implement some aggregate functions in Datastage.
Consider following table.
ID
Name
Salary
Department
101
John
700
EnTC
102
Christian
400
MECH
103
Ronaldo
700
CS
104
Manoher
555
EnTC
105
Sachin
752
IT
106
Kevin
896
EnTC
107
Alistair
400
IT
108
Adam
756
CS

Step 1: Design your job structure like below
Aggregator stage job design
Configure above table with input Sequential file ‘sf_emp_det’.
Step 2: Double click on ’Aggregator’ stage following window will pop up.
Aggregator properties
Step 3: Select ‘Group Key’ as ‘Dept’ and ‘Aggregation Type’= ‘Calculation’ from right side drop down box as shown above.
Now we have to select ‘Column for Calculation’ on which we want to perform aggregate functions. Here we are using ‘Salary’ column for calculation.
For this particular column ‘Salary’, select properties from right bottom pane named ‘Available properties to add’ as here we are using Maximum, Mean, Minimum and Sum properties as shown in step 2 and give name accordingly.
Step 4: Just map the output columns as below.
Optput mapping
Step 5: Compile and run the job. Let’s see what happen to the output.
Output


This how we developed a job for calculating Maximum, average, Minimum and sum of ‘Salary’ column.  Practice for other properties also.

No comments:

Post a Comment