Datastage for Learners: COPY and AGGREGATOR Stage in Datastage

In this article we are going to discuss about two simple stages in datastage.

COPY Stage
AGGREGATOR Stage

Let’s study them with example one by one.

COPY Stage

Name itself describes the function of this stage. It is a Processing stage of having one input link and number of output links which helps to create multiple copies of input data source.
Copy Stage also helps to make a backup for data while performing another operation on that data. We can also drop or change order of columns, modify data type of columns.
Let’s see with an example:
Consider following table.

ID	FName	LName	Location	Network ID	EmailID	DOJ
1	Jach	Simmons	LA	JS524	Letsc@gmail.com	20140101
2	Shumas	Jane	Las	Sj145	Jaene@ymail.com	20130623
3	Jonty	Waughn	Syd	JW927	JontyW@sdbh.com	20131123
3	Suhana	Safar	Mexico	SS99	Sas@gmail.com	20130631
8	Abc	Cde	Mex	ab254	lmn@gmail.com	20131225
4	Shumas	Jane	Las	Sj145	Jaene@ymail.com	20130625

Step 1: Design your structure like below.

Configure above table with the input sequential file ‘sf_emp_det’.
Step 2: Let’s configure the ‘copy’ stage. Double click on it. Following window will pop up. Under ‘Output’ tab we just have to do ‘mappings’ now.

By using highlighted drop down we can map the required columns to the desired link. As shown here, columns ID, Fname, Lname are mapped to the link ‘lnk_id_name’ by dragging and dropping from left to right pane.
Step 3: Repeat it for other columns Loc, LanID, Email, DOJ as shown below.

Step 4: Let’s see what output is.
For ‘lnk_id_name’, columns mapped are ID, Fname, Lname.

For ‘lnk_email_lanid’, columns mapped are LanID, Email.

For ‘lnk_loc_doj’, columns mapped are Loc, DOJ.

2. AGGREGATOR Stage
As name suggest this stage is used to perform Aggregate functions in Datastage like Min, Max, Sum, Variance, Mean value etc. In some Business scenarios we need to use Aggregate function for that purpose this stage plays a part. This is simple and efficient.
Here, let’s try to implement some aggregate functions in Datastage.
Consider following table.

ID	Name	Salary	Department
101	John	700	EnTC
102	Christian	400	MECH
103	Ronaldo	700	CS
104	Manoher	555	EnTC
105	Sachin	752	IT
106	Kevin	896	EnTC
107	Alistair	400	IT
108	Adam	756	CS

Step 1: Design your job structure like below

Configure above table with input Sequential file ‘sf_emp_det’.
Step 2: Double click on ’Aggregator’ stage following window will pop up.

Step 3: Select ‘Group Key’ as ‘Dept’ and ‘Aggregation Type’= ‘Calculation’ from right side drop down box as shown above.
Now we have to select ‘Column for Calculation’ on which we want to perform aggregate functions. Here we are using ‘Salary’ column for calculation.
For this particular column ‘Salary’, select properties from right bottom pane named ‘Available properties to add’ as here we are using Maximum, Mean, Minimum and Sum properties as shown in step 2 and give name accordingly.
Step 4: Just map the output columns as below.

Step 5: Compile and run the job. Let’s see what happen to the output.

This how we developed a job for calculating Maximum, average, Minimum and sum of ‘Salary’ column. Practice for other properties also.

Datastage for Learners

Monday, 22 June 2015

COPY and AGGREGATOR Stage in Datastage

No comments:

Post a Comment