Monday, 22 June 2015

FILTER Stage in Datastage

As name suggest, this stage is use to filter the input data according to the required condition. Here, we are going to learn how Where clause works in Datastage.
Filter Stage in Datastage is processing stage which can have single input link and multiple output link and optional reject link. Depending upon the requirements input records are filtered out unmodified.
Let’s see the implementation of it.
Consider a following table :
Pno
Pname
Color
Weight
City
P1
Nut
Red
12.0
London
P2
Bolt
Green
17.0
Paris
P3
Screw
Blue
17.0
Oslo
P4
Screw
Red
14.0
London
P5
Cam
Blue
12.0
Paris
P6
cog
Red
19.0
London
Step 1: Design your job structure like below.
Configure above table with input sequential file ‘sf_parts’.
Job Design
Step 2: Consider we need to filter out input data on two conditions.
  1. City=London
  2. Weight between 10 and 15
Double click on filter stage, following window will pop up. Here we need to define ‘Where clause’ under Properties tab as shown below.
Configure Filter Stage
We have to define output link for each ‘Where clause’ so that filtered record for that clause will get stored in that output file. We need to decide Link label from link ordering tab which is numeric i.e. 0,1,2,3 etc. as shown below.
Mention this link label for output link under particular ‘Where clause’ as shown. No need to mention anything for Reject link as all records which are not filtered out by ‘Where clause’ will get routed to the reject link.
Link labels
Step 3: Map the output for each output link by using Drop Down under ‘Output Name’ as shown.
Output Mapping
Step 4: Compile and Run the job.
Output:
  1. City=”London”
Output City='London'
  1. Weight between 10 and 15
Weight between 10 to 15
  1. Reject link output
Reject link output
In this way, we can filter out the records as per the requirements also its useful to filter raw or unwanted data.

No comments:

Post a Comment