Tuesday, 23 June 2015

Load XML files using a DataStage Parallel job

Introduction

The intention of this tutorial is to give novice developers a quick start with loading XML data using a DataStage parallel job.

Steps

Step 1:
Create a simple XML file named test.xml
<xml> <customer>Mike</customer> <customer>Anna</customer> </xml>
Step 2:
Create a new DataStage parallel job with 3 stages linked together: A sequential file stage, XML input stage (located under the Real Time category), and a peek stage.
Step 3:
The first trick is to load the entire XML file into a single column of a single row. You do this by creating a column in the sequential file stage of type LongVarChar[Max=9999]. In this example the max size is arbitrary. Set the input file to test.xml. Next, remove all properties in the [Format] tab and add these two:
In the Record level:
Record type=implicit
In the Field defaults:
Delimiter=none
Step 4:
Now that we have the XML in a single column then we can set the XML input stage properties. In the [Transformation settings] tab under the [Stage] tab check the [Repetition element required] tag. In the [Input] tab select the column that you defined in step 3 and check the [XML document] box. In the [Output] tag define a column named [customer] of type varchar[max=255]. Set it as the key. In the description box enter the xml path. In this case /xml/customer/text()
Tip: To reference XML attributes you would use @. For example: /xml/customer/@id would equal 1 when using this xml: <xml><customer id=”1”>Mike</customer></xml>
Step 5:
Compile and run. Peek will produce log records that list the customers from the XML file.

Important Links

https://dstagenotes.wordpress.com/tag/restrict-memory-usage/

http://www.ibm.com/developerworks/data/library/techarticle/dm-0703xiong/

http://www.ibm.com/developerworks/data/library/techarticle/dm-1112isanalyzerrules/

http://www.ibm.com/developerworks/data/tutorials/dm-1108isqualitystagestddata/

http://www.ibm.com/developerworks/library/ws-complexws/

http://www.redbooks.ibm.com/redbooks/pdfs/sg247987.pdf

http://talentain.com/resources/datastage-slowly-changing-dimensions

https://dstagenotes.wordpress.com/2014/04/27/slowing-changing-dimension-using-scd-stage-in-datastage/

https://datastage4u.wordpress.com/2011/04/30/join-stage/

http://datastagetool.blogspot.in/

https://datastage4u.wordpress.com/2012/04/25/infosphere-information-server-datastage-change-data-capture/

http://mydatastage-notes.blogspot.in/p/datastage-study-material.html

http://www.dstagebox.com/

http://www.folkstalk.com/2011/07/datastage-scenario-based-questions-part_3990.html

http://www-01.ibm.com/support/docview.wss?uid=swg21452589

http://www-01.ibm.com/support/docview.wss?uid=swg21663830

https://dstagenotes.wordpress.com/

http://shortcut-tricks.blogspot.in/2016/04/what-is-partition-in-oracle-11g-with.html

http://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.deploy.help.doc/topics/depcreate.html

https://books.google.co.in/books?id=ifLHAgAAQBAJ&pg=PA79&lpg=PA79&dq=deployment+tool+for+ibm+datastage+code+integration&source=bl&ots=0BUFpf_7Qk&sig=yhlSS5C8JUQ_aoi9VdUFN0u8R1s&hl=en&sa=X&ved=0ahUKEwjVqLbxk6HRAhVEvxQKHfKoAF8Q6AEINzAF#v=onepage&q=deployment%20tool%20for%20ibm%20datastage%20code%20integration&f=false

Monday, 22 June 2015

SWITCH Stage in Datastage

Switch Stage in Datastage:
  • Switch stage is a Processing Stage. It has one Input Link and number of output links depending upon the ‘cases’ mentioned. Maximum it can have 128 numbers of output links and one reject link.
  • Function of Switch stage in Datastage is analogues to the Switch Statement in ‘C’ Language. For specific output required, there is specific ‘Case’ situation. As it has one Reject Link, those records which are not matched with any of the case will move to the reject link.

Let’s try to configure this Switch stage.
Consider following table
ID
First Name
Last Name
Location
Network ID
EmailID
1
Jach
Simmons
LA
JS524
Letsc@gmail.com
2
Shumas
Jane
Las
Sj145
Jaene@ymail.com
3
Jonty
Waughn
Syd
JW927
JontyW@sdbh.com
3
Suhana
Safar
Mexico
SS99
Sas@gmail.com
8
Abc
Cde
Mex
ab254
lmn@gmail.com
4
Shumas
Jane
Las
Sj145
Jaene@ymail.com
Step 1: Design your job Structure like below.
1
Configure input Sequential file sf_emp_det with above table.
Step 2:Now, to configure Switch stage, double click on it. Following window will pop up.
2
There are three categories under ‘Properties’ tab.
  1. Input
  2. User-defined Mapping
  3. Options
Let’s study one by one.
Step 3:
  1. Input: ‘Selector’ is field through which we have to select a column according to which ‘cases’ are going to be decide. Here, ‘Selector’ = ID.
In ‘Selector Mode’ as we are going to decide cases for this stage, it should be selected as ‘User-
defined Mappings’.
  1. User-defined Mappings:  Here we are going to actually decide the case statements. As shown above, for case 1, 2, 3 and 8 are all ‘ID’ values and 0, 1, 2 are all link labels which we are getting from ‘Link Ordering’ tab.
Refer following window where in right pane ‘link label’ for corresponding ‘Link name’ is given.These ‘link label’s’ are used to define cases as follows.

ID
Link Label
Link Name
Case
1
0
lnk_ID_1
Case
2
1
lnk_ID_2
Case
3
2
lnk_id_3
Case
8
5(Discard Value)

For record ID=8 will get discarded. For discarded value, ‘Link Label’ could be anything. Here is 5. Those records which are not matched with case’s get rejected and stored it in ‘Reject link’, here ‘rjct_id_4’.
3
3. Options:
Discard value: This is like giving the ‘link label’ to discarded link. Consider our case, ‘case=8=5’. For record ID=8 will get   routed to the link 5 which is mentioned here as ‘Discarded value’.
If not found: It decides action to be taken on the records which are failed to match with case statement.
  1. Drop: It will drop the entire row
  2. Fail: It will cause the job to fail.
  3. Output: It will route that particular row to the reject link.
Note: Reject link is not allowed with the ‘Drop’ and ’Fail’ condition. It will throw error during compilation.
Step 4: Let’s see what will happen to the output!!
Compile and run the job. We will see output link by link.
  1. Link lnk_ID_1 for which ‘Case=1=0’.
4
  1. Link lnk_ID_2 for which ‘Case=2=1’.
5
  1. Link lnk_id_3 for which ‘Case=3=2’.
6
  1. For Reject link lnk_rjct_id_4 :As we didn’t mention any case statement for ID=4, it will get route to the reject link.
7
  1. For ‘Case=8=5’: Record for ID=8 get discarded.

PIVOT ENTERPRISE Stage in Datastage

Pivot enterprise stage is a processing stage which pivots data vertically and horizontally depending upon the requirements. There are two types
  1. Horizontal
  2. Vertical
Horizontal Pivot operation sets input column to the multiple rows which is exactly opposite to the Vertical Pivot Operation. It sets input rows to the multiple columns.
Let’s try to understand it one by one with following example.
  1. Horizontal Pivot Operation.
Consider following Table.
Product Type
Color_1
Color_2
Color_3
Pen
Yellow
Blue
Green
Dress
Pink
Yellow
Purple
Step 1: Design Your Job Structure Like below.
Job structure for Pivot Enterprise stage
Configure above table with input sequential stage ‘se_product_clr_det.
Step 2: Let’s configurePivot enterprise stage’. Double click on it. Following window will pop up.

Select ‘Horizontal’ for Pivot Type from drop-down menu under ‘Properties’ tab for horizontal Pivot operation.
Step 3: Click on‘Pivot Properties’ tab. Under which we need to check box against ‘Pivot Index’. After which column of name ‘Pivot_Index’ will appear under ‘Name’ column also declare a new column of name ’Color’ as shown below.

Step 4:  Now we have to mention columns to be pivoted under ‘Derivation’ against column ‘Color’. Double click on it. Following Window will pop up.

Select columns to be pivoted from ‘Available column’ pane as shown. Click ‘OK’.
Step 5: Under ‘Output’ tab, only map pivoted column as shown.

Configure output stage. Give the file path. See below image for reference.
6
Step 6: Compile and Run the job. Let’s see what is happen to the output.

This is how we can set multiple input columns to the single column (As here for colors).
Vertical Pivot Operation:
Here, we are going to use ‘Pivot Enterprise’ stage to vertically pivot data. We are going to set multiple input rows to a single row. The main advantage of this stage is we can use aggregation functions like avg, sum, min, max, first, last etc. for pivoted column. Let’s see how it works.
Consider an output data of Horizontal Operation as input data for the Pivot Enterprise stage. Here, we will be adding one extra column for aggregation function as shown in below table.
Product
Color
Prize
Pen
Yellow
38
Pen
Blue
43
Pen
Green
25
Dress
Pink
1000
Dress
Yellow
695
Dress
purple
738
Let’s study for vertical pivot operation step by step.
Step 1: Design your job structure like below. Configure above table data with input sequential file ‘se_product_det’.

Step 2: Open Pivot Enterprise stage and select Pivot type as vertical under properties tab.

Step 3: Under Pivot Properties tab minimum one pivot column and one group by column. Here, we declared Product as group by column. Color and prize  as Pivot columns.Lets see how to use ‘Aggregation functions’ in next step.

Step 4: On clicking Aggregation functions required for this column for particular column following window will pop up. In which we can select functions whichever required for that particular column. Here we are using ‘min’, ’max’ and ‘average’ functions with proper precision and scale for Prize column as shown.

Step 5: Now we just have to do mapping under output tab as shown below.

Step 6: compile and Run the job. Lets see what will be the output is.
Output :

CHANGE CAPTURE Stage in Datastage

In this article we are going to study ‘Change Capture’ stage in Datastage. It is a Processing Stage. As name suggests it captures the change between two input data by comparing them based on key column.
The two input links are linked with Change Capture stage by the two default link names i.e. ‘Before’ and ’After’. This captured change is mentioned in output in the form of code in separate column i.e.
0 = If the data is Copied as it is from ‘Before’ Link to ‘After’ Link.
1 = If the data is newly Inserted in ‘After’ link
2 = If the data is Deleted from ‘Before’ link
3 = If the data is Edited in ‘After’ link from ‘Before’ link
Let’s see working of this stage with the help of following example.
Consider following two tables which we are going to use as ‘Before’ and ‘After’ .
Before Link Data
e_id
e_name
e_add
11
John
Singapore
22
Jack
Melbourne
33
Jessy
Sydney
44
Lampard
canada
55
Pomerbachk
perth

After Link Data
e_id
e_name
e_add
11
John
Odisha
22
Jack
Melbourne
44
Lampard
canada
55
Pomerbachk
perth
66
Rahul
Mumbai

Step #1: Design your job structure like below
Job Design
Configure above tables with respective input sequential files i.e. ‘After’ table with sf_after_data and ‘Before’ table with sf_before_data as shown.
Step #2: Double click on Change Capture stage, following window will pop up.
Configure Change Capture Stage 1
Here select ‘key =e_id’ for sorting purpose under ‘Change Keys’ as shown.
Now, next is the most important point in the discussion.
‘Change Values’ is the column name which is taken into the consideration for capturing the change. Here, Select ‘Value =e_add ’. According to the values in column ‘e_add’ , Change Capture stage decides particular data is copied, edited or inserted or not.
Change Modeis the ‘Option’ which helps you to define keys & Values explicitly or implicitly.
Step #3: For simplicity purpose I have mentioned Before and After links with respective Before and After data so that it will be easy to recognize.
Mention this links under Link Ordering tab as shown below.
Link Ordering 1
Step #4: Do the output mapping under Mapping tab. The new column named change_code() is available for mapping which will be show the data is edited, copied, deleted or inserted.
Output Maping 1
Step #5: Compile and Run the job.  
Double click on Dataset output file ds_capture_change and click on View Output. Following window will pop up.
Output
It shows output with additional column named Change_Code which shows values 0,1,2 and 3 depending on data deleted, inserted, copied or edited.
For e_id  22 is copied as it is from Before to After so its code is 0.
For e_id  66, new record is inserted in After so its code is 1.
For e_id 33, record is deleted from Before so its code is 2.
For e_id  11,44 and 55, e_add values are edited so its code is 3.


In this way we can use Change Capture stage for analysis purpose. Its more usefull when tjere is big amount of input data.

REMOVE DUPLICATE Stage in Datastage

  • This stage helps to remove duplicate records from data set depending on the key column. In this, two rows are supposed to be duplicate if they have identical values in key columns. We can define more than one key column.
  • Input data should be sorted for this stage so that all records of having similar key values will be adjacent. Instead of adding separate ‘Sort stage’, we can perform ‘Link Level Sort’.
Let’s try to implement Remove Duplicate stage in Datastage:

Consider a following Table
ID
First Name
Last Name
Location
Network ID
EmailID
Date
1
Jach
Simmons
Chicago
JS524
Letsc@gmail.com
20140506
2
Shumas
Jane
LA
Sj145
Jaene@ymail.com
20140506
8
Shumas
Jane
LA
Sj145
Jaene@ymail.com
20140506
3
Jonty
Waughn
Sydney
JW927
JontyW@sdbh.com
20140506
3
Jonty
Waughn
Sydney
JW927
JontyW@sdbh.com
20130506
3
Jonty
Waughn
Melbourn
SP417
JontyW@sdbh.com
20150506
4
Suhana
Safar
Maxico
SS99
Sas@gmail.com
20180506
We are going to select Key column as ‘ID’. In this table ID=2 and ID=8 have same data and for ID=3 there change in some record. Let’s see, what is going to happen with these records!
Here we have added ‘Date’ column to understand the function of ‘Duplicate To Retain’ option’s function. Will discuss in step 6.
Step 1: Design Your job structure like below

Step 2: Configure Employee detail file in which above tables is saved with the sequential file sf_emp_det and change the ‘Date Format’ to %yyyy%mm%dd. For reference see article “Sequential file in datastage”
Step 3: Double click on Remove duplicate stage i.e. here we named it as rd_on_ID. Below window will pop up.
2
Select ‘Key=ID’ under ‘Keys that defines Duplicates’. One can select multiple keys from ‘Available properties to add’ pane shown in right bottom corner.
‘Duplicate To Retain’ option provides facility to retain ‘First’ or ‘Last’ record for particular Duplicate key.
Step 4: As we are doing link level sort, it shows yellow ‘sort’ symbol on link ‘lnk_emp_det’. To perform this sort go under ‘Input’ tab in which select ‘Partition type’ as anyone (here ‘Hash’) under ‘Partitioning’ tab. Select column on which you want to perform sort from ‘Available’ left pane and check ‘Perform sort’ box as shown below.
Select ‘Date’ only for ‘sorting’ by right click on ‘Date’ in right pane as shown, no need of partitioning.
3
Step 5: Map all the required output column under ‘Output’ tab as shown.
4
Step 6: Now let’s what is the output. Just compile and run the job.
Output:

There was three record for ‘ID=3’ and as we are sorting data in ascending order and ‘Duplicate to Retain=First’, Remove Duplicate stage retain the latest according to dates mentioned. Here
3
Jonty
Waughn
Sydney
JW927
JontyW@sdbh.com
20130506
If we select ‘Duplicate to Retain=Last’
3
Jonty
Waughn
Melbourn
SP417
JontyW@sdbh.com
20150506

This will reflect in output.

COLUMN and ROW Generator Stage in Datastage

In this article instead of developing or doing any modifications s we have discussed earlier we are going to discuss to do some creations of rows and columns with help of Column Generator and Row Generator stages in Datastage.

            1.     Column Generator:

It’s a Development/Debug stage which has one input link and one output link. This stage adds column value for each record. Let’s study this stage in detail with following example.
Consider following Table
ID
Fname
Lname
Loc
LanID
Email
Date
1
Jach
Simmons
LA
JS524
Letsc@gmail.com
20140101
2
Shumas
Jane
Las
Sj145
Jaene@ymail.com
20130623
3
Jonty
Waughn
Syd
JW927
JontyW@sdbh.com
20131123
3
Suhana
Safar
Mexico
SS99
Sas@gmail.com
20130631
4
Abc
Cde
Mex
ab254
lmn@gmail.com
20131225
4
Shumas
Jane
Jane
Sj145
Jaene@ymail.com
20130625
Step #1: Design your job structure like below.
Job Design Column
Configure above table with input sequential file sf_emp_data.
Step #2: Double click on Column Generator stage following window will pop up.
Properties Column
Here mention the column name which you want to add or generate against Column To Generate. Mention Column Method as Explicit.
Step #3: Now we have to define Metadata for this newly generated column. For this purpose go under Input tab, then column and right click on newly added column i.e. Unique_Id, click on Properties. Following window will pop up.
Edit Column Meta Data
Here in Properties pane we can add properties like its initial value, by how much value it should get increment, its final value etc. for Unique_Id . Refer above image for example.
Step #4: Do the required output mapping. Refer below image.
Output Mapping Column 1
Step #5: Compile and Run the job. View Output on target Dataset file ds_trgt_column as shown below.
Column Generated Output 1
In this way we have created a new column named Unique_Id having initial value as 3 and incremented by 1 for next record. This column will help to distinguish records as ID column has some repeated values in above table.

           2.     Row Generator:

We have seen Column Generator stage in above article. Here we are going to discuss about Row generator Stage in Datastage.
This stage is specially used for testing purpose. We are going to study this stage step by step in detail with one simple example or job.
Step #1: Design your job design like below.
Job Design Row Generator
Sort stage is just used to sort the created records for simplicity purpose.
Step #2: Double click on Row Generator stage following window will pop up. Here just mention the Number Of Records want to generate.
Properties Row Generator
Step #3: Mention the column names with Metadata to generate the records. See the below image for reference.
Metadata For Row Generator
You can set the properties of the each column in Edit Column Meta Data as we have discussed in Step 3 of Column Generator article above.
Step #4: Compile and Run the job. View Output on target sequential file sf_tgt_se. You will get the following output.
Output Row Generator


As we set Increment value 1 for each column, you can see each column value is increased by 1. As we didn’t specify any Initial Value for any column, it takes default value which you can notice in above output. For String column, the repetition of characters depends upon the length specified for that column.

FILTER Stage in Datastage

As name suggest, this stage is use to filter the input data according to the required condition. Here, we are going to learn how Where clause works in Datastage.
Filter Stage in Datastage is processing stage which can have single input link and multiple output link and optional reject link. Depending upon the requirements input records are filtered out unmodified.
Let’s see the implementation of it.
Consider a following table :
Pno
Pname
Color
Weight
City
P1
Nut
Red
12.0
London
P2
Bolt
Green
17.0
Paris
P3
Screw
Blue
17.0
Oslo
P4
Screw
Red
14.0
London
P5
Cam
Blue
12.0
Paris
P6
cog
Red
19.0
London
Step 1: Design your job structure like below.
Configure above table with input sequential file ‘sf_parts’.
Job Design
Step 2: Consider we need to filter out input data on two conditions.
  1. City=London
  2. Weight between 10 and 15
Double click on filter stage, following window will pop up. Here we need to define ‘Where clause’ under Properties tab as shown below.
Configure Filter Stage
We have to define output link for each ‘Where clause’ so that filtered record for that clause will get stored in that output file. We need to decide Link label from link ordering tab which is numeric i.e. 0,1,2,3 etc. as shown below.
Mention this link label for output link under particular ‘Where clause’ as shown. No need to mention anything for Reject link as all records which are not filtered out by ‘Where clause’ will get routed to the reject link.
Link labels
Step 3: Map the output for each output link by using Drop Down under ‘Output Name’ as shown.
Output Mapping
Step 4: Compile and Run the job.
Output:
  1. City=”London”
Output City='London'
  1. Weight between 10 and 15
Weight between 10 to 15
  1. Reject link output
Reject link output
In this way, we can filter out the records as per the requirements also its useful to filter raw or unwanted data.

LOOK UP Stage in Datastage

Lookup Stage:
The Lookup stage is most appropriate when the reference data for all lookup stages in a job is small enough to fit into available physical memory. Each lookup reference requires a contiguous block of shared memory. If the Data Sets are larger than available memory resources, the JOIN or MERGE stage should be used.
Lookup stages do not require data on the input link or reference links to be sorted. Be aware, though, that large in-memory lookup tables will degrade performance because of their paging requirements. Each record of the output data set contains columns from a source record plus columns from all the corresponding lookup records where corresponding source and lookup records have the same value for the lookup key columns. The lookup key columns do not have to have the same names in the primary and the reference links.

The optional reject link carries source records that do not have a corresponding entry in the input lookup tables.
You can also perform a range lookup, which compares the value of a source column to a range of values between two lookup table columns. If the source column value falls within the required range, a row is passed to the output link. Alternatively, you can compare the value of a lookup column to a range of values between two source columns. Range lookups must be based on column values, not constant values. Multiple ranges are supported.
There are some special partitioning considerations for Lookup stages. You need to ensure that the data being looked up in the lookup table is in the same partition as the input data referencing it. One way of doing this is to partition the lookup tables using the Entire method.
Lookup stage Configuration:Equal lookup




You can specify what action need to perform if lookup fails.
Scenario1: Continue

Choose entire partition on the reference link

Scenario2:Fail
Job aborted with the following error:
stg_Lkp,0: Failed a key lookup for record 2 Key Values: CUSTOMER_ID: 3
Scenari03:Drop
Scenario4:Reject
If we select reject as lookup failure condition then we need to add reject link otherwise we get compilation error.
Range Lookup:
Business scenario:we have input data with customer id and customer name and transaction date.We have customer dimension table with customer address information.Customer can have multiple records with different start and active dates and we want to select the record where incoming transcation date falls between start and end date of the customer from dim table.
Ex Input Data:
CUSTOMER_ID CUSTOMER_NAME TRANSACTION_DT
1 UMA 2011-03-01
1  UMA 2010-05-01
Ex Di Data:
CUSTOMER_ID CITY ZIP_CODE START_DT END_DT
1 BUENA PARK 90620 2010-01-01 2010-12-31
1 CYPRESS 90630 2011-01-01 2011-04-30
Expected Output:
CUSTOMER_ID CUSTOMER_NAME TRANSACTION_DT CITY ZIP_CODE
1 UMA 2011-03-01 CYPRESS 90630
1  UMA 2010-05-01  BUENA PARK 90620
Configure the lookup stage as shown below.Double click on Lnk_input.TRANSACTION_DATE column.(specifying condition on the input link)



You need to specify return multiple rows from the reference link otherwise you will get following warning in the job log.Even though we have two distinct rows base on customer_id,start_dt and end_dt columns but datastage is considering duplicate rows based on customer_id key only.
stg_Lkp,0: Ignoring duplicate entry; no further warnings will be issued for this table

Compile and Run the job:

Scenario 2:Specify range on reference link:


This concludes lookup stage configuration for different scenarios.

  • Like Join and Merge stage, Look up stage has multiple input links, one is primary and others are reference, according to which lookup operation takes place.
  • But it doesn’t have condition like Merge stage i.e. ‘Reject Links’ should be equal to update input links also it not require data on any of the  input links to be sorted. This stage provides four conditions depend on which future of output data depends. We will see these conditions in ‘Step 4’.
Now, let’s try to implement Look up stage with the help of following tables.
Table 1
ID
First Name
Last Name
Location
Network ID
EmailID
1
Jach
Simmons
Chicago
JS524
Letsc@gmail.com
2
Shumas
Jane
LA
Sj145
Jaene@ymail.com
3
Jonty
Waughn
Sydney
JW927
JontyW@sdbh.com
4
Suhana
Safar
Maxico
SS99
Sas@gmail.com
Table 2
ID
Dept
Dept Head
1
Electronics
Paul
3
CS
Jack
4
TS
Summur
         5
IT
Sean
Table 3
ID
Training Cent
1
CKG
2
AMD
3
WC
Step 1 : Design a job structure like below.
Look up DataStage 1
Consider Employee table as Primary link as shown. Depending upon each record in Primary link, Look Up Stage performs look up operation on Reference Link according with key column.
Consider Employee’s department detail i.e. table 2 and Employee’s Training Center detail i.e. Table 3 as data on two reference links.
Step 2 : Now we are going to configure Look Up Stage (Named as lkp_emp_det in design). Double click on Look Up Stage. Following window will pop up. Left pane is for all inputs and right pane is for output. First link detail table is for Primary link. Second and third are for reference links. Order of these reference links can be changed by using  Look up DataStage 2 this symbol on Title Bar as shown.
Look up DataStage 3
Step 3 : In left pane, map Key column (here ‘ID’) by just dragging it to the respective key column in reference links. Map all remaining required column to right pane as shown.
Step 4 : One of the most important step is to set look up conditions which we can do by using 2nd option on Title bar. Just click on it, following window will pop up.
Look up DataStage 4
There is a list of reference links in ‘Link Name’ column. In ‘Condition’ column we can give conditions for each reference link. If this condition will not meet then what will happen to that data is decided by ‘Condition Not Met’ column and if lookup fails it is decided by ‘Lookup Failure’ column.
Continue : Data will be sent to the Output link.
Reject : Data will be sent to the Reject link.
Drop : Data will neither go to Output link nor to Reject link.
Fail : Job will fail.
In our case, let’s first try for without condition in ‘Condition’ column and ‘Continue’ and ’Reject’ in other columns.
Step 5 :  Compile and run the JOB. Let’s see what the output is.
Output : Stream link
Look up DataStage 5
It’s showing two records. As we have given ‘Lookup Failure’ condition as ‘Reject’, those records from primary link which are not matched with reference link data are collected in Reject Link ‘rjct_primary_rec’ as shown below.
Output : Reject Link
Look up DataStage 6
Step 6 : Let’s try to configure for ‘Condition’ column in ‘Lookup Stage Conditions’ pane.
Just put condition as ID=3 and ‘Reject’ under ‘Condition Not Met’ as shown below.
Look up DataStage 7
Except ID=3 all records will get rejected and get stored in ‘Reject link’. Here data for ID=2 get rejected and we will get output for Stream link as shown.
Look up DataStage 8
Output for ‘Reject Link’
Look up DataStage 9
Note :  Reject Link shows rejected record from primary input link only.