Wednesday, 26 August 2015

How to Invoke REST Web services from DataStage Designer

InfoSphere Information Server Designer provides a stage called Hierarchical Data stage  
 (which was called XML Stage in prior releases to V11.3) that has the capability to parse  
 and compose the hierarchical data formats like JSON and XML. Along with that, it  
  provides the capability to invoke the REST Web services.
 It supports the REST Web services that are configured in different ways as below:
 1. That provides the responses in different formats like JSON, XML, HTML, JPEG, Audio  
     etc.
 2. Supports different authentication mechanisms like BASIC, DIGEST, LTPA, OAUTH
 3. Configured with SSL (Server and Client authentication)
 4. Headers and Cookies
 5. Different HTTP methods like GET, POST, PUT, DELETE, PATCH etc.
 Scenario:
 There is so much data on the social media sites, which you want to pull, transform and 
 send for the  analytics. Here I am taking an example of the IBM Facebook page. From 
 which, I want to retrieve the details for how many likes does IBM page have, which 
 all websites talk about IBM, To fulfill the above requirement, You can use our new 
 capability called REST step from the Hierarchical Datastage stage in DataStage Designer.
  Extracting data from Facebook Page
  The Figure 1 illustrates the DataStage job which retrieves the data from Facebook using 
  REST step  and parses the response data from Facebook to fetch the required data for the
 analytics.


 Figure 1: Extract_And_Parse_DataFromFaceBookPage
 




The figure 2 shows the assembly design of the Hierarchical datastage. The data
is obtained from Facebook by invoking the REST API using REST step. As the 
data provided by Facebook is in format of JSON, JSON parser step is used to parse 
the data. The REST Step “GetFBPageData” and JSON Parser step called 
“ParseFBData” are added to the Assembly Outline.
Figure 2 : Assembly Editor Design

 The Rest Step “GetFBPageData“ is configured as below:
1.In the General tab: The HTTP Method “GET” is selected , URL of the IBM Facebook
   page is mentioned in the “URL” field. The URL is https://graph.facebook.com/IBM as 
   in  figure 3.
2. In the Security tab: As Facebook is configured with SSL, select the checkbox Enable the  
    the SSL and Accept the Self signed certificate as shown in figure 4.
3. In the Request tab: Facebook returns the response in different formats. Here we need in  
    the JSON format. So Specify the Content-Type as application/json under the custom 
    header. As shown in figure 5.
4. In the Response tab : Select the check box “ Pass the received body to” and a radio
    button   “A text node named body in the Output Schema” and specify text/javascript  
    under  the content type as shown in figure 6.
Figure 3: General tab of Rest Step "GetFBPageData"
 Figure 4 : Security tab of Rest Step “GetFBpageData”
 
 Figure 5 : Request tab of Rest Step “GetFBPageData”
 Figure 6 : Response tab of Rest Step “GetFBPageData”
The Output of the REST call which is invoked in the “GetFBPageData” REST step is available in the body element in the output schema tab. The output schema of the REST step is as in figure 7.
Figure 7 : The Output Schema of the Rest Step "GetFBPageData"
 The JSON parser step “ParseFBData” is configured as below:
1.Under the JSON Source tab : Select the String set option, and from the drop down 
   select the bodyelement coming from the REST step “GetFBPageData” as shown in the   
   figure 8.  
2.Under the Document root: Browse and select the schema which conforms with the json
   data retrieved from earlier REST step as shown in figure 9.  
3.Under Validation tab: Minimal Validation is selected by default.  
Figure 8 : JSON Source tab of JSON Parser step “ParseFBData”  
 

 Figure 9 : Document root of JSON Parser step “ParseFBData”


The Output step is configured as below: 

Under the Mappings tab, the target link “DETAILS” is mapped to the top to fetch the parsed details of IBM Facebook page as shown in figure 10.
Figure 10 : Mappings tab of Output step
 
Compile and Execute the job to fetch the required details of the IBM Facebook page shown in figure 11:
Figure 11 : Data from IBM facebook page.
 
 
The output describes the details like ID of the Page, what it is about , when it is founded, 
 how many likes it got, the count talking about this page, username, websites where it is  
 talked etc. This data can be used in the analytics.
Conclusion : The REST capability in the hierarchical Datastage stage can be used to fetch 
 the data like social media data from different applications like Facebook, LinkedIn, 
 Twitter etc which exposes the services using REST
Please also articles published on developer works for the integration scenarios with SOFT 
Layer, Cloudant and Information Governance Catalog Glossary using DataStage.
 http://www.ibm.com/developerworks/data/library/techarticle/dm-1407governrest/index.html 
 Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”

Wednesday, 19 August 2015

Some trickey but useful transformation rules

1. How to validate a Email Address

If
AlNum(
Left(inputcolumn,index(inputcolumn, '@', 1) -1 ):Right(inputcolumn,
index(inputcolumn, '@', 1) - 1)
)
Then "Valid" else "Not"

2. How to replace field separator if a string contain any?

Ereplace(EC_File_Nm.$CommandOutput, @FM,'')

it will replace any field separator if any from the command out put of execute command stage.
Lets say the command output is a file name and it contain any filed separator inside it it will be replaced by null.

3. How to get

Monday, 17 August 2015

How to Integrating Web Services with Datastage

Web services expose operations of custom functions accessible to applications through internet or intranet. Web services are interoperable, reusable and follow strict standards and protocols to be followed for accessing them.
SOAP (Simple Object Access Protocol) is a specification for exchanging information between application and Web Service. Web service request and response are encoded in XML format.
This blog is to provide step-by-step instruction on invoking a web service through Datastage.
At the high level, following are the steps to be followed.
  1. Identify and Save Web Service Definition 
  2. Import Web Service Definition 
  3. Configure Web Service Operation 
  4. Configure Web Service Request 
  5. Configure Web Service Response

Identify and Save Web Service Definition

Many web services are publicly available for free. In this example, let us try to access Country web service published through webservicex site.
WSDL definition is listed on the XML frame. 
If you are using Firefox,
Right click on WSDL definition frame
Select “This Frame” option from popup menu
Select “Save frame as” option to locally save the file



If you are using Internet Explorer
          Click on WSDL definition frame
          Select File Menu and Click Save As option to save the file locally

Import Web Service Definition

This step will help to import the WSDL file saved in previous step into Datastage

  • Login to Datastage Designer 
  • Select Web Service Function Definitions in “Import” menu.
   


  • It opens Web Services browser which lists all the definitions already available on the Datastage project
  •  Click on “Metadata Importer” option


  •  Click on “Open File” option and select the location of WSDL file saved earlier


  • Web service explorer will lists the available web service operations
  • Right click one of the service operation and select Import all operations

 Configure Web Service Operation
Next step is to create a Server Job to access “Country” Web Service through Web Services Client Stage.
  • Login to Datastage Designer and Create a new Server Job 
  • From Palette, Select Web Services Client from Real Time group 
  • Place a sequential file stage as output

  • Open Web Services Client stage properties
  • Click on “Select Web Service Operation”




  • Click on “Select Web Service Operation”
  • Select “GetCountries” option. This web services does not have any input argument to be passed and hence Web Services Client stage is used for data retrieval.


  • Select “GetCountries” option and click on “Select this item” link
If the Datastage server is running behind a proxy for internet access (as most of the site does), configure proxy settings in Proxy tab
  • Enable HTTP/HTTPS option
  • Supply user name and password for accessing internet 
  • Specify proxy host name (without http). If host name and port details are not known, its a general practice to contact Datastage Administrator to obtain such information. 
If Datastage sever and Datastage client machine are directed through same internet proxy, these details can be extracted using the steps below. However, this is not recommended and demonstrated here for test purpose only.


Open Control Panel, Internet Options.
Select LAN Settings from Connections tab


Open the URL specified in LAN settings on a web browser
Host name and port details are generally specified at the “return” statement in configuration script. Supply this information in Web Services Client stage.
NOTE:
  • This would work only when the server is already rightly configured for internet access through proxy 
  • Include the server details in Web Client Stage excluding the file name

Configure Web Service Request

Open Web Services Client stage, Output tab
Open Input Arguments tab
Click on “Load Arguments Information” button. This loads namespace definitions into canvas

Configure Web Service Response

Open Web Services Client stage, Output tab
Open Output Message tab
Click on “Load Arguments Information” button. This loads output namespace definitions into canvas

Open Columns tab and check whether output column is populated 


Save, Compile and Run the job
Output file should have contents similar to the following