InfoSphere
Information Server Designer provides a stage called Hierarchical Data
stage
http://www.ibm.com/developerworks/data/library/techarticle/dm-1407governrest/index.html
(which was called XML Stage in prior releases to V11.3) that
has the capability to parse
and compose the hierarchical data formats
like JSON and XML. Along with that, it
provides the capability to
invoke the REST Web services.
It
supports the REST Web services that are configured in different ways
as below:
1.
That provides the responses in different formats like JSON, XML,
HTML, JPEG, Audio
etc.
2.
Supports different authentication mechanisms like BASIC, DIGEST,
LTPA, OAUTH
3.
Configured with SSL (Server and Client authentication)
4.
Headers and Cookies
5.
Different HTTP methods like GET, POST, PUT, DELETE, PATCH etc.
Scenario:
There
is so much data on the social media sites, which you want to pull,
transform and
send for the analytics. Here I am taking an example of the IBM Facebook page. From
which, I want to retrieve the details for how many likes does IBM page have, which
all websites talk about IBM, To fulfill the above requirement, You can use our new
capability called REST step from the Hierarchical Datastage stage in DataStage Designer.
send for the analytics. Here I am taking an example of the IBM Facebook page. From
which, I want to retrieve the details for how many likes does IBM page have, which
all websites talk about IBM, To fulfill the above requirement, You can use our new
capability called REST step from the Hierarchical Datastage stage in DataStage Designer.
Extracting
data from Facebook Page
The
Figure 1 illustrates the DataStage job which retrieves the data from
Facebook using
REST step and parses the response data from Facebook to fetch the required data for the
analytics.
The figure 2 shows the assembly design of the Hierarchical datastage. The data
The
Rest Step “GetFBPageData“ is configured as below:
REST step and parses the response data from Facebook to fetch the required data for the
analytics.
Figure 1:
Extract_And_Parse_DataFromFaceBookPage
The figure 2 shows the assembly design of the Hierarchical datastage. The data
is obtained from Facebook by invoking the REST API using
REST step. As the
data provided by Facebook is in format of JSON,
JSON parser step is used to parse
the data. The REST Step
“GetFBPageData” and JSON Parser step called
“ParseFBData”
are added to the Assembly Outline.
1.In
the General tab: The HTTP Method “GET” is selected , URL of the
IBM Facebook
in figure 3.
2. In
the Security tab: As Facebook is configured with SSL, select the
checkbox Enable the
the
SSL and Accept the Self signed
certificate as shown in figure 4.
3. In
the Request tab: Facebook returns the response in different formats.
Here we need in
the JSON format. So Specify the
Content-Type as application/json under the custom
header. As shown in figure 5.
4. In
the Response tab : Select the check box “ Pass the received body
to” and a radio
button “A text node named body in the Output
Schema” and specify text/javascript
under the content type as
shown in figure 6.
The
Output of the REST call which is invoked in the “GetFBPageData”
REST step is available in the body element in the output schema tab.
The output schema of the REST step is as in figure 7.
The
JSON parser step “ParseFBData” is configured as below:
1.Under
the JSON Source tab : Select the String set option, and from the
drop down
select the bodyelement coming from the REST
step “GetFBPageData” as shown in the
figure 8.
2.Under
the Document root: Browse and select the schema which conforms with
the json
data retrieved from earlier REST step as shown in figure 9.
3.Under
Validation tab: Minimal Validation is selected by default.
The
Output step is configured as below:
Under
the Mappings tab, the target link “DETAILS” is mapped to the top
to fetch the parsed details of IBM Facebook page as shown in figure
10.
Compile
and Execute the job to fetch the required details of the IBM Facebook
page shown in figure 11:
Figure 11 : Data
from IBM facebook page.
The
output describes the details like ID of the Page, what it is about ,
when it is founded,
how many likes it got, the count talking about
this page, username, websites where it is
talked etc. This data can
be used in the analytics.
Conclusion
: The REST capability in the hierarchical Datastage stage can be
used to fetch
the data like social media data from different
applications like Facebook, LinkedIn,
Twitter etc which exposes the
services using REST
Please also articles published on developer works for the integration
scenarios with SOFT
Layer, Cloudant and Information Governance
Catalog Glossary using DataStage.
Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”