Sunday, 11 September 2016

Difference between Informatica and Datastage

I have used both Datastage and Informatica... In my opinion, DataStage is way more powerful and scalable than Informatica. Informatica has more developer-friendly features, but when it comes to scalabality in performance, it is much inferior as compared to datastage.
Here are a few areas where Informatica is inferior -
1. Partitioning - Datastage PX provides many more robust partitioning options than informatica. You can also re-partition the data whichever way you want.
2. Parallelism - Informatica does not support full pipeline parallelism (although it claims).
3. File Lookup - Informatica supports flat file lookup, but the caching is horrible. DataStage supports hash files, lookup filesets, datasets for much more efficient lookup.
4. Merge/Funnel - Datastage has a very rich functionality of merging or funnelling the streams. In Informatica the only way is to do a Union, which by the way is always a Union-all.

Thursday, 1 September 2016

Datastage Interview Question

http://www.datawarehousing-praveen.com/2013/10/datastage-interview-questions-part-1_720.html

Tuesday, 9 August 2016

MDM

http://news.sap.com/consolidation-harmonization-and-central-management/

https://scn.sap.com/thread/1034712

http://www.sourcemediaconferences.com/MDM/pdf/Tues/MDS/Shewale_Young.pdf

Monday, 14 March 2016

Data Warehousing



Q1 :- Four-Step Dimensional Design Process


Objective: design of a dimensional database by considering four steps in a particular order:

1. Select the business process to model

2. Declare the grain of the business process

3. Choose the dimensions that apply to each fact table row.

4. Identify the numeric facts that will populate each fact table row

https://dwbi1.wordpress.com/2010/03/05/transaction-dimension/
http://byobi.com/blog/2013/09/dimensional-modeling-junk-vs-degenerate/
http://dwhlaureate.blogspot.in/2012/08/junk-dimension.html
https://bintelligencegroup.wordpress.com/2012/06/05/different-types-of-dimensions-and-facts-in-data-warehouse/
http://www.disoln.org/2013/12/Design-Approach-to-Handle-Late-Arriving-Dimensions-and-Late-Arriving-Facts.html

5. MDM :-  Consolidation,Harmonization,Centralization,Distribution

Friday, 8 January 2016

Unix Interview Questions

1. Delete All Spaces or tabs from a line
:- cat file.txt tr -d ' \t'
:- sed 's/[ \t]*$//' file_A > new_file_A
2. Delete consequtive Spaces from a line
:- cat file.txt tr -s ' \t'
3. Remove blank line from file
:- sed '/^$/d' file.txt
:- grep -v '^$' filename > newfilename
4. Search for a pattern in the context of multiple xml files

:- grep -lR 'pattern' * |egrep '(xml)$'
5. Print the fields from the 10th to the 12th in a file
:- cut -d',' -f10-f12 filename.csv

6. Print the line number with the line containing a pattern in a file

:- grep -nf pattern.txt file.txt
7. Searching for repetating lines in a file [- is used after -f to read data from stream]

:-  head -1  Crane4C1.dat| grep -nf -  Crane4C1.dat

8. Remove CTRL+M from a file

:- sed -e "s/^M//"

9. How to find CTRL+M from a file

:-  cat -v filename.txt

10. Print 10 to 20th line in a file
:- sed -n 10,20p filename.txt

11. Looping in AWK

12. Create duplicate record from a file using awk.



13. Show content of two tag in a xml file using sed.

sed -n '/<Tag>/,/<\/Tag>/p' file.xml

14. Find the counts of all the records of files in a directory.

wc -l `find /path/to/directory/*PATTERN* -type f`
 
15. Remove duplicate record from a file using sed.
 
sort file | uniq -d 
  

Wednesday, 16 December 2015

Data Stage Admin Question

1. How to configure Oracle Client in Datastage?
2. How to configure ODBC in Datastage?
3. What are the steps need to be done to create a user in Datastage?
=> Login to Datastage Admin Client then do the below steps:-
 a. Create User
 b. Grant Role to the User
 c. Associate a project to the User[i.e. linking user to a project]
 

Wednesday, 25 November 2015

How to set memory size in Hierarchical Stage

Check the datastage installation 32 bit or 64 bit

$ . ./dsenv
$ file $APT_ORCHHOME/bin/osh

Try to execute the below command to find out the maximum size of allowable heap size for the environment.

java -Xmx4096m myClass

If it can not be loaded then go to AppServer and check wsadmin

vi /home/IBM/WebSphere/AppServer/bin/wsadmin.sh

search for PLATFORM and try to find Linux PREF_JVM_OPTION and check if MaxPermSize is set or not.

Now set the heap_size in the properties of Hierarchical stage as 4096 mb which was by default 256mb