Thursday, 27 March 2014

Exercise 7: How to create and use Pipeline Lookup or How to create Active Lookup?


Exercise 7: How to Achieve Target below

(Source)
ins_id,premium_year,premium_Amount
101,2010,NULL
101,2011,4500
101,2012,NULL
102,2010,NULL
102,2011,6000
102,2012,NULL

Note: NULL is just a representation. There is no value over there.

(Target)
ins_id,premium_year,premium_Amount
101 2010 4500
101 2011 4500
101 2012 4500
102 2010 6000
102 2011 6000
102 2012 6000

Let's Chat: I know this exercise we have already solved in Exercise 6 CASE 2. Here I am taking same scenario and will resolved it though Active pipeline lookup.

What is Pipeline Lookup?
A pipeline lookup is used do lookup upon an application source that is not a relational table or flat file.
A Source Qualifier has been used as a lookup table for Pipeline Lookup.
A Source Qualifier which has been used as a lookup table remains abandon in mapping.

What is the Use of Pipeline Lookup?
To improve performance when processing relational or flat file lookup sources, create a pipeline Lookup transformation instead of a relational or flat file Lookup Transformation. 

Pipeline Lookup can be created in Connected or Unconnected mode.


Theory/Analysis:
After analyzing source and target we can easily conclude that the premium amount value of each id will be repeating for each and every row of that id but here value is coming in second row not in first row then it's quite little bit challenging and for this we will create a Active Pipeline Lookup to take all rows of ID and Pass only that value where Premium Amount is NOT NULL. If you want to create Active Lookup then simply one Check box you have to click while creating lookup.

Practically:

Step 1:
Drag and Drop your source twice and Target once.

Step 2:
Lookup

While creating Select Source Qualifier and Select check box Return all Values on Multiple Match (See Picture Below)


Connect Ins_ID, Premium_Year from Source qualifier (Upon which you are not doing Lookup) to Lookup Transformation.
Condition Tab
Ins_Id = Ins_Id 1

Step 3
Filter
Connect Ins_ID, Premium_Year (Column of Source) and Premium_Amount Column of Lookup from Lookup to Filter Transformation.

Filter Condition: NOT ISNULL(Premium_Amount)

Step 4
Connect Your Columns from Filter to Target.

Step 5:
Give the Source connection or Location/File_Name for both source qualifier.
No need to give Source connection or Location/File_Name for Lookup Table.

Note: Your Second Source and Source Qualifier upon which you are doing Pipeline Lookup will be abandon in mapping no need to do anything with them. Your mapping will look like Image below...

BINGO!!!

Note: You can also achieve this by doing Active Lookup directly  upon your source. Approach explained above is just a representation that how to create Pipeline Lookup.


Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.

hpandey.harsh@gmail.com

Wednesday, 26 March 2014

Exercise 6: How to replace Null with Value

Exercise 6: How to replace NULL with value and achieve target below?

CASE 1
(Source)
ins_id,premium_year,premium_Amount
101,2010,4500
101,2011,NULL
101,2012,NULL
102,2010,6000
102,2011,NULL
102,2012,NULL

Note: NULL is just a representation. There is no value over there.

(Target)
ins_id,premium_year,premium_Amount
101 2010 4500
101 2011 4500
101 2012 4500
102 2010 6000
102 2011 6000
102 2012 6000

Theory/Analysis: After analyzing source and target we can easily conclude that the premium amount value of each id will be repeating for each and every row of that id So if, some how we can store the Premium_Amount first value for each ID and pass it for each and every row then we are done.

Practically:
Step 1
Drag and Drop your source and Target Instance

Step 2
Expression
Create Three ports

PREMIUM_COMPARE (Variable Port) = IIF(ISNULL(premium_Amount),OLD_PREMIUM,premium_Amount)

OLD_PREMIUM (Variale Port) = PREMIUM_COMPARE

O_PREMIUM_AMOUNT (Output Port) = PREMIUM_COMPARE

Step 3
Connect your target from expression. Only one thing needs to take care while connecting Premium_Amount column connect it through O_PREMIUM_AMOUNT port.

That's it... Wow!!! It's was this much easy completed in only three steps.

BINGO!!!




CASE 2
(Source)
ins_id,premium_year,premium_Amount
101,2010,NULL
101,2011,4500
101,2012,NULL
102,2010,NULL
102,2011,6000
102,2012,NULL

Note: NULL is just a representation. There is no value over there.

(Target)
ins_id,premium_year,premium_Amount
101 2010 4500
101 2011 4500
101 2012 4500
102 2010 6000
102 2011 6000
102 2012 6000

Theory/Analysis: After analyzing source and target we can easily conclude that the premium amount value of each id will be repeating for each and every row of that id but here value is coming in second row not in first row then it's quite little bit challenging and our CASE 1 solution does not work here. We have to think about some other logic.

Practically:

Step 1
Drag and Drop your source and Target Instance

Step 2
Aggregator
Connect ins_id and Premium_Amount from Source Qualifier
Create One Port

O_Premium_Amount (Output Port) = MAX(Premium_Amount)

Step 3
Sorter
Connect Ins_Id and Premium_Year from Source Qualifier

Select Ins_Id as key port.

Ins_ID (key)

(Clarification: We need sorter here as we are going to join two pipeline of same source qualifier and for this we need to pass sorted input for Joiner Transformation.)

Step 4
Joiner
Connect Ins_Id and Premium_Year from Sorter and Ins_Id, O_Premium_Amount from Aggregator

Set Properties below...

Condition Tab

Ins_Id 1 = Ins_Id

Join Type: Normal

Step 5
Connect Ins_Id and Premium_Year and Premium_Amount from Joiner to Target.

BINGO!!!




Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.

hpandey.harsh@gmail.com 

Monday, 24 March 2014

Exercise 5: How to find sum of negative and sum of positive integer separately.

Exercise 5: How to find sum of negative and sum of positive integer separately.

Ex: 
Source 
Account_Balance
500
600
-300
-200
400
-100

Target
Total_Account_Balance
1500
-600

Theory/Analysis: After analyzing source and target we can conclude that we have to find the sum for positive and negative number separately. Here only thing we need to identify, the positive and negative integer then finding sum is not a big task. That's it. Identifying integer is also very easy :-)

Step 1
Drag and drop one instance of source and two instance of target.

Step 2
Router
Create one Group
Negative_Number=SUBSTR(TO_CHAR(Account_Balance), 1, 1)='-'

Step 3
Aggregator
Create two Aggregator Transformation

Connect Aggregator1 from Negative_Number Group of Router Transformation
Create one port
Total_Balance (Output Port) = Sum(Account_Balance1)

Connect Aggregator2 from Default Group of Router Transformation
Create one port
Total_Balance (Output Port) = Sum(Account_Balance2)

Step 4:
Connect your both target instance from separate Aggregator Transformation.

BINGO!!!


Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.

hpandey.harsh@gmail.com 

Sunday, 23 March 2014

Exercise 4: How to load top 2 salaries for each department without using Rank Transformation and SQL Queries in Source Qualifier?

Exercise 4: How to load top 2 salaries for each department without using Rank Transformation and SQL queries in Source Qualifier.

Analysis/Theory: After analyzing requirement we can conclude that after sorting the data based upon salary in descending order, if we can pass only two records for each group of department then we are done. That's it!!! Is in't it too easy :-)

Practically

Step 1:
Drag and drop one source and three instance of target table.

Step 2: 
Sorter
Sort the data based upon SAL.

Select SAL as key port and Descending order

Step 3
Router
Create three groups.

Dept10_Data =  Dept=10
Dept20_Data =  Dept=20
Dept30_Data =  Dept=30

Step 4
Sequence Generator
Create Three sequence Generator Transformation
(Change END VALUE =2, START VALUE=1 and Check Cycle) 
By this way your mapping will always give you your expected output.

This step we need to assign an ID for each and every group of records. (If you don't want to use Sequence generator use Expression and build a logic to assign an unique id for each and every record but in that case also we need three different Expression transformation :-)

Step 5
Filter
Create three different Filter Transformation

Connect from each group of Router Transformation to different Filter Transformation like below
Dept10_Data To Filter 1
Dept20_Data To Filter 2
Dept30_Data To Filter 3

Develop Filter condition = NEXTVAL <=2 in all Filter Transformation.

Step 6
Connect all three instance of target from different Filter Transformation.

 Filter 1 To Target Instance 1
 Filter 2 To Target Instance 2
 Filter 3 To Target Instance 3

Yahoo!!! it was pretty easy. Keep Practicing and Learning.




Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.

hpandey.harsh@gmail.com 

Saturday, 22 March 2014

Scenario 16: How to handle comma in a comma delimited file?





Scenario 16: How to handle comma in a comma delimited file?

Theory Analysis: There are multiple cases in this. We will examine each case one by one. First of all, we need to do data profiling for our source file.

What is Data Profiling?
Data Profiling is a validation process through which we are analyzing the data whether it's fit for source or not. If not then what changes needs to be done to make it fit for source or even sometimes we need to push it back to client with reason that data is not in a format which would be treated like a source as per requirement.

CASE 1: More than one Comma [,] is appearing just after delimiter [,]
CASE 2: Comma is [,] appearing in a data itself. Ex: Address column has data like this 'CA,USA'
CASE 3: Any other Char like [+] Char is appearing anywhere in data. [,]

Practically

CASE 1: More than one Comma [,] is appearing just after delimiter [,]

Data
CID,  CNAME, CITY, CNTRY,  PHONE NUMBER
1234,,,SMITH,CA,USA,,,9845679000
1235,JOHN,NY,USA,9996805944
1236,NORMA,,,CA,USA,,,7804567893
1237,CARL,,,NY,USA,889950493

Solution: It's very easy to handle. Informatica has already provided an option to handle this case.

While Importing File metadata of source file in Flat File Import wizard Step 2 of 3 Check Treat consecutive delimiter as one 





Now, it's good to go ahead with source file.

CASE 2: Comma is [,] appearing in a data itself. Ex: Address column has data like this 'CA,USA'

Data
CID, CNAME, ADD, PHONE_NUMBER
1234,SMITH,CA,USA,9845679000
1235,JOHN,NY,USA,9996805944
1236,NORMA,CA,USA,7804567893
1237,CARL,NY,USA,8899504934

Solution: There are two approach to resolve this issue.

Approach 1: Through Script modify your source and put Single Quotes or Double Quotes for each and every value.

Data After Modification
CID, CNAME, ADD, PHONE_NUMBER
'1234','SMITH','CA,USA','9845679000'
'1235','JOHN','NY,USA','9996805944'
'1236','NORMA','CA,USA','7804567893'
'1237','CARL','NY,USA','8899504934'

While Importing File metadata of in Flat File Import wizard Step 2 of 3 select Radio Button of Text Qualifier section Single Quotes. (If you have modified your source with double quotes then please select double quotes)





Approach 2: Read complete file as fixed width and in the expression with the help of SUBSTR() and INSTR() function split into columns. Only one thing we need to take care while splitting third column data we have to start from second occurrence of comma till fouth occurrence of comma.
(In many other scenarios we have already completed this.)

A real time Secret :-) We are creating Reusable transformation for splitting records into column based upon delimited char and we are using that whenever we need.
A second case where we are dependent upon reusable transformation is for removing unwanted space from data while reading from files. Yes,  LTRM(RTRIM())




CASE 3: Any other Char like [+] Char is appearing anywhere in data. [,]
CID, CNAME, ADD, PHONE_NUMBER
1234,SMITH,CA,USA,+19845679000
1235,JOHN,NY,USA,+19996805944
1236,NORMA,CA,USA,+17804567893
1237,CARL,NY,USA,+18899504934

Solution: While Importing File metadata of in Flat File Import wizard Step 2 of 3 Check Remove escape character from data and give you Char + in Text Box.


BINGO!!!

Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Thursday, 20 March 2014

Scenario 15: How to Implement Hybrid SCD or SCD Type 6




Scenario 15: How to Implement Hybrid SCD or SCD Type 6

Source Table Structure
EMPLOYEE_ID    NUMBER(4,0)
FIRST_NAME    VARCHAR2(10 BYTE)
LAST_NAME    VARCHAR2(10 BYTE)
DOB                     DATE
EMAIL                     VARCHAR2(30 BYTE)
PHONE_NUMBER   NUMBER(10,0)
HIRE_DATE             DATE
JOB_ID                     NUMBER(4,0)
SALARY                     NUMBER(10,2)
COMMISSION     NUMBER(10,2)
MANAGER_ID     NUMBER(4,0)
DEPARTMENT_ID    NUMBER(4,0)

Target Table Structure
DWID                         NUMBER
EMPLOYEE_ID         NUMBER(4,0)
FIRST_NAME        VARCHAR2(10 BYTE)
LAST_NAME        VARCHAR2(10 BYTE)
DOB                        DATE
EMAIL                        VARCHAR2(30 BYTE)
PHONE_NUMBER NUMBER(10,0)
HIRE_DATE                DATE
JOB_ID                        NUMBER(4,0)
SALARY                        NUMBER(10,2)
COMMISSION        NUMBER(10,2)
MANAGER_ID       NUMBER(4,0)
DEPARTMENT_ID    NUMBER(4,0)
OLD_JOB_ID        NUMBER
VERSION               NUMBER

Theory/Analysis
We all know the concept of SCD1, SCD2 and SCD3. Here we have to combine all three on single dimension table.
CASE 1: If new record then insert into target
CASE 2: If MANAGER_ID updated in Source then Update same in Target (SCD1)
CASE 3: If PHONE_NUMBER updated in Source then Insert it as a new record to target and update version for previous record in target. (SCD2)
CASE 4: If JOB_ID updated in Source then insert previous JOB_ID to OLD_JOB_ID column and insert current JOB_ID to JOB_ID column in target. (SCD3)

Practically: 
Step 1
Drag and drop one instance of source and five instance of target
(Make sure your target table does have a primary key at DWID column, If it's not there at database level create it Informatica Level)
Target Designer-->Right Click on your target -->Edit-->Column Port  (Right side of column you will see NOT A KEY flip it to PRIMARY KEY.

Step 2
Lookup
Create Lookup on your target table.
Drag and drop EMPLOYEE_ID from SQ to Lookup.
Condition Tab
EMPLOYEE_ID = EMPLOYEE_ID1
Set Lookup Policy on Multiple Match in Properties Tab = Use Last Value

Step 3
Expression
Drag and drop DWID, EMPLOYEE_ID, JOB_ID, OLD_JOB_ID, MANAGER_ID, PHONE_NUMBER from Lookup to Expression and all columns from SQ.
Create six ports.

NEW_REC  (Output Port) =IIF(ISNULL(EMPLOYEE_ID_TGT), 1, 0)

UPD_SCD1 (Output Port) =IIF(NOT ISNULL(EMPLOYEE_ID_TGT) AND (MANAGER_ID_TGT != MANAGER_ID), 2, 0)

UPD_SCD2  (Output Port) =IIF(NOT ISNULL(EMPLOYEE_ID_TGT) AND (PHONE_NUMBER_TGT !=PHONE_NUMBER), 3, 0)

UPD_SCD3 (Output Port) = IIF(NOT ISNULL(EMPLOYEE_ID_TGT) AND (JOB_ID_TGT != JOB_ID), 4, 0)

NEW_VERSION  (Output Port) = 1
OLD_VERSION  (Output Port) = 0

Note: For better understanding I have Suffixed _TGT for all ports coming from Lookup.

Step 4
Router
We need to carry only DWID, OLD_JOB_ID and JOB_ID_TGT of lookup rest of the columns are source columns.
Create four groups
NEW_REC = 1
UPD_SCD1 = 2
UPD_SCD2 = 3
UPD_SCD3 = 4

Step 5:
Sequence Generator
(No need to change any default properties.)

Step 6
Connect all Source columns ports along with New_Version from NEW_REC group to Target Instance 1
Connect NEXTVAL port from sequence generator to DWID


Step 7
Update Strategy
Drag and Drop MANAGER_ID and DWID from UPD_SCD1 group
Set Update Strategy expression = DD_UPDATE

Step 8
Connect MANAGER_ID and DWID columns from Update Strategy to Target Instance to 2.

Step 9
Connect all Source Columns ports along with New_Version and OLD_JOB_ID from UPD_SCD2 group to Target Instance 3.
Connect NEXTVAL port from Sequence generator to DWID.

Step 10
Update Strategy
Drag and Drop OLD_VERSION and DWID from UPD_SCD2 group
Set Update Strategy expression = DD_UPDATE

Step 11
Connect OLD_VERSION and DWID from update strategy to Target instance 4

Step 12
Update Strategy
Drag and drop JOB_ID_TGT and JOB_ID from UPD_SCD3 group.
Set Update Strategy expression = DD_UPDATE

Step 13
Connect JOB_ID_TGT to OLD_JOB_ID and JOB_ID to JOB_ID and DWID to DWID to Target instance 5.

Step 14:
Session-->Properties Tab
Set Treat Source Rows as: DATA DRIVEN
Don't forget to give relational connection for Lookup Transformation also.

What are you waiting for? Run it, Test it, Analyse it...

BINGO!!!



 Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com 

Tuesday, 18 March 2014

Scenario 5C: How to Convert Column To Row (4)

Scenario 5C: How to achieve target below...

(Source)

ID,NAME,LOC
101,null,null
null,SIMITH,null
null,null,USA
102,null,null
null,JOHN,null
null,null,UK

Note: NULL are just for representation. There is no value in that column.

(Target)
ID    NAME    LOC
101 SIMITH USA
102 JOHN UK

Let's Chat: This scenario is just opposite of Scenario 13 A. There we had first converted row to column then column to row. Here we are going to do just opposite.
In case you want to bush up Scenario 13 A.

http://informaticachamp.blogspot.in/2014/03/scenario-13-how-to-convert-row-to-column_16.html

Theory/Analysis:
By analyzing source and target we can conclude that first we need to convert column to row by using Normalizer then we will get three occurrence of each value filter it out based upon if value is not null. Then it's simple concat all value in set of three means concat from ID till LOC value. Ok... Now, we have all the data in rows but in single column (here once again we to convert column to row) so, use expression split into three column value that's it.  gotchaaa or confused? Let's do it practically

Practically

Step 1: 
Drag and Drop your source and target to mapping designer.

Step 2:
Normalizer
Create one port in Normalizer tab

Column Name  Occurs    Data Type   Prec
ROWS             3               String          100

Step 3
Filter
Connect ROWS and GCID_ROWS from Normalizer to Filter

Filter Condition= NOT ISNULL(ROWS)

Step 4
Expression
Create three ports

ADD_ROWS (Variable Port) = IIF(ISNULL(OLD_ROWS) OR GCID_ROWS=1, ROWS, OLD_ROWS||','||ROWS)

OLD_ROWS (Variable Port) = ADD_ROWS

O_ADD_ROWS (Output Port) = ADD_ROWS

Step 5
Filter
Connect GCID_ROWS and O_ADD_ROWS column from Expression

Filter Condition=GCID_ROWS=3

Step 6
Expression
Create three ports

ID (Output Port)= SUBSTR(O_ADD_ROWS, 1, INSTR(O_ADD_ROWS, ',',1,1)-1)

NAME (Output Port)= SUBSTR(O_ADD_ROWS, (INSTR(O_ADD_ROWS, ',',1, 1)+1), ((INSTR(O_ADD_ROWS, ',',1, 2)- INSTR(O_ADD_ROWS, ',',1, 1)-1)))

LOCATION (Output Port)= SUBSTR(O_ADD_ROWS, (INSTR(O_ADD_ROWS, ',',1, 2)+1))

Step 7
Connect ID, NAME, LOCATION to your target.

Now, what are you waiting for? Run it and get your expected output. :-)

BINGO!!!

In Case If you are interested to see others Converting Column to Row Scenario...

http://informaticachamp.blogspot.in/2014/03/scenario-5-how-to-convert-column-to-row.html

http://informaticachamp.blogspot.in/2014/03/scenario-5a-how-to-convert-column-to-row.html

http://informaticachamp.blogspot.in/2014/03/scenario-5b-how-to-convert-column-to.html



Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Monday, 17 March 2014

Scenario 13 A: How to Convert Row to Column

Scenario 13 A: How to Convert Row to Column

How to achieve target below...

(Source)

SUBJECT,MARKS
JAVA,60
ORACLE,50
INFORMATICA,78
JAVA,45
ORACLE,55
INFORMATICA,52

(Target)
Column1  Column2   Column3
JAVA ORACLE    INFORMATICA
60             50             78
JAVA ORACLE    INFORMATICA
45              55             52

Do, you wan't to write it into single column? No issue I have covered both, for single column you have to skip step 5.

Theory/Analysis
After analyzing source and target we can conclude that first we have to convert row to column then column to row. Confused.... What are you talking about first first row to column then column to row....
Wait wait... Let me help you.
First Row to Column: This we have to achieve because JAVA, ORACLE, INFORMATICA. Is in't it converted from row to column. Agreed? OK..
Second Column to Row: This we have to do because in Source we have two columns and second column data is coming to second row.
Hope it's clear by now. Let's do it Practically

Practically

Step 1:
Drag and Drop your source and target to Mapping Designer.

Step 2:
Expression
Create 10 Ports

ID (Variable Port) = 1

NEW_ID (Variable Port) = IIF(OLD_ID=3, ID,  OLD_ID+1)

OLD_ID (Variable Port) = NEW_ID

SUBJECT_CONCAT (Variable Port) = IIF(NEW_ID = 1 OR ISNULL(OLD_SUBJECT), SUBJECT, OLD_SUBJECT||','||SUBJECT)

OLD_SUBJECT (Variable Port) = SUBJECT_CONCAT

MARKS_CONCAT (Variable Port) = IIF(NEW_ID =1 OR ISNULL(OLD_MARKS), MARKS,OLD_MARKS||','||MARKS)

OLD_MARKS (Variable Port) = MARKS_CONCAT

O_NEW_ID (Output Port) = NEW_ID

O_SUBJECT (Output Port) = SUBJECT_CONCAT

O_MARKS (Output Port) = MARKS_CONCAT

Step 3
Filter
Filter Condition = O_NEW_ID=3

Step 4
Normalizer
Create one Column in Normalizer Tab

Column Name  Occurs    Data Type   Prec
COLUMN1        2                String          100

Step 5
(Unnecessary Step, Your choice, If you want to skip this because you want to load data in single column target go ahead connect COLUMN1 port from Normalizer to target)
I am using this step for better understanding.. Split column value in three different column based upon comma.
Expression
Create three ports
COL1 (Output Port) =SUBSTR(COLUMN1,1,INSTR(COLUMN1, ',',1, 1)-1)

COL2 (Output Port) =SUBSTR(COLUMN1, (INSTR(COLUMN1, ',',1, 1)+1),  ((INSTR(COLUMN1, ',',1, 2)- INSTR(COLUMN1, ',',1, 1)-1)))

COL3 (Output Port) =SUBSTR(COLUMN1, (INSTR(COLUMN1, ',',1, 2)+1))

Step 6: 
Excell Target: At session level for target file properties have extension in .CSV and enjoy your excell target. What you said? It's not excell target. Ok!! Give extension .xls but here you will receive an error while opening your target file. Ignore that and proceed to see the data and. If you are giving .xlsx then you won't be able to open it at all. It will throw error file has been corrupted or not in proper format.
Conclusion: Informatica doesn't create excell target. To be safer side what we are doing in real time create target in .csv format and have script to convert it to excell file with .xls or .xlsx extension.

BINGO!!!


Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Friday, 14 March 2014

Scenario 14: How to send a notification for long running Workflow/Sessions?



Scenario 14: How to send notification for long running workflow/sessions?

Ex: If particular workflow is running for longer than specified time then we need to send a notification to concern team.
This solution helps Production Support fellows a lot. They escaped from being a watch dog.

Let's Chat: There are several approach for this and unfortunately every approach has some drawback. It's totally depends upon your requirement that with which drawback you are good or which approach is getting approval from Solution Review Meeting. :-(

Theory/Analysis:
Scenario is crystal clear and yes, we need Timer task to handle this but we have to look after about few case so, let's talk about case first.

CASE 1: If workflow (Sessions) exceeds it's specified time of running send a mail.
CASE 2: If workflow (Sessions) completed within specified time then we need to stop timer

Practically

Step 1:
Create a Timer in concurrent manner so, that timer starts immediately after starting workflow. (See Image Below)
Set Relative time
DAY      HH      MM
0             1          30

Note: Expected time to complete workflow is 1:30 Hrs from time of starting.

Step 2:
Create a Email Task.
Set
Email User Name: 24*7 workers@team.com
Email Subject: Workflow is going beyond our control
Email Text: Wake Up, Dude. It's your turn now.

Step 3: 
Create a Control Task at the end of your workflow execution pipeline. (See Image Below)
Set Control Option: Stop top-level workflow

Note: To test this solution please, reduce the time of timer keep lower than time taken by workflow to complete.

BINGO!!!

Drawback of Approach: Your Workflow and Timer Status will be as Stop not as Succeeded, if workflow completes within specified time.

Do you want me to explain another approach also, which we have opted in our project? If yes, then leave me a comment :-) ha ha ha, It seems now I have started blackmailing... No need to do this my friend, I am working on that approach also and will post it soon. I want your comments through your heart not from your mind. Thanks.

Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Thursday, 13 March 2014

Scenario 5B: How to Convert Column To Row (3)

Scenario 5B: How to Achieve Target below...

(Source)
ENAME,ENAME1,ENAME2
John,Mathew,Stev
Parker,Heden,Waugh

(Target)
FNAME,LNAME
John,Parker
Mathew,Heden
Stev,Waugh

Theory/Analysis
After analyzing source and target we can conclude that LNAME is coming into second row. After converting column to row we have to combine first row and second row in such a way so, that it's load as FNAME and LNAME to Target.

Practically

Step 1: 
Drag and Drop your source and target to mapping designer.

Step 2:
Normalizer
Create a port in Normalizer Tab

PORT_NAME       DATA_TYPE      OCCURS
NAME                   String                     3

Step 3
Sorter
Take Name and GCID_NAME column to Sorter from Normalizer.
GCID_NAME (key)  Check

Step 4
Expression
Create Six Ports
GCID_COMPARE (Variable Port)= IIF(GCID_NAME!=OLD_GCID, GCID_NAME,OLD_GCID)

FNAME_LNAME_FLAG (Variable Port)= IIF(GCID_COMPARE=OLD_GCID, 1,0)

OLD_GCID  (Variable Port)= GCID_COMPARE

FNAME_LNAME_CONCAT (Variable Port)= IIF(NOT ISNULL(FNAME_LNAME) AND FNAME_LNAME_FLAG=1, FNAME_LNAME||','||NAME, NAME)

FNAME_LNAME (Variable Port)=FNAME_LNAME_CONCAT

OUT_FNAME_LNAME (Output Port)=FNAME_LNAME

Step 5
Aggregator
Drag and drop OUT_FNAME_LNAME and GCID_NAME column to Aggregator
Select GCID_NAME as Group Port (Group Check)
Create two Output Port

FNAME (Output Port)= SUBSTR(OUT_FNAME_LNAME, 1, INSTR(OUT_FNAME_LNAME, ',',1,1) -1)

LNAME(Output Port)= SUBSTR(OUT_FNAME_LNAME, INSTR(OUT_FNAME_LNAME, ',',1,1) +1)

Step 6: Connect FNAME, LNAME to Target.

BINGO!!!

LIMITATION: As I had already said solution of this kind of scenarios are always close ended means, if your source has been changed then solution above will not work. Solution above will work only if two records are there in Source. If you have more than two rows in your source then logic to Split FNAME and LNAME will be also changed in Aggregator.


Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Wednesday, 12 March 2014

Scenario 13: How to Convert Row to Column

In this section we have several scenario based upon source data and expected output. We will explore it one by one.

Scenario 13: How to achieve target below...

(Source)
BRAND_ID,PRODUCT,PRICE
100,Cellphone,10000
100,Television,15000
100,Fridge,9000
101,Cellphone,8000
101,Television,12000
102,Cellphone,9000

(Target)
BRAND_ID, PRODUCT, TOTAL_PRICE
100,Cellphone,Television,Fridge,34000
101,Cellphone,Television,20000
102,Cellphone,9000

Analysis/Theory
After analyzing our source and target we can say that we have to pass one record for each group after combining all product and adding price of all products as Total Price. Fine, so once again sorting, comparison and if current records matched with previous one then keep adding Product and Price. The challenge is here when to pass the records to target ah!!! simple answer when old record and current record does not match then pass the record... No,No,No logically it's correct but practically we won't be getting expected result :-(
Want to feel HEAT? Sit on the hot seat.

Practically

Step 1:
Drag and Drop your source and target to mapping designer.

Step 2:
Sorter
Sort the data based upon BRAND_ID

Step 3:
Expression
Create Eleven ports
BRAND_COMPARE (Variable Port) = IIF(OLD_BRAND=BRAND_ID, 1, 0)

OLD_BRAND (Variable Port) = BRAND_ID

PRODUCT_LIST (Variable Port) = IIF(BRAND_COMPARE=1, O_PRODUCT_LIST||','||PRODUCT, PRODUCT)

O_PRODUCT_LIST (Variable Port) = PRODUCT_LIST

ADD_PRICE (Variable Port) = IIF(BRAND_COMPARE=1, (O_PRICE_ADD + PRICE), PRICE)

O_PRICE_ADD (Variable Port) = ADD_PRICE
NEW_ID (Variable Port) = OLD_ID+1
OLD_ID (Variable Port) = NEW_ID
ID (Output Port) = NEW_ID
OUT_PRODUCT_LIST (Output Port) = PRODUCT_LIST
OUT_PRICE (Output Port) = ADD_PRICE

Step 4:
Rank
Create Rank Transformation and drag and drop BRAND_ID, ID, OUT_PRODUCT_LIST, OUT_PRICE to it from expression.
Set properties in Ports Tab.
Select ID as Rank Port (R) should be checked.
Select BRAND_ID as Group
Set Top/Bottom-->TOP in Properties Tab.

Clarification: We are using Rank Transformation to get the last record of particular Brand_ID, as we are using variable ports to combine product and price so, it would be overwritten when new Brand_ID comes. If any one has other approach for how to pass the combine product list and total price to next target. Please let us know through comment section and no need to say No Verbal answer... Post it in Practical way.. I have spent approx 2 hours by applying different approach.. and at last I finalize to use Rank. In the last I am not a genius, I am a beginner and still learning Informatica, Spread your knowledge if you know, it will help you a lot in your growth.   Once again too much lecture.. It seems, I am too much talkative, Sorry... though it's not true. :-)

Step 5
Connect BRAND_ID, OUT_PRODUCT_LIST, OUT_PRICE column to target.

BINGO!!!



Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Exercise 3: How to remove duplicates? If source is a flat file

Exercise 3: How to remove duplicates? If source is a flat file
OR
How to send distinct records to one target and duplicates to others

Theory/Analysis: When records are repeating then we are saying table have duplicates. We can achieve this by comparing current record and previous record. If it's matched then record would be flagged as duplicate record otherwise it's a distinct record. Yes, we need to sort the data first before comparing current and previous record.

Practically:

Step 1:
Drag and Drop your source/target to mapping designer.

Step 2: 
Sorter
Sort the data based upon port for which you are finding duplicate.

EMPNO     Key (Check)

Step 3:
Expression
Create three ports.
EMPNO_COMPARE (Variable Port) = IIF(EMPNO=OLD_REC, 1, 0)
OLD_EMPNO  (Variable Port ) = EMPNO
DUP_RECORD (Variable Port ) = EMPNO_COMPARE

Step 4:
Router
Create two Groups
New_Record: DUP_RECORD = 0
Dup_Record: DUP_RECORD = 1

Step 5:
Connect New_Record group to distinct target and Dup_Record group ports to duplicate target.

BINGO!!!

NOTE: Above solution will eliminate only column level duplicates means if only EMPNO is repeating then only it will filter it out to duplicate target table.
This exercise is best exercise is to understand that how variable ports works and how the records processed through expression and ports. (Use Debugger and analyse it.)



Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com

Tuesday, 11 March 2014

Scenario 12: How to Create Parameter File Dynamically?



Scenario 12: How to Create Parameter File Dynamically?

Let's Chat: There could be n no of case for which we need to create Parameter File dynamically. Here I am taking a simple case for explanation.

CASE: Our source file name changed for each and every run. Let's say Customer_US for United States Customer one day and Customer_Australia for Australian Customer second day and so on.
We have only one source folder where these files comes through FTP. Let's say Infa_Shared/SrcFiles/Cust

Theory/Analysis:
First we need to list out files in Source directory and write into file. Yes, through redirecting dir command output to a file. Treat this file as a source for a mapping which we will be using to create parameter file.
One point we have take care here If more than one files are there in Source Folder then we need to stop creating parameter file because we can pass only one file name through parameter file for processing of actual mapping.

Practically

Step 1:
Create source metadata manually in Source Analyzer.
Sources Menu-->Create
Source File Name: CUST_PARAM
Database Name: Flat_File
Database Type: Flat File
Create one column: File_Name  Data Type String  Prec 50

Step 2:
Create target metadata manually in Target Analyzer
Targets Menu-->Create
File_Name: Cust_Param_File
Database Type: Flat File
Create one column: Rows  Data Type String  Prec 500

Important Note: We need to decide our Session/Workflow name of Actual Session/Workflow which is going to use this parameter file, because we have to pass these value hard coded in mapping for header section of parameter file.

Ex: Session_Name: S_Actual_Session
      Workflow_Name: WKF_S_Actual_Workflow


Step 3: Dag and drop your created source and target to mapping designer.

Step 4: 
Aggregator
Drag and drop File_Name from Source Qualifier to Aggregator
Create two ports
Total_Record (Output Port) = COUNT(FILE_NAME)
Dummy (Output Port)= 1
(No Need to select any port as a Group Port)

Step 4:
Expression
Drag and drop File_Name from Source Qualifier to Expression
Create one Port
Dummy (Output Port) = 1

Step 5:
Sorter
Drag and drop File_Name and Dummy column to Sorter from expression
Select File_Name as key

Note: We need to sort the data before sending it to joiner as we are joining two pipelines of same source qualifier.

Step 6:
Joiner
Drag and Drop columns from aggregator and expression to Joiner.
Join Condition:  Dummy = Dummy 1
Join Type: Normal Join

Step 7:
Expression
Drag and Drop Total_Record and File_Name Port.
Create three output port.

ERROR_HANDLING (Output Port) = IIF(TOTAL_RECORD > 1, ABORT('More than Two Files are there in source folder'))

HEADER: (Output Port) =  '[SCENARIOS.WF:WKF_Actual_Workflow.ST:Actual_Session]'

O_FILE_NAME: (Output Port) = '$InputFile_Name='||FILE_NAME

Step 8:
Normalizer
Create Normalizer (We need Normalizer as two columns values we have to write in two rows)
Right Click-->Edit-->Normalizer Tab

Column_Name   Level    Occurs     Datatype     Prec    Scale
Row                     0           2                String          500       0

Connect Header to Row1 and File_Name to Row 2

Step 9
Connect Row column from Normalizer to Target.

Step 10:
In workflow manager we need to create a command task before session for creating source file for mapping which we are using to create parameter file.
Command Task
Create a Command Task in Workflow Manager
Command 1=dir C:\Informatica\9.5.1\server\infa_shared\SrcFiles\CUST\*.* /b > C:\Informatica\9.5.1\server\infa_shared\SrcFiles\Parameters\PARAM_SRC.txt

Set Error Handling Properties in Properties tab: Fail Task if any Commands fail (Check)

Note: My server is windows server so I have written DOS command. Unix Server user please convert command above for Unix.

Step 11: 
We need to set link condition for a link which is connecting command task to session.
$Command.PrevTaskStatus=Succeeded

Step 12:
Set Source and Target File Name in Session as below.
Source: PARAM_SRC.txt
Target: CUST_PARAM.pmf
Note: I have set Output and Source File Directory as same: $PMSourceFileDir\Parameters
Means I am reading and creating Parameter file in same directory.

Yahoo!!! Our Parameter file is ready in  $PMSourceFileDir\Parameters with name of  CUST_PARAM.pmf. Use it in your actual mapping.

Below step you have to add in your actual mapping in which you are using this dynamically created parameter file.

Step 13:
As we have to take care about clean up of current parameter file and source directory location for next run so, we need to use one more command task after your actual session. Let's say it S_Actual_Session.

Command Task
Command 1: del /Q C:\Informatica\9.5.1\server\infa_shared\SrcFiles\Parameters\*.*
Command 2: del /Q C:\Informatica\9.5.1\server\infa_shared\SrcFiles\CUST\*.*

Set Error Handling Properties in Properties tab: Fail Task if any Commands fail (Check)

Crate a link condition from you session to command task $Actual_Session.PrevTaskStatus=Succeeded

BINGO!!!




Feel free to post your doubt/your marks for my effort in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.
hpandey.harsh@gmail.com