Scenario 10: Implementing SCD1 using MD5.
Let's Chat: I am assuming you already know about SCD1. SCD1 type of Dimension in which we have to insert only new record and update if any of record updated in Source.
In conventional implementing mostly we are taking only one column to implement SCD1.
Ex: If SAL updated in source then update SAL in target as well. What if JOB has been changed? Oh!! No, once again I have to write one more logic. Through MD5() We can achieve this easily without writing multiple logic for checking updated record.
Want to Brush up MD5() visit: http://informaticachamp.blogspot.in/p/c.html
Theory/Analysis:
Implement SCD1, means If any new record inserted then Insert in target table and if any record updated then update target as well. If value of any column like ENAME, JOB, MGR, HIREDATE, COMM, SAL, DEPTNO has been changed then update target with same value.
Note: I am not taking EMPNO for update because EMPNO is Primary Key and we all know rules of Primary Key :-)
Practically
Step 1:
While creating your target table (Dimension Table) please add one column
CHEKSUM Varchar2 (32).
If you have already created your target table no issue alter it like below.
Alter table EMP_SCD1 add checksum varchar2 (32);
Assign Primary Key to EMPNO at database level. Not comfortable.... Create it at Informatica Level :-)
Creating Primary key at Informatica Level
Target Designer-->Right Click on your table-->Edit-->Column Tab ( In the right corner you will see NOT A KEY select PRIMARY KEY for EMPNO column from drop down list)
Step 2: We need one instance of Source and two instance of target.
Step 3:
Lookup
Create Lookup on your target table and follow step below.
A) Drag and Drop EMPNO column to Lookup.
B) Create Lookup Condition
Lookup Table Column Operator Transformation Port
EMPNO = EMPNO1
Step 4:
Expression
A) Drag and Drop EMPNO and CHECKSUM column from Lookup to Expression.
Note: In my expression I have used EMPNO_TGT and CHECKSUM_TGT for better understanding that these ports are target ports which are coming from lookup.
Create four Ports
CHECKSUM (Variable Port)= MD5(TO_CHAR(EMPNO)||ENAME||JOB||TO_CHAR(MGR)||TO_CHAR(HIREDATE)||TO_CHAR(SAL)||TO_CHAR(COMM)||TO_CHAR(DEPTNO))
NEW_RECORD (Output Port)= IIF(ISNULL(EMPNO_TGT),1,0)
UPD_RECORD (Output Port)= IIF(NOT ISNULL(EMPNO_TGT) AND
(CHECKSUM!=CHECKSUM_TGT),1,0)
CHECKSUM_OUT (Output Port)=CHECKSUM
Step 5:
Router
Create Two Group
New_Record: New_Record=1
Update_Record: Upd_Record=1
Step 6
Connect all Source Qualifier ports including CHECKSUM from New_Record Group to Target.
Step 7:
Create Update Strategy
A) Drag and Drop all Source Qualifier ports including CHECKSUM from Update_Record Group to Update Strategy
B) Update Stratgey-->Right Click-->Edit-->Properties Tab
Update Stratgey Expression: DD_UPDATE
Step 8: Connect all ports to second instance of target.
Step 9: Create Session and set
Session-->Right Click-->Edit-->Properties-->Set
Treat Source Rows as: Data Driven
Step 10: Don't forget to give relation connection for your Lookup also.
BINGO!!!
Feel free to post your doubt/your marks for my effort in comment section.
Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)
If you like the approach and solution feel free to share with others.
Thanks.
hpandey.harsh@gmail.com
Let's Chat: I am assuming you already know about SCD1. SCD1 type of Dimension in which we have to insert only new record and update if any of record updated in Source.
In conventional implementing mostly we are taking only one column to implement SCD1.
Ex: If SAL updated in source then update SAL in target as well. What if JOB has been changed? Oh!! No, once again I have to write one more logic. Through MD5() We can achieve this easily without writing multiple logic for checking updated record.
Want to Brush up MD5() visit: http://informaticachamp.blogspot.in/p/c.html
Theory/Analysis:
Implement SCD1, means If any new record inserted then Insert in target table and if any record updated then update target as well. If value of any column like ENAME, JOB, MGR, HIREDATE, COMM, SAL, DEPTNO has been changed then update target with same value.
Note: I am not taking EMPNO for update because EMPNO is Primary Key and we all know rules of Primary Key :-)
Practically
Step 1:
While creating your target table (Dimension Table) please add one column
CHEKSUM Varchar2 (32).
If you have already created your target table no issue alter it like below.
Alter table EMP_SCD1 add checksum varchar2 (32);
Assign Primary Key to EMPNO at database level. Not comfortable.... Create it at Informatica Level :-)
Creating Primary key at Informatica Level
Target Designer-->Right Click on your table-->Edit-->Column Tab ( In the right corner you will see NOT A KEY select PRIMARY KEY for EMPNO column from drop down list)
Step 2: We need one instance of Source and two instance of target.
Step 3:
Lookup
Create Lookup on your target table and follow step below.
A) Drag and Drop EMPNO column to Lookup.
B) Create Lookup Condition
Lookup Table Column Operator Transformation Port
EMPNO = EMPNO1
Step 4:
Expression
A) Drag and Drop EMPNO and CHECKSUM column from Lookup to Expression.
Note: In my expression I have used EMPNO_TGT and CHECKSUM_TGT for better understanding that these ports are target ports which are coming from lookup.
Create four Ports
CHECKSUM (Variable Port)= MD5(TO_CHAR(EMPNO)||ENAME||JOB||TO_CHAR(MGR)||TO_CHAR(HIREDATE)||TO_CHAR(SAL)||TO_CHAR(COMM)||TO_CHAR(DEPTNO))
NEW_RECORD (Output Port)= IIF(ISNULL(EMPNO_TGT),1,0)
UPD_RECORD (Output Port)= IIF(NOT ISNULL(EMPNO_TGT) AND
(CHECKSUM!=CHECKSUM_TGT),1,0)
CHECKSUM_OUT (Output Port)=CHECKSUM
Step 5:
Router
Create Two Group
New_Record: New_Record=1
Update_Record: Upd_Record=1
Step 6
Connect all Source Qualifier ports including CHECKSUM from New_Record Group to Target.
Step 7:
Create Update Strategy
A) Drag and Drop all Source Qualifier ports including CHECKSUM from Update_Record Group to Update Strategy
B) Update Stratgey-->Right Click-->Edit-->Properties Tab
Update Stratgey Expression: DD_UPDATE
Step 8: Connect all ports to second instance of target.
Step 9: Create Session and set
Session-->Right Click-->Edit-->Properties-->Set
Treat Source Rows as: Data Driven
Step 10: Don't forget to give relation connection for your Lookup also.
BINGO!!!
Feel free to post your doubt/your marks for my effort in comment section.
Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)
If you like the approach and solution feel free to share with others.
Thanks.
hpandey.harsh@gmail.com
Hi Champ,
ReplyDeleteI must say your methods are really nice and i love them.
I have a query in which i always get confused.can you please throw some more light on ISNULL and NOT isNULL methods via the above transformation as example?
Thanks in advance.
Hi, Sumit
DeleteThank you for your comments.
A small session on ISNULL() function.
1. ISNULL() function returns true if the value of column is NULL.
Unfortunately Informatica has not provided us facility to See/Use that return value, so, we are returning our value through IIF(). IIF( ISNULL(EMPNO), 1, 0). Means If EMPNO is NULL (Value is not there) then retrun 1 else 0.
2.We are using NOT operator to ISNULL() function for negative return, Yes means FLASE return. This is user-define override for default return of ISNULL() function. IIF(NOT ISNULL(EMPNO) 1, 0)
Means if If EMPNO is NOT NULL (Value is there) then retrun 1
else 0.
These two functions are same as SQL Function NULL and NOT NULL.
Hope by now it's clear... Let me know if you still have doubt.
Thanks
Thanks for your reply.
DeleteIf you can give me any link from where i can understand all these expression functions,that will be really nice.
ReplyDeleteBest link Informatica Help... Open you system read help and do it practically... All doubt would be clear... Always try to find your answer practically.. Treat Informatica as Science not as an Art Subject, :-)
ReplyDeleteHi harsh pandey.I am intersted to learn informatica.I need suggestions like job opening in market and feature of informatica.
ReplyDeleteHi, Naresh
DeleteLet's make it private.. Drop me a mail at hpandey.harsh@gmail.com for further discussion.
Thanks
Thanks for Information Informatica is one of the widely used ETL Tool for extracting the source data and loading it into the target after applying the required transformation. It provides a single enterprise data integration platform to help organization access, transform, and integrate data from a large variety of systems and deliver information to other transactional systems.Informatica Online Training
ReplyDeleteIt will be helpful
ReplyDeleteAfter reading above Articular , i must have appreciate your effort in right this useful articular ....which is better understandably for any guy...-):
ReplyDeleteJi. What is the explanation of new and update expression in expression transformation.. Yesterday I have gone to a company for interview. They asked me.. Why do you use it..
ReplyDelete