Scenario 3: How to load multi-format (Delimited) column Flat File source data into two different targets?
Ex:
cust_id,cname,phone,state_id
101,AMITH,9897045679,201
102,JAGAN,9908907869,203
103,MAHESH,890899089,201
state_id||Sname||Region
201||Karntaka||South
202||Delhi||North
203||Maharashtra||West
Theory/Analysis:
Read complete file as fixed width file. Remove first line through Start Importing row at option while Importing Source File metadata to informatica. First use REPLACESTR() function to replace '||' to ',' then use SUBSTR() till first Comma position. Check whether it's is number or not through ISNUMBER().
Assign and id for each and every record; Use Router to Route data according to ID less than/greater than Mapping Variable. Use SUBSTR, INSTR to split each record for columns.
Practical:
Step 1: Create a Mapping Variable $$SPLIT (Integer) Count
Step 2: Read File as Fixed Width (Though it's comma delimited)
Step 3:
Expression
Create Total Eight Port
RECORDS(Variable Port) = REPLACESTR(0,RECORDS_IN,'||', ',')
COLUMN1 (Variable Port) = SUBSTR(RECORDS,1,INSTR(RECORDS, ',',1, 1)-1)
Rec_ID (Variable Port) = OLD_ID + 1
OLD_ID (Variable Port) = Rec_ID
SPLIT_REC (Variable Port) = IS_NUMBER(COLUMN1)
Var_Assign (Variable Port) = IIF(SPLIT_REC= 0, SETVARIABLE($$SPLIT, Rec_ID))
Out_Rec_ID (Output Port) = Rec_ID
Records_OUT(Output Port) = RECORDS
Note: RECORDS_IN is Input Port and coming from Source Qualifier.
Step 4:
Router:
Create two Group
Cust Rec_ID < $$SPLIT
State Rec_ID > $$SPLIT
Step 5:
Expression:
Create Four Port
CUST_ID (Output Port) = SUBSTR(RECORDS1,1,INSTR(RECORDS1, ',',1, 1)-1)
CNAME (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 1)+1), ((INSTR(RECORDS1, ',',1, 2)- INSTR(RECORDS1, ',',1, 1)-1)))
PHONE (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 2)+1), ((INSTR(RECORDS1, ',',1, 3)- INSTR(RECORDS1, ',',1, 2)-1)))
STATE_ID (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 3)+1))
Step 6:
Expression:
Create Three Port
STATE_ID (Output Port) = SUBSTR(RECORDS3,1,INSTR(RECORDS3, ',',1, 1)-1)
STATE (Output Port) = SUBSTR(RECORDS3, (INSTR(RECORDS3, ',',1, 1)+1), ((INSTR(RECORDS3, ',',1, 2)- INSTR(RECORDS3, ',',1, 1)-1)))
REGION (Output Port) = SUBSTR(RECORDS3, (INSTR(RECORDS3, ',',1, 2)+1))
BINGO!!!
Feel free to post your doubt in comment section.
Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)
If you like the approach and solution feel free to share with others.
Thanks.
hpandey.harsh@gmail.com
Ex:
cust_id,cname,phone,state_id
101,AMITH,9897045679,201
102,JAGAN,9908907869,203
103,MAHESH,890899089,201
state_id||Sname||Region
201||Karntaka||South
202||Delhi||North
203||Maharashtra||West
Theory/Analysis:
Read complete file as fixed width file. Remove first line through Start Importing row at option while Importing Source File metadata to informatica. First use REPLACESTR() function to replace '||' to ',' then use SUBSTR() till first Comma position. Check whether it's is number or not through ISNUMBER().
Assign and id for each and every record; Use Router to Route data according to ID less than/greater than Mapping Variable. Use SUBSTR, INSTR to split each record for columns.
Practical:
Step 1: Create a Mapping Variable $$SPLIT (Integer) Count
Step 2: Read File as Fixed Width (Though it's comma delimited)
Step 3:
Expression
Create Total Eight Port
RECORDS(Variable Port) = REPLACESTR(0,RECORDS_IN,'||', ',')
COLUMN1 (Variable Port) = SUBSTR(RECORDS,1,INSTR(RECORDS, ',',1, 1)-1)
Rec_ID (Variable Port) = OLD_ID + 1
OLD_ID (Variable Port) = Rec_ID
SPLIT_REC (Variable Port) = IS_NUMBER(COLUMN1)
Var_Assign (Variable Port) = IIF(SPLIT_REC= 0, SETVARIABLE($$SPLIT, Rec_ID))
Out_Rec_ID (Output Port) = Rec_ID
Records_OUT(Output Port) = RECORDS
Note: RECORDS_IN is Input Port and coming from Source Qualifier.
Step 4:
Router:
Create two Group
Cust Rec_ID < $$SPLIT
State Rec_ID > $$SPLIT
Step 5:
Expression:
Create Four Port
CUST_ID (Output Port) = SUBSTR(RECORDS1,1,INSTR(RECORDS1, ',',1, 1)-1)
CNAME (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 1)+1), ((INSTR(RECORDS1, ',',1, 2)- INSTR(RECORDS1, ',',1, 1)-1)))
PHONE (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 2)+1), ((INSTR(RECORDS1, ',',1, 3)- INSTR(RECORDS1, ',',1, 2)-1)))
STATE_ID (Output Port) = SUBSTR(RECORDS1, (INSTR(RECORDS1, ',',1, 3)+1))
Step 6:
Expression:
Create Three Port
STATE_ID (Output Port) = SUBSTR(RECORDS3,1,INSTR(RECORDS3, ',',1, 1)-1)
STATE (Output Port) = SUBSTR(RECORDS3, (INSTR(RECORDS3, ',',1, 1)+1), ((INSTR(RECORDS3, ',',1, 2)- INSTR(RECORDS3, ',',1, 1)-1)))
REGION (Output Port) = SUBSTR(RECORDS3, (INSTR(RECORDS3, ',',1, 2)+1))
BINGO!!!
Feel free to post your doubt in comment section.
Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)
If you like the approach and solution feel free to share with others.
Thanks.
hpandey.harsh@gmail.com
I have a doubt that why u have used router and mapping variable .
ReplyDeleteCan we derive all the expression in a single expression transformation rather then using router and two different exp transformation.
The need of mapping variable and router is to route the data to two different target. If you see the source file we are getting customer and state information in single flat file with different delimiter and with the first expression we finding the row from where we have to split the record for second target and assigning it to Mapping variable and through router we are splitting record flow in two group one for customer and second one for state.
DeleteHope it would be clear now. If not feel free to ask you doubt.
Thanks
hpandey.harsh@gmail.com
In 1st expression transformation u have mentioned Var_Assign= IIF(SPLIT_REC= 0, SETVARIABLE($$SPLIT, Rec_ID)) , what is split_rec
ReplyDeleteis it a variable port or what please explain me the above expression and also the router condition
Thanks In Advance ,
I am sorry Rajat, My typing mistake creates confusion. I have corrected it now.
DeleteYes, SPLIT_REC is variable port and in this we evaluating whether first column value is number or not. If first column value is not number then it will store 0 and if SPLIT_REC is 0 then assign that record_id to variable and in router I am routing record like this if rec_id < Variable value then it's customer data. If rec_id > Variable value then State data.
I would suggest instead of only reading try to implement it practically and use debugger to analyze record. It would be more clear by that way.
Thanks.
HI,
ReplyDeletewhat about false part of below IIF
Var_Assign (Variable Port) = IIF(SPLIT_REC= 0, SETVARIABLE($$SPLIT, Rec_ID))
This is brilliant...took me a while to understand that we are just looking for the heading of the 2nd set of recs to set the value of the mapping variable $$SPLIT...got it eventually...Cheers!!!!
ReplyDeleteHow to achieve if we have n number of dynamic files with different file structure and need to load to a single staging table which should be created during the runtime of the workflow.
ReplyDeleteThank you very much ! This is very useful for beginners like me in self learning
ReplyDeleteHi,
ReplyDeleteWhen we have two delimiters , we can use a pre-session unix command like sed 's/||/,/g' input.txt > tmp.txt
which standardize the delimiter .
then we can simply make it a comma delimited file and split columns to multiple targets.
Hi Harsh,
ReplyDeleteFirst of all I am total beginner in Informatica, and I have the following message after saving a mapping: Transformation: EXPTRANS Field: Rec_ID
<> [+]: operand cannot be converted to a number
... >>>>OLD_ID<<<< + 1
Could You help me with this?
Many Thanks
Hi Karol,
DeleteThere are two possibilities for this error.
1. Check the Data Type of the port. It should be INTEGER.
2. Check wheather OLD_ID variable port is available or not. OLD_ID port should must be available which is 4th port in Expression and it is also variable port.
Hope it will help.
Thanks.
Hi Harsh,
DeleteGreat, first of Your solution was correct. I've
also change to INTEGER ports in Router because it was similiar problem. But it is still the same with port Var_Assign though I've change datatype to integer:
<> [=]: function cannot resolve operands of ambiguously mismatching types.
... IIF(SPLIT_REC=>>>>0<<<<, SETVARIABLE($$SPLIT, Rec_ID))
Thanks
Source: Multi record format
ReplyDeleteTarget: 3 tables Base, J1, J2
1 row contains multiple fixed width records and dynamic in nature, need to load data into target tables.
Eg:- (Base Record 100 length [c1 string 50, c2 string 50] +J1 Record 200 length [c1 string 50, c2 string 50, c3 string50, c4 string 50]+J2 Record length 150 [c1 string 50, c2 string 100])
Base Record + J1 + J2+ J1+ J2+J1 -> 1 Record goes to base table, 3 records goes to J1 table , 2 records goes to J2 table
Base Record + J2 + J2+ J1+ J2+J1 -> 1 Record goes to base table, 2 records goes to J1 table , 3 records goes to J2 table
please help me on this.
ReplyDelete