Sunday 23 March 2014

Exercise 4: How to load top 2 salaries for each department without using Rank Transformation and SQL Queries in Source Qualifier?

Exercise 4: How to load top 2 salaries for each department without using Rank Transformation and SQL queries in Source Qualifier.

Analysis/Theory: After analyzing requirement we can conclude that after sorting the data based upon salary in descending order, if we can pass only two records for each group of department then we are done. That's it!!! Is in't it too easy :-)

Practically

Step 1:
Drag and drop one source and three instance of target table.

Step 2: 
Sorter
Sort the data based upon SAL.

Select SAL as key port and Descending order

Step 3
Router
Create three groups.

Dept10_Data =  Dept=10
Dept20_Data =  Dept=20
Dept30_Data =  Dept=30

Step 4
Sequence Generator
Create Three sequence Generator Transformation
(Change END VALUE =2, START VALUE=1 and Check Cycle) 
By this way your mapping will always give you your expected output.

This step we need to assign an ID for each and every group of records. (If you don't want to use Sequence generator use Expression and build a logic to assign an unique id for each and every record but in that case also we need three different Expression transformation :-)

Step 5
Filter
Create three different Filter Transformation

Connect from each group of Router Transformation to different Filter Transformation like below
Dept10_Data To Filter 1
Dept20_Data To Filter 2
Dept30_Data To Filter 3

Develop Filter condition = NEXTVAL <=2 in all Filter Transformation.

Step 6
Connect all three instance of target from different Filter Transformation.

 Filter 1 To Target Instance 1
 Filter 2 To Target Instance 2
 Filter 3 To Target Instance 3

Yahoo!!! it was pretty easy. Keep Practicing and Learning.




Feel free to post your doubt in comment section.

Due to limitation, I can't attach exported .xml file which you can Import in your repository and see or analyse. (Feel free to mail me if you need .xml file with Scenario number and Question.)

If you like the approach and solution feel free to share with others.

Thanks.

hpandey.harsh@gmail.com 

9 comments:

  1. Hi Harsh,

    Very nice scenario, but here we need to use Seq generator with default end value and reset option checked, due to two reasons:
    1.If there are 5 records for a departmentin one session run , it will assign nextval as 1,2,1,2,1, and we will be getting all the values in output.
    2.If we run the session next time again, we want to reset the Current value to 1.

    Please let me know if i am correct.

    Thanks for the knowledge you are sharing:)

    Regards
    Sheetal

    ReplyDelete
    Replies
    1. Hi, Sheetal

      Yes, you are correct but I want to correct few sentence of your comment. Instead of default end value we are using user define end value which is 2 and instead of Reset we are selecting Cycle. :-)

      Delete
    2. Hi Harsh,
      I just did what you have explained in the exercise. I am getting for dept10 three records and dept20 5 records and 6 records respectively, I have 14 records in my source. I have even set current value as 1 and checked the reset option. Please tell me what I am doing wrong. Thank you

      Delete
    3. Instead of Current Value, and Reset... you have to set END VALUE =2, START VALUE=1 and CYCLE option should be checked.

      Read Step 4

      Let me know if you are still not getting expected output.
      Thanks.

      Delete
    4. Great scenario and excellent explanation...however I think we have to keep the end value in the sequence generator as default and check the reset option.

      Delete
  2. Shall we need to load top 2 salaries from each dept to same target or different targets?

    ReplyDelete
  3. SQ-->Sorter-->Exp-->Filter-->Target
    In Sorter first sort by dept in Ascending order then salary by descending order and connect output pipeline to Expression.
    In expression create a following logic to generate sequence number for each department.
    var_seq=iif(DEPTNO=prev_deptno,iif(SALARY!=prev_salary,var_seq+1,var_seq),1)(Variable Port)
    prev_deptno=deptno(Variable Port)
    prev_salary=Salary(Variable Port)
    out_seq=var_seq(Output Port)
    Now connect output pipeline of expression to filter to send records to same target
    or router in case of sending records to multiple targets based on Dept.

    ReplyDelete
  4. Hi Harsh,

    Here we are assuming the dept no's as 10,20,30.what would be the solution if we don't know them in advance.How to filter the dept no's???

    Thanks in Advance

    ReplyDelete
    Replies
    1. You can use transaction control transformation in that case.

      Delete