• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Compare two tables with 2 different criteria

vrunda

Member
I have 2 tables with amount, gst no, invoice no. column, in both tables. Now I have to match gst no. column with exact criteria, & invoice no column on fuzzy match option. How I can achieve the results with power query.
Thank You!!!
 
Provide us with a sample file showing 8-10 records of data from each table. Then provide a mock up of your desired solution. No pictures as we can not manipulate data in a picture.
 
Please find attached the file
 

Attachments

  • New Microsoft Excel Worksheet (2).xlsx
    11.2 KB · Views: 4
In your sample Sr. no 1 on either side isn't exact match. In PQ, text comparison is case sensitive.

At any rate, see below for issue with your data set.

Note that you can't mix FuzzyNestedJoin with regular join. Join type is defined at table level and not at column level.

If it's inner join, you can simply nest join within join. But since this requires outer join, that isn't feasible.

Also fuzzy lookup uses Jaccard similarity using tokenization, token weight, transformation, edit distance etc to give similarity score.

This creates issue in your data set, as Sr. no 3 on left table to Sr. no 4 on right table has higher similarity score than Sr. no 11 on left side and Sr. no 7 on right side (or Sr. no 2 on either side). Thus creating false positive and duplicating Sr. no 4 on left table.

You can adjust similarity score, but either way, you will have false positive or negative one way or the other.

There is no solution that will provide you with "exact" solution that you are looking for. Closest you will get will be something like below. Similarity score is set extremely high to avoid false positive.

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Sr. no", Int64.Type}, {"Name", type text}, {"GStn", type text}, {"Invoice no", type text}, {"Amount", Int64.Type}}),
    #"Merged Queries" = Table.FuzzyNestedJoin(#"Changed Type", {"GStn", "Invoice no"}, Table2, {"GStn", "Invoice no"}, "Table2", JoinKind.FullOuter, [IgnoreCase=true, IgnoreSpace=false, NumberOfMatches=1, Threshold=0.98]),
    #"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Sr. no", "GStn", "Invoice no", "Amount"}, {"Table2.Sr. no", "Table2.GStn", "Table2.Invoice no", "Table2.Amount"})
in
    #"Expanded Table2"

There are few other methods you can apply, but for ease of maintenance. I'd recommend going with above or higher threshold and then doing manual adjustment on remaining.
 
This gave some wrong result , attaching the file with yellow highlighted rows . Please check.
Also you said "If it's inner join, you can simply nest join within join. But since this requires outer join, that isn't feasible." can we combine inner join then outer join and produce the result
 

Attachments

  • New Microsoft Excel Worksheet (2).xlsx
    27.8 KB · Views: 1
Also is there any other way than power query where we can match 2 tables ===one column with exact match criteria
& one column with fuzzy match
 
Back
Top