• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Defining Range to use based on a matching value

ArcanaET

New Member
Hi Guys,

I'm needing to match a product code to a price closest matching the value threshold (we chose 20th percentile within the range). I managed to make a formula that seems to correctly identify closest match, however I'm having issues with automatically defining the range. I was able to define the range for the percentile formula in column D, maybe by stroke of luck (basically I need it to only select a range where the Generic ID is the same. Like, select from range within B2:B4147 where it is equal to B value in the same row as formula).
I tried making something like the below to define the range, however it's just not panning out.
=INDEX(A2:A46100,MATCH(MIN(ABS(IF($B$2:$B$41471=B2,$C$2:$C$46098-D2))),ABS($C$2:$C$46098-D2)),0))

I could manually define the range like I did in the example E column, however I have over 40000 rows to work with and isn't realistic to do manually. I need it to return a matching product that closely matches the price in the 22th percentile, but it needs to be in the same Generic class (column E). If there's a simpler way to determine this, I'm all ears.
Should return a TNDC, where the ProperContractPrice closely matches the value in the 22th Percentile Price, from a range where the Generic ID is the same.
1710865706314.png
 

Attachments

  • Data Sample.xlsx
    12.6 KB · Views: 6
Hello

We can use the PERCENTILE function to find the 22nd percentile price for each Generic ID class. You can use the following formula in an empty column (let's say column F):

Code:
=PERCENTILE(IF($B$2:$B$41471=B2,$C$2:$C$41471),0.22)

Now, in another column (let's say column G), use the following formula to find the closest matching product code to the 22nd percentile price within the same Generic ID class:

Code:
=INDEX($A$2:$A$41471,MATCH(MIN(ABS(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2))), ABS(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2)), 0))

Try and let me know..
 
Hello Monty,
Unfortunately the second formula doesn't appear to be matching to the correct value, matching to a value outside of what should be the correct range.

1710871502168.png
 

Attachments

  • Data Sample.xlsx
    14.3 KB · Views: 1
Hello

Let's try refining the formula. It seems the issue might be related to how the conditions are applied. Let's adjust the formula to ensure it's correctly filtering based on the Generic ID and the price being above the 22nd percentile. Here's the revised formula for column G:


Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2, IFERROR(1/0)))), IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2, IFERROR(1/0)), 0))
 
The above formula should find the closest matching product code within each Generic ID class to the 22nd percentile price. Adjust the range if your data extends beyond row 41471. Let me know if this resolves the issue.
 
Hello

Let's try refining the formula. It seems the issue might be related to how the conditions are applied. Let's adjust the formula to ensure it's correctly filtering based on the Generic ID and the price being above the 22nd percentile. Here's the revised formula for column G:


Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2, IFERROR(1/0)))), IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2, IFERROR(1/0)), 0))
Hi Monty,

It looks like there's an error in this formula "You've entered too few arguments for this function" and references to the 0 in the IFERROR(1/0)
 
Hello

Inhave changed by removing the extra , 0 at the end of the formula. This was causing the "too few arguments" error. Now the formula should work


Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2, IFERROR(1/0)))), IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=F2), $C$2:$C$41471-F2), 0)
 
This one is still having the too few arguments for this function issue. Replacing IFERROR(1/0) with IFERROR(1,0) seems to fix it, and it seems to be working, except in some instances, it refers back to cell A2. I'm assuming it is doing this whenever the formula results in an error. Not sure if you have any thoughts.

=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,IFERROR(1,0))),IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,IFERROR(1,0)),0))
 
This one is still having the too few arguments for this function issue. Replacing IFERROR(1/0) with IFERROR(1,0) seems to fix it, and it seems to be working, except in some instances, it refers back to cell A2. I'm assuming it is doing this whenever the formula results in an error. Not sure if you have any thoughts.

=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,IFERROR(1,0))),IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,IFERROR(1,0)),0))
Did you try my previous formula provide with the change
 
Okay...

I Think there was a misunderstanding in how the IFERROR function was used. Let's adjust the formula to avoid the reference back to cell A2 when encountering an error. We can achieve this by using a large number as the default value instead of IFERROR. Here's the revised formula:

Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,10^10)),IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,10^10),0))
In this formula, if there's an error, it will use a large number (10^10) instead of trying to divide by zero. This should prevent the formula from referencing back to cell A2 in case of an error. Let me know if this resolves the issue!
 
Okay...

I Think there was a misunderstanding in how the IFERROR function was used. Let's adjust the formula to avoid the reference back to cell A2 when encountering an error. We can achieve this by using a large number as the default value instead of IFERROR. Here's the revised formula:

Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,10^10)),IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2),$C$2:$C$41471-D2,10^10),0))
In this formula, if there's an error, it will use a large number (10^10) instead of trying to divide by zero. This should prevent the formula from referencing back to cell A2 in case of an error. Let me know if this resolves the issue!
It looks like it's working correctly now! I appreciate all your help! I'll let you know if I run into any issues, but I did a spot check and it looks like it's referencing correctly to cells in the correct range.
 
Modified in case!

Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2), $C$2:$C$41471-D2, 10^10)), IF(($B$2:$B$41471=B2)*($C$2:$C$41471>=D2), $C$2:$C$41471-D2, 10^10), 0))
 
Last edited:
It looks like it's working correctly now!
Really?
I'm not so sure. Take a look at genericID 101; the smallest difference from the 22% percentile is 00781305995, not the 00641141035 that @Monty 's formula returns.

1710939483263.png

Is this AI screwing up again @Monty ? Or have I misunderstood?

Later, I'll try to put something together that works.

ps. @ArcanaET , on a separate thing, GenericID 130 has 2 tndcs closest to the 22 percentile, 70954020110 & 59651043701; how do you want to handle that?
 
Last edited:
Really?
I'm not so sure. Take a look at genericID 101; the smallest difference from the 22% percentile is 00781305995, not the 00641141035 that @Monty 's formula returns.

View attachment 86780

Is this AI screwing up again @Monty ? Or have I misunderstood?

Later, I'll try to put something together that works.
I noticed that as well, this is due to the ABS function being removed, which means negative values are essentially ignored (so items where the net difference is shorter, but were less than the 22nd percentile were not being considered).

However this enabled me to get to where I needed to go. I decided to take that step out of the nested and into a separate column, and then simplified the formula he gave me, which leads to this one. I created a condition where all of the conditions return true where they need to, (no negatives being arbitrarily defaulted to 10^10), but still fall under the same Generic ID.

=INDEX($A$2:$A$1315,MATCH(MIN(IF(($B$2:$B$1315=B2),$E$2:$E$1315,10^10)),$E$2:$E$1315,0))
Monty's formula would have returned the line in Yellow, however the new one correctly returns the one in Green.
I appreciate everyone's help! You're more than welcome to take a crack at it though, getting the formula to work all in one go without needing to create additional columns would be a great learning experience.
1710940703118.png
 

Attachments

  • Data Sample.xlsx
    20.1 KB · Views: 1
Really?
I'm not so sure. Take a look at genericID 101; the smallest difference from the 22% percentile is 00781305995, not the 00641141035 that @Monty 's formula returns.

View attachment 86780

Is this AI screwing up again @Monty ? Or have I misunderstood?

Later, I'll try to put something together that works.

ps. @ArcanaET , on a separate thing, GenericID 130 has 2 tndcs closest to the 22 percentile, 70954020110 & 59651043701; how do you want to handle that?
Absolutely misunderstood, Am hear to help and learn...No shortcuts..
 
A human-produced formula on the lines of @Monty 's which on cursory checking seems OK save for when there is a GenericID of 0.
Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF($B$2:$B$41471=B2,ABS($C$2:$C$41471-D2))),IF(($B$2:$B$41471=B2),ABS($C$2:$C$41471-D2)),0))
You have which version of Excel?
MS 365, for both columns:
Code:
=LET(f,$A$2:$C$41471,g,B2,a,FILTER(f,CHOOSECOLS(f,2)=g),c,TAKE(a,,-1),b,PERCENTILE(c,0.22),d,ABS(c-b),e,INDEX(TAKE(a,,1),MATCH(MIN(d),d,0)),HSTACK(b,e))
but are you saying you only need to see tndc?
Then similar:
Code:
=LET(f,$A$2:$C$41471,g,B2,a,FILTER(f,CHOOSECOLS(f,2)=g),c,TAKE(a,,-1),b,PERCENTILE(c,0.22),d,ABS(c-b),e,INDEX(TAKE(a,,1),MATCH(MIN(d),d,0)),e)

But do you need to see those repeated rows? Would this be any use:?

1710943189617.png
 
A human-produced formula on the lines of @Monty 's which on cursory checking seems OK save for when there is a GenericID of 0.
Code:
=INDEX($A$2:$A$41471,MATCH(MIN(IF($B$2:$B$41471=B2,ABS($C$2:$C$41471-D2))),IF(($B$2:$B$41471=B2),ABS($C$2:$C$41471-D2)),0))
You have which version of Excel?
MS 365, for both columns:
Code:
=LET(f,$A$2:$C$41471,g,B2,a,FILTER(f,CHOOSECOLS(f,2)=g),c,TAKE(a,,-1),b,PERCENTILE(c,0.22),d,ABS(c-b),e,INDEX(TAKE(a,,1),MATCH(MIN(d),d,0)),HSTACK(b,e))
but are you saying you only need to see tndc?
Then similar:
Code:
=LET(f,$A$2:$C$41471,g,B2,a,FILTER(f,CHOOSECOLS(f,2)=g),c,TAKE(a,,-1),b,PERCENTILE(c,0.22),d,ABS(c-b),e,INDEX(TAKE(a,,1),MATCH(MIN(d),d,0)),e)

But do you need to see those repeated rows? Would this be any use:?

View attachment 86784
My current Org hasn't migrated to 365, still on 2019. I just needed to choose an NDC within the same generic class that isn't necessarily the cheapest option (Won't always be available), but isn't unsustainable to stock (too expensive). We chose 22nd percentile as an arbitrary value. Repeated rows aren't important as long as the NDCs that are visible are the ones we want to focus on.
 
In the file you attached to msg#15, the formula in column F:
=INDEX($A$2:$A$126,MATCH(MIN(IF(($B$2:$B$126=B2),$E$2:$E$126,10^10)),$E$2:$E$126,0))
is dangerous in the event that if the same minimum ABS value in column E occurs in a different GenericID above, you'll get the wrong tndc returned. I've artificially done that in cell E15 to demonstrate. This also changes column G (although it's correct with the new minimum). Restore the data to as it should be by copying cell E14 down 1 cell.
Instead, that formula can be tweaked to:
=INDEX($A$2:$A$126,MATCH(MIN(IF($B$2:$B$126=B2,$E$2:$E$126)),IF($B$2:$B$126=B2,$E$2:$E$126),0))
The 10^10 is not necessary. See column G.

The table at cell I1 of the attached only looks at columns A:C, so all the other columns aren't needed and allows this:
just needed to choose an NDC within the same generic class
This is a query which is compatible with Excel 2019, and, when you change the data in colums A:C, needs refreshing (as you would a pivot table) by right-clicking the table and choosing Refresh.
I note that you've changed to the 25th percentile; the table reflects that. You could put the percentile you want to use in a single cell somewhere and have the query (or formulae) look at that for added flexibility.
 

Attachments

  • Chandoo56454Data Sample v2.xlsx
    32.7 KB · Views: 2
In the file you attached to msg#15, the formula in column F:
=INDEX($A$2:$A$126,MATCH(MIN(IF(($B$2:$B$126=B2),$E$2:$E$126,10^10)),$E$2:$E$126,0))
is dangerous in the event that if the same minimum ABS value in column E occurs in a different GenericID above, you'll get the wrong tndc returned. I've artificially done that in cell E15 to demonstrate. This also changes column G (although it's correct with the new minimum). Restore the data to as it should be by copying cell E14 down 1 cell.
Instead, that formula can be tweaked to:
=INDEX($A$2:$A$126,MATCH(MIN(IF($B$2:$B$126=B2,$E$2:$E$126)),IF($B$2:$B$126=B2,$E$2:$E$126),0))
The 10^10 is not necessary. See column G.

The table at cell I1 of the attached only looks at columns A:C, so all the other columns aren't needed and allows this:

This is a query which is compatible with Excel 2019, and, when you change the data in colums A:C, needs refreshing (as you would a pivot table) by right-clicking the table and choosing Refresh.
I note that you've changed to the 25th percentile; the table reflects that. You could put the percentile you want to use in a single cell somewhere and have the query (or formulae) look at that for added flexibility.
Ah, yea that does fix the same min ABS value. Good catch! Thank you!
 
@p45cal

I should admitt it...Your meticulous analysis of problem statements has inspired me to be more deliberate in my thinking and avoid jumping to conclusions too quickly. Your dedication and expertise are truly admirable.Hat's off to you!
 
Back
Top