fbpx
Search
Close this search box.

Lookup most frequent item [Homework]

Excel Challenges - 25 comments

Here is an interesting problem to keep your brain cells fight boredom on this Friday & weekend.

Let’s say you have some data like this.

most-frequent-lookup

And you want to know, for a given customer name (in cell G4),

  • What is the most frequent quantity?
  • What is the most often purchased item?

How would you write formulas to get these answers?

Assume that all the data is in a table, conveniently named data. So you can access customer names by using structured name – data[Customer] etc.

[Related: Introduction to Excel Structured Referencing]

If there are ties, you may pick any one value.

Click here to download sample file with raw data.

Go ahead and post your answers in the comments.

Want more? Check out our Excel Homework problems page and get cracking.

Chandoo

Hello Awesome...

My name is Chandoo. Thanks for dropping by. My mission is to make you awesome in Excel & your work. I live in Wellington, New Zealand. When I am not F9ing my formulas, I cycle, cook or play lego with my kids. Know more about me.

I hope you enjoyed this article. Visit Excel for Beginner or Advanced Excel pages to learn more or join my online video class to master Excel.

Thank you and see you around.

Related articles:

Written by Chandoo
Tags: , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

25 Responses to “Lookup most frequent item [Homework]”

  1. mma173 says:

    For most frequent quantity:
    {=INDEX(data[Qty],SMALL(IF(COUNTIF(data[Qty],data[Qty])=MAX(COUNTIF(data[Qty],data[Qty])), ROW(data[Qty]),""),1))}

    For most often purchased item:
    {=INDEX(data[Item],SMALL(IF(COUNTIF(data[Item],data[Item])=MAX(COUNTIF(data[Item],data[Item])), ROW(data[Item]),""),1))}

  2. Hugo says:

    first sort the table (high to low)
    then with a simple index/match
    (first occurrence of that name is the most frequent item for that name)

  3. Desk Lamp says:

    {=MODE(IF(data[Customer]=$G4,data[Qty]))}

    {=INDEX(data[Item],MATCH(TRUE,COUNTIFS(data[Item],IF(data[Customer]=$G4,data[Item]),data[Customer],$G4)=MAX(COUNTIFS(data[Item],IF(data[Customer]=$G4,data[Item]),data[Customer],$G4)),0))}

    I think the second formula could be simplified but I'll leave it at that for now.

    • Desk Lamp says:

      Ok, slightly simpler:

      {=INDEX(data[Item],MATCH(TRUE,COUNTIFS(data[Item],data[Item],data[Customer],$G4)=MAX(COUNTIFS(data[Item],data[Item],data[Customer],$G4)),0))}

      • Daniel H says:

        {=MODE(IF(G4=data[Customer],data[Qty]))}

        Same as yours. It’s curious why Excel doesn’t treat “FALSE” as zero

        {=INDEX(data[Item],MODE(IF(G4=data[Customer],MATCH(data[Item],data[Item],0))))}

        It just works, don’t ask me why 😉

        • David N says:

          Daniel, that's a crazy cool and elegant solution!

          My initial thought was to use FREQUENCY on the concatenation of Customer and Item because the COUNTIFS solutions would be very slow if the array were much larger. Then a MATCH on the MAX (or AGGREGATE) of that FREQUENCY would have determined the corresponding position. And that approach works, as shown below. It's long, but it's also always going to be fast.

          You on the other hand recognized that the position of an Item relative to its Customer doesn't matter as long as every possible Item is represented by a unique position (number) and ignored unless it corresponds with the correct Customer. And your solution capitalizes on that perfectly. I also appreciate how your formula avoids repeating any of its pieces, something I particularly dislike and avoid whenever I can.

          {=INDEX(data[Item],MATCH(AGGREGATE(14,6,IF(data[Customer]=G4,FREQUENCY(MATCH(data[Customer]&"#"&data[Item],data[Customer]&"#"&data[Item],0),MATCH(data[Customer]&"#"&data[Item],data[Customer]&"#"&data[Item],0))),1),IF(data[Customer]=G4,FREQUENCY(MATCH(data[Customer]&"#"&data[Item],data[Customer]&"#"&data[Item],0),MATCH(data[Customer]&"#"&data[Item],data[Customer]&"#"&data[Item],0))),0))}

  4. Michael (Micky) Avidan says:

    In cell H4: {=MAX((data3[Customer]=G4)*data3[Qty])}
    In cell I4: {=INDEX(data3[Item],MATCH(G4&H4,data3[Customer]&data3[Qty],))}
    and copy down...
    ***Both Array Formulas
    --------------------------------
    Michael (Micky) Avidan

    • Michael (Micky) Avidan says:

      DATA3 should read DATA.
      It became DATA3 because I copied the sheet in oredr to work on a copy.
      ----------------------------
      Michael (Micky) Avidan

      • Michael (Micky) Avidan says:

        Sorry..., I misssed the MODE part.
        Ignore my posts.
        ----------------------------
        Michael (Micky) Avidan

  5. Ola says:

    Make a PivotTable (sort) --> enables drilldown

  6. Sabeesh says:

    Qty: =MODE(IF(data[Customer]=G4,data[Qty]))

    Item:=INDEX(data[Item],MATCH(MAX(COUNTIFS(data[Customer],G4,data[Item],data[Item])),COUNTIFS(data[Customer],G4,data[Item],data[Item]),0))

    Both are array formulas

    • Rojo Loco says:

      In copy and past of the solution I get results, but not the correct results. For example, I got by your mode index formula:
      Willie Larson 5 Bags

      when i should have gotten:

      Willie Larson Beef 10

  7. Sabeesh says:

    =INDEX(data[Item],MATCH(MAX(COUNTIFS(data[Customer],G4,data[Item ],data[Item])),COUNTIFS(data[Customer],G4,data[Item],data[Item]),0))

  8. ak says:

    Hello,

    Sorry for posting here but I couldn't find a place to post questions here. Is there an area where I can post questions.

    How can I delete every other row quickly in excel, because I ran a report and it skips a row for each entry.

    Thanks so much and apologize if this is not allowed.

  9. ZORRO2005 says:

    {=INDEX(data[Item],MODE(IF(data[Customer]=G4,MATCH(data[Item],data[Item],))))}

    • Andy says:

      Great stuff. I am getting most of this, but I do not understand how the array data[Item] in the Match lookup_value works. How does this make sense?

      MATCH(data[Item],data[Item],)

      • ZORRO2005 says:

        Andy, it will be clear if put the formula below to column E
        =IF(data[Customer]=$G$4,MATCH(data[Item],data[Item],))

      • Arpita says:

        Hi Andy,

        Hope this isn't too late.

        The MATCH function works by assigning a number (which is the item's first array position) to each unique item in Array 1.

        For example, let's say Array 1 is {Bags, Butter, Bags, Butter, Bread, Burgers, Bags}, which has 4 unique items

        Next, we'll take Array 2 {Bags, Butter, Bread, Bread, Burgers, Bread} to match against Array 1. Each item in Array 2 is matched against the whole of Array 1 and the MATCH function determines the first instance of a "match" and outputs the corresponding position from Array 1

        If we had to compare this to MATCH(data[Item],data[Item],0), the formula would be MATCH(Array 2, Array 1, 0)

        The answer to this formula would be {1;2;5;5;6;5). The mode of this array is 5, which is the unique code mapped to the item Bread.

        Hope this helps!

  10. Rojo Loco says:

    Create a pivot table off the data. Put customer and items on rows, put count of item on values. Sort highest to lowest on count of item. Go to design view of pivot table and change report lay out to tabular and repeat values, turn off grand totals and subtotals. Each customer by each item now has a value sorted highest to lowest. Index match the customer name to the item and count.

    This allows for changes to the pivot table to flow to the output, such as minimum value, average value (if your data allowed for it).

  11. Rojo Loco says:

    In looking for a more elegant and "break proof" option, here is what I have:

    from the raw data file you have 2 choices, top item per customer, or ranking item per customer, both have their advantages.

    Here is the top item per customer formula:
    =IF(COUNTIFS($B$4:$B$270,B4,$D$4:$D$270,">"&D4)+1>1,"",COUNTIFS($B$4:$B$270,B4,$D$4:$D$270,">"&D4)+1)
    if you paste this into cell E4 and copy down it should work

    Here is the rank item per customer formula:
    `
    if you paste this into cell F4 and copy down after the E4 formula above it should work

    Then you can filter by column E for 1 and get the top items (note some customers have tied for 1st place).

    Also, by column F you can see additional customer info quickly.

    (I call the new formula columns Top Item Per Customer for Column E and Customer Item Rank for Column F).

    Now you can sort by your preferred column (customer, item, quantity) and the table is always correct. (Sort by preferred item 1st and then next item, such as, customer them top item, or item, then top customer). Item A to Z, then Customer Item Rank Smallest to Largest gives you per item largest customer; Customer Item smallest to largest, then customer a to z gives you the largest items per customer etc.

  12. Rojo Loco says:

    In looking for a more elegant and "break proof" option, here is what I have:

    from the raw data file you have 2 choices, top item per customer, or ranking item per customer, both have their advantages.

    Here is the top item per customer formula:
    =IF(COUNTIFS($B$4:$B$270,B4,$D$4:$D$270,">"&D4)+1>1,"",COUNTIFS($B$4:$B$270,B4,$D$4:$D$270,">"&D4)+1)
    if you paste this into cell E4 and copy down it should work

    Here is the rank item per customer formula:
    =COUNTIFS($B$4:$B$270,B4,$D$4:$D$270,">"&D4)+1

    if you paste this into cell F4 and copy down after the E4 formula above it should work

    Then you can filter by column E for 1 and get the top items (note some customers have tied for 1st place).

    Also, by column F you can see additional customer info quickly.

    (I call the new formula columns Top Item Per Customer for Column E and Customer Item Rank for Column F).

    Now you can sort by your preferred column (customer, item, quantity) and the table is always correct. (Sort by preferred item 1st and then next item, such as, customer them top item, or item, then top customer). Item A to Z, then Customer Item Rank Smallest to Largest gives you per item largest customer; Customer Item smallest to largest, then customer a to z gives you the largest items per customer etc.

  13. ZORRO2005 says:

    For cases when some customers will not have repeats
    {=MODE(IF(data[Customer]=G4,data[Qty]*{1,1}))}
    {=INDEX(data[Item],MODE(IF(G4=data[Customer],MATCH(data[Item],data[Item],)*{1,1})))}

  14. Noori says:

    Hi Chandoo,

    Is there a simple formula for these two questions. I am getting confused with the comments.

Leave a Reply


« »