Generating sequence numbers from cluster values [VLOOKUP to the rescue]
Last night I got an email from Joshua, one of our readers with the subject – Hard Excel problem. Hard?!?, at this stage of summer, the hard problems seem to be (in no particular order),
 Lack of good quality mangoes to eat
 Intense heat and humidity
 Lack of good quality mangoes to eat
Yes, I like mangoes.
Any how, back to Joshua’s email, So I got curious and read it. He is facing an interestingproblem.
I have a very difficult inquiry I am hoping you might be able to solve…
Is there a formula (i.e., without using VBA) that will look at another columns values and provide a new sequential number (i.e., reordered) when the value changes; however, keep the same sequential number for the duplicates?
Below is a table with two columns. […] I now need to rank order those cluster groups. Since cluster 12 appears first it would get a value of ‘1’ and all of the cluster 12’s should now be a ‘1’. Since cluster 4 appears next it would get a rank of 2, etc…
Well, it is an interesting problem for sure. But hard problem, it isn’t. For really hard problems, refer to my list above.
So how to generate the sequence numbers?
Logic: If a value is already listed, we fetch corresponding sequence number. Else, we generate a new sequence number.
Implementation: Simple, we use VLOOKUP.
Assuming the cluster values are in column B, from B4 onwards, in C4, write
=IFERROR(VLOOKUP(B4,$B$3:C3,2,FALSE), SUM(MAX($C$3:C3),1))
Let’s examine the formula.
VLOOKUP(B4,$B$3:C3,2,FALSE) portion: This one looks value in column B and tries to find corresponding sequence value in column C.
SUM(MAX($C$3:C3),1) portion: Gives us next sequence number
IFERROR(VLOOKUP(…), SUM(…)) portion: This does the magic of choosing either existing sequence number or generating a new one.
For more, read about VLOOKUP and IFERROR formulas.
Sequence number generation – Example spreadsheet
Play with the sequence number generation spreadsheet embedded below or Click here to grab a copy of the file.
How would you generate the sequence numbers?
Its your turn to take a crack at the hard problem. How would you solve it? Go ahead and share your answers in the comments.
More hard problems – solved:
Hard problems are not new at Chandoo.org. We take lob vlookups and sumproducts regularly to crack them. Here are few examples:
 Generating unique invoice numbers in Excel
 Extracting unique values list and sort them by occurrence
 Finding unique, duplicate and missing values in a list
 More hard problems in Excel
Hello Awesome...
My name is Chandoo. Thanks for dropping by. My mission is to make you awesome in Excel & your work. I live in Wellington, New Zealand. When I am not F9ing my formulas, I cycle, cook or play lego with my kids. Know more about me.
I hope you enjoyed this article. Visit Excel for Beginner or Advanced Excel pages to learn more or join my online video class to master Excel.
Thank you and see you around.
Related articles:

Leave a Reply
« Show more of your workbook on screens [quick tip]  Sum up neither “A” nor “B” values – How to use DSUM function in Excel [video] » 
13 Responses to “Generating sequence numbers from cluster values [VLOOKUP to the rescue]”
What about IFERROR(VLOOKUP(B4,$B$3:C4,2,FALSE),MAX(C3:C4)+1), can omit one sum function.
I know it is simple, but not EASY.. Thanks to inspire me.
This is going to be very useful, thanks Chandoo.
Fellow mango lover 🙂
Hi Chandoo,
I seen your site recently and learned excel a lot thanks a ton for all your contributions..........
The solution striked to me for the above is
In column E we can filter all the duplicates (these are dynamic) and In column F serial no 1 to 10 (These are static)
In column B we use vlookup formula...
This may be not the accurate solution but I given one more solution...
Plz correct me if i am wrong
Awesome as usual 🙂
I would have used the following formula in cell C4 and copied it down to the remaining cells (although your formula is much more concise!) :
=IF(COUNTIF(B$1:B3,B4),INDEX(C$1:C3,MATCH(B4,B$1:B3,0)),MAX(C$1:C3)+1)
Unlike the solution from Chandoo's file and the solutions from previous comments, this one is different because it does not reference the answers/results from previous rows. It is shown referencing B4 and could be copied down to cover the other rows, but getting the correct answer for row 10 (as an example) does not require that the formula be entered for rows 49 as well.
=SUMPRODUCT((MATCH($B$4:$B$20,$B$4:$B$20,0)<MATCH(B4,$B$4:$B$20,0))/COUNTIFS($B$4:$B$20,$B$4:$B$20))+1
Dear David,
Great formulation but it will be easy to learn if you could please explain how the formula works.
Since I mentioned row 10 in my original post, let's use that for this example. The first MATCH assigns a unique number to each value corresponding to the position of its first occurrence in the list.
=SUMPRODUCT(({1;1;1;1;5;6;6;1;9;1;1;6;5;5;15;1;5}<MATCH(B10,$B$4:$B$20,0))/COUNTIFS($B$4:$B$20,$B$4:$B$20))+1
The advantage is that those numbers are in a sort of ascending order, very much like using the RANK function, only with position being what is important as opposed to size. Then the combination of the second MATCH and the COUNTIFS is borrowed from a trick for getting a rank (without using the RANK function) where there are no skips in the sequence for duplicate values. It considers if the current value's first position is less than the first position of all other values and divides by the number of times each value appears in the list.
=SUMPRODUCT({1;1;1;1;1;0;0;1;0;1;1;0;1;1;0;1;1}/{8;8;8;8;4;3;3;8;1;8;8;3;4;4;1;8;4})+1
This effectively creates an array of fractions that allow us to cheat our way into an accurate count. Here we see the ones (trues) in the numerator correspond to the eights and fours in the denominator such that we have 1/8 a total of 8 times and 1/4 a total of 4 times, with all other fractions equating to zero.
Finally, SUMPRODUCT adds up the fractions for a total of 2, and the +1 increments that count because the 20 on row 10 is in fact the 3rd unique value in the list. Technically, I could even have used an LTE (less than or equal to) between the two MATCH functions and avoided the need for a +1.
Simplest, but less elegant, solution would be to sort the values in B. Then set C4
=IF(ISNUMBER(B3),IF(B4=B3,C3,C3+1),1)
Of course, this assumes there is no intrinsic meaning to the cluster IDs (ie, 12 doesn't have to be 1) which isn't exactly clear from the question.
Its Awesome way to learn the lesson.Thanks for posting this.
I entered numbers in F13 TO F23 and formula started from G16. Default value in G15=1
IF(ISERROR(VLOOKUP(F16,F$15:G15,2,0))=FALSE,VLOOKUP(F16,F$15:G15,2,0),MAX(G$15:G15)+1)
[…] saw this sequencing challenge from Chandoo “Generate a new sequence number when value changes“. His formula solution is great but […]