• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Identifying the top level in a hierarchy of numbers

In the attached file, in column C, I'm trying to identify which project IDs are at the "top level" with a formula. The rows in orange are top level. So the problems I'm running into are the two green cells. Cell C4 should say "Top Level" but it's blank. Cell C30 shouldn't say "Top Level."

The project IDs in column A are sorted. Starting with any level 4 project ID, I am trying to find which project ID is the "Top Level."

The orange cells show what is the top level project ID for the group of project IDs below it.

How is "Top Level" defined? Any given project ID may have a parent ID and that parent ID may have a parent ID. In other words, any level 4 project ID will have a parent ID at level 3. However any level 3 project ID may or may not have a parent ID at level 2.

The highest level you can go up to, is the parent ID for any group of project IDs. For example, any level 4 project ID in A26:31 has a level 3 parent ID (there are 2 of them). And both level 3 project IDs, have a level 2 parent ID. Since level 2 is the highest we can go, that level 2 on row 25 is the "top level" for the project codes below it.

Another example, the level 4 project ID in A12 has a parent ID at level 3 (A11), however that level 3 in A11 does not have a parent ID at level 2 (which would have been A8081.11234 if it were in column A). Therefore starting with level 4 in A12, the highest we can go up is the level 3 ID in row 11. Therefore row 11, is a "Top Level."
 

Attachments

  • Chandoo.org - Identifying Top Level.xlsx
    11.4 KB · Views: 8
Test: in a cell in row 2:
=IF(MIN(IF(TEXTBEFORE($A$2:$A$31,".")=TEXTBEFORE($A2,"."),$B$2:$B$31))=$B2,"top level", "")
copy down.
Nice. This worked. Thank you very much. I spent a few hours trying to figure this one out in the attached file, but when I tried it on my dataset of 980 rows, I got an #N/A error. If I look only at cell D2, I narrowed it down to this part:

TEXTBEFORE($A$2:$A$31,".")=TEXTBEFORE($A2,".")

I used formula evaluator to pin this down. The value returned by TEXTBEFORE($A2,".") appears only twice (in the first two rows) which is why

TEXTBEFORE($A$2:$A$31,".")=TEXTBEFORE($A2,".")

evaluates to TRUE;TRUE;FALSE;FALSE;FALSE;FALSE......to the end except one of those FALSES is the #N/A error (let's say position N). I can't count the number of FALSES that appear after the #N/A error to identify position N for the #N/A error. There are just way too many entries to count.

Do you have any suggestions on how to identify the position N of the #N/A error? I added one column to extract the first 5 characters from each row, checked each one with the EXACT ( ) to see if equaled the value returned by TEXTBEFORE($A2,"."). I got TRUE for the first two rows and FALSE for all remaining rows. No #N/A errors in the remaining rows, so I don't know how else I can pin down what is causing the #N/A error.
 
I'd guess, unlike the dataset you gave, that there are some Project IDs without a dot at all, truly top level ones.
To debug, you can enter parts of the equation into a cell, eg.
=TEXTBEFORE($A$2:$A$982,".")=TEXTBEFORE($A2,".")
to get the full column of results. You can change the $A2 to another cell to see the results for that row.
Anyway, a quick, 'sticking-plaster', resolution could be to use this instead:
=IF(MIN(IF(TEXTBEFORE($A$2:$A$31 & ".",".")=TEXTBEFORE($A2 & ".","."),$B$2:$B$31))=$B2,"top level", "")
not forgetting to adjust the 31 to where your last row of data is.
 
Did it work?
This part worked:
=IF(MIN(IF(TEXTBEFORE($A$2:$A$31 & ".",".")=TEXTBEFORE($A2 & ".","."),$B$2:$B$31))=$B2,"top level", "")

I still wish I knew the reason for the error I mentioned above, but at everything seems fine now. Thank you for your help. This is a really complex formula. I figured the solution had something to do with an array/range.
 
Back
Top