• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Flatten a hierarchy with recursive LAMBDA formula ?

Lolo

Member
Hello,

In excel I have this table (it is just an example) :
Table called T_DATA

idlParentId
dtm|fact
dtm|dim
fact|raw1
dim|raw2
dtm2|raw3
Etc…

I want to flat the hierarchy based on a value the user populate in cella D1 for example

For example if in D1 I populate : dtm
the result expected would be :
(row 1) : dtm>fact>raw1
(row 2) : dtm>dim>raw2

if in A1 I populate : Dtm2
the result expected would be :
(row 1) : dtm2>raw3

So in summary, each path found must be concatened (separated by a '>'),
and there is 1 path by row.

I managed to do it and it works for the 1st path only, with the following recursive function/formula called GetPath:
=LAMBDA(start,
LET(
nextItems,FILTER(T_DATA[ParentId];T_DATA[Id]=start);
IF(nextItem="";start;start&">"&GetPath(nextItem))
)
)
and by using it like this : GetPath(D1)

But I cannot figure out a way to get ALL paths based on the criteria searched in A1
I tried other options found on internet, but the formula returns #N/A each time :(

See attached file.

If anyone can help me, it would be very appreciated.
 

Attachments

  • SampleFlatHierarchy.xlsx
    11.9 KB · Views: 3
Please see the attached workbook for a possible solution. The recursive function is defined as follows:

GET.PATH:
Code:
=LAMBDA(node,lookup_arr,return_arr,
   IF(
      ROWS(node) = 1,
      LET(
         v, @node,
         a, FILTER(return_arr, lookup_arr = TEXTAFTER(v, ">", -1,, 1)),
         IF(ISNUMBER(ROWS(a)), GET.PATH(v & ">" & a, lookup_arr, return_arr), v)
      ),
      VSTACK(
         GET.PATH(TAKE(node, ROWS(node) / 2), lookup_arr, return_arr),
         GET.PATH(DROP(node, ROWS(node) / 2), lookup_arr, return_arr)
      )
   )
)

It should work fine if you only intend to search by parent nodes (e.g. "APP1" and "APP2" in your sample workbook):

Code:
=LET(
   pId, SCAN("", IF(LEFT(T_DATA[Dataset], 3) = "app", T_DATA[Dataset], ""), LAMBDA(a,v, IF(v = "", a, v))),
   arr, FILTER(T_DATA, pId = Param.SearchNode),
   IF(ISNUMBER(ROWS(arr)), GET.PATH(Param.SearchNode, TAKE(arr,, 1), TAKE(arr,,-1)), NA())
)

Searching by child nodes is also possible, as demonstrated in the Example2 worksheet...
 

Attachments

  • SampleFlatHierarchy_recursive_lambda.xlsx
    17.5 KB · Views: 1
Thank you I will have a look tomorrow. Seems by far more complex than I m able to produce in term of formula ... :(
I will let you know. Thank you again for your help
 
You're welcome. In all likelihood, it may end up returning some invalid paths. It will really only work properly with a one-to-many table, where one parent can have many child records. With a many-to-many table, where child records can also have many parents, it will return all subsequent levels for both parents. For example, in your sample file, Raw4 appears as a child record for both Dim1 and Dim2. If Raw4 were to then have a child record, Test1, it would be picked up by both parents (e.g. APP1>DTM1>Dim1>Raw4>Test1 and APP1>DTM1>Dim2>Raw4>Test1), whether that was intended or not. The same goes for Raw5, which appears as a child record for both Dataset2 and Dim2.

Furthermore, the formula will return #NUM! as written, when circular parent-child records are present, e.g. Dataset2>Dataset3; Dataset3>Raw2; Raw2>Dataset2. To help mitigate this issue, the formula can be modified to only search the remaining records at each iteration:

GET.PATH:
Code:
=LAMBDA(node,lookup_arr,return_arr,
   IF(
      ROWS(node) = 1,
      LET(
         v, @node,
         b, lookup_arr = TEXTAFTER(v, ">", -1,, 1),
         a, FILTER(return_arr, b),
         IF(
            ISNUMBER(ROWS(a)),
            LET(
               n, XMATCH(TRUE, b) - 1,
               GET.PATH(v & ">" & a, DROP(lookup_arr, n), DROP(return_arr, n))
            ),
            v
         )
      ),
      VSTACK(
         GET.PATH(TAKE(node, ROWS(node) / 2), lookup_arr, return_arr),
         GET.PATH(DROP(node, ROWS(node) / 2), lookup_arr, return_arr)
      )
   )
)

But it's definitely not perfect...
 

Attachments

  • SampleFlatHierarchy_v2.xlsx
    17.7 KB · Views: 0
Thank you.

1/ to comment your example :
APP1>DTM1>Dim1>Raw4>Test1 and APP1>DTM1>Dim2>Raw4>Test1
this is what I want indeed. It is normal. It means that Raw4 is sourced from Test1

2/ Not yet time to check formulas in details, but I opened the Excel file and it seems to do the job.
I will try to understand the formula, this is impressive ! :)
Thank you so much.

However seems I cannot select an intermediate node, for example Dataset2
and get this result :
- Dataset2>Dataset3>Raw2
- Dataset2>Raw5
But not a big issue. Your proposal with method 'Selected Node via ALL Nodes:' is also valid (but it need to have all hierarchies flatten before)
- APP1>DTM1>Dataset2>Dataset3>Raw2
- APP1>DTM1>Dataset2>Raw5

If we could have a Third option 'Selected Node via ALL Nodes (but see only the selection and childs):' to get only the result :
- Dataset2>Dataset3>Raw2
- Dataset2>Raw5
It would be perfect

NB1: Now, I need to see if it works with a big and real hierarchy (and see the performance)
NB2: I found also a solution via Power Query, but I need to adapt the M code to fit exactly my needs. when done I will share it.
 
Back
Top