• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Array formula for sorted & filtered results

Vatsy

New Member
Hello,
I am new to excel and still learning INDEX & MATCH functions, your help here is greatly appreciated.

I have a data set which can have large number of records (several 1000s). The data is unsorted and contains date and few other fields. See attached sample data excel.

The challenge I have are:
1. Need to find specific values in the 'Value' column that are with in the date range given at cell F3 and G3
2. I also need to map these values to specific dates in 'Date' column
3. Once these values are obtained, I need to plot them on a scatter plot

For #1 above, I am trying to use INDEX and MATCH functions. I have figured out a way to create an array with Row numbers for valid entries (i.e. rows that meet the Date criterion specified above).

{=(Table1[Boolean]=TRUE)*(Table1[Date]>=F3)*(Table1[Date]<=G3)*MATCH(ROW(Table1[Boolean]), ROW(Table1[Boolean]))} (entered with Ctrl+Shift+Enter) See cell E7

This array formula does create an array of values with '0' (Zero) for rows that do not meet the criterion and actual row number for those that do meet the criterion. Example:

{1;2;3;4;5;6;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;95;96;97;0;99;100;101;102;...} <Trimmed rest of the array>

I am lost on how to get the actual data from 'Value' and 'Date' columns. I tried using INDEX but that's not returning an array (I only get the first value). For example:

{=INDEX(Table1[Value], (Table1[Boolean]=TRUE)*(Table1[Date]>=F3)*(Table1[Date]<=G3)*MATCH(ROW(Table1[Boolean]), ROW(Table1[Boolean])))} (entered with Ctrl+Shift+Enter) results in 104 (a scalar value, not an array). This should return an array, but it works sometimes and not others...See cell E8

Could you please help me define a formula that can:
1. Return only the valid row numbers (without the zeros) i.e. return {1;2;3;4;5;6;95;96;97;0;99;100;101;102;...}
2. Return the 'Value' and 'Date' fields corresponding to those rows
3. Is there a way to sort these values based on date?

Thank you so much!
-vatsy
 

Attachments

  • SampleData.xlsx
    24.6 KB · Views: 14
I added a column to your table that displays the row number when valid. Column G displays the list of valid row numbers. Column H and I are the associated dates and values.
 

Attachments

  • SampleData.xlsx
    72.1 KB · Views: 8
Quite honestly I think it is time to ditch whatever version of Excel you are using and license Office 365 and set it to insider. Otherwise you will waste many more times the cost of the year's licence developing this one workbook.

= FILTER( Table1,
(Table1[Boolean]=TRUE)*
(Table1[Date]>=StartDate)*
(Table1[Date]<=EndDate) )


… and this formula didn't keep crashing Excel the way your whole column array formula did!
 
I added a column to your table that displays the row number when valid. Column G displays the list of valid row numbers. Column H and I are the associated dates and values.
Thank you, that does exactly what I want. I am now looking to convert this in to a named formula. The reason being, the table is auto generated by another tool. I could add these columns to another sheet and generate the required graphs from the combination of the data. :)
 
Quite honestly I think it is time to ditch whatever version of Excel you are using and license Office 365 and set it to insider. Otherwise you will waste many more times the cost of the year's licence developing this one workbook.

= FILTER( Table1,
(Table1[Boolean]=TRUE)*
(Table1[Date]>=StartDate)*
(Table1[Date]<=EndDate) )


… and this formula didn't keep crashing Excel the way your whole column array formula did!
Yes, this was my original thought as well. I got excited when I saw the FILTER function, but the organization doesn't allow insider preview builds and I am stuck to the old method of dealing with large data until this becomes mainstream. Thank you so much for the reply!
 
The attached uses a short sequence of named formulas. Since that is the way that all my workbooks are built, I at least thought I should make the effort! The row index 'k' is given by
= ROW(Table1) - ROW(Table1[#Headers])
The consolidated row numbers of valid rows 'validRowNum' are
= SMALL( IF( (Table1[Boolean]=TRUE) * (Table1[Date]>=startDate) * (Table1[Date]<=endDate), k ), k )
This is calculated twice, once to generate the output and the second time to limit the length of the output array using 'N'
= COUNT( validRowNum )
The only worksheet formula is
= IF( k<=N, INDEX( Table1[[Date]:[Value]], validRowNum, {1,2} ), "" )
This will be a spilt array for me or a CSE array for others.
 

Attachments

  • FilteredData (PB).xlsx
    48.5 KB · Views: 19
The attached uses a short sequence of named formulas. Since that is the way that all my workbooks are built, I at least thought I should make the effort! The row index 'k' is given by
= ROW(Table1) - ROW(Table1[#Headers])
The consolidated row numbers of valid rows 'validRowNum' are
= SMALL( IF( (Table1[Boolean]=TRUE) * (Table1[Date]>=startDate) * (Table1[Date]<=endDate), k ), k )
This is calculated twice, once to generate the output and the second time to limit the length of the output array using 'N'
= COUNT( validRowNum )
The only worksheet formula is
= IF( k<=N, INDEX( Table1[[Date]:[Value]], validRowNum, {1,2} ), "" )
This will be a spilt array for me or a CSE array for others.

Thank you Peter for the quick reply! This is excellent!

I do have one challenge though (this is like pealing onions)...As I said before, the table is auto generated by an external tool and for whatever reason, the tool will recreate the table with a new name each time. That means, I need to update all formulas with the table name each time the data gets refreshed (which can be on a daily basis). The alternative is to use Sheet1!$A:$A instead of Table1[Date]. This is ugly because Excel will then process all 10,000+ rows even if there is no valid data in them (needless to say the crash every so often).

I found a way to detect the table name using the following formula (thank to this post: https://stackoverflow.com/questions/36561623/excel-return-table-name-using-formula):

M12 =MID(FORMULATEXT(A1),2,FIND("[",FORMULATEXT(A1))-2)
where A1=Table1[[#Headers],[Date]]

Is there a way for me to use this table name in the formula?

Alternatively, now I can figure out how many rows are in the table:
M13=ROWS(INDIRECT(M12))

Can I use the total number of rows in M13 in the formula? E.g. Sheet1!$A2:$A&M13?

Thank you,
vatsy
 
@John Jairo V 's approach of reverting to a pre-tables style of dynamic array will work. Normally I avoid referencing data by its location and base the entire solution of Names; I simply do not care where the data is as long as I can reference it and I can relate it to concepts within the business domain.
Your situation of knowing precisely where the input data resides but not knowing what it is called is a new one for me.

What happens if you rename the table back to its pre-update value? Do the references correct themselves?
#REF! errors will not repair but #NAME? errors should.

Can you use a macro to rename the table and repair damaged formulae?
The code required is very simple; something like
Sub RenameTable()
Dim LO As ListObject
Set LO = Sheet1.ListObjects(1)
LO.Name = "InputData"
End Sub
but many organisations do not like their employees loading programs onto company computers.
 
I have inserted additional names to reduce the nesting of formulas and added a macro to rename the Table once it has been overwritten by a new version with a different name. I would be curious to know what breaks and in what manner. As I said before, it is a use case that I haven't encountered and one which I have no way of testing.
 

Attachments

  • FilteredData (dev).xlsm
    47.6 KB · Views: 3
Back
Top