• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

MultiCheck Across Multiple Tabs [SOLVED]

Pofski

Member
Hello again everybody;


I am trying to achieve the following.

I have a row with unique id's. In a column next to that row i would like to have a formula that automatically puts a 1 if the id isn't found in any of the tabs that contain data.( i have included an example, because I know that it is going to get a bit confusing, sorry for that)


The entire thing should be done on the tab Calculation. If an equipment group starts with Q, it would look in the tab DataQ.

If it starts with an E, it looks in the tab with DataE.

If an exception (like an F) it should be possible to add the F to the DataE tab as well.


If the unique id is found, then there would be a 0 in the column MultiChk, if it is nowhere to be found, then it would put a 1.


Also, this should be done with formula, and not a macro (this is a returning topic for me, for which I apologise, but it is company policy)


Also, the end result is going to have to be used with 6 different tabs, with normally anywhere between 100.000 to 1 mil. lines of unique id's in the tab Calculation. So I am also trying to get it as low as possible on used resources.


Thank you in advance.


Sincerely,


Pofski


https://docs.google.com/file/d/0B0naOkYo4pCmQkVYOWpSVHF0aGs/edit?usp=sharing
 
Pofski - if someone suggests use of the INDIRECT function, you probably want to steer clear of it in this case because that function is Volatile, meaning it will recalculate any time anythig in the workbook changes. And given you have more than a million cells to search, that will be slow.


Instead, some thoughts. Using the CHOOSE function will be a good option, because it isnt volatile, and it allows you to scan ranges stored on different sheets. You can see more on volatility - and also download a workbook with some examples of lookups with non-volatile functions - at a recent question I answered at http://www.excelguru.ca/forums/showthread.php?1828-How-to-display-a-list-if-conditions-are-met-excel-2010


Or if you have excel 2010 then you could use PowerPivot.


I'll take a look in half a day if i get time. Got to go to bed now.
 
Hi Pofski ,


Why don't you try a compound IF statement , and see how the response time is ?


=--ISNA(MATCH(B2,IF(LEFT(B2)="Q",DataQ!$B$2:$B$5,IF(LEFT(B2)="E",DataE!$B$2:$B$6)),0))


Copy this down on the Calculation tab.


Narayan
 
Hey all,


I'm doing a comparison now between the suggested formula from Narayan, and the following.


I put an extra info square in containing what eachfirst letter of the equipment group that corresponds to in cells E1:F3; and then i put the following code:


=CHOOSE(VLOOKUP(LEFT(A2;1);$E$1:$F$3;2;FALSE);IF(ISNA(VLOOKUP(B2;DataQ!$B$2:$B$5;1;0));1;0);IF(ISNA(VLOOKUP(B2;DataE!$B$2:$B$5;1;0));1;0))


I'm checking now how this runs on the full program, will let you guys know.
 
Here’s a couple of key points you want to consider, as well as a revision of your formula in light of them. Note that my version of Excel uses commas to separate arguments, whereas yours uses semicolons. So you’ll need to adjust these accordingly if you cut and paste from here.


Firstly, should we use VLOOKUP or MATCH in this particular case?

I’d go for VLOOKUP. From http://msdn.microsoft.com/en-us/library/office/ff726673(v=office.14).aspx

VLOOKUP is slightly faster (approximately 5 percent faster), simpler, and uses less memory than a combination of MATCH and INDEX, or OFFSET. However, the additional flexibility that MATCH and INDEX offer often enables you to significantly save time. For example, you can store the result of an exact MATCH in a cell and reuse it in several INDEX statements.


Secondly, can we optimise your VLOOKUP so that it works faster?

Yes we can, according to http://fastexcel.wordpress.com/2012/03/29/vlookup-tricks-why-2-vlookups-are-better-than-1-vlookup/

Here’s the specifics: VLOOKUPs that use the EXACT MATCH argument – i.e. the FALSE or Zero argument in VLOOKUP(B2,DataQ!$B$2:$B$5,1,FALSE) – are very slow/resource intensive.

VLOOKUPS that do an approximate match on sorted data– i.e. VLOOKUP(B2,DataQ!$B$2:$B$5,1,TRUE) – are much faster/less resource intensive.

However, the problem with this is if your lookup term isn't in the lookup list, you’ll get a false match. That is, approximate VLOOKUPS always return something, even if the item you fed it isn't in the lookup list. And in your case we know that there will be items that are not in the lookup list, because those items are exactly what we are looking for.

But there’s a clever workaround from Charles Williams at http://fastexcel.wordpress.com/2012/03/29/vlookup-tricks-why-2-vlookups-are-better-than-1-vlookup/

Basically we check to see whether the item returned from the 'approximate VLOOKUP'formula is the same thing we originally fed to the 'approximate VLOOKUP' formula. If it is, we know your item is in the lookup list. If not, then we know the item we looked for isn’t in the lookup list.

Here’s how that looks:

IF(VLOOKUP(B2,DataQ!$B$2:$B$5,1,TRUE)=B2,0,1)

Note this revision replaces the ISNA step with a simple equality check.

So the revised formula looks like this:

=CHOOSE(VLOOKUP(LEFT(A2,1),$E$1:$F$3,2,FALSE),IF(VLOOKUP(B2,DataQ!$B$2:$B$5,1,TRUE)=B2,0,1),IF(VLOOKUP(B2,DataE!$B$2:$B$5,1,TRUE)=B2,0,1))


Lastly, can we better structure the lookup data so that the job is made even easier?
Quite possibly.


If a range is sorted across several columns then you can often write a formula that will dynamically work out where a specific sub-set of data sits within a column of interest, so that you can work directly with that rather than processing an entire column.

For instance, say we have 500,000 rows of data sorted alphabetically by Year, then by Country, and then by City. And say we routinely want to perform some ad hoc calculations regarding a particular city in a particular country in a particular year. We have three input cells where a user can choose which Year, Country, and City they are interested in from some dropdowns. Say they choose 2010, New Zealand, and Wellington from those dropdowns.

• We can write a formula that scans the Year column and returns the first and last row numbers that contain the year of interest (2010). Let’s say this tells us that the 2010 data starts in row 250,010 and ends in row 302,084.

• We can then search just within rows 250,010 and 302,084 of the Country column for our country of interest (New Zealand). Let’s say this tells us that the New Zealand data for that year falls between rows 274,813 and 289,002.

• We can then search just within rows 274,813 and 289,002 of the City column for our city of interest (Wellington). Let’s say this tells us that the Wellington data for New Zealand in 2010 falls between rows 282,703 and 283,055.

• We can then point our computationally intensive aggregation or filtering formula just at the 352 rows between 282,703 and 283,055 that contain data for Wellington, New Zealand in 2010, rather than looking for a Year, Country, and City match across three columns for all 500,000 rows of data.

In your case, you should find that it is significantly more efficient to first scan the Equipment Group column to determine the applicable range that each item will fall within in the Unique ID column. And then use that range to set upper and lower bounds in which to do the VLOOKUP in that Unique ID column.


I’ll whip up an example.
 
Last edited:
Okay, here’s a very complicated range slicing formula that should work pretty fast on your large datasets. https://www.dropbox.com/s/3qqnvf8yn...tilpe sheets with Range Slicing 20130629.xlsx


Note that I’ve changed all your lookup tables to Excel Tables/ListObjects using the Ctrl+T Shortcut. This means everything is dynamic…add more data to the tables and the formula will still work.


And also note that this formula returns “True” in the case that the lookup item is missing, and “False” in the case that it is there.


=ISNA(CHOOSE(VLOOKUP(LEFT([@Group],1),ChooseID,2,FALSE),VLOOKUP([@ID],INDEX(DataE[ID],MATCH([@Group],DataE[Group],FALSE),1):INDEX(DataE[Group],MATCH([@Group],DataE[Group],TRUE),1),TRUE)=[@ID],VLOOKUP([@ID],INDEX(DataQ[ID],MATCH([@Group],DataQ[Group],FALSE),1):INDEX(DataQ[Group],MATCH([@Group],DataQ[Group],TRUE),1),TRUE)=[@ID]))


This bit in bold finds the correct sheet tab:

=ISNA(CHOOSE(VLOOKUP(LEFT([@Group],1),ChooseID,2,FALSE) ,VLOOKUP([@ID],INDEX(DataE[ID],MATCH([@Group],DataE[Group],FALSE),1):INDEX(DataE[Group],MATCH([@Group],DataE[Group],TRUE),1),TRUE)=[@ID],VLOOKUP([@ID],INDEX(DataQ[ID],MATCH([@Group],DataQ[Group],FALSE),1):INDEX(DataQ[Group],MATCH([@Group],DataQ[Group],TRUE),1),TRUE)=[@ID]))

And this bit handles the lookup for the DataE tab:

=ISNA(CHOOSE(VLOOKUP(LEFT([@Group],1),ChooseID,2,FALSE), VLOOKUP([@ID],INDEX(DataE[ID],MATCH([@Group],DataE[Group],FALSE),1):INDEX(DataE[Group],MATCH([@Group],DataE[Group],TRUE),1),TRUE)=[@ID] ,VLOOKUP([@ID],INDEX(DataQ[ID],MATCH([@Group],DataQ[Group],FALSE),1):INDEX(DataQ[Group],MATCH([@Group],DataQ[Group],TRUE),1),TRUE)=[@ID]))

As does this bolded bit for the DataQ tab:

=ISNA(CHOOSE(VLOOKUP(LEFT([@Group],1),ChooseID,2,FALSE), VLOOKUP([@ID],INDEX(DataE[ID],MATCH([@Group],DataE[Group],FALSE),1):INDEX(DataE[Group],MATCH([@Group],DataE[Group],TRUE),1),TRUE)=[@ID] , VLOOKUP([@ID],INDEX(DataQ[ID],MATCH([@Group],DataQ[Group],FALSE),1):INDEX(DataQ[Group],MATCH([@Group],DataQ[Group],TRUE),1),TRUE)=[@ID]) )

To incorporate a new lookup tab (say DataX

1. Make a copy of an existing data tab

2. Select the table, and change the name of the table to DataX

3. Populate the table with the relevant product groups and unique IDs

4. Add the tab at the bottom of the ChooseID lookup table

5. Amend the megaformula above by copying the blod bit to the end, and changing any instance of DataQ to DataX in than end bid, so you end up with this:

=ISNA(CHOOSE(VLOOKUP(LEFT([@Group],1),ChooseID,2,FALSE), VLOOKUP([@ID],INDEX(DataE[ID],MATCH([@Group],DataE[Group],FALSE),1):INDEX(DataE[Group],MATCH([@Group],DataE[Group],TRUE),1),TRUE)=[@ID] , VLOOKUP([@ID],INDEX(DataQ[ID],MATCH([@Group],DataQ[Group],FALSE),1):INDEX(DataQ[Group],MATCH([@Group],DataQ[Group],TRUE),1),TRUE)=[@ID]) ,VLOOKUP([@ID],INDEX(DataX[ID],MATCH([@Group],DataX[Group],FALSE),1):INDEX(DataX[Group],MATCH([@Group],DataX[Group],TRUE),1),TRUE)=[@ID]) )

…And you’re good to go.
 
Last edited:
Back
Top