• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Vlookup for 200,000 rows is very slow...any alternative?

PP3321

Active Member
Dear Chandoo Community,

Thank you always for your generous help.
Today, I have 1 question...

I have to do vlookup for 200,000 rows.
It shows this message, and it takes may be 2 or 3 minutes for calculation to finish.

Calculating:(4 Processor(s)):
screenshot.PNG

What is the best practice when it comes to big file containing more than 200,000 rows?
Am I doing the right thing to use Excel?
Is there alternative, for example like getting add-ins...?

Thanking you in advance...
 
Single column lookup on unsorted list, be it INDEX/Match or VLOOKUP, the speed is pretty much comparable. Takes about 1.5 min 1.75 min in most cases (for about 200k rows of data).

If it takes more than that... It may be due to other factors within the workbook.

Several ways to speed things up.
If you can sort the list on key field, double approximate VLOOKUP is much faster than any other method.

I often use, VBA, MS QUERY, PowerQuery for large lookup.

See thread below for some example.
http://chandoo.org/forum/threads/vba-code-to-vlookup-between-workbooks.32161/page-2#post-190961
 
@Chihiro,
Thank you so much!
If you do not mind asking, I have 1 question.
Why using VBA is faster than using worksheet?
Does it take more memory to do something on the worksheet,
than doing loops in VBA Editor?
I guess using WorksheetFucntion Object to do vlookup would not make difference...?

@AVK,
Thank you. It is great to understand that...

"Try using the INDEX and MATCH functions instead of VLOOKUP. VLOOKUP is slightly faster (approximately 5 percent faster), simpler, and uses less memory than a combination of MATCH and INDEX, or OFFSET. However, the additional flexibility that MATCH and INDEX offer often enables you to significantly save time. For example, you can store the result of an exact MATCH in a cell and reuse it in several INDEX statements."
 
@Chihiro,
I watched these videos.
I discover many things from you.

How to do vlookup with Power Query.

How to do vlookup with MS Query.
 
Why using VBA is faster than using worksheet?
Does it take more memory to do something on the worksheet,
than doing loops in VBA Editor?

This will depend largely on your data structure.

For a single column lookup on sorted key field. Double approximate VLOOKUP will probably be the fastest. Assuming that there are no other calculations on the workbook (dependent or otherwise).

VBA using combination of Array & Dictionary object is very fast (typically, 400k rows of lookup takes 0.25~0.5 sec). Since, all operation can be done within memory and then result cast onto sheet in one shot. Also, added benefit is that you can turn off all other calculation etc while process is being completed.

I don't recommend VLOOKUP using WorksheetFunction, with very few exceptions. Since, it is usually faster to use Formula and is more cumbersome to edit code when change is made on the worksheet.

Advantage of using INDEX/MATCH isn't speed of calculation. But in it's flexibility. It can look from right to left, match agaist both columns and rows, etc.

My personal preference is using SQL (be it MS Query, ADODB or PowerQuery). It does not require source workbook to be open and deals very well with unsorted data. It's also quite easy to adopt for multiple condition lookup.
 
Back
Top