• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Huge Excel sheet problem optimizing it and reducing size

AshutoshSethi

New Member
Hi,


I am interning at a mutual funds division of a large firm and working on a revenue model. The model is already developed and is huge. I need to optimize it in terms of speed and efficiency overall.


Problems are: it is very slow. It has more than 42 worksheets and is 23 MB right now. It links to lot of other excel files on my drive but not to the drives in the networked hard drives.


It uses vlookup, sumif, and sum a lot. All worksheets are Linked to all others through formulas. Macros are just in one sheet and are for inserting buttons in one sheet that link to other tabs in the workbook.


Any advice on improving formulas or speed would b great. Also, is there an option where I can split the excel file and keep the important tabs in one sheet.


I know this is a generic question so any advice would he helpful.

Thanks,

Ashu
 
Good day AshutoshSethi


If you start splitting your workbook by removing some work sheets you may run in to big problems, if these sheets have formulas linking to other work sheets then you will lose the reference.


To suggest other formulas without seeing what is going on is difficult.


You say it is very slow but how slow?
 
Hi ,


Any kind of external linkage should be avoided , unless you are linking to values which are used by several applications in various workbooks ; so the fact that there are a lot of external linkages is itself a source of delays.


Another source of delays is if the data and formulae are intertwined unnecessarily ; if there is a base set of data , based on which different models run , it would be better if all the data were in one worksheet , and all the other worksheets / workbooks reference this one worksheet , rather than have some data along with some formulae spread out over many worksheets / workbooks.


Another possibility is to use helper cells / columns , range names to store intermediate results , so that the same calculations are not repeated multiple times.


The ideal way to start the process , would be to prepare a detailed data flow diagram ; what are the fundamental model constants , what are the base inputs , what are the different outputs , can the model itself be partitioned into multiple sections , in which the outputs of one section can feed into another section ?


Excel's recalculation speed is a factor of the dependencies in your application ; the more you reduce the dependencies to the optimal configuration , the faster your application will run.


If you are fluent in Excel , I strongly recommend you visit this link :


http://ramblings.mcpher.com/Home/excelquirks/optimizationlink


Download the profiler which is available there , and check out your workbook for the areas which are responsible for time delays. Based on the data which you get , you can decide which segment needs to be worked on.


Narayan
 
Greetings Ashu, and welcome to the forum.

First, congrats on the job opportunity. Next, as a newcomer to the forum, I'd encourage you to use the Custom Search box in top-right of screen here at Chandoo. It will help you find several useful articles and threads.


To answer your question, I'd point you first to this article:

http://chandoo.org/wp/2012/03/26/excel-speedup-optimization-tips-by-experts/

and then this one:

http://chandoo.org/wp/2012/03/27/75-excel-speeding-up-tips/


The biggest drivers to causing things to slow down with the formulas is probably having to link to so many files, and large references (e.g., referencing a whole column like C:C instead of relevant data, C1:C1000).

You could try moving the dashboard/important info to a new workbook (easiest way is to right-click on the sheet tab and move to a new book), but it's not going to help things run faster.


If the macros are running slow, there are several tips/tricks that can be used to help speed things up. For that, you could post a sample of one of the macros (if possible) to the forum and we'll give it a look over, see if there's any way we can help speed things up.
 
Those are all excellent links above.


Some additional thoughts:

"The model is already developed and is huge. It has more than 42 worksheets and is 23 MB right now."


23 MB isn't that big for Excel (although granted if your model was better structured, it might well only need 2 MB). So the filesize alone likely won't be the cause of major issues.


" It links to lot of other excel files on my drive but not to the drives in the networked hard drives."


Oh dear. Linking files IMHO is a no-no. It's a recipe for disaster. It's very easy for someone to make a structural change to one of those other files (like adding a column or row) while your model is closed, meaning your model won't pick up on the fact that everything has moved a row or column. Meaning you'll likely get a wrong answer.


Has performance suffered to the point that you have to set calculation to Manual?

If so, note that many people think that having to run a spreadsheet in manual calculation mode is simply a byproduct of having a big spreadsheet with lots of data. But in actual fact in most cases I've seen, this is usually due to bad/inefficient spreadsheet design. The worst culprit is volatile functions such as OFFSET or INDIRECT or TODAY or NOW that are referenced by tens of thousands of 'downstream' formulas.


I suggest you simply do a 'Find All' search (Cntl + F) on OFFSET and INDIRECT etc within your worksheet and see how may hits you get. Make sure you select 'within workbook' and 'look in formulas' and then hit the 'Find All' option.


The other thing to look out for is that you are not copying large ranges throughout the worksheet. For instance, the other day I saw a spreadsheet that copied a big table of around 15000 rows times 11 columns to multiple places throughout the workbook without good reason. In all, there were 1.4 million formulas involved in duplicating that block throughout the spreadsheet, for no good reason. I restructured it so that aggregation formulas (e.g. SUMIF, SUMPRODUCT etc) were pointed directly at the raw data.


Some other things to check:

* you say you have lots of VLOOKUPS. VLOOKUPS are MUCH MUCH faster if your data is sorted, and you set the last argument to TRUE (although you need to read http://fastexcel.wordpress.com/2012/03/29/vlookup-tricks-why-2-vlookups-are-better-than-1-vlookup/ for more info and some gotchas).

* If you use SUMPRODUCT a lot, see if you can replace them with SUMIFS (new to EXcel 2007) because its faster.

* Don't use =IF(ISERROR(SomeFormula), SomeOtherFormula, SomeFormula), because both branches of the IF get evaluated even if only one of them is ultimately used. Instead, use the new IFERROR function. =IFERROR(SomeFormula,SomeOtherFormula). It's much simpler, and only requires half the amount of processing.

* Don't use IF statements like =IF(F1=0,"",F1) to replace zeros with blanks, because this suppression requires significant overhead to do something that you can accomplish with no overhead whatsoever via custom number formats.

So instead of something like this:

=IF(F1=0,"",F1)

… we could just use this:

= F1

…in conjunction with the following custom number format:

#,##0.0;- #,##0.0;;

Do a google search on Custom Number Formats for more info, or check out http://peltiertech.com/Excel/NumberFormats.html or http://www.corality.com/tutorials/custom-number-formats-excel


And finally, read and reread this link: http://msdn.microsoft.com/en-us/library/office/ff726673%28v=office.14%29.aspx
 
Back
Top