• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Reducing calculation time

posaune76

New Member
Hello.

I have this (rather ungainly) workbook in which we track student information: demographics, tutoring notes, attendance, etc. I'd like to say right up front that I inherited the overall structure of the workbook, and I've been working hard both to streamline things and add functionality. I am also working on a complete redesign, but for now I'm stuck with the current structure. I feel strongly that having a separate worksheet of each type for each month is most likely hugely inefficient, but it's what I have for now.

One function I've added is a sort of "autocomplete" for the Student Info worksheets so that when a chronically ill patient is readmitted, much of their demographic information (DOB, grade, school, district, city, county, state) from previous months is automatically filled in. I suspect that this is part of my problem, but I'm not convinced, and I don't have good ideas for alternatives. The number of cells depending on any given entry for this information lessens through the semester: February looks back at January, March looks back at February & January, etc.; June doesn't have anything looking at it.

Notes on structure:
-Each month has a Student Info worksheet, where demographic info & tutoring notes go. Unfortunately, the tables in these worksheets are named Entry: JanEntry, FebEntry, etc. Sorry about that.
-Each month has an Attendance worksheet that refers to Student Info for that month to pull in student ID, name, school district. This can be sorted independently of Student Info, and attendance will remain properly attributed.
-Each month has a Communication worksheet, which is used to generate a printed worksheet for tutors to use as a reference throughout the day. It refers to Student Info. Nothing refers to Communication.
-Each month has a 150 Form worksheet, which is used to generate a form letter for the local district office on a monthly basis. It refers to Student Info and Attendance.
-There is a single Schools worksheet that is used to provide dynamic data validation menus for entering school & district information. We add to it as we discover new kids from schools we haven't served yet.
-The Data worksheet pulls together the information from the Student Info and Attendance worksheets.
-The Menus worksheet has information used for data validation menus & INDEX lookups of Zip/City/County information.
-The Reports worksheet provides summary information, referring to the Data worksheet.

The only volatile functions I can find in the whole workbook are the OFFSET formulas that are used in the data validation for districts and schools on the Student Info worksheets. It is my understanding that using OFFSET in data validation doesn't increase volatility, so I've worked to keep it there!

My problem, in a nutshell: When I enter information that doesn't have a lot of other cells depending on that information (a new student's grade level in June, for example), there's still a significant amount of calculation time following the entry before I can move on. I'd like to understand why this is the case, and it'd be even better to find a fix!

Sorry for the long post, but I thought a bit of a tour would be useful. If you're interested in helping and have any questions, please don't hesitate to ask. I've added 3 sample students to January and February so you can see a little bit of the functioning of the workbook. I haven't added any attendance data, though, since it's the Student Info worksheets that made me scratch my head.

The file is too large to upload; it can be downloaded here: https://drive.google.com/file/d/0B7kLqoPt5NtfalJEZkdEUFgwcTQ/edit?usp=sharing
 
Hi ,

One problem is that there is excessive formatting ; this includes cells with colour formatting and merged cells. This can contribute to higher recalculation times.

You need to restructure the entire workbook so that the data is stored in one worksheet , without any formatting ; another worksheet can contain the printable template ; as and when you need to print , you select the data , transfer it to the template , and print it. This will probably need VBA , but this will solve your problem.

Narayan
 
Narayan:

Thanks for the input. I don't disagree; things could be much simpler, including formatting issues. The new version of things that I'm working on does much of what you suggest. However, if there's something that I could do beyond getting rid of some formatting (does having conditional formatting, even on cells not related to the new data entered, cause recalculation?) to make things a bit smoother at least for the rest of the school year, that would be a great thing.

There's also no guarantee that my "new way" will be accepted -- there's a lot of "but it's always been done this way" that I try to work around every week.

I'll be sure to get rid of what formatting I can, but I also have to keep those who work with the tools more than my 3 hours/week happy.

An idea: would it be better for me to have the "autocomplete" formulas look for the the information in the Data worksheet, rather than with nested IF functions referring to successive previous months' worksheets?


Again, thanks for any and all input.
 
Hi ,

I can see that the sheets :

Student Info
Attendance
150 Form

have provision for up to 250 students.

However , the Communication tab also has provision for 250 students ; why ?

The Communications tab contains material which is to be printed out ; it does not need to be stored , since the information which is on this tab is being derived from information which is on the other 3 tabs.

The Communications tab should be laid out as a single printed sheet ; you design it to contain material pertaining to at most 4 students ; when it is to be printed out , you specify up to 4 ID numbers of students , the respective cells on the Communications tab get populated with data pertaining to the 4 students whose ID numbers have been specified , and when you are done , if more than 4 students need to be communicated with , you specify the next set of 4 student ID numbers.

Thus , the Communications tabs for each month are not required ; have just one tab where the month can also be specified.

You will reduce the number of tabs by 11 , and the number of rows in that tab from 2000 to 35.

Narayan
 
Narayan:

I completely understand where you're coming from with the Communication pages, and I'm coming up with ways to cut them down as I type:

1) I don't need to repeat the ID in all 8 rows of each kid's form.
2) I can reduce the number of entries to, say, 75 (which still feels high, but who knows how many kids we may serve per day in the future), then add a helper column in Student Info to determine which kids are in the hospital. They can be sorted by this column, then by name or whatever else, and Communication can just look at the top 75 lines. Even with the addition of the helper column, I'm saving 1500 formula cells each month.

As for why it's set up this way, here's a description of how the workbook is used on a daily basis. In the morning, the classroom teacher learns from a separate system (one from which she can't just copy & paste, which is its own annoyance) who the kids are who are school age & hospitalized. She enters any new information on the Student Info worksheet: new kids, new rooms, new isolation procedures, etc. She then sorts the Student Info worksheet by name, I think. Then she goes to the Communication worksheet, where she filters the right-hand "Print?" column for PRINT, which will exclude discharged kids and kids with no needs ("NN") as their Teacher Note. She adjusts print area & prints. Volunteer tutors then use the printed form to note their interactions with the students. The following day, those notes will be transcribed into the appropriate places in Excel: Student Info and Attendance. So, that's the flow. It could be better, but change in form, rather than function, is slow -- comfort levels need to be high to keep volunteers completing their parts, and the teacher needs to be able to work quickly within a system she knows. In short, a more manual process, rather than a more automated one -- even a slow one -- just won't fly.

I'm there once a week for 3 hours ,where I do lots of straight entry and any troubleshooting that's needed. On my own time at home, I'm designing something I hope will both feel familiar and be far better: a unified Student info worksheet, a unified Attendance worksheet, a unified Communication sheet, and reports.

The thing I'm still wondering, though, and it's the thing that led me to post: What's causing the recalculation for something innocuous like a grade level entry? Even if I get rid of every formula I can find that refers to grades -- the Grade column in Data, the autofilling formulas in subsequent months -- it still recalculates! There's not even any conditional formatting that's reliant on grade level.


Thanks,

-Brian
 
Hi Brian ,

I am not able to understand your terminology of grade level entry ; which sheet tab , and which cell ?

Probably the zoom factor also plays a role ; as far as possible having a zoom of 100 % may help.

Narayan
 
Sorry about that. I'm talking about worksheet January Student Info, any cell in column JanEntry[Grade], for example.
 
Hi Brian ,

Just found out one surprising fact about the way your workbook has been designed !

Every StudentInfo tab has formulae which refer to the earlier month's StudentInfo tab ! Thus when you enter data on the February StudentInfo tab , not only do you have formulae which refer to the January StudentInfo tab , but every subsequent month's StudentInfo tab formulae will be getting recalculated because you have entered some fresh data , even though the subsequent months will not have any data !

Out of the 26 or so columns on every StudentInfo tab , almost 15 have such formulae , which is 25 rows multiplied by 15 formulae per row , some of which are quite complicated , and there can be 11 such tabs which will have all their formulae getting recalculated ! I can't even imagine such a dependency tree !

I think this is one design which needs to be redone , period.

Narayan
 
I agree. I am redoing it, as I've mentioned. I also described the structure you noted in your last post in the second paragraph of the OP. As I said, I would understand if this structure somehow contributed to the phenomenon I'm trying to figure out (the recalculation issue), but I can't fathom just how it would be contributing.

The reason I say this is that if you:

1) Delete or rename the AllSpringData[Grade] column in the Data worksheet; and
2) Make an entry in the JunEntry[Grade] column on the June Student Info worksheet, which at this point has no formulas referring to it, anywhere,

the workbook recalculates. Why?

I understand that redesign is needed, and that there are many inefficiencies. This is the result of the sporadic evolution of a legacy document that I've been trying to work with in limited time over the course of a couple of school years, with no formal instruction in Excel.

The purpose of my posting of this thread, though, is the discovery and understanding of the particular principle that causes the recalculation -- so that I don't duplicate the problem in my redesigned product! Any help with this specific issue would be appreciated. I do thank you, Narayan, for your considered and useful feedback on other aspects of the workbook, many of which suggestions I will implement in some form. At this point, though, I'm trying to maintain focus on my original inquiry.
 
Hi Brian ,

Even June has formulae , and sufficiently complicated formulae at that , in 11 columns , each of them on the lines of the following :

=IF('June Student Info'!$Z5=" ","",IF(COUNTIF(MayEntry[Name],'June Student Info'!$Z5)>0,INDEX(MayEntry,MATCH('June Student Info'!$Z5,MayEntry[Name],0),MATCH(G$1,MayEntry[#Headers],0)),IF(COUNTIF(AprEntry[Name],'June Student Info'!$Z5)>0,INDEX(AprEntry,MATCH('June Student Info'!$Z5,AprEntry[Name],0),MATCH(G$1,AprEntry[#Headers],0)),IF(COUNTIF(MarEntry[Name],'June Student Info'!$Z5)>0,INDEX(MarEntry,MATCH('June Student Info'!$Z5,MarEntry[Name],0),MATCH(G$1,MarEntry[#Headers],0)),IF(COUNTIF('February Student Info'!$AA$2:$AA$4,'June Student Info'!$Z5)>0,INDEX('February Student Info'!$A$2:$AA$4,MATCH('June Student Info'!$Z5,'February Student Info'!$AA$2:$AA$4,0),MATCH(G$1,'February Student Info'!$A$2:$AA$4,0)),IF(COUNTIF(JanEntry[Name],'June Student Info'!$Z5)>0,INDEX(JanEntry,MATCH('June Student Info'!$Z5,JanEntry[Name],0),MATCH(G$1,JanEntry[#Headers],0)),""))))))

When I compare the times between entering a name in the January StudentInfo tab , and the June StudentInfo tab , the response time on the former is 27 seconds , while on the latter it is 4 seconds.

I think the formulae such as the one above are contributing to the 4 second recalculation time.

Narayan
 
Hi Brian ,

Another point about the way the formulae have been written ; to retrieve information about a student from an earlier sheet tab , in case that student is not a new one , the headers are being matched in every formula ! Isn't this overkill ?

If every month StudentInfo tab follows the same template , then surely we can expect that the column headers will be labelled the same in all the tabs , and they will also be in the same columnar order ? Thus the Gender column will always be the 7th column in every StudentInfo tab ; why should the text "Gender" be matched with the table header ?

Rather than try and investigate why the response time is high , I think the correct method would be to set right all the things that need to be set right ; when you are done , I am absolutely sure that you will find the response times are where you expect them to be.

Narayan
 
Back
Top