Recently, Bluetaurean asked in the Chandoo.org Forums about ways to allocate work durations for various product lines across 24 hour days to create a daily schedule.
Both formula-based and VBA-based solutions were offered.
Today at formula Forensics we will take a look at the formula-based approach.
As always at Formula Forensics you can follow along, Download Here – Excel 2007-2013.
Set the Scene
Since one might encounter a similar need in a variety of contexts (manufacturing, engineering, project planning, etc.), we will look at a more general problem of allocating a set of tasks and corresponding durations to one or more days, as shown in the following diagram.
We will create two output views:
- One that is a flat list that can then be manipulated further using Excel’s Pivot table feature, and
- Another view that mimics a pivot-table (and is similar to a typical project Gantt view, but with actual values listed instead of a bar chart).
You can follow along using the attached Excel document. Download here Excel 2007+
Problem Specifics
- We have a list of tasks and their durations.
- We need to distribute the tasks to different days, without exceeding the maximum available duration in a given day.
- When the hours in a day are “used up”, we need to allocate the remaining task duration to the next day, and so on.
- On the other hand, if a given task does not use up all of the hours in a given day, we will need to assign more than one task for that day, provided the combined durations do not exceed the available hours for that day.
- In other words, we will need to split a task across one or more days, or combine one or more tasks into a single day, as needed, to maximize the work performed in a given day.
Developing the Approach
Before we tackle this problem in Excel, let us review how we might do this manually. Like most things, we might use the following three step process:
- Take the first task and assign its duration to Day 1. If the task’s duration exceeds the maximum hours available in a day, allocate the portion of the duration that does not fit into Day 1 into Day 2.
- Take the second task, and see whether it can fit into an existing day, or whether it needs to be distributed to multiple days
- Etc. (OK… so that three-step process was a stretch!)
Statistics show that most people think in terms of IF-THEN-ELSE statements. So here it is…
For a given Day, and for a given Task, If [Hours Not Allocated For that Task] > [Hours Available for that Day] Then Set Duration for that Day as [Hours Available for that Day] Else Set Duration for that Day as [Hours Not Allocated for that Task] EndContinue the above evaluation until all tasks have been allocated to days.
Of course, the above IF() logic can be condensed as follows:
MIN( [Hours Not Allocated For that Task] , [Hours Available for that Day] )
Putting it All Together: Output Option 1: Gantt-like View
Let us employ the above approach to create the Gantt-like view.
To make our approach more generic, we will use an Excel Name called “MaxHrsPerDay” to indicate the maximum available hours in a given day. (In the sample worksheet, it has been set to 24 hours.)
Our source data is setup as shown in the diagram below:
- Tasks are in the range A2:A5
- Durations are in the range B2:B5
We will create the output in a separate worksheet, in the range A1:E5 as shown below:
Put the following formula into cell A2 and copy down to A5:
=SourceData!$A2
(This formula is merely referencing the values from the SourceData sheet. The sample workbook also includes an approach to make this reference more location independent.)
Put the following formula in cell B2, and copy it down and right:
=MIN((SourceData!$B2-SUM($A2:A2)), (MaxHrsPerDay-SUM(B$1:B1)))
Setup the header row (B1:E1) as desired. (I have used text values for the header. You could also calculate the header text using formulas. Since that is straightforward, I will leave that as an exercise for the reader.)
Now let us look at what the formula in cell B2 is doing:
- SUM($A2:A2) is calculating the sum of the allocated durations for TaskA. (Please note the use of absolute and relative references. The formula is anchored on column A, but the starting row, ending row and ending column are free to expand.) SUM($A2:A2) returns zero since SUM() ignores text values.
– If you look at cell C2, the reference changes to SUM($A2:B2).
– In cell B3, the reference changes to SUM($A3:A3). You get the idea
- (SourceData!$B2-SUM($A2:A2)) calculates the difference between the duration for TaskA (40 in the example) and the hours allocated as of that point (0), to return 40-0=40.
- SUM(B$1:B1) is calculating the sum of the allocated hours for Day1. (Again, we are using a combination of absolute and relative references to keep the calculation anchored on column B.) In this case, the value is zero, since this is the first allocation for Day1.
- (MaxHrsPerDay-SUM(B$1:B1)) calculates the hours remaining (i.e. available) for Day1. Since this is for cell B2, the calculation returns 24 – 0 = 24.
That is it!
We put those absolute and relative references to good use!
This approach was easy because all we had to do was calculate the duration for a given task for a given day.
On the other hand, if we had to figure out what the Task was, or which Day it was, the calculation gets a little more involved. Since this is “formula forensics”, we would not have it any other way! 🙂
Putting it All Together: Output Option 2: A Sequential List of Tasks and Durations for Each Day (i.e. a Flat List)
As before, we will use the Excel Name “MaxHrsPerDay” to refer to the maximum hours in a Day.
As shown in the following diagram, we will turn the source data into a flat list of Days, Tasks and Durations:
Unlike with VBA, since a formula cannot choose which row and column to write its output, we have to set the formula in every cell where we suspect there might be a value.
In the above sample diagram, we copy the formulas from row 2 to row 9. However, row 9 shows “…” indicating that the list was completed by row 8.
Let us look at how to determine the value for Day, Task and Allocated Duration.
For ease of description, I have created the following Excel Names:
WorkList: =A2:A5 in the source data.
WorkDuration: =B2:B5 in the source data
While creating the Gantt-like view earlier, we were able to take advantage of the static “Day” and “Task” values to determine the Remaining Duration, Available Duration, etc. Since we now have to determine all three values (Day, Task, Allocated Duration), we will need some “helper” data.
We will add a column alongside the source data that shows the cumulative duration (for reasons that will become clear shortly), as shown in the following diagram:
Cumulative Duration is calculated as the sum of all durations up to a given row.
- For example, in cell C2, the Cumulative Duration is 40.
- In cell C3, the Cumulative Duration is 40+20=60
- And so on.
For ease of referencing, we will use an Excel Name called CumulativeDuration =C2:C5.
Let us look at why we need the “CumulativeDuration” helper column:
The circular logic problem
In order to determine the durations already allocated for a given day, we will need to know which Day it is.
We also need to know which Task we are trying to calculate the duration for.
So… do we calculate the Day or the Task or the Duration first?!! As you can imagine, that will soon land us in some circular logic.
Some helpful observations about the output:
- In column C of the output (on worksheet FlatList), the sum of allocated durations adds up to the total duration for all tasks. (No surprise here!)
- If every task had duration equal to the MaxHrsPerDay, you would have the same duration value for all days. (Not surprising, but interesting!)
- In other words, you could think of the Allocated Duration column as the total duration for all tasks, allocated MaxHrsPerDay at a time.
- Now we need a way to iterate through the duration values one at a time and account for the durations already processed. In other words, each value needs to contain all of the previous values. Welcome to an array of the cumulative durations!
- For example, in the cumulative array “{40;60;65;80}”, the value 60 already includes the previous value 40 in it. This allows us to subtract all durations allocated up to a given row, to get the duration value that is remaining to be allocated.
- Since Excel is good with numbers, we will base the calculation for AllocatedDuration and Tasks on the Duration values.
- By calculating the two values separately, we avoid the circular logic.
Let’s now look at the formulas for Day, WorkItem and AllocatedDuration.
It would be easier if we looked at the formulas in reverse order, starting with AllocatedDuration, then WorkItem, and finally Day.
Formula for “AllocatedDuration”
Enter the following formula into cell C2, ending with Ctrl+Shift+Enter, as shown in the following diagram:
=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”,MIN(INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration-SUM(C$1:C1) > 0, 0)) – SUMIFS(C$1:C1, B$1:B1,B2), MaxHrsPerDay-SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)))) Ctrl+Shift+Enter
Let us look at the formula closely (using the formula in row 2):
- SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)) -> This calculates the sum of all allocated durations up to the previous row, where the Day = current row’s day. Please note the use of absolute and relative references. They allow us to expand the range as we go down the rows, while remaining anchored to the first row.
– Since this is the first data row, C$1:C1 returns “Allocated Duration” and the ISNUMBER() function returns FALSE, and consequently, the IF() function returns 0.
– A$1:A1 returns “Day”, and the test A$1:A1=A2 returns FALSE. Please note that in this case, it does not matter whether A2 has a value in it, whether it has the value 1, etc.
– SUMPRODUCT() provides the result of FALSE * 0 = 0
- MaxHrsPerDay – SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)) -> This calculates the difference between maximum duration available for a day and the sum of durations allocated for the current day. In other words, it calculates the available duration for the current row’s day.
– In this example, the calculation results in MaxHrsPerDay (24 in our example) – 0 = 24
- SUMIFS(C$1:C1, B$1:B1,B2) -> This calculates the sum of all allocated durations for the current row’s task. Since B$1:B1 is the text value “Work Item”, the SUMIFS() returns 0. Again, it does not matter if B2 is blank or has a value like “TaskA”, since Excel correctly evaluates the condition whether B$1:B1 equals B2.
- SUM(C$1:C1) -> This calculates the sum of all allocated durations up to the previous row.
- CumulativeDuration — SUM(C$1:C1) -> CumulativeDuration evaluates to {40;60;65;80}. SUM(C$1:C1) evaluates to zero. As such, the expression evaluates to {40;60;65;80} – 0, or {40;60;65;80}.
– If we look at the calculation for this expression in cell C3 (the expression would be “CumulativeDuration—SUM(C$1:C2)”), we would get the result of {40;60;65;80} – (0+24) = {16;36;41;56}. (As you know, subtracting a scalar value from an array results in an array with each value reduced by the scalar value.)
– If we look at the calculation for this expression in cell C4 (the expression would be “CumulativeDuration—SUM(C$1:C3)”) , we would get the result of {40;60;65;80} – (0+24+16) = {0;20;25;40}
– As you can see, each successive calculation reduces the CumulativeDuration array by the amount of hours already allocated. By reducing the CumulativeDuration array in this fashion, we ensure that we do not “double count” a duration.
– If a value in the array evaluates to zero, it means the corresponding duration has been fully allocated. (In cell C3, the first value in the array is zero, indicating that the original 40 hours has been fully allocated.) We will put this knowledge to good use in the next expression.
- MATCH(TRUE, CumulativeDuration—SUM(C$1:C1) > 0, 0) -> The expression CumulativeDuration—SUM(C$1:C1) > 0 evaluates to ={TRUE;TRUE;TRUE;TRUE} because all values are greater than zero. By performing a MATCH() for TRUE, we are able to find the first location in the array that has a non-zero value.
– If we look at the result of this expression in cell C3, we get {16;36;41;56} > 0 = {TRUE;TRUE;TRUE;TRUE}
– If we look at the result of this expression in cell C4, we get {0;20;25;40} > 0 = {FALSE;TRUE;TRUE;TRUE}
– As you recall, the zero values (or FALSE) correspond to the durations that have been fully allocated, whereas, the non-zero values (or TRUE) correspond to the durations that have NOT been fully allocated.
– It is helpful to note that MATCH() returns the LOCATION of what it finds. As such, the returned location is that of the first duration value that has not been fully allocated! Since the CumulativeDuration array is the same size as the WorkDuration array, we will be able to put this returned location value to good use in the next expression.
- INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration — SUM(C$1:C1) > 0, 0)) -> By using the location value (of the first duration value that has not been fully allocated), we find the corresponding original duration value from the WorkDuration array.
– As we saw earlier, the expression “CumulativeDiration – SUM(C$1:C1)” reduces the CumulativeDuration by the duration values allocated to that point. However, the resulting array could have partial duration values as well. By referencing the corresponding duration value from the WorkDuration array, we ensure that we retrieve the original (full) duration value that was to be allocated.
- MIN(…) -> This expression calculates the value of MIN([Hours Not Allocated For that Task], [Hours Available for that Day])
– [Hours Not Allocated For that Task] is returned by INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration—SUM(C$1:C1) > 0, 0)) – SUMIFS(C$1:C1, B$1:B1,B2)
– [Hours Available for that Day] is returned by second half of the MIN() expression: MaxHrsPerDay—SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)).
– So, we essentially got back to the logic we started from, which is the same logic we used for creating the Gantt-like view as well.
- The remaining portion of the formula (the IF() check) determines if all of the hours have been allocated. If all hours have been allocated, it returns “…”.
– SUMPRODUCT(WorkDuration) -> This expression calculates the total of all work duration values. In cell C2, it evaluates to SUMPRODUCT({40;20;5;15}) = 80
– SUM(C$1:C1)>=SUMPRODUCT(WorkDuration) -> Determines if the sum of durations allocated up to that point is greater than the total for all durations. (Since this is part of an array formula, you could also use the SUM function in place of SUMPRODUCT. But I am partial to the SUMPRODUCT function!! So, unless you are in a competition where the winner is determined by the shortest formula, feel free to use either one!
Formula for “WorkItem”
Enter the following formula into cell B2, ending with Ctrl+Shift+Enter, as shown in the following diagram.
=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”,INDEX(WorkList, MATCH(TRUE, (CumulativeDuration-SUM(C$1:C1)) > 0, 0))) Ctrl+Shift+Enter
You are already familiar with most of the formula components since you saw them in the formula for AllocatedDuration. The only difference is that in this formula, we are returning a value from WorkList. (i.e. we locate the position of the first non-zero duration in CumulativeDuration array, and since that array is the same size as the WorkList array, we are able to find the first Task that has not been fully allocated.)
Formula for “Day”
Enter the following formula into cell A2, ending with Ctrl+Shift+Enter, as shown in the following diagram:
=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”, MAX( N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay), 1)) Ctrl+Shift+Enter
Let us look at the formula in detail (using the formula in row 2):
- SUMIFS(C$1:C1, A$1:A1, A1) -> This expression calculates the sum of all durations (in column C) where the Days (in column A) equal the previous day.
– In cell A2, this expression evaluates to “SUMIFS(“Allocated Duration”, “Day”, “Day”)” = 0. (Excel smartly ignores any non-numeric values in the first argument.)
– In cell A3, this expression evaluates to “SUMIFS({“Allocated Duration”;24}, {“Day”;1}, 1)” = 24.
- SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay -> This expression checks if the sum of all durations where the Days equal the previous day is greater than or equal to MaxHrsPerDay.
– In cell A2, this expression evaluates to FALSE
– In cell A3, this expression evaluates to TRUE
- N(A1) -> This expression returns the numeric value for its argument. Since N() returns zero for any non-numeric arguments, we use this function to return zero for the heading (“Day”) in A1. (Any numeric values are returned as is.)
- MAX( N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay), 1) -> The first argument of the MAX function “N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay)”returns the next increment for day, if the previous day has been fully allocated. Otherwise, it returns the same value as the previous day.
– In cell A2, this expression evaluates to MAX( N(“Day”) + (SUMIFS(“Allocated Duration”, “Day”, “Day”)>=24), 1), which evaluates to MAX( N(“Day”) + (0>=24), 1), which evaluates to MAX( 0 + (FALSE), 1), which finally evaluates to 1.
– In cell A3, this expression evaluates to MAX( N(1) + (SUMIFS({“Allocated Duration”;24}, {“Day”;1}, 1)>=24), which evaluates to MAX( N(1) + (24>=24), 1), which evaluates to MAX( 1+ (TRUE), 1), which finally evaluates to 2 since 1 + TRUE = 2.
Download
You can download a copy of the above file and follow along, Download Here – Excel 2007-2013.
Final Thoughts
While we used the same basic logic for both output options in this article, there are probably many other ways to tackle the age-old problem of production scheduling.
I would love to hear about some of your ideas, as well as ways to extend the concepts described here.
In the meantime, I wish you continued EXCELlence!
Sajan.
Other Chandoo.org Posts related to Scheduling
Here at Chandoo.org you can find the following related posts:
http://www.chandoo.org/wp/2010/11/18/scheduling-variable-sources/
http://chandoo.org/wp/2009/06/16/gantt-charts-project-management/
http://chandoo.org/wp/project-management-templates/gantt-charts/
Thank You
This was Sajan’s second post at Chandoo.org and so a special thank you to Sajan for putting pen to paper to describe the technique here.
You may want to read Sajan’s first post here or thank him in the comments below:
Formula Forensics “The Series”
This is the 31st post in the Formula Forensics series.
You can learn more about how to pull Excel Formulas apart in the following posts: Formula Forensic Series
Formula Forensics Needs Your Help
I need more ideas for future Formula Forensics posts and so I need your help.
If you have a neat formula that you would like to share like above, try putting pen to paper and draft up a Post like Sajan has done above or;
If you have a formula that you would like explained, but don’t want to write a post, send it to Hui or Chandoo.






















28 Responses to “FIFA Worldcup 2018 Excel Tracker – FREE Download”
Good work as always - I liked the way you did the "menu" on the left hand side (although the buttons aren't lined up between tabs if I'm being ultra picky)
Have you previously written about the method of extracting the Wikipedia page into Power Query? It's not something I recall seeing before.
ps other geeky observsations:
- the bracket columns are too narrow for the date & match number - and will need to be wider still when the team names get populated
- match 51 should be Moscow (Luzhniki) for consistency
- it's not possible to be 23 hours ahead of GMT - the International Dateline gets in the way! I think the maximum is 14. There are also a couple of countries who work to a quarter hour to make it really complicated!
- There's a typo in the how-to - "compated" instead of compared
Thanks for the lovely feedback. I have fixed almost all of them.
1) button alignment: this is tricky as row heights can change between sheets.
2) Column width is fixed now so bracket view looks better
3) Updated the stadium name
4) Did not bother with the 23 hours ahead thingie. This is more of a novelty feature 😛
5) Fixed the type
6) Fixed an issue with live score table. This should work as long as the points table is maintained in wikipedia page - https://en.wikipedia.org/wiki/2018_FIFA_World_Cup
7) I have not discussed the technique of reading all tables on webpage to one big table. Watch out for a blog post on this soon.
Button alignment is one for the ultra-OCD sufferers 😉 There are ways, but only for those with too much time on their hands.
Aah, Excel. The perfect tool for people like us. Everything (cells) is in same shape and size by default and aligned perfectly. 😀
Is there actually a way to copy row heights (in the same way you copy column widths?)
By the way Chandoo, great post. I'm forwarding to my department. I actually use another query from the same page to automatically fill in the team names for the knock-out stages (I made one for round of 16 which I then duplicated and edited for quarter-finals etc.) This is incredible, I was always wondering how to do these type of queries from the web, and now I know 🙂
Hello!
This is quite amazing and incredibly cool to use 🙂
Testing the constraints of this sheets a few errors popped I noticed:
- Vlookup Group E-H refers to column J instead of E (eg. Brazil gets the same points as Russia because the formula looks up Russia twice)
- Power query only has 29 lines, the overview of has 32 but the 3 countries from group A are lost as the overview is refreshed - causing N/A in the group stage colums
@Jake.. thank you. I am sorry for the errors. I could not test the live points table until the games began. I see my folly now. I have fixed both issues and uploaded a new file. As the points table relies on a wikipedia page, if someone decides to change the layout or rename a column it can seriously harm this template. I took some precautions in the Power Query layer to adjust column names dynamically etc, but it is not foolproof.
Try downloading the newer version and let me know if you see something funny.
No worries!
Was able to fix the vlookup myself but the power query had me bit stumped 🙂 And wanted to give you a heads-up to everyone can enjoy it!
Thanks for the awesome sheet!
Hi,
Thanks for sharing this world cup tracker. Certainly makes it more interesting when the data is current. As a newbie, it also helps to have a couple of mistakes to find whether unintentional or not.
Thanks again
Hi,
Your v-lookups in the "Group Stage" tab for groups E, F, G, and H (all the ones under column O) are pointing to the wrong country. They all point to column J, so whatever happens to the countries in column J will also be reflected for the countries in the groups in column O for that same row.
Just thought I'd call that out. Thanks for the great work on this!
@Christian... Thanks for trying this and letting me know about lookups. I have fixed the issue now. Please download latest version for that and few more fixes.
Refresh All did not work correctly. Team names vanished though points were updated.
@Sheeloo... Can you please try with latest version (download again using above links). I tested up to latest Iran's stunning win over Morocco and it works.
Dear Chandoo
Thanks a lot for this worksheet.
However, while refreshing the data, I am getting error message as "Initialisation of Data Source failed".
May I know what version of Excel you are using? Do you have internet connectivity? If you are familiar with Power Query, try tracing the steps in the query editor. And oh, first start with the latest version of file (link above).
@Etienne - yes. Copy row, paste formats will do it, although obviously that will bring the formats of every cell in the row as well as the height.
Latest version seems to be working well.
One request: the Groups & Points tables on the Group Stage sheet have the team names pre-entered. This means they don't get sorted according to the results.
On my copy, I've changed them to a lookup, so they appear in the same order as the points table. It would be good if you can do the same if/when you release a new update!
Here's what I did. It's not the most elegant, but it works, and I didn't have much time to spend on it!
Using helper values of 1,2,3,4 in columns I and N for each group, the formula for the first team name in group A (cell J4) is:
=INDEX(points[Team],MATCH(OFFSET(J4,-(I4),0),points[Group],0)+(I4-1))
This can be copied & pasted to the other team name cells.
Cheers!
Good suggestion. I have made changes to the points table to remove lookups and just show teams in the order they appear in the detailed table. This way, You will see top two teams on first two rows. We could highlight them as well (figured this would make it look like a bowl of M&Ms, so didn't bother) or highlight *YOUR* team.
I consider my Excel skills as above average but far from guru and I love how your little projects like this get me to look at data in a new way. I would like to expand on the data in the points table through the use of some calculations but I am a little challenged by the data coming across as text. The Pts column is easy to deal with, but I'm having problems with the GD. The negative goal differential looks like it may be noted with an en dash instead of a minus sign, but if I search for an en dash in the data Excel doesn't find any. I would like to include conversion to a minus sign in my little macro so I can get everything to numbers but so far I am not having any luck. Any thoughts? Thanks for your help.
Thanks for such kind words 🙂
I suggest adding an extra step in Power Query to convert points, GD & other columns to numbers. You can replace em dash in PQ. I did not do it as this will add another layer of dependency and should the wikipedia page change, one more reason for the query to fail.
As always, an awesome spreadsheet from Chandoo. I love the Power Query score update without macros. The country watch-out is a unique feature as well!
For those who like a predictor template with flag lookup and a ribbon UI, here is our spreadsheet:
https://www.spreadsheet1.com/fifa-world-cup-2018-russia-free-prediction-templates-for-excel.html
Here is our World Cup 2022 template with LAMBDA functions:
https://www.spreadsheet1.com/fifa-world-cup-2022-qatar-free-prediction-templates-for-excel.html
[…] Interesting World Cup Tracker here at chandoo.org : https://chandoo.org/wp/fifa-worldcup-2018-tracker/ […]
Great template!
I came across another one with image vlookups for country flags
https://eexcel.co.uk/downloads (World_Cup_2018_Sweepstake.xlsx)
This is a great Template.
I am running Excel 2010 with the PowerQuery add-in running.
The scores will not update, so I followed the error and the second operation (Fitlered rows) says that the table is empty.
After a few minutes on Wikipedia, I realise that my PowerQuery skills are not good enough to work out what the issue is.
Any suggestions?
I would like to fix it myself is possible.
Thanks,
Sean.
@Sean... Can you try the latest version mate? I think it should work.
Where can I see the results for a specific match?
Thanks!
@Juan... You can now. I have included a results tab that shows match scores. This too is a live table. Just refresh data to get new results. Please download latest version file from links above to use this feature.
PS: There is another version coming soon with all goals too. I just have to spend some more time polishing the Json to table Power Query thingie.