Excel Howtos, Formula Forensics, Posts by Sajan

Formula Forensics No. 031 – Production Scheduling using Excel

Last updated on October 12, 2012

Hui...

Recently, Bluetaurean asked in the Chandoo.org Forums about ways to allocate work durations for various product lines across 24 hour days to create a daily schedule.

Both formula-based and VBA-based solutions were offered.

Today at formula Forensics we will take a look at the formula-based approach.

As always at Formula Forensics you can follow along, Download Here – Excel 2007-2013.

Set the Scene

Since one might encounter a similar need in a variety of contexts (manufacturing, engineering, project planning, etc.), we will look at a more general problem of allocating a set of tasks and corresponding durations to one or more days, as shown in the following diagram.

We will create two output views:

One that is a flat list that can then be manipulated further using Excel’s Pivot table feature, and
Another view that mimics a pivot-table (and is similar to a typical project Gantt view, but with actual values listed instead of a bar chart).

You can follow along using the attached Excel document. Download here Excel 2007+

Problem Specifics

We have a list of tasks and their durations.
We need to distribute the tasks to different days, without exceeding the maximum available duration in a given day.
When the hours in a day are “used up”, we need to allocate the remaining task duration to the next day, and so on.
On the other hand, if a given task does not use up all of the hours in a given day, we will need to assign more than one task for that day, provided the combined durations do not exceed the available hours for that day.
In other words, we will need to split a task across one or more days, or combine one or more tasks into a single day, as needed, to maximize the work performed in a given day.

Developing the Approach

Before we tackle this problem in Excel, let us review how we might do this manually. Like most things, we might use the following three step process:

Take the first task and assign its duration to Day 1. If the task’s duration exceeds the maximum hours available in a day, allocate the portion of the duration that does not fit into Day 1 into Day 2.
Take the second task, and see whether it can fit into an existing day, or whether it needs to be distributed to multiple days
Etc. (OK… so that three-step process was a stretch!)

Statistics show that most people think in terms of IF-THEN-ELSE statements. So here it is…

For a given Day, and for a given Task,

If [Hours Not Allocated For that Task] > [Hours Available for that Day] Then

Set Duration for that Day as [Hours Available for that Day]

Else

Set Duration for that Day as [Hours Not Allocated for that Task]

End
Continue the above evaluation until all tasks have been allocated to days.

Of course, the above IF() logic can be condensed as follows:

MIN( [Hours Not Allocated For that Task] , [Hours Available for that Day] )

Putting it All Together: Output Option 1: Gantt-like View

Let us employ the above approach to create the Gantt-like view.

To make our approach more generic, we will use an Excel Name called “MaxHrsPerDay” to indicate the maximum available hours in a given day. (In the sample worksheet, it has been set to 24 hours.)

Our source data is setup as shown in the diagram below:

Tasks are in the range A2:A5
Durations are in the range B2:B5

We will create the output in a separate worksheet, in the range A1:E5 as shown below:

Put the following formula into cell A2 and copy down to A5:

=SourceData!$A2

(This formula is merely referencing the values from the SourceData sheet. The sample workbook also includes an approach to make this reference more location independent.)

Put the following formula in cell B2, and copy it down and right:

=MIN((SourceData!$B2-SUM($A2:A2)), (MaxHrsPerDay-SUM(B$1:B1)))

Setup the header row (B1:E1) as desired. (I have used text values for the header. You could also calculate the header text using formulas. Since that is straightforward, I will leave that as an exercise for the reader.)

Now let us look at what the formula in cell B2 is doing:

SUM($A2:A2) is calculating the sum of the allocated durations for TaskA. (Please note the use of absolute and relative references. The formula is anchored on column A, but the starting row, ending row and ending column are free to expand.) SUM($A2:A2) returns zero since SUM() ignores text values.

– If you look at cell C2, the reference changes to SUM($A2:B2).
– In cell B3, the reference changes to SUM($A3:A3). You get the idea

(SourceData!$B2-SUM($A2:A2)) calculates the difference between the duration for TaskA (40 in the example) and the hours allocated as of that point (0), to return 40-0=40.
SUM(B$1:B1) is calculating the sum of the allocated hours for Day1. (Again, we are using a combination of absolute and relative references to keep the calculation anchored on column B.) In this case, the value is zero, since this is the first allocation for Day1.
(MaxHrsPerDay-SUM(B$1:B1)) calculates the hours remaining (i.e. available) for Day1. Since this is for cell B2, the calculation returns 24 – 0 = 24.

That is it!

We put those absolute and relative references to good use!

This approach was easy because all we had to do was calculate the duration for a given task for a given day.

On the other hand, if we had to figure out what the Task was, or which Day it was, the calculation gets a little more involved. Since this is “formula forensics”, we would not have it any other way! 🙂

Putting it All Together: Output Option 2: A Sequential List of Tasks and Durations for Each Day (i.e. a Flat List)

As before, we will use the Excel Name “MaxHrsPerDay” to refer to the maximum hours in a Day.

As shown in the following diagram, we will turn the source data into a flat list of Days, Tasks and Durations:

Unlike with VBA, since a formula cannot choose which row and column to write its output, we have to set the formula in every cell where we suspect there might be a value.

In the above sample diagram, we copy the formulas from row 2 to row 9. However, row 9 shows “…” indicating that the list was completed by row 8.

Let us look at how to determine the value for Day, Task and Allocated Duration.

For ease of description, I have created the following Excel Names:

WorkList: =A2:A5 in the source data.

WorkDuration: =B2:B5 in the source data

While creating the Gantt-like view earlier, we were able to take advantage of the static “Day” and “Task” values to determine the Remaining Duration, Available Duration, etc. Since we now have to determine all three values (Day, Task, Allocated Duration), we will need some “helper” data.

We will add a column alongside the source data that shows the cumulative duration (for reasons that will become clear shortly), as shown in the following diagram:

Cumulative Duration is calculated as the sum of all durations up to a given row.

For example, in cell C2, the Cumulative Duration is 40.
In cell C3, the Cumulative Duration is 40+20=60
And so on.

For ease of referencing, we will use an Excel Name called CumulativeDuration =C2:C5.

Let us look at why we need the “CumulativeDuration” helper column:

The circular logic problem

In order to determine the durations already allocated for a given day, we will need to know which Day it is.

We also need to know which Task we are trying to calculate the duration for.

So… do we calculate the Day or the Task or the Duration first?!! As you can imagine, that will soon land us in some circular logic.

Some helpful observations about the output:

In column C of the output (on worksheet FlatList), the sum of allocated durations adds up to the total duration for all tasks. (No surprise here!)
If every task had duration equal to the MaxHrsPerDay, you would have the same duration value for all days. (Not surprising, but interesting!)
In other words, you could think of the Allocated Duration column as the total duration for all tasks, allocated MaxHrsPerDay at a time.
Now we need a way to iterate through the duration values one at a time and account for the durations already processed. In other words, each value needs to contain all of the previous values. Welcome to an array of the cumulative durations!
For example, in the cumulative array “{40;60;65;80}”, the value 60 already includes the previous value 40 in it. This allows us to subtract all durations allocated up to a given row, to get the duration value that is remaining to be allocated.
Since Excel is good with numbers, we will base the calculation for AllocatedDuration and Tasks on the Duration values.
By calculating the two values separately, we avoid the circular logic.

Let’s now look at the formulas for Day, WorkItem and AllocatedDuration.

It would be easier if we looked at the formulas in reverse order, starting with AllocatedDuration, then WorkItem, and finally Day.

Formula for “AllocatedDuration”

Enter the following formula into cell C2, ending with Ctrl+Shift+Enter, as shown in the following diagram:

=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”,MIN(INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration-SUM(C$1:C1) > 0, 0)) – SUMIFS(C$1:C1, B$1:B1,B2), MaxHrsPerDay-SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)))) Ctrl+Shift+Enter

Let us look at the formula closely (using the formula in row 2):

SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)) -> This calculates the sum of all allocated durations up to the previous row, where the Day = current row’s day. Please note the use of absolute and relative references. They allow us to expand the range as we go down the rows, while remaining anchored to the first row.

– Since this is the first data row, C$1:C1 returns “Allocated Duration” and the ISNUMBER() function returns FALSE, and consequently, the IF() function returns 0.
– A$1:A1 returns “Day”, and the test A$1:A1=A2 returns FALSE. Please note that in this case, it does not matter whether A2 has a value in it, whether it has the value 1, etc.
– SUMPRODUCT() provides the result of FALSE * 0 = 0

MaxHrsPerDay – SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)) -> This calculates the difference between maximum duration available for a day and the sum of durations allocated for the current day. In other words, it calculates the available duration for the current row’s day.

– In this example, the calculation results in MaxHrsPerDay (24 in our example) – 0 = 24

SUMIFS(C$1:C1, B$1:B1,B2) -> This calculates the sum of all allocated durations for the current row’s task. Since B$1:B1 is the text value “Work Item”, the SUMIFS() returns 0. Again, it does not matter if B2 is blank or has a value like “TaskA”, since Excel correctly evaluates the condition whether B$1:B1 equals B2.
SUM(C$1:C1) -> This calculates the sum of all allocated durations up to the previous row.
CumulativeDuration — SUM(C$1:C1) -> CumulativeDuration evaluates to {40;60;65;80}. SUM(C$1:C1) evaluates to zero. As such, the expression evaluates to {40;60;65;80} – 0, or {40;60;65;80}.

– If we look at the calculation for this expression in cell C3 (the expression would be “CumulativeDuration—SUM(C$1:C2)”), we would get the result of {40;60;65;80} – (0+24) = {16;36;41;56}. (As you know, subtracting a scalar value from an array results in an array with each value reduced by the scalar value.)

– If we look at the calculation for this expression in cell C4 (the expression would be “CumulativeDuration—SUM(C$1:C3)”) , we would get the result of {40;60;65;80} – (0+24+16) = {0;20;25;40}

– As you can see, each successive calculation reduces the CumulativeDuration array by the amount of hours already allocated. By reducing the CumulativeDuration array in this fashion, we ensure that we do not “double count” a duration.

– If a value in the array evaluates to zero, it means the corresponding duration has been fully allocated. (In cell C3, the first value in the array is zero, indicating that the original 40 hours has been fully allocated.) We will put this knowledge to good use in the next expression.

MATCH(TRUE, CumulativeDuration—SUM(C$1:C1) > 0, 0) -> The expression CumulativeDuration—SUM(C$1:C1) > 0 evaluates to ={TRUE;TRUE;TRUE;TRUE} because all values are greater than zero. By performing a MATCH() for TRUE, we are able to find the first location in the array that has a non-zero value.

– If we look at the result of this expression in cell C3, we get {16;36;41;56} > 0 = {TRUE;TRUE;TRUE;TRUE}

– If we look at the result of this expression in cell C4, we get {0;20;25;40} > 0 = {FALSE;TRUE;TRUE;TRUE}

– As you recall, the zero values (or FALSE) correspond to the durations that have been fully allocated, whereas, the non-zero values (or TRUE) correspond to the durations that have NOT been fully allocated.

– It is helpful to note that MATCH() returns the LOCATION of what it finds. As such, the returned location is that of the first duration value that has not been fully allocated! Since the CumulativeDuration array is the same size as the WorkDuration array, we will be able to put this returned location value to good use in the next expression.

INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration — SUM(C$1:C1) > 0, 0)) -> By using the location value (of the first duration value that has not been fully allocated), we find the corresponding original duration value from the WorkDuration array.

– As we saw earlier, the expression “CumulativeDiration – SUM(C$1:C1)” reduces the CumulativeDuration by the duration values allocated to that point. However, the resulting array could have partial duration values as well. By referencing the corresponding duration value from the WorkDuration array, we ensure that we retrieve the original (full) duration value that was to be allocated.

MIN(…) -> This expression calculates the value of MIN([Hours Not Allocated For that Task], [Hours Available for that Day])

– [Hours Not Allocated For that Task] is returned by INDEX(WorkDuration, MATCH(TRUE, CumulativeDuration—SUM(C$1:C1) > 0, 0)) – SUMIFS(C$1:C1, B$1:B1,B2)

– [Hours Available for that Day] is returned by second half of the MIN() expression: MaxHrsPerDay—SUMPRODUCT((A$1:A1=A2)* IF(ISNUMBER(C$1:C1), C$1:C1, 0)).

– So, we essentially got back to the logic we started from, which is the same logic we used for creating the Gantt-like view as well.

The remaining portion of the formula (the IF() check) determines if all of the hours have been allocated. If all hours have been allocated, it returns “…”.

– SUMPRODUCT(WorkDuration) -> This expression calculates the total of all work duration values. In cell C2, it evaluates to SUMPRODUCT({40;20;5;15}) = 80

– SUM(C$1:C1)>=SUMPRODUCT(WorkDuration) -> Determines if the sum of durations allocated up to that point is greater than the total for all durations. (Since this is part of an array formula, you could also use the SUM function in place of SUMPRODUCT. But I am partial to the SUMPRODUCT function!! So, unless you are in a competition where the winner is determined by the shortest formula, feel free to use either one!

Formula for “WorkItem”

Enter the following formula into cell B2, ending with Ctrl+Shift+Enter, as shown in the following diagram.

=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”,INDEX(WorkList, MATCH(TRUE, (CumulativeDuration-SUM(C$1:C1)) > 0, 0))) Ctrl+Shift+Enter

You are already familiar with most of the formula components since you saw them in the formula for AllocatedDuration. The only difference is that in this formula, we are returning a value from WorkList. (i.e. we locate the position of the first non-zero duration in CumulativeDuration array, and since that array is the same size as the WorkList array, we are able to find the first Task that has not been fully allocated.)

Formula for “Day”

Enter the following formula into cell A2, ending with Ctrl+Shift+Enter, as shown in the following diagram:

=IF(SUM(C$1:C1)>=SUMPRODUCT(WorkDuration), “…”, MAX( N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay), 1)) Ctrl+Shift+Enter

Let us look at the formula in detail (using the formula in row 2):

SUMIFS(C$1:C1, A$1:A1, A1) -> This expression calculates the sum of all durations (in column C) where the Days (in column A) equal the previous day.

– In cell A2, this expression evaluates to “SUMIFS(“Allocated Duration”, “Day”, “Day”)” = 0. (Excel smartly ignores any non-numeric values in the first argument.)

– In cell A3, this expression evaluates to “SUMIFS({“Allocated Duration”;24}, {“Day”;1}, 1)” = 24.

SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay -> This expression checks if the sum of all durations where the Days equal the previous day is greater than or equal to MaxHrsPerDay.

– In cell A2, this expression evaluates to FALSE

– In cell A3, this expression evaluates to TRUE

N(A1) -> This expression returns the numeric value for its argument. Since N() returns zero for any non-numeric arguments, we use this function to return zero for the heading (“Day”) in A1. (Any numeric values are returned as is.)
MAX( N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay), 1) -> The first argument of the MAX function “N(A1) + (SUMIFS(C$1:C1, A$1:A1, A1)>=MaxHrsPerDay)”returns the next increment for day, if the previous day has been fully allocated. Otherwise, it returns the same value as the previous day.

– In cell A2, this expression evaluates to MAX( N(“Day”) + (SUMIFS(“Allocated Duration”, “Day”, “Day”)>=24), 1), which evaluates to MAX( N(“Day”) + (0>=24), 1), which evaluates to MAX( 0 + (FALSE), 1), which finally evaluates to 1.

– In cell A3, this expression evaluates to MAX( N(1) + (SUMIFS({“Allocated Duration”;24}, {“Day”;1}, 1)>=24), which evaluates to MAX( N(1) + (24>=24), 1), which evaluates to MAX( 1+ (TRUE), 1), which finally evaluates to 2 since 1 + TRUE = 2.

Download

You can download a copy of the above file and follow along, Download Here – Excel 2007-2013.

Final Thoughts

While we used the same basic logic for both output options in this article, there are probably many other ways to tackle the age-old problem of production scheduling.

I would love to hear about some of your ideas, as well as ways to extend the concepts described here.

In the meantime, I wish you continued EXCELlence!

Sajan.

Other Chandoo.org Posts related to Scheduling

Here at Chandoo.org you can find the following related posts:

http://www.chandoo.org/wp/2010/11/18/scheduling-variable-sources/

http://chandoo.org/wp/2009/06/16/gantt-charts-project-management/

http://chandoo.org/wp/project-management-templates/gantt-charts/

Thank You

This was Sajan’s second post at Chandoo.org and so a special thank you to Sajan for putting pen to paper to describe the technique here.

You may want to read Sajan’s first post here or thank him in the comments below:

Formula Forensics “The Series”

This is the 31st post in the Formula Forensics series.

You can learn more about how to pull Excel Formulas apart in the following posts: Formula Forensic Series

Formula Forensics Needs Your Help

I need more ideas for future Formula Forensics posts and so I need your help.

If you have a neat formula that you would like to share like above, try putting pen to paper and draft up a Post like Sajan has done above or;

If you have a formula that you would like explained, but don’t want to write a post, send it to Hui or Chandoo.

Share this tip with your colleagues

30 Comments
Ask a question or say something...
Tagged under advanced excel, circular references, downloads, Formula Forensics, Microsoft Excel Formulas, planning, scheduling
Category: Excel Howtos, Formula Forensics, Posts by Sajan

Want an AWESOME
Excel Class?

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.

Rebekah S

Reporting Analyst

FREE Goodies for you...

Related Tips

Excel Howtos

13 Responses to “Using pivot tables to find out non performing customers”

David Onder says:

October 3, 2012 at 12:35 pm

To avoid the helper column and the macro, I would transpose the data into the format shown above (Name, Year, Sales). Now I can show more than one year, I can summarize - I can do many more things with it. ASAP Utilities (http://www.asap-utilities.com) has a new experimental feature that can easily transpose the table into the correct format. Much easier in my opinion.

David

Reply
- Chandoo says:
  
  October 3, 2012 at 1:03 pm
  
  Of course with alternative data structure, we can easily setup a slicer based solution so that everything works like clockwork with even less work.
  
  Reply
Martin says:

October 3, 2012 at 1:05 pm

David, I was just about to post the same!
In Contextures site, I remember there's a post on how to do that. Clearly, the way data is layed out on the very beginning is critical to get the best results, and even you may thinkg the original layout is the best way, it is clearly not. And that kind of mistakes are the ones I love ! because it teaches and trains you to avoid them, and how to think on the data structure the next time.

Eventually, you get to that place when you "see" the structure on the moment the client tells you the request, and then, you realized you had an ephiphany, that glorious moment when data is no longer a mistery to you!!!

Rgds,

Reply
JMarc says:

October 3, 2012 at 2:11 pm

Chandoo,
If the goal is to see the list of customers who have not business from yearX, I would change the helper column formula to : =IF(selYear="all",sum(C4:M4),sum(offset(C4:M4,,selyear-2002,1,columns(C4:M4)-selyear+2002))) This formula will sum the sales from Selected Year to 2012.
JMarc

Reply
Elias says:

October 3, 2012 at 2:29 pm

If you are already using a helper column and the combox box runs a macro after it changes, why not just adjust the macro and filter the source data?

Regards

Reply
RichW says:

October 3, 2012 at 7:10 pm

I gotta say, it seems like you are giving 10 answers to 10 questions when your client REALLY wants to know is: "What is the last year "this" customer row had a non-zero Sales QTY?... You're missing the forest for the trees...
Change the helper column to:
=IFERROR(INDEX(tblSales[[#Headers],[Customer name]:[Sales 2012]],0,MATCH(9.99999999999999E+307,tblSales[[#This Row],[Customer name]:[Sales 2012]],1)),"NO SALES")
And yes, since I'm matching off of them for value, I would change the headers to straight "2002" instead of "Sales 2002" but you sort the table on the helper column and then and there you can answer all of your questions.

Reply
Kevin says:

October 4, 2012 at 1:56 am

Hi thanks for this. Just can't figure out how you get the combo box to control the pivot table. Can you please advise?

Cheers

Reply
- Chandoo says:
  
  October 4, 2012 at 2:26 am
  
  @Kevin.. You are welcome. To insert a combo box, go to Developer ribbon > Insert > form controls > combo box.
  For more on various form controls and how to use them, please read this: http://chandoo.org/wp/2011/03/30/form-controls/
  
  Reply
Kevin says:

October 4, 2012 at 2:41 am

Thanks Chandoo. But I know how to insert a combobox, I was more referring to how does in control the year in the pivot table? Or is this obvious? I note that if I select the Selected Year from the PivotTable Field List it says "the field has no itens" whereas this would normally allow you to change the year??

Thanks again

Reply
Kevin says:

October 4, 2012 at 3:18 am

worked it out thanks...
when =data!Q2 changes it changes the value in column N:N and then when you do a refreshall the pivottable vlaues get updated

Still not sure why PivotTable Field List says “the field has no itens"?? I created my own pivot table and could not repeat that.

Reply
Bermir says:

October 5, 2012 at 3:59 pm

Hi, I put the sales data in range(F5:P19) and added a column D with the title 'Last sales in year'. After that, in column D for each customer, the simple formula

=2000+MATCH(1000000,E5:P5)

will provide the last year in which that particular customer had any sales, which can than easily be managed by autofilter.

Reply
- Bermir says:
  
  October 5, 2012 at 5:00 pm
  
  Somewhat longer but perhaps a bit more solid (with the column titles in row 4):
  
  =RIGHT(INDEX($F$4:$P$19,1,MATCH(1000000,F5:P5)),4)
  
  Reply
Segmenting customers by revenue in Excel | Chandoo.org - Learn Microsoft Excel Online says:

January 27, 2014 at 8:03 am

[…] Finding non-performing customers using Pivot Tables […]

Reply

Formula Forensics No. 031 – Production Scheduling using Excel

Hui...

Set the Scene

Problem Specifics

Developing the Approach

Putting it All Together: Output Option 1: Gantt-like View

Putting it All Together: Output Option 2: A Sequential List of Tasks and Durations for Each Day (i.e. a Flat List)

Let us look at why we need the “CumulativeDuration” helper column:

The circular logic problem

Formula for “AllocatedDuration”

Formula for “WorkItem”

Formula for “Day”

Download

Final Thoughts

Other Chandoo.org Posts related to Scheduling

Thank You

Formula Forensics “The Series”

Formula Forensics Needs Your Help

Get FREE Excel + Power BI Tips

Welcome to Chandoo.org

FREE Calendar & Planner Excel Template for 2026

Who is my boss’s boss? [Data Analytics Challenge – 001]

New Zealand GST Calculation with Excel [Free Template]

Make a Pivot from Another Pivot Table in Excel

How to use XLOOKUP with two sheets?

Related Tips

New Zealand GST Calculation with Excel [Free Template]

Make a Pivot from Another Pivot Table in Excel

How to use XLOOKUP with two sheets?

How to Merge Multiple CSV Files in Excel (Step-by-Step Guide)

How to enable developer ribbon in Excel?

13 Responses to “Using pivot tables to find out non performing customers”

Leave a Reply

Get FREE Excel & Power-BI Newsletter

Get Started

Online Classes

About

Downloads