Chandoo: Did somebody just chart?
Jeff: Yes. Yes I did. More on that later. But first, let’s take a sniff of Mike Alexander’s outliers, shall we?
Over at the bacon bits blog, Mike has an interesting post on using something called the Tukey Method to identify outliers in a data set. That article is worth reading for John Walkenbach’s comment alone.
Here’s Mike’s sample dataset, with the data points identified as outliers highlighted in orange:
![]()
The Tukey method that Mike blogs about constructs a fence around “reasonable” readings, and that fence is described mathematically by an arbitrary numerical factor:
(Quartile 1) – (Arbitrary_Factor × IQR)
(Quartile 3) + (Arbitrary_Factor × IQR)
Typically a factor of 1.5 is used. Check out Mike’s blog for a detailed explanation of this stuff.
That’s all good, but it also produces a fairly arbitrary cut-off, depending on what factor you use. So rather than using an algorithm to determine outliers, my preference is to sort the data from lowest to highest value, then plot it and look at the resulting shape:

—Edit— Jon says in the comments:
Your line chart would be easier to read if you’d used markers. I use markers to indicate where the data actually IS, and help show that the line only ties the data together and doesn’t indicate more data, until the points are nearly touching.
Trust Jon to chart in my face. But he’s right. So here it is:

[Aside: That chart’s done in Excel 2013. What’s weird is that those markers aren’t centered on the line, but seem to sit just above it by a point or two. Whoops, Microsoft.]
And here it is with data labels, so it’s easier to see the actual values:

Some may say that the data labels are redundant, because you can gauge the values from the axis. My mature response to that is “Ffffffrrrrrt”. I like the data labels…once I’ve used the line to quickly judge what may be outliers, the labels let me confirm the jump in values without having to move my head back and forth like I’m watching Roger Federer play Andy Murry at Wimbledon.
In fact, maybe I can combine the marker with the labels, and get rid of that axis altogether:

Hey, that looks cool. Anyone going to get Tufte on me?
—Edit over—
This is akin to making a bunch of actors line up in order of shortest to tallest, and saying:
Okay…Elijah, Dominic, Billy, and Sean…you’re shortest. And by golly, you four look a lot shorter than the others. You guys can be the Hobbits.

[Aside: I recreated the below graph from one a site called SFScope. Check out the outliers at both ends, and click on the picture to visit the original]

I like this graphical approach. I think it takes less effort to visually identify outliers than to programatically identify them. For instance, let’s look at Mike’s sample data again for a moment:

Looking at this data, I visually identify pretty much the same outliers as Tukey would – points 1,2,3, 19, and 20. In addition, it looks like that 4th data point – with a value of 13 – looks like it has outlier stamped all over it too, when you see it in context of the other data.
Another benefit of plotting ranked data is that it also allows you to ask questions about interesting trends within the datapoints that clearly are not outliers. For instance, what’s the deal with the sudden ‘acceleration’ in the trend between datapoints 16 and 17 caused by? Understanding drastic changes within non-outlier points might be worth as much money to a business as understanding the outliers themselves.
Lose the horizontal axis?
Sometimes with larger datasets, that horizontal axis can be distracting, because Excel only has enough space along that axis to display the labels for every nth rank.
For instance, take the below graph, which looks at just how much money an organization receives from each of its customers by way of annual membership subscription each year:

See what I mean? You find yourself trying to decipher the trend in the data labels, and this really draws your eye away from the incredible trend shown in the graph above.
That’s much less distracting. Wow: many of our customers hardly subscribe to anything, and a few practically keep this place afloat!
What else can we show on a graph like this?
Sorting your data like this also lends itself to visually segmenting your customers by how much they contribute to your total revenue.
For instance, the below graph shows just how many customers it takes to account for each subsequent 25% of revenue, and what the average annual subscription within each group is. This gives you a real appreciation into just how valuable your larger customers are in comparison to smaller customers:

Wow, half our subscription revenue comes from our Key Accounts and Large Customers groups, who make up just 10% of our subscription base. Let’s be especially nice to those customers. And lots of our effort is spent in servicing small clients that don’t buy much. Can we grow their business? Should we sack some of them as customers, so we can spend that effort finding bigger ones?
Using revenue ‘buckets’ of 25% was a fairly arbitrary choice. What if we designed a chart template that let you dynamically choose different sized revenue buckets, as well as let you use more buckets if you wanted to?
For instance, looking at the above graph, it looks to me that we have a whole bunch of ‘Tiny Customers’. And we also might want to segment that group of Median customers that all have exactly the same sized subscription into a group of their own.
Well, the chart template I’ve put together for this post lets you do just that:

Wow. Jeff charted again. Man, look at all those time-wasting small accounts…they’re about as welcome as a chart in an elevator!
Note that the above graph was produced using Excel 2013. Excel 2013 automatically puts in those grey lines connecting the data lables with the series. Those are called Leader Lines. They rock.
Unfortunately, earlier versions of Excel only use leader lines for pie charts. But fear not, intrepid reader, for my chart template uses a bit of VBA to automatically puts lines in for you using shapes, if you’re using Excel 2010:

What’s cool about this template is that all the data labels are dynamic: change the ‘breakpoints’ between groups or the number of groups in the ‘Controls’ table [see screenshot below], and the details within the data labels are updated automatically. Bing!

I modified a version of Jon Peltier’s great Label Last Point routine to refresh the placement of the data labels. (Thanks, Jon). Here’s the template, so you can play around in the privacy of your own screen:
Segmenting customers by revenue contribution_V1 [Not tested in Excel 2007 or earlier]
Oh yes. I most definitely charted, boss.
Updates
—Update 1—
Prompted by some great action in the comments below, I whipped up this redesign in both gray and white:

While I like the grey, I do think it’s harder on the eyes than black text on white background. And I don’t think a grey chart would work well on say a dashboard. But that said, there’s no doubt in my mind that this chart is sexier than my original. Might look nice in the Economist. Here’s a link to the revised sample file: Segmenting-customers-by-revenue-contribution_with_Leader_Lines V1
—Update 2—
Kaiser Fung has some great ideas on how to redesign this in his post Visualizing Uneven Distributions. Go check it out, and be sure to subscribe to both his Junk Charts blog as well as his Big Data, Plainly Spoken blog. Both are gold. Both will make you a better analyst.
Added by Chandoo
If you like this chart, chances are you are going to love the below too:















23 Responses to “Shift Calendar Template – FREE Download”
Hi Chandoo,
your recent postings include only Excel 2007 templates. Unfortunately the company I work at still runs Excel 2003. Is it possible to get your awesome files in other excel version as well?
Thanks so much for your great excel stuff!
Is it possible to do this for shifts with hours instead of days? To organise a three shift day?
Thanks in advance,
Stelios
In my organization there are 45 employees i need split then into three shifts ex:A shift:14,B shift:14,C shift:14 and week off:3 kindly help me on this.
@Masthan
You need to understand what rules your company has for the various shifts / roster combinations
Chandoo, I once did a shift control spreadsheet for my team. I put one person in each line, the columns were the days. I put a shift code in each cell indicating in which shift that person should work, or if the person were out that day. I have two codes for being out. One is for vacations and one is to compensate days worked in weekends. This way I was able to count how many persons I have in each shift, how many were on vacations and how many were out compensating (that's the term we use here) weekend worked hours.
Later I included the possibility of a person be in two lines one for normal hours other for overtime. This is mainly used for planning purposes. If you would like I can send you an example. The only problem of this spreadsheet is that we don't have a person view, only this consolidated view.
Hi George, I would like to have a copy of your spreadsheet if you can share it.
Thanks in advance, Chuck
Hi Chandoo,
Where is the code located ? is it VBA ? If so , how do you hide it ? Or it is .NET ?
Thx
@Idan
.
No VBA or code, it is all done with Mirrors.
Only Joking,
.
But there is no VBA or code,
It is all done with Named Formulas and Lookups.
Have alook at the cells in the calander area and Named Formulas in the Formulas, Name Manager Tab.
How can i calculate between two or more different workbooks? Please, reply me as early as possible.
@Anand
Open the workbooks you want to link to
Start a formula = and click and change between workbooks as required.
You can use the View, Switch window menu to change workbooks mid formula
The format for using workbooks is
=[Workbook.xlsm]Sheet1!$A$1
or
=SUM('[Book2.xls]Sheet1'!$A$1:$D$10)
etc
Hi Chandoo,
I am working with a call centre wherein i ned to update at the month end 20 to 30 employees login hours which are defict to track it at the month end is very difficult is there any template which can be made to track that why on a particular day a guy who needs to be on calls was why not on calls.
Thank you so much Chandoo. This is really helping me. As usual, you rock.
What's FortyTwoDays and Calendar in Name manager?
Both are unused and FortyTwoDays doesn't make any sense.
I have a SQL db that contains records of events scheduled/completed on a particular date. Can this method ous building a calendar be used to display those events on the respective day?
Positively awesome!
I'm attempting to help a friend create a schedule for adult classes - and of course its not"paid help". Here is the scenario:
20 classes, instructor, room#, student class size, start date, number of class days (need to subtract weekends)
class
instructor
room
students
start
#days
PATH
karen
201
21
01/01/13
11
BILLING
jane
401
15
01/12/13
13
MEDISOFT
mike
301
11
01/25/13
9
he'd like to see these classes show up in different colors within the same month's calendar chart. He can draw it, but I'd like to see it done automatically through data, and I just can't visualize it, but I KNOW this will work - can you help?
Jan 🙂
Dear chandoo,
Try many way to download still can't access. Any way we want to try out 3 shifts with 3 guys in a group .eg Group A Morn, Group B Night and Group C Rest. And every each group must work on sunday to take turns. In fact we are security teams so that's why sunday is required to work. Pls guide and show how to put in the working calendar. Thank you in advance.
I've been trying to copy and/or recreate this to use in a workbook I'm doing for the transportation department I'm working for. I need to have the calendar on the first sheet in my document (it has graph's from data on another sheet). I'm trying to use it to track (with the conditional formatting) accidents and injuries. I've redone the conditional formatting to do 4 different accident types (no injury, near miss, OSHA recordable injury and work loss injury), but when I enter the formula's you have in the calendar portion where it says "DateOfFirst-FirstWeekDay" I can't figure out how you did that. Are you able to help?
I would like to use Excel to solve the following problem for a community work. I want to create a Driver schedule for a given month from a pool of volunteers for a community service. Each of these volunteers can drive only on specific days in a week. I would like to populate the driving schedule for each weekday with primary, secondary and tertiary drivers in a random fashion so that I do not overburden one person. I would greatly any help you can provide.
Hi chandoo,
Thanks for your valuable effort for create this template and let me know how to add multiple employees in the the Roaster.
Hi Chandoo,
This article on shift roaster is very helpful. Could you please let me know how i can use the same for n number of resources who work 24/7, considering their leaves and holidays?
Thanks,
Savitha
Hi Chandoo,
This article on shift roaster is very helpful to all. Could you please let me know how i can use the same if I want to add for some more shifts, since the color is not getting change if I add more shifts like 4,5 etc.,
Thanks,
Murali
nice post
How can I change the date to 2017 under Shift Data worksheet.