Awesome chart to visualize Salary Increases for 3,500+ people [Tutorial]

Share

Facebook
Twitter
LinkedIn

Game for some charting awesomeness?

Off late, I have been doing a lot of data analysis and visualization on performance ratings, salary hike, gender pay equality etc. Today let me share you an awesome way to visualize massive amounts of data.

Scenario: Your organization of 3,686 people recently went thru annual performance ratings & review process. At the end of it, everyone was offered some salary increase (from $0 to $24,000 per year). You have 7 business groups. How do you tell the story of all these salary hikes in one chart?

How about this one?

finalize-jitter-plot-visualizing-employee-salary-hikes

Ready to know how to create this in Excel? Read on.

Tutorial: Creating jittered scatter plot in Excel

That is right, what you are seeing above is good old scatter plot with a bit of jitter (random noise added to X values). This way, when too many dots are at a single point, we spread them apart to show more.

Let’s look at data:

Here is a sample of 3,500+ employee’s ratings and salary hikes (randomly made up), with the usual columns:

performance-ratings-and-sal-increase-raw-data

Convert rating and group names to numbers:

Since we can’t use rating and group names in XY plot (we need numbers, not text), let’s convert these in to numbers using simple MATCH() formula.

We get two new columns, like below:

rating-and-group-converted-to-numbers

Creating X & Y values from data:

Next up, we need to generate the X & Y values for our plot.

Y value:  This is easy. It is the amount of salary increase with two twists:

  • If employee got $0 hike, we want to omit them in the plot. This will remove many of dots from the plot (less clutter)
  • If an employee is unrated (even if they got a hike), we want to omit them too. This is because our plot has only 4 rating levels per group. There are very few unrated people and they are not the focus of this chart.

We can create Y value using a simple IF formula like below:

  • =IF(OR([@[Salary Increase $]]=0,[@[Rating 17 (number)]]=5),NA(),[@[Salary Increase $]])

X value:  This is the tricky bit. Since there are 7 groups, each with 4 ratings (excluding the unrated), we have 28 possible X values. We want to space these out so dots for one group + rating combination don’t encroach other combination.

Let’s say we give 10 units of space per group.

That means, we have 2.5 units of space per rating in that group (and total of 70 units of space).

Now, the dot needs to plotted at the center of this 2.5 unit of space (ie at 1.25)

The basic formula would be: =[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25

But what about the jitter?

Aah, right. We need to add random noise to X value. Since each rating has 2.5 of space, how about noise between -0.7 to 0.7 ? This still leaves plenty of space on both ends thus keeping the plot clear.

jittering-a-dot-with-random-noise

We can use below formula to generate the noise.

=RANDBETWEEN(-700,700)/1000

The final formula for X value goes like this:

=[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25+[@Noise]

Here is how our X,Y looks at this stage:

x and y values for scatter jitter plot

Data prep done, let’s move to the plot.

Creating jittered scatter plot

  1. Select both X & Y values and insert XY plot. We get this.jitter-plot-step-1
  2. Set X axis limits and remove title: As all our dots are between 10 to 80, let’s set them as limits for X axis. Also, let’s remove the chart title.jitter-plot-step-2-x-axis-limits
  3. Add vertical gridlines: Although our dot towers are separated from each other, adding grid lines makes it easy to read the chart.jitter-plot-step-3-vert-lines
  4. Format the markers: Set fill to solid color and 25% transparency. This makes the dots look nice and shows the density when there are too many people at some co-ords.jitter-plot-step-4-transperent-dots
  5. Set Y axis limit: So that we can focus on people getting salary increase of up to $10,000. This zooms the chart to meaty part while showing plenty of outliers. We get this:jitter-plot-step-5-y-axis-max-limit
  6. Last step:  Remove plot and chart borders, so we can add extra info, labels etc.jitter-plot-step-6-remove-borders

Ok, now our chart is almost ready. Next step, making it a story.

Create a wireframe in 10 column area, as shown below:

chart-layout-wireframe-employee-jitter-plot-v1

Next place the chart inside the red box. Adjust plot area size so it fits in to 7 columns. Hold ALT key when adjusting so chart’s plot area would fit in to 7 columns. You need to repeat this step every time you fiddle with the chart. So do it at last.

Add extra story points:

  • A clear and descriptive title
  • A sub-title explaining what is going on and how to read the chart.
  • Group names and rating names. You can use the below trick to align the rating labels inside cell nicely.demo-horizontal-distribution-cell-text
  • Show some more stats like median hike, median new pay (if you have it), head counts and unrated counts.
  • Add any footers, disclaimers (about excluded people in the plot etc.)
  • Add a border around this entire wire frame so it all looks like one piece.
  • Shade alternative columns in some dull color. This improves the readability. As our chart is transparent, cell fill colors will show up nicely.

We are done.

finalize-jitter-plot-visualizing-employee-salary-hikes

Inspiration for this – R

That is right. You can create a similar plot quicker and better using R. ggplot, an R library has built-in support for jittering dots on XY plots. So using that, you can create below chart with just 7 lines of code. This is what you get (yes, you can show each rating dots in different color, and yes, you can order the groups by number of people in them).

employee-ratings-jitter-plot-r

Here is the R script if you want to experiment.

Download Excel Chart

Click here to download the workbook containing this chart, tutorial and raw data. Try re-creating it in Excel (or your favorite visualization tool) to learn more.

How do you like this chart?

I had lots of fun making and tweaking this chart. It shows some interesting patterns about how salary hikes are distributed across groups and where everyone is.

How do you like this? Do you plan to add some jitter to your busy scatter plots? Please share your thoughts in comments section. And if you want some inspiration, check out more such charts.

Jittery about charts?

If you love story telling and beautiful visualizations but not sure how to get there, consider enrolling in our Excel School or 50 ways to Analyze Data programs. In these powerful courses, I teach you all about awesome data analysis and visualization techniques.

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

15 Responses to “Highlight Employees by Performance Rating – Conditional Formatting Challenge”

  1. Stephen says:

    While this might solve the question Shelly asked, there is another option that might be more useful - a pivot table could make a list of people who fall into the various categories, so, if you needed to simply see who got in the top bracket to give them a bonus, you would have that list

    Simply sorting by the rankings would work too, but you would knock them out of alphabetical order. 

  2. Darin Myers says:

    Normal
    0

    false
    false
    false

    EN-US
    X-NONE
    X-NONE

    /* Style Definitions */
    table.MsoNormalTable
    {mso-style-name:"Table Normal";
    mso-tstyle-rowband-size:0;
    mso-tstyle-colband-size:0;
    mso-style-noshow:yes;
    mso-style-priority:99;
    mso-style-parent:"";
    mso-padding-alt:0in 5.4pt 0in 5.4pt;
    mso-para-margin-top:0in;
    mso-para-margin-right:0in;
    mso-para-margin-bottom:10.0pt;
    mso-para-margin-left:0in;
    line-height:115%;
    mso-pagination:widow-orphan;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";
    mso-ascii-font-family:Calibri;
    mso-ascii-theme-font:minor-latin;
    mso-hansi-font-family:Calibri;
    mso-hansi-theme-font:minor-latin;
    mso-bidi-font-family:"Times New Roman";
    mso-bidi-theme-font:minor-bidi;}

     
    The solution I chose makes use of the percentile formula.
     
    The percentile formula returns the value representing the K-th percentile of a range of values. The range of values is the first criteria, and K is the second criteria in the formula.

    I applied Conditional Formatting according to the formulas in the order below:

    5%    =$C6>=PERCENTILE($C$6:$C$33,0.95)   Dark Blue
    15%  =$C6>=PERCENTILE($C$6:$C$33,0.85)   Light Blue
    65%  =$C6>=PERCENTILE($C$6:$C$33,0.1)     Green
    10%  =$C6>=PERCENTILE($C$6:$C$33,0.05)   Light Red
    5%    =$C6<PERCENTILE($C$6:$C$33,0.05)     Dark Red
     
    The issue I noted with this approach is that Zambi was not highlighted in my solution as it is in the solution provided. Unless I am mistaken, and I very well may be, the 10th percentile for this data set is at 2.21, so Zambi would fall above the 10th percentile with a PR of 2.3.
     
    The first step to this was figuring out the 'buckets'; what scores should fall into each range. In attempting to match the formatting of the spreadsheet, I determined the buckets below.
     
    5% = 95% to 100%
    10% = 90% up to but not including 95%
    65% = 10% up to but not including 90%
    10% = 5% up to but not including 10%
    5% = under 5%
     
    After that, it is a relatively simple matter to plug the necessary values into the conditional formatting formulas as shown above.

    One final consideration is that while the buckets above match the color banding on the spreadsheet, I believe that the original request suggests a different color banding with 6 buckets shown below.
     
    Top 5%    = 95 to 100%    Dark blue
    Top 10%  = 85 up to but not including 95%    Light blue
    Top 65%  = 35 up to but not including 85%    Green

    Bottom 10% = 10% down to but not including 5%   Light Red
    Bottom 5%   = 5% or under    Dark Red
     
    This leaves one final bucket of 10 to 35% (exclusive of both values) that is not highlighted and so would remain white.
     
    Thank you Chandoo and Shelly for an interesting and useful exercise. This is certainly a valuable technique to have in my reporting bag of tricks.
     

  3. PSG says:

    Use of PERCENTILE is a smarter way of doing it.  Below is my solution.
     
    First 5 % = Apply conditional formatting (Dark Blue) as highlight ">=" =PERCENTILE(C:C,0.95)

    Next 15% = Apply conditional formatting (Lighter Blue) as highlight between =PERCENTILE(C:C,0.95)-0.01 and  =PERCENTILE(C:C,0.8)

    Next 65% = Apply conditional formatting as highlight (Olive Green) between =PERCENTILE(C:C,0.8)-0.01 and  =PERCENTILE(C:C,0.15)

    Next 10% = Apply conditional formatting as highlight (Lighter Red) between =PERCENTILE(C:C,0.15)-0.01 and  =PERCENTILE(C:C,0.05)

    Bottom 5% = Apply conditional formatting (Red) as less than =PERCENTILE(C:C,0.05)

    • Shailesh says:

      I agree, this is a challenge faced by HR managers every year and use of percentile formulae is the most popular solution which permits further processing like making bell curve, applying increments based on segmentation etc.

  4. Mayank Bhatia says:

    Hi Chandoo,

    I came at the same solution as yours (not looking at yours first) but I have hard coded the conditions in the conditional formatting. For example:

    =AND($C6>=$D$10,$C6<$D$9)

    I have done the same thing 5 times for each condition.   This makes the formatting independent of the order of specification. I think it will work better across versions of excel.

    To copy the same thing in all sheets, Shelly can copy these formatted cells with format painter and apply it to the relevant cells in next sheet and so on! I know 700 sheets will be difficult but I dont know of any other way to apply conditional formating rules to the whole sheet.

      

  5. Sameer Srivastava says:

    First i have used percentile formula in the next column of "percentile Threshold" where E5, E6.. is input to colour code.
    The idea behind doing this is to replicate the formula for any range and any threshold

    =PERCENTILE($C$3:$C$30,1-E5)

    =PERCENTILE($C$3:$C$30,1-E6)

    =PERCENTILE($C$3:$C$30,1-E7)

    =PERCENTILE($C$3:$C$30,1-E8)

    =PERCENTILE($C$3:$C$30,1-E9)

    Now i have given logic to different employee by applying "if Formula"

    =+IF(J3>=$G$5,1,IF(J3>=$G$6,2,IF(J3>=$G$7,3,IF(J3>=$G$8,4,5))))

    where 'J"  referes to PR and "G" refers to percentile derived from above mentioned formula.
    once again it is replicable (just change reference points)

    Now comes the major part of Conditional Formatting, i have used "use a formula to determine which cells to be formatted" 
    Formula =$j=5, format "required colour" Applies to "$I$3:$J$30" 
    plus put tick on stop if true

    This solves the query, important point that this is repeatable and can be done for n number of departments

    Thanks !

  6. Deepa says:

    I had done some reading on it and in Excel 2010 a new function has been introduced, percentile.exc. Attaching a video which also talks why the old percentile function shouldn't be used as it acts erroneous at times. Might be worth a watch Chandoo,
    http://www.itechtalk.com/thread10579.html

    • Hui... says:

      @Deepa

      Quit correct.

      Where ever you use statistical spreadsheet functions and are using excel 2010 you should use the new versions of the functions as MS did a lot of work to speed up and fix errors in the old functions.

      Warning: If you use the new Excel 2010 statistical functions in Named Formulas most of them will crash excel so do keep that in mind.

  7. Kishore says:

    Hello Chandoo,
    When i first read the challenge file, i thought, the color that need to be applied for a given rule, also need to be picked dynamically as given in rule set. But in the solution file, i found that color is hard Coded. So in case, someone has same data, but wants different colors, he/she needs to goto manage rules and change colors.
    Let me know if my understanding is correct, and if yes, can we also make the color to be applied dynamic?
     
    Thanks
    Kishore

  8. Roger L Moreno says:

    HI I ALSO USED THE PERCENTILE FUNCTION. HOWEVER, I WENT A STEP FURTHER AND USING THE SMALL() FUNCTION I SORTED THE DATA BY PERCENTILE SO THE COLOSCHEME WOULD BE GROUPED BASED ON THE VALUE. THIS WAY IT IS BETTER AND EASIER TO VIEW.

  9. [...] recently posted a challenge to help a reader with a [...]

    • Balraj says:

      Hi, i have got doubt regarding to the percentages that has been put in chandoo's spreadsheet, i cant understadn how he put directly. can some one please explain how chandoo put the percetages straight way that i stated below..

      5%

      15%

      60%

      10%

      5%

  10. I have stumbled on this post as the solution has been already given so I have taken the liberty to record a video where I show the implementation of it as well as adding a filtering feature which I hope can prove to be useful.

    Thank you

    http://www.xlninja.com/2012/06/28/how-to-use-excel-to-highlight-employee-performance-rating/  

  11. [...] scriu nici macar un cuvant din urmatorul articol. Astazi mi-am citit mailul si hopa challenge de la Chandoo. Cum puteam sa refuz asa ceva si m-am apucat de citit, iar dupa 5 min i-am spus sotului ca pe asta [...]

  12. Yves S says:

    Question for Chandoo:
    I came to your site late but am totally loving these challenges 🙂

    I guess it all boils down to how the bins are set up.
    I agree with the PERCENTILE.INC function.

    pls help me understand where I am wrong.

    I have determined following the bins:

    bottom 5% <=2.00 (F6:F33 <=PERCENTILE(range,.05))
    lower 15% (5+10) <= 2.40 (F6:F33 <=PERCENTILE(range,.15))
    lower 80% (5+10+65) <=3.46 (F6:F33 <=PERCENTILE(range,.80))
    lower 95% (5+10+65+15) <=4.00 (F6:F33 =PERCENTILE(range,.95))
    top 5% <=4.20 (F6:F33 <=PERCENTILE(range,1.00))

    I find that only Tom is highest scorer and unique top 5% achiever.

    I notice that Chandoo has included Christy and Daniel in top 5% achievers. How can there be 3 people in top 5% out of a population of 28 (5% of 28 = 1.4, i.e. only one person can achieve that status)?

    I tried different ways but cannot get to that distribution.

    Rest of the work is simply organizing the conditional formatting rules with Stop If True box checked.

    Thanks for your insights

Leave a Reply