fbpx
Search
Close this search box.

Awesome chart to visualize Salary Increases for 3,500+ people [Tutorial]

Share

Facebook
Twitter
LinkedIn

Game for some charting awesomeness?

Off late, I have been doing a lot of data analysis and visualization on performance ratings, salary hike, gender pay equality etc. Today let me share you an awesome way to visualize massive amounts of data.

Scenario: Your organization of 3,686 people recently went thru annual performance ratings & review process. At the end of it, everyone was offered some salary increase (from $0 to $24,000 per year). You have 7 business groups. How do you tell the story of all these salary hikes in one chart?

How about this one?

finalize-jitter-plot-visualizing-employee-salary-hikes

Ready to know how to create this in Excel? Read on.

Tutorial: Creating jittered scatter plot in Excel

That is right, what you are seeing above is good old scatter plot with a bit of jitter (random noise added to X values). This way, when too many dots are at a single point, we spread them apart to show more.

Let’s look at data:

Here is a sample of 3,500+ employee’s ratings and salary hikes (randomly made up), with the usual columns:

performance-ratings-and-sal-increase-raw-data

Convert rating and group names to numbers:

Since we can’t use rating and group names in XY plot (we need numbers, not text), let’s convert these in to numbers using simple MATCH() formula.

We get two new columns, like below:

rating-and-group-converted-to-numbers

Creating X & Y values from data:

Next up, we need to generate the X & Y values for our plot.

Y value:  This is easy. It is the amount of salary increase with two twists:

  • If employee got $0 hike, we want to omit them in the plot. This will remove many of dots from the plot (less clutter)
  • If an employee is unrated (even if they got a hike), we want to omit them too. This is because our plot has only 4 rating levels per group. There are very few unrated people and they are not the focus of this chart.

We can create Y value using a simple IF formula like below:

  • =IF(OR([@[Salary Increase $]]=0,[@[Rating 17 (number)]]=5),NA(),[@[Salary Increase $]])

X value:  This is the tricky bit. Since there are 7 groups, each with 4 ratings (excluding the unrated), we have 28 possible X values. We want to space these out so dots for one group + rating combination don’t encroach other combination.

Let’s say we give 10 units of space per group.

That means, we have 2.5 units of space per rating in that group (and total of 70 units of space).

Now, the dot needs to plotted at the center of this 2.5 unit of space (ie at 1.25)

The basic formula would be: =[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25

But what about the jitter?

Aah, right. We need to add random noise to X value. Since each rating has 2.5 of space, how about noise between -0.7 to 0.7 ? This still leaves plenty of space on both ends thus keeping the plot clear.

jittering-a-dot-with-random-noise

We can use below formula to generate the noise.

=RANDBETWEEN(-700,700)/1000

The final formula for X value goes like this:

=[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25+[@Noise]

Here is how our X,Y looks at this stage:

x and y values for scatter jitter plot

Data prep done, let’s move to the plot.

Creating jittered scatter plot

  1. Select both X & Y values and insert XY plot. We get this.jitter-plot-step-1
  2. Set X axis limits and remove title: As all our dots are between 10 to 80, let’s set them as limits for X axis. Also, let’s remove the chart title.jitter-plot-step-2-x-axis-limits
  3. Add vertical gridlines: Although our dot towers are separated from each other, adding grid lines makes it easy to read the chart.jitter-plot-step-3-vert-lines
  4. Format the markers: Set fill to solid color and 25% transparency. This makes the dots look nice and shows the density when there are too many people at some co-ords.jitter-plot-step-4-transperent-dots
  5. Set Y axis limit: So that we can focus on people getting salary increase of up to $10,000. This zooms the chart to meaty part while showing plenty of outliers. We get this:jitter-plot-step-5-y-axis-max-limit
  6. Last step:  Remove plot and chart borders, so we can add extra info, labels etc.jitter-plot-step-6-remove-borders

Ok, now our chart is almost ready. Next step, making it a story.

Create a wireframe in 10 column area, as shown below:

chart-layout-wireframe-employee-jitter-plot-v1

Next place the chart inside the red box. Adjust plot area size so it fits in to 7 columns. Hold ALT key when adjusting so chart’s plot area would fit in to 7 columns. You need to repeat this step every time you fiddle with the chart. So do it at last.

Add extra story points:

  • A clear and descriptive title
  • A sub-title explaining what is going on and how to read the chart.
  • Group names and rating names. You can use the below trick to align the rating labels inside cell nicely.demo-horizontal-distribution-cell-text
  • Show some more stats like median hike, median new pay (if you have it), head counts and unrated counts.
  • Add any footers, disclaimers (about excluded people in the plot etc.)
  • Add a border around this entire wire frame so it all looks like one piece.
  • Shade alternative columns in some dull color. This improves the readability. As our chart is transparent, cell fill colors will show up nicely.

We are done.

finalize-jitter-plot-visualizing-employee-salary-hikes

Inspiration for this – R

That is right. You can create a similar plot quicker and better using R. ggplot, an R library has built-in support for jittering dots on XY plots. So using that, you can create below chart with just 7 lines of code. This is what you get (yes, you can show each rating dots in different color, and yes, you can order the groups by number of people in them).

employee-ratings-jitter-plot-r

Here is the R script if you want to experiment.

Download Excel Chart

Click here to download the workbook containing this chart, tutorial and raw data. Try re-creating it in Excel (or your favorite visualization tool) to learn more.

How do you like this chart?

I had lots of fun making and tweaking this chart. It shows some interesting patterns about how salary hikes are distributed across groups and where everyone is.

How do you like this? Do you plan to add some jitter to your busy scatter plots? Please share your thoughts in comments section. And if you want some inspiration, check out more such charts.

Jittery about charts?

If you love story telling and beautiful visualizations but not sure how to get there, consider enrolling in our Excel School or 50 ways to Analyze Data programs. In these powerful courses, I teach you all about awesome data analysis and visualization techniques.

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Excel School made me great at work.
5/5

– Brenda

Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

Weighted Average in Excel with Percentage Weights

Weighted Average in Excel [Formulas]

Learn how to calculate weighted averages in excel using formulas. In this article we will learn what a weighted average is and how to Excel’s SUMPRODUCT formula to calculate weighted average / weighted mean.

What is weighted average?

Wikipedia defines weighted average as, “The weighted mean is similar to an arithmetic mean …, where instead of each of the data points contributing equally to the final average, some data points contribute more than others.”

Calculating weighted averages in excel is not straight forward as there is no built-in formula. But we can use SUMPRODUCT formula to easily calculate them. Read on to find out how.

19 Responses to “Awesome chart to visualize Salary Increases for 3,500+ people [Tutorial]”

  1. Vishal Srivastava says:

    Integration with R is really awesome chandoo! Looking forward for more articles on Excel+R.

    Thanks

  2. Vishal Srivastava says:

    Hi Chandoo,
    I am getting following error in R script:

    Error: Faceting variables must have at least one value
    Please tell where i am missing something.

    Thanks

  3. N says:

    Very nice. But unfortunately this is too advanced for me. I'd like to learn about the basics, eg. the IF formula. Do you already have a post that explains that? Many thanks!

  4. Michael says:

    Chandoo,

    An excellent example of charting...especially the jitters !
    This will find its way into my portfolio

    Thank you

  5. Chihiro says:

    Too bad that tidyverse isn't supported right now in PowerBI to publish online.

    I'm still very new to R... But from my understanding, tidyverse is collection of individual packages... so I guess I can load individual library(s) and should work. Now to go and find where each function belongs 🙂

    • Chihiro says:

      Success!

      Minor modification: You need these 4 libraries
      library(ggplot2)
      library(magrittr)
      library(dplyr)
      library(readr)

      And add all columns into Values field and change rem_data line to...
      rem_data <- dataset

  6. Brian says:

    is there a way to color a particular data record red?

    Thinking of trying to use this in a dashboard to automate the last 30 days data where the red is "Today"

    • Chandoo says:

      You can use another series to separate today (or any other criteria records). If you are not sure, send me the dummy data set @chandoo.d@gmail.com and I will show in a future blog post.

  7. Chihiro says:

    I was playing a bit more with the jitter plot and found issue where R visual returns error when slicer is used on it.

    With sample data when slicer is used to filter for "Operations", the visual returns error. While it works fine on "Development".

    Looks like the issue is when there is no record (visible on chart or not) that belongs to particular rating (Ex: NME).

    I circumvented this issue by adding dummy data for each rating category to Group Name (I added 0 increase since these are filtered from the chart in script).

    Will report back if I find more elegant solution within R script.

  8. Deb says:

    Thank you so much. The download works great and the tutorial is excellent. I customized it like crazy and didn't manage to break it. You are awesome!

  9. Adil says:

    Great post. I'm unable to understand the X axis formula setup? Can someone please take the time out to explain why it is so. I tried reading the article twice but not getting it. Thanks.
    =[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25+[@Noise]

  10. DENIS says:

    Excellent article, thanks! What should be the formula for placing the points so that they are not distributed in random order, but evenly from the center? I distributed wage values for 150 ranges. And in each of them I want the points to be uniformly distributed from the center to the right and to the left within [-0.7, +0.7]. Something like beeswarm method Center in R (http://www.cbs.dtu.dk/~eklund/beeswarm/).

  11. […] before I started building this visual, I’d fortunately been reading an article by my good friend Chandoo, in which he “jittered” some dots in an Excel scatter […]

  12. Jayanta Moitra says:

    Hi Chandoo
    I need to create a program to print stagewise fare chart. The chart will be printed based on fare & stagewise Kms. Example -

    Km Stage
    0 Source
    5 A
    10 B
    14 C

    Fare Chart will be printed as -
    Km Stage
    0 Siliguri 00
    5 Fulbari 10
    10 Fatapukur 20 10
    15 Jalpaiguri 30 20 10

    Can you please help me.

  13. Issac Alex says:

    Hi Chandoo,

    Thank you for the opportunity and for all the awesome excel, you are amazing

    Really love them all,

    Best wishes

Leave a Reply