Awesome chart to visualize Salary Increases for 3,500+ people [Tutorial]

Share

Facebook
Twitter
LinkedIn

Game for some charting awesomeness?

Off late, I have been doing a lot of data analysis and visualization on performance ratings, salary hike, gender pay equality etc. Today let me share you an awesome way to visualize massive amounts of data.

Scenario: Your organization of 3,686 people recently went thru annual performance ratings & review process. At the end of it, everyone was offered some salary increase (from $0 to $24,000 per year). You have 7 business groups. How do you tell the story of all these salary hikes in one chart?

How about this one?

finalize-jitter-plot-visualizing-employee-salary-hikes

Ready to know how to create this in Excel? Read on.

Tutorial: Creating jittered scatter plot in Excel

That is right, what you are seeing above is good old scatter plot with a bit of jitter (random noise added to X values). This way, when too many dots are at a single point, we spread them apart to show more.

Let’s look at data:

Here is a sample of 3,500+ employee’s ratings and salary hikes (randomly made up), with the usual columns:

performance-ratings-and-sal-increase-raw-data

Convert rating and group names to numbers:

Since we can’t use rating and group names in XY plot (we need numbers, not text), let’s convert these in to numbers using simple MATCH() formula.

We get two new columns, like below:

rating-and-group-converted-to-numbers

Creating X & Y values from data:

Next up, we need to generate the X & Y values for our plot.

Y value:  This is easy. It is the amount of salary increase with two twists:

  • If employee got $0 hike, we want to omit them in the plot. This will remove many of dots from the plot (less clutter)
  • If an employee is unrated (even if they got a hike), we want to omit them too. This is because our plot has only 4 rating levels per group. There are very few unrated people and they are not the focus of this chart.

We can create Y value using a simple IF formula like below:

  • =IF(OR([@[Salary Increase $]]=0,[@[Rating 17 (number)]]=5),NA(),[@[Salary Increase $]])

X value:  This is the tricky bit. Since there are 7 groups, each with 4 ratings (excluding the unrated), we have 28 possible X values. We want to space these out so dots for one group + rating combination don’t encroach other combination.

Let’s say we give 10 units of space per group.

That means, we have 2.5 units of space per rating in that group (and total of 70 units of space).

Now, the dot needs to plotted at the center of this 2.5 unit of space (ie at 1.25)

The basic formula would be: =[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25

But what about the jitter?

Aah, right. We need to add random noise to X value. Since each rating has 2.5 of space, how about noise between -0.7 to 0.7 ? This still leaves plenty of space on both ends thus keeping the plot clear.

jittering-a-dot-with-random-noise

We can use below formula to generate the noise.

=RANDBETWEEN(-700,700)/1000

The final formula for X value goes like this:

=[@[Business Group]]*10+([@[Rating 17 (number)]]-1)*2.5+1.25+[@Noise]

Here is how our X,Y looks at this stage:

x and y values for scatter jitter plot

Data prep done, let’s move to the plot.

Creating jittered scatter plot

  1. Select both X & Y values and insert XY plot. We get this.jitter-plot-step-1
  2. Set X axis limits and remove title: As all our dots are between 10 to 80, let’s set them as limits for X axis. Also, let’s remove the chart title.jitter-plot-step-2-x-axis-limits
  3. Add vertical gridlines: Although our dot towers are separated from each other, adding grid lines makes it easy to read the chart.jitter-plot-step-3-vert-lines
  4. Format the markers: Set fill to solid color and 25% transparency. This makes the dots look nice and shows the density when there are too many people at some co-ords.jitter-plot-step-4-transperent-dots
  5. Set Y axis limit: So that we can focus on people getting salary increase of up to $10,000. This zooms the chart to meaty part while showing plenty of outliers. We get this:jitter-plot-step-5-y-axis-max-limit
  6. Last step:  Remove plot and chart borders, so we can add extra info, labels etc.jitter-plot-step-6-remove-borders

Ok, now our chart is almost ready. Next step, making it a story.

Create a wireframe in 10 column area, as shown below:

chart-layout-wireframe-employee-jitter-plot-v1

Next place the chart inside the red box. Adjust plot area size so it fits in to 7 columns. Hold ALT key when adjusting so chart’s plot area would fit in to 7 columns. You need to repeat this step every time you fiddle with the chart. So do it at last.

Add extra story points:

  • A clear and descriptive title
  • A sub-title explaining what is going on and how to read the chart.
  • Group names and rating names. You can use the below trick to align the rating labels inside cell nicely.demo-horizontal-distribution-cell-text
  • Show some more stats like median hike, median new pay (if you have it), head counts and unrated counts.
  • Add any footers, disclaimers (about excluded people in the plot etc.)
  • Add a border around this entire wire frame so it all looks like one piece.
  • Shade alternative columns in some dull color. This improves the readability. As our chart is transparent, cell fill colors will show up nicely.

We are done.

finalize-jitter-plot-visualizing-employee-salary-hikes

Inspiration for this – R

That is right. You can create a similar plot quicker and better using R. ggplot, an R library has built-in support for jittering dots on XY plots. So using that, you can create below chart with just 7 lines of code. This is what you get (yes, you can show each rating dots in different color, and yes, you can order the groups by number of people in them).

employee-ratings-jitter-plot-r

Here is the R script if you want to experiment.

Download Excel Chart

Click here to download the workbook containing this chart, tutorial and raw data. Try re-creating it in Excel (or your favorite visualization tool) to learn more.

How do you like this chart?

I had lots of fun making and tweaking this chart. It shows some interesting patterns about how salary hikes are distributed across groups and where everyone is.

How do you like this? Do you plan to add some jitter to your busy scatter plots? Please share your thoughts in comments section. And if you want some inspiration, check out more such charts.

Jittery about charts?

If you love story telling and beautiful visualizations but not sure how to get there, consider enrolling in our Excel School or 50 ways to Analyze Data programs. In these powerful courses, I teach you all about awesome data analysis and visualization techniques.

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

22 Responses to “Formula Forensic No 019. Converting uneven Text Strings to Time”

  1. Joe Carsto says:

    Why not let the TIME function take care of the math:
    =TIME(LEFT(TEXT(A1,"000000"),2),MID(TEXT(A1,"000000"),3,2),RIGHT(TEXT(A1,"000000"),2))

    • Ben Niebuhr says:

      I was going to point out the same thing, except to note that useing the time function and doing the divide method are not interchangeable.

      I have spent hours investigating a spreadsheet working with a couple of years worth of hourly data, and found that the reason things weren't working is because the rounding on the divide method is only close to the correct time values. In order to have it work for comparisons, (like sub-totaling by time value, or pivoting) you MUST use the TIME function.

      Great use of the TEXT function, Hui. I will be using this concept for sure.

  2. Elias says:

    Why not just.

    =TEXT(A1,"00\:00\:00")*1

    Regards

    • Joe Carsto says:

      Elegant!

    • Manick says:

      Hi Elias,

      I tried to use your formula. But, it doesn't seem to work for me. I am getting an error message "The formula you typed contains an error". It seems I have the problem in using \: in the format. How can I overcome this?

      Thanks

      • Greg G says:

        Manick, it isn't the /: that causes the problem. If you copy/paste it, you're getting “'s instead of the actual quotation marks that Excel uses. Change the quotation marks by deleting from the pasted formula and retype them.

      • modeste says:

        Hi Manick...
        use this alternate formula :
        =1*TEXT(A1,"00"":""00"":""00")

        note twice double quote each side of :

  3. Elias says:

    @Manick,

    Did you copy the formula and pasted in Excel or did you typed? Also, do you use , or ; as separator of arguments?

    Regards

    • Joe Carsto says:

      @Elias: I had no problem using your formula, in fact, I have used your method to convert a number such as 20120419 to an Excel date using =TEXT(A1,"0000\/00\/00")*1. Thanks for posting.

      • Elias says:

        @Joe: For date convertion you can use this as well.

        =TEXT(A1,"00-00-00")*1

        Regards

        • Joe Carsto says:

          Sweet! It appears this also works with =TEXT(A1,"0-00-00")*1. I come from the old days when you counted every byte. I also like to try an make formulas as small as possible for the fun of it 🙂

  4. Haseen says:

    Elias's suggestion is the simplest, but here is yet another way with TIME and MOD functions...

    =TIME(MOD(A2/10000,100),MOD(A2/100,100),MOD(A2,100))

  5. Since the seconds appear to always be 0, why not simply the input to minutes and above and save yourself the trouble of typing those zeroes...

    0 => 0:00
    1 => 1:00
    10 => 10:00
    100 => 1:00:00
    etc.

    Then just use this formula...

    =TEXT(A1,"0\:00\:")*1

    • Elias says:

      @ Rick, the numbers to convert are no typed, they are imported. Then your formula will return the wrong result.

      Regards.

  6. Hmm! My formula lost some backslash-zero combinations (two of them to be exact). The formula was supposed to be this...

    =TEXT(A1,"0\:00\:\zero\zero")*1

    where the words "zero" should actually be the number 0. Another way to write the formula is this...

    =TEXT(A1,"0\:00\:""00""")*1

  7. Rajagopal says:

    Hi Master,
    While writing the formulae you have considered only upto "seconds factor" . I think you should take the centi-seconds factor also to achieve best results. Please look into it and rectify the problem...?

    For Example.
    In horse racing timings are noted in minute, seconds and centi-seconds, like if a horse finished in 70 seconds over a scurry of 1200 metres, is noted as 1.10 min. Nowadays it is noted in centi-seconds everywhere, like 70.00 if you want to convert it to centi seconds (should multiply by 100) = 7000 centi seconds. If you put this figure into your formula as a general number (7000) it will return as 1:10:00. As per your formula, it should be taken as 1 hour 10 seconds 0 minutes. However for a racing enthusiast like me it can be taken as 1 minute 10 seconds also.

    Just look what happens if we race goers use this figure as 7000 centi seconds in your formulae, it will correctly show as 1 minute 10 seconds(?) Suppose a horse finishing over a 1200m in 70.60 seconds or in racing terms written as 1.10.60 mins, where 1 minute 10 seconds, & 60 centi-seconds can be counted as 7060, if you put this figure in the formula it will return as 1 minute 11 seconds, that is correct.

    My point is if you can incorporate Centi Seconds in the formulae, it would be of great help to us also.

    Thanks and regards.
    Rajagopal (Mumbai)

  8. Vishy says:

    Awesome techniques !

    I tried with 235960 just to see if it will fail but this is great.

  9. CMC says:

    Although a little longer, this too work:

    =CHOOSE(LEN(A2);A2/(24*3600);A2/(24*3600);LEFT(A2;1)/(24*60) + RIGHT(A2;2)/(24*3600);LEFT(A2;2)/(24*60) + RIGHT(A2;2)/(24*3600);LEFT(A2;1)/24 + MID(A2;2;2)/(24*60) + RIGHT(A2;2)/(24*3600);LEFT(A2;2)/24 + MID(A2;3;2)/(24*60) + RIGHT(A2;2)/(24*3600))

  10. Converting uneven Text Strings to Time I have imported some data that comes in as a number that I need to convert to h:mm.

  11. Sudhir Gawade says:

    Just come across this while googling

    find interesting challenge and come up with this 

    =TEXT(TEXT(SUBSTITUTE(A1,RIGHT(A1,1),""),"000000"),"00\:00\:00")

  12. Renee Keel says:

    I need to convert a string of numbers representing average minutes, to reflect correct time values. For example, the numbers below currently represent 5.79 minutes, 15.82 minutes, etc.

    I need to convert these values to their correct corresponding value within time parameters. So 5.79 would be something close to 5 minutes and 45 seconds.

    5.79
    15.82
    3.92
    12.40
    6.70
    3.62

    I know there has to be a way to compute this in Excel, it can do anything, I believe!

    Thank you for any and all assistance~

    • Chandoo says:

      @Renee... You can use a formula like this. Assuming A1 has the minutes.seconds,

      =INT(A1) + MOD(A1, 1)*0.6

      If you want to see it in 5 minutes 45 seconds format, use

      =INT(A1) & " mins " & ROUND(MOD(A1, 1)*0.6,2) & " secs"

Leave a Reply