How your country did in Commonwealth Games – Power BI Viz and Tutorial

Share

Facebook
Twitter
LinkedIn

Commonwealth games 2018 have ended in the weekend. Let’s take a look at the games data thru Power BI to understand how various countries performed.

Here is my viz online (or you can see a snapshot below, click on it to expand).

Looks good, isn’t it? Well, read on to know how it is put together.

This is a high-level tutorial, aimed at Power BI users than newbies. If you are new to Power BI, start with my Power BI beginners tutorial.

Commonwealth Games Performance – Power BI Visualization – Tutorial

Step 1: Define goals for your visualization

Whenever you are making anything more than a bar chart (come to think of it, even bar charts need a bit of noodling before hand), it is prudent to spend time thinking what you want to accomplish with the visual.

For me the goals are:

  • Understand how various countries have performed in 2018, compare that to previous editions of games (say 2014, 2010 and 2006)
  • See which countries have improved their medal performance from last games
  • Understand how top 10 countries performed – which events they excel in
  • Prepare everything in less than 2 hours

I made a rough sketch of the visualization too. But I deviated quickly once I started playing with the data in Power BI.

Step 2: Gather the data

The data for this visualization came from 2 sources:

  • gc2018.com for 2018 games data
    • https://results.gc2018.com/en/all-sports/medal-standings.htm
    • https://results.gc2018.com/en/all-sports/medallist-by-sport-<country name>.htm
  • thecgf.com for previous games data
    • https://thecgf.com/results/games/3052 for 2014
    • https://thecgf.com/results/games/3046 for 2010
    • https://thecgf.com/results/games/3026 for 2006 medals

I mashed up most of the data in Power Query, but had to use a bit of Python (more on this in a future blog post) as the medalist by sport page (https://results.gc2018.com/en/all-sports/medallist-by-sport-<country name>.htm) has weird formatting with event name as A tag followed by medalists in a table and this was too much to process in PQ.

Step 3: Set up the data model

After gathering all the data in PQ, we can bring only relevant tables to Power BI model. I brought below tables:

  • medals – with medal tables for current (2018) and previous three editions of CG games
  • top 10 countries – event level medal data for top 10 countries in 2018
  • Countries – generated table with top 10 country names and their 3 letter abbreviations
  • medal types – typed in table with URLs for medal images and custom sort order of Gold, Silver and Bronze

Step 4: Create measures

Since one of the goals for this visual is to keep everything under 2 hours, I created only basic measures.

  • Medal Count = sum(medals[Medals])
  • Medal Count for 2014 = CALCULATE([Medal Count], 'medals'[Games] IN { "2014" })
  • Medal Count for 2018 = CALCULATE([Medal Count], 'medals'[Games] IN { "2018" })
  • Medal Count (all) = CALCULATE([Medal Count], all(medals[Games]))
  • Country Name = SELECTEDVALUE(medals[Country]) for showing in tooltip & chart header
  • % increase - 2014 to 2018 = DIVIDE([Medal Count for 2018]-[Medal Count for 2014], [Medal Count for 2014], 0) for showing in tooltip
  • medal count - top 10 = countrows('top 10 countries')
  • total medal count for country = CALCULATE([medal count - top 10], all('top 10 countries'[Event]))
  • medal % = [medal count - top 10] / [total medal count for country]

As you can see, these are basic arithmetic or simple CALCULATE measures. I used the excellent quick measure feature to create the Medal Count for 2014 measure and learned about IN keyword. #awesome

Step 5: Create visuals

Visual for exploring medal performance by country

I started with a simple slicer on games year and a matrix visual by country in rows, medal type in columns and medal count in values. Then I added data bars to the medal count.

Visual for exploring change over time:

Then I added Ribbon chart with Games, Medal Type and Medal Count to see how total medals have changed over time. When you pick a country from the matrix, this visual updates to show how that country’s performance changed over time.

Visual for seeing which countries improved in 2018:

I added a scatter chart with  Country as legend, Medal count for 2018 as X and Medal count for 2014 as Y. Then I added symmetry shading to this chart from analytics pane. Viola, we can see which countries did well or worse in this round compared to 2014.

Visual for tool tip

I inserted a new page (called Country Medals), changed the format to Tooltip and added a few visuals to make it a tool tip for the scatter chart.

Setting up tooltips is still painful, but this is a new feature, so I am sure MS will add more teeth to this power.

Linking scatter chart and tooltip

Select the scatter chart and from Format pane, set up tooltip to a report page and select Country Medals page.

Visual for seeing where top 10 countries excel

I added another matrix visual with Event in rows, abbreviated country name in columns and medal % in values. Then I added conditional formatting > Background color scales to spot bigger numbers easily.

This visual and the scatter plot are then linked to a slicer on medal type (Gold / Silver / Bronze) so you can see event performance and change over time for any type of medal.

Formatting the visuals

The default colors for visuals use Power BI color scheme. I changed the colors to match medals – Gold, Silver and Bronze so that they are easy to spot. Unfortunately, this would not sync across all visuals, so we have to format each of the visuals (well, only two – ribbon chart and bar chart on the tooltip page)

Download Commonwealth games Power BI Viz

Click here to download the workbook. Examine the query definitions (especially top 10 countries) to learn some quirky ways to work with Power Query. Enable interactions from view ribbon to see how each visual interacts with others. Play with it and mash up your own data to create something equally awesome. If you end up making another viz from this data, feel free to post it in the comments section so we all can see and learn from you.

 

Want to learn Power BI? Check out Power BI Play Date

If you like what you have seen and want to learn how to build such cool visualizations and reports for your work, sign up for my Power BI Play Date. We are opening next batch very soon.

 

 

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

19 Responses to “How to Distribute Players Between Teams – Evenly”

  1. Roshan Thayyil says:

    An excellent solution, especially for large data sets.

    Another solution without using solver would be to assign the player with the highest score to Team 1, the 2nd to team 2, 3rd to team 3, 4th to team 3, 5th to team 2, 6th to team 1, 7th to team 1 and it continues. This method would end up with a Std Dev of 0.001247219. This works best with a distribution with lower Std Dev for the dataset.

    Full Disclosure: this is not my idea, remember reading something a few years ago. Think it may have been Ozgrid

    • Roshan Thayyil says:

      thinking back I now remember why I read about it. About 10 years back I had to distribute around 300 team members into 25-30 odd teams. Used this method based on their performance scores. I used the method I described to do this and the distribution was pretty fair.

      Solver would have saved me a ton of time though 🙂

  2. I think the issue with you first Solver approach was that you took the absolute value of the sum of team deviations (which should always be zero except for rounding) instead of the sum of the absolute values (which is a reasonable measure of how unbalanced the teams are).

  3. Here's another simple algorithm you could use: you start from the top (with players sorted from high to low), and at each step allocate the next player to whichever team has the smallest total so far. You can implement it dynamically with some formulas so it will update automatically when the data changes.

    If the scores were more widely distributed (so that this might end up with not all teams the same size), you could add a constraint to only pick among the teams which currently have fewest players at each step, or just stop adding to any team when it hits its quota.

    When I tried it on the sample, I got the three teams below, with a STDEV of 0.000942809 (i.e. about half of what Solver got to).

    Team 1: John, Hugo, Tom, Josh, Eric, Zane, Charles, Andrew
    Team 2: Barry, Michael, Kenny, Joe, Xavier, Patrick, Oliver, William
    Team 3: Henry, Steven, Ben, Frank, Kyle, Edward, Cameron, Lachlan

    Thanks for sharing!

    • Ishaan says:

      Hi,
      I was looking at all the solutions and this is closest to what I intended to do. I am dividing a bunch of players into 3 soccer teams. Players availability is also a factor while deciding the teams.
      So the steps the excel needs to do is as follows:
      1) In availability column if "yes" go to next
      2) Equally divide 'Goalkeepers', 'Strikers', 'Defenders' basis their quality
      So the end result gives each 3 teams a balance of players playing at different positions.
      Can this be done on Google spreadsheet with only availability as an input from the user and rest calculates by itself.
      Sorry for asking such a pointed question, but I have been struggling to find a solution for it for sometime now!

      • Robin says:

        Hi Ishaan,

        I am working on a similar problem at the moment, so I am wondering if you ever found a solution and if you are willing to share what you did.

  4. Konrad says:

    Hi everyone, this is a variation of the famous Knapsack Problem https://en.wikipedia.org/wiki/Knapsack_problem.

    I had to use a VBA implementation recently as part of a problem, where we ar trying to allocate teams of an organization into different locations (we are a large company with many different team). The goal was to optimally allocate teams to individual buildings without putting too many teams into one building and not splitting teams apart.
    As we had around 400 teams of different sizes, solver couldn't handle it anymore. Luckily there is a Knapsack algorithm implementation in VBA readily available on the internet :).

    I also went with a heuristic approach first!

  5. Joe Egan says:

    An interesting mathematical solution but what if Eric and Xavier can't stand each other or Patrick is best friends with Steven - the real life problems that effect "even" teams.

    • Hui... says:

      @Joe

      You can add more criteria like
      If Eric and Xavier can't stand each other
      =OR(AND(E15=1,E16=1),AND(F15=1,F16=1),AND(G15=1,G16=1))
      It must be False

      If Patrick is best friends with Steven
      =OR(AND(E5=1,E17=1),AND(F5=1,F17=1),AND(G5=1,G17=1))
      It must be True

      Note that the 2 formulas above are exactly the same
      except for the ranges
      One must be True = Friends
      One must be False = Not Friends

  6. Gustavo Sousa says:

    Nice post Hui!

    I download your workbook and just try to change in options the Precision Restriction from 10E-6 to 10-8 and the Convergence from 10E-4 to 10E-10. The process take almost the same time, but the results was great.

    The standard deviation I got was 0,000471.

    Team 1: John, Tom, Kenny, Frank, Eric, Xavier, Edward, Zane
    Team 2: Steven, Hugo, Ben, Joe, Josh, Oliver, Cameron, William
    Team 3: Barry, Henry, Michael, Kyle, Patrick, Charles, Andrew, Lachlan

  7. Charlie says:

    Great application of Solver! Thanks for the link!

  8. Chuck says:

    Great explanation. Well done... However, I tried with 6 teams of 4 players and solver never did finish.

  9. Akbar says:

    How about vba code for the same data set.
    I have 3 column A B C wherein A has text and B has number Wherein C is blank. And in C1 been the header C2 where I want the name to come evenly distributed the number which is in Column B.
    My Lastcolumn is 1000.

  10. HRMFT says:

    Sorry if I'm being slow here, but how is 'Team Score' calculated? I've gone through the explanation several times but it seems to just appear.

    • Hui... says:

      @Hrmft

      This process uses the Solver Excel addin

      Solver is effectively taking the model and trying different solutions until it gets a solution that meets all the criteria
      Then solver puts the solution into the cell and moves to the next cell

      So yes it appears to "just appear"

  11. Caroline says:

    Hi ! Thank you so much ! Works great 🙂

  12. Jim Cruse says:

    I cannot get the fourth Equation to work in my excel spreadsheet
    You have =($E$2:$G$25=0)+($E$2:$G$25=1)=1 as a SUMIF solution, I have, =($F$2:$H$13=0)+($F$2:$H$13=1)=1 as my solution but it does not work. The only thing I changed is the ranges. Any suggestions?
    Thank you.
    Jim

  13. Jim Cruse says:

    I cannot get the fourth Equation of TURE or FALSE statements to work in my excel spreadsheet You have =($E$2:$G$25=0)+($E$2:$G$25=1)=1 as a SUMIF solution, I have, =($F$2:$H$13=0)+($F$2:$H$13=1)=1 as my solution but it does not work. The only thing I changed is the ranges. Any suggestions?
    Sorry I left some of it out in the previous question,
    Thank you. Jim

Leave a Reply