Search

Analyzing half a million customer complaints – Regional Trends [Part 2 of 3]

Share

This is part two of our three part series on how to analyze half a million customer complaints. Read part 1 here.

This is the second part of 3 part series on this theme. Please use below links to access other parts.

1. What do they complain about – Part 1
2. Regional trends – Part 2
3. My bank vs. your bank in a scorecard – Part 3

Analyzing Regional Trends

As introduced in part 1, our complaints dataset has geographical information too. We know the state & zip code for each complaint. Please note that zip codes are partial or missing for a 10% of the data.

1. Regional trends by state, product & issue
2. Complaints per million by state
3. Complaints by zip code

1. Regional trends by state, product & issue

First, we need to identify the goals for this analysis. You can define any number of goals for this data set. Let’s start with these goals,

• Understand yearly trend by state between 2012 & 2015 (we don’t have full data for 2011 or 2016)
• Explore 2015 monthly trend to understand patterns
• Understand top 3 issues bothering the customers
• Look at the trends at whole or by any individual product
• Don’t let bigger states hijack the report
• Rearrange the report so we can focus on states that require most attention.

Feel free to add your own goals and analyze data to fulfill them.

Mockup for our visual analysis

Given the geographical nature of this data, we would be tempted to go with a maps based visualization. Unfortunately, maps tend to distort the data, as bigger states get more space. So for this analysis, let’s stick with good old tables, lines & dots.

Here is a mockup of the analysis we are trying to achieve.

From raw data to analysis – the journey

I will be honest here. The journey from raw data with 500,000 rows to the analysis we need is not an easy one. As I wanted to analyze this data with tools & features native to Excel 2010 (so that maximum readers can enjoy this), I had to say no to Power Pivot, VBA, Power Maps, Timelines etc.

Using formulas like SUMIFS, INDEX, MATCH, VLOOKUP or SUMPRODUCT is also a no-no as we have too much data. So what to do?

Here is the process:

• Remove un-necessary columns from the raw data. We end up with just 7 columns (from 16)
• Create pivot tables from raw data
• Pivot 1 – Complaint count by Year, Month, State, Product
• Pivot 2 – Complaint count by Product, Issue, State for the year 2015
• Pivot 3 – Complaint count by Issue, State for the year 2015
• Build calculations that talk to pivot tables to fetch the numbers we want
• Define named ranges for the pivot table data, row & column labels so that we can easily access them in our formulas
• Get the numbers we need using GETPIVOTDATA, INDEX & SUMIFS
• Define form controls to capture user choices for product, sort order & indexed data
• Sort data by given criteria using LARGE formula
• Plug the calculations to output worksheet.
• Set up a scrollbar form control to show 12 states at a time, as we have limited space on the screens.
• Show trend using sparklines
• Show alert icon for states with >20% increase in complaint volume using conditional formatting.
• Tidy up the output

Regional trend analysis – final output

Here is the final output of the regional analysis report.

2. Complaints per million by state

Of course, as the data has geographical component, I am tempted to try some map based visualizations to analyze it. So for this & next type of analysis, I have used new Excel 2016 feature – 3D Maps (known as Power Maps in earlier versions).

If we chart the complaint data directly, bigger states (like CA, FL etc.) will hijack the map. So I downloaded 2015 population estimates (from Wikipedia) and combined this information to calculate a new metric complaints per million.

I created two map layers – one with complaints per million in 2015 & another to mark states with above average complaint rate.

Check out the visualization below. For more, download the 3D maps workbook.

3. Complaints by zip code

While many of us think zip codes are actual geographical regions (ie polygons), they are not. They are simply postal routes. So gathering population by zip code is hard. Fortunately, I found a Population by zipcode – 2010 – dataset here. It is for year 2010, but the data is a good proxy for our analysis.

In our dataset, zip code data is not available for roughly 10% records. So I eliminated them and analyzed the rest. When calculating the complaints per thousand metric, I noticed another anomaly. Certain zipcodes have very little population (~ 3,000 zipcodes have less than 500 people). So if they have even 1 complaint in 2015, their CPT (complaints per thousand) will be very high.

Here is the first visualization – Complaints per thousand by Zipcode

To avoid the too little population problem, I have applied a filter in 3D maps so that if a ZIP code has less than 10 complaints, we don’t show them on the map.

Here is the second visualization

Quick demo of the visualizations

In case you are unable to download the files or want to just peek at these awesome visualizations, see below video.

You can also see this on our YouTube Channel.

How would you analyze the complaint data?

Go ahead and play with this data yourself. How would you analyze it? Please share your ideas, analysis & charts in the comments section. If you wish to publish a chart, email it to chandoo.d@gmail.com with the subject “Half a million complaints”. I will add your chart at the end of this post.

I cover this topic and 50 other case studies in my online class 50 ways to analyze data. If you are keen to learn advanced analytics and data science thru Excel, I highly recommend 50 ways course. Check it out here.

For something less intense, but still as much fun and detailed, check out Excel School program. This course teaches intermediate to advanced Excel with specific focus on Dashboard Reporting. Check out Excel School here.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Chandoo is an awesome teacher
5/5

– Jason

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

How to compare two Excel sheets using VLOOKUP? [FREE Template]

You are the boss of ACME Inc. And one day, both of your accounts receivables team members Sara and James come to you with two versions of the customer payment data. How do you compare these two Excel sheets and reconcile the data? In this article, let me explain the step by step process.

3 Responses to “Analyzing half a million customer complaints – Regional Trends [Part 2 of 3]”

1. Mehmet Gunal OLCER says:

The size of the file named as "consumer-complaints-3dmap.xlsx" is 66.1 MB. It is too big. Isn't it?
Would you like to shrink the size to 55.9 MB. Saving more than 15% is possible by saving the same file with XLSB extention instead of XLSX extention.

2. Daniel says:

Hi, I have Excel 2013 Prof Plus, but I dont find Power Maps under the Com Add Ins - any idea? Thanks and best regards, Daniel

Get FREE Excel & Power-BI Newsletter

One email per week with Excel and Power BI goodness. Join 100,000+ others and get it free.