Analyzing half a million consumer complaints [Part 1 of 3]

Posted on February 16th, 2016 in Analytics , Charts and Graphs - 10 comments

How would you analyze data when you have lots of it? That is the inspiration for this series.

analyzing-consumer-complaints-interactive-treemap-th

A little background

We are re-opening enrollments in to the 50 ways to analyze data program in another week (on 24th of February, Wednesday). I want to share the process, techniques and visualizations you can use to analyze any business data with a case study. In this series of articles, let’s analyze fairly complex and large data set to derive insights.

Please join our Analytics Email Newsletter to get all three parts of this series and information about 50 Ways to Analyze Data program.

Let’s meet our data – Finance Industry Consumer Complaints

As part of open data initiatives, US government & Consumer Financial Protection Bureau maintain a list of all consumer complaints made against financial institutions (banks, credit unions etc.) You can download this data from the catalog page here. I have obtained the data on 1st of February, 2016. The download has 513,824 records. Each row contains one complaint.

The data has these columns.

about-the-consumer-complaint-data

Analyzing with a blank slate – how to start?

Here is how we can find things to analyze / discover / visualize / present in similar situations.

Metrics & KPIs

All industries have metrics / KPIs. Can you identify the metrics that matter in a given situation? For example, if you are the CEO of Citi Bank and you want to understand your bank’s performance in the customer service area, you could define below metrics:

  • Complaints per million transactions: A high rate indicates poor customer experience.
  • Average response time: Low response times indicate smooth & well defined processes and empowered support staff.
  • % of disputed resolutions: A high percentage indicates customers are not happy with the solution offered by the bank.

Top x items

Knowing which products / regions / customer groups are making most complaints (or most anything) can help you identify underlying issues and address them. Let’s say you found out that identity theft is a big issue, you could launch a campaign to educate customers about safe banking, offer additional security measures like 2 step verification etc.

Trends

You could look at the trend of complaints (or any other metrics) over time to understand improvement in the performance. Let’s say you want to understand how your safe banking campaign impacted number of complaints made in the identity theft category. A trend analysis will provide the information you want.

Pareto Analysis

This is a variation of top x items. If you are short on resources and want to know which items are easy pickings to improve your customer service. A Pareto analysis on the issues raised by customers can tell you just that.

Distribution

Knowing how things are spread on an axis can help you optimize your processes. For example, if you know that every Wednesday we get a lot of complaints, we can plan our contact center staffing in a such a way to reduce the average waiting time.

Many more…

A blank canvas (in this case, raw data) offers many possibilities. As an analyst, our job is to dig out the information that helps our company succeed.

Ways to analyze this complaint data

As there are many options, I just picked three to showcase what we can do. Feel free to download the data and do your own analysis.

  1. What do they complain about – Part 1
  2. Regional trends – Part 2
  3. My bank vs. your bank in a scorecard – Part 3

What do they complain about?

There are 11 types of products & 48 sub-products spread across 3,496 companies. There are 95 types of issues faced by customers, further categorized in to 68 sub issues. And there is complaint narrative (only available on 50,902 complaints or roughly 10%).

Making sense of all this text can be hard and complex.

This is where pruning comes in to picture. If we can understand just top 25 issues and give user the power to drill down to any product / state / time frame, then users can discover issues that matter most.

Analyzing top 25 issues

The first step is simple. We just make a pivot table from the data (make sure you add the pivot table to data model to capitalize on new Excel features) and drop the issue in rows area and complaint ID in to values area.

We get this:

pivot-table-with-issue-and-count

Now, add below filters

  • Timeline on the Date Received
  • Slicer on Product
  • Slicer on State

Sort the pivot table by descending order of Complaint Count.

Although the sorted pivot table (first 25 rows) provides the answer we want, it is a tough  nut to chew. So let’s jazz it up.

Visualizing top issues – Treemap

For the visualization, let’s use Tree map chart introduced in Excel 2016. If you are using Excel 2013 or earlier, don’t worry. We will learn another way to visualize the data further down this page.

If you select the pivot data and try to insert a treemap chart, you get a message booing you.

error-when-adding-treemap-from-pivot-data

No worries, we will trick Treemap to work with pivot data. Just copy the first 25 rows of the pivot and paste them as link in a blank sheet.

Now that the data is in a regular Excel range, we can create a Treemap from it. Before we do that, let’s just add 26th row to the data, with the issue label [Other], which is the total of any issues outside top 25.

Select all this data and insert a treemap from Insert ribbon.

Polishing the visualization

Let’s move the chart, slicers & timeline to a separate sheet. Just cut and paste them.

Polish the slicers using styles and customization features (disabling the header, adjusting number of columns etc.) For more on slicer customization, read Introduction to slicers.

Adjust timeline granularity to Quarters.

Our polished slicers

slicers-and-time-line-for-interactive-treemap

Set up descriptive chart title & captions by harvesting the data from slicers, timeline & pivot. You may refer to download workbook for the actual formulas. Note, these work only in Excel 2013 or above.

The final visualization looks like this, click on it to enlarge.

analyzing-consumer-complaints-interactive-treemap

Demo of the visualization

Check out the video demo of this interactive visualization & analysis.

Alternative Visualization for Excel 2010

As Treemap is not available in Excel 2013 or earlier, we can use other visualizations like bar charts or conditional formatting heatmaps to visualize this data. Here is an attempt, using bar charts.  Click to enlarge the image.

alternative-visualization-excel-2010

Download Example Workbook

Please click here to download the consumer complaints analysis – part 1 workbook. Play with the visualizations to learn more.

Note: this file is 100 + mb. So give it a few minutes to download.

Next part – Regional trends & analysis

The 2nd part of this case study is now ready. Please click here to read regional trends & analysis of customer complaints data.

How would you analyze the complaint data?

Go ahead and play with this data yourself. How would you analyze it? Please share your ideas, analysis & charts in the comments section. If you wish to publish a chart, email it to chandoo.d@gmail.com with the subject “Half a million complaints”. I will add your chart at the end of this post.

Sign up to receive remaining two parts & more…

If you like this and want to learn more about data analytics, understand how our 50 ways to analyze data program can help you, sign up for our course newsletter. Please click here.

10 Responses to “Analyzing half a million consumer complaints [Part 1 of 3]”

  1. John says:

    The only disappointments foreseen that keep me from enrolling in these courses (at my own personal expense, of course, no way my employer would fund it) are: 1) how long the course material is accessible, considering the price (it IS an online course, after all, so overhead can't be that high) 2) especially considering the price, it's a shame to see how much time I'll spend covering subjects I'm already familiar with, and 3) a lot of the stuff I've been seeing requires Excel later than 2007, which I currently have (sorry, but I got tired of repeatedly forking out hundreds of dollars at every new version, usually to find myself stuck having to relearn the new steps for old functions (the ribbons took a while, and finding macros was tedious, then the custom toolbar Ï had in 2003 for those functions I use all the time was suddenly impossible)

    • John Omohundro says:

      Hi John. Like you, my (now former) employer would not have paid for these courses, and I was stuck at the 2007 version for similar reasons. One solution for upgrading Excel at minimal up-front expense is to get a Microsoft 365 Pro Plus subscription. It's about $12 a month, and you can get the latest updates from now until you decide to stop subscribing.

      Custom toolbars are possible in later versions of Excel, too...this may be worth your while.

      I greatly feared becoming obsolete, and I wanted to learn Power Pivot and Power Query, neither of which exist in 2007. Subscribing (yes, at my own expense) was the best choice. Chandoo's courses have also helped me in my quest to stay up-to-date. Yes, his courses are pricey, but (1) they are great courses, and (2) this is how Chandoo puts food on the table, and I don't blame him for not giving them away. He puts lots of effort into the content, and if I was him I would charge top dollar too.

    • Chandoo says:

      Hi John. Thanks for your comments. We offer full downloads of all lesson videos in our programs. These do not expire. It is something very few online training providers do, but I have made a conscious choice to give the downloadable videos so that users can refer to the content anytime. The online access is valid for an year but you can extend for 1 more by paying a fraction of the fee (usually $50 per 6 months and $100 per year). Although our cost for giving access is low, our costs on admin & tech support are high. As long as you remain a student in the classroom, you can post questions and doubts which we answer.

      Please note that there is no compulsion to join any of our courses. You can read all the free content, download example files, listen to podcasts and watch free videos I post on YouTube without even paying me a penny. The courses enable me to I support my family and work on my mission of making people awesome.

      All the best.

  2. MF says:

    Hi Chandoo,
    FYI. The pivot table is not working if the file is opened with Excel 2010... 🙁
    Here's the error message for your information:
    This workbook contains an Excel data model that is created in a newer version of Excel. You can open this workbook in an older version of Excel, but you will be unable to load or work with PowerPivot when it coexists with a Excel data model.

  3. James P says:

    Chandoo -

    Thanks for this. Gave me a good working example of the cube functions with slicers.

    When I add the data to the data model to use the cube functions to identify slicer selections, the number formatting doesn't carry over to the slicer from the data table. Is that what you have found? Any ideas on why that is? If it's not part of the data model, whatever number format I set in my table shows up in the slicer buttons, no problem.

    Thanks.

  4. Jim says:

    Chandoo,

    Your other readers might be interested to know that I too had an issue with being stuck in the dark ages of Excel 2007 (both at home and at work) until I discovered Office 2013 on eBay. There are plenty of reputable sellers offering valid licence keys taken from scrapped PCs which work well with downloaded versions of the software from Microsoft. At the end of last year I paid £66 (about $100) for 2 licence keys which both activated perfectly. Seems to me a much better deal than $12 per month, indefinitely, for Office 365.

  5. Anthony says:

    How to determine the reputable sellers offering valid licensing keys on eBay? Would like to do this too. I cant follow along with Excel 2010 so I need to upgrade....

  6. Richard says:

    Hi Everyone,

    I've got Office Standard 2016 and when I opened it in Excel 2016 I still got the broken chart message. Not sure if it is a a 2016 update issue, as that is controlled by the Admins and it's only pushed out typically monthly if not a critical issue. I had been copying the steps so I had a working chart with some variances but was able to figure out my problems from looking at the download. Liked this part and moving on to 2 and 3 as my work allows me to work at 'em.

    • Chandoo says:

      Hmm.. I use standalone installation of Office 2016, but I am under the impression that O365 2016 versions are even more up to date (with newer features added frequently), so may be it was something else that caused the error.

  7. Mobilken says:

    Chandoo,
    I'm new to excel Pivot/BI. I have created the pivot table and slicers. what is meant

    Timeline on the Date Received -

    Thank you for your help and GREAT web site ,,,

Leave a Reply