• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Frequency Distribution of Text Data

Dr. Debbie

New Member
I have a list of text data, (AMI, PNA, CHF, Pedi, Med, Surg), that I need to do a frequency distribution on. The list contains about 2000 entries, with about 200 different "choices" in the list. From what I understand, with the "frequency" formula, I have to create a "bin" with the choices that I want to get info on. My question is this: There are over 200 different choices; do I have to type each one individually into a bin (please say "no"!)? Is there an easier way to find out that there were 150 AMI, 200 CHF, 3 PNA?


Thanks, Dr. Debbie
 
Dr. Debbie

Without knowing the format of your data I would suggest creating a Pivot Table from the data, doing a simple count or sum of an appropriate field

That will both create a unique list as well as doing the count.

Then disconnect the pivot table from its source by Copying the Pivot table and Pasting the Pivot Data as values

That should be a good start
 
If it were me I would approach this one of two ways:


#1: Pull a list of unique values, and then use CountIF or CountIFS to simply count the number of matches. This is essentially the same as a frequency distribution and just as accurate (and in this case a lot faster). If you have more than 200 different categories, this might prove difficult to manage (although me myself, I'd always want to see the full data set and then parse), however:


#2 If your values have some specific relationships or groupings that you can identify, you could write an IF formula in a helper column to assign a number or super-category to that item. In other words use the super categories as your bins, and then run your frequency formula (or better yet CountIF(S))
 
Back
Top