• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Pivot Table - count if within range - label values, or...?

k1s

Member
Hi

I'm struggling to create a pivot table that can reproduce the layout in the attached.

Clipboard01.jpg


I've got 6 types of test scores that I want to analyse with lots of permuations, like gender, test type, first language of the pupil, etc, so I want to be able to use slicers to make it easier to view subsets of results.

I've been trying to use label filters in a pivot table to create % of pupils making each grade-band. I can see that it can produce columns of each score greater than or less than a value, but I can't work out how to get it to count within a range or sum the columns greater than or less than. I could create calculated columns in my data table, but there would be 180 of them given the number of tests, scores and re-tests that there are in the actual data, so I'd rather have the calculations done in the analysis table.

Is it possible?

Would greatly appreciate some expert input. Thanks
 

Attachments

  • Pivot like this.xlsx
    17.9 KB · Views: 3
First, stick test score into Row Label field.

Right click and choose Group.

Un-check auto and enter like below screen shot.
upload_2016-12-20_7-36-46.png

This will generate 3 groupings. <100, 100-109, & >110 (actually it's >=110).

Now change the label for each and move to Column field.

Add Test score to Values field as "Count of", and set "Show Value As" to % of Grand total.

See attached.

Edit: If you need more than 3 groupings... you'll need to group it manually, leverage PowerQuery (utilizing custom bucket function, outlined in link below).
http://chandoo.org/forum/threads/power-query-without-using-if-function.31541/#post-187511

Or add calculated column to the source data table.
 

Attachments

  • Pivot like this.xlsx
    23.9 KB · Views: 9
Last edited:
It works with 2010 (free add-in) and 2016 (comes standard).

For 2013, you'll need specific edition (ProPlus, Standalone Excel etc).

Unfortunately it's not available for older versions.
 
If you need more than 3 groupings... you'll need to group it manually, leverage PowerQuery...
Or add calculated column to the source data table.

Well, first time using power query and the link that you linked to was certainly very helpful.

If I've understood correctly, I've effectively set-up a copy of my data table somewhere in the background (Power Query land) in which I've added a column/field that contains the name of the band that the score falls into by querying the new function.

One question though: Is this any more efficient than adding a calculated column to the orginal table?

My issue is that if I need to add a helper column for each set of scores, that'll be 180 helper columns. Does it make any difference if it's in the main source table or the power query table?
 

Attachments

  • Pivot like this using power query function for banding.xlsx
    185.8 KB · Views: 2
@k1s You can use Power Pivot to set up such dynamic calculations. Or here is another option.
Set up a new table that lists all possible scores (0 thru 200) in one column and band in the next column.
then set up a relationship between your raw data and the grade mapping table. Now you can set up grade as column label in the pivot.

You can do all this with Excel 2013 or above pivot tables. Just use the relationships feature.

Read this: http://chandoo.org/wp/2013/07/01/introduction-to-excel-2013-data-model-relationships/
 
@r2c2

Thanks for the link. Reading through it, it sounds easy,

I'd have a Table_3 with a column of [Possible Scores] (actually70-140), and another column [Bands] with say Band 1 for every row between 70 & 79, etc. but...

...set up a relationship between your raw data and the grade mapping table...

How would the relationship work? What would be the related columns ("Foreign" & "Primary"). Doesn't the each table have to have an identical column (field and with unique values down the column)?
 
My issue is that if I need to add a helper column for each set of scores, that'll be 180 helper columns.

Not sure I understand you. Can you demonstrate why you would need 180 helper columns? In most situation, bucket function is used to assign every possible score to as many bucket as needed, and need single custom column only.

I also noticed that you intermixed PowerQuery's data model and source table in your Pivot. That's something you should avoid (as you would have noticed, it would give wrong result).

See attached for updated PivotTable.

FYI - You can replace ">=120" etc with text representation (i.e. "A").
 

Attachments

  • Pivot like this using power query function for banding.xlsx
    178.2 KB · Views: 1
@r2c2

Thanks for the link. Reading through it, it sounds easy,

I'd have a Table_3 with a column of [Possible Scores] (actually70-140), and another column [Bands] with say Band 1 for every row between 70 & 79, etc. but...



How would the relationship work? What would be the related columns ("Foreign" & "Primary"). Doesn't the each table have to have an identical column (field and with unique values down the column)?

Select Table3[Possible Scores] as primary column and map it to the score column in your original data set.
 
...why you would need 180 helper columns? In most situation, bucket function is used to assign every possible score to as many bucket as needed, and need single custom column only.

I actually have Table of many pupils test data containing results for 180 different tests (over several years and subjects), so if I understood correctly I'd need to add a caculated helper column - either in the main data table or the background powerquery data model version for each of those tests - e.g. [Year 1 Spring Maths score] would need a helper column indicating which band the score was in then [Year 1 Summer Maths score] would need anotehr column, etc.?

Select Table3[Possible Scores] as primary column and map it to the score column in your original data set.
I think this will have cracked it. Now I need to see if I can make it work with the main data

Many thanks to all of you - I learned a lot

Clipboard02.jpg
 

Attachments

  • Pivot like this using data relationship banding.xlsx
    149.6 KB · Views: 3
e.g. [Year 1 Spring Maths score] would need a helper column indicating which band the score was in then [Year 1 Summer Maths score] would need anotehr column, etc.?

Be it bucket function or data relationship tables, it does the same thing (see the link I gave you, both options are mentioned in it).

As long as there's single score column, you only need 1 table or helper column for buckets.

If there's more than one score column... well, you will need multiple helper column (be it generated via data relationship table or bucket function).

I'd suggest uploading what your table actually looks like with enough sample.
 
Be it bucket function or data relationship tables, it does the same thing ...more than one score column...will need multiple helper column (be it generated via data relationship table or bucket function).

Won't it be that I have to set-up a relationship for each score column, rather than new columns, if I go the data relationships route? Then if I understood correctly, I need to drag a single banding column into the Pivot Table (whichis what I did in the previosly attached)?

Whereas if I go the powerquery function route, I'd have to create a helper column for each score column.

I'd suggest uploading what your table actually looks like with enough sample.
It's quite complicated. I have 760 pupil records, with a unique identifier plus >200 fields, 180 of which are test scores, the others (gender, year group, etc.) will be the filtering factors for the pivot tables.

90 of Table fields/columns contain scores between 1.00 and 3.00. The buckets for these are 1-1.84, 1.85-2.44 and 2.45-3
70 contain scores between 80 and 120. The buckets for these are 70-79,80-89,90-99,100-109, 110-114 and 115-120
10 contain scores between 85 and 115. The buckets for these are 80-89,90-99,100-109, 110-114 and 115-120
10 contain scores between 80 and 120. The buckets for these are 85-94,95-104, and 105-115.
 
Hi Guys, Happy New Year.

I've been trying to extend the relationships method to a second column (because I have scores for many tests in the same table) but I'm running into problems.

When I try to create a second relationship Excel says "This relationship would create a duplicate path so it will be created as inactive".

Clipboard04.jpg


What am I doing wrong?
 

Attachments

  • Pivot like this using data relationship banding.xlsx
    162 KB · Views: 0
The safest and simplest option is to restructure your score data to below layout

Gender - Test - Score - Unique Code

This way, you have only score column and the relationship won't be ambiguous.

If you can't restructure the data, then I suggest using Power Pivot to define the relationships. That way, you can use the excellent USERELATIONSHIP() DAX formula to turn on / off the inactive relationships to calculate the measure values.

If this sounds complicated, so go back to my suggestion and restructure your data.
 
When I try to create a second relationship Excel says "This relationship would create a duplicate path so it will be created as inactive".

This is the reason why I've recommended unpivoting Test columns. Since relationship is based on scores and not on "Unique Pupil ID", this causes issue.
 
The safest and simplest option is to restructure your score data to below layout:
Gender - Test - Score - Unique Code
My main data is 1 record per pupil with 20 or so contextual fields (gender, etc.) and ~180 sets of test scores. so structuring this way is not really feasible

If you can't restructure the data, then I suggest using Power Pivot to define the relationships. That way, you can use the excellent USERELATIONSHIP() DAX formula to turn on / off the inactive relationships to calculate the measure values. If this sounds complicated, so go back to my suggestion and restructure your data

This is the reason why I've recommended unpivoting Test columns. Since relationship is based on scores and not on "Unique Pupil ID", this causes issue

I've upgraded to a version of Excel with PowerPivot, and I'm reading up on how to create relationships... Will post back when I've either worked out how to do it or get stuck! Thanks again
 
Back
Top