Pivot Table - count if within range - label values, or...?

k1s · Dec 20, 2016

Hi

I'm struggling to create a pivot table that can reproduce the layout in the attached.

I've got 6 types of test scores that I want to analyse with lots of permuations, like gender, test type, first language of the pupil, etc, so I want to be able to use slicers to make it easier to view subsets of results.

I've been trying to use label filters in a pivot table to create % of pupils making each grade-band. I can see that it can produce columns of each score greater than or less than a value, but I can't work out how to get it to count within a range or sum the columns greater than or less than. I could create calculated columns in my data table, but there would be 180 of them given the number of tests, scores and re-tests that there are in the actual data, so I'd rather have the calculations done in the analysis table.

Is it possible?

Would greatly appreciate some expert input. Thanks

Chihiro · Dec 20, 2016

First, stick test score into Row Label field.

Right click and choose Group.

Un-check auto and enter like below screen shot.

This will generate 3 groupings. <100, 100-109, & >110 (actually it's >=110).

Now change the label for each and move to Column field.

Add Test score to Values field as "Count of", and set "Show Value As" to % of Grand total.

See attached.

Edit: If you need more than 3 groupings... you'll need to group it manually, leverage PowerQuery (utilizing custom bucket function, outlined in link below).
http://chandoo.org/forum/threads/power-query-without-using-if-function.31541/#post-187511

Or add calculated column to the source data table.

k1s · Dec 20, 2016

Chihiro said:
First, stick test score into Row Label field....

Edit: If you need more than 3 groupings... you'll need to group it manually, leverage PowerQuery (utilizing custom bucket function, outlined in link below).
http://chandoo.org/forum/threads/power-query-without-using-if-function.31541/#post-187511
.

Many thanks. I'll open the attachment even i get back to my desk. Is power query backwardly compatible with older versions of excel?

Chihiro · Dec 20, 2016

It works with 2010 (free add-in) and 2016 (comes standard).

For 2013, you'll need specific edition (ProPlus, Standalone Excel etc).

Unfortunately it's not available for older versions.

k1s · Dec 20, 2016

Chihiro said:
If you need more than 3 groupings... you'll need to group it manually, leverage PowerQuery...
Or add calculated column to the source data table.

Well, first time using power query and the link that you linked to was certainly very helpful.

If I've understood correctly, I've effectively set-up a copy of my data table somewhere in the background (Power Query land) in which I've added a column/field that contains the name of the band that the score falls into by querying the new function.

One question though: Is this any more efficient than adding a calculated column to the orginal table?

My issue is that if I need to add a helper column for each set of scores, that'll be 180 helper columns. Does it make any difference if it's in the main source table or the power query table?

r2c2 · Dec 20, 2016

@k1s You can use Power Pivot to set up such dynamic calculations. Or here is another option.
Set up a new table that lists all possible scores (0 thru 200) in one column and band in the next column.
then set up a relationship between your raw data and the grade mapping table. Now you can set up grade as column label in the pivot.

You can do all this with Excel 2013 or above pivot tables. Just use the relationships feature.

Read this: http://chandoo.org/wp/2013/07/01/introduction-to-excel-2013-data-model-relationships/

k1s · Dec 20, 2016

@r2c2

Thanks for the link. Reading through it, it sounds easy,

I'd have a Table_3 with a column of [Possible Scores] (actually70-140), and another column [Bands] with say Band 1 for every row between 70 & 79, etc. but...

...set up a relationship between your raw data and the grade mapping table...

How would the relationship work? What would be the related columns ("Foreign" & "Primary"). Doesn't the each table have to have an identical column (field and with unique values down the column)?

Chihiro · Dec 20, 2016

My issue is that if I need to add a helper column for each set of scores, that'll be 180 helper columns.

Not sure I understand you. Can you demonstrate why you would need 180 helper columns? In most situation, bucket function is used to assign every possible score to as many bucket as needed, and need single custom column only.

I also noticed that you intermixed PowerQuery's data model and source table in your Pivot. That's something you should avoid (as you would have noticed, it would give wrong result).

See attached for updated PivotTable.

FYI - You can replace ">=120" etc with text representation (i.e. "A").

r2c2 · Dec 20, 2016

k1s said:
@r2c2

Thanks for the link. Reading through it, it sounds easy,

I'd have a Table_3 with a column of [Possible Scores] (actually70-140), and another column [Bands] with say Band 1 for every row between 70 & 79, etc. but...

How would the relationship work? What would be the related columns ("Foreign" & "Primary"). Doesn't the each table have to have an identical column (field and with unique values down the column)?

Select Table3[Possible Scores] as primary column and map it to the score column in your original data set.

k1s · Dec 21, 2016

Chihiro said:
...why you would need 180 helper columns? In most situation, bucket function is used to assign every possible score to as many bucket as needed, and need single custom column only.

I actually have Table of many pupils test data containing results for 180 different tests (over several years and subjects), so if I understood correctly I'd need to add a caculated helper column - either in the main data table or the background powerquery data model version for each of those tests - e.g. [Year 1 Spring Maths score] would need a helper column indicating which band the score was in then [Year 1 Summer Maths score] would need anotehr column, etc.?

r2c2 said:
Select Table3[Possible Scores] as primary column and map it to the score column in your original data set.

I think this will have cracked it. Now I need to see if I can make it work with the main data

Many thanks to all of you - I learned a lot

Chihiro · Dec 21, 2016

e.g. [Year 1 Spring Maths score] would need a helper column indicating which band the score was in then [Year 1 Summer Maths score] would need anotehr column, etc.?

Be it bucket function or data relationship tables, it does the same thing (see the link I gave you, both options are mentioned in it).

As long as there's single score column, you only need 1 table or helper column for buckets.

If there's more than one score column... well, you will need multiple helper column (be it generated via data relationship table or bucket function).

I'd suggest uploading what your table actually looks like with enough sample.

k1s · Dec 21, 2016

Chihiro said:
Be it bucket function or data relationship tables, it does the same thing ...more than one score column...will need multiple helper column (be it generated via data relationship table or bucket function).

Won't it be that I have to set-up a relationship for each score column, rather than new columns, if I go the data relationships route? Then if I understood correctly, I need to drag a single banding column into the Pivot Table (whichis what I did in the previosly attached)?

Whereas if I go the powerquery function route, I'd have to create a helper column for each score column.

Chihiro said:
I'd suggest uploading what your table actually looks like with enough sample.

It's quite complicated. I have 760 pupil records, with a unique identifier plus >200 fields, 180 of which are test scores, the others (gender, year group, etc.) will be the filtering factors for the pivot tables.

90 of Table fields/columns contain scores between 1.00 and 3.00. The buckets for these are 1-1.84, 1.85-2.44 and 2.45-3
70 contain scores between 80 and 120. The buckets for these are 70-79,80-89,90-99,100-109, 110-114 and 115-120
10 contain scores between 85 and 115. The buckets for these are 80-89,90-99,100-109, 110-114 and 115-120
10 contain scores between 80 and 120. The buckets for these are 85-94,95-104, and 105-115.

Chihiro · Dec 21, 2016

Code:

Whereas if I go the powerquery function route, I'd have to create a helper column for each score column.

You'd do unpivot operation in PowerQuery, giving you single score column. This has added benefit of giving you flat table structure, which will give you more flexibility in Pivot Table.

See links for detail on how.
https://support.office.com/en-us/ar...er-Query-0f7bad4b-9ea1-49c1-9d95-f588221c7098
https://wessexbi.wordpress.com/2014/02/27/unpivot-nested-headings-with-power-query/

k1s · Jan 9, 2017

Hi Guys, Happy New Year.

I've been trying to extend the relationships method to a second column (because I have scores for many tests in the same table) but I'm running into problems.

When I try to create a second relationship Excel says "This relationship would create a duplicate path so it will be created as inactive".

What am I doing wrong?

r2c2 · Jan 9, 2017

The safest and simplest option is to restructure your score data to below layout

Gender - Test - Score - Unique Code

This way, you have only score column and the relationship won't be ambiguous.

If you can't restructure the data, then I suggest using Power Pivot to define the relationships. That way, you can use the excellent USERELATIONSHIP() DAX formula to turn on / off the inactive relationships to calculate the measure values.

If this sounds complicated, so go back to my suggestion and restructure your data.

Chihiro · Jan 9, 2017

When I try to create a second relationship Excel says "This relationship would create a duplicate path so it will be created as inactive".

This is the reason why I've recommended unpivoting Test columns. Since relationship is based on scores and not on "Unique Pupil ID", this causes issue.

k1s · Jan 10, 2017

The safest and simplest option is to restructure your score data to below layout:
Gender - Test - Score - Unique Code

My main data is 1 record per pupil with 20 or so contextual fields (gender, etc.) and ~180 sets of test scores. so structuring this way is not really feasible

If you can't restructure the data, then I suggest using Power Pivot to define the relationships. That way, you can use the excellent USERELATIONSHIP() DAX formula to turn on / off the inactive relationships to calculate the measure values. If this sounds complicated, so go back to my suggestion and restructure your data

This is the reason why I've recommended unpivoting Test columns. Since relationship is based on scores and not on "Unique Pupil ID", this causes issue

I've upgraded to a version of Excel with PowerPivot, and I'm reading up on how to create relationships... Will post back when I've either worked out how to do it or get stuck! Thanks again

Pivot Table - count if within range - label values, or...?

Member

Attachments

Excel Ninja

Attachments

Member

Excel Ninja

Member

Attachments

Active Member

Member

Excel Ninja

Attachments

Active Member

Member

Attachments

Excel Ninja

Member

Excel Ninja

Member

Attachments

Active Member

Excel Ninja

Member