Need help getting frequency of values in a huge asymmetrical range/ table

sauronbaggins

New Member
Hi guys,

I have tried googling and the things that I know. However, I am not getting the solution. I have an asymmetrical data set of names. It has approx 20000 rows. Column A is never empty for any of the rows. The farthest column that holds a value in it is "HZ" (not all rows have values until HZ, meaning a LOT of the columns are empty. I made it a "Table" and tried to summarize with a pivot table but that doesn't work. What's the best way to find frequency of all the names in such a huge range?

To give you an idea of how the data looks, please see this table.

GameColumn 1Column 2Column 3
Tennis​
Chess​
Golf​
Tennis​
Baseball​
Table Tennis​
Tennis​
Golf
Hockey

The output I want is this:

 Game Count Tennis 3 Golf 2 Hockey 1 Badminton 1 Baseball 1 Basketball 1 Chess 1 Table Tennis 1

If there is a way to consolidate/ transpose all the values across this range in a single column, that might also work.

Peter Bartholomew

Well-Known Member
I think the answer may be simpler than you expect. Using dynamic arrays
= COUNTIFS(Table, sport)
gives the result. Without dynamic arrays it might be better to use a table for the output list as well, so the name of each game is referenced separately
= COUNTIFS(Table, [@Sport])
and the formula copies down the output table automatically.

Getting the list of distinct sports is harder but you may already have that.
Your idea of consolidation into a single column is also possible but it represents very much more work.

sauronbaggins

New Member
I think the answer may be simpler than you expect. Using dynamic arrays
= COUNTIFS(Table, sport)
gives the result. Without dynamic arrays it might be better to use a table for the output list as well, so the name of each game is referenced separately
= COUNTIFS(Table, [@Sport])
and the formula copies down the output table automatically.

Getting the list of distinct sports is harder but you may already have that.
Your idea of consolidation into a single column is also possible but it represents very much more work.
Thanks for your reply, but I do not have the list of all the unique "sport" values. The unique values run in a few thousands, so it is not possible to do it manually. I first need to create a list of unique values and then find the frequency of each of those across the range. Sorry, if that wasn't clear in the first post.

Peter Bartholomew

Well-Known Member
If you have a dynamic array version of Excel, I could provide a formula-based solution. If not, I would suggest the use of PowerQuery.

Code:
``````let
Source = Excel.CurrentWorkbook(){[Name="Table"]}[Content],
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([Value] <> "")),
#"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"Value", Order.Ascending}})
in
#"Sorted Rows"``````
Guido might have some improvements and could possibly even return the count of instances from PQ. Otherwise, extend the PQ output table with the formula from above and it will refresh along with the table.

Last edited:

GraH - Guido

Well-Known Member
Hi Peter, sauronbaggins

Code:
``````let
Source = Excel.CurrentWorkbook(){[Name="tGames"]}[Content],
GroupAllRows = Table.Group(Source, {}, {{"Games", each _, type table [Game=text, Column1=text, Column2=text, Column3=text]}}),
RemoveAllTable = Table.RemoveColumns(UnpivotAllColumns,{"Games"}),
ExpandGamesUnpivot = Table.ExpandTableColumn(RemoveAllTable, "GamesUnpivot", {"Games"}, {"Games"}),
GroupGames_CountRows = Table.Group(ExpandGamesUnpivot, {"Games"}, {{"Count", each Table.RowCount(_), type number}}),
SortCountDesc_GamesAsc = Table.Sort(GroupGames_CountRows,{{"Count", Order.Descending}, {"Games", Order.Ascending}})
in
SortCountDesc_GamesAsc``````

Attachments

• 18.4 KB Views: 6

p45cal

Well-Known Member
another:
Code:
``````let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ExpandedCustom = Table.ExpandTableColumn(RemovedOtherColumns, "x", {"Value"}, {"Value"}),
FilteredRows = Table.SelectRows(ExpandedCustom, each ([Value] <> null)),
GroupedRows = Table.Group(FilteredRows, {"Value"}, {{"Count", each Table.RowCount(_), type number}}),
SortedRows = Table.Sort(GroupedRows,{{"Count", Order.Descending}})
in
SortedRows``````
although an adaptation of @Peter Bartholomew 's get's there quite slickly:
Code:
``````let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
UnpivotedColumns = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"),
RemovedColumns = Table.RemoveColumns(UnpivotedColumns,{"Attribute"}),
GroupedRows = Table.Group(RemovedColumns, {"Value"}, {{"Count", each Table.RowCount(_), type number}}),
SortedRows = Table.Sort(GroupedRows,{{"Count", Order.Descending}, {"Value", Order.Ascending}})
in
SortedRows``````

Last edited:

Ateeb Ali

Member
is this resolved or you still need help?
I can do it via vbcode

GraH - Guido

Well-Known Member
Haha, in my effort to avoid hard coded column names, I forced a step where they are hard coded . Really like your step Record.ToTable. It seems I never think in records, but only table and lists. So that's a good reminder not to skip the obvious. On larger data it would mean a significant reduction in running time. Good stuff p45cal.

sauronbaggins

New Member
Whoa. Lots of solutions. Thanks guys. Imma try these and report back.

sauronbaggins

New Member
Thanks everyone. I tried this solution at another board and it worked perfectly (sharing as it can be helpful to other people in similar situation). Link

vletm

Excel Ninja
sauronbaggins
As a new member, You should reread Forum Rules
You have skipped few basic steps.