• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

PowerQuery Group by Max, but keep rows from other field

plujan

New Member
Hello,

Came across a challenge for which i hope i can get some help.
I have a table which I am pulling into PowerQuery. From that table I have various fields, but there are two I need to use in conjunction.

One field is "TimeStamp" and the other is "Value".

Value field is an entry field, where users are typing in a value.
TimeStamp is WHEN the value was typed in.

I need to be able to keep the most recent value from the Value field by referencing the TimeStamp field.

When I try grouping it, of course it won't work as it's two separate fields.

Are there any ideas how I could pull this in without any major manipulation?

Thank you!
 
Attach a representative sample file and a mocked up solution. Hard to visualize what you specifically need. No pictures as we cannot manipulate data in a picture.
 
Made some random data
1705152952942.png

Basic query
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Grouped Rows" = Table.Group(Source, {"Other cols"}, {{"Max", each Table.Max(_,"TimeStamp"), type nullable record}}),
    #"Expanded Max" = Table.ExpandRecordColumn(#"Grouped Rows", "Max", {"TimeStamp", "Value"}, {"TimeStamp", "Value"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded Max",{{"TimeStamp", type datetime}, {"Value", type number}, {"Other cols", type text}})
in
    #"Changed Type1"
 

Attachments

  • Chandoo_55834.xlsx
    18.8 KB · Views: 1
Academic comment only:
There are 2 as with timestamp of 13 Jan 2024 14:00 of which it only chooses one.
Good catch @p45cal! I randomized and did not bother to check all lines as I only wanted to show a technique.
I'm guessing that might not occur in the real dataset. If it does, then we can search for another solution.
Which maybe is doing a group by all rows, adding a custom column in which a select rows on the max list of timestamps occurs. That will return all lines.
 
that might not occur in the real dataset
True, I feel it's unlikely in the real dataset unless the user changes multiple rows at once, perhaps with cut and pasting or entering values in multiple cells at once with Ctrl + Enter. It's only because I was playing around with other methods that 2 as appeared:
1705162796832.png
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    ChangedType = Table.TransformColumnTypes(Source,{{"TimeStamp", type datetime}, {"Value", type number}, {"Other cols", type text}}),
    GroupedRows = Table.Group(ChangedType, {"Other cols"}, {{"TimeStamp", each List.Max([TimeStamp]), type nullable datetime}}),
    MergedQueries = Table.NestedJoin(GroupedRows, {"Other cols", "TimeStamp"}, ChangedType, {"Other cols", "TimeStamp"}, "Table", JoinKind.LeftOuter),
    ExpandedTable = Table.ExpandTableColumn(MergedQueries, "Table", {"Value"}, {"Value"})
in
    ExpandedTable
 
To return duplicated rows:
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Grouped Rows" = Table.Group(Source, {"Other cols"}, {{"All", each Table.SelectRows(_, let latest = List.Max(_[TimeStamp]) in each [TimeStamp] = latest), type table [TimeStamp=datetime, Value=number, Other cols=text]}}),
    #"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"TimeStamp", "Value"}, {"TimeStamp", "Value"})
in
    #"Expanded All"
1705234736104.png
 
Back
Top