1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Useful PowerQuery tricks: Chihiro's notes

Discussion in 'Power Pivot, Power Map etc' started by Chihiro, Aug 30, 2017.

  1. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    For last 6 months or so, I've been learning PowerQuery and slowly building up bag of tricks to help in data analysis.

    PowerQuery is the best tool available in Excel for data transformation and hope some of these will help you in making use of it.
  2. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    1. Use of #shared to list all available M functions.
    Open blank query and paste in below code in the advanced editor.
    Code (vb):
    let
        Source = Record.ToTable(#shared)
    in
        Source
    Or... in two steps.
    Open blank query and enter #shared in formula bar.
    Use Record Tools to convert into Table.

    Just learned this from Chandoo's sample file. This is very useful as PowerQuery/M is function driven language.
  3. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    2. How to refer to previous row in PowerQuery
    Often this type of calculation is performed in DAX, after data is loaded to Data Model. However, there are instances where user does not have access to PowerPivot, and only Get & Transform (PQ).

    NOTE:
    If the table does not have Index column, add one. This is required as you'll need sequential numbering to refer to previous row.

    Formula/Code to refer to previous row value for specific column.
    Code (vb):
    #"PreviousStep"{[IndexColumnName]-1}[ColumnName]
    Previous step will be #"Added Index" if you just added index column prior.
  4. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    3. Get current date time (like Excel's NOW()).

    For datetime:
    Code (vb):
    DateTime.LocalNow()
    For date only:
    Code (vb):
    DateTime.Date(DateTime.LocalNow())
  5. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    4. Passing parameter from worksheet to PowerQuery
    There are cases where you want to allow user to set parameter without going into PowerQuery Editor. A single cell named range can be used for this purpose.

    Set up single cell with named range (ex. vVal = Sheet1!$A$1)

    Then in advanced editor enter following:
    Code (vb):
    mVal = Excel.CurrentWorkbook(){[Name="vVal"]}[Content]{0}[Column1],
    Now, you can use mVal in subsequent steps in PowerQuery.

    Note:
    While I find above easiest to use. You can also use Table and other methods.

    Ex. To refer to single cell in a table.
    Code (vb):
    = Excel.CurrentWorkbook(){[Name="TableName"]}[Content]{0}[ColumnName]
    Where {0} indicates first row of data range.

    For referencing worksheet, you can't use Excel.CurrentWorkbook method. You will need to use Excel.Workbook and specify full path of the file. Since, workbook location may change, this method isn't recommended.

    To make path dynamic you can use named range method explained above. Combining it with Excel formula to get file path.

    Excel formula for path (excluding file name):
    Code (vb):
    =LEFT(CELL("filename"),FIND("[",CELL("filename"),1)-1)
    Last edited: Aug 30, 2017
    NARAYANK991 likes this.
  6. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    5. Using variable in Native Query
    In most instances, it is easy enough to filter data in query editor, but there are instances where you want to limit data size that's being brought in. I recommend doing it on SQL side by creating View (or prameterized query) . But this isn't feasible when you don't have permission in the database.

    When passing parameter to query, you should always take care to convert variable to text type. This is especially important for numeric values. Since, & operator (concatenate) can't be used on non-text data types.

    Sample native query to SQL server.
    Code (vb):
    SELECT *
    FROM TableName as t1
    WHERE t1.Date >= '8/1/2017'
    This will be shown in Advanced query as...
    Code (vb):
    =Sql.Database("Server:Port", "Dbname", [Query="SELECT * #(lf)FROM TableName as t1 #(lf)WHERE t1.Date >= '8/1/2017'"])
    Note that #(lf) represents new line.

    Refer to Post #5 above for how to pass variable from worksheet to PowerQuery.

    Ex: Where named range vVal holds date value
    Code (vb):
    mVal = Text.From(Excel.CurrentWorkbook(){[Name="vVal"]}[Content]{0}[Column1]),
    Text.From is used to convert to text type.

    You can then edit query to...
    Code (vb):
    =Sql.Database("Server:Port", "Dbname", [Query="SELECT * #(lf)FROM TableName as t1 #(lf)WHERE t1.Date >= '" & mVal & "'"])
    One drawback to this method is that editing source by database query window will not be available and you must edit query directly in formula bar or in advanced editor.

    Note: You may encounter security level mismatch when variable is used in native query. If that happens, go to "Data source settings" in query editor, and set the database's security and current workbook's security to same level.
    Last edited: Aug 30, 2017
    NARAYANK991 and r2c2 like this.
  7. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    6. Check if value is number
    Normally, in PowerQuery, you'd keep column data type consistent, be it Numeric; Text or some other type.
    However, in some cases, when the data is brought in to PowerQuery, it isn't of consistent type. In order to do transformation sometimes it is necessary to identify what type of data is contained in each row of a column. Following formula can be used to identify numeric data.
    Code (vb):
    = Value.Is(Value.FromText([ColumnName]), type number)
    This will return True for numeric and False for other data types.

    Alternately you can use following to return actual numeric value.
    Code (vb):
    = try Number.From([ColumnName]) otherwise null
    This will return actual numeric value from column row if numeric and null if not.
    Thomas Kuriakose and NARAYANK991 like this.
  8. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    7. PowerTrim
    Unlike Excel's Trim function Text.Trim in PQ does not trim mid string spaces. I used to use Text.Replace with double spaces. But found this great function by Ken Puls.
    Code (vb):
    (text as text, optional char_to_trim as text) =>
    let
    char = if char_to_trim = null then " " else char_to_trim,
    split = Text.Split(text, char),
    removeblanks = List.Select(split, each _ <> ""),
    result=Text.Combine(removeblanks, char)
    in
    result
    This is really useful custom function to keep in your bag of tricks and I always have it handy in my NotePad ++ Library of code/function snippets.

    Below is link to Ken's article on the function.
    https://www.excelguru.ca/blog/2015/10/08/clean-whitespace-in-powerquery/
    r2c2 likes this.
  9. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    8. Max Date across Multiple Column
    Recently I was asked to help client analyze customer purchase data. To obtain average duration from previous purchase made, for list of customers that made purchase in a given month. This is normally very easy to do with flat table.

    However, client had data stored in View in database and had 3 columns tracking previous purchase made using different payment method. These columns had null value when there is no record.
    upload_2018-2-23_11-4-52.png

    Typically this can be handled using SQL Unpivot (see link below for explanation).
    https://blogs.msdn.microsoft.com/sa...-finding-the-max-value-from-multiple-columns/

    But client was using SQL Server 2003... which doesn't support Unpivot function and also eliminated several other methods I typically use (Cross Apply, NULLIF).
    This had me scratching my head for a bit. Then I remembered that PowerQuery had unpivot tool.

    Process:
    1. Data was brought in from SQL without any transformation or aggregation, as there was need to validate process and compare against result obtained using SQL Query (I imported table into SQL2012 for this purpose and used previously mentioned Unpivot combined with Row_Number function).

    2. Once data is brought into PowerQuery, select the 3 prevType columns (i.e. date columns) and replace null values with some non-existent date value (typically 1/1/1900). Without this step, when all 3 columns have null, that line will disappear from result.

    upload_2018-2-23_10-59-42.png

    Note: Make sure you use all lower case for null.

    3. With same 3 columns selected, unpivot the columns.

    4. Select all columns except the new attribute column and value column and Group by, using Max of Value column as aggregation.

    5. Replace 1/1/1900 with null. Done

    upload_2018-2-23_11-2-45.png

    BONUS: Duration in days.
    Code (vb):
    = if [LastPrevPurchase] = null then null else Duration.Days(Duration.From([purchaseTime]-[LastPrevPurchase]))
  10. GraH - Guido

    GraH - Guido Active Member

    Messages:
    611
  11. Chihiro

    Chihiro Excel Ninja

    Messages:
    4,556
    9. Wildcard search (string) in PowerQuery
    In simplest form comparing string to string.
    Code (vb):
    = Text.Contains("String","Substring")
    It will return True if substring is within string.

    However, in many cases, you'd want to ignore case sensitivity. In which case you'd use 3rd argument for Text.Contains.
    Code (vb):
    = Text.Contains("String","Substring", Comparer.OrdinalIgnoreCase)
    But, Text.Contains does not handle multiple substring. You could write multiple Text.Contains function, but that's tedious.

    This is where Splitter.SplitTextByAnyDelimiter comes in handy. This function can accept list of delimiters to split string. Instead of just one delimiter (SplitTextByDelimiter).

    As Splitter.SplitTextByxxx will always produce 1 item in list (i.e. delimiter isn't present in string = 1, meaning string isn't split). You can use List.Count on the result to test if substring is contained.

    Ex: From following check if field contains "cat" or "dog", case sensitive.
    upload_2018-5-14_16-37-26.png

    Code (vb):
    = List.Count(Splitter.SplitTextByAnyDelimiter({"dog","cat"})([Text]))>1
    If you want to exclude string containing "cat" or "dog"...
    Code (vb):
    = List.Count(Splitter.SplitTextByAnyDelimiter({"dog","cat"})([Text]))=1
    There is no internal argument to make it case insensitive, so you'd need to make [Text] all lower case etc to accommodate for cases (or the other way around).

    Since list can be a table column. If you have another table with list of substrings you could use that in place of hard coded list... see attached for how this is done.

    Attached Files:

    Thomas Kuriakose likes this.

Share This Page