Convert unevenly spaced list to table [Data from Hell]

Posted on August 30th, 2017 in Power Query - 23 comments

Introducing Data from Hell:

Watch out, its data from hell. In this new video series, we are going to look at some nutty, frustrating and fun data reshaping challenges and solve them using Excel. We will use Power Query, Formulas, VBA or other features as needed to free this data from damnation.

For our first installment, let’s reshape unevenly spaced list of values to a table.

Unevenly Spaced List -> Tabular format

Let’s say you have a list with some values. They are separated by blank cells. You want to convert this to a table, so related values are on same row. But the table can have any number of columns depending on how big the chunks are.

Something like this:

d4h-unevenly-spaced-list-to-table-power-query

Enter Power Query (Get & Transform Data)

You guessed right. This data is too much out of whack for formula based solution. So let’s reshape it using Power Query aka Get & Transform Data.

The process goes like this:

  1. Load the data in to query editor
  2. Replace nulls (blank cells) with something that is not part of your data, like %^%
  3. Using Text.Combine, concatenate all values to a single value, separated by some symbol that is not part of data, like =
  4. Convert this single value to a table and split by the symbol used in step 2.
  5. Calculate number of = symbols in each row using Text.Length and Text.Replace functions in to a new column
  6. Figure out the maximum of this new column as a variable. Name it as MaxCols
  7. Remove the extra column created in step 5
  8. Split the first column by delimiter set in step 3 with number of columns set to MaxCols.
  9. Load data back to Excel.

Data from Hell – Convert unevenly spaced list to table – Power Query tutorial

As this process requires a bit of tinkering with Power Query functions and a few false starts, I made a tutorial. I woke at 4AM to record this. Just kidding, I woke up because my nose was all blocked up and couldn’t go back to sleep. So I opened up Excel for some play and recorded this.  Check it out below.

You can also watch this video on our YouTube Channel.

Download Example Workbook

Click here to download the example workbook for this Power Query tutorial. Try reshaping the data yourself first, you will learn a lot about PQ and how to use it for your advantage.

More Power Query Recipes for you

Power Query (or Get & Transform Data as it is called in newer versions of Excel) is an incredibly powerful tool to extract, reshape and merge your data sets. Check out below tutorials and recipes for more.

Written by Chandoo
Tags: , , , , , ,
Home: Chandoo.org Main Page
? Doubt: Ask an Excel Question

23 Responses to “Convert unevenly spaced list to table [Data from Hell]”

  1. Oficientes says:

    Thanks for the tutorial Chandoo!

  2. Chihiro says:

    Thanks for sharing Chandoo. Never knew about #shared use as source to list M functions.

    It's very useful tool as PQ is function driven language, and makes looking up function syntax so much easier than using Web search every time. 🙂

  3. GraH says:

    Get well soon, Chandoo.
    When I'm feeling ill, the last thing that comes to my mind is: "I'm firing up Excel and play for a while."
    You are a bit of a weird dude! 🙂
    Thanks for starting to share some fine PQ wizardry.

  4. Simon Nuss says:

    Below is an alternative solution. It takes a set-based approach by separating each group of cells into their own table. This avoids the string manipulation and allows for more flexibility if you want to take the data in different direction:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
    #"Added Custom" = Table.AddColumn(#"Added Index", "IndexFillDown", each if [Index] = 0 then [Index] else if [Data] = null then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"IndexFillDown"}),
    #"Grouped Rows" = Table.Group(#"Filled Down", {"IndexFillDown"}, {{"SeparatedData", each Table.FromList(List.RemoveNulls(_[Data])), type table}}),
    #"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Transpose", each Table.Transpose([SeparatedData])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Transpose"}),
    #"List of All Columns" = List.Distinct(List.Combine(List.Transform(#"Removed Other Columns"[Transpose], each Table.ColumnNames(_)))), // This step dynamically gathers all column names, e.g. column1..columnN, so the next step is not hard-coded.
    #"Expanded Transpose" = Table.ExpandTableColumn(#"Removed Other Columns", "Transpose", #"List of All Columns",#"List of All Columns")
    in
    #"Expanded Transpose"

  5. Kuldeep Mishra says:

    I have no idea about power query however i love the new tab in excel 2016 tried to learn but there is less documentation available about power query and power pivot.

    Sub Test()
    Dim I As Long, LRow As Long, Lcol As Long
    Lcol = 4
    LRow = 4
    With Sheet1
    For I = 4 To .Cells(Rows.Count, "B").End(xlUp).Row
    If Not VBA.IsEmpty(.Cells(I, 2)) Then
    .Cells(LRow, Lcol) = .Cells(I, 2)
    Lcol = Lcol + 1
    End If

    If VBA.IsEmpty(.Cells(I, 2)) Then
    LRow = LRow + 1
    Lcol = 4
    End If

    Next I
    End With
    End Sub

  6. lori says:

    You could also add another column to the right of the data labelled "<table><td>" then fill down the formula:

    =IF([@Data]="","<tr><td>","<td>")

    The result should be as below (note closing tags are not needed.) Copy these two columns to notepad then copy and paste to another cell on the sheet.


    Data <table><td>
    Arif <td>
    Lala <td>
    <tr><td>
    Donny <td>
    Kay <td>
    Chou <td>
    <tr><td>
    Marry <td>
    John <td>
    Smith <td>
    something <td>
    <tr><td>
    Harry <td>
    Matthew <td>
    Rery <td>
    Brave <td>
    Let's <td>
    <tr><td>
    Sea <td>
    Earth <td>
    Ace <td>
    Few <td>
    More <td>
    things <td>
    <tr><td>
    at <td>
    the <td>
    end <td>

  7. Abhay Gadiya says:

    Checkout this solution:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Demoted Headers" = Table.DemoteHeaders(Source),
    #"Replaced Value" = Table.ReplaceValue(#"Demoted Headers","Data",null,Replacer.ReplaceValue,{"Column1"}),
    #"Added Index" = Table.AddIndexColumn(#"Replaced Value", "Index", 1, 1),
    #"Added Conditional Column" = Table.AddColumn(#"Added Index", "Custom", each if [Index] = 1 then "1" else if [Column1] = null then [Index] else null ),
    #"Filled Down" = Table.FillDown(#"Added Conditional Column",{"Custom"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Filled Down",{{"Custom", Int64.Type}}),
    #"Inserted Subtraction" = Table.AddColumn(#"Changed Type", "Inserted Subtraction", each [Index] - [Custom], type number),
    #"Removed Columns" = Table.RemoveColumns(#"Inserted Subtraction",{"Index"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Column1] null)),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Filtered Rows", {{"Inserted Subtraction", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Filtered Rows", {{"Inserted Subtraction", type text}}, "en-US")[#"Inserted Subtraction"]), "Inserted Subtraction", "Column1"),
    #"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Custom"})
    in
    #"Removed Columns1"

    Also checkout video on my youtube channel.

    • Simon Nuss says:

      There's an error in your script, #"Filtered Rows" should read:

      #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Column1] null)),

  8. Simon Nuss says:

    Hmm, interesting, the "" ("not equal") sign is getting filtered out of our code when we submit our comments. Chandoo - just a heads up, this seems like a bug.

  9. Abhay Gadiya says:

    Imagine a new data set like this

    name app date
    may 1-Jan-16
    Abhay 7-Jan-16
    justin 1-Jan-16
    john 9-Jan-16
    hugh 4-Jan-16
    Abhay 5-Jan-16
    will 1-Jan-16
    justin 1-Feb-16
    jackie 5-Jan-16
    jackie 7-Jan-16
    john 1-Jan-16
    Abhay 4-Jan-16
    Abhay 1-Jan-16
    may 3-Jan-16
    john 2-Jan-16

    No blank rows in between. The final output will be similar to above case study like this...

    name 1 2 3 4
    Abhay 1/4/2016 1/1/2016 1/5/2016 1/7/2016
    hugh 1/4/2016
    jackie 1/5/2016 1/7/2016
    john 1/1/2016 1/2/2016 1/9/2016
    justin 1/1/2016 2/1/2016
    may 1/1/2016 1/3/2016
    will 1/1/2016

    Seems to be one of the challenging example for power query

    • Simon Nuss says:

      See below for a solution:

      let
      Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
      #"Grouped Rows" = Table.Group(Source, {"Name"}, {{"Count", each Text.Combine(List.Transform([Date], each Text.From(_)), "|"), type text}}),
      MaxColumns = List.Accumulate(#"Grouped Rows"[Count], 0, (state, current) => let count = List.Count(Text.Split(current, "|")) in if count > state then count else state),
      #"Split Column by Delimiter" = Table.SplitColumn(#"Grouped Rows", "Count", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), MaxColumns)
      in
      #"Split Column by Delimiter"

  10. Simon Nuss says:

    Here's another solution completely from left field. I decided to challenge myself and do the whole thing using a variety of List functions. Enjoy!

    let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content][Data],
    Transform = List.Transform(Source, each _ & "|" ),
    ReplaceValue = List.ReplaceValue(Transform, null, "#(lf)", Replacer.ReplaceValue),
    Combine = Text.Combine(ReplaceValue),
    Lines = Lines.FromText(Combine),
    TransformAgain = List.Transform(Lines, each try Text.Start(_,Text.Length(_)-1) otherwise ""), // "try" catches nulls
    MaxColumns = List.Accumulate(TransformAgain, 0, (state, current) => let count = List.Count(Text.Split(current, "|")) in if count > state then count else state),
    #"Converted to Table" = Table.FromList(TransformAgain, Splitter.SplitTextByDelimiter("|"), MaxColumns)
    in
    #"Converted to Table"

  11. Vishal says:

    I have been working into Excel 2007. How can I use it?

Leave a Reply