• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Splitting uneven length cells into ordered columns

AZBYCX

New Member
Hi, first time poster, hoping you can help :)

Have been asked to separate out address data in large document to get all Towns into one column. All addresses are contained in single cells (in same column) but they contain content of different lengths. All delimiters are comma space. For example:
(row 1 cell) 12 Any Street, Region, Town, Postcode, Country
(row 2 cell) 12a, Any Street, Region, Town, Postcode, Country
(row 3 cell) House name, Any Street, Town, Postcode, Country
(row 4 cell) Some long location (eg Accessible road from B222 off A111 towards some place), Town, Postcode, Country
(row 5 cell) 12 Any Street, Town, Postcode, Country
etc etc

I'm new to PQ and I'm struggling to get all Town references into a single column. I can almost get there with a very lengthy process using lots of conditional columns etc but there's always a 'rogue' entry. Is there any way I can do this? Any help appreciated.
 
AZBYCX
As first time poster then
You have today read Forum Rules
You would reread those to refresh You memory
eg from How to get the Best Results at Chandoo.org
You would notice some basic details.
 
I'm new to PQ and I'm struggling to get all Town references into a single column. I can almost get there with a very lengthy process using lots of conditional columns etc but there's always a 'rogue' entry. Is there any way I can do this? Any help appreciated.

While it should not be that hard.
Given your sample data is somehow representative, the TOWN is always the 3rd last column. So this pattern works:
  1. add an index
  2. group by this index, use AllRows as aggregate
  3. Split this sub table by delimiter
  4. Get the column headers
  5. find the column header by position that is your town (3rd last)
  6. Extract the data from that list.
  7. done
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Column1", "Addresses"}}),
    #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"Index"}, {{"All", each _, type table [Addresses=text, Index=number]}}),
    SplitTableByComma = Table.AddColumn(#"Grouped Rows", "SplitByComma", each Table.SplitColumn([All],"Addresses", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))),
    GetColumnsHeaders = Table.AddColumn(SplitTableByComma, "ColumnNames", each Table.ColumnNames([SplitByComma])),
    SelectColByPostionOfTown = Table.AddColumn(GetColumnsHeaders, "SelectPositionOfTown", each List.First(
    List.LastN(
    List.RemoveLastN([ColumnNames],1)
    ,3)
)),
    ExtractTheTown = Table.AddColumn(SelectColByPostionOfTown, "Get Town", each Table.Column([SplitByComma],[SelectPositionOfTown]){0})
in
    ExtractTheTown

EDIT: a shorter version, you actually do not need the index (I'm almost always using an index)
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "Addresses"}}),
    #"Grouped Rows" = Table.Group(#"Renamed Columns", {"Addresses"}, {{"All", each _, type table [Addresses=text, Index=number]}}),
    SplitTableByComma = Table.AddColumn(#"Grouped Rows", "SplitByComma", each Table.SplitColumn([All],"Addresses", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))),
    GetColumnsHeaders = Table.AddColumn(SplitTableByComma, "ColumnNames", each Table.ColumnNames([SplitByComma])),
    SelectColByPostionOfTown = Table.AddColumn(GetColumnsHeaders, "SelectPositionOfTown", each List.First(
    List.LastN(
    [ColumnNames],3)
    )),
    ExtractTheTown = Table.AddColumn(SelectColByPostionOfTown, "Get Town", each Table.Column([SplitByComma],[SelectPositionOfTown]){0}),
    RemoveHelperCols = Table.RemoveColumns(ExtractTheTown,{"All", "SplitByComma", "ColumnNames", "SelectPositionOfTown"})
in
    RemoveHelperCols
 

Attachments

  • Chandoo_44636.xlsx
    19.2 KB · Views: 4
Last edited:
While it should not be that hard.
Given your sample data is somehow representative, the TOWN is always the 3rd last column. So this pattern works:
  1. add an index
  2. group by this index, use AllRows as aggregate
  3. Split this sub table by delimiter
  4. Get the column headers
  5. find the column header by position that is your town (3rd last)
  6. Extract the data from that list.
  7. done
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Column1", "Addresses"}}),
    #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"Index"}, {{"All", each _, type table [Addresses=text, Index=number]}}),
    SplitTableByComma = Table.AddColumn(#"Grouped Rows", "SplitByComma", each Table.SplitColumn([All],"Addresses", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))),
    GetColumnsHeaders = Table.AddColumn(SplitTableByComma, "ColumnNames", each Table.ColumnNames([SplitByComma])),
    SelectColByPostionOfTown = Table.AddColumn(GetColumnsHeaders, "SelectPositionOfTown", each List.First(
    List.LastN(
    List.RemoveLastN([ColumnNames],1)
    ,3)
)),
    ExtractTheTown = Table.AddColumn(SelectColByPostionOfTown, "Get Town", each Table.Column([SplitByComma],[SelectPositionOfTown]){0})
in
    ExtractTheTown

EDIT: a shorter version, you actually do not need the index (I'm almost always using an index)
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "Addresses"}}),
    #"Grouped Rows" = Table.Group(#"Renamed Columns", {"Addresses"}, {{"All", each _, type table [Addresses=text, Index=number]}}),
    SplitTableByComma = Table.AddColumn(#"Grouped Rows", "SplitByComma", each Table.SplitColumn([All],"Addresses", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv))),
    GetColumnsHeaders = Table.AddColumn(SplitTableByComma, "ColumnNames", each Table.ColumnNames([SplitByComma])),
    SelectColByPostionOfTown = Table.AddColumn(GetColumnsHeaders, "SelectPositionOfTown", each List.First(
    List.LastN(
    [ColumnNames],3)
    )),
    ExtractTheTown = Table.AddColumn(SelectColByPostionOfTown, "Get Town", each Table.Column([SplitByComma],[SelectPositionOfTown]){0}),
    RemoveHelperCols = Table.RemoveColumns(ExtractTheTown,{"All", "SplitByComma", "ColumnNames", "SelectPositionOfTown"})
in
    RemoveHelperCols
Thanks GraH - Guido! I'll give that a go asap!
 
Or what about this 1 stop shopping UI experience:

On the Transform ribbon select "Extract" -> "Text Between Delimiters"
69838
69839
 
Back
Top