• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Pairs of numbers

BTW, analysis of the formula approach shows that I have split the number into its constituent digits 18 times. A good candidate for a helper range perhaps?
 
Hi David,

For a self-contained formula that returns an array and allows for any number of digits, perhaps this:

=MODE.MULT(IF(LEN(SUBSTITUTE(SUBSTITUTE(C1,{0;1;2;3;4;5;6;7;8;9},,1),{0,1,2,3,4,5,6,7,8,9},,1))=LEN(C1)-2,(11*({0,1,2,3,4,5,6,7,8,9}+{0;1;2;3;4;5;6;7;8;9})-9*ABS({0,1,2,3,4,5,6,7,8,9}-{0;1;2;3;4;5;6;7;8;9}))/2),11*{0,1,2,3,4,5,6,7,8,9})

The first part checks if each combination of digits is contained and the last part returns a symmetric matrix of digits (00 .. 99) and an extra column for duplicates.

Also, as an alternative to VBA, you might consider Python which has lots of built-in functionality for this sort of thing without requiring detailed programming. From the Python Shell (IDLE), the following commands match the results from the examples…

Code:
>>> import itertools
>>> list(set(itertools.combinations([1,2,3,4],2)))
[(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (2, 4)]
>>> list(set(itertools.combinations([5,6,6,8],2)))
[(5, 6), (6, 8), (5, 8), (6, 6)]
>>> list(set(itertools.combinations([5,7,7,7],2)))
[(5, 7), (7, 7)]
>>> list(set(itertools.combinations([7,7,7,7],2)))
[(7, 7)]

To incorporate these functions into the sheet, see:
https://newtonexcelbach.com/2018/06/01/returning-arrays-from-vba-and-python/

Now to study the other formula and power query solutions...
 
Hi Peter,

You got me thinking in another direction. I got this far for the moment (work in progress).
Code:
let
  Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
  #"Changed Type" = Table.TransformColumnTypes(Source,{{"Original Numbers", Int64.Type}}),
  AddIndex = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
  GroupRows = Table.Group(AddIndex, {"Index"}, {{"AsTable", each _, type table}}),
  ConvertToTable = Table.AddColumn(GroupRows, "AsSingleColumnTable", each Table.FromList(Table.Column([AsTable],"Original Numbers"), Splitter.SplitByNothing(), null, null, ExtraValues.Error)),
  TransposeTable = Table.AddColumn(ConvertToTable, "TransposeTable", each Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes([AsSingleColumnTable], {{"Column1", type text}}, "nl-BE"), {{"Column1", Splitter.SplitTextByRepeatedLengths(1), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1")),
  AddSubIndex0 = Table.AddColumn(TransposeTable, "AddIndex0", each Table.AddIndexColumn([TransposeTable],"Index0",0,1)),
  AddSubIndex1 = Table.AddColumn(AddSubIndex0, "AddIndex1", each (Table.AddIndexColumn([AddIndex0],"Index1",1,1))),
  MergeTable = Table.AddColumn(AddSubIndex1, "MergedSubTable", each Table.NestedJoin([AddIndex0], "Index0", [AddIndex1], "Index1", "MergedTable", JoinKind.LeftOuter))
in
  MergeTable

I'll work on it some more when I have time. I want to use Table.RowCount, and Table.Repeat to create a modulo index.
Then merge the table on its own. Combine the correct columns to make the pairs, filter out duplicates and finally extend those subtables. Got stuck with adding the column using the rowcount.
 
A self-contained formula

An amazing piece of ingenuity, there should be a prize!
The use of MODE.MULT is a neat piece of lateral thinking.
SUBSTIITUTE / LEN is not pretty but it does the job.
The array formulas made my head spin; 11s and 9s?

Just to be perverse, I worked my way through that section with named formulas
Code:
I: ={1,2,3,4,5,6,7,8,9}
J: =TRANSPOSE( I )
X: = 10*I + J
Y: = TRANSPOSE( X )
Σ: = X + Y
Δ: = ABS( X - Y )
M: = ( Σ - Δ ) / 2

Σ holds the multiples of 11 and Δ the 9s.
 
Hi Lori,

Beautiful solution, also covers more than 4 digits in the cell.

My solution, which is the basis
--(SMALL(--MID(C$1,{1;2;3;4},1),{1;1;1;2;2;3})&SMALL(--MID(C$1,{1;2;3;4},1),{2;3;4;3;4;4})),
I come to 6 combinations,{12;13;14;23;24;34},From here, I have two options :

1.=XlSortUnique(--(SMALL(--MID(C1,{1;2;3;4},1),{1;1;1;2;2;3})&
SMALL(--MID(C1,{1;2;3;4},1),{2;3;4;3;4;4}))),

XlSortUnique> https://tkljh.s3.amazonaws.com/quant/tools/excel.html

2.=IFERROR(AGGREGATE(15,6,--(SMALL(--MID(C$1,{1;2;3;4},1),{1;1;1;2;2;3})&SMALL(--MID(C$1,{1;2;3;4},1),{2;3;4;3;4;4}))/(--(SMALL(--MID(C$1,{1;2;3;4},1),{1;1;1;2;2;3})&SMALL(--MID(C$1,{1;2;3;4},1),{2;3;4;3;4;4}))>D1),1),"")


David
 

Attachments

  • Pairs of numbers.xlsx
    9.4 KB · Views: 6
David, yes the basic logic of the solution is easier to see in the specific case that you show.

As Peter says, the previous formula generalises to more digits by piecing together three basic building blocks:

- MODE.MULT which is a goto function for returning variable length arrays and can often be used as an alternative to longer SMALL/AGGREGATE alternatives.

- LEN/SUBSTITUTE which is a commonly used combination for counting occurrences of characters within a string.

- The identity: MIN(X,Y) = (X + Y - ABS (X - Y))/2 which is frequently useful (as i was recently reminded by a formula from John Jairo V). It is here applied to arrays with Y = TRANSPOSE(X) to reverse digits (eg 73 -> 37), details are shown in Peter's deconstruction .

I'm working through the power query solutions about which I am pretty clueless. To create the table of pairs, i was wondering if there was a way to do a cross join to generate a list of n x n entries like adding the same table twice to a sql query?
 
{=SMALL(IF(MID(TEXT(REPT(1, LEN($C$1)-ROW($A$1:INDEX($A:$A, LEN($C$1)-1))), REPT(0, LEN($C$1))), COLUMN($A$1:INDEX($1:$1,, LEN($C$1))), 1)+0, MID($C$1, ROW($A$1:INDEX($A:$A, LEN($C$1)-1)), 1)*10 + MID($C$1, COLUMN($A$1:INDEX($1:$1,, LEN($C$1)))), 1)), ROW($A$1:INDEX($A:$A, LEN($C$1)*(LEN($C$1)-1)/2)))}

Also returning an array with the answers, small first.
 
I'm working through the power query solutions about which I am pretty clueless. To create the table of pairs, i was wondering if there was a way to do a cross join to generate a list of n x n entries like adding the same table twice to a sql query?
Hi Lori, it is where I'm stuck too. There is a way. I was thinking of using the modulo and replacing the value by a parameter based on a row count. Did not find the time to try and got errors all over.
Then, when typing this, PQ knows many to many relationship.
Not yet there, but getting closer. Cross join function is this part
Code:
MergeTable = Table.AddColumn(AddXJoinIndex1, "MergedSubTable", each Table.NestedJoin([AddXjoin0], {"Index0X", "Index0"}, [AddXjoinIndex], {"Index1X", "Index1"}, "MergedTable", JoinKind.FullOuter))
Code:
let
  Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
  #"Changed Type" = Table.TransformColumnTypes(Source,{{"Original Numbers", Int64.Type}}),
  AddIndex = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
  GroupRows = Table.Group(AddIndex, {"Index"}, {{"AsTable", each _, type table}}),
  ConvertToTable = Table.AddColumn(GroupRows, "AsSingleColumnTable", each Table.FromList(Table.Column([AsTable],"Original Numbers"), Splitter.SplitByNothing(), null, null, ExtraValues.Error)),
  TransposeTable = Table.AddColumn(ConvertToTable, "TransposeTable", each Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes([AsSingleColumnTable], {{"Column1", type text}}, "nl-BE"), {{"Column1", Splitter.SplitTextByRepeatedLengths(1), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1")),
  AddSubIndex0 = Table.AddColumn(TransposeTable, "AddIndex0", each Table.AddIndexColumn([TransposeTable],"Index0",0,1)),
    AddXJoinIndex0 = Table.AddColumn(AddSubIndex0, "AddXjoin0", each Table.AddIndexColumn([AddIndex0],"Index0X",1,0)),
    AddSubIndex1 = Table.AddColumn(AddXJoinIndex0, "AddIndex1", each (Table.AddIndexColumn([TransposeTable],"Index1",1,1))),
    AddXJoinIndex1 = Table.AddColumn(AddSubIndex1, "AddXjoinIndex", each Table.AddIndexColumn([AddIndex1],"Index1X",1,0)),
  MergeTable = Table.AddColumn(AddXJoinIndex1, "MergedSubTable", each Table.NestedJoin([AddXjoin0], {"Index0X", "Index0"}, [AddXjoinIndex], {"Index1X", "Index1"}, "MergedTable", JoinKind.FullOuter)),
    #"Removed Columns" = Table.RemoveColumns(MergeTable,{"AsTable", "AsSingleColumnTable", "TransposeTable", "AddIndex0", "AddXjoin0", "AddIndex1", "AddXjoinIndex"}),
    #"Expanded MergedSubTable" = Table.ExpandTableColumn(#"Removed Columns", "MergedSubTable", {"Column1", "MergedTable"}, {"Column1", "MergedTable"}),
    #"Expanded MergedTable" = Table.ExpandTableColumn(#"Expanded MergedSubTable", "MergedTable", {"Column1"}, {"Column1.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded MergedTable", each ([Column1] <> null) and ([Column1.1] <> null)),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Pairs", each [Column1.1]&[Column1]),
    #"Removed Duplicates" = Table.Distinct(#"Added Custom", {"Pairs", "Index"}),
    #"Removed Columns1" = Table.RemoveColumns(#"Removed Duplicates",{"Column1", "Column1.1"})
in
  #"Removed Columns1"
 
I think I have done it! (I have updated my workbook from #25.)
I have calculated the foreign key pairings in the junction table to cater for exactly 9 digits and then allowed inner joins to remove the null references.
I have separately identified multiply occurring digits and appended those pairs to the distinct digit pairs.
 

Attachments

  • challenge-n-digit PQ.xlsx
    31.1 KB · Views: 15
Further to Lori's post No: 27, here is complete Python code that can be used direct from Excel as a UDF, via xlwings:

Code:
import xlwings as xlw
import itertools

@xlw.func
@xlw.arg('num',numbers = int)
def listcombs(vals, num=2):
    return list(set(itertools.combinations(vals, num)))

To use:
If you don't have Python and xlwings, install Anaconda Python, which includes xlwings by default.
Copy the code above to a file called ListComb.py
Open the xlwings Quickstart.xlsm file and save as ListComb.xlsm (or ListComb.xlsb if you prefer), in the same folder as the .py file.
Click the Import Functions icon on the xlwings add-in tab.

That's all. You should now be able to call the listcombs function as a UDF.

For the array-function auto-sizing VBA code, download the attached zip file with examples, or follow the link to my blog in Lori's post
 

Attachments

  • XLWings.zip
    22.5 KB · Views: 1
Comparing with Peter Bartholomew's last results, to avoid getting pairs repeated in reverse, the input needs to be sorted in ascending order, so I have added a sort to the Python code. I have also sorted the results, so the results list is now in the same order as found by Peter.
Revised files attached.
 

Attachments

  • ListComb.zip
    20.6 KB · Views: 3
I think I have done it! (I have updated my workbook from #25.)
I fail to turn this into a function (if someone has a tip, highly appreciate that), but I believe the following transformation steps also work.
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="t4fct"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Original Numbers", type text}}),
    AsTable = Table.AddColumn(#"Changed Type", "AsTable", each Table.FromList({[Original Numbers]},Splitter.SplitByNothing(), null, null, ExtraValues.Error)),
    SplitTable = Table.AddColumn(AsTable, "SplitTable", each Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes([AsTable], {{"Column1", type text}}, "nl-BE"), {{"Column1", Splitter.SplitTextByRepeatedLengths(1), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1")),
    DistinctTable = Table.AddColumn(SplitTable, "DistinctTable", each Table.Distinct([SplitTable],{"Column1"})),
    AddXjoin = Table.AddColumn(DistinctTable, "AddXjoin", each Table.AddIndexColumn([DistinctTable],"IndexX",1,0)),
    AddIndex = Table.AddColumn(AddXjoin, "AddIndex", each Table.AddIndexColumn([AddXjoin],"Index",1,1)),
    MergeTable = Table.AddColumn(AddIndex, "MergeTable", each Table.NestedJoin([AddIndex], {"IndexX"}, [AddIndex],{"IndexX"},"MergeTable" ,JoinKind.FullOuter)),
    CountRows = Table.AddColumn(MergeTable, "Counter", each Table.RowCount([MergeTable])),
    ExpandMergeTable1 = Table.ExpandTableColumn(CountRows, "MergeTable", {"Column1", "Index", "MergeTable"}, {"Column1", "Index", "MergeTable.1"}),
    ExpandMergeTable2 = Table.ExpandTableColumn(ExpandMergeTable1, "MergeTable.1", {"Column1", "Index"}, {"Column1.1", "Index.1"}),
    MakePairs = Table.AddColumn(ExpandMergeTable2, "Pairs", each if [Counter] = 1 then [Column1]&[Column1.1] else if [Index.1] > [Index] then [Column1]&[Column1.1] else null),
    FilterNull = Table.SelectRows(MakePairs, each ([Pairs] <> null)),
    RemoveColumns = Table.RemoveColumns(FilterNull,{"AsTable", "SplitTable", "DistinctTable", "AddXjoin", "AddIndex", "Column1", "Index", "Column1.1", "Index.1", "Counter"})
in
    RemoveColumns
 

Attachments

  • MakingPairs.xlsx
    19 KB · Views: 4
I believe the following transformation steps also work

You are ahead of me in terms of the sophistication of your use of PQ. I think you have kept tables (how is that done? is it by typing a line within the advanced editor or is there a button I haven't noticed?) where I keep on loading as 'connection only' to retain partial results. Your outer join looks more compact than anything I have attempted.

I have only ever written one function by following step-by-step guidance so I have nothing to offer at present on that front.

One thing you might consider borrowing from my attempt is the test for multiple occurrences of a digit which you are currently filtering out unless it happens to be the only digit used for the string.

Aside: The Python and xlwings looks interesting but might need further description and I am not sure where such a discussion might fit?
 
Hi Peter Bartholomew,

You call me sophisticated? Yet, I was inspired by your solution, and I did some serious trial-and-error to accomplish what I did. Together with typing #shared a lot in the formula bar as well.

Honestly, I reverted to tables since I was unable to find anything working with lists. I would love to see how that's done, if possible, but I believe it is.

Both tables and lists are standard object types in PQ - that you know - and I used table functions to write formulas in an added column. So it was done without the advanced editor.
For these 2 formulas - the parts in red specifically - I really needed google and an example I could dig:
  • AsTable = Table.AddColumn(#"Changed Type", "AsTable", each Table.FromList({[Original Numbers]},Splitter.SplitByNothing(), null, null, ExtraValues.Error)),
  • SplitTable = Table.AddColumn(AsTable, "SplitTable", each Table.ExpandListColumn(Table.TransformColumns(Table.TransformColumnTypes([AsTable], {{"Column1", type text}}, "nl-BE"), {{"Column1", Splitter.SplitTextByRepeatedLengths(1), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"))
The key was understanding how to build the pattern. Almost all of the above posts, both formula and PQ based, really helped. Call it a "joined" effort. BTW, did we already say thanks to David for this fine challenge?
 
Aside: The Python and xlwings looks interesting but might need further description and I am not sure where such a discussion might fit?

The download file should just work, once Python and xlwings are installed, but I am happy to answer questions.

Why wouldn't here be an appropriate place to discuss it? It's a simple to implement way of extending Excel, using free software, and by the look of it very much easier to implement than the Power Query solutions.
 
Doug, I second the xlwings approach. It seems simplest overall when compared with formula, vba and PQ and in addition to pairs can return triples and higher combinations.

It can also be made fully dynamic to autoresize on calculate which I don't think the other ways can (PQ resizes the table but needs a manual refresh and formulas automatically calculate but don't resize.)

Further discussions could continue on your blog if there's a feeling that this might veer off topic.
 
It's a simple to implement way of extending Excel, using free software
I was wondering whether the 'vault' might be an appropriate sub-forum. The issues raised clearly go beyond the scope of this discussion thread that addresses combinations of digits. I would like to be able to be able to refer to the instructions and relevant advice when I am in a position to devote time to the task of implementing and learning something new.
If the topic were raised as a new discussion I would feel more free to ask questions concerning its general scope and applicability.
 
PQ resizes the table but needs a manual refresh and formulas automatically calculate but don't resize
I think functionality from PQ is within the Excel Object model so it should be possible to implement an event-driven update (maybe not in my 2010 PQ add-in though). I must admit I find the expectation that one should hit refresh something of a drawback when one is simply responding to a worksheet change.

As for resizing, my strategy was to accommodate 18 digits because the output doesn't change from there on. Unless, of course, you use general characters rather than just digits - the PQ solutions will cope with non-numeric input.

The solution is sized at the point one splits the string to a column. That is the point at which the index column should be added.
 
For PQ, i guess you could always use a (heavy-handed) sheet change event. Is there one for a PQ refresh? I can see one for Table update. It would be nice if there was a non-code option so that the workbook wouldn't need to be macro-enabled and so that code only got called when the input cell changed. Add-in functions like those that Doug and David cited recalculate efficiently using the workbook tree of dependencies and can be referenced without needing to be saved as *.xlsm.

For formulas that return variable length arrays, one approach is to use,

=IFERROR( INDEX( <formula> , ROW() - MIN( ROW() ) + 1 ), "")

array-entered over a sufficiently large range. <formula> can be a formula returning an array or a name like 'output' as in the 'challenge-n-digit' attachment. This appears to be significantly faster than copying down individual cell formulas. (I should add this technique derives from one of Peter's sample files)
 
VBA for PQ or table, I believe it is the same.
For non VBA, you do have this, but not "event" driven.
Formulae do win over PQ when it comes to auto-update.
upload_2018-8-3_17-34-32.png
 
You can find bit of discussion on xlWings & Python in threads below.
https://chandoo.org/forum/threads/web-data-pulling-need-help-in-improving-code.36872/
https://chandoo.org/forum/threads/whats-in-your-ribbon.37498/
https://chandoo.org/forum/threads/is-it-possible-to-use-list-comprehension-in-vba.38083/

It's handy tool to have, especially when iterator algebra is needed. Along with various other functionality (such as reading and transforming netCDF file; statistical analysis etc).

Link to pretty good article on itertools library below.
https://realpython.com/python-itertools/
 
I have just posted a blog article with a summary of the different approaches here, and link to updated spreadsheet with Hui's VBA (slightly modified), and Lori's on-sheet formula:
VBA vs Power Query vs Python

Feel free to comment on the blog with any questions or comments on the Python or xlwings side of things.

Lori - using xlwings you do need to save the spreadsheet as xlsm (or xlsb), because it uses VBA wrapper functions to call the Python code. I believe that pyxll can be run from an xlsx.

Chihiro - thanks for the links. The real python site looks like a great resource.
 
@Doug, Nice summary. i hadn't really taken in the VBA requirement - thanks for clarifying that. Much of my experience in this area has been with Exceldna which doesn't need any vba interaction.

On a side note, i suppose the vba part could be stored as an addin (*.xlam) if desired and worksheet formulas could point to that from a xlsx format workbook. However that would create workbook links to the add-in that may require updating as is a "feature" of vba add-in functions.
 
Back
Top