• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Multi Threading - VBA Excel macro

James989

Member
Hi,

I have created a macro :-

Code:
  For i = 1 To d Step 1
  
       R = Rnd ^ 2 + Rnd ^ 2
  
       If R <= 1 Then
              count = count + 1
       End If
  
  Next

My problem here is "d " is a very large number(billions) and it takes very long time to run this macro.

It it possible to do multi threading simulation i.e somthing similar to MATHLAB Parallel For-Loops (PARFOR) ? In MATHLAB, you can run the same loop on 8 MATLABPOOL workers and it take much shorter time to finish it. See link :
http://www.mathworks.com/products/p...shipping/distcomp/paralleldemo_parfor_pi.html


James
 
Last edited by a moderator:
By the way I just ran the following code on d=100,000,000 and it took 30 seconds on my PC, so 1 Billion will take about 300 seconds (5 minutes)

Code:
Sub H()
Dim i As Long, d As Long
Dim count As Long
d = 100000000
For i = 1 To d
  R = Rnd ^ 2 + Rnd ^ 2
  If R <= 1 Then count = count + 1
Next
Debug.Print d, count, count/d
End Sub

the results were:
Code:
  d  Count  Count/d
 10  5  0.5
 100  82  0.82
 1000  813  0.813
 10000  7894  0.7894
 100000  78691  0.78691
 1000000  785520  0.78552
 10000000  7853046  0.7853046
 100000000  78540454  0.78540454
If your using Excel 64 bit you can also go to Long Long data types
 
Last edited:
By the way I just ran the following code on d=100,000,000 and it took 30 seconds on my PC, so 1 Billion will take about 300 seconds (5 minutes)

Code:
Sub H()
Dim i As Long, d As Long
Dim count As Long
d = 100000000
For i = 1 To d
  R = Rnd ^ 2 + Rnd ^ 2
  If R <= 1 Then count = count + 1
Next
Debug.Print i, count
End Sub

If your using Excel 64 bit you can also go to Long Long data types

This only an simple example, my actual case is more complex and have to run 2e13 rounds !, it take days to finish it ??
 
Are you able to tell us more about the problem and why you need to run 20 trillion iterations as there may be other ways to skin the cat ?
 
Are you able to tell us more about the problem and why you need to run 20 trillion iterations as there may be other ways to skin the cat ?

Problem :-

choose 5 cards randomly from a deck, choose 1 card from the remaning deck and choose 4 cards from the remaining deck. Total combinations = 52C5 * 47C1 * 46C4 = 20 trillion +- .
 
Hi James ,

Correct me if I am wrong , but what you want to do is select 5 unique random numbers from a set of numbers between 1 and 52 ; is this correct ?

If yes , then there are any number of ways to do this , none of which involves looking at trillions of combinations.

Check the code here for one way :

http://www.cpearson.com/excel/randomnumbers.aspx

Narayan
 
Hi James ,

Correct me if I am wrong , but what you want to do is select 5 unique random numbers from a set of numbers between 1 and 52 ; is this correct ?

If yes , then there are any number of ways to do this , none of which involves looking at trillions of combinations.

Check the code here for one way :

http://www.cpearson.com/excel/randomnumbers.aspx

Narayan

OK, for casino game carribean stud poker, choose 5 cards for player, choose 1 card(face up) and then choose 4 cards from remaning deck for dealer, then compare the cards and see who is the winner. Total combinations= 20 trillion +-
 
James

The nest question is what do you want to do with the cards once choosen ?
 
once 5 cards was choosen for player and dealer, you have to compare the cards each other to see who is the winner. If you want to run a full complete cycle, total combinations=20 trillion !
 
Hate to mention this:

* let's assume each number takes up 1 byte of memory
* So 1 run will take up 5 bytes
* so 20 trillion will take up 100 trillion bytes (a little less than 100 terabytes)
* A single excel page has about 1 million rows and 16000 columns, so lets say it can hold about 16 billion numbers
* So you'll need about 6250 excel pages, every cell filled, to simulate that many runs
* Lucky for you, later versions of excel automatically compress workbooks (they're actually ZIP files and you can open them with Winzip). Unlucky for you, I highly doubt you'll get a good compression ratio because you are filling this spreadsheet with random numbers. Let's assume a compression ratio of 20%, which is way higher than you'll probably get.
* File takes up about 80 terabytes. Of course, you can split it up over multiple files, but it's going to wind up total about 80TB.
* I just googled the largest internal hard drive available, and it's 3 TB.
* So you need 27 hard drives to store this spreadsheet. I don't know how many hard drives you can install on a PC but you're definitely going to need more than one computer or a dedicated server.
* Conclusion: This spreadsheet isn't happening.

But there is hope. Due to something called the Law of Large Numbers, you don't need to run through every possible iteration if you're trying to figure out the properties of cards chosen at random from a deck. Given the method you listed, you would inevitably wind up with duplicate runs, where you randomly pick the same 5 numbers twice. At first the odds would be overwhelming against it; a few billion runs in, now it's quite significant. Which means you need a list, which is not realistic due to constraints I described earlier.

An example of the way the Law of Large Numbers works: Lets say you didn't know what an average value of a six-sided die roll was. You could roll a six-sided die a few thousand times and then average the results, and you'll find out that it's extremely close to the value predicted by statistics (which is 3.5).

You can do the same thing with random cards. Excel can easily simulate 5 cards drawn at random from a 52-card deck. You could write a macro to run this a few thousand times and put the results in a spreadsheet and compare and evaluate them however you want. Due to randomness, the results aren't going to be exactly the same as if you had worked out the probabilities mathematically, but the more runs you do, the closer to the actual probability you'll get. It doesn't matter that you won't even come close to going through every possible set of 5 cards -- the trend you're looking for will begin to emerge.

Hope this helps.
 
Hate to mention this:

* let's assume each number takes up 1 byte of memory
* So 1 run will take up 5 bytes
* so 20 trillion will take up 100 trillion bytes (a little less than 100 terabytes)
* A single excel page has about 1 million rows and 16000 columns, so lets say it can hold about 16 billion numbers
* So you'll need about 6250 excel pages, every cell filled, to simulate that many runs
* Lucky for you, later versions of excel automatically compress workbooks (they're actually ZIP files and you can open them with Winzip). Unlucky for you, I highly doubt you'll get a good compression ratio because you are filling this spreadsheet with random numbers. Let's assume a compression ratio of 20%, which is way higher than you'll probably get.
* File takes up about 80 terabytes. Of course, you can split it up over multiple files, but it's going to wind up total about 80TB.
* I just googled the largest internal hard drive available, and it's 3 TB.
* So you need 27 hard drives to store this spreadsheet. I don't know how many hard drives you can install on a PC but you're definitely going to need more than one computer or a dedicated server.
* Conclusion: This spreadsheet isn't happening.

But there is hope. Due to something called the Law of Large Numbers, you don't need to run through every possible iteration if you're trying to figure out the properties of cards chosen at random from a deck. Given the method you listed, you would inevitably wind up with duplicate runs, where you randomly pick the same 5 numbers twice. At first the odds would be overwhelming against it; a few billion runs in, now it's quite significant. Which means you need a list, which is not realistic due to constraints I described earlier.

An example of the way the Law of Large Numbers works: Lets say you didn't know what an average value of a six-sided die roll was. You could roll a six-sided die a few thousand times and then average the results, and you'll find out that it's extremely close to the value predicted by statistics (which is 3.5).

You can do the same thing with random cards. Excel can easily simulate 5 cards drawn at random from a 52-card deck. You could write a macro to run this a few thousand times and put the results in a spreadsheet and compare and evaluate them however you want. Due to randomness, the results aren't going to be exactly the same as if you had worked out the probabilities mathematically, but the more runs you do, the closer to the actual probability you'll get. It doesn't matter that you won't even come close to going through every possible set of 5 cards -- the trend you're looking for will begin to emerge.

Hope this helps.


why you need to hold all the outcome/results after each iteration in excel ? You can just sum it and only show the final results after 20 trillion iterations ! Am I missing something ?
 
Yes, the fact that you don't need to run 20 trillion iterations due to the Law of Large Numbers.

Given Hui's simple example, it would take about 70 days if he had to run 20 trillion iterations of that program. However, the count/d value is converging on the actual mathematical probability. Look at the difference between 10 million and 100 million. 10 million trials are showing 78.53% and 100 million trials are showing 78.54%. As a matter of fact, the probability isn't actually changing significantly when he runs over 100,000 trials.

Also, Hui is choosing two random numbers between 0 and 1 - infinite possibilities.

My point is, because of Law of Large Numbers, you can either choose to run all 20 trillion combos and tie up a computer for 70 days, or run 1 million random trials and tie up a computer for 3 seconds, except that the number you get from 1 million trials will be about 0.01% off. In my opinion 70 days isn't worth a 0.01% increase in accuracy.
 
Law of Large Numbers in action:

I worked out the probability in Hui's example, and count/d should converge to Pi/4, about 0.78540. The 100,000,000 trial is 5 digits accurate.
 
You may need 70 days all less to run through the complete cycle.

If you can separate a single workbook( iteration from 1 to r) to 5 workbooks for iterations from 1 to n, n+1 to m, m+1 to p, p+1 to q and q+1 to r, then run this 5 workbooks simulteneously, your total time may reduce to 70/5=14 days ?
 
James, I'm trying to help you out here.

Of course you need less than 70 days, because you only need 30 seconds.

If you're trying to figure out the probability of winning a card game, I'm suggesting that it's MUCH faster to simulate 100,000,000 random trials than to set up some nested For loops to iterate over all possible hands, and the hit to the accuracy won't even be noticeable.

In the 4th post on this thread, Hui is actually proving my point. Look at what his program is doing:

For 100,000,000 trials:
1) Select 2 random numbers between 0 and 1 and square them.
2) Add the squares together.
3) If the sum is less than 1, trial is a success, so count it.

The probability of a successful trial is Pi/4, or about 78.540%. The more runs he does, the closer his count/d (which is the probability of a successful trial) gets to Pi/4. If you ran the same program on your computer you would get nearly the same results despite the fact that your computer is choosing completely different random numbers.

So, by that logic, you could set up a card game simulation to run random trials. Running 100,000,000 random hands will give you a win-loss ratio accurate to the nearest 0.001% which should be good enough for you.
 
James, I'm trying to help you out here.

Of course you need less than 70 days, because you only need 30 seconds.

If you're trying to figure out the probability of winning a card game, I'm suggesting that it's MUCH faster to simulate 100,000,000 random trials than to set up some nested For loops to iterate over all possible hands, and the hit to the accuracy won't even be noticeable.

In the 4th post on this thread, Hui is actually proving my point. Look at what his program is doing:

For 100,000,000 trials:
1) Select 2 random numbers between 0 and 1 and square them.
2) Add the squares together.
3) If the sum is less than 1, trial is a success, so count it.

The probability of a successful trial is Pi/4, or about 78.540%. The more runs he does, the closer his count/d (which is the probability of a successful trial) gets to Pi/4. If you ran the same program on your computer you would get nearly the same results despite the fact that your computer is choosing completely different random numbers.

So, by that logic, you could set up a card game simulation to run random trials. Running 100,000,000 random hands will give you a win-loss ratio accurate to the nearest 0.001% which should be good enough for you.


Hi,

I know that I can get solve the problem(less than 30s) by simulation BEFORE I posted here !

What I want is to run through the complete cycle for the case of card game( Not the Pi case !) so that I can get the EXACT answer.
 
Hi, James989!
70 days 1 PC :eek:, 70 PCs 1 day :oops:, anything intermediate :),... otherwise use MatLab :mad:.
Regards!
 
I know this is an old thread but I thought it might make sense to share for future generations :):

Multithreading for loops in VBA is possible and there are several approaches to this. Read more here: http://blog.tkacprow.pl/excel-multithreading-vba-vs-vbscript-vs-c-net/

Using this Excel tool which I developed (in VBA): http://blog.tkacprow.pl/excel-vba-multithreading-tool/ you can easily achieve multithreading of your example like this:

Code:
Sub ForLoop(workbookName As String, seqFrom As Long, seqTo As Long)
      R = Rnd ^ 2 + Rnd ^ 2
      If R <= 1 Then
              count = count + 1
      EndIf
      ParallelMethods.SetRangeToMaster(workbookName, "Sheet1","A" & seqFrom, count)
End Sub

Sub ParallelForLoop()
     Dim parallelClass As Parallel
     Set parallelClass = New Parallel
     parallelClass.SetThreads 4
     Call parallelClass.ParallelFor("ForLoop", 1, 1000)'Replace 1000 with any number
     result = Application.Sum("A:A")
End Sub
 
Back
Top