Dummy Data – How to use the Random Functions
Using collected or known data is the best when developing Excel models, but from time to time this may not be available when you are developing your model.
This post will look at some options for setting up Dummy Data using Excels Random functions.
Variability
Real data displays a range of variability, but this variability is generally within ranges or distributions of ranges of results.
All fields type can contain variability
ie: Country, State Names and Zip/Postal Codes, Maybe large lists but are fixed
Peoples Names, Maybe a large lists but are fixed by local rules
Ages, generally less than 80, never less than 0
Dates: Rarely before 1990 or 1900 in rare cases
Lists: are fixed
Numbers: generally random or conforming to a fixed distribution or known trend
Numbers: may include integers, decimals, negatives, extremely large numbers or all combinations
In generating random lists you will need to choose if you want random data, random data within constraints or random with a distribution. The choice is really yours and should in part be based on what the data is being used for and how accurately it needs to reflect reality.
Techniques
The techniques described below are all shown with a worked example in the attached Examples File or the Excel 2003 Example
Each example is annotated below like (Example 4.). ie: Refer to Example 4 in the above example files.
Dates
Setting up Random Dates is a simple process using the Date function.
=Randbetween(StartDate,EndDate)
Dates in a Range of Years
=Randbetween(Date(2000,1,1),Date(2011,12,31))
Will give a list of Random dates between 1 Jan 2000 and 31 Dec 2011 (Example 1.)
(Thanx Mike W)
Dates in a Month
=Date(2010, 6, Randbetween(1,30)
Will give a list of Random dates between 1 June 2010 and 30 June 2010 (Example 2.)
Don’t worry that the above formula (Example 1) can actually produce a 31 Feb 2005, the Date function will happily convert that to 3 March 2005 (Example 3.)
Dates within a Date Distribution
=DATE(2011,7,NORMINV(RAND(), 0,60))
Will give a list of Random dates between approximately 1 Jan 2010 and 31 Dec 2010, with a mean of July 1 and standard deviation of 2 Months (60days) (Example 4.)
Where NORMINV(RAND(), 0,60) will return values between -180 and +180, 99.7% of the time
Text Fields
Dependant on how many items in the list you require there are 3 techniques available
Choose
For small lists of less than 6 to 10 items you can use a simple Choose function (Example 5.)
=Choose(Randbetween(1,6),”Item 1″, “Item 2”, “Item 3”, “Item 4”, “Item 5”, “Item 6”)
VLookup
Using VLookup (Example 6.)
=Vlookup(Randbetween(1,List Length), List, 2)
Index
Using Index (Example 7.)
=Index(List, Randbetween(1, Counta(List) ))
Numbers
Small Random List of Numbers
Random from a small list of numbers (Example 8.)
=Choose(Randbetween(1,6), Numb 1, Numb 2, Numb 3, Numb 4, Numb 5, Numb 6 )
Note that the numbers:
- Don’t have to be in any order,
- Can be integers, negatives or contain decimals
- Can be repeated
eg: =Choose(Randbetween(1,6), 18, 21, -19, 36.4, 18, 24)
Random Integers
Return Integers between Start and Finish (Example 9.)
=Randbetween(Start, Finish)
=Randbetween(50, 100)
Will return an Integer between 50 and 100
Random Numbers
=Rand()
Will return a random number between 0 and 1
=Round(Rand()*100, 2)
Will Return Numbers between 0 and 100 with 2 Decimal places (Example 10.)
Random Numbers Based on a Distribution
=Norminv(Rand(), Mean, SD)
Will return a random number between 0 and 1 based on a distribution of Average = Mean and Standard Deviation = SD
=Norminv(Rand(), 50, 17)
Will return a random number between 0 and 100 based on a distribution of Average = 50 and Standard Deviation = 17, (Example 11.)
Random Numbers Fitting a Trend
If your distribution has to match a trend add a Random component to the Trends equation (Example 12.)
Y=mX+c
= rand() * X + rand()*5
= rand() * A2 + rand()*5
True/False
Choose
Use Choose and Randbetween (Example 13.)
=Choose(Randbetween(1,2), True, False)
If
Use If and Rand (Example 14.)
=If(Rand()<0.5, True, False)
Combination Text and Numbers
The above techniques can be combined to make lists of Alpha Numeric Data
Say your business has a fleet of vehicles (TR=Truck, VN=Van, CAR=Car)
=Choose(Randbetween(1,3),”TR”,”VN”,”CAR”) & Text(Randbetween(1,15),”0#”)
Will randomly choose 1 of “TR”,”VN”,”CAR” and add a random number between 1 and 15 to it format with a leading 0, eg: TR05, (Example 15.)
Other Sources of Data
Random Data
There are a number of web sites where Random Data is available.
http://www.fakenamegenerator.com/order.php
http://www.generatedata.com/#generator
http://www.melissadata.com/lookups/
Open Source Data
There are a number of web sites where Open Source Data is available.
http://www.readwriteweb.com/archives/where_to_find_open_data_on_the.php
Function Used:
Rand: Returns a random number between 0 and 1.
Randbetween: Returns a random Integer between lower and upper limits. Pre Excel 2007 Randbetween was only available through installation of the Analysis Toolpak (Thanx Luke).
Norminv: Returns the inverse of the normal cumulative distribution. That is it returns the X value from a Normal Distribution that has a know Mean and Standard Deviation where the a known cumulative percentage is supplied.
Choose: Choose an item from a list of up to 254 items.
Vlookup: Lookup the matching value from a list and return a data item from another column from the same location.
Index: Retrieve an items from a defined location within a range.
Text: Displays a number as Text with a defined format.
Other Uses of Random Functions
Of course the techniques shown here don’t have to be used for setting up Dummy Data.
One area where Random numbers is used is in Monte Carlo Simulation. This has been discussed at Chandoo.org at Data Tables and Monte-Carlo Simulations in Excel a Comprehensive Guide
Techniques
The techniques described above are all shown with a worked example in the attached Examples File or the Examples File 2003 ver
Limitations in Pre Excel 2007 versions
The Excel function, Randbetween, was only introduced in Excel 2007. As such the exaples above will only work in 2007/10.
However a simple alternative is available
Randbetween(Low, High) = Low + Int(Rand()*(High-Low))+1
Randbetween(90, 100) = 90 + Int(Rand()*10)+1
Examples using this approach are shown in the 2003 Version of the Examples files above.
How have you made Dummy Data or used the Random Functions?
How have you made Dummy Data or How have you used it ?
How have you used Random Numbers in your workbooks ?
Let us know in the comments below:

















37 Responses to “Pie of a Pie of a Pie chart [Good or Bad?]”
If I could have the same quality of graphics and illustration in Office Apps, I would certainly use it.
If I could have the same quality of graphics in Office Apps (Excel, PPT) I would certainly use it.
Chandoo,
First, let me say I love your blog. I like this post, and I think that technically (in terms of readability of data) your argument is correct. The bar of bars, and the table, are much better for readability and accuracy, and as you say would be much easier to produce.
But these points ignore the context of the chart. If the chart was part of a scientific paper, your solution would be a valid one. The context in this case is an illustrated atlas of wildlife. A companion graphic to go with written text. The importance of aesthetic goes up over readability and accuracy. Much of the data and points (I assume) will be covered in the text.
There's always a pure technical tufte-esque argument. But I sometimes think it ignores the value of aesthetics. (Which I admit are quite subjective)
Great post though. Thanks.
The Treemap makes the scope of the data much clearer! The 3D pie chart depiction is deceptive.
This reminds me of the videos ive seen on the internet where it compares the relative sizes of the earth with the larger planets, then the sun, then other stars in the galaxy. Eventually there is an image showing the largest star in the sky with a little pixel representing the sun.
My point is if you varied the size of the charts it would help convey the message. The first chart (salt vs fresh) would be the biggest and the rest would be arranged in descending order. I feel this would be more accurate.
It may be helpful to consider the advice of Steven Few and Edward Tufte regarding pie charts in general. To summarize, they are seldom the most useful way to present data. Here's Few's thoughtful piece on the subject.
http://www.perceptualedge.com/articles/08-21-07.pdf
Try putting the percentages on the bar charts instead of actual amounts. Lakewater would be .013 % instead of 52.
That is very good pie chart example.
Please send example file if it is possible.
It will work , even though colors may be confusing , it can be labeled well . Also it can be called as the drilled chart , as it drills in information further , like the first chart may show business in a region , second may drill into a particular region , thrid may further drill into wat products are there in that region . It works well for me , i would more vote for the 2 nd option .
Overall all this site is awesome ,
p.s : just like me
The risk with pie of a pie of a pie chart is that Jon may have a seizure by looking at it. Also, it isn't easy to read. 😉
I dunno. The only thing worse than a pie chart is a cascading series of pie charts. I don't even think they really lend themselves to this sort of thing. It just becomes a big hide-the-ball game with your viewer.
Those goofy connectors between the pies are pure chart junk. I can't really tell if the second chart has 2 series or 3 - because the connector is a different color than the 2 labeled slices. Despite that, even whereas the drill down kind of works, still the individual components suffer from the same old weaknesses that 3d pie charts have.
Use a large bar chart as your "cover story", and fill in the sub points with smaller bar charts - or even go grab the Fabrice SFE project for extra butter. Use page orientation, color, and some text styles to guide your audience through the drill downs.
FWIW, if you check out the guy's site, you can find several other truly mortifying charts:
http://www.andrewdavies.com.au/index.html
The methane emissions one is particularly heinous. Although, I'm kind of debating what I think about the 'Glacier Changes" chart. I'd kind of like to see the data on that to see how it would look in a more traditional horizon chart.
Thank you, that was scary. I don't understand the "Glacier changes" Chart at all...
Its a very nice way to represent the data, especially when we have sets and sub-sets within the data.
I like these!
Except for the fact that they aren't dynamic and hence must be setup manually each time
It would also be nice if they could be interrogated as in select a different segment and the new data falls out automagically, but then none of the standard Excel charts do that either.
I'd like it better if the bars were stacked. How about this idea (I hope I can convey it in words):
First bar is vertical and stacked.
Second bar is horizontal, stacked horizontally and the same proportion had it been on the first bar.
Third bar is vertical, stacked vertically and the same proportion had it been on the second bar.
Then it would really look like you are zooming on the chart, like the Powers of Ten video, or maybe like the golden ration spiral.
These looks shunting but setting up for each step makes kicks them out. However if these can be arranged automatically by native excel or by VBA, these will be the part of my "Archery"
I agree with Chandoo's Suggestion about the Bar Graph which represents data in a very appropriate manner. Even I prefer doing the same. I seldom use Pie Chart unless required.
That's a real nice example of a missleading infographic. But to be honest, I think chandoos suggestion is not much better!
Why are pie charts bad? I think because they don't show the real size-relations. The biggest pie in that example ist 300k big. The 2nd one has only the size of 10k, about 3% of the first one. Niether the pies nor the bars show the real sizes. I jnow, it's hard to show the sizes because the values of the second and the third pie are so small. But that's what visualization are about - showing relations to allow the reader to see the real sizes!
So how to show the real figures?
First possibility is o use a 1:1 scaling. Well then, you need a very big screen to show also after a 90° rotation, wihich I would prefer because it's a structural comparison and not a timeline. Maybe that solution is not the perfect way.
The other chance you have is to zoom in but to really show that you zoom in! http://www.pro-chart.de/images/Water_Fall.png maybe gives you a first impression what i mean. (i was a quick try, done in 10 minutes)
The next way is, maybe to fold the bars like in the financial report 2011 of the Post of Switzerland page 22. That chart is based on an excel chart. Maybe can explain you how to do it 😉
Financial Statement: http://www.post.ch/en/post-startseite/post-berichterstattung/post-berichterstattung-service/post-berichterstattung-downloads/post-gb-2011-finanzbericht.pdf
page 22: http://www.pro-chart.de/images/FS_Schweizer_Post.png
A way that is not so very common is to divide the bars in a lot of single datapoints. So maybe the 390k bar then consists of about 5,000 single datapoint. That's not possible - it is! Have a look:
http://www.pro-chart.de/images/Dotted_WF.png
It's pure excel!
Now one single point ist 0,2% of the whole (in the example above). Add more datapoints and you can visulize the very big and the very small numbers!
Wish you a lot of fun - visualizing with excel can be very powerful!
Joerg
...if you would like to know how these charts work, just send an email to J.Decker@pro-chart.de
Hey Joerg,
I don't dig so much the dotted waterfall thing. But this is kind of awesome:
http://www.pro-chart.de/images/FS_Schweizer_Post.png
Can you help me on the bar of bar graph? Would it be possible to create that from pivot table? Can you show me how to create the bar of bar graph?
do nothing but say "Awesome!"
You are a Rock star.....This seemed an answer as if someone was reading my mind and just had the solution to my questions on what I exactly was looking for .....What a Fab !!
can u explian me step by step
Can anyone please explain how to make this chart please.
Do you mean the pie of a pie chart or the folded bar chart?
Joery PIE OF PIE Chart please
Can someone please explain how to make PIE OF PIE Chart.
@Mandeep
The last line of the post is:
PS: If you want to know to create this pie of pie of pie chart in excel, see here.
Due to forum migration, link is now:
http://chandoo.org/forum/threads/multiple-pie-chart.7343/
Hi... i love these charts.... can any one show me how to draw these charts in excel 2010
@Vamshi
The very last line of the post refers you to:
PS: If you want to know to create this pie of pie of pie chart in excel, see here. http://chandoo.org/forums/topic/multiple-pie-cahrt
Where is the attachment....it used to be there...i have seen this before but now i am not able to find...
See this:
http://img.chandoo.org/playground/WaterDistribution-chandoo.xlsx
And this:
http://chandoo.org/forum/threads/how-do-you-create-this-chart.9743/
Normally I don't learn post on blogs, however I would like to
say that this write-up very compelled me to try and do so!
Your writing style has been amazed me. Thank you,
quite great article.
This is very impressive, I would like to learn how to build this for myself. I have tried for some time now, is there a step by step process on how to create these waterfall pie of pie charts?
I am novice to excel and use it very seldom. But your blog contains to the point information one needs to get going.
I was searching for a trick to do a Pie chart drill down - for example the first pie chart shows how the prices are distributed between perishable and non-perishable items.
Now if we want to know how the perishable items are distributed - one can click the segment and it will draw another pie chart with distribution of all different perishable items (milk,meat,fruit,veg etc)
So do you have any such trick?
Regards,
electrojit
I like the look of your pie of pie of pie chart, although I understand that the relative size of each pie does not represent the actual percentages.