September has been by far the best month since I started this blog with 103k page views, 34 posts and 246 comments (great discussions everybody 🙂 ). The RSS Subscribers have crossed the 1000 milestone last month and are steadily growing each day.
With so many posts and great discussions every month it may be difficult for you as a reader to keep track of the best of the articles. So from this month onwards Pointy Haired Dilbert will feature a monthly post on the best articles from last month.
- Extracting Initials from Names using Excel Formulas 33 Comments [September 02]
- Micro Charting in Excel – 7 Alternatives Reviewed 14 Comments [September 05]
- Building Sexy Dashboards using Excel – 4 Part tutorial 11 Comments [September 10]
- Petal Charts – Debatable Alternative to Radar Charts 12 Comments [September 18]
- Brilliant and Fun to Watch – Microsoft I am a PC ads 7 Comments [September 19]
- Fuzzy Searching in Excel – Handling Spelling Mistakes 6 Comments [September 25]
- Love & Sex – 10 Infographics that can WOW you 2 Comments [September 26]
- Cleaning up Phone Numbers Using Excel 3 Comments [September 30]
Subscribe
You can subscribe to Pointy Haired Dilbert regular updates by RSS Reader, E-mail newsletters, or in many other ways.
Follow the great discussions on the blog posts by subscribing to comments
Join the 1000+ small but passionate group of excel learners / users and pros today!
You will do me a great favor by adding this blog to your technorati favorites, delicious bookmarks.
I am happy to see this small blog growing each day and I owe this success to You, my readers. Thank you :).
As always, feel free to drop your suggestions, ideas in the comments or through mail (chandoo.d at gmail.com) and I will be *very* happy to respond to you.
8 Responses to “Pivot Tables from large data-sets – 5 examples”
Do you have links to any sites that can provide free, large, test data sets. Both large in diversity and large in total number of rows.
Good question Ron. I suggest checking out kaggle.com, data.world or create your own with randbetween(). You can also get a complex business data-set from Microsoft Power BI website. It is contoso retail data.
Hi Chandoo,
I work with large data sets all the time (80-200MB files with 100Ks of rows and 20-40 columns) and I've taken a few steps to reduce the size (20-60MB) so they can better shared and work more quickly. These steps include: creating custom calculations in the pivot instead of having additional data columns, deleting the data tab and saving as an xlsb. I've even tried indexmatch instead of vlookup--although I'm not sure that saved much. Are there any other tricks to further reduce the file size? thanks, Steve
Hi Steve,
Good tips on how to reduce the file size and / or process time. Another thing I would definitely try is to use Data Model to load the data rather than keep it in the file. You would be,
1. connect to source data file thru Power Query
2. filter away any columns / rows that are not needed
3. load the data to model
4. make pivots from it
This would reduce the file size while providing all the answers you need.
Give it a try. See this video for some help - https://www.youtube.com/watch?v=5u7bpysO3FQ
Normally when Excel processes data it utilizes all four cores on a processor. Is it true that Excel reduces to only using two cores When calculating tables? Same issue if there were two cores present, it would reduce to one in a table?
I ask because, I have personally noticed when i use tables the data is much slower than if I would have filtered it. I like tables for obvious reasons when working with datasets. Is this true.
John:
I don't know if it is true that Excel Table processing only uses 2 threads/cores, but it is entirely possible. The program has to be enabled to handle multiple parallel threads. Excel Lists/Tables were added long ago, at a time when 2 processes was a reasonable upper limit. And, it could be that there simply is no way to program table processing to use more than 2 threads at a time...
When I've got a large data set, I will set my Excel priority to High thru Task Manager to allow it to use more available processing. Never use RealTime priority or you're completely locked up until Excel finishes.
That is a good tip Jen...