• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Best way to build a data model

Bakkone

New Member
Hello!


Im looking into PowerPivot and building a few simple databases. I've been making a few VBA scripts collecting data from various reports and sources to build my own little data warehouse.


Having never really done this before Im stuck on a question regarding the "theory" of building a database. What is the best way to do it? Which way is better in regards to use with Pivot, PowerPivot without taking up too much computing power.


Look at these Examples:


1,

Period Product Type Amount

130630 Banana Cost -1

130630 Banana Sales 2

130630 Apple Cost -3

130630 Apple Sales 45

130630 Oranges Cost -156

130630 Oranges Sales 687


2,

Period Product Cost Sales

130630 Banana -1 2

130630 Apple -3 45

130630 Oranges -156 687


I want to use the first example since I find it easier to have one column with the same type of data in it. If I want thet result for apples I can just sum all rows with apple. But the first example has twice as many cells as number two. Im thinking after a few hundreds of thousand rows this will be too much. And I can make calculated fields in pivot to calculate the net result for me. Note that these are examples. IRL Cost and Sales would probably be 15 different things, so 15 columns or x15 rows. Then again PowerPivot is great with handling massive amounts of rows and data


Im looking to this forum for some help on how to think about this. Anyone know a good book where I can read up on the theory behind building databases like this? Any special blogs I should read? Whitepaper? If you had the above examples, what would you do?
 
Bakkone


It depends on the purpose of the final result


If you want to know the Sum of Sales on a Day or teh count of sales you need to keep the source data


If you are only looking at Summary data, Summary data will do


Keep in mind that if you use Power Pivot it will handle large data sources ie: >20,000,000 records with relative ease


So I'd suggest keeping all the data for a while and see how the system goes

You can always revert to Summary data later, but once you've lost the source data it's gone forever.


If you keep it in Excel setup0 macros to produce calculated fields and save them as values daily.

This reduces the computation overhead by removing redundant calculations
 
Thanks for your reply. Ill experiment a bit and see what I like the most.


If you had to choose between the two above setups? Which would you prefer?
 
Managers and others always think up new measures/KPI's etc that they require

So I'd suggest keeping all the data for a while and see how the system goes

You can always revert to Summary data later, but once you've lost the source data it's gone forever.
 
Back
Top