• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Pivot table memory using appended queries

I have a Power Query made up of multiple separate Excel tables. When I insert a PivotTable and connect to that query, it works fine--but the file size jumped. In PivotTable Options, the checkbox for Save source data with file is grayed out, preventing unselecting the checkbox. I think this is the source of the problem.

Can anything be done about this?

Paul
 
No. Power Pivot based on Data model requires data to be stored in cache.
Cached data will stay with the workbook.

Only way to reduce size, would be to look at your data and see if you can reduce footprint.
  1. Data granularity - Do you actually need all the rows in data model. Or can you pre-aggregate some data so that there's less.
  2. Table structure - Are all attribute for record that should be stored vertically (rows) stored that way? Meaning are you able to reduce column count? Data model's vertipaq engine is optimized for vertical compression.
  3. Data model design - Do you have redundant data? Make sure you optimize data storage by utilizing Fact and Dimension tables. Utilizing ID/Numeric column to relate information between table(s) rather than storing multiple text values.
 
Thanks. #2 seems the most promising. I have many unused columns of data that came with the source data, and the PivotTable doesn't use those fields. I'll delete those unused columns in the Query Editor and see what difference it makes.
Paul
 
Update: Yes, deleting unneeded columns in the Query Editor cut a huge amount of file size. It went from 140MB down to 90MB, 36% smaller. Still large, but much better.
Paul
 
Last edited:
I have a follow-up question: If I filter out unneeded rows in the Query Editor, would that make a difference in file size? Or is file size only sensitive to the number of columns?
Paul
 
I'd say it's both, while number of columns have more impact. Like @Chihiro said "vertipaq engine is optimized for vertical compression.".
Vertipaq optimizes data in different ways, but having less unique values generally helps. That's also why granularity is important too (also said by Chihiro). Storing individual db-transactions can quickly add up to high numbers, while minute by minute, hourly, daily, monthly, etc aggregations increasingly improve performance of a data model and make it drastically smaller too.
 
Can I filter out unneeded rows in the editor, or do I need to go to the source data and eliminate rows before importing into PowerQuery?
Paul
 
Filter in the UI will work, Paul. Would be strange if it doesn't. If the filter can be applied earlier, then the job for Power Query becomes easier too.
 
Back
Top