1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Reduce the size of Very Large Excel Data Model

Discussion in 'Ask an Excel Question' started by Dokat, Dec 22, 2016.

  1. Dokat

    Dokat Member

    Messages:
    163
    Hi,

    I created a data model with 295K rows and 27 columns which comes out to be 17 MB when i ran a power pivot table. After adding another table with 227K row to the data model all sudden data size jumps to over 42 mb? Can anyone advise me on how to reduce the size of the data model...

    Thanks
  2. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,615
    It really depends on what your doing with it?

    Are you saving it as an Excel Binary file ? (*.xlsb file type)

    Do you need to keep it all in Excel?
    Can it be offloaded to Access or a CSV table and accessed via a Power tool ?

    There are techniques for summarising large tables in Excel that are blindingly fast, so don't let the actual file size worry you, it is the speed that is generally an issue
  3. Dokat

    Dokat Member

    Messages:
    163
    I am saving as .xlsm file as data excel dashboard. Yes i need to keep it all in Excel. Thanks
  4. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,615
    So what are you trying to achieve ?
  5. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,394
  6. Dokat

    Dokat Member

    Messages:
    163
    I am trying to update a excel dashboard....source data is in another excel workbook with data model over 184mb. I am trying to see if there is a way to reduce the size of the excel file so when i update the dashboard file stays under 20 mb.

    Thanks
  7. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,615
    How often does the data change? Or is it just added to?
    Are there any fields that aren't required?
  8. Dokat

    Dokat Member

    Messages:
    163
    Data changes once a week and yes all columns are required. Thanks
  9. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,615
    Where does the data come from and how ?
  10. Dokat

    Dokat Member

    Messages:
    163
    From another excel file...Its a data pull from proprietary system
  11. Hui

    Hui Excel Ninja Staff Member

    Messages:
    10,615
    Why can't the dashboard be setup to extract the data from the other file using SQL or other linkages?
  12. Dokat

    Dokat Member

    Messages:
    163
    Hi, i dont have any experience with SQL. Thanks
  13. David Evans

    David Evans Active Member

    Messages:
    577
    Here's your chance to learn a new skill, then!
  14. Dokat

    Dokat Member

    Messages:
    163
  15. David Evans

    David Evans Active Member

    Messages:
    577
    Great! You're in the right place with the right people to get some help looking at your problem from different perspectives. Give us some information on what your data source looks like
    Khalid NGO likes this.
  16. Dokat

    Dokat Member

    Messages:
    163
    It's a excel workbook with 2 sheets. One sheet has 197K rows and 27 columns, the other one has 266K rows and 20 columns. Total size of the workbook is 176 mb. I have a vba code in the spreadsheet thus i have to save the file in /xlsm format. I have a dashboard (another workbook) that uses this workbook as a source file. When i ran pivot table using power pivot data model dashboard size reduces to 46 MB but it is still very large file to open and use. I am open to any recommendations or suggestions you may have. Thank you
    Last edited: Dec 29, 2016
  17. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,577
    A few suggestion.

    1. Reduce data size by performing aggregation prior to loading to data model (PowerQuery, ADODB, MS Query etc)
    2. Use XLSB format instead of XLSM. It will further reduce size.
    3. Check PivotTable Cache and that you don't have unnecessary duplicates
    David Evans likes this.
  18. Dokat

    Dokat Member

    Messages:
    163
    Hi,

    1) How can i perform aggregation prior to loading the data.

    2) XLSB reduces the size of the file. I use external connections to link to database and run pivot table. However when i do that size of the dashboard increases exponentially.

    2) How can i check the PivotTable Cache that it doesnt have unnecessary duplicates?

    Thanks
  19. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,577
    1) Depends on method and type of aggregation. As well as your source data structure.

    You can find some example for PowerQuery below:
    Custom sumifs function
    http://forum.chandoo.org/threads/powerquiery-to-allocate-numbers-if-stock-is-available.32443/

    Using SQL statement in PowerQuery (connecting to MS SQL Server)
    • Aggregation performed on hourly IVR data to daily level and also aggregated on market level (Original data was broken out by city and Phone # called), joined with other lookup/related tables.
    • Data also snipped to relevant date range (Previous month & Current month to date)
    • Similar operation can be done in MS Query/ADODB (slightly different syntax) for workbook/sheets
    Code (vb):
    Select
        t1.reportdate as rDate,
        t1.Product,
        (Case When t1.Product='Product1-E' or t1.Product='Product1-F' Then
       'Product1' When t1.Product='Product2-E' or t1.Product='Product2-F' Then
       'Product2' Else t1.Product End) As pType,
       t3.regionName As Region,
        t2.marketName,
        SUM(t1.SalesCalls) As SalesCalls,
        SUM(t1.InfoCalls) As InfoCalls,
        SUM(t1.CallDuration) As CallDuration,
        SUM(t1.HoldDuration) As HoldDuration,
        SUM(t1.NewAni) As NewAni,
        SUM(t1.ExistingAni) As ExistingAni,
        SUM(t1.amount) As SalesAmount,
        SUM(t1.Support) As SupportCalls
    From
        View_IVR_Sales_Data as t1
        Left Outer Join View_MarketCodeMap as t2
        ON t1.cityCode = t2.cityCode
        Left Outer Join MarketRegions as t3
        On t1.cityCode = t3.cityCode
    Where
        t1.reportdate >= DATEADD(month, -1, Dateadd(month, datediff(month, 0, Getdate()),0))
    Group by
        t1.reportdate as rDate,
        t1.Product,
        (Case When t1.Product='Product1-E' or t1.Product='Product1-F' Then
       'Product1' When t1.Product='Product2-E' or t1.Product='Product2-F' Then
       'Product2' Else t1.Product End) As pType,
       t3.regionName As Region,
        t2.marketName
    2) Your set up isn't optimal then. Likely duplicating pivot caches, unnecessary formatting etc etc. Also consider splitting dashboard workbook into multiple workbooks. Organized by related data category etc. Any case, use XLSB over other formats for better performance.

    3) See link.
    http://ramblings.mcpher.com/Home/excelquirks/snippets/pivotcache

Share This Page