1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Excel File size

Discussion in 'VBA Macros' started by Monty, May 11, 2017.

  1. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Everyone!

    Am facing a big issue at work as my excel file consists of 8 lakhs of data while opening/closing and saving file take hell out of time...Problem is i can have data only in excel no other choice.

    What i did...Converted my file to .xlsb to make file size reduce...but the problem still exists.

    Any comments/Suggestions on how to handle files with huge data...

    Monty!
  2. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    That's... 800,000 rows? How many columns?

    What Version of OS, Ram & CPU along with Excel version (32/64 bit) do you have?
    Monty likes this.
  3. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Chihiro.

    Thaks' for the quick response.

    Details

    OS : Win 7
    RAM : 4gb
    XlVer : 2013 (64 Bit)
    Columns : 36

    But still no escape need to work on these requirements with excel file.
    Last edited: May 11, 2017
  4. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    Hmm Ram is bit lacking. You'd want 8gb, preferably 16gb.

    Tested 26 column 1mil row file, with 8 char length data in each cell.

    Took about 10 sec on my machine to save and open.
    OS - Win7 (64 bit)
    CPU - i5-4200M @ 2.5GHz
    RAM - 16GB
    Excel - 2016 (64 Bit)

    If you have any add-in try disabling it to speed up things.

    Unfortunately there aren't much else you can do to speed it up, if you need to directly interact with the workbook.

    How is the workbook used?
    Monty likes this.
  5. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Chihiro.

    Thanks testing file for me...i have to live with it at work...Can not change at system front.

    Am running a project with two sheets

    Sheet Names:

    Tempate: User will input the rows which may go upto 50k

    Database : We have some 8 laks rows.


    1) Need to loop through rows and create a concatenation of row as do not have provision of creating a helper column called combination in the template sheet so need to handle with vba.
    2) that combination should be checked with "Database sheet" if found not action to be taken but not found there are certain validation to check.

    Please find the simple test file.

    Thanks
    Monty!

    Attached Files:

    Last edited: May 11, 2017
  6. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Chihiro.

    This is how i just started with!

    Code (vb):
    Sub Test()
    Dim Tempsht As Worksheet
    Dim DataSht As Worksheet
    Dim Lrow As Long
    Dim Concat As String

    Set Tempsht = Worksheets("Template")
    Set DataSht = Worksheets("Database")

    Tempsht.Activate

    Lrow = Tempsht.Range("A" & Rows.Count).End(xlUp).Row

        For i = 2 To Lrow
       'Creating concatination from template to check with Database sheet.
           Concat = Range("A" & i).Value & "|" & Range("B" & i).Value
       
       
        Next i

    End Sub
     
  7. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    I'm bit confused with your description.

    While you state if combination is found in database no action is to be taken... But Case 1 to 3 state that there is action needed.

    Also in your sample file, every combination in Template is found in Database.
    Monty likes this.
  8. Monty

    Monty Well-Known Member

    Messages:
    836
    Yes chihiro.

    We are not just checking the combination, if the combination exist need to check further below columns
    Approve 1 Approve 2

    Valid Entry

    If combination found and if you have some names in Approve 1 and Approve 2 then it is valid entry

    What is not valid entry

    1) If combination found and if you have Approve 1 or Approve 2 "Blank" then it is Invalid entry.

    2) If combination found and if you have Approve 1 or Approve 2 as some text as "Testing" then it is Invalid entry.

    3) If combination found and if you have Approve 1 or Approve 2 as some text as "Inprogress" then it is Invalid entry.
    Above three validation makes an entry in the template sheet as invalid and need to high light with red color

    Example:
    Database sheet
    upload_2017-5-12_0-32-8.png

    Template sheet
    upload_2017-5-12_0-33-2.png


    As per the screen shot only "E" is valid combination when checking with database sheet as Approver 1 and Approver 2 has some names apart from "Blank", "Inprogress" and "Testing".

    Hope am not confused you..Please let me know any questions.
    Last edited: May 11, 2017
  9. Monty

    Monty Well-Known Member

    Messages:
    836
    Attached file as per the requirement!

    Attached Files:

  10. SirJB7

    SirJB7 Excel Rōnin

    Messages:
    8,894
    Hi, Monty!
    Just because of the 800k+ rows, with formulas, I'd suggest you to move to Access or any version of SQL database.
    Regards!
    Monty likes this.
  11. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Sir.

    I was Testing with Access initially then SQL server database so that 150 users can access the data for validation..But unfortunately we did't get the access for server as per policy .

    So trying to do the same validation in Within excel with same data dumb in Tab "Database".

    Am sure this would become more easy working for access or Sql server but no option for me.

    Thanks
    Monty!
  12. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    Wait... so each user is to have the database sheet in the workbook that they will be working in?
    Monty likes this.
  13. Monty

    Monty Well-Known Member

    Messages:
    836
    Every user will hold separate file as there is no server concept to access.

    This macro file will be passed on to every user.
  14. Monty

    Monty Well-Known Member

    Messages:
    836
    Every file will have Two tabs "Template sheet" and "Database sheet" which will used by every use standalone.
  15. SirJB7

    SirJB7 Excel Rōnin

    Messages:
    8,894
    Hi, Monty!
    IMHO you shouldn't go on with Excel. If corporate policies don't allow you to use the right tool, then someone doesn't think that it has to be implemented... What's this person opinion?
    Regards!
    Monty likes this.
  16. Monty

    Monty Well-Known Member

    Messages:
    836
    Hello Sir.

    This project started with expecting for SQL server so that total dump will be in the tables and user will hold only template sheet on clicking on the button this entry should go and check with the sql server tables...But now we end up using excel individually.
  17. Monty

    Monty Well-Known Member

    Messages:
    836
    Approx 150 users will have this file doing there validation now at there own desk..
  18. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    Ouch. Whomever is responsible for shooting down DB option should be held responsible for creating the application :p
    Monty likes this.
  19. SirJB7

    SirJB7 Excel Rōnin

    Messages:
    8,894
    Hi, Monty!
    Then that guy is the system owner... Once found the father of the beast, everybody else is responsibility free... Hence, enjoy the future as far as you can from that monster... Unless you want to share or be charged with the delegated responsibility.
    Regards!
  20. Monty

    Monty Well-Known Member

    Messages:
    836
    Thanks for the feedback.

    But still i wanted to check with one user working performance after writing the code.Please let me know any help.

    Code (vb):
    Sub Test()
    Dim Tempsht As Worksheet
    Dim DataSht As Worksheet
    Dim Lrow As Long
    Dim Concat As String

    Set Tempsht = Worksheets("Template")
    Set DataSht = Worksheets("Database")

    Tempsht.Activate

    Lrow = Tempsht.Range("A" & Rows.Count).End(xlUp).Row

        For i = 2 To Lrow
      'Creating concatination from template to check with Database sheet.
         Concat = Range("A" & i).Value & "|" & Range("B" & i).Value
     
     
        Next i

    End Sub
     
  21. Marc L

    Marc L Excel Ninja

    Messages:
    3,173
    Hi !
    .xlsb file format is just to reduce size on disk,
    nothing with how Excel "works" !

    Excel needs to load all in memory : application, add-ins, ribbon,
    workbooks & worksheets, formulas, data, …
    As it is just how Excel works so there is almost nothing to do
    except to increase the RAM of computer !

    Excel can create and manage some Access database.
    But for big data, Access is far away the best choice.

    On a recent thread on my local forum, a guy asked for a help to end
    its project. An helper warned him again like several weeks ago
    for this kind of database project Excel is under level, better is to move
    to Access. The guy did'nt want that way again as it was so closed to ending.
    After some help, two days later the guy was happy to finish but
    complained about a long time execution for a main procedure.
    Helper tested on his side : 53 minutes to finish the process.
    The guy said : « Ok it's normal, almost same time on my side. »

    Few hours later, the helper tried on Access : less than a minute !
    The guy was surprised and disappointed o_O and wrote
    « Ok but 'cause your are an expert too on Access. »
    The helper answered his code was like a beginner but yes,
    as an expert he optimized his code and his new version lasts
    around 20 seconds ! Versus 53 minutes under Excel … :eek:

    Access is the right choice for big data, safety of data,
    shared work between users, work on a lan, …

    Often database and maintenance procedures are under Access
    but some data can be exported to Excel
    (or Excel can directly grab Access data via SQL)
    for analysis for example …
    Monty and SirJB7 like this.
  22. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,579
    Hi Monty ,

    Execution time of code , or recalculation time of a worksheet is solely a factor of the complexity of data processing or calculations that are taking place. Opening and closing a workbook is also a factor of these , since if recalculation mode is Automatic , Excel will do this operation both before saving a workbook , and after opening it.

    This in turn is to some extent dependent on the algorithm that is used to carry out the data processing or calculations.

    Can you mention your working file size in MB ?

    Narayan
  23. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    @Monty

    Did bit of test. Using approx 800k rows, takes about 8.4 sec to just find matching row# for 8 items from template.

    There may be faster way... but I wouldn't pursue this avenue if I were you.
    Code (vb):
    Sub Test()
    Dim TempArr
    Dim cel As Range, x
    Dim sTime As Single, eTime As Single

    sTime = Timer
    myArray = Sheets("DataBase").Range("A1").CurrentRegion

    ReDim TempArr(1 To UBound(myArray, 1), 1 To 3)

    For i = 1 To UBound(myArray, 1)
        TempArr(i, 1) = myArray(i, 1) & "|" & myArray(i, 2)
        TempArr(i, 2) = myArray(i, 3)
        TempArr(i, 3) = myArray(i, 4)
    Next

    For Each cel In Range("A1:A8")
        x = Application.Match(cel.Value & "|" & cel.Offset(, 1).Value, Application.Index(TempArr, , 1), 0)
        Debug.Print x
    Next
    eTime = Timer

    Debug.Print eTime - sTime
    End Sub
    Edit: Also did test with Scripting.Dictionary, but just adding 800k unique keys took about 28 sec.
    Last edited: May 12, 2017
    Monty likes this.
  24. NARAYANK991

    NARAYANK991 Excel Ninja

    Messages:
    15,579
    Hi ,

    What are the operations involved here ?

    Can we assume that the Database tab is not updated often ? If so , then the concatenation of the fields in each record in this tab can be done using formulae , as a one-time or at least infrequent occurrence.

    The concatenation of the fields in each record in the Template tab needs to be done using VBA ; for 50,000 records , with 36 fields , this is bound to take time. If this were to be done using VBA and Access , what would be the difference ?

    Matching the 50,000 concatenated values with the values in the Database tab can be made fast if we sort the Database tab on the concatenated field ; this can be done in code or manually as a one-time activity.

    Repeating the above match operation over 50,000 records is again going to take time. The fact that all of this is happening in computer RAM means there is not going to be any great advantage if this were to be done using Access.

    The basic point is that Monty needs to mention what the present benchmark time is before we conclude that Excel is at fault.

    Narayan
  25. Chihiro

    Chihiro Well-Known Member

    Messages:
    3,669
    Concat can be done for 800K rows in about 3 sec. I initially tested with code and formula. Didn't make much difference so I pasted them as values for second test.

    Checking if the matching record exist in database will be the bottle neck here.

    Code (vb):
    Sub Test()
    Dim TempArr
    Dim cel As Range, i
    Dim sTime As Single, eTime As Single

    sTime = Timer
    myArray = Sheets("DataBase").Range("A1").CurrentRegion

    For Each cel In Range("A1:A8")
        i = Application.Match(cel.Value & "|" & cel.Offset(, 1).Value, Application.Index(myArray, , 5), 0)
        If IsError(i) Then
            cel.Resize(, 2).Interior.Color = vbRed
            cel.Resize(, 2).Font.Color = vbWhite
            GoTo Skip
        End If
        If myArray(i, 3) = "" Or myArray(i, 4) = "" Then
            cel.Resize(, 2).Interior.Color = vbRed
            cel.Resize(, 2).Font.Color = vbWhite
        ElseIf myArray(i, 3) = "Testing" Or myArray(i, 4) = "Testing" Then
            cel.Resize(, 2).Interior.Color = vbRed
            cel.Resize(, 2).Font.Color = vbWhite
        ElseIf myArray(i, 3) = "InProgress" Or myArray(i, 4) = "InProgress" Then
            cel.Resize(, 2).Interior.Color = vbRed
            cel.Resize(, 2).Font.Color = vbWhite
        End If
    Skip:
    Next
    eTime = Timer
    Debug.Print eTime - sTime
    End Sub
    It really depends on how many record needs to be checked from Template sheet.

    But added issue is when Database needs any change. Updating for all 150 users is going to be a pain.

    If using DB, you can ensure everyone is using same database table and perform checking operation using temp table in under 1 sec.
    Monty likes this.

Share This Page