• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Conditionally extract datatable from one workbook to another using ADO SQL

inddon

Member
Hello There,

I would like to use the ADODB.Connection (SQL) to fetch the data from other workbook.

I have 3 excel workbooks as follows. :
1. Master File.xlsx
This contains all Master Data Tables: Customer, Vendor, Currency and Transaction.
The Customer, Vendor and Currency has 2 columns, Code and it's respective names.
The Data Table Transaction has columns: Transaction Number Transaction Date, Customer Code, Vendor Code, Currency Code, Amount

2. Application File.xlsm
This file contains all the logic and various UserForms to read and write data to the above Master File. In the original file , we do not use any SQL yet.
There are 2 modules:
. A. InitializeEnvironment: Initialize the Public variables for workbooks, worksheets, tables, etc.
. B. ExecuteReport: Which calls the VBA code for report

3. Report Template File.xlsx
This is the report template file. A new file for below requirement.
This contains a Data Table: Tbl_Report and contains one column Seq.

All the above files are located in the same folder. For sample purpose I have kept the files focused on this requirement only.

The requirement is as follows:
In the Application File, worksheet Criteria there are 2 parameters Transaction From and Transaction To Date and a push button.

A. On click Push Button, it should check if the file 3. Report Template File exists, if yes then proceed, else Msgbox "File Not exists" and exit sub.

B. It should get all the rows from DataTable: Transactions (Master File.xlsx), whose Transaction Date is between parameters Transaction From and Transaction To Date.
It should link to the related Data Tables Customer, Vendor and Currency and get it's related Codes' Names for every Transaction row fetched.

C. If the SQL number of rows fetched is greater than 1, then it should:
. 1. make a copy of the 3. Report Template File and rename it to "Transaction Report.xlsx"
. 2. Insert fethced rows into table Tbl_report
. 3. Save file in a folder "Reports" in the same path of the workbook.
. 4. Close the "Transaction Report.xlsx"

D. Display message "Report created " .Name of the File & Total Records in the file


I have attached the sample files for your reference.

Look forward to hearing from you.

Regards,
Don
 

Attachments

  • Application File.xlsm
    25.2 KB · Views: 28
  • Master FIle.xlsx
    16 KB · Views: 25
  • Report Template File.xlsx
    8.8 KB · Views: 24
Before you start on this. I'd recommend few things.

1. Each sheet should only house table (header & data). Nothing else. Meaning removing value from B2 of each sheet.
2. Note that ADODB cannot query table by name. You should use sheet name to query each tables.
3. In order to do multiple joins ADO, you need nested SELECT

In fact... I'd recommend doing away with Tables altogether in this type of operation using ADO. And storing headers in Row1. Data starting from A2 down and across, with no empty rows.

Excel tables in ADO code serves no purpose and just adds additional complexity.
 
Here's sample code to bring data into Application File.xlsm. Assuming values in B2 are deleted.

Note that I added "Sheet1" for demo purpose in Application File.xlsm
Code:
Sub Demo()
Dim fPath As String, strQry As String
Dim fName As String: fName = "Report Template File.xlsx"
Dim cn As Object, rs As Object

fPath = ThisWorkbook.Path & "\"
If Len(Dir(fPath & fName)) = 0 Then
    MsgBox "Report template file does not exist"
    Exit Sub
End If
Set cn = CreateObject("ADODB.Connection")
With cn
    .Provider = "Microsoft.ACE.OLEDB.12.0"
    .ConnectionString = "Data Source=" & fPath & "Master File.xlsx;" & _
                        "Extended Properties=""Excel 12.0 Xml;HDR=Yes;"""
    .CursorLocation = 3
    .Open
End With

strQry = "SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM [Transactions$] t"
strQry = strQry & " INNER JOIN [Customers$] c ON t.[Customer Code] = c.[Customer Code]) a"
strQry = strQry & " INNER JOIN [Vendors$] v ON a.[Vendor Code] = v.[Vendor Code]) b"
strQry = strQry & " INNER JOIN [Currencies$] r ON b.[Currency Code] = r.[Currency Code]) d"
strQry = strQry & " WHERE d.[Transaction Date] >= " & Range("TFromDate").Value2 & _
                " AND d.[Transaction Date] <= " & Range("TToDate").Value2 & ""

On Error GoTo ErrHandle:
Set rs = CreateObject("ADODB.Recordset")
rs.Open strQry, cn, 1, 3
Sheets("Sheet1").Range("A1").CopyFromRecordset rs
rs.Close
ErrHandle:
Debug.Print Err.Description
cn.Close
End Sub
 
Before you start on this. I'd recommend few things.

1. Each sheet should only house table (header & data). Nothing else. Meaning removing value from B2 of each sheet.
2. Note that ADODB cannot query table by name. You should use sheet name to query each tables.
3. In order to do multiple joins ADO, you need nested SELECT

In fact... I'd recommend doing away with Tables altogether in this type of operation using ADO. And storing headers in Row1. Data starting from A2 down and across, with no empty rows.

Excel tables in ADO code serves no purpose and just adds additional complexity.

Hi Chihiro,

Thank you for very much for your response and appreciate your recommendation.

In the actual master file all the worksheets contain Data Tables. There is only 1 data table in a worksheet.
All the Data Table Headers Start from B3 and Data body range from B4.
Getting away with the Data Tables from the master file will affect the working of the application.

In the sample 'Application File.xlsm', there is the below Sub, which initializes the workbook, worksheets, tables. This is how it is setup in the actual Application File VBA code as well.

In the ADO FROM cluase, if the Data Table Range can be used to derive the worksheet name and it's data table range would be great.
eg. [Transactions$B3:$G19]

Code:
Sub InitializeEnvironment
..
..
..
  '----------------------------------------------------------------------------
  'Set Master Worksheet and DataTable Transactions
  '----------------------------------------------------------------------------
  Set MstWSName_Transactions = MstWB.Worksheets("Transactions")
  MstTblName_Transactions = "Tbl_Transactions"
  Set O_MstTbl_Transactions = MstWSName_Transactions.ListObjects(MstTblName_Transactions)
..
..


Many thanks and look forward to hearing from you.

Regards,
Don
 
Here's sample code to bring data into Application File.xlsm. Assuming values in B2 are deleted.

Note that I added "Sheet1" for demo purpose in Application File.xlsm
Code:
Sub Demo()
Dim fPath As String, strQry As String
Dim fName As String: fName = "Report Template File.xlsx"
Dim cn As Object, rs As Object

fPath = ThisWorkbook.Path & "\"
If Len(Dir(fPath & fName)) = 0 Then
    MsgBox "Report template file does not exist"
    Exit Sub
End If
Set cn = CreateObject("ADODB.Connection")
With cn
    .Provider = "Microsoft.ACE.OLEDB.12.0"
    .ConnectionString = "Data Source=" & fPath & "Master File.xlsx;" & _
                        "Extended Properties=""Excel 12.0 Xml;HDR=Yes;"""
    .CursorLocation = 3
    .Open
End With

strQry = "SELECT * FROM (SELECT * FROM (SELECT * FROM (SELECT * FROM [Transactions$] t"
strQry = strQry & " INNER JOIN [Customers$] c ON t.[Customer Code] = c.[Customer Code]) a"
strQry = strQry & " INNER JOIN [Vendors$] v ON a.[Vendor Code] = v.[Vendor Code]) b"
strQry = strQry & " INNER JOIN [Currencies$] r ON b.[Currency Code] = r.[Currency Code]) d"
strQry = strQry & " WHERE d.[Transaction Date] >= " & Range("TFromDate").Value2 & _
                " AND d.[Transaction Date] <= " & Range("TToDate").Value2 & ""

On Error GoTo ErrHandle:
Set rs = CreateObject("ADODB.Recordset")
rs.Open strQry, cn, 1, 3
Sheets("Sheet1").Range("A1").CopyFromRecordset rs
rs.Close
ErrHandle:
Debug.Print Err.Description
cn.Close
End Sub


Thank you Chihiro for your help.

I will apply your solution and will update you tomorrow.

Thanks again and regards,
Don
 
FYI - Below section of your Initialize module will fail without actually opening workbooks.
Code:
  '----------------------------------------------------------------------------
  'Set Workbooks
  '----------------------------------------------------------------------------
  Set MstWB = Workbooks(MstDBFileName)                      '"Master File.xlsx"
  Set EntWB = ThisWorkbook                             '"Application File.xlsm"
  Set RptWB1 = Workbooks(RptTmplFileName)          '"Report Template File.xlsx"
  Set RptWB2 = Workbooks(RptFileName)                '"Transaction Report.xlsx"

In order to Set Workbook object, it must be opened already. Kind of defeats the purpose of using ADO code.
 
FYI - Below section of your Initialize module will fail without actually opening workbooks.
Code:
  '----------------------------------------------------------------------------
  'Set Workbooks
  '----------------------------------------------------------------------------
  Set MstWB = Workbooks(MstDBFileName)                      '"Master File.xlsx"
  Set EntWB = ThisWorkbook                             '"Application File.xlsm"
  Set RptWB1 = Workbooks(RptTmplFileName)          '"Report Template File.xlsx"
  Set RptWB2 = Workbooks(RptFileName)                '"Transaction Report.xlsx"

In order to Set Workbook object, it must be opened already. Kind of defeats the purpose of using ADO code.

Hi Chihiro,

Yes you are right, I realized it a bit later.

The first 2 files will always be open (MstWB and EntWB).

For the last 2 workbooks, I thought of doing it this way:
1. If the number of rows returned by the SQL is greater than 1,
then make a copy of "Report Template File.xlsx" and store it in a sub-folder "Reports" and rename it to "Transaction Report.xlsx"
2. Open workbook "Transaction Report.xlsx", and paste the sql fetched rows, save and close the file.

The above should be okay?

Your VBA code works good. :awesome: However, even though the TFromDate and TToDate are filled, sometimes I get an error message "No value given for one or more parameters". I have to close the file re-open then it works.

Regards,
Don
 
Then likely there are some errors during code execution. I've just set it to print out error message in the sample code.
You'll need to create robust error handling to deal with that. When connection/recordset isn't properly disposed/removed during error, it will cause issue until you destroy object by closing workbook.
 
Then likely there are some errors during code execution. I've just set it to print out error message in the sample code.
You'll need to create robust error handling to deal with that. When connection/recordset isn't properly disposed/removed during error, it will cause issue until you destroy object by closing workbook.

Hi Chihiro,

I added the objects to be cleared.

When I keep the 'Master File.xlsx' open, I receive this error "No value given for one or more parameters"
But when the 'Master File.xlsx' is closed, the records are fetched.

in the actual situation the 'Master File.xlsx', is always open. How can the above error be resolved?

Thanks & regards,
Don
 
Then ADO isn't the right method.
ADO has memory leak issue along with the problem you discovered when source workbook is open.

If the workbook is open. Use advanced filter copy or other method.
 
Then ADO isn't the right method.
ADO has memory leak issue along with the problem you discovered when source workbook is open.

If the workbook is open. Use advanced filter copy or other method.

Thank you Chihiro. That made me search further for the solution.

I came across a solution using DAO (which works when the Master File is open).
Please let me know if the below use of DAO, resolves the memory leak you mentioned.

Herewith, the below code for your reference:


Code:
'https://www.exceltip.com/import-and-export-in-vba/use-a-closed-workbook-as-a-database-dao-using-vba-in-microsoft-excel.html
Sub GetWorksheetData()
  Dim db As DAO.Database, rs As DAO.Recordset, f As Integer, r As Long
  Dim fPath As String, FileName, DBFullName, strQry As String
  
  fPath = ThisWorkbook.Path & "\"
  FileName = "Master File.xlsx"
  DBFullName = fPath & "Master File.xlsx"
  strSourceFile = DBFullName
  
  strQry = "SELECT Tran.[Transaction Number] "
  strQry = strQry & ",      Tran.[Transaction Date] "
  strQry = strQry & ",      Cust.[Customer Name] "
  strQry = strQry & ",      Vend.[Vendor Name] "
  strQry = strQry & ",      Curr.[Currency Name] "
  strQry = strQry & ",      Tran.[Amount] "
  strQry = strQry & "FROM   [Transactions$B3:G20] Tran "
  strQry = strQry & ",      [Customers$B3:C7]    Cust "
  strQry = strQry & ",      [Vendors$B3:C7]      Vend "
  strQry = strQry & ",      [Currencies$B3:C5]   Curr "
  strQry = strQry & "WHERE  Tran.[Transaction Date] >= " & Range("TFromDate").Value2 & " "
  strQry = strQry & "AND    Tran.[Transaction Date] <= " & Range("TToDate").Value2 & " "
  strQry = strQry & "AND    Cust.[Customer Code]    = Tran.[Customer Code] "
  strQry = strQry & "AND    Vend.[Vendor Code]       = Tran.[Vendor Code] "
  strQry = strQry & "AND    Curr.[Currency Code]       = Tran.[Currency Code] "
  strQry = strQry & "ORDER BY Tran.[Transaction Date] "


  On Error Resume Next
  Set db = OpenDatabase(strSourceFile, False, True, "Excel 8.0;HDR=Yes;")

  On Error GoTo 0
  If db Is Nothing Then
      MsgBox "Can't find the file!", vbExclamation, ThisWorkbook.Name
      Exit Sub
  End If

  ' open a recordset
  On Error Resume Next
      Set rs = db.OpenRecordset(strQry)
   'Set rs = db.OpenRecordset("SELECT * FROM [Transactions$B3:G20]")

  On Error GoTo 0
  If rs Is Nothing Then
     MsgBox "Can't open the file!", vbExclamation, ThisWorkbook.Name
     db.Close
     Set db = Nothing
     Exit Sub
 End If

 RS2WS rs, Sheets("Sheet1").Range("A1")

 ' write field names
 For intColIndex = 0 To rs.Fields.Count - 1
   Sheets("Sheet1").Range("A1").Offset(0, intColIndex).Value = rs.Fields(intColIndex).Name
 Next

 ' write recordset
 Sheets("Sheet1").Range("A1").CopyFromRecordset rs

 rs.Close
 Set rs = Nothing
 db.Close
 Set db = Nothing
End Sub

Code:
Sub RS2WS(rs As DAO.Recordset, TargetCell As Range)
Dim f As Integer, r As Long, c As Long
    If rs Is Nothing Then Exit Sub
    If TargetCell Is Nothing Then Exit Sub

    With Application
        .Calculation = xlCalculationManual
        .ScreenUpdating = False
        .StatusBar = "Writing data from recordset..."
    End With

    With TargetCell.Cells(1, 1)
        r = .Row
        c = .Column
    End With

    With TargetCell.Parent
        .Range(.Cells(r, c), .Cells(.Rows.Count, c + rs.Fields.Count - 1)).Clear
        ' clear existing contents
        ' write column headers
        For f = 0 To rs.Fields.Count - 1
            On Error Resume Next
            .Cells(r, c + f).Formula = rs.Fields(f).Name
            On Error GoTo 0
        Next f
        ' write records
        On Error Resume Next
        rs.MoveFirst
        On Error GoTo 0
        Do While Not rs.EOF
            r = r + 1
            For f = 0 To rs.Fields.Count - 1
                On Error Resume Next
                .Cells(r, c + f).Formula = rs.Fields(f).Value
                On Error GoTo 0
            Next f
            rs.MoveNext
        Loop
        .Rows(TargetCell.Cells(1, 1).Row).Font.Bold = True
        .Columns("A:IV").AutoFit
    End With

    With Application
        .StatusBar = False
        .Calculation = xlCalculationAutomatic
        .ScreenUpdating = True
    End With
End Sub


Thank you again for taking the time to help me out.

Regards,
Don
 
It would have similar issue as far as I know. Though I haven't used DAO much myself (I don't use VBA to query workbook, unless there is no other alternative).

It's never a good idea to query an open workbook. Though you could always make a copy of open workbook. Then query the copy.
 
It would have similar issue as far as I know. Though I haven't used DAO much myself (I don't use VBA to query workbook, unless there is no other alternative).

It's never a good idea to query an open workbook. Though you could always make a copy of open workbook. Then query the copy.
I did thought about making a file copy (with ADO method) and after the job is complete delete the file copy.
Would the problem of memory leakage (using ADO) still be there?

Regards,
Don
 
If you are querying only the closed workbook. Then no. It would not cause memory leak, as long as recordset and connection objects are properly handled.

But then, I see little benefit in using ADO for this purpose.
 
If you are querying only the closed workbook. Then no. It would not cause memory leak, as long as recordset and connection objects are properly handled.

But then, I see little benefit in using ADO for this purpose.
Thank you Chihiro.

On Error and at the end, I am setting the connection and recordset object variables to Nothing. This should work, right?
Set cn = Nothing
Set rs = Nothing

I will use the ADO method.

Regards,
Don
 
It's never a good idea to query an open workbook. Though you could always make a copy of open workbook. Then query the copy.
Why "never"? Because of the memory leak issue? Or some other reason?

Are we sure the OP experienced the memory leak?

Chihiro said:
ADO has memory leak issue
Are you quite sure it still has? It may have been resolved. Here's why i suspect so:

- The original MS article describing the leak is no longer available on the MS website.
- Here's an archived copy of the article. It says "APPLIES TO Excel 2000, OLE DB Provider for Jet 4.0, Excel 97, Excel 2002". Not sure if ADO-to-worksheet uses Jet.
- This coder tried to duplicate the original leak, and failed.

Altho this guy has experienced the problem as recently as 2014. Not sure what version he's on.
And this person in 2016.
 
Why "never"? Because of the memory leak issue? Or some other reason?
That and also, since data queried is going to be from last saved data and not necessarily latest live data on open workbook. When querying closed workbook, you know it's the latest data stored in the workbook.

ADODB has memory leak on querying open workbook. This has never been fixed to my knowledge. Though last time I checked was year or two ago.

But then, you are responding to 2+ years old post ;)
I'd recommend using Power Query rather than ADODB for most data querying needs. Unless you need to update closed workbook (i.e. Insert/Update recordset).
 
since data queried is going to be from last saved data and not necessarily latest live data on open workbook.
Can you share any situations or code where this happens?

ADODB has memory leak on querying open workbook. This has never been fixed to my knowledge. Though last time I checked was year or two ago.
How and where did you check?

I'd recommend using Power Query rather than ADODB
My understanding is that Power Query is a front-end tool, no? I'm trying to pull data into VBA, not for display purposes.
 
I'd recommend doing some test of your own...

1. When another user has workbook open and actively working on it. At the same time you are querying workbook.
2. Where? I just ran multiple successive query on open workbook and monitored memory usage.
3. Huh? PQ is ETL tool not just displaying data.
 
I'd recommend doing some test of your own...
1. When another user has workbook open and actively working on it. At the same time you are querying workbook.
- Unclear how that tests the memory leak.
3. Huh? PQ is ETL tool not just displaying data.
Ok, thx for that. i don't need ETL for my use-case.

I just ran the test code from the original leak article. The original code failed on the following line:
Code:
cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & "Data Source=" & WorkbookFullName & "Extended Properties=Excel 8.0"
Error: Provider cannot be found. It may not be properly installed.

The reason is:
The Microsoft OLE DB Provider for Jet and the Jet ODBC driver are available in 32-bit versions only. You can't run them in 64 bit mode.
https://www.connectionstrings.com/using-jet-in-64-bit-environments/

This affects me because i'm on Excel 2016 64 bit. The 64-bit solution is:
With Office 2010, there are new drivers, the 2010 Office System Driver, which will be provided in both 32-bit and 64-bit versions. You can use these drivers to let your application connect to Access, Excel and text files in a 64 bit environment utilizing the new 64-bit drivers. The provider name is "Microsoft.ACE.OLEDB.12.0".

The driver is already installed, so i just need to use the ACE connection string.

Code:
cn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & WorkbookFullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"
https://www.connectionstrings.com/ace-oledb-12-0/

My data contained 1 million+ records. While running, i watched Excel in Task manager (i didn't use the memory counters from the original MS article). On each loop, Excel memory consumption varied from about 200 MB to about 400 MB. Never grows beyond 400 MB, never crashes, always jumps back to 250 MB on each loop. There was no steady increase, as you'd see with a leak.

76563

76564

So i think i can conclude that the memory leak doesn't affect 64-bit Excel with the 64-bit driver.
 
Sure. If it works for you. Go with it. Part 1 isn't about memory leak, but about data integrity. Excel and ADO isn't built for ACID, you always run some risk of inconsistency / conflict when working with open workbook. As there is no native data lock mechanism. But if you are only reading data and not performing other parts of CRUD operation, risk isn't that great.
 
Sure. If it works for you. Go with it. Part 1 isn't about memory leak, but about data integrity. Excel and ADO isn't built for ACID, you always run some risk of inconsistency / conflict when working with open workbook. As there is no native data lock mechanism. But if you are only reading data and not performing other parts of CRUD operation, risk isn't that great.
Yep, people should be aware of multi-user access to the same file. My use-case isn't a mult-user scenario, so i can perform all CRUD operations without concern. I didn't notice that the OP was concerned about locking for a multi-user environment.

thx
 
Back
Top