• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

VBA extract data from csv

Rodrigues

Member
All

With ninjas help, have been working successfully on an excel file (Transaction.xlsx) using formulas columns C to U.

I'm wondering if it is possible use :

-vba to read the source file (Production Report Detailed xxxxxxx.csv) without opening it

-convert the formulas into vba

Each tab will have a different month.
Please note that, had to convert "Production Report Detailed 2017-02-27.csv" to xlsx in order to upload the file.

Thanks in advance

R
 

Attachments

  • Transaction.xlsx
    720.6 KB · Views: 11
  • Production Report Detailed 2017-02-27.xlsx
    70.8 KB · Views: 13
Hi !

Do not convert text file but just rename it as .csv.txt !
Join also a workbook with an expected worksheet
as a result from this source text file …

The better initial explanation, the better result !
 
Hi Marc
Thanks for your reply.
File renamed to .txt = source file Production Report Detailed 2017-02-27.csv (it's a file generated daily from a system);
Transaction.xlsx = destination file (file with formulas)
The aim of the transaction file, is to show the very first transaction of the day from each machine.
So as an example, each day will have:
Production Report Detailed 2017-02-25.csv
Production Report Detailed 2017-02-26.csv
Production Report Detailed 2017-02-27.csv

What I would like to achieve is that, using VBA to replace formulas on destination file, so I can extract the same data from source file without opening the source file.
Then link the Transaction file to a .ppt, so the data will be displayed in a monitor as the data is populated.
The source file is located on a network drive (in this example C), on a folder called AutoReports.
The destination file will be located on a different network drive, if it helps can be located on the same path as source file (I'm open to suggestions).

The problem I have at the moment is that, on Transaction.xlsx file I have to copy the formula and amend the date to match the file date and then drag across, also if convert the data into a table (for .ppt) the data disappears.
Thanks again.
R
 

Attachments

  • Production Report Detailed 2017-02-27.txt
    125.6 KB · Views: 9
  • Transaction.xlsx
    720.6 KB · Views: 7
See if this is how you wanted.
Code:
Sub test()
    Dim fn, e, txt As String, x, y, i As Long
    Dim myMonth As String, myDate As Date, myDay As String, myTime As Date, myCol As Long, temp
    fn = Application.GetOpenFilename("CSVFiles,*.csv", , "Select File(s)", , True)
    If Not IsArray(fn) Then Exit Sub
    For Each e In fn
        txt = CreateObject("Scripting.FileSystemObject").OpenTextFile(e).ReadAll
        With CreateObject("VBScript.RegExp")
            .Global = True: .MultiLine = True
            .Pattern = "^ *[lL, ].+$"
            txt = .Replace(txt, Chr(2))
        End With
        x = Split(txt, Chr(2))
        For i = 0 To UBound(x)
            If Application.Clean(Trim$(x(i))) <> vbNullString Then
                y = Split(Split(x(i), vbCrLf)(0), ",")
                If UBound(y) > 13 Then
                    If (IsDate(y(4))) * (IsNumeric(y(14))) Then
                        myDate = Split(y(4))(0)
                        myMonth = Format$(myDate, "mmm")
                        myDay = Format(myDate, "dddd")
                        myTime = Split(y(4))(1)
                        myCol = y(14) + 2
                        If Not Evaluate("isref('" & myMonth & "'!a1)") Then
                            With Sheets.Add(after:=Sheets(Sheets.Count))
                                .Name = myMonth
                                Sheets(1).Rows(1).Copy .Rows(1)
                            End With
                        End If
                        With Sheets(myMonth)
                            temp = Application.Match(CLng(myDate), .Columns("b"), 0)
                            If IsError(temp) Then
                                With .Range("a" & Rows.Count).End(xlUp)(2)
                                    temp = .Row: .Resize(, 2) = Array(myDay, myDate)
                                End With
                            End If
                            .Cells(temp, myCol).Value = myTime
                        End With
                    End If
                End If
            End If
        Next
    Next
End Sub
 
A note to Marc L:
Rodrigues has a formula in cell C28 (because that's the row that corresponds to the .csv file we've been given) of:
Code:
=IFERROR(INDEX('C:\Users\zzzz\AppData\Local\Temp\[Production Report Detailed 2017-02-27.csv]Production Report Detailed 2017'!$E:$E,MATCH(C1,'C:\Users\zzzz\AppData\Local\Temp\[Production Report Detailed 2017-02-27.csv]Production Report Detailed 2017'!$O:$O,0)),"")

This worked, but not out of the box.
While trying to get vba to put that formula in it raised an eyebrow that a .csv file had a sheet name (Production Report Detailed 2017)! I checked the .csv file in notepad and it was a normal .csv file, that is, no sheet name. Turns out this sheet name is generated by Excel as (up to) the first 31 characters of the .csv file's filename. OK, include that in the formula-generating code - done.
I manage to generate the exact same formulae throughout the grid on Rodriques' sheet. Only, of course, I've only got one date's .csv file; no matter it should give results for that row at least.
It doesn't.
I compare formulae - they're the same. I look at the workbook's Edit links dialogue and check the status; I find "Warning: open source to update values" which I do, values update, I close it again and all is well.
But it's crazy that I have to open the source files - we don't want that!

I did try opening and closing the .csv file in Excel in the code but (a) it only produced text (no attempt by Excel to convert to date) and (b) the whole point was to do all this while not opening the .csv files in Excel.

Some searching (http://www.excelforum.com/excel-gen...h-external-csv-file-updating.html#post2004182 ) brought up a Richard Scollar comment: "Those kind of links will only persist whilst the csv docuument is open (as Excel effectively converts the csv to a temporary xls format file whilst it is open). As soon as you close the csv it reverts to what it is - a simple text file with next to no structure. Do you have to save it as a csv file? If you could save it down as an xls file this wouldn't (or shouldn't) happen."
Saving those .csv files as .xlsx files in the first instance might be a possible solution for Rodrigues, I'm not sure (he would be able to give a constant sheet name to make coding easier).

My next attack would be to create an array-entered udf which would perhaps sql the .csv files and produce one row's data in one shot.

Trouble is I'm running out of time and won't be able to get to do this over the next 10 days or so, hence this note, to save you some time, perhaps.

This is the useless code:
Code:
Sub Macro16()
myFolder = GetFolder  'asks for the folder containing the csv files
StartOfFileName = "Production Report Detailed "  'adjust this to be the bit before the date part of the file names (note trailing space).
'ThisWorkbook.UpdateLinks = xlUpdateLinksNever 'you might need this line and its counterpart commented out lower down.
Application.DisplayAlerts = False
For Each cll In Range("B28:B28") '"B2:B28" later.
  csvSheetName = Left(StartOfFileName & cll.Text, 31)  'this because  excel wantsto see a sheet name for a csv file in a formula
  myformula = "=IFERROR(INDEX('" & myFolder & "\[" & StartOfFileName & cll.Text & ".csv]" & csvSheetName & "'!C5,MATCH(R1C,'" & myFolder & "\[" & StartOfFileName & cll.Text & ".csv]" & csvSheetName & "'!C15,0)),"""")"
  With cll.Offset(, 1).Resize(, 19)
    .FormulaR1C1 = myformula
    '.Value = .Value  'removes formula leaving value'
  End With
Next cll
'ThisWorkbook.UpdateLinks = xlUpdateLinksAlways
Application.DisplayAlerts = True
End Sub


Function GetFolder() As String
Dim fldr As FileDialog
Dim sItem As String
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
  .Title = "Select a Folder"
  .AllowMultiSelect = False
  .InitialFileName = Application.DefaultFilePath
  If .Show <> -1 Then GoTo NextCode
  sItem = .SelectedItems(1)
End With
NextCode:
GetFolder = sItem
Set fldr = Nothing
End Function
 
Jindon
Thank you for your help. Do what I'm looking for in part.
It's possible to read data without opening source file, so all files will be read? Also, when tested with today's date file just fills in C2 & D2, not sure what I'm doing wrong. File attached with result. Also, one thing haven't mentioned (apologies my fault) on Auto report folder I have others .csv files, could we add the name on this code I.e.: Production Report Detailed (this will be unique).
Thanks again, much appreciated.
R
 

Attachments

  • Book1.xlsm
    20.1 KB · Views: 7
1) It reads data directly from csv file(s).
2) I wrote the code fully based on the file you attached, so if the file has different format, it will not get the information properly.
I need to see the source file of "today's date", not the result.
3) The file pick up dialogue allows you to select multiple files, so it is not reading all the csv files in the folder.
 
My next attack would be to create an array-entered udf which would perhaps sql the .csv files and produce one row's data in one shot.
Well, in the attached is just such a function, called GetTimes.
Used as follows:
Code:
=GetTimes(B28,$C$1:$U$1,"c:\Users\Public\Documents\Production Report Detailed ")
upload_2017-3-2_0-52-2.png
The above is the process of array-entering the formula in cells C28:U28 of the Feb sheet in the attached (Array-Entering=Ctrl+Shift+Enter, not just Enter)
Note that the PathNameAndFileNameStart argument is everything except the date part of the filename (in this case it includes a space at the end), the .csv isn't needed, the code tacks that onto the end immediately after the date bit.
Once entered you can copy up/down.
Of course, you can wrap it in an IFERROR.
Also the TheGrps range should be the same size as the range you're entering the formula into.

I really am out of time now and so have not written any robust code to place the formulae automatically into the cells of a sheet and perhaps convert to plain values afterwards, but it would be something along the lines of:
Code:
Sub Macro1()
With Range("C2:U29")
  '.Rows(1).FormulaArray = "=GetTimes(RC[-1],R1C3:R1C21,""c:\Users\Public\Documents\Production Report Detailed "")"
  .Rows(1).FormulaArray = "=IFERROR(GetTimes(RC[-1],R1C3:R1C21,""c:\Users\Public\Documents\Production Report Detailed ""),"""")"
  .Rows(1).AutoFill Destination:=.Cells, Type:=xlFillDefault
  .Value = .Value
End With
End Sub

Anyway, the values obtained seem to be right.
 

Attachments

  • chandoo33395Transaction_02.xlsm
    22.5 KB · Views: 8
Last edited:
1) It reads data directly from csv file(s).
2) I wrote the code fully based on the file you attached, so if the file has different format, it will not get the information properly.
I need to see the source file of "today's date", not the result.
3) The file pick up dialogue allows you to select multiple files, so it is not reading all the csv files in the folder.

Hi Jindon
Apologies delay in reply, have been away from the desk and it is my fault not explaining it properly.
Attached is an example of today's data source file (Production Report Detailed 2017-03-13.csv) and the destination with layout I would like to achieve if possible.
Also today noticed that, some transactions on source file are not order by time, therefore Cells L14; N14 on worksheet Mar are incorrect.
Cell L14 the result should be 06:44 (cell E1174);
Cell N14 the result should be 06:25 (cell E1401).
Taking this an example, if possible, the code/formula needs to look at transactions times founded and select the earliest time.
The destination file will update the results each day, as soon the source file is generated and stored, so tomorrow, row15 will show the very first transaction of the day on each column.
The plan is that, have a pc with destination file open, so when source file is on shared drive, the results will be automatically populated on destination file, which it will be linked to a power point, so when results are populated the ppt will reflect that.
Source file = Production Report Detailed 2017-03-13.csv (had to rename to .txt to upload it)
Destination file= Book2.xlsx
Thanks again and sorry for the confusion created.
R
 

Attachments

  • Production Report Detailed 2017-03-13.txt
    293.3 KB · Views: 12
  • Book2.xlsx
    521.2 KB · Views: 9
See if this extract correct values.
Code:
Sub test()
    Dim fn, e, txt As String, x, y, z, i As Long, ii As Long
    Dim myMonth As String, myDate As Date, myDay As String, myTime As String, myCol As Long, temp
    fn = Application.GetOpenFilename("CSVFiles,*.csv", , "Select File(s)", , True)
    If Not IsArray(fn) Then Exit Sub
    For Each e In fn
        txt = CreateObject("Scripting.FileSystemObject").OpenTextFile(e).ReadAll
        With CreateObject("VBScript.RegExp")
            .Global = True: .MultiLine = True
            .Pattern = "^ *[lL, ].+$"
            txt = .Replace(txt, Chr(2))
        End With
        x = Split(txt, Chr(2))
        For i = 0 To UBound(x)
            If Trim$(x(i)) <> vbNullString Then
                y = GetMinRow(x(i))
                If IsArray(y) Then
                    myDate = CDate(y(0))
                    myMonth = Format$(myDate, "mmm")
                    myDay = Format(myDate, "dddd")
                    myTime = Format(Split(y(0))(1), "h:mm")
                    myCol = y(1)
                    If Not Evaluate("isref('" & myMonth & "'!a1)") Then
                        With Sheets.Add(after:=Sheets(Sheets.Count))
                            .Name = myMonth
                            Sheets(1).Rows(1).Copy .Rows(1)
                        End With
                    End If
                    With Sheets(myMonth)
                        temp = Application.Match(CLng(myDate), .Columns("b"), 0)
                        If IsError(temp) Then
                            With .Range("a" & Rows.Count).End(xlUp)(2, 2)
                                temp = .Row: .Resize(, 2) = Array(myDay, Format$(myDate, "yyyy-mm-dd"))
                            End With
                        End If
                        .Cells(temp, myCol).Value = myTime
                    End With
                End If
            End If
        Next
    Next
End Sub

Function GetMinRow(ByVal txt As String)
    Dim e, x, temp, myDate As Date
    For Each e In Split(txt, vbLf)
        If Trim$(e) <> vbNullString Then
            x = Split(e, ",")
            If UBound(x) > 13 Then
                If (IsDate(x(4))) * (IsNumeric(x(14))) Then
                    If IsEmpty(temp) Then
                        temp = e: myDate = CDate(x(4))
                    Else
                        If CDate(x(4)) < myDate Then
                            temp = e: myDate = CDate(x(4))
                        End If
                    End If
                End If
            End If
        End If
    Next
    If temp <> vbNullString Then
        GetMinRow = Array(myDate, Split(temp, ",")(14) + 3)
    End If
End Function
 
See attached, two buttons on two new sheets to press to result in plain values and formulae.
Since my posts seem to be invisible to Rodrigues I won't pursue further.
 

Attachments

  • chandoo33395Transaction_03.xlsm
    549.7 KB · Views: 16
Back
Top