1. Welcome to Chandoo.org Forums. Short message for you

    Hi Guest,

    Thanks for joining Chandoo.org forums. We are here to make you awesome in Excel. Before you post your first question, please read this short introduction guide. When posting or responding to questions please remember our values at Chandoo.org are: Humility, Passion, Fun, Awesomeness, Simplicity, Sharing Remember that we have people here for whom English is not there first language and we need to allow for this in our dealings.

    Yours,
    Chandoo
  2. Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  3. When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Help with Macro To Organize Gas Chromatography Instrument Data

Discussion in 'VBA Macros' started by NCase, Dec 12, 2018.

  1. NCase

    NCase Member

    Messages:
    39
    Hello,
    I have multiple csv files in multiple folders which contain data from a gas chromatography instrument (see attached csv data files and a version of the data that I processed in an excel)

    The raw data is organised in an terrible format where the measured analyte peaks and mass spectral library identification data are not arranged side by side. This makes manual processing of the data in excel very labor intensive
    I am an including example csv files and the way I would like the data to be formatted in excel.

    Is there a way to create a macro to read the csv files in the folders and organize it into a final format in attached spreadsheet (see Final Data Layout Worksheet)

    I hope someone can help as I am flooded with a huge amount of useful data but its in useless format.

    Best,

    Francis

    Attached Files:

  2. Hui

    Hui Excel Ninja Staff Member

    Messages:
    11,692
    Francis

    Have you spoken to the Gas Chromatography operators and asked is there a more user friendly output format ?

    My experience with these says there is
  3. NCase

    NCase Member

    Messages:
    39
    Hui,

    It is an old instrument running on windows XP so excel macros are the only way other instrument users recommend and even the instrument company says excel is only option.

    Best,

    Francis
  4. Hui

    Hui Excel Ninja Staff Member

    Messages:
    11,692
    Its not an impossible task, but I just don't have time to assist you until middle January '19.

    Anybody else want a VBA challenge ?
  5. NCase

    NCase Member

    Messages:
    39
    Thank you Hui. Yes to anybody else that can help with challenge. Willing to provide a reasonable paypal donation to anyone that achieve the goal.
  6. Deepak

    Deepak Excel Ninja

    Messages:
    2,875
    You will get an update soon...
    NCase likes this.
  7. Kenneth Hobson

    Kenneth Hobson Active Member

    Messages:
    238
    It is easier to help if zip is used to compress files as most have it.

    Do you need the 2nd sheet's output or just the last sheet? I can look at it tonight. Looks simple enough...
    NCase likes this.
  8. NCase

    NCase Member

    Messages:
    39
    Hi,

    The last sheet as an output. I created a template of what I was trying to do manually.

    Here is the file in zip format for others interested in solving the problem.

    Attached Files:

  9. Deepak

    Deepak Excel Ninja

    Messages:
    2,875
    Hi,

    Pls find...
    Feeling sleepy so I will do rest later like as data compile from folders and colors...
    Do let me know for if any changes required.

    Code (vb):
    Option Explicit

    Sub DataProcessingV1()
    Dim r As Range, iFristBlock As Integer, oFind As Range, oRnageToHide As Range, oRangeBold As Range
    Dim iPasteRow As Integer, iCopyRow1 As Range, iCopyRow2 As Range, iRows As Range

    Application.ScreenUpdating = False

    With ActiveSheet
    iFristBlock = .UsedRange.Columns.Count - 1
    Set iRows = .UsedRange.SpecialCells(xlCellTypeLastCell) '.Offset(, -iFristBlock + 1)

    For Each r In .Columns(1).Cells
    If r.Value = "Name=" Then
        iPasteRow = r.Row + 1
            Set oFind = .Columns(1).Cells.Find("[PBM Apex]", r, , , , xlByRows)
                If Not oFind Is Nothing Then Set iCopyRow1 = oFind.Offset(-3) Else Exit Sub
            Set oFind = Nothing
           
            Set oFind = .Columns(1).Cells.Find("[contents]", iCopyRow1, , , , xlByRows)
           
                If Not oFind Is Nothing Then
                    If oFind.Address = "$A$1" Then
                        Set iCopyRow2 = iRows
                    Else
                        Set iCopyRow2 = oFind.Offset(-2).Offset(, iFristBlock + 2)
                    End If
                Else
                    Set iCopyRow2 = oFind
                End If
            Set oFind = Nothing
            Set oFind = .Range(iCopyRow1, iCopyRow2).Resize(, iFristBlock + 1)

                    oFind.Copy .Cells(iPasteRow, iFristBlock + 2)
                    oFind.EntireRow.Delete
           
                   
                   
                    If Not oRangeBold Is Nothing Then
                          Set oRangeBold = Union(oRangeBold, .Rows(iPasteRow + 5), .Rows(iPasteRow + 3))
                    Else
                        Set oRangeBold = Union(.Rows(iPasteRow + 5), .Rows(iPasteRow + 3))
                    End If
                   
                   
                    If Not oRnageToHide Is Nothing Then
                          Set oRnageToHide = Union(oRnageToHide, .Rows(iPasteRow + 4), .Rows(iPasteRow - 3).Resize(6))
                    Else
                        Set oRnageToHide = Union(.Rows(iPasteRow + 4), .Rows(iPasteRow - 3).Resize(6))
                    End If
    Set iCopyRow1 = Nothing
    Set iCopyRow2 = Nothing
           
    End If
    Next


    oRnageToHide.EntireRow.Hidden = True
    oRangeBold.Font.Bold = True
    .Range("D:H,L:O,Q:Q,T:V").EntireColumn.Hidden = True
    .Columns("A:B").ColumnWidth = 24
    .Columns("C").ColumnWidth = 6
    .Columns("I:K").ColumnWidth = 9
    .Columns("P").ColumnWidth = 70
    .Columns("R:S").ColumnWidth = 12


    End With
    Application.ScreenUpdating = True
    End Sub
     
    NCase likes this.
  10. NCase

    NCase Member

    Messages:
    39
    Deepak,

    This is great. The trick is now to read from each individual files in there own folders. Some data files may not have library data or no data is acquired but the file includes just empty header files. So the macro can get confused.

    I am including a larger example of a typical dataset.

    Thank you for your great help so far.

    Best,

    Francis

    Attached Files:

  11. NCase

    NCase Member

    Messages:
    39
    ps colors are not essential it was done to highlight areas and how I wanted to rearrange the data.

    Best,

    Francis

  12. Kenneth Hobson

    Kenneth Hobson Active Member

    Messages:
    238
    I am not sure what sheet Deepak was formatting.

    There are 2 approaches to get the data. One might use all arrays. Another might be to do a data query and import the files and then format and parse that.

    This is a small beginning for the first method. Parts could be used for the 2nd method.

    Code (vb):
    Sub Main()
      Dim p$, a, v, fso As Object, b, s$
      'Change path p to suit...
     p = "C:\Users\lenovo1\Dropbox\Excel\CSV\ParseFilesToMaster\Results.csv"
      a = Filter(Split(CreateObject("wscript.shell").exec _
        ("cmd /c Dir " & """" & p & """" & " /b /a-d /s").stdout.ReadAll, vbCrLf), ".")
      Set fso = CreateObject("Scripting.FileSystemObject")

      For Each v In a
        b = Split(fso.OpenTextFile(fn, 1, 0).ReadAll, vbLf)
     
        Exit For 'used for testing to just work on 1 file for testing.
     Next v

    End Sub
    NCase likes this.
  13. NCase

    NCase Member

    Messages:
    39
    Hi Kenneth,

    The raw data is in the first sheet. Sheet 2 and 3 were my step by attempts to show how I wanted to transform the data into a useful format. The color codes were to make it easy to read.

    Best,

    Francis
  14. Chihiro

    Chihiro Excel Ninja

    Messages:
    5,171
    A question, did you want to completely remove those hidden rows and columns? Or keep and and just hide them?

    My personal preference is to use FreeFile() and use RegEx to test pattern and add specific rows to dictionary object for later processing.

    Sample code for bringing in only rows you are interested in.
    Code (vb):
    Sub Demo()
    Dim iFile As String: iFile = "C:\Test\Downloads\Dataprocessing\2018-11-28_0936\RESULTS.CSV"
    Dim intFF As Integer: intFF = FreeFile()
    Open iFile For Input As #intFF
    Set dic = CreateObject("Scripting.Dictionary")
    Do Until EOF(1)
        Line Input #intFF, ReadData
        With CreateObject("VBScript.RegExp")
            .Global = True
            .Pattern = "(\[INT TIC: \d{4}-\d{2}-\d{2}_\d{4}.+?\])$|(Header=,.+)|(\d.?\=\,\s.?\d.+$)"
            If .Test(ReadData) Then
                dic(ReadData) = Split(Replace(ReadData, """", ""), ",")
            End If
        End With
    Loop

    Close #intFF
    For Each Key In dic.keys
        Debug.Print Key
    Next

    End Sub
    Returns.
    Code (vb):
    [INT TIC: 2018-11-28_0936 BLANK SYSTEM.D\data.ms]
    Header=,"Peak","R.T.","First","Max","Last","PK  TY","Height","Area","Pct Max","Pct Total"
    1=,  1,  2.058,  81, 163, 188,"BV 2", 176958, 74614385,100.00, 76.695
    2=,  2,  2.382,  188, 189, 197,"VV 2", 167496,  6909203,  9.26,  7.102
    3=,  3,  2.575,  197, 205, 245,"VB 3", 105000, 13971177, 18.72, 14.361
    4=,  4,  3.709,  290, 297, 307,"BV  ",  61950,  1791997,  2.40,  1.842
    Header=,"PK","RT","Area Pct","Library/ID","Ref","CAS","Qual"
    1=,  1,  2.0629,76.6953,"Carbon dioxide",    81,"000124-38-9", 4
    2=,  2,  2.3780, 7.1019,"Carbon dioxide",    81,"000124-38-9", 4
    3=,  3,  2.5719,14.3608,"Benzene",  1018,"000071-43-2",95
    4=,  4,  3.7140, 1.8420,"Cyclotrisiloxane, hexamethyl-", 79619,"000541-05-9",90
    Last edited: Dec 13, 2018
    NCase likes this.
  15. NCase

    NCase Member

    Messages:
    39
    Hello,

    It would be nice to bring together raw data from multiple results.csv files into sheet 1 for the record. Then process that data into the desired format into sheet 2.

    Hiding or getting of rows works for me.

    Best,

    Francis
  16. Chihiro

    Chihiro Excel Ninja

    Messages:
    5,171
    Try below code. I haven't fully tested, so check your result thoroughly.
    Code "CheckAllSubFold" should be run with blank sheet active. ImportData is called in that sub.
    Code (vb):
    Sub ImportData(iFile As String)
    Dim dic As Object, ar
    Dim y As Long, x As Long
    Dim intFF As Integer: intFF = FreeFile()
    Open iFile For Input As #intFF
    y = 1
    Set dic = CreateObject("Scripting.Dictionary")
    Do Until EOF(1)
        Line Input #intFF, ReadData
        With CreateObject("VBScript.RegExp")
            .Global = True
            .Pattern = "(\[INT TIC: \d{4}-\d{2}-\d{2}_\d{4}.+?data\.ms\])$|(Header=,.+)|(\d.?\=\,\s.?\d.+$)"
            If .Test(ReadData) Then
                dic(ReadData) = 1
            End If
        End With
    Loop

    Close #intFF
    sRow = IIf(Cells(Rows.Count, "A").End(xlUp).Row = 1, 1, Cells(Rows.Count, "A").End(xlUp).Row + 1)
    x = Application.RoundUp(dic.Count / 2, 0)
    For Each Key In dic.Keys
        If y <= x Then
            Range("A" & sRow + y - 1) = Key
        Else
            Range("L" & sRow + y - x) = Key
        End If
        y = y + 1
    Next
    Set dic = Nothing
    End Sub

    Sub CheckAllSubFold()
    Dim path As String: path = ThisWorkbook.path & "\"
    Dim fName As String: fName = "RESULTS.CSV"
    Dim cPath As String
    cPath = Dir(path, vbDirectory)
    Do While Len(cPath) > 0
        If Left(cPath, 1) <> "." And _
            (GetAttr(path & cPath) And vbDirectory) = vbDirectory Then
            ImportData path & cPath & "\" & fName
        End If
        cPath = Dir()
    Loop

    Range("A:A").TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Comma:=True
    Range("L:L").TextToColumns Destination:=Range("L1"), DataType:=xlDelimited, _
            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Comma:=True
    Range("L:L").Delete
    End Sub
    NCase likes this.
  17. NCase

    NCase Member

    Messages:
    39
    Hi Chihiro,

    Can you include the two macros in excel file to test.I have tried doing this in excel 2016 and its not working for me so far.

    Best,

    Francis
  18. NCase

    NCase Member

    Messages:
    39
    Deepak,

    If your code could be adapted to reading my directory of results.csv files then you would have achieved the desired goal in a very elegant way.

    Best,

    Francis

  19. Chihiro

    Chihiro Excel Ninja

    Messages:
    5,171
    Here, extract the attached and run the code in Sample_Master.xlsm.

    Note that master should be in the folder containing all the sub-folders with csv files (same location where you had compiled sample).

    If you need master to be elsewhere, I'd suggest using FileDialog to pick a folder containing all the sub-folders.

    Attached Files:

    NCase likes this.
  20. NCase

    NCase Member

    Messages:
    39
    Hi Chihiro,

    I know you don't reply on weekends but here is my update on the macro so far

    The macro is working at a 95% success rate but it missed 3 files in the directory for some unknown reason.

    I am including the raw files, the macro with the data process and the list of files it missed. I cannot see an obvious reason why the macro missed them. It might be useful to have a list of the folder names processed in a separate sheet to make sure its captured and processed all the required files in the data directory.

    Other than that issue its working perfectly. Thank you so much for your help and have a great weekend.

    Best,

    Francis

    Attached Files:

  21. NCase

    NCase Member

    Messages:
    39
    Hi Chihiro,

    Another final chink in my dataset. It turns out the results.csv file sometimes has a variable name. Is it possible to just process any .csv file in the folder and not worry about its exact name.

    Have a great weekend.

    Best,

    Francis
  22. Chihiro

    Chihiro Excel Ninja

    Messages:
    5,171
    I don't use rar. Could you re-upload in zip format?
  23. NCase

    NCase Member

    Messages:
    39
    ok. Here it is

    Attached Files:

  24. Chihiro

    Chihiro Excel Ninja

    Messages:
    5,171
    Here. Use this code then. It will grab info from all 67 files in your sample (even if file name is different from RESULTS.CSV, as long as it is csv file).

    However, if there are other patterns that could occur, you'll need to adjust.

    Only changes made are. ".Pattern" line and additional logic to store folder names and using that to grab file names stored in each folders dynamically in "CheckAllSubFold" sub (using collection and additional loop over the collection).
    Code (vb):
    Sub ImportData(iFile As String)
    Dim dic As Object, ar
    Dim y As Long, x As Long
    Dim intFF As Integer: intFF = FreeFile()
    Open iFile For Input As #intFF
    y = 1
    Set dic = CreateObject("Scripting.Dictionary")
    Do Until EOF(1)
        Line Input #intFF, ReadData
        With CreateObject("VBScript.RegExp")
            .Global = True
            .Pattern = "(\[INT.+?: \d{4}-\d{2}-\d{2}_\d{4}.+?data\.ms\])$|(Header=,.+)|(\d.?\=\,\s.?\d.+$)"
            If .Test(ReadData) Then
                dic(ReadData) = 1
            End If
        End With
    Loop

    Close #intFF
    sRow = IIf(Cells(Rows.Count, "A").End(xlUp).Row = 1, 1, Cells(Rows.Count, "A").End(xlUp).Row + 1)
    x = Application.RoundUp(dic.Count / 2, 0)
    For Each Key In dic.Keys
        If y <= x Then
            Range("A" & sRow + y - 1) = Key
        Else
            Range("L" & sRow + y - x) = Key
        End If
        y = y + 1
    Next
    Set dic = Nothing
    End Sub

    Sub CheckAllSubFold()
    Dim path As String: path = ThisWorkbook.path & "\"
    Dim fName As String: fName = "*.CSV"
    Dim cPath As String, coll As New Collection
    cPath = Dir(path, vbDirectory)
    Do While Len(cPath) > 0
        If Left(cPath, 1) <> "." And _
            (GetAttr(path & cPath) And vbDirectory) = vbDirectory Then
            coll.Add path & cPath & "\"
        End If
        cPath = Dir()
    Loop
    For I = 1 To coll.Count
        fName = Dir(coll.Item(I) & "*.csv")
        ImportData coll.Item(I) & fName
        fName = Dir()
    Next
    Range("A:A").TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Comma:=True
    Range("L:L").TextToColumns Destination:=Range("L1"), DataType:=xlDelimited, _
            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Comma:=True
    Range("L:L").Delete
    End Sub
    NCase likes this.
  25. NCase

    NCase Member

    Messages:
    39
    Hi Chihiro,

    I keep getting the error "Path not found (Error 76)" when I try to run the code.

    Best,

    Francis

Share This Page