• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find strings listed in column into files and mark Yes or No in respective file name column

I have strings listed in "A" column and file names are listed in column headers. I want to find whether string in "A" column exists in a file or not and add result in a file name column. I am hopeful that it is possible using vba.

I tried using formula "vlookup" (when added data into sheet) but it is time consuming calculation and as each file has 22K lines and for just 100 files all rows counts to 22 lakh lines.
Any help will be appreciated.
 

Attachments

  • New Microsoft Excel Worksheet.xlsm
    19.4 KB · Views: 6
  • file1.txt
    626 bytes · Views: 7
Last edited:
I'd recommend uploading at least 1 sample target file (the file where you are performing search on).

Without knowing structure of data being searched. It's bit hard to optimize it.
Is the search being performed on flat text, tab delimited, comma separated or some other format? etc etc.
 
Assuming all files are in single folder and exact file names are used as header...

Code:
Sub Demo_StringSearch_txt()
Dim fPath As String: fPath = "C:\test\search\"
Dim strContent As String
Dim intFF As Integer: intFF = FreeFile()
Dim myArr
Dim i As Long, j As Long
myArr = Range("A1", Cells(Cells(Rows.count, 1).End(xlUp).Row, Cells(1, Columns.count).End(xlToLeft).Column)).Value

For i = 2 To UBound(myArr, 2)
    Open fPath & myArr(1, i) For Input As #intFF
    strContent = Input(LOF(intFF), intFF)
    Close #intFF
    For j = 2 To UBound(myArr)
        If InStr(strContent, myArr(j, 1)) > 0 Then
            myArr(j, i) = "Yes"
        End If
    Next
Next

Range("A1").Resize(UBound(myArr), UBound(myArr, 2)) = myArr


End Sub

See attached zip. Demo_Search_txtFile.xlsb has the code and output sheet ("Demo"). And 3 text files to search on is included.

Change fPath to folder of your choosing.
 

Attachments

  • Demo_Search_txtFile.zip
    13.9 KB · Views: 9
Last edited:
I would also like to see file name which is currently in progress, so modified code. But that shows first file and then after macro is complete, last file. I do not see continuous progress. For an average, macro takes 1-2 minutes for each file and I need to process 100 to 500 files in one go.
Code:
Sub Demo_StringSearch_txt()
Application.ScreenUpdating = False
Dim fPath As String: fPath = "D:\test\"
Dim strContent As String
Dim intFF As Integer: intFF = FreeFile()
Dim myArr
Dim i As Long, j As Long
myArr = Range("A1", Cells(Cells(Rows.Count, 1).End(xlUp).Row, Cells(1, Columns.Count).End(xlToLeft).Column)).Value
'Application.StatusBar = file.Name
For i = 2 To UBound(myArr, 2)
Application.StatusBar = myArr(1, i)
    Open fPath & myArr(1, i) For Input As #intFF
    strContent = Input(LOF(intFF), intFF)
    Close #intFF
    For j = 2 To UBound(myArr)
        If InStr(strContent, myArr(j, 1)) > 0 Then
            myArr(j, i) = "Yes"
        End If
    Next
Application.StatusBar = myArr(1, i)
Next

Range("A1").Resize(UBound(myArr), UBound(myArr, 2)) = myArr

Application.ScreenUpdating = True
End Sub
 
Assuming all files are in single folder and exact file names are used as header...

Code:
Sub Demo_StringSearch_txt()
Dim fPath As String: fPath = "C:\test\search\"
Dim strContent As String
Dim intFF As Integer: intFF = FreeFile()
Dim myArr
Dim i As Long, j As Long
myArr = Range("A1", Cells(Cells(Rows.count, 1).End(xlUp).Row, Cells(1, Columns.count).End(xlToLeft).Column)).Value

For i = 2 To UBound(myArr, 2)
    Open fPath & myArr(1, i) For Input As #intFF
    strContent = Input(LOF(intFF), intFF)
    Close #intFF
    For j = 2 To UBound(myArr)
        If InStr(strContent, myArr(j, 1)) > 0 Then
            myArr(j, i) = "Yes"
        End If
    Next
Next

Range("A1").Resize(UBound(myArr), UBound(myArr, 2)) = myArr


End Sub

See attached zip. Demo_Search_txtFile.xlsb has the code and output sheet ("Demo"). And 3 text files to search on is included.

Change fPath to folder of your choosing.

Having strings in "A" column (around 40 K) and around 500 file names in columns, vba takes approx. 1.5 to 2 min. for each file (Overall 16 hours for the run).
I have also made screen updating false during run. So can there be any way to increase performance of this macro?
 
Don't use VBA? :p

Really, using right tool for the job may ease your pain. Look into investing time and/or money into proper DB.

Or use GREP (GnuWIN, AstroGrep etc etc). Which can be used to spit out list of files that contain match to an expression.

Other than that... try using RegEx pattern matching instead of Instr and see if it's any faster. I'm bit busy right now to write the code.
 
Don't use VBA? :p

Really, using right tool for the job may ease your pain. Look into investing time and/or money into proper DB.

Or use GREP (GnuWIN, AstroGrep etc etc). Which can be used to spit out list of files that contain match to an expression.

Other than that... try using RegEx pattern matching instead of Instr and see if it's any faster. I'm bit busy right now to write the code.

Ok. No problem. My final goal is to compare 2 consecutive files if they contains same string. Yeah, I will keep finding other better solution.:)
 
Back
Top