• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find/replace across multiple files in directory/subdirectories from Excel list

jamo3162

New Member
I'm very new to VBA, to start. I have an Excel list of a few hundred values in Column A and what they're supposed to be in Column B but they're spread across hundreds of files.

Essentially I need a find/replace that uses the Excel columns as reference, searches every file in a directory and its subfolders, and makes the find/replace swap.

I'm trying to use a VBA I found elsewhere but it only searches ONE folder, when I need it to search the subdirectories too. The rest of the code is perfect for what I'm doing. Source thread:
https://chandoo.org/forum/threads/m...text-string-files-in-folder.30905/post-184075

Here's what I've been trying so far:

Code:
Sub ReplaceStringInFile()
'http://stackoverflow.com/questions/14840574/find-and-replace-string-in-all-excel-files-in-folder
Dim objFSO As Object
Dim objFil As Object
Dim objFil2 As Object
Dim regex As Object: Set regex = CreateObject("VBScript.RegExp")
Dim StrFileName As String
Dim StrFolder As String
Dim strAll As String
Dim FindStr As String, ReplaceStr As String

Set objFSO = CreateObject("scripting.filesystemobject") 'enable microsoft scripting runtime
StrFolder = "C:\Users\me\Top-Level\" 'choose folder to go through
StrFileName = Dir(StrFolder & "*.txt") 'choose extension to find and replace

Do While StrFileName <> vbNullString
    Set objFil = objFSO.Opentextfile(StrFolder & StrFileName)
    strAll = objFil.readall
    objFil.Close
    Set objFil2 = objFSO.createtextfile(StrFolder & StrFileName)
    For i = 2 To Range("A" & Rows.Count).End(xlUp).Row
        With regex
            .MultiLine = True
            .Global = True
            .IgnoreCase = False '\\ Change to true if you don't want case sensitivity
            .Pattern = Range("A" & i).Value
            strAll = .Replace(strAll, Range("B" & i).Value)
        End With
    Next i
    objFil2.Write strAll 'choose what to find and replace *case sensitive
    objFil2.Close
    StrFileName = Dir
Loop
Set regex = Nothing
End Sub

I tried playing around with asterisks\regex with the StrFolder and StrFileName values as a sort of wildcard/catch-all but haven't been able to get it to work for multiple folder levels.

Thank you in advance!
 
If the hard drive is mechanical - not a SSD - see this faster way rather than the slower FSO way :​
 
I don't do this sort of thing very often, but any faster way of doing anything is interesting. But what is this Dir$ function? I looked in the VBA reference and I don't see it as one of the statements. Oh, duh, it's a function!, got it. I've never used that function, but I learned recently that the VBA Like operator is much faster than my usual way of doing things, eg:
Code:
If str Like "ABC*"
'...is much, much faster than...
If Left(str, 3) = "ABC"
So I'm keeping an eye out for other efficiencies.
 
Not by my timing.
Code:
Sub Timing()
  Const nit = 9999999
  vs = "ABC"
  vstr = "ABCDEF"

  ' Method #1:
  va = Timer
  For ji = 1 To nit
    If Left(vstr, 3) = vs Then
      End If
    Next ji
  t1 = Timer - va

  ' Method #2:
  vsl = "ABC*"
  va = Timer
  For ji = 1 To nit
    If vsl Like vstr Then
      End If
    Next ji
  t2 = Timer - va

  ' Method #2:
  va = Timer
  For ji = 1 To nit
    If InStr(vstr, vs) = 1 Then
      End If
    Next ji
  t3 = Timer - va

  MsgBox "1: " & t1 & ", 2: " & t2 & ", 3: " & t3
  End Sub
It says
84096

InStr is faster than examining the substring but slower than Like.
 
While we're talking timing, I learned some time ago that it's about ten times faster to access a scalar variable than the property of an object. Much bigger difference than I expected! So nowadays I'm in the habit of grabbing a property and saving it in a variable if it's going to be in a loop or even used more than once:
Code:
Set owb = ThisWorkbook
Set ows = owb.ActiveSheet
Set ocs = ows.Cells 'saves time instead of making VBA look up ows.Cells(row, col) each time
vavg = ocs(23, 1).Value 'get the average value from the bottom of the col
For jr = 2 to 22
  If ocs(jr, 1) > vavg then 'do something here
  End If
Although actually it's even faster to use an array:
Code:
Set owb = ThisWorkbook
Set ows = owb.ActiveSheet
Set ocs = ows.Cells
ar = Range(ocs(2, 1) to ocs(23, 1)).Value
rz = UBound(ar, 1)
vavg = ar(rz, 1) 'get the average value from the bottom of the col
For jr = 1 to rz - 1
  If ar(jr, 1) > vavg then 'do something here
  End If
Working with individual cells gets slow if you're working with more than one or two hundred of them. Putting them all in an array is more work, but is blindingly fast even if you're working with tens of thousands of cells.
 
InStr is faster than examining the substring but slower than Like.
The same on my side even once the Like 'logic error' fixed and without checking Instr for a particular value,​
the same with my below try with counters results …​
So weird some Excel VBA experts, Gurus, MVP and so on claim Instr is faster than Like. :rolleyes:
On my side I never care I just use the more appropriate function according to what I need …​
84098
Code:
Sub Timing3()
  Const C = "ABCDEF", Z = 9999999
    Dim S$, B!, L&, R&(1 To 3), T!(1 To 3), V
        S = "ABC"
' #1
        B = Timer
        For L = 1 To Z:  R(1) = R(1) - (Left(C, 3) = S):  Next
        T(1) = Timer - B
' #2
        B = Timer
        For L = 1 To Z:  R(2) = R(2) - (InStr(C, S) > 0):  Next
        T(2) = Timer - B
' #3
        S = S & "*"
        B = Timer
        For L = 1 To Z:  R(3) = R(3) - (C Like S):  Next
        T(3) = Timer - B
'
        V = [{"Left ","InStr","Like "}]
        S = ""
    For L = 1 To 3
        S = S & L & ") " & V(L) & " : " & Format$(T(L), "0.000s") & " / " & R(L) & vbLf & vbLf
    Next
        MsgBox S, , "Timer Results"
End Sub
 
InStr is faster than examining the substring but slower than Like.

Your example is executing the if statement for Left and InStr, but not for like. If I arrange things so that each statement satisfies the if statement, I get Instr as slightly faster,.

Code:
Sub Timing()
    Const nit = 9999999                               '9,999,999
    vs = "ABC"
    vstr = "ABCDEF"
    Dim A, B, C

    ' Method #1:
    va = Timer
    For ji = 1 To nit
        If Left(vstr, 3) = vs Then
            A = A + 1
        End If
    Next ji
    t1 = Timer - va

    ' Method #2:
    vsl = "A*F"
    va = Timer
    For ji = 1 To nit
        If vstr Like vsl Then
            B = B + 1
        End If
    Next ji
    t2 = Timer - va

    ' Method #2:
    va = Timer
    For ji = 1 To nit
        If InStr(vstr, vs) = 1 Then
            C = C + 1
        End If
    Next ji
    t3 = Timer - va

    Debug.Print A
    Debug.Print B
    Debug.Print C

    MsgBox "1 Left: " & t1 & vbCr & "2 Like: " & t2 & vbCr & "3 Instr: " & t3
End Sub

84116
 
I just went back and checked my code, and yeah, I see the mistake in the Like test. But I'm still getting Like slightly faster now that it's fixed. Here's the same program only using literals, for easier reading:
Code:
Sub Timing()
  Const nit = 9999999

  ' Method #1:
  va = Timer
  For ji = 1 To nit
    If Left("ABCDEF", 3) = "ABC" Then
      End If
    Next ji
  t1 = Timer - va

  ' Method #2:
  va = Timer
  For ji = 1 To nit
    If "ABCDEF" Like "ABC*" Then
      End If
    Next ji
  t2 = Timer - va

  ' Method #2:
  va = Timer
  For ji = 1 To nit
    If InStr("ABCDEF", "ABC") = 1 Then
      End If
    Next ji
  t3 = Timer - va

  MsgBox "1: " & t1 & ", 2: " & t2 & ", 3: " & t3
  End Sub
84117

And by the way, I suspect InStr would do worse if the target string doesn't happen to be near the beginning. But then I suppose so would the other two methods, so never mind.
 
As on my side​
84118
so to conclude it depends on the way it is coded, the Excel version, the OS and the system setup …​
 
Please, focus to this:
I'm trying to use a VBA I found elsewhere but it only searches ONE folder, when I need it to search the subdirectories too.
Results or speed ... without correct results is it matter of speed?
 
As the parenthesis is already closed with my previous post and as as I gave a way in post #2, still waiting for OP news … :rolleyes:
 
If the hard drive is mechanical - not a SSD - see this faster way rather than the slower FSO way :​
I'm working off my local C: laptop for work so lets assume Mechanical with files that I'm going to reuploading to a knowledge repository. Thank you for the link! I'm super new to a lot of this so I think that's a bit over my head right now but I'm reading into it.

Apologies if I used terminology that may be slightly different here, but when I said "faster" I only meant that in terms of "please don't make me open every single folder and run this VBA code".

My main issue is still that I need a find/replace function/feature/code/script that can scan an entire directory and its subfolders and their files for terms in an Excel spreadsheet Column A and replace them with the phrase/code/paragraph/etc from column B
 
Last edited:
So in this case you can try a mix :​
  • Windows FSO for scanning folders
  • VBA funcion Dir for opening workbooks
  • both in a stand alone recursive procedure (calling itself) …
 
Back
Top