• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find and replace in Word macro

CobolDrudge

New Member
This is a VBA question for Word, not Excel. I hope that’s OK.

I have a large Word document which needs an edit. I’m pretty sure it requires VBA because of a boolean condition. Here is a sample of the file (4 paragraphs; a couple of non-ascii characters not showing properly):

Viola riviniana, common dog-violet {het bleeksporig bosviooltje}; large square sepal appendages >1.5mm, spur curved up, blunt, notched, paler, much branched purple veins esp. lower petal
V. reichenbachiana, early dog-violet {het donkersporig bosviooltje}; sepal appendages <1.5mm, spur straight, pointed, unnotched, violet, darker, upper petals cf rabbit, veins on lower  unbranched
V. odorata, sweet violet {de maarts viooltje}; closely downy, sepals blunt, broad glossy lvs, rooting runners, fragrant, fl dark violet, white, other colours; vc
V. palustris, marsh violet {het moerasviooltje}; lvs reniform, 1-4cm wide, fls 10-15mm wide, blunt petals with dark purple veins, blunt spurs

And here is what I need to change it to:

Viola riviniana, common dog-violet {bleeksporig bosviooltje (het)}; large square sepal appendages >1.5mm, spur curved up, blunt, notched, paler, much branched purple veins esp. lower petal
V. reichenbachiana, early dog-violet {donkersporig bosviooltje (het)}; sepal appendages <1.5mm, spur straight, pointed, unnotched, violet, darker, upper petals cf rabbit, veins on lower  unbranched
V. odorata, sweet violet {de maarts viooltje}; closely downy, sepals blunt, broad glossy lvs, rooting runners, fragrant, fl dark violet, white, other colours; vc
V. palustris, marsh violet {moerasviooltje (het)}; lvs reniform, 1-4cm wide, fls 10-15mm wide, blunt petals with dark purple veins, blunt spurs

All the action is in the curly brackets which occur as a pair either once or not at all per paragraph. The job is to move the word ‘het’ if it occurs directly after the opening curly bracket, to behind the closing bracket, surrounding it with round brackets. In other words, the five characters ‘{het ‘ become ‘{‘ and ‘}’ becomes ‘ (het)}’ (note added leading space)

In the third paragraph in the sample there is no change applied because we have ‘de’ instead of ‘het’.

I have a notion how to do the looping. Going though the paragraphs will be something like

For i = 1 to ActiveDocument,Paragraphs.Count
.
Next i

And within that I can loop through the words with something along the lines

For j = 1 to ActiveDocument.Paragraphs(i).Range.Words.Count
.
Next j


- but I’m not at all sure how to code the text manipulation and would appreciate help. I see the main problem as being that ‘{het’ is a single item in the Words collection because it has a space on either side, but ‘}’ is not a word because it never has a space on either side.
 
Hi,

Try this code and let me know?
Code:
Sub MoveHetOutsideBrackets_Regex()
    Dim docRange As Range
    Dim re As Object
    Dim matches As Object
    Dim match As Object
    Dim pattern As String
    Dim content As String
   
    ' Get full document content
    Set docRange = ActiveDocument.content
    content = docRange.Text

    ' Create RegExp object
    Set re = CreateObject("VBScript.RegExp")
    re.pattern = "\{het ([^}]*)\}"
    re.Global = True
    re.IgnoreCase = False

    ' Replace using a function
    If re.Test(content) Then
        content = re.Replace(content, "{$1 (het)}")
        docRange.Text = content
    End If
End Sub

If you like it, don't forget to click the thumbs up

Regards
 
Last edited:
Hi,

Try this code and let me know?
Code:
Sub MoveHetOutsideBrackets_Regex()
    Dim docRange As Range
    Dim re As Object
    Dim matches As Object
    Dim match As Object
    Dim pattern As String
    Dim content As String
  
    ' Get full document content
    Set docRange = ActiveDocument.content
    content = docRange.Text

    ' Create RegExp object
    Set re = CreateObject("VBScript.RegExp")
    re.pattern = "\{het ([^}]*)\}"
    re.Global = True
    re.IgnoreCase = False

    ' Replace using a function
    If re.Test(content) Then
        content = re.Replace(content, "{$1 (het)}")
        docRange.Text = content
    End If
End Sub

If you like it, don't forget to click the thumbs up

Regards
It has very nearly worked! All the transformations have been done correctly but it has converted the whole file to bold format, possibly because the very first word in the file is in bold. The distinction between normal and bold is vital and needs to be preserved. Here is a longer sample of the file (line breaks not showing properly because I'm pasting directly from Word):

Droseraceae

Drosera rotundifolia, round-leaved sundew {de ronde zonnedauw}; lf stalks horizontal & hairy

D. intermedia, oblong-leaved sundew {de kleine zonnedauw}; lf stalks erect, hairless, infl from below

D. anglica, great sundew {de lange zonnedauw}; infl long, to 2x lvs, from centre of rosette

Cistaceae

Helianthemum nummularium (H. chamaecistus) {het geel zonneroosje}, common rock-rose

H. appenninum, white rock-rose {het rotszonneroosje}

H. oelandicum (canum), hoary rock-rose {het oelandzonneroosje}; fls 1-1.5cm, bent styles, lvs 2mm wide

Violaceae

Viola riviniana, common dog-violet {het bleeksporig bosviooltje}; large square sepal appendages >1.5mm, spur curved up, blunt, notched, paler, much branched purple veins esp. lower petal

V. reichenbachiana, early dog-violet {het donkersporig bosviooltje}; sepal appendages <1.5mm, spur straight, pointed, unnotched, violet, darker, upper petals cf rabbit, veins on lower ± unbranched

V. odorata, sweet violet {de maarts viooltje}; closely downy, sepals blunt, broad glossy lvs, rooting runners, fragrant, fl dark violet, white, other colours; vc

V. palustris, marsh violet {het moerasviooltje}; lvs reniform, 1-4cm wide, fls 10-15mm wide, blunt petals with dark purple veins, blunt spurs

The bold elements are botanical families and there are some hundreds of them in the file.
 
Hi,

Sorry, I didn’t realize that the bold formatting and so on was being removed.
Please, try this fixed code
Code:
Sub MoveHetOutsideBrackets_Regex()
    Dim re As Object, match As Object, matches As Object
    Dim rng As Range
    Dim pattern As String

    ' Set up regex
    Set re = CreateObject("VBScript.RegExp")
    re.pattern = "\{het ([^}]*)\}"
    re.Global = True
    re.IgnoreCase = False

    ' Search in document range
    Set rng = ActiveDocument.content

    If re.Test(rng.Text) Then
        Set matches = re.Execute(rng.Text)
       
        ' Loop from last match to first to avoid offset errors
        Dim i As Long
        For i = matches.Count - 1 To 0 Step -1
            Set match = matches(i)

            ' Define a range matching the exact position
            Dim matchRange As Range
            Set matchRange = rng.Duplicate
            matchRange.Start = rng.Start + match.FirstIndex
            matchRange.End = matchRange.Start + match.Length

            ' Replace text preserving formatting
            matchRange.Text = "{" & match.SubMatches(0) & " (het)}"
        Next i
    End If
End Sub

Regards
 
Back
Top