• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Sub-string to be extracted, updated and replaced back within string

saidhruv

Member
Dear All,

This query is about a specific string to be extracted from a full string within cell and thereby replacing that specific string back within the original string, i.e. within the same cell. Before this replacement, the extracted string would be manually appended with a specific code before it is fitted back within original string.

The explanation of query is very lengthy and i am really sorry about that.

The data that i have is a list of links which consists of . (dot) within the link. The output required is, replace the dot with specific code. However, this specific code is not generic, it changes based on where the dot is situated within the link. Example, NEWYORK. would have a html code for fullstop, hence output would be, NEWYORK'.. The other example is, p.t.o. In this case the dots will be replaced with html code of dot, i.e. p⋅t⋅o⋅

In this manner the application of whether dot is an html code ⋅ or ' would depend on situation and hence we cannot make a logic within formula/vba. The replacements have to be made manually.

But, what can be done is, LOGIC: extract the data just before the dot (along with the dot) and place it within a separate sheet. Manually one can replace the dot with specific code within that separate sheet. And finally the code to replace the extracted data back into the original sheet.

Now, a collegue of mine has already helped me in the case where the . are available within ADDRESS. However, we have another set of data that are a list of links. In this part the code available with us is not compatible. Hence we need help here.

Explaining the code that is compatible with ADDRESS. Have attached the file as well named DotCode.
Step1: Macro: MakeClean - this macro extracts data from within a pre-selected range. The data has the dot as well and is pasted in a separate sheet.
Step2: Manually replace the dots with specific code within that output available in separate sheet.
Step3: Macro: MultiReplace - this macro replaces the updated data within the original data, in a new sheet.

You may test this code. Please note that before manually replacing the dots as mentioned in step2, pls sort the output based on length of the data, large to small. This sorting is required as a workaround to overlook an existing bug within MultiReplace macro.

However, the macro that is required is as follows so as to extract data from LEFT of dot as per logic stated below:
Step1: remove all spaces available within the string. Please do not use TRIM cause we need to remove all spaces. This is required.
Step2: Extraction logic:
If punctuation (any special character) is available before dot, extract only that particular punctuation along with dot
.
If alphabet is available before dot, extract all alphabets till we get punctuation or integer or reach the start of cell..
If digit is available before dot, extract only that first digit just before dot and display along with dot

If . is available at the start, pls extract . along with the next set of data, i.e. if punctuation then only one punctuation, if word it should be extracted till it ends with punctuation or digit and if it is digit extract only that digit which is next to dot.

The above stated is the logic. Attaching the file LinkMakeClean. It has sheet link_example that has sample link data. The data is in column A and starts with A2.. Now, pls refer the other attachment DotCode. The macros available in this sample worksheet are required to be created with the logic stated above.

See if you can help cause we have tried almost many a times, but in vain.

Thanks!
 

Attachments

  • DotCode.xlsm
    25.7 KB · Views: 2
  • LinkMakeClean.xlsm
    26.1 KB · Views: 2
1) Which sheet are you referring to?
2) If you upload a workbook with before/after would help to understand more.
And the data should includes all the variations that you have stated in your first post.
 
1) Which sheet are you referring to?
2) If you upload a workbook with before/after would help to understand more.
And the data should includes all the variations that you have stated in your first post.

Dear Jindon..

DotCode.xlsm contains the code that is working fine. It is attached just for reference in case anyone wants to understand that code. However, the code only works for a particular set of data and it does not work for another set of data for which i have raised this question.

Now to answer your request i am attaching a before/after data.

Please find the attachment LinkMakeClean_Jindon.xlsm.. This excel has the following sheets:
link_example - this sheet has the sample of the data which has all variations mentioned in my first post.
Sheet10 - this sheet has the data which is expected to be extracted from the sheet link_example. Data has to be extracted based on the logic mentioned in my first post. However, i will once again mention the logic at the end part of this post, again. To extract this data, i request for a vba code.
ManualSubstitute - in this sheet i will manually substitute the dots which html codes. I will copy data from sheet10 into this sheet and will manually work on it.
FinalOutput - this sheet is the final part expected. The data from ManualSubstitute sheet must be placed within original data and pasted within this sheet. To replace this data, i request for a vba code.

Logic for Sheet10: Search for dot (.), and

a) If punctuation (any special character) is available before dot, extract only that particular punctuation along with dot..
b) If alphabet is available before dot, extract all alphabets till we get punctuation or integer or reach the start of cell..
c) If digit is available before dot, extract only that first digit just before dot and display along with dot..
d) If . is available at the start, pls extract . along with the next set of data, i.e. if punctuation then only one punctuation, if word it should be extracted till it ends with punctuation or digit and if it is digit extract only that digit which is next to dot.

Logic for sheet FinalOutput:
Please replace the data available in ManualSubstitute sheet within the original data available in link_example sheet and paste it here.

Thanks!!
 

Attachments

  • LinkMakeClean_Jindon.xlsm
    27.9 KB · Views: 1
See if this is how you wanted.
Code:
Sub test()
    Dim a, i As Long, m As Object, n As Long
    a = Application.Substitute(Range("a2", Range("a" & Rows.Count).End(xlUp)).Value, " ", "")
    ReDim b(1 To 10000, 1 To 1)
    With CreateObject("VBScript.RegExp")
        .Global = True: .IgnoreCase = True
        .Pattern = "^\.[\W_]|\.(.*?)(?=\W)|([a-z]*?)\.+|[\W_]\.+|\d\."
        For i = 1 To UBound(a, 1)
            For Each m In .Execute(a(i, 1))
                n = n + 1: b(n, 1) = m
            Next
        Next
    End With
    If n > 0 Then [b2].Resize(n).Value = b
End Sub
 

Attachments

  • LinkMakeClean_Jindon with code.xlsm
    28.8 KB · Views: 2
Last edited:
See if this is how you wanted.
Code:
Sub test()
    Dim a, i As Long, m As Object, n As Long
    a = Application.Substitute(Range("a2", Range("a" & Rows.Count).End(xlUp)).Value, " ", "")
    ReDim b(1 To 10000, 1 To 1)
    With CreateObject("VBScript.RegExp")
        .Global = True: .IgnoreCase = True
        .Pattern = "^\.[\W_]|\.(.*?)(?=\W)|([a-z]*?)\.+|[\W_]\.+|\d\."
        For i = 1 To UBound(a, 1)
            For Each m In .Execute(a(i, 1))
                n = n + 1: b(n, 1) = m
            Next
        Next
    End With
    If n > 0 Then [b2].Resize(n).Value = b
End Sub


Dear Jindon,

Can it get any more awesome.. It works perfectly. Thanks a lot..
Now about the other and the final part of the logic. Can you please replace the data back into the original string?
I have manually updated the data you extracted. The manually replaced data is available in sheet ManualSubstitute, Column D. Attaching the file back.
Can you replace this data back into original position within FinalOutput sheet.

Thanks aTon!
 

Attachments

  • LinkMakeClean_Jindon_with_code_TESTEDFINE.xlsm
    26.3 KB · Views: 1
Replace?
Code:
Sub myReplace()
    Dim a, i As Long
    a = Sheets("manualsubstitute").Cells(1).CurrentRegion.Value
    With Sheets("link_example").Columns(1)
        .Replace " ", "", 2
        For i = 2 To UBound(a, 1)
            If (a(i, 1) <> "") * (a(i, 4) <> "") Then .Replace a(i, 1), a(i, 4), 2
        Next
    End With
End Sub
 
Super LIKE dear jindon!!!! May God Bless You with the choicest of His Blessings!! You have done something which we tried since a month and failed. You did it in say less than 30 minutes... Fabulous!!!
 
Back
Top