• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Altering column values

vk7

Member
Hello All,

I have an excel worksheet with few columns, and in one of the column I wanted to remove all the Garbage Values (i.e all special characters including periods, comma, semicolon, colon, apostrophe, single quotes, double quotes, less than, greater than, bullets, tilde, dashes and all other possible special characters including alphanumeric characters). Also I wanted to convert the values from uppercase to lowercase.

This is just an example but it should eliminate all of the special characters in the column named Garbage in the attached excel sheet.

The following data

-This is line11- that talk's about chandoo.org, where people Get master in Excel; "operation's".And help other people in the forum. There have-been many helpful tutorial's available in this//forum // that helps many users to improve themselves.

Should look like the below one after conversion

this is that talks about chandoo org where people get master in excel operations and help other people in the forum there have been many helpful tutorials available in this forum that helps many users to improve themselves
 

Attachments

  • Document.xlsx
    9.6 KB · Views: 7
Hi ,

Replacing chandoo.org with chandoo org implies that the period has been replaced by a space character. Either this can be done for all period symbols , where ever they may occur , or it can be done for the text .org ; which is it ?

Narayan
 
  • Like
Reactions: vk7
Basically that line has the following data.

(TAM123) Remove the alphanumeric characters. And all ^ special characters from this line like :);'?//..<>\|\\||{}[]()~``!@#$%^&*)

So after conversion, it should be like the following one

remove the alphanumeric characters and all special characters from this line like
 
The position is not fixed, it can be anywhere in the sentence. First, Last, Middle anywhere it can be. The logic is (parenthesis) is a special character and it should be removed. After that TAM123 is remaining which is alphanumeric character as it is alphanumeric it should be removed as well.
 
Hi ,

I did not mean First , Last or Middle.

Your uploaded file has the following in cell B2 :

• This is 1st sentence, where I'm trying to remove Garbage Character's like period, apostrophe, bullets, …….

The highlighted 3 characters are also an alphanumeric string , just like TAM123 ; how do you keep one and remove another ? What is the logic ?

Narayan
 
  • Like
Reactions: vk7
Seems like that is making confusion for you. I apologize for the same.

The following string
• This is 1st sentence, where I'm trying to remove Garbage Character's like period, apostorphe, bullets, …….

After comparison, should look like
this is sentence where im trying to remove garbage characters like period apostorphe bullets

Logically, all alphanumeric characters should be removed.
 
No I'm even more confused: what would you keep then?
Hi ,

When the term alphanumeric is used , what is meant here is that if both alphabets and digits occur in the same word , that word will be removed. Of course , if digits occur on their own too , they are to be removed.

What will remain at the end of this data processing will be only alphabets.

Narayan
 
  • Like
Reactions: vk7
@NARYANK991, I see it is not removing numeric characters. It also needs to remove numeric characters. In addition, it doesn't convert it to lowercase.

loyal customers are present in huge numbers around 12780909 In this line remove the special character and fill the gap with the space
 
@NARAYANK991,

There is a minor error, I believe so. It cannot remove quotations and dashes(–). Please see the attached file (B1)
 

Attachments

  • Test (1).xlsm
    16.1 KB · Views: 1
Hi ,

When special characters were mentioned initially , only those characters described in the data were considered.

Now , there seem to be characters with ASCII codes 160 , now 150 ; every new character will have to be added afresh to the list. Nothing can be done.

Narayan
 

Attachments

  • Test.xlsm
    14.7 KB · Views: 5
  • Like
Reactions: vk7
Hello @NARAYANK991, I understood now. One final question, how do you find that those are ascii characters? Also is there a quick way to list all ascii characters and their decimal values?
 
Hi ,

ASCII characters are those whose codes range from 0 through 255 , with codes 0 through 127 being standard ASCII , and codes beyond that till 255 being extended ASCII.

https://en.wikipedia.org/wiki/ASCII

https://en.wikipedia.org/wiki/Extended_ASCII

The Excel values for these are available using the CODE function.

Thus , if you have the letter A in uppercase in cell A1 , using a formula such as :

=CODE(A1)

will return the value 65 , which is the ASCII code for uppercase A.

Narayan
 
  • Like
Reactions: vk7
Back
Top