All articles with 'text processing' Tag
Excel text functions are essential for cleaning up data, extracting parts or combining results. Learn the most important functions and their usage in this article.Continue »
Ever wanted to make a random sentence or text? You can use Excel formulas to make totally random sentences. This is a great way to generate test data or dummy data-sets.Continue »
This is CRAZY!!!. I stumbled on a weird use for FILTERXML() while reading a forum post earlier today. So I couldn’t wait to test it. I am happy to share the results.
Say you have some text (sentence / phrase / keyword etc.) in a cell and you want to extract the nth word. Unfortunately Excel doesn’t have SPLIT() formula. So we end up writing obscenely long array formulas or use gazillion helper columns.
Here is the super sneaky trick. Use FILTERXML() instead.Continue »
We all know that Pivot Tables are best thing since avocado on toast. But they can’t slice text values and spread them in a table with Pivots. So how to take a large blob of text and turn it in to something meaningful like above?
Simple, we use Power Query.Continue »
Let’s talk about the untrimmable spaces.
We all know that TRIM() removes extra spaces from the beginning, ending and middle of a text.
So for example, if A1 has ” something and one more ”
will give “something and one more”
We can use CLEAN() function to remove non-printable characters (like the ASCII codes 0 to 31). Of course, SPACE is technically a printable character, so CLEAN() won’t remove spaces.
The untrimmable spaces…?
The other day Sreekanth emailed me a sample of data and asked, “how do I remove the spaces in this list and convert them to numbers?”
Naturally I tried to TRIM().
But the data won’t budge. See above.
Hmm, let’s investigate why.Continue »
So here is a news from strange but true department. Microsoft Excel blamed for gene study errors [bbc.com].
Microsoft’s Excel has been blamed for errors in academic papers on genomics.
Researchers trying to raise awareness of the issue claim that the spreadsheet software automatically converts the names of certain genes into dates.
Gene symbols like SEPT2 (Septin 2) were found to be altered to “September 2”.
This is what happens when you spend countless hours learning genome sequencing and very little about the software tools where your data goes. May be we need clippy back to warn people about such sticky situations.Continue »
Here is an interesting problem to start your day.
Let’s say you work as DNA sequencing engineer at The Enterprise. And you just unlocked the sequence that is responsible for all male problems. The early onset of baldness. The sequence code is AAAA. And you want to find out how many times this sequence is found in a sample of DNA strings, in the range B6:B19. Essentially you want the above.
So how do you write the formula?Continue »
Okay, time for another challenge.
Imagine you have some data like this. Each cell contains 3 numbers separated by line break – CHAR(10) and you need to extract the number that is 10 digits long.
Go ahead and solve this riddle.Continue »
Here is a quick homework to keep you busy this weekend.
Can you extract number of days from below text.
Nov15 PUTS (23 days)
March15 TIKS (3 days)
March1 TIKS (25 days)
June11 TIKS (10 days)
Assume the data is from cell A1.
Your solution should return the following:
Post your answers (formulas, VBA code or Power Query M code) in the comments.Continue »
Last night I asked members of our Chandoo.org facebook page to share an Excel problem you are struggling with. Francis asked,
How to save a file as .txt in vba without quotes? When I save as .txt, the file has got quotes inside of it. I used the code Print, but it didnt work because the file loses its delimitation.
Does anyone know how to solve this?
Let’s understand how to save a range as text and overcome the double quote problem.Continue »
If you deal with customers or colleagues in Europe, often you may see numbers like this:
When these numbers are pasted in Excel, they become text, because Excel can’t understand them.
Here is a simple way to convert the European numbers to regular ones.
Use NUMBERVALUE() Function.Continue »
We all know that VLOOKUP (and its cousins MATCH, HLOOKUP and LOOKUP) are great for finding information you want. But they are helpless when you want to do a case-sensitive lookup.
So how do we write case sensitive VLOOKUP formulas?
Simple. We can use EXACT formula.Continue »
On Wednesday (15th July), I ran my first ever webinar, on a topic called, “How to be a BETTER Analyst?” (here is the replay link, in case you missed it). It was a huge success. More than 1,100 people attend the live webinar and hundreds more watched the replay. As part of the webinar, we had interactive Q&A. Viewers posted their questions and I replied to as many of them as I can.
After the webinar, I wanted to make sure I covered all the questions. So I downloaded the chat history. There were more than 700 messages in it. And I am not in the mood to read line by line to find-out the questions. A good portion of chat messages were not questions but stuff like ‘hello everyone, I am from Idaho’, ‘Wow, Chandoo has beard!”, “Enjoying a beer in Belgium while watching webinar” etc. So I wanted a quick way to flag the messages as question or not.Continue »
Recently my iPhone 4 crashed. It is 3.5 years old. And just like any other 3 year old, it started acting weird & crazy one night. The next morning it went silent. It won’t go beyond the Apple logo whenever I start it. Since I couldn’t wait for the phone to start, I took out the SIM card (the phone is unlocked, if you are wondering) and placed it in my old Nokia phone. But alas, none of my contacts are on the SIM. They are in “cloud”.
After a day of answering phone calls from everyone including my mom as “Chandoo here”, I’ve decided to get my contacts back. So I logged in to iCloud to download a backup. And the backup was a .VCF file.
Since I wanted to have all my contact numbers in a spreadsheet, I did what any Excel nerd would do. I built a template that can convert VCF data to Excel worksheet.Continue »
Today lets rescue John Doe from John_doe@email.com.
Extract first & last name from email address
Given an email address in the format
You need to extract first name & last name using formulas.Continue »