Cleaning Up Imported Data – A Recent Case Study

Share

Facebook
Twitter
LinkedIn

Yesterday in Formula Forensics 008 we looked at Elkhans MaxIf problem.

However the solution/formula as presented is the final solution to his problem.

Elkhans original worksheet contained other problems and today we will look at this:

I have attached the orginal file as a sample file you can Download Here.

You will see that the MaxIf cell F13 is returning 0, where it should be showing 0.246

 

Houston, We’ve Had a Problem!

Cell F13 has the same formula we looked at in yesterday’s Formula Forensics: =MAX(IF((Parameter_3=D13)*(Parameter_4=E13),Parameter_5,0))

A quick check of the formula reveals that everything was technically right with the formula, yet the answer is wrong?

To solve this I tried several steps which is the topic of this post:

Examine the logic of the If’s Criteria

The formula =MAX(IF((Parameter_3=D13)*(Parameter_4=E13),Parameter_5,0)) works by calculating the maximum value from the If array.

So step 1 was to look at the logic in the If’s Criteria

That is (Parameter_3=D13)*(Parameter_4=E13)

In cell F15 I entered = (Parameter_3=D13)*(Parameter_4=E13) followed by F9

Excel returns: = {0;0;0;0;0;0;0;0;0;0;0}

This tells me that none of the Cells match the criteria, strange?

Yet manually I can see 4 matching records, below:

Check Cell Length

The next quick step was to look at the length of the text in each cell.

In Column I, I added a =Len(E2) and copied down, there was only 2 characters in each cell, this step eliminated leading or trailing spaces.

Retype the Data

Elkham supplied the source data in an Excel file.

But the Criteria was manually typed by me.

 

So the next step was to retype some of the original data in Cell E2

Wow an Answer, So obviously there was a difference?

 

What is Wrong Here?

So obviously there was a difference between the C1 in cell E2 and the C1 in cell E13?

I checked this in 3 ways

1. Type the value “C1” into Cell E2, without the quotes

This returned an answer 0.08 from F2 as it should have.

2. Copy an old “C1” value to E13

This resulted in the correct answer of 0.246 in F13

 

3. Use a quick Formula

Entering a quick formula

In F17 type =E2=E13

Excel returns False

Showing that the value of cell E2 does not match E13

 

So what is in E2:E12 ?

As I had typed the values into the Criteria Cells D13:E13, I knew what they were, they were a plain and simple “C1”

So what was in E2:E12 ?

Next step was to look at the Ascii values of the 2 characters in Column E.

In K2: =Code(Left(E2,1))

In L2: =Code(Right(E2,1))

Copy both down to Row 13

Bingo !

The Character C in cell E2 wasn’t the same as the Character C in E13 ?

Yet both cells contained a Calibri Font.

If I now type in a spare cells:

F18 =Char(63), Excel displays a “?

F19 =Char(67), Excel displays a “C

Yet Cell E2 is clearly displaying C1 with a First Character Ascii code of 63 which should be a ? mark.

I suspected that it had been copied and pasted from MS Access, So I shot an email back to Elkhan, asking “What the source of the data was?”.

The response came back that “The data had been copied from a Russian (Cyrillic) version of an MS Word File and pasted into an English version of Excel.

I can’t explain what has happened but somehow the character sets and associated values got scrambled when copied the data from the Russian Word Document into Excel

If you have had experiences like this or can explain what has happened please do so in  the comments below:

 

Solution

The Solution was now easy

Use Search/Replace

Copy the contents of cell E1,

goto Search/Replace or Ctrl H

Find: Paste the contents of Cell E1

Replace with: C1

 

Conclusions:

  1. Be careful when receiving data from foreign language files, including word and Excel files
  2. Check summations based on such data to ensure its integrity
  3. Be methodical in tracking down problem cells

 

Lets us know about your Data Transfer Nightmares

Have you had any strange data transfer issues?

Let us know in the comments below.

 

Facebook
Twitter
LinkedIn

Share this tip with your colleagues

Excel and Power BI tips - Chandoo.org Newsletter

Get FREE Excel + Power BI Tips

Simple, fun and useful emails, once per week.

Learn & be awesome.

Welcome to Chandoo.org

Thank you so much for visiting. My aim is to make you awesome in Excel & Power BI. I do this by sharing videos, tips, examples and downloads on this website. There are more than 1,000 pages with all things Excel, Power BI, Dashboards & VBA here. Go ahead and spend few minutes to be AWESOME.

Read my storyFREE Excel tips book

Overall I learned a lot and I thought you did a great job of explaining how to do things. This will definitely elevate my reporting in the future.
Rebekah S
Reporting Analyst
Excel formula list - 100+ examples and howto guide for you

From simple to complex, there is a formula for every occasion. Check out the list now.

Calendars, invoices, trackers and much more. All free, fun and fantastic.

Advanced Pivot Table tricks

Power Query, Data model, DAX, Filters, Slicers, Conditional formats and beautiful charts. It's all here.

Still on fence about Power BI? In this getting started guide, learn what is Power BI, how to get it and how to create your first report from scratch.

42 Responses to “Prevent Duplicate Data Entry using Cell Validations”

  1. Jair says:

    Hi Chandoo, I need you help in the following problem.
    I'm trying to get a direccion from a found result. With this dirreccion I will want the before cell value. For example, If result of a find is 38 localized in cell $C$2, I need to get previus value (cell $B$2 ), maybe Andrés.

    Do you know some way to do that?

    Thank you for you help.

  2. Lincoln says:

    Hi Chandoo

    Thanks for this. One thing though: In my pre-2007 version of Excel, the COUNTIF function doesn't recognise a semicolon (;), but requires a comma.

    Is the semicolon an Excel 2007 thing?

  3. Chandoo says:

    Jair... I am not sure I understand what you want. what do you mean by Dirreccion?

    @Lincoln: I am sorry, often I forget that I am using European version of excel where the delimiter is ; instead of ,. I have corrected the formula now.

  4. subbu says:

    Thanks for this nice tip, i used to do a find all after filling every new items which was cumbersome.

    Do you know a way to extend this validation search to other tabs/sheets ?

  5. Jair says:

    Thanks for you attention. I'm trying to get of value continue from a found value. Let me show a example:

    Name Years
    John 35
    Maria 28
    Teresa 32

    If I search the max years, the result is 35, but I need that result to be John. Do you know how I can do it?

  6. Chandoo says:

    @Subbu.. you can easily extend the validation to other sheets by pasting the data validations. See the latest article here: http://chandoo.org/wp/2009/10/28/copy-data-validations/

    @Jair.. you can use the large() or small() formulas to do this. for eg. =index(A1:A3,large(B1:B3,1)) will get you the name of the person with highest "years". More help here: http://chandoo.org/excel-formulas/large.html

  7. Jair says:

    Hi, I don't know if I'm using bad the formula or its performance is diferent for my Office version. Large() formula return the value in the cell, in my example 35. The index() formula use a range, row and column. I'm using the large() as number of row, and it is bad because into the range don't have row 35. This is my perception. What do you think?

  8. Chad says:

    Hello,
    I am trying to attempt data validation in Excel Mobile, but the DV tool isnt available. I want to prevent duplicates is all, any advice on acheiving this in Excel Mobile? Thanks..

  9. Chandoo says:

    @Jair... my french aint that good. it starts at "merci" and ends at "beau coup".

    Anyhow, you need to merge the large with vlookup to do this. I am not sure if you have solved the problem. Otherwise let me know with details and I can write the formula in comments.

    @Chad... I have never used excel mobile, so I have no idea. May be they have not implemented data validations in excel mobile.

    Any excel mobile users out there?

  10. Jair says:

    Hi Chandoo, the proposed solution by JlD is interesting. He created a macro to get values when the matrix is not one dimensional, how on my problem. This fuction for me.
    I would like to share you my work, how can I upload?

  11. Chandoo says:

    @Jair.. sorry for such a delayed reply.. you can upload the files to skydrive and link them here. Or you can email them to me at chandoo.d @ gmail.com and I will upload them somewhere. But it could take forever if you email files to me as I am a bit lazy.

  12. [...] Day 31: Advanced Data Validation Tricks in Excel – Part 2 [...]

  13. Muhammad Moin says:

    Hi,

    Can you help me in Microstrategy?

    Br,
    Moin

  14. Ramprasad says:

    really wonderful article. I feel it is implementing Primary Key concept into spreadsheets.

  15. sriram says:

    Hi article on data validation. Excel is a very versatile platform to work with and we use it for all kinds of data tabulation. In fact this must have been the most rudimentary data management tools I must have worked with and knowing such tips only adds functuionality to our user experience. Great article. looking forawrd to read more.

  16. Vasanth says:

    Hi Chandoo,

    Thanks for such a nice idea.

    I tried copy paste the data into the validated area, but the pop-up msg (warning msg) doesn't came. Is it something that we need to update the data manually each time,.

    Do we have any option where we can bulk upload the number and it throws a warning message that the data already exits and do we want to continue with this ?

    Please do reply me.

    Thank you.

    Regards,
    Vasanth.

  17. kochu says:

    It was really useful chandoo...thanks a lot...

  18. Leo says:

    Tried this in excel 2010 and it did not work?
    Could the newer excel have changed that much?

    • Hui... says:

      @Leo

      It works fine in Excel 2010

      The formula used above =COUNTIF($B$4:$B$11,B4)<=1

      only applies to the range B4:B11

      Did you adjust the range to your data?

  19. Tariq Khan says:

    This page helped me accurately to find solution of my question. thanx

  20. Murli says:

    we want to prevent duplicate entries in three columns combined, using data validation, i.e. say, column A has first name and Column B has middle name, Column C has last name. the first name can be duplicate, middle name can be duplicate, last name can be duplicate, but not all three at the same time.

  21. Murli says:

    I want to prevent duplicate entries in three columns combine, using data validation, i.e. say, column A has first name and Column B has middle name, Column C has last name. the first name can be duplicate, middle name can be duplicate, last name can be duplicate, but not all three at the same time.

  22. KokTiong says:

    Hi, I've tried above validation method to prevent duplicate value from entering into the cells. It's work, when user key in the data into the selected range. However, it's not working when user copy-&-paste the info into the same range.

    Please advice. Thanks. 

  23. ZAMEER SHAIKH says:

    Hi Chandoo,
     
    Does it work in Excel 2007?
     
    Please Reply

  24. mahavir says:

    thanks chandoo........

  25. SUSHOBH says:

    it does not work when data is copy pasted...any solution for this??

  26. shaloo says:

    hi i m shaloo and i want to know in excel if i write duplicate no.then it says or show about we are write duplicate no.

  27. Kris says:

    Hi Chandoo

    I've tried using this with a Named Range, which is actually a column in a Table as DV wont accept a table reference, and it wont work.
    Also tried using Offset to specify the Named Range, but that wont work either.

    Is it possible to use Named Ranges with DV?

    Thanks
    Kris

  28. Paula says:

    I have tried the above formula on a table column. The Error box does not pop up, there is only the small ! next to the cell with the duplicate. The column I am working with is formulas that produce a date. Is the reason it doesn't work that the cells contain formulas rather than data?

  29. Ken says:

    The formula works but only if I enter data in cell above it. So for example, if I have "123" in B11 it does not allow me to enter "123" in B10, B9, B8, etc. But I can still enter "123" in B12. Please help! 🙂

  30. Karan says:

    Great tip.. thanks a lot

  31. I have 21 years of experience working as data entry assistant. I constantly read several blogs to keep myself up-to-date with the advances in data entry profession. I really enjoyed this blog post. From my several years of experience, I agree with you 100% when you say, “ We all know that data validation is a very useful feature in Excel. You can use data validation to create a drop-down list in a cell and limit the values user can enter. ”

    Keep blogging. I will come here again.

    --data entry assistant

  32. HaroonRashid says:

    Hi,
    This is really very helpful.
    Thank you

  33. Junaid says:

    how can i assign two validation on a single cell
    one is for list validation (means the data should be from that range)
    second i want to prevent them from repetition

    how can i do this ?
    P7 to P506 have GR# which are for list
    i want to prevent C column to not to repeat and should be from the P column

  34. Gaurav says:

    friend can any one tell me the formula
    exname location qty
    gaurav 1 1
    rofan 2 5
    sandeep 3 6
    gaurav 4 3
    rofan 5 4
    sandeep 6 8
    gaurav 7 9

    If this is a data.
    if i want a formula by which if i type gaurav then all the location and qty should be shown in a new page.
    i had 5,00,000 sku so if i punch one name i can get the entire details

  35. Gaurav says:

    IF(ISERROR(INDEX($B$3:$C$9,SMALL(IF($B$3:$B$9=$B$12,ROW($B$3:$B$9)-ROW($C$2)),ROW(A1:C1)),2)),"",INDEX($B$3:$C$9,SMALL(IF($B$3:$B$9=$B$12,ROW($B$3:$B$9)-ROW($C$2)),ROW(A1:C1)),2))
    please explain

  36. MD. RASEL SARDER says:

    YOUR COUNTIF FORMULA IS REALLY HELPFUL AND WORKS. I TRIED SEVERAL SITES BUT THEIR FORMULA DOES NOT WORK. ONLY YOU HAVE GIVEN A RIGHT FORMULA!
    THANK YOU!!!!!

Leave a Reply