• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Comparing two spreadsheets for differences

sborah99

Member
I have two spreadsheets with similar data. The first spreadsheet has close to 7500 entries while the second spreadsheet has close to 19,000 entries. Both spreadsheets have data concerning baseball players (bio details, measurements, baseball stats, etc.), but I need to transfer certain information from one sheet to the other. So, I need to know if I can accomplish a few things with Excel. First, I need to know which entries from the first spreadsheet appear in the second spreadsheet. Secondly, I need to know if the extraneous information in the second spreadsheet can be moved out easily so that the matching info aligns with the first spreadsheet so that I can transfer the new info I need easily. Thirdly, if I cannot do the second thing easily, is there another way? I have downloaded a trial version of Excel Compare but I cannot get it to do what I want. Any help is greatly appreciated. Thanks.


Stacy
 
Hi Stacy,


This can be done using Excel. Could you please post a sample worksheet and send me the link?


Refer: http://chandoo.org/forums/topic/posting-a-sample-workbook


PK
 
Thanks PK. Here is the link for the sample worksheet:


https://docs.google.com/spreadsheet/ccc?key=0Am9qgILQvLEedFlKMV9kNVduZHRnaFQ5Q3NaOFVPbUE&usp=sharing


The ranges from cells Z to CC are the ranges that will be changed from one spreadsheet to the other.


Stacy
 
You can find the solution here:


https://hotfile.com/dl/203208928/a543484/ML_Rosters.xlsx.html


I have added some extraneous values in column A (highlighted in bold )of sheet 2 to represent the actual situation. I am assuming that columns B to I in sheet 2 have other information.


1) In column J, I am checking if the values in column A match those in column A of sheet 1. For those extraneous values, you will get a "False".


2) You can use a filter in column J to sort the values in alphabetical order. All the "False" values will appear at the top. You can easily select those rows and delete them using the right click on your mouse. That way the "True" values will move up.


3) Using vlookup, you can look up values in columns Z to CC in sheet 2.


Hope this helps. Let me know if you have any questions.
 
PK, will this make it possible to compare both spreadsheets to find matching players in both spreadsheets then move the unmatched players out of the sheets? Thanks.


Stacy
 
Hi, sborah99!


First of all welcome to Chandoo's website Excel forums. Thank you for your joining us and glad to have you here.


As a starting point I'd recommend you to read the green sticky topics at this forums main page. There you'll find general guidelines about how this site and community operates (introducing yourself, posting files, netiquette rules, and so on).


Among them you're prompted to perform searches within this site before posting, because maybe your question had been answered yet.


Feel free to play with different keywords so as to be led thru a wide variety of articles and posts, and if you don't find anything that solves your problem or guides you towards a solution, you'll always be welcome back here. Tell us what you've done, consider uploading a sample file as recommended, and somebody surely will read your post and help you.


And about your question...


If you haven't performed yet the search herein, try going to the topmost right zone of this page (Custom Search), type the keywords used in Tags field when creating the topic or other proper words and press Search button. You'd retrieve many links from this website, like the following one(s) -if any posted below-, maybe you find useful information and even the solution. If not please advise so as people who read it could get back to you as soon as possible.


As you said you've found the answer by yourself, would you be as kind as to post it here so as people who read this topic may have the question and the answer too? Thank you.


Regards!
 
I was a bit hasty when I said I found my answer. I tried using the VLookup function that PK suggested, but it never worked, no matter how I arranged the equation. So, I searched on the forums as you suggested, SirJB7, and I found info about the Match and SumProduct functions, which appeared to hold the keys to the answers I need. But I can't figure out how to get any of them to operate properly. I always end up with an error in the A1 cell when I don't include that cell in any of the function equations.


I'll clarify what I would like to achieve: I have two spreadsheets that list baseball players and their various attributes. The first spreadsheet lists almost 7500 baseball players while the second spreadsheet lists almost 18,800 players. The first spreadsheet has attributes that I want to transfer to the second spreadsheet but only to players that appear in both spreadsheets. Those attributes appear in the cells ranging from Z to CC, and I want to line up the duplicate baseball players in the same order in both spreadsheets so that I can copy the attributes from the first spreadsheet and then paste those attributes into the second spreadsheet in one go. I'll admit that I thought I had better Excel skills than this but I am completely stumped as to how to get anything to work. Thanks, guys.
 
Hi, sborah99!

Consider uploading a sample file (including manual examples of desired output), it'd be very useful for those who read this and might be able to help you. Thank you.

Give a look at the green sticky posts at this forums main page for uploading guidelines.

Regards!
 
Here are portions of the two spreadsheets I am working with:


#1 -- https://docs.google.com/spreadsheet/ccc?key=0Am9qgILQvLEedFlKMV9kNVduZHRnaFQ5Q3NaOFVPbUE&usp=sharing


#2 -- https://docs.google.com/spreadsheet/ccc?key=0Am9qgILQvLEedDVEdlY3SmgyZ0E0UG15RG5zLWotcUE&usp=sharing


#1 has the updated attributes that I need to transfer over to #2. There are players who don't appear in both spreadsheets so I would like to move those out of the sheets and align up all the duplicate players in both spreadsheets. I have highlighted the first two players on each list to show that they have the same attributes in the Z to CC cells. That should clarify what I would like done. Thanks.
 
I tried using this function in sheet #2:


=ISNUMBER(MATCH(E36,[SHEET1]ml_rosters!$E$36:$EQ$18767,0))


and had it check those against the values in sheet #1. All I got was a "FALSE" error in sheet #2. Did I use too wide of a search, should I have just used the values in cells E and F to check the names in sheet #2 with those in sheet #1? Thanks.
 
Hi, sborah99!


Regarding your post about ISNUMBER and MATCH formula there's a syntax error in it, which must be as follows -without considering if it's working right-:

=ISNUMBER(MATCH(E36,'[ml_rosters.xlsx]Sheet1'!$E$36:$EQ$18767,0)

if you want to check for value at E36 into range E36:EQ18767, but looking at your worksheet values I guess you wanted to look for it in a portion of column E only (E36:E18767); if so, the formula should be:

=ISNUMBER(MATCH(E36,'[ml_rosters.xlsx]Sheet1'!$E$36:$E$18767,0)


Regarding your uploaded files post:

a) What do you mean with "move those out of the sheets and align up all the duplicate players in both spreadsheets"? Deleting all rows that don't match in column A against the other workbook and keeping only the coincident ones, ergo the same no. of rows on each worksheet? If not, please elaborate a bit more.

b) Values of columns Z:CC of ML file should directly overwrite values at same columns of OBSL file? If not, same request.


Regards!


EDIT: corrected typo error.
 
Thanks for fixing my equation. I was wondering why I couldn't get it to work. In response to your questions:


a) The first spreadsheet only has 7440 players in it, and those 7440 players have updated attributes. I want to find those same 7440 players in the second spreadsheet, have them listed in alphabetical order in both sheet, then copy the updated attributes from the first spreadsheet and paste those attributes into their corresponding cells in the second spreadsheet.


b) Yes, that is essentially what should happen.


Thanks for your help. I'll try out your fix.
 
Hi, sborah99!

Please confirm if players in 2nd workbook others than those 7440 in 1st workbook should be deleted or what, and what if any of those 7440 in 1st workbook doesn't exist in 2nd?

Yes, you're right, there's a typo error, updated yet, could you retry?

Regards!


EDITED
 
The players in the second workbook should not be deleted, only moved out of the list. I will have another task for them.


This is the formula I am trying to use, and I keep getting an invalid reference error:


=ISNUMBER(MATCH(E36,[ml_rosters]SHEET1!$E$36:$EQ$18767,0))


I took out the [ml_rosters.xlsx] part and it asked me to browse to the file I wanted to compare. But when I selected the file, it gave me the invalid reference error.
 
Hi, sborah99!

Sorry, but I don't fully get what do you want to mean with "only moved out of the list".

About the formula, I just corrected the syntax error but don't analyzed what it actually tried to do. I'll give a look later.

Regards!
 
Players who have no duplicates should not be included in the lists but I would still like them on the spreadsheets somewhere, either above or below the lists of duplicate players.
 
Hi, sborah99!


Give a look at these files:

https://dl.dropboxusercontent.com/u/60558749/Comparing%20two%20spreadsheets%20for%20differences%20-%20ML%20Rosters%20%28for%20sborah99%20at%20chandoo.org%29.xlsm

https://dl.dropboxusercontent.com/u/60558749/Comparing%20two%20spreadsheets%20for%20differences%20-%20OBSL%20Rosters%20%28for%20sborah99%20at%20chandoo.org%29.xlsm


In both:

a) There're defined 2 dynamic named ranges:

"DataTable" as: =DESREF(Sheet1!$A$2;0;0;CONTARA(Sheet1!$A:$A)-2;CONTARA(Sheet1!$1:$1)) -----> in english: =OFFSEF(Sheet1!$A$2,0,0,COUNTA(Sheet1!$A:$A)-2,COUNTA(Sheet1!$1:$1))

"DataUpdateTable" as: =DESREF(DataTable;0;25;56) -----> in english: =OFFSEF(DataTable,0,25,,56)

b) In worksheet Sheet1 there's a procedure SortLinked in the VBA section:

-----

[pre]
Code:
Option Explicit

Sub SortLinked()
' constants
Const ksRng = "DataTable"
' declarations
Dim rng As Range
' start
Set rng = Range(ksRng)
' process
With ActiveSheet.Sort
With .SortFields
.Clear
.Add Key:=Range("ER:ER"), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
.Add Key:=Range("E:E"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
.Add Key:=Range("F:F"), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
.Add Key:=Range("A:A"), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
End With
.SetRange rng
.Header = xlGuess
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
' end
Set rng = Nothing
Beep
End Sub
[/pre]
-----

c) There are 2 new columns ER & ES:

ER1: title

ER2: Z (greater than below values for sorting descending because of damned row 2)

ER3 & down: =SI(ES3=0;"N";"Y") -----> in english: =IF(ES3=0,"N","Y")

ES1: title

ES3 & down: =SI.ERROR(COINCIDIR(A3;'[XXX.xlsm]Sheet1'!$A:$A;0);0) -----> in english: =IFERROR(MATCH(A3,'[XXX.xlsm]Sheet1'!$A:$A,0),0) (where XXX is the name of the other file)

In OBSL file (target) there's a module with a procedure UpdateLinked in the VBA section:

-----

Option Explicit

Sub UpdateLinked()
' constants
Const ksWB = "Comparing two spreadsheets for differences - ML Rosters (for sborah99 at chandoo.org).xlsm"
Const ksWS = "Sheet1"
Const ksRng = "DataTable"
Const ksRngU = "DataUpdateTable"
Const kiColumnExists = 148
Const kiColumnFullName = 150
' declarations
Dim rngSource As Range, rngSourceUpdate As Range
Dim rngTarget As Range, rngTargetUpdate As Range
Dim I As Long, J As Integer, K As Long, L As Long
' start
Set rngSource = Workbooks(ksWB).Worksheets(ksWS).Range(ksRng)
Set rngSourceUpdate = Workbooks(ksWB).Worksheets(ksWS).Range(ksRngU)
Set rngTarget = ThisWorkbook.Worksheets(ksWS).Range(ksRng)
Set rngTargetUpdate = ThisWorkbook.Worksheets(ksWS).Range(ksRngU)
' process
K = 0
L = 0
For I = 1 To rngTarget.Rows.Count
If rngTarget.Cells(I, kiColumnExists).Value = "Y" Then
If rngSource.Cells(I, kiColumnFullName).Value = _
rngTarget.Cells(I, kiColumnFullName).Value Then
For J = 1 To rngTargetUpdate.Columns.Count
rngTargetUpdate.Cells(I, J).Value = rngSourceUpdate.Cells(I, J).Value
Next J
K = K + 1
Else
MsgBox "Player: " & rngTarget.Cells(I, kiColumnFullName).Value & _
" not matching at row: " & I + 2, _
vbApplicationModal + vbCritical, "Warning"
L = L + 1
End If
End If
Next I
MsgBox "Matching records updated: " & K & vbCrLf & _
"Matching records not updated: " & L, _
vbApplicationModal + vbInformation, "Summary"
' end
Set rngTargetUpdate = Nothing
Set rngTarget = Nothing
Set rngSourceUpdate = Nothing
Set rngSource = Nothing
Beep
End Sub

-----


Only once steps:

1) update your workbooks as described in a), b) & c)

2) define the named ranges in them as specified


Each time steps:

1) in both files run the procedure SortLinked

2) in target (OBSL) file run the procedure UpdateLinked


Just advise if any issue.


Regards!
 
Those didn't work, either. They tell me where the duplicates are but they don't align the duplicates between the worksheets. I still have to do that manually. What I have done is colored the first sheet in red then copied and pasted into the second sheet and sorted everything by last and first names. So now the duplicates are sorted with each other. But I still have to go through them one by one.


I want to know if this is possible: Find all the duplicates in both sheets and have them aligned in the same rows on both sheets. The non-duplicates should be sorted into a different list. Can this be done? Thanks.
 
Hi, sborah99!


Could you verify that my uploaded files were correct? If so, go on; if not, I'll check them out.


As you may notice there weren't any matches between the two files you uploaded, so as to test the provided solution I took two values from one sheet (18619 and 26768) and replaced two random values in the other sheet.


After that, I sorted each one using seldom SortLinked procedures, ran UpdateLinked procedure in OBSL workbook, checked the output and it got updated. Those are the files that I uploaded.


If I were you I'd to this:

a) check that my actual source and target files have the two dynamic ranges defined at a)

b) check that my actual source and target files have the procedure SortLinked in module Módulo1

c) check that my actual target file has the procedure UpdateLinked in module Módulo1

d) run in my actual source and target file the procedure SortLinked

e) run in my target file the procedure UpdateLinked

f) check my target file and verify if it was correctly updated


If f) didn't happen please consider uploading both actual files. Give a look at the green sticky posts at this forums main page for uploading guidelines.


Regards!
 
Well, when you say there are no matches, I have to disagree with you. Most of the players on the ml_rosters have duplicates on the obsl_rosters. They just may be on different teams and have different ratings in the obsl-rosters. Those are the players I want to get aligned in the two spreadsheets. Here are links to the two actual spreadsheets I am using:


#1) https://docs.google.com/file/d/0B29qgILQvLEebk9TNUh2Y1BzNFk/edit?usp=sharing


#2) https://docs.google.com/file/d/0B29qgILQvLEecDhoZnlSRGh1RXM/edit?usp=sharing


Sheet #1 has 7440 players. Sheet #2 has almost 18,900 players. I want to find all the players that appear in both sheets and have them listed in alphabetical order separate from the players who don't appear in both sheets. Is this possible?
 
Hi, sborah99!


I checked again your original 2 uploaded files (not the last ones) and I noticed that the Player ID field (column A) differs for the same player between the 2 files, field which I was using to match them.


I added a third helper column at the end of both files, additionally to the 2 previous, as follows:

ER2: Z

ER3:ERxxxx : =SI(ES3=0;"N";"Y") ------> in english: =IF(ES3=0,"N","Y")

ES3:ESxxxx : =SI.ERROR(COINCIDIR(ET3;'Other path[Other file]Sheet1'!$ET:$ET;0);0) -----> in english: =IFERROR(MATCH(ET3,'Other path[Other file]Sheet1'!$ET:$ET,0),0)

ET3:ETxxxx : =E3&" "&F3 -----> in english: =E3&" "&F3


Please download again the updated uploaded files from the same links of previous post. Code of UpdateLinked procedure updated and edited in same post too.


Just advise if any issue.


Regards!
 
Back
Top