• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Splitting text/authors name

Status
Not open for further replies.

tango

Member
I have more than 10 thousands of scientific references placed under one cell and I would like to separate the authors name on another columns. I have searched the site and tried putting different formula to get the result I wanted but cant get it. Appreciate any assistance please.


Example:

A1

LName, B.A., Q.B. Lname and Y.Z. Lname 1978. Book Title text here. Page number.


Result

B1: LName,

C1: B.A.,

D1: Q.B.

E1: Lname

F1: and

G1: Y.Z.

H1: Lname

I1: 1978.

J1: Book Title text here. Page number.


Therefore whatever appears after the publication date (ie. 1978) can be placed fully on another cell. I dont want to apply "Data>Text to columns" as "J1" will be separated.


Attached is a sample worksheet contains three reference as an example for review.

https://docs.google.com/folder/d/0B4HzoNsXG4s8UDNLOWptN0g5ekE/edit


Thanks.
 
Hi, tango!

Would you please post a greater number of references, 10-20 maybe, and its actual data?

Just to check if and how words like "Book", "Title", "here", "Page", "Number", "Another", ... are in the data.

Regards!
 
Hi Tango!!


See this file: Text is split-ed but can't mark how to separate the Book Name. Same problem as SirJb7 facing.


http://dl.dropbox.com/u/60644346/T-Chandoo.xlsx


@SirJb7, Hi,


Searching for Charlsberg in your deep freezer?? :D


Regards,
 
@Faseeh

Hi!

Not yet, buddy, freezer's still not handy.

Regards!

PS: BTW, never dare again to rename Carlsberg or I'll be forced to... <censored>
 
Thanks for all your reply.


@SirJB7: See requested samples here: https://docs.google.com/file/d/0B4HzoNsXG4s8T2tYWVBqdEJSWWM/edit . I have manually split-ed one sample for reference (see A21).


@Faseeh: Thanks. All is good except that the title of the books (Book 1 Title here xxxxxxxxxxxxxx. Page number here) or the subsequent words after the publication date (in this case its, 2011, 1978 and 1989) are also separated. Perhaps you could check the link above for reference.


Thanks again.
 
Hi Faseeh ,


I am not sure you have used the following formulae in your file :


The position of the year of publishing , can be found out by :


Position_of_year : =MIN(IFERROR(FIND(ROW(1900:2012),A1),9999))


entered as an array formula , using CTRL SHIFT ENTER.


Year_published : =MID(A1,Position_of_year,4)


Book_Title : =MID(A1,Position_of_year+6,9999)


Narayan
 
Hi


Why do you want to do this - I ask as there might be a different way to achieve what you are looking for. For example, do you want sort by Author?


Cheers

Glen
 
@Glen. yes and other type of data manipulation like exporting the file as xml with its own fields.


@SirJB7 Thanks, @Faseeh thanks, @Narayan thanks... This is really good!


I incorporated the codes (see link below) but I think I have it wrong as it requires me to review cell by cell to remove unwanted data (marked in red). I could not imagine doing that with 10k of references.


Is there a better way to achieve the right format? Like simplifying or consolidating all codes together please.


https://docs.google.com/file/d/0B4HzoNsXG4s8bVpaRklUbjNKMVU/edit


Thanks a lot.
 
Hi ,


The basic problem is separating the authors' names , since the formats are many.


The formulae in columns U , V and W will directly give you the Year of Publishing , and the Book Title , assuming that everything to the right of the year is a part of the Book Title.


To separate the authors' names , first use the value obtained from the formulae in column U to get the complete list of authors , using the LEFT function.


Within the list of authors , the first author's name follows the NAME and then INITIALS format separated by commas , while succeeding authors' names follow the INITIALS first and then the NAME format , separated by commas. The last author's name , where there is more than one author , is separated from the previous author not by a comma , but by the word "and". With three different formats to take care of , doing this by formulae is going to be complicated ; can you accept a VBA solution ?


Narayan
 
Hi! I have the exact same issue. But I cant access the Google drive where you explain the solution. Could anyone lend me a hand with this?
 
DiegoEch
You should open a new thread
... with Your own sample Excel-file
as You've read.
This is over ten years old thread ... and ... this is closed now.
 
Status
Not open for further replies.
Back
Top