Hi ,
I would advise against doing anything with this data other than examine it painstakingly.
Which of us can say whether the following two items of data are the same or different ?
30%DEZinTolueneinB2
30%DEZinTolueneinB5
The difference is just one character , but for all we know , there can be a significant difference between them.
If the data had been normal English text , it would have been another story , but with technical text , especially names of chemicals / medicines , it would be difficult to apply an automated solution which eliminates the need for human scrutiny.
Narayan