• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

How to remove mutiple random rows from somewhat good order?

Iscariot

New Member
Hello.

I have ran in to the problem where I do have an excel file with 12000+ rows and most of it are ordered somewhat fine:

first row: good data
second row: good data
third row: good data

Problem appears when sometimes it goes like this:

First row: good data
Second row: good data
Third row: BAD UNNEEDED DATA
Fourth row: good data

How can I remove that random row with bad data?

Context (look at the attached file):

1. First and second row always has good data.
2. Sometimes third row is occupied with random comment, data and etc.
3. The data which should be in the third row - goes to fourth row.

Is there a way to eliminate that "bad" order with the "comment-like" row? For example, is there maybe a way to set an example-like-patern-like preset? Or sort it somehow? I have tried to sort it and use some formulas, but I am not too good with excel...
 

Attachments

  • excel.png
    excel.png
    12.7 KB · Views: 6

Iscariot

Do You know clear rule ... what is "bad" row?
... after that, could You filter those rows ... and ... eliminate those?
That is the problem: I cant filter those because I do not know what kind of info would be there.
"what is a bad row" - as shown in the example, the "bad" row is a "comment-like" row which sometimes appears in the order of first, second and third row. I could go fish out "bad" rows manually - but that is what I am asking: is there a way to do it in a quick and efficient way.
 
As I tried to ask
The 1st step is:
Do You know clear rule
... what is "bad" row?
... ... what is a "comment-like" row?

After above, it's possible to do something 'safe'.

Maybe You can see those rows Yourself, but ... Excel should know clear rules for those.
I can notice, if previous row's fourth character is - and current row's length is less than ... 4 then that row could be a "bad" row,
Of course, there could be mark/remove some rows randomly ...

Have You tried to find out/solve - why those rows comes?
Aren't there now extra tasks - someway those rows comes and then You should 'take care' those rows.
Stop that procedure.
 
As I tried to ask
The 1st step is:
Do You know clear rule
... what is "bad" row?
... ... what is a "comment-like" row?

After above, it's possible to do something 'safe'.

Maybe You can see those rows Yourself, but ... Excel should know clear rules for those.
I can notice, if previous row's fourth character is - and current row's length is less than ... 4 then that row could be a "bad" row,
Of course, there could be mark/remove some rows randomly ...

Have You tried to find out/solve - why those rows comes?
Aren't there now extra tasks - someway those rows comes and then You should 'take care' those rows.
Stop that procedure.
Thing is, that the only acceptable pattern is, as shown in the example picture, are green 1, 2 and 3. That is the wanted, correct order. Yet, sometimes a "comment" like row appears and the order of the rows becomes 1, 2, BAD ROW (RANDOM DATA), 3. To filter bad data rows - its not an option, since the data there is just too random. I thought, that maybe there is a way to somehow define a rule for the correct order (the data in green 1, 2, 3 rows are always the same type) and just eliminate that random bad row according to the numbering of rows or something like that.

And yes, first, second and third row always has same type of data: first row has 4 numbers with "(K1)" or "(K2" - that is clear rule. Second row always has more digits. Third row with a good data would have random "1E54752" and/or "2E547852" and so on - or it would have BAD COMMENT LIKE DATA and "1E54785" would go to fourth row.

P.S. I have tried to order it, sort it, define rules - but as I said, random row with bad data is just too random. The only way was to manually delete them - that is why I was wondering if there is a better way.

Anyhow, HUUUGE THANK YOU for the reply.
 

Iscariot

Did You read my I can notice line?
About Your And yes ...
#1 You find row which starts with ???? (E)
#2 You check that next line is longer than ... 4
#3 You check that next line is longer than ... 4
>> if TRUE then continue with #1
>> if FALSE then You could find Your "bad" row >> mark row, that You can find it later.
repeat steps #1 to #3 with next rows as many times as it needs.

But, I would try to solve this as I wrote in my last three lines.
 
Back
Top