• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Average score, grouped and sorted into lots of 10s from unsorted data.. [SOLVED]

V13RNE

New Member
Hi


I have two columns of data. column A is a "reference number", column B is the "score".


I need to compile (and pick out) batches of 10 "scores" with their "reference numbers" across the data set that have an average score of 12 (can be slightly more but not <12)

[pre]
Code:
A       B
Ref Nr	Score
4002	10
4056	8
4023	15
4016	14
4032	7
4100	9
4520	11
4612	23
4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
4109	5

etc
(several 100)
[/pre]

Would this require a loop? so the first pass would eliminate 10 records, the second pass another 10 records etc


Many thanks for your help :)
 
Hi ,


Can you explain the meaning of the following ?



I need to compile (and pick out) batches




Suppose we extend you data a little bit , and have the following data :

[pre]
Code:
4002	10
4056	8
4023	15
4016	14
4032	7
4100	9
4520	11
4612	23
4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
4109	5
4612	23
4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
4108	23
4109	5
4612	23
4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
The first such batch of 10 scores and ID is :

[pre][code]4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
4109	5
4612	23
What should be done with this data ? Is it to be copied somewhere else ?


What should happen next ? Will the next calculation start from the next row as shown below ?

4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
4108	23
4109	5
4612	23
4912	5
4523	6
4987	12
4321	13
4123	8
1561	7
4646	6
4544	13
4881	20
4008	11
4108	23
[/pre]
In this case , the next such batch of 10 will be :

4987 12
4321 13
4123 8
1561 7
4646 6
4544 13
4881 20
4008 11
4108 23
4108 23[/code][/pre]
Narayan
 
Hi Narayan


Thanks for your reply. In answer to your questions:


1. Can you explain the meaning of the following ?


I need to compile (and pick out) batches


The solution is to first select any
10 scores from the data that have an average of not less than 12; and move these results to a second sheet.


As these first 10 entries are now removed from the original data set, the second iteration will do the same as before until "n"th_iteration, there will be:


1. no records left (as they've all been batched together in groups of 10 with an average score of not less than 12)


2. a remainder - due to the number of records in the original data set not divisible by 10


3. a remainder - due to the fact that the final iteration produces an average score <12


The second sheet (or subsequent sheets) will then contain the data set (ID and Score), split up into batches of 10, where the average(score)is 12 or slightly more.


I hope this explanation is clearer?


Many thanks


V13RNE
 
Hi ,


Sure ; I think the only way this kind of iteration is possible is through a macro. I assume you are OK with this.


Narayan
 
Hi Narayan


Yes, I knew it was going to involve a macro - I've no problems with macros; can get an idea of the logic from the code (I ain't a programmer though!!)


hope you can help?


Thanks once again


V13RNE
 
Hello,

I am not clear on what the OP is attempting to do. What Narayan outlined will identify batches of consecutive items that have an average score > 12.


However, one could also interpret the requirement as identifying any 10 items to constitute a batch, provided an item is not re-used across batches, and the average of a batch is > 12.

With this second interpretation, consider the following batch with scores located in {1,5,9,13,16,17,19,21,22,24}


The scores are {10,7,5,8,13,20,23,23,5,12}

The average is 12.6


i.e. in this second interpretation of the requirement, the scores are not necessarily in consecutive locations.


Which is the correct interpretation of your requirement?


Also, what do you mean by average slightly more than
12? Is 15 slightly more? How about 200? Remember that 200 is slightly more than 12 when compared to 20,000!!


-Sajan.
 
On that last point, I'm slightly older than 21; slightly less rich than Mark Zuckerburg, slightly less masculine than Arnold Schwarzenegger; slightly less feminine than Greta Garbo; slightly better at Excel than a monkey; slightly less smelly than a monkey; and slightly mad.
 
Hi Sajan ,


I think you can interpret it any way , as long as you extract the selected 10 items to another sheet , and repeat the process as many times as are required to extract all such batches of 10.


The point is which is going to be easier ? Doing this process for consecutive items or doing it for non-consecutive items ?


The point about slightly is just that ; when the average of 10 items exceeds the threshold , extract them ; it does not matter whether the average is 12 or 15 , since that will depend on the value of the 10th item.


Let us come up with an algorithm , which can always be modified according to what is actually meant.


Narayan
 
Thanks to you all


slightly by definition i would interpret as no more than 14, 15 at tops. The nearer to 12 the better for the initial sets of iterations.


The selection - as in Sajan's post - for each batch of 10 is completely random.


V13RNE
 
V13RNE


Without seeing Narayan's response it may be useful to define slightly as a % of the orginal number


15 is slightly more than 12 if the allowable variance is 25% (12+ 12*25/100)=15
 
Hi Bob ,


I am more worried that on all the other attributes , Jeff has given a comparison ; on the last one he has not given any !


Narayan
 
Narayan


Wow! Thank you so much for doing this - I've messed around with it and it's just the job...


Great stuff!! Thanks once again


V13RNE
 
Hui


Narayan's solution >=12 works out with a margin of 6.7% i.e. 12.8 across the test data. I'm happy with that!


Thanks everyone for their comments and help.
 
Back
Top