• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Threshold line for scatter graph

kCd

New Member
HI!

I'm trying to put straight line across a scatter graph with a percentage of the data points (e.g., 33 out of 100 [33%]) above the line the remaining data points (67) below the line and identify the corresponding y-intercept value.


Scenerio: file download times are recorded once daily over the course of 100 days and graphed. The individual download times vary between 2 seconds and 120 seconds. I would like to have Excel calculate and plot a line that represents a threshold value where 67% of the download times occur. This is probably immediately below the 33rd highest data point, but does not necessarily have any relation to the data value collected on day 33. The y-coordinate for the line should be shown. 33 of the data points would be above the line, and 67 would be below the line.


I originally tried calculating a value from the data points, but realized that this line is related to the total number of data points and their respective order from the highest to lowest value. My brain hurts.


Thank you in advance for your help!
 
Hi ,


This can be done with named ranges as described below. First , let us assume that your data is from F5:F290 ; I have deliberately taken a larger data set , so that the formula is understood.


1. If we use the formula :


=LARGE(Sheet1!$F$5:$F$290,0.67*ROWS(Sheet1!$F$5:$F$290)


we get a value , say X , above which 67 % of the data set lies. Since the number of values in the data set is 286 , 67 % of 286 is 192. So 192 values will lie above the value X.


2. Create a named range Percentile_67 , and use the following formula in the refers to input box :


=ISNUMBER(Sheet1!$F$5:$F$290)*LARGE(Sheet1!$F$5:$F$290,0.67*ROWS(Sheet1!$F$5:$F$290))


What this does is create an array of values , all of them equal to the value X obtained in (1) above.


Plot this data as a second series in your X-Y scatter graph.


Narayan
 
Thank you for your response. I think that this is heading toward a working solution, but a little clarification is requested. When I used the first formula (and changed the references to the actual sheet name and data rows), what is returned is the value contained in the cell 67% from the cell with the greatest value. At this value 67% of the data points would be above a plotted line rather than below. If I use your formula and put in 0.33 rather than 0.67, I think that this may work.


My real question is about your second comment. I'm not really clear on the function of the second formula. The first formula provides the correct value on the y-axis (33% of the values above and 67% below), but I need to draw a horizontal line at this value over the entire period represent by the data. I'm not sure how to plot an array created by the second formula.
 
Hi ,


You are right ; I made a mistake in taking 67 ; taking 33 instead will give you the right value.


The line at the correct value is drawn by the second formula. Of course , even here , you would need to replace the 0.67 by 0.33


How this second formula works is :


1. Assume that your data range is from F5:F290 , which is 286 values.


2. We know these are all numbers , but using the =ISNUMBER(Sheet1!$F$5:$F$290) function generates an array of TRUE values ( 286 in number ) , which Excel converts to 1's. A FALSE value will be converted to 0.


3. 0.67*ROWS(Sheet1!$F$5:$F$290) gives us a number which is 67 % of the number of rows in the range F5:F290 i.e. 67 % of 286 = 192.


4. Applying this number 192 to the LARGE(Sheet1!$F$5:$F$290,0.67*ROWS(Sheet1!$F$5:$F$290)) function gives us that value in your data range above which 67 % of the values lie. The LARGE function takes two parameters ; the first one is the range of values ; the second one is a number from 1 upwards ; 1 will give the largest value in a range , 2 will give the second largest value in a range and so on.


5. Multiplying these two results will give us an array of values , all of them equal to the value obtained in (4) above.


Narayan
 
Back
Top