• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Excel 2003 rolling average by qty and part number

cookmd

New Member
I'm running into an issue providing data to our operations guys.


I have a list of part numbers, (some of which are duplicates) in column A.

In column B, I have a list of target "run times" for each part. (in minutes per piece)

In column C, I have a list of Actual Run times for those same parts.


I need to calculate a rolling average for each part number (removing duplicates) based on the last 5 or 10 jobs. since this isn't date-specific, and there are multiple occurrences of each item number, I don't know how to retrieve unique item records with a rolling average run time.


To complicate things further, this data is being pulled from a 'semi-live' ODBC connection to Access. I have the connection set to update every time the spreadsheet is opened. I prefer an excel formula solution, but if it would be easier to use VBA or alter my SQL statement that queries Access, I'll go that route.


Any guidance/solution would be greatly appreciated as I have been working on this for a few days now. Thanks in advance!
 
Hi ,


Can you clarify a few points ?


1. You say the connection updates every time you open the spreadsheet ; does this mean that as long as the spreadsheet is open , what you have is a static data set , a snapshot , which can be worked on , independent of new data ? How much of new data is added each time the connection updates , is it just one more 'record' or is it a set of fresh 'records' ?


2. Let us assume your existing set of part numbers ( including duplicates ) extends from A2 through A300 ; by using an additional column , say column D , you can arrive at a list of unique part numbers , extending from , say D2 through D120 ; for the part number in D2 , will the target run times be the same , or are they going to be different ?


3. Suppose for the part number in D2 , the actual run times are in cells C7 , C8 , C12 , C33 , C34 , C35 , C51 ,... ; to calculate the first moving average , you will use C7 , C8 , C12 , C33 and C34 , assuming you are using 5 values to arrive at the moving average. For the next moving average , values in C8 , C12 , C33 , C34 and C35 will need to be used. Is this correct ?


Narayan
 
Hi Narayan,


Thanks for the response!


1. As the spreadsheet opens, the data updates. From that point, it is a static set of data that can be worked on. As the connection pulls in new data each time the spreadsheet is opened, there may be any number of new rows added to the original data set.


2. Target run times are based on each part number, and are set. (No matter how many times part number 1234 comes up, the target run time will always be the same.) I do need an automated solution to narrow down the part numbers to a unique part number list though.


3. Yes, the example you gave is correct. The first moving average would include C7, C8, C12, C33, and C34. Likewise, the second moving average calculation would be C8, C12, C33, C34, and C35.


Ideally, we are attempting to answer 2 questions with this report.

1. Is the target run time accurate, or does it need to be adjusted?

2. Based on the rolling average of the run times for each item, is the process getting faster, slower, or remaining the same. (efficiency)


Thanks again!
 
Hi ,


Thanks for the clarification. To get a unique list of part numbers from a list of part numbers which can include duplicates , you can use the following formula ( from http://www.get-digital-help.com/2009/03/30/how-to-extract-a-unique-list-and-the-duplicates-in-excel-from-one-column/ ) :


=INDEX(List,MATCH(0,COUNTIF($B$1:B1,List),0))


List is the range which contains duplicates , say e.g. A1:A273


Try this and see if you are able to get the unique part over with.


Generating the moving average is going to be more difficult ; do you happen to have any data we can work with ? If so , please upload a sample worksheet , with say 50 rows of data , and we can work it out.


Narayan
 
I got the Array part sorted out and added to the spreadsheet. Now the unique values show in column d. Please let me know if I need to add/remove anything else for the next part. Thanks again!


Test Data:

http://www.mediafire.com/download.php?ml6dtq2u838lug3
 
Hi, cookmd!

May it be possible that you copy the SQL statement used to get the data from the Access database too? And other fields available, a date/time, an ordering sequence, just for evaluate sorting your data and being able to maintain chronology. Thanks.

Regards!
 
That's a little more complicated... One SQL statement pulls half the data, while the second SQL statement pulls the other half of the data. From there, I'm using VLOOKUP between the two worksheets to join the data. because of the way the database was created (long before I started with the company,) I wasn't able to find any other way to avoid unnecessary duplication.


In reality, one query pulls item number, scheduled run time, actual run time, and other data I need; while the other query pulls work order information and close date.

SQL1:

[pre]
Code:
SELECT OPERATION.WORKORDER_TYPE, OPERATION.WORKORDER_BASE_ID, OPERATION.SETUP_HRS, OPERATION.RUN, OPERATION.RUN_TYPE, OPERATION.ACT_SETUP_HRS, OPERATION.ACT_RUN_HRS, OPERATION.CLOSE_DATE
FROM VMFG.dbo.OPERATION OPERATION
WHERE (OPERATION.WORKORDER_TYPE='W') AND (OPERATION.CLOSE_DATE>={ts '2011-01-01 00:00:00'})
ORDER BY OPERATION.CLOSE_DATE
SQL2:

SELECT WORK_ORDER.TYPE, WORK_ORDER.BASE_ID, WORK_ORDER.PART_ID, WORK_ORDER.RECEIVED_QTY, WORK_ORDER.CLOSE_DATE
FROM VMFG.dbo.WORK_ORDER WORK_ORDER
WHERE (WORK_ORDER.CLOSE_DATE Is Null) AND (WORK_ORDER.TYPE='W') AND (WORK_ORDER.RECEIVED_QTY>$.0000) OR (WORK_ORDER.CLOSE_DATE>={ts '2011-01-01 00:00:00'}) AND (WORK_ORDER.TYPE='W') AND (WORK_ORDER.RECEIVED_QTY>$.0000)
ORDER BY WORK_ORDER.PART_ID
[/pre]
 
Hi, cookmd!

Is there any field in the Access DB tables OPERATION or WORK_ORDER like a sequence number, a date&time field?

The idea is to sort the Excel file to a proper order, get the results and then re-sort it to its original order.

Regards!
 
OPERATION.CLOSE_DATE is a date/time field, as is WORK_ORDER.CLOSE_DATE. I've actually removed the "is Null" filters from both queries too. As far as sorting by proper order goes, we have 20-30 machines running at any time, so there are duplicate CLOSE_DATE entries. Not all machines log hh:mm either, which is another issue.


Thanks!
 
Hi ,


Can you check out the following link ?


https://skydrive.live.com/?lc=16393#!/view.aspx?cid=754467BA13646A3F&resid=754467BA13646A3F%21142


Please note that clicking on the link will not do ; please copy the entire address and paste it in your browser.


The actual run times are obtained by selecting a large enough range of columns , and entering the formula as an array formula ; the extra cells will contain #N/A ; but the rest of the data appears to be correct. Hopefully , starting with this , you can derive the moving averages.


Narayan
 
Thanks Narayan,


I'm not familiar with the transpose function. Would you mind explaining to me how that formula works? I see that each column only has one value posted, is that the rolling average for the unique part list in Column D?'m sorry to ask so many questions, but I'd like to understand how and why those formulas work so I can learn the function, not just key in formulas.


Thank you so much!
 
Hi ,


Not at all. I welcome questions !


What you can do to understand the working of any formula is to place the cursor in the cell which contains the formula , and click on Evaluate Formula ; this will allow you to step through the working of the formula , one step at a time.


The formulae from cell F2 through AR2 , F3 through AR3 and so on , are all array formulae ; whole arrays of data are evaluated at a time , instead of normal formulae , which evaluate cells. A very good introduction to array formulae is :


http://www.cpearson.com/excel/ArrayFormulas.aspx


The formula used in the worksheet is :


($A$2:$A$40=D2)*($C$2:$C$40)


This is considering the elements of the array C2:C40 , one at a time , and multiplying it by either 0 or 1 - 0 where the corresponding element in the array A2:A40 is not equal to D2 , and 1 where the corresponding element in the array A2:A40 is equal to D2. The result would be another array of 39 elements ( 40 - 2 + 1 ) , which would have occupied 39 rows ; the TRANSPOSE function ensures that 39 columns are occupied instead.


Narayan
 
Hi Narayan,


Thanks again for all your help. I think I need to give a total overview to make sure I'm going about this the best, most efficient way. I value any input you may want to offer.


Overview:

I'm using 2 ODBC connections that both update as the workbook is open.

Each ODBC data pull has its own tab/worksheet.

I use VLOOKUP to append data from one ODBC connection to another.

I have attempted to account for future data updates by extending the VLOOKUPs beyond the actual data.

Ideally, I would like to collect the rolling average of the last 10 runs for each part number.

We have around 3,000 part numbers that need to be tracked.

I have limits on my SQL statements to only pull data from 2011 to present. Hopefully this will keep excel from reaching its 55,000 row limit for a while.


Desired Delivery Method:

Stakeholders want this data delivered in pivot chart form. I need to calculate a rolling average, and pivot the results so we can do some trend analysis.

With this data, we should be able to more accurately calculate expected run times and improve our scheduling.

Since this data needs to be pivoted, and the data set is rather large, is the above method still the best option in your opinion?


Thanks for your feedback!

-Cookmd
 
Hi ,


I think Excel should be able to handle the data ; however , only you can actually see whether this happens , since you are working with actual data !


When you say that the data is to delivered in pivot chart form , the large data size should not be the reason for selecting any particular method ; in my opinion , the whole purpose of data analysis is decision making ; so if your scheduling is on a daily basis , then you need to analyse a small set of data which is fresh , and immediately relevant ; if your scheduling is a monthly activity , then you need to analyse a larger set of data ; if your purpose is to look at long-term trends , then probably a 5-year window may be more representative.


If you can cover just one full month in each worksheet tab , then you can probably store data for quite a few years ; in addition , if your part numbers are standardised , is it possible you can create a database of part numbers , so that any data can be looked up , using this database as a master ? This may need to be a one-time activity ; you can even look at adding a few macros to create and maintain / update this database.


Narayan
 
Thanks for your analysis Narayan, I appreciate it.


I did run into an issue with the array formula you provided to uniquely identify part numbers. For some reason, I'm receiving duplicates, and I don't know why. Do you have any pointers or need any additional information? I can provide a modified screencap for you to look at. (Trying not to share proprietary data...)


Thanks,

cookmd
 
Hi ,


Can you post a sample worksheet with only the part numbers data , so that I can check why you are getting duplicates even after using the array formula ?


Narayan
 
Hi Narayan,


http://www.mediafire.com/download.php?816a2022qv2exn7


the above link contains a partial list. I was using a VLOOKUP function to pull the part numbers to a new worksheet. Then I was using the formula you provided earlier to try and make a list of unique part numbers. Could it be that referencing a cell that contains a formula is causing the issue? I don't know any other way to automate the process if that is the issue though.


Thanks again!

cookmd
 
Hi ,


There are no duplicates within your partial list ; using the formula resulted in 297 unique part numbers. Then I copied the original list and pasted it in a different column as values ; removing duplicates from this list using Excel's in-built facility , gave the same 297 entries as unique.


Narayan
 
Back
Top