• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Calculating a moving weighted average during Resource Scheduling

Hi ,

There are actually two key concepts which have been used here ; John Jairo's brilliance is putting them to work in this unfamiliar setting.

1. Using two running totals to firstly find out where to stop , and then to get the exact changeover value.

For example , it is easy to see that we need to calculate :

22,559 x 55.4% +

12,737 x 62.5%

However , the question is how do we calculate the third value , since part way through this third value , we are exceeding the target of 40,000.

This is where the idea of 2 running totals helps.

Consider the first running total starting with the first value ; we will have :

22,559 ; 35,296 ; 47,633

Now consider the second running total which starts with the blank cell above the data in column C ; we will have :

0 ; 22,559 ; 35,296

Now , we subtract these values from the target of 40,000 ; the first set gives us :

17,441 ; 4,704 ; -7,633

while the second gives us :

40,000 ; 17,441 ; 4,704

As long as the first set is positive , we take the original data values of :

22,559 ; 12,737

When it turns negative , we take from the second set , which will give us :

4,704

Now , the issue comes to the second key concept ; how do we ignore the remaining values ? Since 4,704 was the changeover point , all the remaining values in the second set will be negative.

Ignoring these negative values forms the second key concept viz. using the LN and EXP functions , which essentially return positive values as they were originally , and convert all negative values to error values.

These error values are then trapped by the IFERROR.

Narayan
 
Hi again, to all!

Good catch Hui!
Just for refine a little bit... I really like when all the solutions have a single formula for all the range. So, I make the changes:

[F6] : =MIN(40000,$C32-SUM($E6:E6))

[F38] (CSE) : =IFERROR((SUM(IFERROR(EXP(LN(IF(SUM($E6:F6)-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1))<_r,SUM($E6:F6)-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1)),_r))),)*$D10:$D31)-SUM(N(+$E6:E6)*N(+$E38:E38)))/F$6,)

Both formulas drag it to the right. Blessings!
 
=SUM(IFERROR(EXP(LN(IF(F$6-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1))<_r,F$6-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1)),_r))),)*$D10:$D31)/F$6

=IFERROR((SUM(IFERROR(EXP(LN(IF(SUM($E6:F6)-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1))<_r,SUM($E6:F6)-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1)),_r))),)*$D10:$D31)-SUM(N(+$E6:E6)*N(+$E38:E38)))/F$6,)

Hi John,

You are great :awesome:
Both formulas are astonishing and beyond my understanding.

Regards,
 

I call this kind "headache formula", even to just try to understand it !

The reason why I have the - very - bad habit to create
an around five minutes VBA code …
 
Hi again, to all!

Another approach, a bit shorter:
[F38] (CSE) : =IFERROR((SUM(IFERROR(EXP(LN(MMULT(MMULT(CHOOSE({1,2},SUM($E6:F6)-MMULT(N(ROW(_r1)>=TRANSPOSE(ROW(_r1))),N(+_r1)),_r),{1,1;1,-1}/2)^{1,2}^{1,0.5},{1;-1}))),)*$D10:$D31)-SUM(N(+$E6:E6)*N(+$E38:E38)))/F$6,)

Thanks for all your kind words.
By the way... Excellent explanation for @NARAYANK991. Blessings!
 
Sorry if I have double-posted this but I cannot find it above. I thought I posted it at 9:00 before going out for the day but it seems not. It still needs some tidying given that the workbook was written in a bit of a rush sometime between 8:00 and 9:00. Period 12 is wrong; this morning it was a #REF! error I am not sure whether that is better or worse!
 

Attachments

  • Scheduling Example.xlsx
    17.3 KB · Views: 7
I have worked through the problem again using more basic functions (no MMULT, just SUMIFS). There are no clever formulas, just a sequence of multi-cell array formulas that can be packaged within the corresponding Named formulas. The file with helper cells is the workbook in its original form, as developed, and does go some way to explaining the process.

I am sure no one will wish to adopt this approach but I hope it will be of some interest. The solution answers @Hui's challenge but without actually providing the formula. It is implicit within the solution if one cares to substitute out all of the names.
 

Attachments

  • Scheduling Example with Named formulas.xlsx
    16.8 KB · Views: 9
  • Scheduling Example with helper cells.xlsx
    18.3 KB · Views: 7
Peter

That is interesting

It appears similar to John's first approach using named Formula instead of formula
 
@Hui
It took me a long time to realise that named formulas were just that, formulas and not variables. Then Charles Williams's observations that named formulas are not evaluated unless they are used within a cell suddenly made sense.
Today's effort would have gone much faster if I hadn't changed strategies mid-stream. I was tempted to use matrix accumulation and differencing operators but then decided that SUMIFS would be more efficient despite doubling the calculation to work with start and end of period cases.
Peter
 

Attachments

  • temp - matrices.png
    temp - matrices.png
    21.8 KB · Views: 16
Late seeing the tag in this post. Interesting challenge and great solutions by JJV. One minor tweak might be to use EFFECT(.,1) in place of EXP(LN(.)) to minimise rounding issues.

Fwiw, here's one more non-cse option referring only to constant cell values filled across (F38):

=SUMPRODUCT(MMULT(ABS(F6-MMULT(MMULT(($C10:$C31+MMULT(MMULT(SUMIF($F8:$Q8,{"<=","<"}&F8,$F6:$Q6)-SUMIF($B10:$B31,{"<","<="}&$B10:$B31,$C10:$C31),{1,1;-1,1}/2)^{1,2}^{1,0.5},{1,-1;-1,1}))^{1,2}^{1,0.5},{1;-1}/2)^{1,2}^{1,0.5},{1,-1;1,-1}/2)),{-1;1}/2),$D$10:$D$31)/F$6

It follows JJV's technique of making pairwise MIN/MAX replacements using MMULT (+ headbanging!) in:

=MAX(0,MIN(F$6,$C10,(SUMIF($F$8:$Q$8,{"<=","<"}&F$8,$F$6:$Q$6)-SUMIF($B$10:$B$31,{"<","<="}&$B10,$C$10:$C$31))*{1,-1}))
 
@Hui
It took me a long time to realise that named formulas were just that, formulas and not variables. Then Charles Williams's observations that named formulas are not evaluated unless they are used within a cell suddenly made sense.
Today's effort would have gone much faster if I hadn't changed strategies mid-stream. I was tempted to use matrix accumulation and differencing operators but then decided that SUMIFS would be more efficient despite doubling the calculation to work with start and end of period cases.
Peter

Peter

Named Formula are also Array Formula
Which can be handy from time to time
 
As requested, here are some rough notes on the previous formula construction, maybe Hui can incorporate in a format suited to FF.

In another post, Narayan has spoken of compiling formula patterns for solving problems. The key patterns needed here are arrays of running sums together with combinations of constraints.

1. For the constraints, results need to be positive, less than row and column limits as well as less than cumulative row and column limits.

This led to the second formula from the previous post entered into F10 and filled over the range. One can see the row and column symmetry in the formula, I'll leave details to readers or others with better explanatory skills.

2. Normally, I would be more than happy to have got this far! The real challenge is in making suitable modifications to support arrays of running sums and constraints by making the changes:

$B10 -> $B10:$B31
$C10 -> $C10:$C31

For this purpose, we can break down the formula into pairs of inputs:

=MAX(0,MIN(r,MIN(c,MIN(x,y))))

Here, r and c are row and column quantities and x and y are the overlapping totals in the row and column quantities.

3. Consider first MAX(0,z) where z=MIN(...) is a long formula.

The trouble with the obvious IF alternative is repeating this long formula twice, i.e. IF(z>0,z,0).

A second way, used above by John, is to find an expression that errors for negative values like EXP(LN(z)) or z^0.5^2 or EFFECT(z,1), and then pass to IFERROR.

A third slightly less obvious alternative which ties in with methods in the next section is to use ABS(z) = z^2^0.5 as in:

=(z+ABS(z))/2
=MMULT(z^{1,2}^{1,0.5},{1;1}/2)

4. For MIN(x,y) the obvious replacement =IF(x<y, x, y) again results in formula duplication. Instead we can use:

=(x+y-ABS(x-y))/2
=MMULT((x+{1,-1}*y)^{1,2}^{1,0.5},{1;-1}/2)

5. This method can also be applied to MIN(c,.) and MIN(r,.). Since r and c are both positive the formula can be rewritten:

=MIN(r,MIN(c,MAX(0,MIN(x,y))))

and then the exponents ^{1,2}^{1,0.5} can be replaced by ABS for the r and c parts.

At each step of this formula conversion process, results were checked by filling over the whole range. Last working versions were stored in a text file which proved far too useful!
 
Just in case anyone is interested, I thought I should document my approach for comparison purposes. This time I have used a verbose notation which may assist readability.

The starting point is to name all the input data:
On the supply side -

An index column identifying each parcel: parcel.index
The ore in tonnes for each parcel: Ore.supply\parcel
The percentage iron within the parcel: Fe.percent
The iron content in tonnes for each parcel: Fe\parcel
On the demand side -
An index column identifying each period: period.index
The ore in tonnes required over each period: Ore.demand\period

The calculation is based upon matching the cumulative demand to the cumulative supply to identify the parcel used to fulfil each month's requirement.

The following is calculated -
On the supply side -

The cumulative weight of ore available prior to the current parcel: Ore.cumulative\parcel
= SUMIFS( Ore.supply\parcel, parcel.index, "<" & parcel.index )
The iron content available prior to the current parcel: Fe.cumulative\parcel
= SUMIFS( Fe\parcel, parcel.index, "<" & parcel.index )
Because the total ore available might place a limit on the demand that can be met the total is calculated: Ore.totalSupply
= SUM(Ore.supply\parcel)
On the demand side -
The cumulative total ore required at the start and end of each period:
Ore.cumulativeDemand\period
= SUMIFS( Ore.demand\period, period.index, {"<","<="} & period.index )
[this borrows from Lori]

Because supply may limit the extent to which the requirement is met the tonnage actually delivered within each period is calculated as a worksheet formula:
= MMULT( IF( Ore.cumulativeDemand\period<Ore.totalSupply, Ore.cumulativeDemand\period, Ore.totalSupply ), {-1;1} )
The range is named Ore.supplied\Period and is used in subsequent calculations, starting with a variation Ore.cumulativeSupply\period on the previous running total
= SUMIFS( Ore.supplied\Period, period.index, {"<","<="} & period.index )

The ore and iron content delivered for each period:
At the start and end of any period the total weight of ore delivered to date using entire parcels can be looked up and the balance to be taken from the current parcel Ore.balance\period is calculated by:
= Ore.cumulativeSupply\period -
LOOKUP( Ore.cumulativeSupply\period, Ore.cumulative\parcel )

The cumulative weight of iron Fe.cumulativeSupply\period supplied for the beginning and end of each period is given in terms of the total for the last parcel and the weight and iron content of the balance:

= LOOKUP( Ore.cumulativeSupply\period, Ore.cumulative\parcel, Fe.cumulative\parcel ) +
Ore.balance\period * LOOKUP( Ore.cumulativeSupply\period, Ore.cumulative\parcel, Fe.percent ) / 100

The iron content supplied over each period is then given by a final worksheet array formula:
= 100 * MMULT( Fe.cumulativeSupply\period, {-1;1} ) / Ore.supplied\Period

What I have used this challenge for is to demonstrate (to myself) that it is possible to develop a solution to a complex problem entirely within terms of the domain of application and with no reference to cells or their relative position. That is to deny the assumptions of end user computing?
 

Attachments

  • Scheduling Example with Named formulas - Alt.xlsx
    22 KB · Views: 8
@Lori

...there was a simpler approach...

Someday we might need a debate as to what constitutes 'simplicity'. For me 'simplicity' lies in capturing the structure that underpins the problem domain and carries meaning step by step towards a solution.

You and others here seem to specialise in producing solutions within a single formula, like a little jewel of amazing ingenuity that dazzles and bewitches.

For F1F9 and the FAST standard 'simplicity' lies in pedantry and the use of naïve techniques on the assumption that the user may wish to check the calculation with a pocket calculator.

Can we all be right?
 
Hi ,

I think Lori's usage of the word simpler is only in the context of a formula which does not use helper cells.

If you remove this constraint , nothing can be much simpler than Hui's original approach which used helper cells.

Narayan
 
Simplicity can take many forms

My original example is simple, can manually be followed and audited.

But I was recently confronted with a situation where the client wanted a solution which would have required between 50 and 100 blocks of that solution where each block could have extended to up to 1000 rows and 240 columns. Clearly unworkable.

So yes, it would have been simple to setup, simple to follow and audit, it is very visual and explainable, but it would not have been a usefull tool

Simplifying the problem to 50-100 data areas, with 50 solution rows, and a half dozen summary rows, gave a workable solution, despite the issues of unautitable cell formulas.

In terms of auditability I use my original simple mockup as a sample where they can see both solutions side by side.
 
Hi Hui ,

But the solution to situations which cannot use helper cells , is not single formulae which can be understood by barely 1% of Excellers.

The correct solution would be to use VBA , which can have code that is well developed , properly commented , adequately tested and which is scalable in future.

There have been enough instances in this forum itself , where brilliant formulae are presented , and which are found to have flaws later ; in some cases , these flaws are pointed out by others , and in some cases by the creator of the formula.

Clearly such a situation cannot be allowed to develop in a working process , where such flaws can prove to be costly.

Narayan
 
Simpler... in the sense of employing only five functions, Peter's approach is simpler in the sense of readability. And most would agree the simplest method to follow would be to use Hui's original set up with a data table attached - which is in effect a solution with helper cells though it may be quite slow.

A few reasons for adopting formula based solutions are:
  • speed - sometimes they are faster than udfs (sometimes they are slower too!)
  • portability - works with eg google sheets and mobile versions (previous formula was constructed on phone while on a train)
  • creativity - leads one to think "outside the box" and can help develop skills in problem solving
In most other respects udfs are the way to go (robustness, flexibility, scalability, etc.) And for sure if you were going to use a an approach like above in a real world scenario it would need to be tested inside out.
 
Back
Top