• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Find Duplicates - Drill Down to 2 Levels

fareedexcel

Member
Dear Experts,

I have a sheet where I need to find the duplicate entries. There are 6 companies each comprise grade levels from 1-13 with a total of 27 PDB descriptions (Column K).

I need to filter each grade and check each PDB descriptions (27 checks for each grades) whether the amount in column N,O,P are same.

If not same, then the particular cell to be highlighted. The attachment is enclosed with an example. Kindly help.
 

Attachments

  • Find Duplicates.xlsx
    90.5 KB · Views: 11
No pretty little formula I am afraid - not even an ugly great big formula!
I needed a copy of your data that I could sort by Pay Grade and by Codes. Then helper fields to determine the start of each block of data and its length (5 or 6). That allowed me to determine the mode of each measure within the data block and check that the value is equal to the mode.

Having fought my way through this way I probably could write a formula to filter and calculate the required mode for each cell but it would involve scanning the entire column for every condition. The limit would be a matter of computational efficiency. With 1000 rows one might expect to involve a few 100 times more calculation.
 

Attachments

  • Find Duplicates (PB).xlsx
    188.5 KB · Views: 18
What I set out to avoid was searching the entire table to evaluate each output cell. To achieve that I took a copy of your source data and sorted it by Pay Grade and Codes to bring the layout closer to your report format.

The first helper field I introduced was simply a record number. Sheet row numbers are irrelevant to my way of working whereas record numbers can be useful. The next step is to test whether a record has the same (Grade and) Code values as the line above. I sometimes Name the formula 'newBlock?' (say) or simply use it as it is nested within other formulas.
= IFERROR(Table1[@Codes]<>(prior Table1[Codes]), 1)

If it is TRUE I am starting a new block, if FALSE I simply copy down the record number of the first row of the block.
= IF( newBlock?, [@RecordNum], (prior [BlockStart]) )
where 'prior' is a relative reference to the record above the current cell
= Table1 data!1:1 etc.
when the active cell is in Row2.

I also need to know where the block ends so I MATCH the first index of a block against the list of BlockStarts beginning with the current record. That gives be the block length.
= IF(newBlock?,
MATCH([@BlockStart], [@BlockStart]:INDEX([BlockStart],ROWS(Table1))),
prior [BlockLength] )


The MODE formulas pick out the appropriate 5 or 6 values from a the part of a field that lies within the block row range using INDEX. This is only evaluated for the first record of a block and then copied down.
= IF( newBlock?,
MODE( INDEX([Amount or Rate], [@BlockStart] ) :
INDEX([Amount or Rate], [@BlockStart]+[@BlockLength]-1) ),
prior [Rate.mode] )


From there the conditional formats simple test each value to determine whether it is equal to the mode or not.
 
Dear Peter I gone through the explanation. It's in detail as the formula is in structural reference, I feel bit difficulty in understanding. But the way you presented is much appreciated.
 
@fareedexcel

Yes, I could be done. As you say, the logic would be much the same.

Sort the data range
Block-read the required columns to variant arrays
Iterate through the arrays
Each time you locate a new bloc, evaluate the previous block
Add results to array and reset small calculation array (5-6 values).
Next
Block-write results to helper range


To use conditional formatting without helper range would require more advanced programming techniques.

The "single click" is not the problem; the problem is writing the code!

On a different tack, time spent familiarising yourself with Tables and Structured References would be time well-spent. The new functionality such as Power Query is all build with tables in mind.
 
@fareedexcel

Yes, I could be done. As you say, the logic would be much the same.

Sort the data range
Block-read the required columns to variant arrays
Iterate through the arrays
Each time you locate a new bloc, evaluate the previous block
Add results to array and reset small calculation array (5-6 values).
Next
Block-write results to helper range


To use conditional formatting without helper range would require more advanced programming techniques.

The "single click" is not the problem; the problem is writing the code!

On a different tack, time spent familiarising yourself with Tables and Structured References would be time well-spent. The new functionality such as Power Query is all build with tables in mind.
Thanks Peter for your advise.
 
Back
Top