• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Counting unique values based on several conditions

droopy

New Member
I need to be able to count how many distinct ProjectID are in my table with the following criteria:

PorjectID should be counted if it appears

[list type=decimal]
[*]in the given quarter,
[*]for a given region,
but not if it was already existing in previous quarters.

The following array formula satisfies conditions 1 and 2, but not condition 3.

(Region, Qtr and ProjectID are named ranges)


Code:
SUM(IF(FREQUENCY(IF(Region=D$10,IF(Qtr=$A$11,IF(LEN(ProjectID)>0,MATCH(ProjectID,ProjectID,0),""),""),""), IF(Region=D$10,IF(Qtr=$A$11,IF(LEN(ProjectID)>0,MATCH(ProjectID,ProjectID,0),""),""),""))>0,1))

Any hints on how I can solve this one is welcome![/list type=decimal]
 
Welcome to the forums!


Any chance we could use a helper column? Assuming your data is sorted by time (previous quarters are at top of sheet), Region is column AA, Qtr is column AB, ProjectID is column AC, helper column formula would be:

=(AA2=D$10)*(AB2=$A$11)*(COUNTIF(AC$2:AC2,AC2)=1)


Then, to get your unique count, it's simply:

=SUM(HelperColumn)


There's probably a way to combine everything into one formula, but this method is certainly simpler (and easy to understand later).
 
Thanks for your answer Luke.

Data I use are raw data, and I'd rather not put an ordering constraint on that source, given that I perform many other calculations.

What if my other calculations need as well to rely on a specific (and distinct) ordering?

Moreover, I would need several additional columns, as I need to graph these counts for each quarter and each region simultaneously.

That's why I need something very flexible (though probably rather cryptic as well).

Anyway, these internal calculation sheets will be hidden and used as a basis to produce the overall graphs, and I plan to heavily comment these sheets for future maintenance (whether I do it myself, or it is handed over to someone else...)


Any other suggestion?
 
Can you elaborate on what the Qtr field looks like? Is it text, dates, numbers, etc? Need to know this so we can build the logic of "previous quarters".


Also, would a UDF be acceptable (function built in VB)?
 
Here are some additional details:

I have one macro-enabled workbook with raw data (in which I have some UDF running to populate calculated fields for easier data crunching).


I have a second workbook in which I have some slicers for the users to filter their data set, and present the resulting graphs.

This is also where I do the data crunching, in hidden sheets.


In raw data workbook, the formats are as follows:

  • Qtr format: text - yyyy-Q# (e.g. 2011-Q2), where yyyy is the fiscal year (from Nov yyyy - 1 to Oct yyyy)
  • Region format: text (e.g. AMS, EMEA etc.)
    ProjectID: textual.
    There's also Submission date: date

Actually, Qtr should reflect the date of submission (so if date is 28-oct-2011, Qtr should be 2011-Q4). But I have some cases where, in raw data, 28-oct-2011 will point on 2012-Q1 instead.

So basically, I need to count the # of distinct ProjectID that were submitted in each quarter (i.e. which submission dates are within corresponding fiscal quarter boundaries), and which were not already submitted in previous quarters.


I haven't yet explored the VB option, as I was trying to do it with excel formulae only, but if this is not an option I can live with UDF (my excel sheet is already macro-enabled).

Anyway, this will make formulas easier to read, and programming easier to comment...
 
Lengthy, but I think this does it:

=SUM(IF(FREQUENCY(MATCH(IF((COUNTIFS(ProjectID,ProjectID,Qtr,"<"&A11)=0)*(Region=D10)*(Qtr=A11),ProjectID,"XX"),IF((COUNTIFS(ProjectID,ProjectID,Qtr,"<"&A11)=0)*(Region=D10)*(Qtr=A11),ProjectID,"XX"),0),MATCH(IF((COUNTIFS(ProjectID,ProjectID,Qtr,"<"&A11)=0)*(Region=D10)*(Qtr=A11),ProjectID,"XX"),IF((COUNTIFS(ProjectID,ProjectID,Qtr,"<"&A11)=0)*(Region=D10)*(Qtr=A11),ProjectID,"XX"),0))>0,1))-1


Breaking it down:

Function to sort out what we care about based on first three criteria:

=IF((COUNTIFS(ProjectID,ProjectID,Qtr,"<"&A11)=0)*(Region=D10)*(Qtr=A11),ProjectID,"XX")


Function to get unique count (from XL help file)

=SUM(IF(FREQUENCY(MATCH(B2:B10,B2:B10,0),MATCH(B2:B10,B2:B10,0))>0,1))


To get our formula, replace every reference of B2:B10 in 2nd formula with entire first formula. As noted in help file, this is an array formula, needs to be confirmed using Ctrl+Shift+Enter. Note also that formula will not work pre-2007. Might be able to replace the COUNTIFS function with a SUMPRODUCT if really needed...
 
Thank you Luke. It works fine, except for my last quarter, where it seems many projects are not counted.


I'm trying to see which are selected (or which ones are left aside).


But all in all it satisfies my needs.


Thanks again.

Catherine
 
Hmm, curious as to what's going wrong in the last qty, but glad to hear that for the most part it's working. Let us know if you discover a more complete solution! =)
 
Yes I found it strange as well. I saw no reason why this should fail for that specific quarter.

Then digging into the results I expected vs. the results I got, I finally pinpointed it.

If you remember, I cannot rely on my raw Qtr data, because sometimes the Qtr data does not reflect the actual (fiscal) quarter of the submission date. So I added a column in which I calculate the actual quarter.

If month is nov or dec, the fiscal year should be calendar year +1. This is where I had the bug.

In nov and dec, I increment the value of the year and use Str before concatenation with the quarter.

I had just missed the fact that Str systematically inserts a space on the left of the digits for potential sign.

That's why it only happened in Q1. All the entries of Q1 that were either in nov or in dec were reported as having their Qtr value < "2012-Q1" (thus were not counted) because of this tiny (and poorly visible) space.

I trimmed the string, and it's fine now!

Wow, that stupid little error took me a while to figure out!
 
Phew, glad a solution was found.

I too have spent many hours debugging stuff because of a single character that is wrong...

=(
 
Back
Top