
DAX offers powerful way to analyze “new” vs. “returning” customers. In this article learn easy and simple DAX measure patterns to count number of new customers and number of returning customers from your data.
What is a Returning Customer?

A returning customer is someone who comes back to our business and does another transaction. For example, in the above illustration, CUST-001 and CUST-004 are repeat or returning customers.
What is a NEW customer?
A new customer is someone who is doing their first transaction with us. In the above example data, all other customers (except CUST-001 & CUST-004) are technically NEW CUSTOMERS.
Note: A new customer today might be a returning customer in future.
DAX measures for calculating new vs. returning customer counts
All the measures in this example are based on a simple “Data” table with 4 columns – Customer ID, Date, Order Qty and Product Name.

Customer Count Measure
=Customer Count = DISTINCTCOUNT(data[Customer ID])This is a simple distinct count measure that tells us how many distinct customers transacted with us. When used with a the context of a date or product we will get the number of customers per each.
Returning Customer Count Measure
Returning Customer Count =
var custs = DISTINCT(data[Customer ID])
var curr_date = LASTDATE(data[Date])
return
sumX(custs, CALCULATE([Customer Count], data[Date]<curr_date))This measure tells us how many returning customers are there for the context of current “time-period”.
How this returning customer count works?
Imagine the below output and let’s focus on the second row.

- For the date context of 6-January
- We create custs variable which gives us all the 92 customer IDs.
- The curr_date variable tells us the latest date – i.e. 6-January.
- We then iterate for each of the customers in custs table and calculate the [Customer Count] prior to the curr_date. This would be 1 if the customer has previously transacted with us and 0 otherwise.
- The SUMX adds up all these values (ie all 1s) and tells us 33, which is the number of returning customers.
New Customers Measure
New Customers = [Customer Count] - [Returning Customer Count]If you already have both the total [customer count] and [returning customer count], you can easily subtract one from another to get the [new customers] count.
But if you don’t have the [returning customer count] or just want to directly calculate the [new customers], you can use below DAX measure.
New Customer Count - direct =
var custs = DISTINCT(data[Customer ID])
var curr_date = LASTDATE(data[Date])
return
SUMX(custs, IF(CALCULATE([Customer Count], data[Date] < curr_date)=0,1,0))The above measure uses the same approach as [Returning Customer Count] but flips the logic inside SUMX by using the IF function to negate the CALCULATE result.
Returning Customers in Last 4 Weeks or similar

While the above [Returning Customer Count] works flawlessly, it may not be realistic to consider a customer to be returning if they rarely transact. So a more realistic calculation would be to consider a customer to be returning if they did some business in the last 4 weeks (or x periods). Here is the DAX pattern for that.
Returning Customers in Last 4 Weeks =
var custs = DISTINCT(data[Customer ID])
var curr_date = LASTDATE(data[Date])
var start_date = DATEADD(curr_date,-28,DAY)
return
SUMX(custs, CALCULATE([Customer Count], data[Date]<curr_date && data[Date]>=start_date))In this case, we simply calculate the “start_date” for our calculation window as well. Here I have used 28 days as an example, but you can easily change this to any window size.
Then we apply the same SUMX logic but modify the filter context in the CALCULATE to check both boundaries of the dates.
Why not do this analysis in SQL or somewhere upstream?

When I mentioned about this approach to my wife Jo, she said, why not do this in SQL directly and tag each customer as “new” or “returning”?
Here is why I prefer to do this with DAX:
- Business Rule Flexibility: With DAX based approach, we can easily change the business rule surrounding who is a returning customer. For example, we can use the 4 week window like above easily.
- Interactivity: We can add a product slicer (see below) to analyze which customers returned to purchase the same product. This is incredibly helpful to understand customer loyalty and campaign effectiveness.
Of course, there are advantages with SQL approach too. Mainly,
- SQL tagging is faster: Unlike DAX calculations which run in real-time & client-side, SQL calculations are done once and at server side. When you have millions or billions of records, doing SUMX in real-time is going to be slow.
- Consistency: Applying customer tagging at server side in the data layer means the business rule & logic is consistently applied for every report.
Sample Power BI Workbook:
If you want to play with these measures and understand the calculation better, check out the sample PBIX file here.
In conclusion
New vs. Returning Customer analysis is a must-have for customer analytics. The DAX required for this is easy to implement and works beautifully. Try this analysis to understand the effectiveness of marketing campaigns (lead gen, customer capture) and loyalty programs (reward points, notifications). Using a time-window based calculations (ex: 4 weeks) is a great way to understand customer behavior and purchasing patterns.














19 Responses to “How to Distribute Players Between Teams – Evenly”
An excellent solution, especially for large data sets.
Another solution without using solver would be to assign the player with the highest score to Team 1, the 2nd to team 2, 3rd to team 3, 4th to team 3, 5th to team 2, 6th to team 1, 7th to team 1 and it continues. This method would end up with a Std Dev of 0.001247219. This works best with a distribution with lower Std Dev for the dataset.
Full Disclosure: this is not my idea, remember reading something a few years ago. Think it may have been Ozgrid
thinking back I now remember why I read about it. About 10 years back I had to distribute around 300 team members into 25-30 odd teams. Used this method based on their performance scores. I used the method I described to do this and the distribution was pretty fair.
Solver would have saved me a ton of time though 🙂
I think the issue with you first Solver approach was that you took the absolute value of the sum of team deviations (which should always be zero except for rounding) instead of the sum of the absolute values (which is a reasonable measure of how unbalanced the teams are).
Here's another simple algorithm you could use: you start from the top (with players sorted from high to low), and at each step allocate the next player to whichever team has the smallest total so far. You can implement it dynamically with some formulas so it will update automatically when the data changes.
If the scores were more widely distributed (so that this might end up with not all teams the same size), you could add a constraint to only pick among the teams which currently have fewest players at each step, or just stop adding to any team when it hits its quota.
When I tried it on the sample, I got the three teams below, with a STDEV of 0.000942809 (i.e. about half of what Solver got to).
Team 1: John, Hugo, Tom, Josh, Eric, Zane, Charles, Andrew
Team 2: Barry, Michael, Kenny, Joe, Xavier, Patrick, Oliver, William
Team 3: Henry, Steven, Ben, Frank, Kyle, Edward, Cameron, Lachlan
Thanks for sharing!
Hi,
I was looking at all the solutions and this is closest to what I intended to do. I am dividing a bunch of players into 3 soccer teams. Players availability is also a factor while deciding the teams.
So the steps the excel needs to do is as follows:
1) In availability column if "yes" go to next
2) Equally divide 'Goalkeepers', 'Strikers', 'Defenders' basis their quality
So the end result gives each 3 teams a balance of players playing at different positions.
Can this be done on Google spreadsheet with only availability as an input from the user and rest calculates by itself.
Sorry for asking such a pointed question, but I have been struggling to find a solution for it for sometime now!
Hi Ishaan,
I am working on a similar problem at the moment, so I am wondering if you ever found a solution and if you are willing to share what you did.
Hi everyone, this is a variation of the famous Knapsack Problem https://en.wikipedia.org/wiki/Knapsack_problem.
I had to use a VBA implementation recently as part of a problem, where we ar trying to allocate teams of an organization into different locations (we are a large company with many different team). The goal was to optimally allocate teams to individual buildings without putting too many teams into one building and not splitting teams apart.
As we had around 400 teams of different sizes, solver couldn't handle it anymore. Luckily there is a Knapsack algorithm implementation in VBA readily available on the internet :).
I also went with a heuristic approach first!
An interesting mathematical solution but what if Eric and Xavier can't stand each other or Patrick is best friends with Steven - the real life problems that effect "even" teams.
@Joe
You can add more criteria like
If Eric and Xavier can't stand each other
=OR(AND(E15=1,E16=1),AND(F15=1,F16=1),AND(G15=1,G16=1))
It must be False
If Patrick is best friends with Steven
=OR(AND(E5=1,E17=1),AND(F5=1,F17=1),AND(G5=1,G17=1))
It must be True
Note that the 2 formulas above are exactly the same
except for the ranges
One must be True = Friends
One must be False = Not Friends
Nice Post!
Just one question What if number of players are not even or equally divisible.
Nice post Hui!
I download your workbook and just try to change in options the Precision Restriction from 10E-6 to 10-8 and the Convergence from 10E-4 to 10E-10. The process take almost the same time, but the results was great.
The standard deviation I got was 0,000471.
Team 1: John, Tom, Kenny, Frank, Eric, Xavier, Edward, Zane
Team 2: Steven, Hugo, Ben, Joe, Josh, Oliver, Cameron, William
Team 3: Barry, Henry, Michael, Kyle, Patrick, Charles, Andrew, Lachlan
Great application of Solver! Thanks for the link!
Great explanation. Well done... However, I tried with 6 teams of 4 players and solver never did finish.
How about vba code for the same data set.
I have 3 column A B C wherein A has text and B has number Wherein C is blank. And in C1 been the header C2 where I want the name to come evenly distributed the number which is in Column B.
My Lastcolumn is 1000.
Sorry if I'm being slow here, but how is 'Team Score' calculated? I've gone through the explanation several times but it seems to just appear.
@Hrmft
This process uses the Solver Excel addin
Solver is effectively taking the model and trying different solutions until it gets a solution that meets all the criteria
Then solver puts the solution into the cell and moves to the next cell
So yes it appears to "just appear"
Hi ! Thank you so much ! Works great 🙂
I cannot get the fourth Equation to work in my excel spreadsheet
You have =($E$2:$G$25=0)+($E$2:$G$25=1)=1 as a SUMIF solution, I have, =($F$2:$H$13=0)+($F$2:$H$13=1)=1 as my solution but it does not work. The only thing I changed is the ranges. Any suggestions?
Thank you.
Jim
I cannot get the fourth Equation of TURE or FALSE statements to work in my excel spreadsheet You have =($E$2:$G$25=0)+($E$2:$G$25=1)=1 as a SUMIF solution, I have, =($F$2:$H$13=0)+($F$2:$H$13=1)=1 as my solution but it does not work. The only thing I changed is the ranges. Any suggestions?
Sorry I left some of it out in the previous question,
Thank you. Jim