An ABC Analysis is a very common requirement for for business users. It classifies e.g. Items, Products or Customers into groups based on their sales and how much impact they had on the cumulated overall sales. This is done in several steps.
I just published a new version of the Dynamic ABC Analysis at www.daxpatterns.com. The article can be found here.
1) Order products by their sales in descending order
2) Cumulate the sales beginning with the best selling product till the current product
3) Calculate the percentage of the cumulated sales vs. total sales
4) Assign a Class according to the cumulated percentage
Marco Russo already blogged about this here. He does the classification in a calculated column based on the overall sales of each product. As calculated columns are processed when the data is loaded, this is not dynamic in terms of your filters that you may apply in the final report. If, for example, a customer was within Class A regarding total sales but had no sales last year then a report for last year that uses this classification may give you misleading results.
In this blog I will show how to do this kind of calculation on-the-fly always in the context of the current filters. I am using Adventure Works DW 2008 R2 (download) as my sample data and create a dynamic ABC analysis of the products.
The first thing we notice is that our product table is a slowly changing dimension of type 2 and there are several entries for the same product as every change is traced in the same table.
So we want to do our classification on the ProductAlternateKey (=Business Key) column instead of our ProductKey (=Surrogate Key) column.
First we have to create a ranking of our products:
IF(NOT(ISBLANK([SUM SA])),
RANKX(
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA])))
Check if there is only one product in the current context and that this product also has sales. If this is the case we calculate our rank. We need to do the CALCULATETABLE to do the ranking within the currently applied filters on other columns of the DimProduct-table e.g. if a filter is applied to DimProduct[ProductSubcategoryKey] we want to see our ranking within that selected Subcategory and not against all Products.
I also created a measure [SUM SA] just to simplify the following expressions:
The second step is to create a running total starting with the best-selling product/the product with the lowest rank:
TOPN(
[Rank CurrentProducts],
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA]),
[SUM SA])
We use a combination of SUMX() and TOPN() here. TOPN() returns the top products ordered by [SUM SA]. Further we use our previously calculated rank to only get the products that have the same or more sales than the current product. For example if the current product has rank 3 we sum up the top 3 products to get our cumulated sum (=sum of the first 3 products) for this product. Again we need to use CALCULATETABLE() to retain other filters on the DimProduct-table.
The third step is pretty easy – we need to calculate percentage of the cumulated sales vs. the total sales:
[CumSA CurrentProducts]
/
CALCULATE([SUM SA], ALL(DimProduct[ProductAlternateKey]))
This calculation is straight forward and should not need any further explanation.
The result of those calculations can be seen here:
To do our final classification we have to extend our model with a new table that holds our classes and their border-values:
Class | LowerBoundary | UpperBoundary |
A | 0 | 0.7 |
B | 0.7 | 0.9 |
C | 0.9 | 1 |
Class A should contain products which’s cumulated sales are between 0 and 0.7 – between 0% and 70%.
Class B should contain products which’s cumulated sales are between 0.7 and 0.9 – between 70% and 90%.
etc.
(This table can later be extended to support any number of classes and any boundaries between 0% and 100%.)
To get the boundaries of the selected class we create two measures that are later used in our final calculation:
Our final calculation looks like this:
[SUM SA],
CALCULATE(
[SUM SA],
FILTER(
VALUES(DimProduct[ProductAlternateKey]),
[MinLowerBoundary] < [CumSA% CurrentProducts]
&& [CumSA% CurrentProducts] <= [MaxUpperBoundary])))
If our Classification-table is not filtered, we just show our [SUM SA]-measure. Otherwise we extend the filter on our DimProduct[ProductAlternateKey] using our classification filtering out all products that do not fall within the borders of the currently selected class.
This measure allows us to see the changes of the classification of a specific product e.g. over time:
In 2006 our selected product was in Class C. For 2007 and 2008 it improved and is now in Class A. Still, overall it resides in Class B.
We may also analyze the impact of our promotions on the sales of our classified products:
Our Promotion “Touring-1000 Promotion” only had impact on products in Class C so we may consider to stop that promotion and invest more into the other promotions that affect all classes.
The classification can be used everywhere you need it – in the filter, on rows or on columns, even slicers work. The only drawback is that the on-the-fly calculation can take quite some time. If I find some time in the future i may try to further tune them and update this blog-post.
The example workbook can be downloaded here:
Though it is already in Office 2013 format an may not be opened with any previous versions of Excel/PowerPivot.
It also includes a second set of calculations that use the same logic as described above but does all the calculations without retaining any filters on the DimProducts-table. This allows you to filter on Class “A” and ProductSubcategory “Bike Racks” realizing that “Bike Racks” are not a Class “A” product or to see which Subcategories or Categories actually contain Class A, B or C products!
Your running total methodology obtained with TOPN is very creative. It opens up a number of interesting possibilities, not the least of which is a new way to calculate moving averages.
I tried your solution. on my data (3,000,000 + lines of sales and 20,000 + products) measure CumSA CurrentProducts is calculated for too long. There are ideas for increasing productivity?
20k products is quite a lot for this kind of calculation
the only option I see to improve performance is to remove products that do not have sales at all before running all the calculations
running the calculation only on a subset of data should speed it up but this also depends on how sparse your fact-table is populated
how is the calculation for [Rank CurrentProducts] performing?
if it performs well you may consider to do the classification only based on the rank
e.g. Top 1000 products is Class A, 1001-5000 is Class B and rest is Class C
Thanks, Gerhard.
Needs is a classic ABC. I have the solution to the Multidimension. I’m using a dynamic set and function TopPercent to determine the minimum value of sales for the product in 80% (A Class), 95% (B Class) and then compare the sales of each product with these values.
iif([Measures].[SalesAmount]>=[minA SalesAmount],”A”,
iif([Measures].[SalesAmount]>=[minB SalesAmount],”B”,”C”))
In DAX no TopPercent function…
I tried to tune the calculation by also calculating [minA SalesAmount] first but the performance is worse than before as that calculation is also performed everytime
the only option I see is to precacluated those [minA] values for predefined slices – e.g. Year and Country:
Year Country MinA MinB MinC
2008 US 300.000 100.000 0
2009 UK 175.000 100.000 0
[MinA]/[MinB]/[MinC] could be calculated columns that do the expensive calculation at beforehand so no on-the-fly calculation is needed
but this makes the whole thing much more complex and also it is not flexible anymore
though, for your case it may be a viable solution
Hi Gerhard, your sample is a very good. Unfortunately I’ve yet Excel 2010 and so I cannot open the data model: the Classes table isn’t related to any other table, isn’t it? If I want to analyse by category, do I modify your formulas or is it better to create new measures?
Hi pscorca,
you are right, Classes table is not related to any other table
to analyze by category it is probably easier to change the current calculations instead of creating new ones
Hi,
Thank you for sharing the solution above. I am just getting stuck at one place – the [Rank]. This is the RANK formula I am using in the Pivot Table is
=IF(NOT(ISBLANK([Sessions conducted])),
RANKX(
CALCULATETABLE(
VALUES(Feedback[Organised by]),
ALL(Feedback[Organised by])),
[sessions conducted]))
The problem is that if multiple clients give me the same number of sessions, then it assigns the same rank. So if 3 clients give me 10 sessions each, then the rank will be 5 for all three. Is there a way to get the rank as 5,6 and 7 respectively.
The RANK.EQ will not work here because the second argument to the RANK function has to be a column name and not a measure.
Please help.
Hi Ashish,
I also replied to your MSDN post already
the RANKX()-function accepts a total of 5 arguments to handle ties and ordering
=IF(NOT(ISBLANK([Sessions conducted])),
RANKX(
CALCULATETABLE(
VALUES(Feedback[Organised by]),
ALL(Feedback[Organised by])),
[sessions conducted],
[sessions conducted],
1,
DENSE))
does this work for you?
Hi,
Thank you for replying but that does not work. The ranks assigned to same numbers is still the same.
could you share your workbook or send it to me per mail so i can take a closer look at the Problem?
you find my mail in the About-Tab on top of the page
Pingback: ABC analysis
Hi Gerhard,
I am working on the Top 10/ 20/30/50 Products and rank 31-50, 51-10 etc and its revenue contribute to total revenue (how many %). I use the similar method but I have spent so much time on this but it does not work for me. It would be appreciated if you could help me with this.
Thanks
do you have a sample workbook to share?
this would realy make things easier
just drop me a mail
I sent a sample file few days ago but not sure if you could open the file.
It was quite large
Hi Gerhard,
Thank you for sharing… I used your method to Group stores by Sales Value Growth and it worked perfectly.
Thanks
Erik
Glad to hear that Erik!
May I ask how many distinct stores have?
In your first measure,
Rank CurrentProducts:=IF(HASONEVALUE(DimProduct[ProductAlternateKey]),
IF(NOT(ISBLANK([SUM SA])),
RANKX(
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA])))
I had better query performance for the RANKX portion by using this DAX:
RANKX( SUMMARIZE(ALL(DimProduct), DimProduct[ProductAlternateKey]), [SUM SA] )
Any insight into the workings of things on why Summarize would be faster than Calculatetable? Run against my data, SQL Server Profiler recorded run times of 156 milliseconds, and 437 milliseconds. So not a trivial difference.
hmm, good questions
I guess it is related to the CALCULATETABLE which may not perform very well in this case. Could be that is evaluated every time when RANKX iterates over it and the SUMMARIZE may be cached – but that’s just a guess
It may also vary by the number of rows of [ProductAlternateKey]
how many do you have in your scenario?
-gerhard
Hi Gerhard, I need to set the rank of products over years as equal to the rank of the last year selected by the user in a timeline. For Example: if user selects years from 2009 to 2012, 2009/2010/2011 ranks should be equal to 2012 rank. At the moment I use this DAX Formula, but the LASTDATE function is not dynamic when user selects different time period (for your information, in the dim_time I have years from 2006 to 2012, so 2012 in this case is the last year).
CALCULATE(RANKX(dim_product),[SUM],,,Dense),FILTER(ALL(dim_time[YearDate]),dim_time[YearDate]=LASTDATE(dim_time[YearDate])))
Can you help me? Regards
In order to calculated the rank based on the last selected year only you need to calculate the last years value first. You can use TOPN() to do so:
SUM SA LY :=
CALCULATE (
[SUM SA];
TOPN (
1;
VALUES ( DimDate[CalendarYear] );
DimDate[CalendarYear]
)
)
then you can use this measure in all subsequent calculations or only for the rank calculation
-gerhard
Thank you for your support.
I ask you another question.
I have a dim_topN table with only one column with values 1,2,3,4,5 and I want to use it to select TOP N Products. This table has no relations with the others tables.
I created a TOPN slicer, and the requirement is that if user selects “1”, he want to see the TOP 1 Product, if he selects “2” he want to see both TOP 1 and TOP 2 Products and so on. So I need multiple selection on that slicer.
At the moment the result I reached is that the PivotTable shows the TOP 1 Product when “1” has been selected, the TOP 2 Product when “2” has been selected and so on. If I try to select multiple items from slicer, PowerPivot gives me error.
How can I enable multiple selection from the slicer?
If is not clear, I’ll give you more details.
Thank you in advance
Regards.
well, basically you would need a slight deviation of the CumSA-Measure from above:
SUM TopN Product Sales:=SUMX(
TOPN(
MAX(dim_topN[topNcount]),
CALCULATETABLE(
VALUES(DimProduct[ProductAlternateKey]),
ALL(DimProduct[ProductAlternateKey])),
[SUM SA]),
[SUM SA])
Hi,
Instead of the sales amount in each class, I’d like to have a distinctcount of the number of materials in a certain Category:
eg:
Class A: 500 items
Class B: 600 items
Class C: 900 items
Or in combination:
Class A: 500 items: 80€
Class B: 600 items: 15€
Class C: 900 items: 5 €
I can’t find the correct DAX expression to have the materials counted in that particular Class (because there is no relationship with the DimABC Classes?)
Can anyone help me to solve this please?
Thanks
Geoffrey
Hi Geoffrey,
you are right, there is no relationship between DimABC to any other table and its also not necessary. The values are related to DimABC only via the last calculated measure [SA Classified Current] which respects the current selection on DimABC
basically, to do the classification on a distinct count measure you would simply have to change the definition of the [SUM SA] measure to a distinct count instead of a sum
however, I am not quite sure yet if an ABC classification in general makes sense on a distinct count measure
you can rank by the distinct count measure but further accumulating the top N results may not be correct as a distinct count is not additive
-gerhard
Hi Gerhard,
thank you for your quick response.
Basically, I want to know how much materials are making my C Class.
So still necessary to have the classification based on the sales amount
I want to know how many (unique) products there are in each bucket. And of course adjusted by each filter you apply?
So I don’t think changing the formula from [Sum SA] to a distinct count will help?
Geoffrey
OK, now I got your point 🙂
in this case we only need to change the last calculated measure to the following:
DC Products Classified Current:=IF(
NOT(ISCROSSFILTERED(Classification[Class])),
[DC Products],
CALCULATE([DC Products],
FILTER(
VALUES(DimProduct[ProductAlternateKey]),
[MinLowerBoundary] < [CumSA% CurrentProducts] && [CumSA% CurrentProducts] <= [MaxUpperBoundary]))) the measure [DC Products] would simply be your distinct count measure does this work for you?
It works like a charm now … you’re the best gerhard. Thank you very much for your professional assistance!
Kind regards,
Geoffrey
I was able to work through this and get it working for my data set. The only change I need to make is that I want the ABC ranking to be dynamic regarding every possible filter EXCEPT time. So I may change sales regions or salesperson, etc…. but i want it to look at the entire time frame of my data set and classify them as A,B,or C regardless of it being week 1 or 2 or 3….etc..
I am guessing i need to incorporate “All(Dataset[Finish Date])” into one or more of the formulas, but I am not sure where. Any assistance would be much appreciated!
without having tested it I would say simply changing the [SUM SA]-measure by adding “ALL(Dataset[Finish Date]))” should do the trick
I am currently working on a new version for dynamic ABC analysis with much better performance
I may address your request there
-gerhard
Okay…. It APPEARS that worked. I can’t be sure… haha… I can’t honestly say I have less than a VERY general understanding of how these measures are working.
I have no doubt this “Dynamic ABC grouping” does work, but even though I have implemented this into my file, it is definitely slow to respond, and I can’t say I understand how it works very well at all. A new solution with better performance would be super useful. Even though this gave the grouping i am wanting, I doubt I would ever implement it into future files given its current state. The slow performance, and my overall inability to have a solid understand of how it works (and be confident it is working accurately). Just my two cents.
The reason I am looking for this ABC grouping is for a stacked bar graph of volume over time. With a large # of customers, it is too much noise, making it difficult to really see what matters. The grouping I implemented showed only the top 40% of volume customers over the entire time frame (4 customers), letting the smaller volume customers be grouped. A trend of the top customers volumes can be seen, and their impact on the overall volume.
Many thanks for posting the current solution! There doesn’t appear to be anything like it online that I have found, and that surprises me. There definitely is value in quickly being able to see only your key customers for a region, product, etc….
Thanks again!
Hi Gerhard,
Will you post the new version of the Dynamic ABC analysis in this same thread?
Or how can I be updated when it will become available?
Thanks
Geoffrey
Hi Goeffrey,
I will add a link here and also write a dedicated post for the new solution
-gerhard
Hi Gerhard,
thank you for the excellent tutorial.
However I still can’t achieve my desired output with your solution:
Instead of the “ProductAlternateKey” I want to show the “EnglishProductName” in the PivotTable in Excel for the User on the rows. If you replace the “ProductAlternateKey” with any other attribute from the same Dimension (for Example “EnglishProductName”), the PivotTable shows everything with Rank “1”, and doesn’t accumulate the values anymore.
It only works if you show ONLY the “ProductAlternateKey” on the rows.
I don’t quite understand this, it’s just a 1:1 relationship and the Name is in the same row…
I cannot carry out the calculation on my ProductName, because it could be not distinct.
Is there a way to do it?
I just found the solution myself with a bit more thinking 🙂
Instead of giving the VALUES and the ALL functions only one specific column, you have to give them the whole table:
Rank CurrentProducts:=IF(HASONEVALUE(DimProduct[ProductAlternateKey]),
IF(NOT(ISBLANK([SUM SA])),
RANKX(
CALCULATETABLE(
VALUES([TableNameHere]),
ALL(DimProduct[TableNameHere])),
[SUM SA])))
The same applies for the cumulated value etc
Hi Leonhard,
great that you found a working solution for your scenario on your own!
here are just two side-notes from me:
1) the measures [Rank], [CumSA] and [CumSA%] were not supposed to be exposed to the enduser. they were and should only be used internally for the final calculation
2) I will soon publish a new version of the Dynamic ABC calculation at http://www.daxpatterns.com with much better perofrmance
kind regards,
gerhard
Hi Gerhard,
due to the “nature” of the DAX-Calculations used in this ABC solution, the performance seems to be acceptable with about 1000-1500 products to analyze.
However the amount of my distinct products that have sales to analyze with this ABC solution is over 35.000. While tracing with the SQL Server Profiler I was able to conclude that the [Rank CurrentProducts] still runs quite fast, but [CumSA CurrentProducts] and [SA Classified Current] take 5-10 minutes to run, wenn the user selects a different year or company division from my connected slicers.
My Charts include “Top 5 Products” and “Bottom 5 Products” per selected A / B / C Group, selected Year and selected Company Division as well as a visual line-chart representation of Cumulated Percentage distribution (lorenz curve) of all products.
Can you share any advice at this point that I might utilize to improve my performance, besides calculated-columns to pre-calculate the ABC-status per filter?
Hi Leonhard,
as I already mentioned in my previous reply I am going to publish a new version of the calculation with much better performance within the next weeks
if you want to test the new calculation beforehand just drop me a mail
-gerhard
Hi Gerhard, any update on the new calculation?
Kind regards,
Geoffrey
Hi Goeffrey,
I dropped you a mail with a beta-version of the new calculation.
In case you have not received it just contact me via mail (see About-page for details)
kind regards,
-gerhard
Thanks, I received your email. I will work with it as soon I have some time available.
Kind regards,
Geoffrey
How could I get the total sales of each classification and respective product count for each classification?
Hi Christopher,
make sure to also read my published article at DAX Patterns: http://www.daxpatterns.com/abc-classification-dynamic/
the style and content there is much better and probably easier to understand
in general, you can use any measure you want, simply place it as the first argument of the first/outer CALCULATE (the so called “” according to the article above)
this also includes distinct counts, and even more complex calculated measures as we are basically just creating/adding a filter on a set of items/products
-gerhard
It’s the most usable way to calculate ABC. Thank you very much !
Hi Gerhard,
Thank you very much for your topic, it’s very useful for me. By the way, I’m trying to list each with its class in a table. How can i do that with a formula.
I think there is some code in the PBIX that exposes the class as a measure which you could use inside a table next to other measures