Finding common maximum in Power BI

Let’s get straight into the problems.

Problem #1

You get country sales data on monthly basis. But, some of the countries i.e. Switzerland and Italy lag the others. Let’s imagine a scenario when all countries latest data (June-2020) has arrived except Switzerland and Italy. This is how your data table would look like in this interim situation:

As you can see, Switzerland and Italy are missing from June data.

I have come across many clients who would prefer showing May-2020 in reports as latest data month of Europe region in this situation. Because May-2020 is the latest complete month for Europe. At the same time, they would like to see June-2020/May-2020 whichever is applicable as latest month for individual countries. In other words, client wants to see something like this in his report:

This is our first problem.

Now, let’s get started with implementation.

Let’s imagine our table’s name is CountryData. We will introduce a calculated column “Latest Month Column” using DAX:

Latest Month Column = FORMAT(CALCULATE(MAX(CountryData[Month]), FILTER(CountryData, CountryData[Country]=EARLIER(CountryData[Country]))),“mm/dd/yyyy”)

This column always shows the latest month available for respective country.

Now, we will introduce two DAX measures called “Latest Month” and “Latest Month Sales”

Latest Month = MIN (CountryData[Latest Month Column])

Latest Month Sales = SUMX(FILTER(CountryData, CountryData[Month]= CountryData[Latest Month]),CountryData[Sales])

That’s it. We will now use these two measures in our report:

As you can see, North America’s latest moth is June-2020 whereas Europe’s is May-2020.

Problem #2

This lagging nature of some countries will cause some problem in charts/graphs as well. Look at following charts

First bar chart is fine. I have placed it just to visualize that Switzerland and Italy are missing from June. But, the second one (line chart at right hand side) will give the client a worrying picture of his business. They would see Europe’s sales is suddenly going through a steep decline. It’s a confusing message because the decline was caused due to the absence of Switzerland and Italy. It will get fixed as soon as they arrive. It has nothing to do with actual sales. In this scenario, some clients would ask you to plot the latest month only if it has complete set of data for all countries. In other words, we will plot charts only up to the common maximum month for all countries within that region. In our case, last month plotted for Europe will be May-2020 and North America’s last month will be June-2020. This will avoid the misinformation about Europe’s sales. And when Switzerland and Italy eventually arrive, June-2020 will be plotted for Europe as well. This is our Problem#2.

Let’s implement the solution now:

Step 1: Let’s create a calculated table called “RegionLatestMonth” from our original CountryData table. This will contain region’s name and common maximum month from all its respective countries.

RegionLatestMonth = SUMMARIZE(CountryData,CountryData[Region],“Region’s Latest Month”, [Latest Month])

Step2: Create another table called “CountryData2” that will have one extra column than the original CountryData table. This extra column will have Region’s common maximum month for each country. In other words, we will join our CountryData table with the DAX table created at Step 1 based on Region names and add the first’s table’s column (Region’sLatestMonth) as an extra column into the second table called CountryData. Resultant table’s name is CountryData2:

CountryData2 =

var TempTable = TREATAS (RegionLatestMonth, CountryData[Region], RegionLatestMonth[Region’s Latest Month])

RETURN NATURALLEFTOUTERJOIN (CountryData, TempTable)

Step3: Finally, create another DAX table called CountryData3 from CountryData2 created at Step2 by excluding the rows whose month is larger than respective Region’s common maximum month.

CountryData3 = CALCULATETABLE (CountryData2, FILTER (CountryData2, CountryData2[Month]<=CountryData2[Region’s Latest Month]))

And then use the CountryData3 created at Step3 in your visualization:

Now you can see that Europe’s last month is May-2020 while North America’s last month is June-2020. I have highlighted both in above diagram.

Posted in Power BI, DAX | Tagged , , , , , , , , | Leave a comment

Links in Power BI

This is just a compilation of different types of linking mechanisms in Power BI. I have explained them from use case point of view. It’s mainly written for newbies like me who are wondering which linking feature is applicable for which use case? 🙂

Tiles: Link between dashboard and reportss

This is only available for Power BI Service. Click on ‘tiles’ icon on any visual.

Then you will see a prompt to select desired dashboard or create a new one.

And when user opens that dashboard, he can click on the tile to go into the original report from which this tiled visual is coming from. The dashboard gives you a landing page/home page or ‘summary’ of your measures/KPIs. You can then click visuals to see details inside your reports.

Bookmarks: Link to another page within same report (without parameters or filters)

This is usually needed when you just want users to click and land in another page of same report without thinking about any devilish parameters 😊

Steps:

  1. Create bookmark. I have created a bookmark called ‘Potential’ which links to a page called ‘Potential View’
  • Create a button/text whatever you like, give it a proper text or heading and define ‘Action’ for it. Action’s type will be ‘Bookmark’.

That’s all.

Hyperlink: Link your report from another report or another external site (with filter as parameter)

Two use cases for this type of linking:

  1. User visits your report from an external site. But you want to control his landing page and filters.
  2. User visits a new report from your original report. For example, if your users want to do some ad-hoc analysis separately from the original pre-defined report and you want to slice his data through filter.

Steps:

  1. Following is a sample URL you can embed anywhere you like (PBI buttons, links, external HTMLs etc.)

https://app.powerbi.com/groups/me/reports/zzzzzzzz/ReportSection?filter=SvGridFact_/ReportingGeo_Desc eq ‘Territory02’

It’s based on the following pattern

URL?filter=Table/Field eq ‘value

Visit https://docs.microsoft.com/en-us/power-bi/service-url-filters for more details.

2. Remember, if Page1 of your report is represented by ‘zzzzzzzz’ , Page2 might be represented by ‘yyyyyyy’. You can easily copy the appropriate link from browser address bar. Many people suggested to use &pageName=ReportSection, &pageName=ReportSection2 etc. as suffix to the main URL in order to land on different pages of your report but I could not make it work. I rather used the original URL coming from browser address bar for each individual page.

3. You can embed these links in grid tables as well. Look at the following example of a grid table.

‘Geography’ column shown in Grid table is coming from a column called ‘ReportingGeo_DESC’ of ‘SvGridFact’_ table.  Now, you want to show hyperlinks in the same grid with parameter coming from Geography column (aka SvGridFact_[ ReportingGeo_DESC]).

The report where the link will redirect you has a table called ‘AttrGeographySelection’ and a column called ‘GeographySelection_DESC’. You want to filter AttrGeographySelection[GeographySelection_DESC] column with Parameter value coming from SvGridFact_[ReportingGeo_DESC ] column. So, all you have to do is to introduce a new column called ‘Hyperlink’ with following definition:

Hyperlink = “https://app.powerbi.com/groups/me/reports/zzzzzz/ReportSection?filter=AttrGeographySelection/GeographySelection_DESC eq ‘”&SvGridFact_[ReportingGeo_Desc]&”‘”

Let me make it easy for you with a simplified formula:

Hyperlink_column = “https://app.powerbi.com/groups/me/reports/zzzzzz/ReportSection?filter=DestinationReportDataTable/DestinationReportDataColumn eq ‘”&SourceReportGridDataTable[SourceReportGridDataColumn]&”‘”

By the way, I have set ‘URL Icon’ on in Format->Values section

Data behind the grid table looks like this:

Drillthrough

This is handy when you want your users to click on any data element and jump to another page within the same report. Yes, the landing page will have data filtered by the data element clicked on the source page.

Step1: Identify the landing page. Drag the desired column into drillthrough area highlighted below

For example, I have dragged Country column into Drillthrough field

Now, you can go to another page having Country and right click to drillthrough. Then you will be redirected to the landing page defined above.

Cross-report drillthrough is also possible. See official documentation on Drillthrough: https://docs.microsoft.com/en-us/power-bi/desktop-drillthrough

Posted in Power BI, DAX | Tagged , , , , , | Leave a comment

Non-additive measures with different granularity in Power BI

I have found many articles on web regarding non-additive measures in power BI. But most of them are about inventory-type measures where normal aggregate functions are not suitable for use. However, in real life we often face situations where data providers deliver pre-calculated measures for different hierarchy levels. Numeric distribution, weighted distribution, market share etc. are some examples of such measures. And then we are left with no choice but to use fact tables with different granularity. It’s always a bit of challenge to show those measures in a single report in any BI tool.   

Look at following example. Here, we have a non-additive measure called Market Share and two fact tables with different granularities. One of them is at Region level and the other one at Country level.  

CountrySales Table

RegionSales Table

Let’s define a DAX measure called ‘MS’ in CountrySales table:

MS = if (selectedvalue(CountrySales[Country])=BLANK(), CALCULATE(SUM(RegionSales[Market Share]), filter(CountrySales,CountrySales[Region]=SELECTEDVALUE(RegionSales[Region]))), SUM(CountrySales[Market Share]))

Let’s look at the visuals now:

As you can see, Sales are getting summed up but MS (DAX based market share) are showing value directly from RegionSales table. I have used multi-select slicers here. Whenever all or multi-country is selected from Country slicer, a blank is returned and Region level MS (Market Share) is displayed. Otherwise Country level share is displayed.

In case you are wondering about relationship:

This solution can be improved a lot. I would be happy to hear from experts.

Posted in Power BI, DAX | Tagged , , , , , , , | 2 Comments

Restore SQL Server database using SAS from Azure Blob Storage

You need to create a SQL Server credential first to restore a database back up from Azure blob storage. A lot of writings can be found around the web on this. Most of them have used Azure Storage Account to create the credential in the first place. In this short post, I will be using SAS (Shared Access Signature) to create the credential first. So, here you go..

Following is a snapshot of a credential created using SAS. Identity will be SHARED ACCESS SIGNATURE and Password will be your <SAS token>.

Once, it is created you can now go to Restore database or Restore File or File-groups option

And finally use the credential created at the beginning to locate your back up file in Azure storage

Use your back up file’s URI in Shared Access Signature area.

If you like to use scripts to perform all these actions, you can follow this tutorial from Microsoft

https://docs.microsoft.com/en-us/azure/sql-database/sql-database-managed-instance-get-started-restore

 

Posted in Azure, SQL, Database | Tagged , , , , | Leave a comment

Device specific logic in Microstrategy

Sometimes you may need to set device specific logic in your microstrategy documents or reports. For example, some VI charts show different labels or different headers for iPad and you may want to hide them in iPad using a rectangular label. Or, you may not want to allow iPad users to click a certain button or link and at the same time you want to allow desktop or notebook users to click that link or button. In these cases, you can use a system prompt called “Mobile ID” to find out if the user is viewing your report from an iPad or from a desktop.

First, create a metric using this system prompt. Just put ?[Mobile ID] in your metric formula.

Device.JPG

Let’s say this Metric’s name is “DeviceID”. Now, put this metric in a dataset. When you run this dataset, it should give you user’s mobile device id. When you run this from a PC or notebook, it will give you some default digits – “1234567890123456“.

Device3.JPG

Now you can use this metric (“DeviceID”) in conditional formatting inside your documents. You need to write something like…

If (DeviceID = 1234567890123456) then hide

else show

Hope it helps. 🙂

 

Posted in Microstrategy, MSTR | Tagged , , , , , , , | Leave a comment

From SQL to Spark

When I started learning spark, I decided to learn it from the other way round. That means, rather than going through a step by step journey, I decided to learn it from BI reporting perspective. I took a table with some data. Then, I tried to answer some common reporting queries by using python+Spark. It was simple stuff but I decided to post it in my blog as it may help other people from BI/SQL background who are wondering where to start from.

My table looks like below. Each territory has some bricks and each brick has some sales and target amount.

P1

So, first question is which territory has the highest number of bricks?

It’s a simple group by/count problem in SQL. But pySpark is quite simple as well.

df.groupby(‘Territory’).count().orderBy(‘count’, asending=False).show(1)

p2

Second question is, which Territory has the highest target attainment? Here, we have to calculate sum of sales and sum of target for each territory and then divide sales by target to find out attainment percentage. Again, pySpark has some simple solution:

 

from pyspark.sql.functions import sum
from pyspark.sql.functions import format_number
df.groupby(‘Territory’).agg(format_number((sum(“BrickSales”)*100/sum(“Brick Target”)),2).alias(‘Attainment %’) ).show(1)

HighestAttainment

Finally, we will try to make a flat structured table a hierarchical one. Let’s see the input table.

Flat1.JPG

We have a hierarchy here; National->Territory->Brick. We will now make this table a classic hierarchical parent-child table. So, our expected table will be like this:

Child                            Parent                        Sales

AB16                             Terr0032                   0

BA06                             Terr0003                   374.0941667

Terr0032                      40012                          XXXX

Terr002                        40012                          XXXX

To solve this problem, we will create two dataframes and then make a union of them.

from pyspark.sql.functions import sum
df_1 = df.withColumnRenamed(“Brick”,”Child”).withColumnRenamed(“Territory”,”Parent”).withColumnRenamed(“BrickSales”,”Sales”)
.select(‘Child’,’Parent’,’Sales’)
df_2 = df.groupby(‘Territory’,’NATIONAL’).agg(sum(“BrickSales”).alias(‘Sales’) ).withColumnRenamed(“Territory”,”Child”).withColumnRenamed(“NATIONAL”,”Parent”)
df_1.union(df_2).show(700)

Flat2.JPG

Posted in Spark, Python | Tagged , , , , , | Leave a comment

Visual Insight charts inside Microstrategy documents

Microstrategy visual insight has lots of exciting beautiful looking charts. But many of them are not available in documents. You can import them into document following the steps described below. But, do not do this unless you really have to do it. Because, those imported charts become difficult to maintain inside document. most of the time, if you change any property (or change threshold etc.) that chart turns into a mess. And, you have to import it again.

So, here are the steps to import VI charts into document:

  1. Create your desired chart in a VI dashboard and convert it into a document. Then save that document.
  2. Now open your original document where you want to show the VI chart. Now, click on ‘Layout’ available in Insert menu at top menu bar.

insertPanel

3. Now, you will get a file browser where you have to select the document saved in step 1.

4. Now, that document will be imported into your original document. See ‘Layout’ tab. You now have two tabs. Now, all you have to do is to go to the second tab, copy your VI chart and paste it into your first tab (‘Layout1’).

Panels

Caution: Imported layout’s dataset is also imported into your original one. You will not get any notification. Often, it creates mess.

Posted in Microstrategy, MSTR | Tagged , , , , , | Leave a comment

Metric calculation based on selection in Microstrategy

Hello guys, I am back after a long time. I was learning Microstrategy during last one year. So, here I am, writing something on mstr.

Today, I am writing about a problem that almost every mstr developer faces. Let’s say, you have two base metrics: dollars and units. Now, you like to calculate Market Share (MS) based on these two base metrics. Everything stays fine up to this point. But, problem starts at dashboard or document side. It is very likely that your client would like to have one selector in web document to choose between dollars and units. And MS should be calculated based on his selection (either dollars or units). Given below is a snapshot which will make everything clear.

Usually, this problem is solved by using panels with a panel selector. For two base metrics, you have to take two identical panels. Both panels will be exactly same from visualization point of view. But, they will be using two different metrics. One panel will use MS_Dollars and the other one will use MS_Units. That means, whenever you need to change anything, you have to do it in two places. Imagine what would happen if you have three or four base metrics? You will end up copying the same panel three or four times. In one of my projects I had six identical panels (yuuuk..)!

Following is my workaround to avoid creating multiple identical panels; although with a small glitch…

  1. Add one table called DimFlag into warehouse catalog. It will contain two records:  ‘Units’ and ‘Dollars’

DataTable

2. Create an attribute called Flag. This attribute doesn’t have any relationship with any other table in my schema.

AttrFlag

3. Create a metric called Flag using the attribute created in previous step.

MetricFlag

4. Create one metric called MyMeasure. This metric holds key to our solution. This metric returns Dollar metric if Dollar is selected from flag attribute and it will return Units metric if Units is selected from flag attribute. Note that Flag ID has been used in if condition instead of flag description.

MyMeasure

5. By the way, we have another attribute called country. We want to show each country’s sales (dollar or units) and its corresponding share (%) in our web document. We need to use level dimensionality to derive ‘market sales in units’ and ‘market sales in dollars’ first. When I say ‘market’, I mean summation of all available countries. So, our level dimensionality will make sure Country attribute is ignored as ‘filter’ and ‘GroupBy’. This level dimensionality is not related to our main solution. Our main trick is the ‘if’ condition that was demonstrated in previous step. Finally, we will create another wrapper metric called ‘Market Total’ which will contain an ‘if’ condition just like the one in ‘MyMeasure’ metric created in previous step.

marketTotal

6. Now, we are ready to create a dataset. Given below is a snapshot of our dataset. If you look at corresponding generated SQL, you will notice a cross-join. It’s because Flag attribute is not connected anyway to our facts. This is the small glitch I was talking about. But, we can live with a cross-join for small scale projects where data volume is not that big and generated SQL is also simple.

DS1

7. We are almost done. All you have to do is to import this dataset into your document. Create a selector from Flag attribute, create a grid table using MyMeasure and MarketTotal metric and Country attribute. Finally, make this grid a target for your Flag selector. Metrics calculation will happen based on your selection in selector as per the ‘if’ condition defined inside Metric definitions.

Hope you’ve understood my idea. Get back to me for any question.

 

Posted in Microstrategy, MSTR | Tagged , , , , , | Leave a comment

What I have learned in the name of unit testing of ETL, DWH, OLAP Cube and MDX

In one of our projects, we ‘somehow’ decided that every part of the system should be ‘unit test’-ed . It was a classical ETL->DWH->Cube project where web dashboard was built to show BI reports from cube (using MDX as reporting language).  Off course, unit testing was nothing new to our web developers. But, our data warehouse developers had to figure it out.

Let’s take a simple example; we have a function in our ETL process that joins two table and returns the result set. The only way to test this function is to verify its entire output data. But, unlike a programming language where expected output of a unit test remains fixed, this simple function’s expected output cannot remain fixed. Because it depends on input data. The only way we can keep it fixed is to keep the input data fixed. Hence came the idea of ‘frozen environment’. So, all our data should be frozen as long as we are doing development and this way we can pass the unit test. Now this is where it gets difficult. Because, we can never make our system ‘go live’ keeping every data in a frozen state.  It has to change every now and then. And, every time input data changes, we need to change expected output data as well.

Another thing is, it hardly happens that we are changing something in our ETL flow and not expecting any data change in the system. In ETL we mostly do data manipulation, isn’t it? What kind of change we can do in a lame ‘join’ where we do not expect any data change in the end? Yes, performance tweaking, CLRs, string formatters may be some of those but not much other than that. So, we end up changing the unit test almost every time we change anything in the flow. Otherwise, unit tests will fail. So, more and more it is looking like data testing rather than unit testing.

All of above is also applicable for data warehouse, olap cubes and MDX. Actually, unit testing is something meant for programming languages. Cube or sophisticated ETL tools are more like design tools doing mostly data manipulation; therefore will not behave like a programming language. Personally, I will always call it ‘good old data testing’, not ‘unit testing’ anymore.

Posted in SQA, Testing, SQL, Database | Tagged , , , | 2 Comments

Common Latest Period in SSAS Cube

Recently I came across a situation where data was being delivered at different times for different countries and client wanted to see data for latest common month in Region/Continent level reports. Data arrive on monthly basis, but it arrives at different dates for different countries. So, at one point in our data warehouse, we have Dec-2014  data for UK and Qatar. Rest of the countries have fallen behind them (having data for up to Nov-2014, Oct-2014 etc.). Let’s analyze in more details.

First, let’s go into the cube. We have a simple Geo hierarchy. Countries are at lowest level and its parent is Region (Europe, Mid East).

GeoHierarchy

We have a simple month dimension as well. Its primary key is the digit produced by concatenating year with month. So, Dec-2014’s key will be 201412.

MonthDimensionDimMonthData

Let’s browse cube and see how data looks like…

CubeData

Look at rows inside red colored zone. Different countries have different latest months. User requirement is to see latest month’s sales for all geo levels. It can be achieved in many ways. One way to achieve this is to create a calculated measure like this…

CREATE MEMBER CURRENTCUBE.[Measures].[Last Month Sales]

AS sum(

tail(nonempty([Dim Month].[Id].[Id],([Measures].[Sales],[Dim Geography].[Hierarchy].currentmember)),1),

[Measures].[Sales])

,VISIBLE = 1 ;

If we drag this measure in cube browser, country level and region level sales will be like this:

CountrySales1 RegionSales1

It’s perfectly alright for country level sales . But, it will be interesting to analyze Region level sales. Actually, for regions, most of the business users would like to see common latest month’s total sales. In our data, Europe’s last month’s sales 4830 is probably giving the business user a wrong picture (of course it depends…. ), because only UK  data is available for Dec-2014. So showing a total for Dec-2014 may not make much sense. Latest common month for Europe is Nov-2014 and latest common month for Mid East region is Oct-2014. So, it would always make much sense if we show Nov-2014 total for Europe and Oct-2014 total for Mid East. To support this requirement of showing common latest month’s sales I have introduced another fact. This fact is nothing but a storage for country-wise latest month.

CubeStructure2 NewFact

This Fact table’s name is FactCountryMaxPeriod and declared a simple count measure called [Measures].[Fact Country Max Period Count] from it. We are almost done. Just declare a new calculated measure called [Measures].[Common Latest Month Sales] and see it in cube browser.

CREATE MEMBER CURRENTCUBE.[Measures].[Common Latest Month Sales]

as

sum(nonempty([Dim Month].[Id].[Id], [Measures].[Fact Country Max Period Count]).item(0), [Measures].[Sales])

,VISIBLE = 1 ;

LastMonthSalesCountry LastMonthSalesRegion

This time, Europe and Mid East is showing total sales of Nov-2014 and Oct-2014 respectively.

Posted in MDX, SSAS (Analysis Service), Cube | Tagged , , , , , , , | Leave a comment