October 03, 2023

Always Use Explicit Measures - Wait, What Is An Explicit Measure?

October 03, 2023/ ed hansberry

All of your analytics in Power BI are done with measures, and there are two ways to create measures. It is important to understand the difference between the two methods. If you drag a field from the data pane directly into a visual, that is an implicit measure. The model usually wraps it with SUM(), but other aggregations are possible, like COUNT() or AVERAGE().

In the above image, I am dragging the Profit field to the table, and Power BI is using the SUM() aggregation. The query Power BI uses to actually generate this table visual is below:

You can see it wrote SUM(‘financials’[Profit]) and wrapped it with CALCULATE(). That CALCULATE() is important later. Side note: If you’ve ever heard someone say they don’t use CALCULATE() in their reports, that is just nonsense. Whether you use the CALCULATE() function in a measure or not, Power BI uses it all of the time.

An explicit measure is when you explicitly write a measure using the DAX functions you need. For example, Total Profit = SUM(‘financials’[Profit]) is an explicit measure. Once created, I would then drag [Total Profit] to my visuals. The DAX query when using an explicit measure might look like this:

Here it is using the [Total Profit] measure. Notice there is no CALCULATE() function in the query. That is because when you refer to a measure, there is an implicit CALCULATE() already. It is as if the above said CALCULATE(‘Financials’[Total Profit]). Got it?

When you use an explicit measure, the DAX query for your visual will have an implicit CALCULATE().
When you use an implicit measure, the DAX query for your visual will have an explicit CALCULATE().

Don’t worry. That is as complex as this article gets.😉 It just helps to understand what goes on under the hood.

Note that Power BI always puts the table names in front of measures within the queries it generates. This is a bad practice for you and me. Never do this when writing measures, calculated columns, or calculated tables. The rule is:

When referencing a measure, always exclude the table name.
When referencing a column, always include the table name.

What Power BI does when building its own queries is its business. You’ll need to understand how to read the code, and you can see in the query above it appears it is referencing a Total Profit column, but it isn’t. Ok, enough of that tangent.

So, what difference does it make whether or not you use implicit or explicit measures? Here are the differences:

Advantages of implicit measures:

Easier when adding a field to a visual.

Disadvantages of implicit measures: (not comprehensive)

You cannot edit the measure in order to change the calculation. Say you had dragged in the ‘financials’[sales] field into a measure and it created the implicit SUM(‘financials’[sales]) measure, you cannot change it, other than the aggregatio type (COUNT, AVERAGE, etc.) If you need sales to be the net sales calculation by subtracting cash discounts, you cannot. You now have to write an explicit measure that might look like Net Sales = SUM(‘financials’[sales]) - SUM(‘financials’[cash discounts). Now you will need to remove your implicit sales measure from any visual that was using it and replace it with the new business logic. That is a lot more work than just editing an explicit measure had you created one in the first place. Be kind to your future self.
1. This particular problem magnifies itself if you have added conditional formatting to that field. Once you swap out the implicit measure with an explicit measure, you have to redo all of the conditional formats!
You cannot build off of it. The beauty of measures is you can create new measures that use other measures, and if you change the original measure, the dependent measures will automatically take in that new logic.
Implicit measures do not work with Calculation Groups.

The Calculation Group thing is HUGE. What is the problem here?

I’m not going to do a deep dive into Calculation Groups. Others have already done a fantastic job here, so go read their blogs.

Consider the following visual. It has two measures.

The Quantity column is an implicit measure. I just dragged the Sales[Units] field into the visual and called it Quantity. The Total Units column is the explicit measure Total Units = SUM(Sales[Units]).

Now, I want to have another set of columns that will show me the units in the same period last year. I could create a new measure for this, or I could use a Calculation Group. The latter is more versatile, so I’ll do that. Using Tabular Editor I created a Time Intelligence Calculation Group with Current Period and Prior Year Calculation Items. Here is the full DAX for how that will work. Again, this isn’t an article to teach you how to create Calculation Groups. I’m focused on the downsides of implicit measures.

When I drop that Calculation Group to the Column section of the matrix, I get this visual:

Do you see the problem? The Current Period columns are fine. But look at the Prior Year columns. The Quantity column is showing the same as the Current Period. That is wrong. But the Total Units measure correctly shows the units from the prior year. Why is that?

It is the SELECTEDMEASURE() function in the Calculation Items shown above. That is how Calculation Groups work their magic. They have but one flaw. They cannot see implicit measures, so they ignore them. Power BI protects you from this though. When you create a Calculation Group, the model flips a switch that blocks the creation of additional implicit measures. Notice the DiscourageImplicitMeasure property. By default this is set to FALSE, but when you add a Calculation Group to a model, it gets set to TRUE.

Now if you drag a field to a visual expecting a quick calculation, you won’t get one. It will either just show the values in that field, or block you from adding it to a visual outright. The property is called “Discourage Implicit Measures” and not “Block Implicit Measures.” The reason is the tabular model cannot foresee all possible third party clients working with the model, but for Power BI Desktop or writing a report in the service, it is a block.

So we’ve seen implicit measures are fast and easy to create, but have a number of shortcomings. I know it takes an extra 30-45 seconds to create an explicit measure, but trust me, creating explicit measures is a best practice. It will save you additional work in the future, and if you introduce Calculation Groups to your model, will help ensure they are returning the expected results.

March 13, 2023

Tips for Debugging your DAX Code

March 13, 2023/ ed hansberry

When trying to get your DAX measures to work correctly, there are a number of tools you can use to debug, which I will briefly mention, but not go into depth on. Instead, this post is about how to think about debugging your code. I am focusing on DAX here, but many of the methods would apply to any type of code.

First, the tools you can use. This is not a comprehensive list, just something to get you started.

Use Variables. Variables have several benefits:
1. Variables are processed one time and can be reused later in the measure in multiple places. This can speed up your code.
2. You can return a variable result to the visual itself. This allows you to see what each component is doing to ensure each part is calculating correctly.
CONCATENATEX - If a variable returns a table, you cannot return that variable directly to the visual. A measure requires a scalar (single) value. CONCATENATEX will convert a table to a scalar value so you can see everything in that table. You can also use the newer TOCSV or TOJSON functions, which do similar work different ways. TOCSV and TOJSON are the underlying functions that help EVALUATEANDLOG return results, and aren’t really meant to be used in a production model, but they are handy in debugging.
EVALUATEANDLOG - This is a relatively new function that will generate a ton of output for each expression you wrap with this function. There is a massive 4 part series on EVALUATEANDLOG here, including links to download the free DAXDebugOutput tool that will be necessary to read what this function returns. Note: Be sure to remove EVALUATEANDLOG once you are done. It will slow your code down. You should never publish a report that has this function in it.
Tabular Editor 3’s Debugger - This is simply the luxury vehicle for debugging DAX. More about this tool is here.

Instead of an in-depth look at these tools though, I want to look at the thought process when debugging and offer some tips.

First, when debugging DAX, be sure to do it in a Table or Matrix visual. You need to be able to see what numbers are being returned for each value in the columns and rows. This is impossible to do efficiently in a bar or line chart. To quickly convert your visual to a table or matrix, make a copy of it in Power BI, then while it is selected, select the Table or Matrix visual icon in the Visualization Pane. You might need to move fields around to get the correct fields in the Rows, Columns, and Values section.

Next, make a note of any filters you have set in the Filter Pane, then clear or remove them. You need to understand what the core calculation your measure is doing before any filters are applied. This may not be the result you want, but you have to start with the unfiltered result. For example, you might only be interested in getting an average 12 months of sales for Texas, but if the [Average 12 Months of Sales] measure is wrong at the global level, having your report filtered for United States is only muddying the issue. Make sure that figure is correct globally first. Then you can apply filters. If you apply the United States filter and Texas is still wrong, it is possible your measure is fine, but you have a problem in the data. Zip codes are assigned to the wrong state for example. But you will never figure that out if you have no confidence the measure itself is correct. When debugging DAX measures and queries, we take for granted that our data is good. That isn’t always the case, so keep that in mind.

We’ve all built reports where we create 3-4 measures, then create several measures with new logic and include those original measures. and before you know it, you have 40 measures with a complex web of dependencies. And the measure you are debugging references 8 of those measures.

As you can see above, a single measure can have many, perhaps dozens, of fields and measures it is dependent upon. When this happens, I generally tear down the measure to the component parts, even replicating code from other measures into the one I am debugging. Often I find some code in one of those dependencies that works great for that particular measure in a specific chart, but it doesn’t work well when used inside of your measure. Context Transition can be invoked at the wrong granularity (subtotal or total level) and return undesirable results, or it has a particular filter applied or removed using CALCULATE. Start simple, and rebuild it. Once you get it working, you can replace the duplicated code with the measure it came from. Sometimes though you might find your new code runs much faster. Iterators like SUMX and AVERAGEX often benefit from streamlined code vs using measures. As already noted, using measures in iterators can produce undesirable results at the total level. Look at these two examples:

Average Sales per KG = AVERAGEX(Sales, Sales[Units] * Sales[Price] / RELATED(Products[KG]))

Average Sales per KG = AVERAGEX(Sales, [Total Sales] / [Total KG])

If you put both of those measures in a table and it is by invoice number, they will both produce the correct results at the detail level. However, the second example that uses measures will not produce correct results at any sub-total or grand total level. It will total the sales for the entire table then divide that by the total of the kilograms. A useless result. (Ignore I am using the / symbol vs DIVIDE(). I am going for clarity here, not best practice.)

So here, reusing measures actually cause a problem. That is not often the case. I am a huge fan of reusing measures, but it isn’t always the answer, and can make debugging more challenging. Never assume your dependent measures are correct, or at least correct for your current needs.

Next, take a step back and try to understand the question being asked. Have the question phrased in plain business language. For example, here are two ways you can be presented with a question from a user for the report:

I need to get the total sales where the Relative Date is 0 through -7.
I need to get the total sales from today going back 6 days.

Those are not even the same question, and there are several ways to interpret #2.

You’ll notice in #1 that is a range of 8 days. So if you are getting that type of request from a user or co-worker, you might not think to push back, Instead you may give them a simple measure with a filter predicate like ‘Date’[Relative Date] <=0 && ‘Date’[Relative Date] >= -8. That filter is correct, but it likely answering the question in the wrong way.
When the person says “today going back 6 days” what does “today” mean? Today’s calendar date, or the latest date in the model, which might be a few days ago. Using TODAY() in a measure and then walking back 6 days will give the most recent 7 days, but the last 2-3 days might have no values if there are weekends or holidays involved. That might be correct. It might not be.

The point is, understand in plain English what your user is needing in the visual. Then you can start to craft the code to return that value. But if they start talking all DAXy to you, you might be debugging the wrong logic.

Do it in Excel first. Or have the user do it in Excel. Sometimes pointing directly to cells will allow you to understand the logic. You can rarely convert Excel formulas to DAX, but once you understand the logic of what is happening, writing the DAX becomes much easier. And you will also have the benefit of the correct answer in Excel you are trying to match.

You can also write it out on paper, or better yet, if you are on site, use the biggest whiteboard you can find and draw the logic out there.

Sometimes you just need to a break. Go for a walk. I have solved tons of problems a few miles from home on my morning walks. Staring at the same code can reenforce that bad code in your head and you cannot see a way around it. Stepping away removes that barrier.

Ask for help. I cannot stress this enough. Innumerable times I’ve asked for help and in just talking through the problem, I find a new path myself without the other person haven spoken a word. Some people call this “talking to the rubber duck.” Put a rubber duck on your desk and when you get stuck, tell the duck what the issue is. Just talking out loud you have to go much slower than thinking through it, and that alone helps.

Of course, asking someone can reveal the answer or a path forward. Ask a co-worker. Many organizations have Power BI or data analytics groups in Teams, Yammer, Slack, or other collaboration tool. Don’t be afraid to jump in with the question or ask for a meeting where you can do a screen share. I have helped others that had super basic questions a new person to Power BI would ask, and I’ve been on 2-3 hour screen shares where I and a co-worker untangled a gnarly problem. Sometimes it was my problem, sometimes theirs.

There are also the public forums, like the Power BI Community. It is especially important in those communities to give adequate sample data in a usable format - CSV or XLSX, and a mock up of the desired results. Don’t take shortcuts with your question. The people you are talking to know nothing about your scenario, so where you can take shortcuts with your co-workers that know the data and the company, forum members don’t. So carefully explain the issue.

One of the latest fads is to ask ChatGPT. That is ok as long as you don’t trust the code for the correct answer. I’ve found ChatGPT doesn’t always return code that even works, let alone code that returns an accurate result. However, it can help directionally, and suggest functions, or different ways to shape and filter the data you might not have thought of.

Regardless of where you might get an answer, verify it is correct. It may return a number, but verify it is the correct number.

Hopefully the above tips will help you rethink how you approach a problem you are stuck on. Of course, learn the debugging tools and techniques mentioned above. The best time to learn those tools is when you don’t need them. Have a complex measure that already works? Awesome. Throw a table visual with that measure into Tabular Editor 3 and see how the debugger works, or wrap some of the variables in EVALUATEANDLOG and see what the DAXDebugOutput tool generates. Then when you do have a problem the output from these tools will be familiar to you.

January 26, 2023

Why You Should Avoid Calculated Columns in Power BI

January 26, 2023/ ed hansberry

First off I want to make sure it is clear that I am not saying “never use calculated columns.” I am not. What I am saying is they should be avoided, and there are several reasons. I am going to focus on the model size, and thus performance. Note that the data I am going to show is a contrived set of data for demonstration purposes only to explain.

Power BI, which includes Power Pivot models and SSAS Tabular models, compresses the data in tables as it is loaded to the service. It never uncompresses that data. All calculations are done on the compressed data. If you want to know how this works, I suggest you read Chapter 17 in The Definitive Guide to DAX, Second Edition. It does a thorough job of explaining how the VertiPaq engine works.

Here is the scenario. I have a 200M record table with three columns:

A list of numbers from 0 to 100,000,000 in 0.5 increments. This has 200M unique values.
A list of numbers that rounds those numbers to whole numbers. It has 100M unique numbers. 0.5 and 1.0 from the first column are 1 in the second, 1.5 and 2.0 are 2 in the second, etc.
A third column is a text value of “Even” or “Odd” that is based on the second column. This field only has 2 values.

The column to pay attention to is the [Even or Odd] column. I created this column two different ways. First in Power Query and then as a Calculated Column in DAX. This is important to understand.

All data in the model is compressed, but only data that is imported is taken into account when creating the optimal sorting algorithm to determine the best possible compression. So when I loaded the table with [Even or Odd] as an imported column, the VertiPaq engine considered that column when compressing the data. When it created it as a Calculated Column, it was compressed, but only by taking the [Index] and [Integer] columns into account for the best possible compression. Compression optimization happens at import and load. It does not get re-optimized and re-compressed when you add new columns.

Using the VertiPaq Analyzer from DAX Studio, we can see how much data each of these columns consumes.

Focus on the Total Size column. There are only 2 values in the [Even or Odd] column - “Even” or “Odd.” Note: The scenario gets a bit more complex here as these text values create a dictionary, which is a further optimization I am not going to get into here because the effect is the same on both tables - again, see the Definitive Guide to DAX for more on this.

On the bottom table, called “Imported”, the [Even or Odd] column consumes roughly 18KB. Remember, there are 200M records here. However, the VertiPaq engine looked at this column and realized it had a low cardinality, only 2, and sorted by that column first. This means theoritically it could represent this by just two values - “Even” for the first 100M records, and “Odd” for the second 100M records, and that is, in effect, what it did. At 18KB that is certainly more than 2 text values, but again, this gets deeper in to how VeritiPaq splits the data, and this 200M record table is broken up into 191 segments in Power BI. Again, read the book.

Now look at the top table - Calculated Column. In this table, the Vertipaq engine only looked at the [Index] and [Integer] columns when optimizing for compression, and there isn’t a lot of compression going on here. The [Index] isn’t compressed at all. There are 200M unique values. The [Integer] column can be compressed, but not by much, as there are still 100M unique values, which is why it is not quite half the size of the [Index] column - roughly 0.86GB vs 1.49GB.

But the [Even or Odd] column had to resign itself to be compressed at whatever compression algorithm was used by VertiPaq for the imported columns. As a result, even with 2 values, it is almost 25MB in size, or 1,338 times larger than the imported version!

What does that mean in practical terms? As one example, if you do any calculations based on the [Even or Odd] column, the computer hosting the model has to use 25MB to hold the Calculated Column version vs 18KB for the imported version. The speed of calculations will be faster on the smaller set of data. Again, this specific example is contrived, so the speed of calculations for 18kb vs 25MB will not be very large, or perhaps even noticeable.

Note too that when processing a model, the entire model is loaded in RAM. This is very different to how relational databases like SQL Server or Oracle work. They only load what they need from disk to do the necessary calculations. VertiPaq doesn’t work that way. It loads it all.

Now consider what happens if you add 10 more columns to each of these tables. Doing it using DAX will greatly increase the size of the table vs doing it in Power Query or in your data source.

And that is the point. Calculated Columns will increase the model size, which in turn increases the amount of RAM needed to hold the model, and the time it takes to run calculations on it.

There are other reasons to transform your data in Power Query or further upstream. In this example, I could have removed the “Integer” column from the table before loading keeping only two columns. You cannot do that with DAX. You can add to the model, you cannot take away without destroying data and any Calculated Columns that depend on it. This is why I avoid at all costs any LEFT/RIGHT/MID logic in a Calculated Column. I get what I need in Power Query, then remove the source column. Saves space, refreshes are faster, and calculations are faster.

So when should you use a Calculated Column? Here are the questions I ask myself:

Am I transforming data or adding data based on other columns? Do this in Power Query or earlier. This is often possible. Exceptions are when the DAX is looking at multiple groupings of data via a SUMMARIZE() or data in other tables. Power Query is inefficient here, but it may still be possible in an earlier source like SQL Server. This is especially important if you don’t need the original data you are transforming. You can delete unnecessary columns in Power Query when loading data. You cannot delete columns in the model without breaking Calculated Columns that rely on that data.
Does this need to be in a slicer or filter? This has to be a column - it cannot be a measure. But if it can be done in PQ or earlier, do it there.
Is the formula really complex and won’t work in a measure? This usually isn’t a good reason. I’ve yet to see a DAX formula in a Calculated Column that could not be reworked to be a measure. Calculated Columns have row context and not filter context. Measures are the opposite. They have filter context and not row context. This is where it becomes necessary to really understand evaluation context, how to create row context in a measure using iterators, and when to invoke context transition via CALCULATE. It can be done, it just requires some work. Reach out for help in the Power BI community or talk to co-workers.
If the formula in a measure is inefficient, then a Calculated Column helper column may be necessary. I had a particularly gnarly measure that used PERCENTILE.INC() three times over three groupings of data with SUMMARIZE(). The measure worked, but was horribly slow. PERCENTILE.INC() required the formula engine, and that was simply inefficient over 20M records as users filtered data. It was also impractical to attempt to do this in Power Query or the source systems, some of which were just CSV files. So I let a Calculated Column do the heavy lifting by moving all of the PERCENTILE.INC logic there, reducing the visual rendering from over 8,000ms to under 500ms when done. But it was a last resort, not the first thing I reached for.
Am I just developing right now? I use Calculated Columns every day during development. Super easy to analyze data and start the logic of the DAX code. But I make it a point to convert to imported columns to measures and remove the Calculated Columns from the code before I even push a model to the Test environment.

Remember, Calculated Columns aren’t inherently bad. They are there for a reason. But they aren’t meant to be used like you’d use a column in Excel. Most of the time, I find they are not needed and have adverse impacts on the model, but there are definitely use cases for them and some of my models have them for very particular reasons. The key as a good Power BI developer is knowing when to use them and when to avoid them.

December 16, 2021

Working With Sparklines In Power BI - All About The Filter Context

December 16, 2021/ ed hansberry

With the December 2021 version of Power BI Desktop, as well as authoring reports in the Power BI Service, we can now add Sparklines to our Table and Matrix visuals. With just a few clicks you can make visuals that look like this:

This isn’t a tutorial on how to create sparklines as Microsoft has you covered there. What I want to discuss though is how Sparklines are impacted by the filter context in a visual. First though you need to enable this feature. It is in preview mode at launch, so go to File, Options, Options and Settings, and look for the Preview section. Click Sparklines, then close and restart Power BI Desktop.

In the visual above, I added the “Month” field to the Sparkline settings and that created an additional filter on the visual. Before I added the Sparkline, Year was the only filter affecting the numbers. By adding the month to the sparkline, shown below, it breaks out the Total Net Sales figure by month.

But what if I don’t want the sparkline to be the full range of data. For example, I just want the sparkline to show the last 3 months of the year, so 3 data points, not 12. As a measure, this is simple.

Sales Last 3 Months =
CALCULATE(
    [Total Net Sales],
    'Date'[Month] IN { 10, 11, 12 }
)

As you can see, this returns the correct value as a value in the matrix, but the sparkline created for it using months is flat. Each of the data points are the same, and there are 12 of them, not 3. What is going on?.

This is context transition. When you alter a measure within Calculate, it does several things. This article isn’t a deep dive into context transition, but the key issue here is when I told it in the measure above to only show me when the ‘Date’[Month] column is within the values between 10 and 12, it replaced the existing filters coming from the Date table, which for months would be the values 1-12. You can see that here if I add the Month field to the table.

When only the Year is showing in the visual, you don’t see this. But the adding the Months field makes it clear what the model is doing. When I added the Month field to the sparkline setting, the exact same thing happened. Now, if you are pretty good at DAX, you’ve already figured out how to fix this. What we need to do is restore the filters on the Sales Last 3 Months measure so the total remains $2.7M, but the individual monthly amounts for October - December are retained. There are several ways to do this. First I’ll do it by restoring the values for month within the measure, and the VALUES() function is perfect for this.

Sales Last 3 Months =
CALCULATE(
    [Total Net Sales],
    'Date'[Month] IN { 10, 11, 12 },
    VALUES( 'Date'[Month] )
)

Now for each row in the matrix, it looks at the month. For January, there is a 1. VALUES(‘Date’[Month]) returns 1, but the predicate ‘Date’[Month] IN {10, 11, 12}) is returning only 10 - 12. Remember that filters in DAX are tables, so it is a mini-table of 10, 11, and 12. And multiple filters in CALCULATE are combined using AND logic. So the table of 10, 11, 12, AND the table of 1, from the VALUES(‘Date’[Month]) function, returns nothing, because they don’t intersect at all.

This process repeats, and gives us blanks for months 1-9. But in October, VALUES(‘Date’[Month]) returns 10. That with AND logic for 10, 11, and 12 returns 10. So now it will show October sales. The same for months 11 and 12. We get this:

Now we have the correct value for months 10-12, and nothing for months 1-9. The total is still $2.7M, so it is working. Let’s remove the Months field from the visual and take a look at the sparklines now.

Bingo. That is exactly what we were looking for. Using VALUES is the easiest way for me to visualize what is going on in the model, but there is a better way, There is another function in DAX called KEEPFILTERS that is most often used within CALCULATE. Change the measure to this:

Sales Last 3 Months =
CALCULATE(
    [Total Net Sales],
    KEEPFILTERS( 'Date'[Month] IN { 10, 11, 12 } )
)

KEEPFILTERS() will keep CALCULATE() from overwriting the filter for the ‘Date’[Month] column, and, well, keep the filters from that column, which is being supplied by the sparkline settings.

Sparklines has been a long requested feature, and I’m glad to see it has finally arrived. If you see unexpected results though, you have to step back and think about the filter that the sparkline added, and see if you have a CALCULATE in that field impacting how the filters work. KEEPFILTERS() is likely the way to fix it.

August 24, 2021

Be Careful When Filtering for Blanks in DAX

August 24, 2021/ ed hansberry

It is often the case in Power BI you need to filter your data to include or exclude rows where there are blanks. Be careful though, because you might get unexpected results. Consider this table:

It has 11 records, 2 of which are blank, and 3 that have $0 sales. Why would I have both blanks and 0’s in the same table? It depends on the data. Maybe blanks represent orders that are currently not finished, or were improperly filed out, whereas $0 represents orders where a customer ordered and item, but we were out of stock, so sales are $0 and we want that recorded. Let’s assume the data is correct, but we want to report on it in different ways.

Here is how you might try count the number of records that are not complete - the blanks.

COUNTROWS(
    FILTER(
        'Sales',
        'Sales'[Sales] = BLANK()
    )
)`

However, this may return different results than expected.

Instead, it returns 5. Why? If you want a deep dive into how Power BI handles blanks, see this excellent SQLBI article. The quick answer though is, when DAX evaluates BLANK() like this, it considers it to be zero. So it is combining the two blank and 3 zero value records, thus 5 records.

The proper way to filter for blanks in this case is to use the ISBLANK() function.

Conversely, if you want to exclude only blanks, you wouldn’t write <>BLANK() in your filter, but you’d use the NOT operator.

If you run across a model where someone has used =BLANK() or <>BLANK(), be it in a measure or calculated column, be sure those results are correct. It is possible they knew it would also include or exclude zero values, but more than likely, they were unaware of this. When I use =BLANK() in my measures knowing it will include zero values, I put a comment in the measure explaining this is the expected result, mainly so future me doesn’t see that and think past me messed up somehow. 😀

You can download a copy of the PBIX file I used here.

July 02, 2021

Quickly Format All DAX Code in Tabular Editor

July 02, 2021/ ed hansberry

If you aren’t using Tabular Editor to format all of your DAX code for you, you are missing out. I’ve written a super short script that will format all of your DAX code in measures, and make it look picture perfect in Power BI Desktop.

First, you need to download Tabular Editor. The free version is here. There is a commercial version, which I highly recommend, available here. If you don’t have Tabular Editor yet, start with the free version for now. This is the code you need:

/*  Cycle over all measures in model format using
    DAX Formatter with Short Lines, then add 1 line
    feed to the start of the measure */

FormatDax(Model.AllMeasures, true);
foreach (var m in Model.AllMeasures)
    {
        m.Expression = '\n' + m.Expression;
    }

In Tabular Editor 2.x, paste that code in the advanced scripting window as shown. Tabular Editor 3 users just click on the New Script button on the ribbon and paste the code there. For both, press the green Run button.

This will do 2 things.

The FormatDax() function will take all of your DAX code for measures only, and send it to DAXFormatter.com for formatting, then get the results. It does this in one big batch to lessen the load on the DAX Formatter servers, vs hitting it 100 times if you have 100 measures. The ‘true’ parameter will use the Short line format. If you like long lines, just get rid of the “, true” part. But really, who likes long lines? 😉 In both versions it will tell you how many changes were made to the model in the status bar at the bottom.

The ‘foreach’ loop will then go through all of the measures and add a line feed. An example of the result as seen in Power BI Desktop is below.

Without the line feed added, it will start the measure on the top row, and it looks like this:

It still works, it just doesn’t look as nice. The line feed forces the measure to start on the row under the measure name.

Now you need to save these changes back to Power BI Desktop, or in the case of TE/3, to a dataset via an XMLA endpoint in a premium workspace, or SSAS/AAS models depending on your license. Just click the icon that has the floppy disk on top of a cube. (It is 2021. Will we ever get rid of the floppy disk icon for save operations?)

To save this script in Tabular Editor 2, either keep a copy of the text somewhere like Evernote or OneNote, or save it to a file. Tabular Editor 3 users can save it as a macro so it is always available to you. This is one of the productivity enhancements of TE/3.

You’ll notice I said it does this for measures. It will not format DAX in Calculated Columns nor in Calculated Tables. That is currently a limitation of what parts of the model Tabular Editor is able to change.

Now you have no excuse for having poorly formatted measures.

This is just one of the many features of Tabular Editor, but it makes it worth getting even if you are new to Power BI. Formatted DAX code is so much easier to read, understand, and edit, and now it is a simple button press away.

July 29, 2020

Add a Refresh Time Stamp To Your Power BI Reports

July 29, 2020/ ed hansberry

It is often useful to tell your users when the report refreshed so they have some idea of how current, or stale, the data is. Unfortunately, you cannot just add a TODAY() formula to your report as that will change each time they open the report. This method will give you the refresh time stamp and compensate for Daylight Savings Time.

First, let’s get the time and date in Power Query. This time will only change on the report refresh, which is what you want.

Open Power Query. You can do this using Get Data on the main Power BI menu, and we want a Blank Query.
Click on Advanced Editor on the Home ribbon of Power Query.
Remove the default text.
Replace it with this code

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VZDLCQAhDAV78Sxonv9axP7bWLNPMN4mDMYhc7oUBAFRmvNOJBTy8r8RnTpNJh8TdRo0iUzT94BIIeTzRBdAaBr5GF0A0FTyMZqGdNM2mwDkG7CZZuhQKEA2ZdWI+pQ1U9ae/7v5v9vTYNzTYNiyFG/Z5rU+", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [dtDSTStart = _t, dtDSTEnd = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source, {{"dtDSTStart", type date}, {"dtDSTEnd", type date}}, "en-US"),
    varCurrentDate = DateTime.Date(DateTimeZone.SwitchZone(DateTimeZone.LocalNow(),-8)),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [dtDSTStart] < varCurrentDate and [dtDSTEnd] > varCurrentDate),
    varDSTOffset = Table.RowCount(#"Filtered Rows"),
    #"Last Refresh Date" = #table(
        type table
            [
                #"RefreshDate"=datetimezone
            ],
        {
            {DateTimeZone.SwitchZone(DateTimeZone.LocalNow(),-8 + varDSTOffset,0)}
        }
        )
in
    #"Last Refresh Date"

Wow. What is all of that? Just to get a refresh timestamp? Yes. Let’s look at the steps.

Source: This is the Daylight Savings Time Start and End dates for the US. You will need to change this for your region as applicable. This goes through 2030 right now, and is subject to change at the whims of politicians.

You can manually edit this table by clicking on the gear icon in the Source step.

You can also get rid of all of that and point the source to a table you have somewhere. A table on your SQL server, a SharePoint List, Excel File, whatever. Swapping out the code for that is beyond the scope of this article. I wanted to provide a complete solution in one query without having to rely on external data sources. I know there are websites with this data you can link to, but I don’t like relying on websites for my reports, and don’t want to add those sites to my on-prem gateway just for a DST table.

Changed Type: This just changes the data keyed in the Source line to dates. It is always text by default. Those are in the US Date format and I told Power Query to use the en-US culture setting to do the type conversion.

varCurrentDate: This gets today’s date, with an offset for my UTC timezone, which is -8 - Pacific Time. You’d need to use Google for yours, or use this handy article on Wikipedia. You can use a decimal number here, so if you are in the Marquesas Islands, your offset is -9:30, which means you would use -9.5 here.

Filtered Rows: This will return 1 or 0 rows when the varCurrentDate variable is greater than the Start Date and less than the End Date for when DST occurs.

varDSTOffset: This is a variable that counts the rows returned in the Filtered Rows step. It is either 1 or 0. This will add 1 or 0 hrs to the time calculated next, adjusting for DST.

Last Refresh Date: This uses the formula below to get the current date and time of the refresh, adjusting for the varDSTOffset calculated above. It then puts it into a table and changes the type to “datetimezone” so it can actually be loaded into the DAX data model. Note, again, you’ll need to adjust that -8 to your UTC offset.

=DateTimeZone.SwitchZone(DateTimeZone.LocalNow(),-8 + varDSTOffset,0)

The final result is a date with a timezone offset that looks like this. Remember, I am in the -8 Pacific Timezone, and DST is in effect, so it added 1, making the net UTC Offset -7.

Note: If you do not observe DST in your area, just remove the +varDSTOffset part of the formula. You can also remove about 90% of this code, but I’ll leave that to you. Leaving the DST code in and removing varDSTOffset from your formula will not impact performance at all.

The DAX model will understand that and will adjust the time shown to the user. Finally, rename the query to “Refresh Time Stamp” and make sure it is set to load. Close Power BI and apply changes.

In DAX, create a new measure:

Last Update = 
     "Last Update on " 
     & UNICHAR(10) 
     & FORMAT(
         MAX('Refresh Time Stamp'[RefreshDate]),
         "MMM DD, YYYY H:MM AM/PM"
     )

Now drop that measure in to card. It will look something like this:

Resize and position the card as desired. If you do not like the forced line feed before the date, get rid of the UNICHAR(10) function. You can also customize the format of the time shown using the codes from this article by MIcrosoft.

Finally, to you can right-click on the Refresh Time Stamp table in Power BI and hide it. The measure will still work, but the table won’t clutter up your list of measures, tables, and fields.

One final bit of minutia. The varCurrentDate variable itself does not adjust itself for DST, so if your report runs between midnight and 3am on the day DST switches, it may be off one hour for that refresh. It will adjust though if it is run after the 3am switch.

Ed Hansberry's Power BI and Excel Blog