Creating A Date Dimension
DataHub Consulting, Experts in Analytics, Business Intelligence, and Compliance 310 310Read it in 13 minutes
window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-YFZ1F7T6M6');
window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-YFZ1F7T6M6');Read it in 13 minutes
In this series of articles, we will look at some of the core knowledge for passing the Microsoft PL-300 (Microsoft Power BI Data Analyst) exam. In this article we will cover considerations for a Date Dimension when planning a Power BI data model.
As a Power BI consultant and Microsoft Certified Trainer, I was one of the experts that helped Microsoft create the syllabus for the PL-300 and the DP-500 exams. With that in mind I wanted to focus this article on a topic that will support anyone preparing for the PL-300.
With a Power BI data model, the most important dimension is the date dimension (or sometime called a calendar dimension). Whilst working with organisations on Power BI projects I’ve come across models that don’t have a date / calendar dimension. In these instances Power BI creates it’s own auto datetime tables (that are hidden) to allow for time intelligence. Power BI will create a hidden date table for any date columns in the model. This will work but can lead to performance and in-memory file size issues.
In this article we will look at:
So, let’s get started.
In majority of the cases there is a data source that will have a calendar dimension. With the modern technology this could involve either,
Using a calendar dimension from the source system like serverless or dedicated is, in my opinion, the best way for Power BI data modeling. But there are cases where this is not possible.
There are times where there is no data warehouse or the ability to create a date dimension in the source. When this occurs, there is a need to write the code to create a date table in order to report across a timeline. This also then allows for the use of time intelligence effectively and efficiently.
So, what are the options?
If we can’t utilize a date table in the source data, then there are two options. The first is to use Power Query, and the second is to create a calculated table in Power BI using DAX.
To use time intelligence in Power BI, and to create the calculated measures for YTD, MoM, Last year comparisons etc, there needs to be a date table with a specified column that is a continuous date range. How do we do this in Power BI.
When you select a table, in the Power BI, it takes you to the Table Tools ribbon.
With the table name selected click the “Mark as date table”.
When the Mark as date table dialogue box is visible, then select the date column that will have unique dates. Power BI will do the validation checks mentioned in the requirements above. Once validated, then click the OK button.
Power BI by default creates in the background its own internal date table. This is not visible to the developers and users but will be part of the model. If you are a developer and use the Vertipaq Analyzer tool, then you can identify if auto date/time tables are being created. From the vertipaq analyzer you will also be able to see the performance impact of these date tables.
If like myself, you prefer to manage and use your own date dimension then there isn’t a need for the auto date/time tables. These tables then only create additional overhead from a performance point of view.
If you want to understand more about Power BI Auto date/time tables, then have a look at this great video from the Guy In A Cube team. Reduce Power BI dataset size by 60% with One Checkbox
For more information on the Power BI Auto Date/Time this is a link to the Microsoft documentation Auto date/time in Power BI Desktop – Power BI | Microsoft Learn
When you have a dedicated date table for your time intelligence calculations then there isn’t a need for the Power BI Auto date/time tables. With this setting selected Power BI will create auto date/time tables that can increase the size of the model and reduce performance.
If there is a date table, then deselect this option in the settings. To do this:
File > Options and Settings > Options.
Either on the Global or Current File section select the Data Load page. Deselect the checkbox for the auto date/time.
If you select the Global setting this will be changed for all reports in the future. If the Current file is selected the setting will only apply to that particular report file.
With any Power BI model there needs to be a planning stage. As a Power BI consultant I always conduct a discovery session with the business to understand some core requirements and deliverables.
From this I will understand what time intelligence measures will be required, how much historic data will be in the model. This will tell me the start and end range of the date table. Also, I’m looking what additional columns will be needed in the date table.
There is a tendency to include all possible columns but are these all necessary. Are you making the model more complicated for the business users. Ask yourself the question, do I need some or all columns relating to Year, Season, Quarter, Month, Week, as well as the required date specific column. If you only require year, quarter, month then only include those.
Also consider:
Syntax – USERELATIONSHIP (ColumnName 1, ColumnName 2)
Where:
ColumnName 1 is the column in the table on the many side of the relationship.
ColumnName 2 is the column in the table on the one side of the relationship.
Reference to the Microsoft DAX page
USERELATIONSHIP function (DAX) – DAX | Microsoft Learn
OK, you have got to the stage where you plan to script a calendar date table. There are two options here. Script it in Power Query or as a DAX table. If you choose to use DAX, then from Power BI desktop select the modeling tab and, in the ribbon, select the New Table.
Both options work well with Power BI but there are some pros and cons to each method. Let’s have a look at these.
Start creating your DAX script.
Once that you have created your new table then you can start to script the code for the required columns. With every calendar dimension as mentioned above, there needs to be a continuous column of dates that does not include blanks, and can be set as a data type of date. The easiest wat to do this in DAX is to use one of the two DAX functions. The first is the CalendarAuto function. The second is the Calendar function.
CalendarAuto()
With CALENDARAUTO() it scans the model and will identify the min and max dates. With the max date the CALENDARAUTO() function will default to the end of the max year. So, if the max date in the model was October 15th 2022, the function would use December 31st 2022 as the max date.
From this it will create a table of continuous dates between the Min Date and Max Date.
Syntax – CALENDARAUTO()
For more information on this function CALENDARAUTO – DAX Guide
Calendar()
With the CALENDAR() function you need to specify the start and end dates in the function. The function returns a table of dates between the start date and the end date.
Syntax – CALENDAR( [StartDate] , [EndDate] )
For more information on this function CALENDAR – DAX Guide
For the PL-300 exam it’s important to understand data modeling, relationships, DAX functions, and also performance within Power BI. These are particularity relevant for implementing a date dimension. Every Power BI project that I’ve worked on has required a date table of some sort, so that users can report across timelines and for developers to implement time intelligence calculations. Understanding the requirements in the initial planning stage is key to optimizing the model and correctly setting up a date table.
We do not employ salespeople; our team are all experienced technical specialists that can talk you through any of our services.
Contact us