Learning Microsoft Power BI Data Analyst course

··

101 min read

Overview:
Skills measured
I. Module 1 Discover data analysis
- 1. Introduction
- 2. Overview of data analysis
- 3. Roles in data
- 4. Tasks of a data analyst
- 5. Answers:
- Learning objectives
II. Module 2 Get started building with Power BI
- 1. Introduction
- 2. Use Power BI
- 3. Building blocks of Power BI
- 4. Tour and use the Power BI service
- 5. Answers:
- 6. Summary
- Learning objectives
III. Module 3 Get data in Power BI
- 1. Introduction
- 2. Get data from files
- 3. Get data from relational data sources
- 4. Create dynamic reports with parameters
- 5. Create dynamic reports for multiple values
- 6. Get data from a NoSQL database
- 7. Get data from online services
- 8. Select a storage mode
- 9. Get data from Azure Analysis Services
- 10. Fix performance issues
- 11. Resolve data import errors
- 12. Exercise - Prepare data in Power BI Desktop
- 13. Check your knowledge
- 14. Summary
- Learning objectives
IV. Module 4 Clean, transform, and load data in Power BI
- 1. Introduction
- 2. Shape the initial data
  - i. Get started with Power Query Editor
- 3. Simplify the data structure
- 4. Evaluate and change column data types
- 5. Combine multiple tables into a single table
  - i. Append queries
  - ii. Merge queries
- 6. Profile data in Power BI
  - i. Find data anomalies and data statistics
- 7. Use Advanced Editor to modify M code
- 8. Exercise - Load data in Power BI Desktop
- 9. Check your knowledge
- 10. Summary
- Learning objectives
V. Module 5 Design a semantic model in Power BI
- 1. Introduction
- 2. Work with tables
- 3. Create a date table
- 4. Work with dimensions
  - i. Hierarchies
  - ii. Role-playing dimensions
- 5. Define data granularity
- 6. Work with relationships and cardinality
- 7. Resolve modeling challenges
  - i. Relationship dependencies
- 8. Exercise - Model data in Power BI Desktop
- 9. Check your knowledge
- 10. Summary
- Learning objectives
VI. Module 6 Add measures to Power BI Desktop models
- 1. Introduction
- 2. Create simple measures
  - Measures:
- 3. Create compound measures
- 4. Create quick measures
- 5. Compare calculated columns with measures
- 6. Check your knowledge
- 7. Exercise - Create DAX Calculations in Power BI Desktop
- 8. Summary
- Learning objectives
VII. Module 7 Add calculated tables and columns to Power BI Desktop models
- 1. Introduction
  - i. Duplicate a table
  - ii. Create a date table
- 2. Create calculated columns
- 3. Learn about row context
- 4. Choose a technique to add a column
- 5. Check your knowledge
- 6. Summary
- Learning objectives
VIII. Module 8 Use DAX time intelligence functions in Power BI Desktop models
- 1. Introduction
- 2. Use DAX time intelligence functions
- 3. Additional time intelligence calculations
  - i. Calculate new occurrences
  - ii. Snapshot calculations
- 4. Exercise - Create Advanced DAX Calculations in Power BI Desktop
- 5. Check your knowledge
- 6. Summary
- Learning objectives
IX. Module 9 Optimize a model for performance in Power BI
- 1. Introduction to performance optimization
- 2. Review performance of measures, relationships, and visuals
  - i. Identify report performance bottlenecks
    - a. Analyze performance
    - b. Review results
  - ii. Resolve issues and optimize performance
- 3. Use variables to improve performance and troubleshooting
- 4. Reduce cardinality
- 5. Optimize DirectQuery models with table level storage
- 6. Create and manage aggregations
- 7. Check your knowledge
- 8. Summary
- Learning objectives
X. Module 10 Design Power BI reports
- 1. Introduction
  - i. Report structure
  - ii. Report pages
- 2. Design the analytical report layout
- 3. Design visually appealing reports
- 4. Report objects
- 5. Select report visuals
- 6. Select report visuals to suit the report layout
- 7. Format and configure visualizations
- 8. Work with key performance indicators
- 9. Exercise - Design a report in Power BI desktop
- 10. Check your knowledge
- 11. Summary
- Learning objectives
XI. Module 11 Configure Power BI report filters
- 1. Introduction to designing reports for filtering
- 2. Apply filters to the report structure
- 3. Apply filters with slicers
- 4. Design reports with advanced filtering techniques
- 5. Consumption-time filtering
- 6. Select report filter techniques
- 7. Case study - Configure report filters based on feedback
- 8. Check your knowledge
- 9. Summary
- Learning objectives
XII. Module 12 Enhance Power BI report designs for the user experience
- 1. Design reports to show details
- 2. Design reports to highlight values
- 3. Design reports that behave like apps
- 4. Work with bookmarks
- 5. Design reports for navigation
- 6. Work with visual headers
- 7. Design reports with built-in assistance
- 8. Tune report performance
- 9. Optimize reports for mobile use
- 10. Exercise - Enhance Power BI reports
- 11. Check your knowledge
- 12. Summary
- Learning objectives
XIII. Module 13 Perform analytics in Power BI
- 1. Introduction to analytics
- 2. Explore statistical summary
- 3. Identify outliers with Power BI visuals
  - i. Use a visual to identify outliers
  - ii. Use DAX to identify outliers
- 4. Group and bin data for analysis
- 5. Apply clustering techniques
- 6. Conduct time series analysis
- 7. Use the Analyze feature
- 8. Create what-if parameters
- 9. Use specialized visuals
- 10. Exercise - Perform Advanced Analytics with AI Visuals
- 11. Check your knowledge
- 12. Summary
- Learning objectives
XIV. Module 14 Create and manage workspaces in Power BI
- 1. Introduction
- 2. Distribute a report or dashboard
- 3. Monitor usage and performance
- 4. Recommend a development life cycle strategy
- 5. Troubleshoot data by viewing its lineage
- 6. Configure data protection
- 7. Check your knowledge
- 8. Summary
- Learning objectives
XV. Module 15 Manage semantic models in Power BI
- 1. Introduction
- 2. Use a Power BI gateway to connect to on-premises data sources
  - i. on-premises gateway
  - ii. Troubleshoot an on-premises data gateway
- 3. Configure a semantic model scheduled refresh
- 4. Configure incremental refresh settings
- 5. Manage and promote semantic models
- 6. Troubleshoot service connectivity
- 7. Boost performance with query caching (Premium)
- 8. Check your knowledge
- 9. Summary
- Learning objectives
XVI. Module 16 Create dashboards in Power BI
- 1. Introduction to dashboards
  - i. Dashboards vs. reports
  - ii. Get started with dashboards
- 2. Configure data alerts
- 3. Explore data by asking questions
- 4. Review Quick insights
- 5. Add a dashboard theme
- 6. Pin a live report page to a dashboard
- 7. Configure a real-time dashboard
  - i. Stream in Power BI
  - ii. Visualize real-time data in Power BI
- 8. Set mobile view
- 9. Exercise - Create a Power BI dashboard
- 10. Check your knowledge
- 11. Summary
- Learning objectives
XVI. Module 17 Implement row-level security
- 1. Introduction
- 2. Configure row-level security with the static method
- 3. Configure row-level security with the dynamic method
- 4. Exercise - Enforce row-level security in Power BI
- 5. Check your knowledge
- 6. Summary
- Learning objectives
Conclusion
Source: Microsoft Power BI Data Analyst [Link]
Author: Dheeraj.Yss
Connect with me:

In this article, contains info about #MicrosoftPowerBIDataAnalyst and get prepared about for Exam PL-300.

Foreword:

The entire content is owned by Microsoft, and I am logging for practice and it is for educational purposes only.

All presented information is owned by Microsoft and intended solely for learning about the covered products and services in my Microsoft Learn AI Skills Challenge: Fabric Analytics Engineer Journey.

Overview:

This course covers the various methods and best practices that are in line with business and technical requirements for

modeling,
visualizing, and
analyzing data with Power BI.

The course will show how to access and process data from a range of data sources including both relational and non-relational sources.

Finally, this course will also discuss how to manage and deploy reports and dashboards for sharing and content distribution.

As a candidate for this "Microsoft Power BI Data Analyst" certification, you should deliver actionable insights by working with available data and applying domain expertise.

You should:

Provide meaningful business value through easy-to-comprehend data visualizations.
Enable others to perform self-service analytics.
Deploy and configure solutions for consumption.

As a Power BI data analyst, you work closely with business stakeholders to identify business requirements.

You collaborate with enterprise data analysts and data engineers to identify and acquire data.

You use Power BI to:

Transform the data.
Create data models.
Visualize data.
Share assets.

You should be proficient at using Power Query and writing expressions by using Data Analysis Expressions (DAX).

You know how to assess data quality. Plus, you understand data security, including row-level security and data sensitivity.

Skills measured

Prepare the data
Model the data
Visualize and analyze the data
Deploy and maintain assets

Course	Microsoft Power BI Data Analyst
Module 17

I. Module 1 Discover data analysis

Course	Microsoft Power BI Data Analyst
Module 1/17	Discover data analysis

Would you like to explore the journey of a data analyst and learn how a data analyst tells a story with data?

1. Introduction

As a data analyst, you are on a journey.

With data and information as the most strategic asset of a business, the underlying challenge that organizations have today is understanding and using their data to positively effect change within the business.

Businesses continue to struggle to use their data in a meaningful and productive way, which impacts their ability to act.

You need to be able to look at the data and facilitate trusted business decisions. Then, you need the ability to look at metrics and clearly understand the meaning behind those metrics.

Data analysis exists to help overcome these challenges and pain points, ultimately assisting businesses in finding insights and uncovering hidden value in troves of data through storytelling.

As you read on, you will learn how to use and apply analytical skills to go beyond a single report and help impact and influence your organization by telling stories with data and driving that data culture.

2. Overview of data analysis

Data analysis is the process of identifying, cleaning, transforming, and modeling data to discover meaningful and useful information.

The data is then crafted into a story through reports for analysis to support the critical decision-making process.

While the process of data analysis focuses on the tasks of cleaning, modeling, and visualizing data, the concept of data analysis and its importance to business should not be understated. To analyze data, core components of analytics are divided into the following categories:

Descriptive - key performance indicators (KPIs) - return on investment (ROI), organization's sales and financial data.
Diagnostic - questions about why events happened
Predictive - questions about what will happen in the future
Prescriptive - questions about which actions should be taken to achieve a goal or target.
Cognitive - a self-learning feedback loop.

As the amount of data grows, so does the need for data analysts. A data analyst knows how to organize information and distill it into something relevant and comprehensible. A data analyst knows how to gather the right data and what to do with it, in other words, making sense of the data in your data overload.

3. Roles in data

Business analyst
Data analyst
Data engineer
Data scientist
Database administrator

4. Tasks of a data analyst

Data preparation - profiling, cleaning, and transforming your data to get it ready to model and visualize.
Data modeling - determining how your tables are related to each other.
visualization - report should tell a compelling story about that data.
Analyze - find insights, identify patterns and trends, predict outcomes, and then communicate those insights.
Manage - management of Power BI

5. Answers:

A data analyst uses appropriate visuals to help business decision makers gain deep and meaningful insights from data.

An optimized and tuned semantic model performs better and provides a better data analysis experience.

A key benefit of data analysis is the ability to gain valuable insights from a business's data assets to make timely and optimal business decisions.

Learning objectives

In this module, you'll:

Learn about the roles in data.
Learn about the tasks of a data analyst.

II. Module 2 Get started building with Power BI

Course	Microsoft Power BI Data Analyst
Module 2/17	Get started building with Power BI

1. Introduction

Microsoft Power BI is a complete reporting solution that offers data preparation, data visualization, distribution, and management through development tools and an online platform.

Use Power BI to create visually stunning, interactive reports requiring complex data modeling & to serve as the analytics and decision engine.

2. Use Power BI

Power BI Desktop - development tool available to data analysts and other report creators.
Power BI service - allows you to organize, manage, and distribute your reports and other Power BI items, create high-level dashboards that drill down to reports.

The flow of Power BI is:

Connect to data with Power BI Desktop.
Transform in Power Query and model data with Power BI Desktop.
Create visualizations and reports with Power BI Desktop.
Publish report to Power BI service.
Distribute and manage reports in the Power BI service.

3. Building blocks of Power BI

The building blocks of Power BI are semantic models and visualizations.

semantic model - consists of all connected data, transformations, relationships, and calculations.

4. Tour and use the Power BI service

Workspaces are the foundation of the Power BI service.

Distribute content - Apps are the ideal sharing solution within any organization.

Refresh a semantic model - configure scheduled refreshes of your semantic models in the Power BI service.

5. Answers:

The Power BI service lets you view and interact with reports and dashboards, but doesn't let you shape data.
Without a semantic model, you can't create visualizations, and reports are made up of visualizations.

An app is a collection of ready-made visuals, pre-arranged in dashboards and reports. You can get apps that connect to many online services from the AppSource.

6. Summary

Microsoft Power BI offers a complete data analytics solution that includes data preparation, visualization, and distribution.

Semantic models and visualizations are the building blocks of Power BI.

The flow and components of Power BI include:

Power BI Desktop for creating semantic models and reports with visualizations.
Power BI service for creating dashboards from published reports and distributing content with apps.
Power BI Mobile for on-the-go access to the Power BI service content, designed for mobile.

By using Power BI, you can make data-informed decisions across your organization.

Learning objectives

In this module, you'll learn:

How Power BI services and applications work together.
Explore how Power BI can make your business more efficient.
How to create compelling visuals and reports.

III. Module 3 Get data in Power BI

Course	Microsoft Power BI Data Analyst
Module 3/17	Get data in Power BI

You'll learn how to retrieve data from a variety of data sources, including Microsoft Excel, relational databases, and NoSQL data stores.

You'll also learn how to improve performance while retrieving data.

1. Introduction

create a suite of reports that are dependent on data in several different locations.

This module will focus on the first step of getting the data from the different data sources and importing it into Power BI by using Power Query.

2. Get data from files

A flat file is a type of file that has only one data table and every row of data is in the same structure.

The file doesn't contain hierarchies. Likely, you're familiar with the most common types of flat files, which are comma-separated values (.csv) files, delimited text (.txt) files, and fixed width files.

Another type of file would be the output files from different applications, like Microsoft Excel workbooks (.xlsx).

Change the source file - Data source settings.

3. Get data from relational data sources

SQL Server.

Change data source settings - Power BI /Home tab, select Transform data, and then select the Data source settings option.

Write an SQL statement - SQL is beneficial because it allows you to load only the required set of data by specifying exact columns and rows in your SQL statement and then importing them into your semantic model. You can also join different tables, run specific calculations, create logical statements, and filter data in your SQL query.

It's a best practice to avoid doing this directly in Power BI. Instead, consider writing a query like this in a view.

A view is an object in a relational database, similar to a table. Views have rows and columns, and can contain almost every operator in the SQL language.

If Power BI uses a view, when it retrieves data, it participates in query folding, a feature of Power Query. Query folding will be explained later, but in short, Power Query will optimize data retrieval according to how the data is being used later.

4. Create dynamic reports with parameters

Creating dynamic reports allows you to give users more power over the data that is displayed in your reports; they can change the data source and filter the data by themselves.

Power Query Editor / Home tab, select Manage parameters > New parameter.

5. Create dynamic reports for multiple values

To accommodate multiple values at a time, you first need to create a Microsoft Excel worksheet that has a table consisting of one column that contains the list of values.

6. Get data from a NoSQL database

A NoSQL database (also referred to as non-SQL, not only SQL or non-relational) is a flexible type of database that doesn't use tables to store data.

If you're working with data stored in JSON format, it's often necessary to extract and normalize the data first. This is because JSON data is often stored in a nested or unstructured format, which makes it difficult to analyze or report on directly.

7. Get data from online services

To support their daily operations, organizations frequently use a range of software applications, such as SharePoint, OneDrive, Dynamics 365, Google Analytics and so on. These applications produce their own data. Power BI can combine the data from multiple applications to produce more meaningful insights and reports.

8. Select a storage mode

business requirements are satisfied when you're importing data into Power BI.

However, sometimes there may be security requirements around your data that make it impossible to directly import a copy. Or your semantic models may simply be too large and would take too long to load into Power BI, and you want to avoid creating a performance bottleneck.

Power BI solves these problems by using the DirectQuery storage mode, which allows you to query the data in the data source directly and not import a copy into Power BI. DirectQuery is useful because it ensures you're always viewing the most recent version of the data.

The three different types of storage modes you can choose from:

Import -
- create a local Power BI copy of your semantic models from your data source, use all Power BI service features with this storage mode,
- including Q&A and Quick Insights,
- Data refreshes can be scheduled or on-demand.
DirectQuery
- useful when you don't want to save local copies of your data because your data won't be cached,
- you can query the specific tables that you'll need by using native Power BI queries,
- Essentially, you're creating a direct connection to the data source,
- ensures that you're always viewing the most up-to-date data, and that all security requirements are satisfied,
- Additionally, this mode is suited for when you have large semantic models to pull data from. Instead of slowing down performance by having to load large amounts of data into Power BI, you can use DirectQuery to create a connection to the source, solving data latency issues as well.
Dual (Composite)
- In Dual mode, you can identify some data to be directly imported and other data that must be queried. Any table that is brought in to your report is a product of both Import and DirectQuery modes.
- Using the Dual mode allows Power BI to choose the most efficient form of data retrieval.

9. Get data from Azure Analysis Services

Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud.

You can use advanced mashup and modeling features to combine data from multiple data sources, define metrics, and secure your data in a single, trusted tabular semantic model.

The semantic model provides an easier and faster way for users to perform ad hoc data analysis using tools like Power BI.

Notable differences between Azure Analysis Services and SQL Server are:

Analysis Services models have calculations already created.
If you don’t need an entire table, you can query the data directly. Instead of using Transact-SQL (T-SQL) to query the data, like you would in SQL Server, you can use multi-dimensional expressions (MDX) or data analysis expressions (DAX).

10. Fix performance issues

Occasionally, organizations will need to address performance issues when running reports.

Power BI provides the Performance Analyzer tool to help fix problems and streamline the process.

i. Optimize performance in Power Query

The performance in Power Query depends on the performance at the data source level.

The variety of data sources that Power Query offers is wide, and the performance tuning techniques for each source are equally wide.

For instance, if you extract data from a Microsoft SQL Server, you should follow the performance tuning guidelines for the product. Good SQL Server performance tuning techniques include index creation, hardware upgrades, execution plan tuning, and data compression.

These topics are beyond the scope here, and are covered only as an example to build familiarity with your data source and reap the benefits when using Power BI and Power Query.

Power Query takes advantage of good performance at the data source through a technique called Query Folding.

a. Query folding

Query folding is the process by which the transformations and edits that you make in Power Query Editor are simultaneously tracked as native queries, or simple Select SQL statements, while you're actively making transformations.

The reason for implementing this process is to ensure that these transformations can take place in the original data source server and don't overwhelm Power BI computing resources.

You can use Power Query to load data into Power BI. Then use Power Query Editor to transform your data, such as renaming or deleting columns, appending, parsing, filtering, or grouping your data.

Consider a scenario where you’ve renamed a few columns in the Sales data and merged a city and state column together in the “city state” format. Meanwhile, the query folding feature tracks those changes in native queries. Then, when you load your data, the transformations take place independently in the original source, this ensures that performance is optimized in Power BI.

The benefits to query folding include:

More efficiency in data refreshes and incremental refreshes. When you import data tables by using query folding, Power BI is better able to allocate resources and refresh the data faster because Power BI doesn't have to run through each transformation locally.
Automatic compatibility with DirectQuery and Dual storage modes. All DirectQuery and Dual storage mode data sources must have the back-end server processing abilities to create a direct connection, which means that query folding is an automatic capability that you can use. If all transformations can be reduced to a single Select statement, then query folding can occur.

If all transformations can be reduced to a single Select statement, then query folding can occur.

Native queries aren't possible for the following transformations:

Adding an index column
Merging and appending columns of different tables with two different sources
Changing the data type of a column

A good guideline to remember is that if you can translate a transformation into a Select SQL statement, which includes operators and clauses such as GROUP BY, SORT BY, WHERE, UNION ALL, and JOIN, you can use query folding.

While query folding is one option to optimize performance when retrieving, importing, and preparing data, another option is query diagnostics.

ii. Query diagnostics

Another tool that you can use to study query performance is query diagnostics. You can determine what bottlenecks may exist while loading and transforming your data, refreshing your data in Power Query, running SQL statements in Query Editor, and so on.

This tool is useful when you want to analyze performance on the Power Query side for tasks such as loading semantic models, running data refreshes, or running other transformative tasks.

iii. Other techniques to optimize performance

Other ways to optimize query performance in Power BI include:

Process as much data as possible in the original data source. Power Query and Power Query Editor allow you to process the data; however, the processing power that is required to complete this task might lower performance in other areas of your reports. Generally, a good practice is to process, as much as possible, in the native data source.
Use native SQL queries. When using DirectQuery for SQL databases, such as the case for our scenario, make sure that you aren't pulling data from stored procedures or common table expressions (CTEs).
Separate date and time, if bound together. If any of your tables have columns that combine date and time, make sure that you separate them into distinct columns before importing them into Power BI. This approach will increase compression abilities.

11. Resolve data import errors

While importing data into Power BI, you may encounter errors resulting from factors such as:

Power BI imports from numerous data sources.
Each data source might have dozens (and sometimes hundreds) of different error messages.
Other components can cause errors, such as hard drives, networks, software services, and operating systems.
Data often can't comply with any specific schema.

The following sections cover some of the more common error messages that you might encounter in Power BI.

i. Query timeout expired

Relational source systems often have many people who are concurrently using the same data in the same database. Some relational systems and their administrators seek to limit a user from monopolizing all hardware resources by setting a query timeout. These timeouts can be configured for any timespan, from as little as five seconds to as much as 30 minutes or more.

For instance, if you’re pulling data from your organization’s SQL Server, you might see the error shown in the following figure.

ii. Power BI Query Error: Timeout expired

This error indicates that you’ve pulled too much data according to your organization’s policies. Administrators incorporate this policy to avoid slowing down a different application or suite of applications that might also be using that database.

You can resolve this error by pulling fewer columns or rows from a single table. While you're writing SQL statements, it might be a common practice to include groupings and aggregations. You can also join multiple tables in a single SQL statement. Additionally, you can perform complicated subqueries and nested queries in a single statement. These complexities add to the query processing requirements of the relational system and can greatly elongate the time of implementation.

If you need the rows, columns, and complexity, consider taking small chunks of data and then bringing them back together by using Power Query. For instance, you can combine half the columns in one query and the other half in a different query. Power Query can merge those two queries back together after you're finished.

iii. We couldn't find any data formatted as a table

Occasionally, you may encounter the “We couldn’t find any data formatted as a table” error while importing data from Microsoft Excel. Fortunately, this error is self-explanatory. Power BI expects to find data formatted as a table from Excel. The error even tells you the resolution. Perform the following steps to resolve the issue:

Open your Excel workbook, and highlight the data that you want to import.
Press the Ctrl-T keyboard shortcut. The first row will likely be your column headers.
Verify that the column headers reflect how you want to name your columns. Then, try to import data from Excel again. This time, it should work.

iv. Couldn't find file

While importing data from a file, you may get the "Couldn't find file" error.

Usually, this error is caused by the file moving locations or the permissions to the file changing. If the cause is the former, you need to find the file and change the source settings.

Open Power Query by selecting the Transform Data button in Power BI.
Highlight the query that is creating the error.
On the left, under Query Settings, select the gear icon next to Source.
Change the file location to the new location.

iv. Data type errors

Sometimes, when you import data into Power BI, the columns appear blank. This situation happens because of an error in interpreting the data type in Power BI. The resolution to this error is unique to the data source. For instance, if you're importing data from SQL Server and see blank columns, you could try to convert to the correct data type in the query.

Instead of using this query:

SELECT CustomerPostalCode FROM Sales.Customers

Use this query:

SELECT CAST(CustomerPostalCode as varchar(10)) FROM Sales.Customers

By specifying the correct type at the data source, you eliminate many of these common data source errors.

You may encounter different types of errors in Power BI that are caused by the diverse data source systems where your data resides.

12. Exercise - Prepare data in Power BI Desktop

Get Data in Power BI Desktop

Lab story

This lab is designed to introduce you to Power BI Desktop application and how to connect to data and how to use data preview techniques to understand the characteristics and quality of the source data. The learning objectives are:

Open Power BI Desktop
Connect to different data sources
Preview source data with Power Query
Use data profiling features in Power Query

i. Get data from SQL Server

ii. Preview Data in Power Query Editor

Observed:

a. DimEmployee Query

table - Notice that the last five columns contain Table or Value links.

These five columns represent relationships to other tables in the database.

They can be used to join tables together. You’ll join tables in the Load Transformed Data in Power BI Desktop lab.

Column Quality for Position column - shows 94% empty.

Column Distribution for EmployeeKey column

When the distinct and unique counts are the same, it means the column contains unique values.

When modeling, it’s important that some model tables have unique columns. These unique columns can be used to create one-to-many relationships, which you’ll do in theModel Data in Power BI Desktoplab.

b. DimReseller query

Column Profile for BusinessType column header - Notice the data quality issue: there are two labels for warehouse (Warehouse, and the misspelled Ware House) - You’ll apply a transformation to relabel these five rows in the Load Transformed Data in Power BI Desktop lab.

c. DimSalesTerritory query

In the Model Data in Power BI Desktop lab, you’ll create a hierarchy to support analysis at region, country, or group level.

d. FactResellerSales query

column quality for the TotalProductCost column, and notice that 8% of the rows are empty - Missing TotalProductCost column values is a data quality issue. To address the issue, in the Load Transformed Data in Power BI Desktop lab, you’ll apply transformations to fill in missing values by using the product standard cost, which is stored in the related DimProduct table.

iii. Get data from a CSV file

ResellerSalesTargets query

Repeat the steps to create a query based on the D:\Allfiles\Resources\ColorFormats.csv file.

Save / apply later.

13. Check your knowledge

T-SQL is the query language that you would use for SQL Server.

You're creating a Power BI report with data from an Azure Analysis Services MDX Cube. When the data refreshes in the cube, you would like to see it immediately in the Power BI report. How should you connect? - Live connection.

What can you do to improve performance when you're getting data in Power BI?Always use the least amount of data needed for your project.

14. Summary

In this module, you learned about pulling data from many different data sources and into Power BI.

You can pull data from files, relational databases, Azure Analysis Services, cloud-based applications, websites, and more.

Retrieving data from different data sources requires treating each data source differently. For instance, Microsoft Excel data should be pulled in from an Excel table. Relational databases often have query timeouts.

You can connect to SQL Server Analysis Services with Connect live, which allows you to see data changes in real-time.

It's important to select the correct storage mode for your data.

Do you require that visuals interact quickly but don’t mind possibly refreshing the data when the underlying data source changes?

If so, select Import to import data into Power BI.

If you prefer to see updates to data as soon as they happen at the cost of interactivity performance, then choose Direct Query for your data instead.

In addition, you learned how to solve performance problems and data import errors.

You learned that Power BI gives you tooling to identify where performance problems may exist. Data import errors can be alarming at first, but you can see that the resolution is easily implemented.

Learning objectives

By the end of this module, you'll be able to:

Identify and connect to a data source
Get data from a relational database, like Microsoft SQL Server
Get data from a file, like Microsoft Excel
Get data from applications
Get data from Azure Analysis Services
Select a storage mode
Fix performance issues
Resolve data import errors

IV. Module 4 Clean, transform, and load data in Power BI

Course	Microsoft Power BI Data Analyst
Module 4/17	Clean, transform, and load data in Power BI

Power Query has an incredible number of features that are dedicated to helping you clean and prepare your data for analysis.

You'll learn how to simplify a complicated model, change data types, rename objects, and pivot data.

You'll also learn how to profile columns so that you know which columns have the valuable data that you’re seeking for deeper analytics.

1. Introduction

When examining the data, you discover several issues, including:

A column called Employment status only contains numerals.
Several columns contain errors.
Some columns contain null values.
The customer ID in some columns appears as if it was duplicated repeatedly.
A single address column has combined street address, city, state, and zip code.

Clean data has the following advantages:

Measures and columns produce more accurate results when they perform aggregations and calculations.
Tables are organized, where users can find the data in an intuitive manner.
Duplicates are removed, making data navigation simpler. It will also produce columns that can be used in slicers and filters.
A complicated column can be split into two, simpler columns. Multiple columns can be combined into one column for readability.
Codes and integers can be replaced with human readable values.

2. Shape the initial data

Power Query Editor in Power BI Desktop allows you to shape (transform) your imported data.

You can accomplish actions such as

renaming columns or tables,
changing text to numbers,
removing rows,
setting the first row as headers,
and much more.

It is important to shape your data to ensure that it meets your needs and is suitable for use in reports.

i. Get started with Power Query Editor

Removing columns at an early stage in the process rather than later is best, especially when you have established relationships between your tables. Removing unnecessary columns will help you to focus on the data that you need and help improve the overall performance of your Power BI Desktop semantic models and reports.

3. Simplify the data structure

Rename a query

Replace values

Replace null values

Remove duplicates

4. Evaluate and change column data types

Implications of incorrect data types

Incorrect data types will prevent you from creating certain calculations, deriving hierarchies, or creating proper relationships with other tables.

For example, if you try to calculate the Quantity of Orders YTD, you'll get the following error stating that the OrderDate column data type isn't Date, which is required in time-based calculations.

Another issue with having an incorrect data type applied on a date field is the inability to create a date hierarchy, which would allow you to analyze your data on a yearly, monthly, or weekly basis.

Change the column data type in Power Query Editor

the change that you make to the column data type is saved as a programmed step. This step is called Changed Type

5. Combine multiple tables into a single table

You can combine tables into a single table in the following circumstances:

Too many tables exist, making it difficult to navigate an overly complicated semantic model.
Several tables have a similar role.
A table has only a column or two that can fit into a different table.
You want to use several columns from different tables in a custom column.

You can combine the tables in two different ways: merging and appending.

i. Append queries

When you append queries, you'll be adding rows of data to another table or query.

For example, you could have two tables, one with 300 rows and another with 100 rows, and when you append queries, you'll end up with 400 rows.

ii. Merge queries

When you merge queries, you'll be adding columns from one table (or query) into another. To merge two tables, you must have a column that is the key between the two tables.

This process is similar to the JOIN clause in SQL.

6. Profile data in Power BI

Profiling data is about studying the nuances of the data: determining anomalies, examining and developing the underlying data structures, and querying data statistics such as row counts, value distributions, minimum and maximum values, averages, and so on.

i. Find data anomalies and data statistics

Column quality shows you the percentages of data that is valid, in error, and empty.

Column profile gives you a more in-depth look into the statistics within the columns for the first 1,000 rows of data.

Column distribution shows you the distribution of the data within the column and the counts of distinct and unique values, both of which can tell you details about the data counts.

7. Use Advanced Editor to modify M code

Each time you shape data in Power Query, you create a step in the Power Query process. Those steps can be reordered, deleted, and modified where it makes sense.

8. Exercise - Load data in Power BI Desktop

Load Transformed Data in Power BI Desktop

Lab story

In this lab, you’ll use data cleansing and transformation techniques to start shaping your data model.

You’ll then apply the queries to load each as a table to the data model.

In this lab you learn how to:

Apply various transformations
Load queries to the data model

i. Configure the Salesperson query

ii. Configure the SalespersonRegion query

iii. Configure the Product query

iv. Configure the Reseller query

v. Configure the Region query

vi. Configure the Sales query

vii. Configure the Targets query

viii. Configure the ColorFormats query

ix. Update the Product query

x. Update the ColorFormats query

In the Query Properties window, uncheck the Enable Load To Report checkbox.

Disabling the load means it will not load as a table to the data model. This is done because the query was merged with theProductquery, which is enabled to load to the data model.

xi. Finish up

9. Check your knowledge

What is a risk of having null values in a numeric column?

AVERAGE takes the total and divides by the number of non-null values. If NULL is synonymous with zero in the data, the average will be different from the accurate average.

If you have two queries that have different data but the same column headers, and you want to combine both tables into one query with all the combined rows, which operation should you perform?

Append will take two tables and combine it into one query. The combined query will have more rows while keeping the same number of columns.

Which of the following selections aren't best practices for naming conventions in Power BI?

Abbreviations lead to confusion because they're often overused or not universally agreed on.

10. Summary

This module explained how you can take data that is difficult to read, build calculations on, and discover and make it simpler for report authors and others to use.

Additionally, you learned how to combine queries so that they were fewer in number, which makes data navigation more streamlined.

You also replaced renamed columns into a human readable form and reviewed good naming conventions for objects in Power BI.

Learning objectives

By the end of this module, you’ll be able to:

Resolve inconsistencies, unexpected or null values, and data quality issues.
Apply user-friendly value replacements.
Profile data so you can learn more about a specific column before using it.
Evaluate and transform column data types.
Apply data shape transformations to table structures.
Combine queries.
Apply user-friendly naming conventions to columns and queries.
Edit M code in the Advanced Editor.

V. Module 5 Design a semantic model in Power BI

Course	Microsoft Power BI Data Analyst
Module 5/17	Design a semantic model in Power BI

Building a great semantic model is about simplifying the disarray.

A star schema is one way to simplify a semantic model.

You will also learn about why choosing the correct data granularity is important for performance and usability of your Power BI reports.

Finally, you learn about improving performance with your Power BI semantic models.

1. Introduction

A good semantic model offers the following benefits:

Data exploration is faster.
Aggregations are simpler to build.
Reports are more accurate.
Writing reports takes less time.
Reports are easier to maintain in the future.

Typically, a smaller semantic model is composed of fewer tables and fewer columns in each table that the user can see.

To summarize, you should aim for simplicity when designing your semantic models.

i. Star schemas

Now that you have learned about the relationships that make up the data schema, you are able to explore a specific type of schema design, the star schema, which is optimized for high performance and usability.

ii. Fact tables

Fact tables contain observational or event data values: sales orders, product counts, prices, transactional dates and times, and quantities.

Fact tables can contain several repeated values.

For example, one product can appear multiple times in multiple rows, for different customers on different dates. These values can be aggregated to create visuals.

For instance, a visual of the total sales orders is an aggregation of all sales orders in the fact table.

With fact tables, it is common to see columns that are filled with numbers and dates. The numbers can be units of measurement, such as sale amount, or they can be keys, such as a customer ID. The dates represent time that is being recorded, like order date or shipped date.

The Sales table contains the sales order values, which can be aggregated, it is considered a fact table.

iii. Dimension tables

Dimension tables contain the details about the data in fact tables: products, locations, employees, and order types.

These tables are connected to the fact table through key columns.

Dimension tables are used to filter and group the data in fact tables.

The fact tables, on the other hand, contain the measurable data, such as sales and revenue, and each row represents a unique combination of values from the dimension tables.

For the total sales orders visual, you could group the data so that you see total sales orders by product, in which product is data in the dimension table.

Fact tables are much larger than dimension tables because numerous events occur in fact tables, such as individual sales.

Dimension tables are typically smaller because you are limited to the number of items that you can filter and group on.

For instance, a year contains only so many months, and the United States are composed of only a certain number of states.

The Employee table contains the specific employee name, which filters the sales orders, so it would be a dimension table.

Note:

The Sales table contains the sales order values, which can be aggregated, it is considered a fact table.

The Employee table contains the specific employee name, which filters the sales orders, so it would be a dimension table

Star schemas and the underlying semantic model are the foundation of organized reports; the more time you spend creating these connections and design, the easier it will be to create and maintain reports.

2. Work with tables

This process of formatting and configuring tables can also be done in Power Query.

3. Create a date table

During report creation in Power BI, a common business requirement is to make calculations based on date and time. Organizations want to know how their business is doing over months, quarters, fiscal years, and so on.

You can create a common date table that can be used by multiple tables.

i. Create a common date table

Ways that you can build a common date table are:

Source data
DAX
Power Query

ii. DAX

You can use the Data Analysis Expression (DAX) functions CALENDARAUTO() or CALENDAR() to build your common date table.

The CALENDARAUTO() function returns a contiguous, complete range of dates that are automatically determined from your semantic model.

The starting date is chosen as the earliest date that exists in your semantic model, and the ending date is the latest date that exists in your semantic model plus data that has been populated to the fiscal month that you can choose to include as an argument in the CALENDARAUTO() function.

Dates  = CALENDAR(DATE(2011, 5, 31), DATE(2022, 12, 31))

Year = YEAR(Dates[Date])

iii. Power Query

You can use M-language, the development language that is used to build queries in Power Query, to define a common date table.

= List.Dates(
#date(2011,05,31), // start date
365*10, //Dates for everyday for the next 10 years
#duration(1,0,0,0) //Specifies duration of the period 1 = days, 0 = hours, 0 = minutes, 0 = seconds
)

4. Work with dimensions

When building a star schema, you will have dimension and fact tables.

Fact tables contain information about events such as sales orders, shipping dates, resellers, and suppliers.

Dimension tables store details about business entities, such as products or time, and are connected back to fact tables through a relationship.

You can use hierarchies as one source to help you find detail in dimension tables. These hierarchies form through natural segments in your data.

For instance, you can have a hierarchy of dates in which your dates can be segmented into years, months, weeks, and days.

Hierarchies are useful because they allow you to drill down into the specifics of your data instead of only seeing the data at a high level.

i. Hierarchies

To create a hierarchy, go to the Fields pane on Power BI and then right-click the column that you want the hierarchy for. Select New hierarchy,

Parent-child hierarchy

Flatten parent-child hierarchy

The process of viewing multiple child levels based on a top-level parent is known as flattening the hierarchy.

In this process, you are creating multiple columns in a table to show the hierarchical path of the parent to the child in the same record. You will use PATH(), a simple DAX function that returns a text version of the managerial path for each employee, and PATHITEM() to separate this path into each level of managerial hierarchy.

Path = PATH(Employee[Employee ID], Employee[Manager ID])

To flatten the hierarchy, you can separate each level by using the PATHITEM function.
To view all three levels of the hierarchy separately, you can create four columns in the same way that you did previously, by entering the following equations. You will use the PATHITEM function to retrieve the value that resides in the corresponding level of your hierarchy.

Level 1 = PATHITEM(Employee[Path],1)
Level 2 = PATHITEM(Employee[Path],2)
Level 3 = PATHITEM(Employee[Path],3)

ii. Role-playing dimensions

Role-playing dimensions have multiple valid relationships with fact tables, meaning that the same dimension can be used to filter multiple columns or tables of data. As a result, you can filter data differently depending on what information you need to retrieve.

Calendar is the dimension table, while Sales and Order are fact tables.

The dimension table has two relationships: one with Sales and one with Order.

This example is of a role-playing dimension because the Calendar table can be used to group data in both Sales and Order.

If you wanted to build a visual in which the Calendar table references the Order and the Sales tables, the Calendar table would act as a role-playing dimension.

5. Define data granularity

Data granularity is the detail that is represented within your data, meaning that the more granularity your data has, the greater the level of detail within your data.

6. Work with relationships and cardinality

Power BI has the concept of directionality to a relationship.

This directionality plays an important role in filtering data between multiple tables.

i. Relationships

The following are different types of relationships that you'll find in Power BI.

Many-to-one (*:1) or one-to-many (1: *) relationship

Describes a relationship in which you have many instances of a value in one column that are related to only one unique corresponding instance in another column.
Describes the directionality between fact and dimension tables.
Is the most common type of directionality and is the Power BI default when you are automatically creating relationships.

One-to-one (1:1) relationship:

Describes a relationship in which only one instance of a value is common between two tables.
Requires unique values in both tables.
Is not recommended because this relationship stores redundant information and suggests that the model is not designed correctly. It is better practice to combine the tables.

Many-to-many (.) relationship:

Describes a relationship where many values are in common between two tables.
Does not require unique values in either table in a relationship.
Is not recommended; a lack of unique values introduces ambiguity and your users might not know which column of values is referring to what.

ii. Cross-filter direction

Data can be filtered on one or both sides of a relationship.

With a single cross-filter direction:

Only one table in a relationship can be used to filter the data. For instance, Table 1 can be filtered by Table 2, but Table 2 cannot be filtered by Table 1.

Note: **

Follow the direction of the arrow on the relationship between your tables to know which direction the filter will flow. You typically want these arrows to point to your fact table.

For a one-to-many or many-to-one relationship, the cross-filter direction will be from the "one" side, meaning that the filtering will occur in the table that has many values.

With both cross-filter directions or bi-directional cross-filtering:

One table in a relationship can be used to filter the other. For instance, a dimension table can be filtered through the fact table, and the fact tables can be filtered through the dimension table.
You might have lower performance when using bi-directional cross-filtering with many-to-many relationships.

iii. Cardinality and cross-filter direction

For one-to-one relationships, the only option that is available is bi-directional cross-filtering.

For many-to-many relationships, you can choose to filter in a single direction or in both directions by using bi-directional cross-filtering.

iv. Create many-to-many relationships

7. Resolve modeling challenges

When you are creating these relationships, a common pitfall that you might encounter are circular relationships.

i. Relationship dependencies

8. Exercise - Model data in Power BI Desktop

Design a Data Model in Power BI

Lab story

It will involve creating relationships between tables, and then configuring table and column properties to improve the friendliness and usability of the data model.

You’ll also create hierarchies and create quick measures.

In this lab you learn how to:

Create model relationships
Configure table and column properties
Create hierarchies

i. Create model relationships

You can interpret the cardinality that is represented by the 1 and (*) indicators.
Filter direction is represented by the arrow head.
A solid line represents an active relationship; a dashed line represents an inactive relationship.
Hover the cursor over the relationship to highlight the related columns.

ii. Configure Tables

iii. Configure the Product table

To organize columns into a display folder, in the Data pane, first select the Background Color Format column.

While pressing the Ctrl key, select the Font Color Format column.

In the Properties pane, in the Display Folder box, enter Formatting.

Display folders are a great way to declutter tables—especially for tables that comprise many fields. They’re logical presentation only.

iv. Configure the Region table

In the Properties pane, expand the Advanced section (at the bottom of the pane), and then in the Data Category dropdown list, select Country/Region.

Data categorization can provide hints to the report designer. In this case, categorizing the column as country or region provides more accurate information to Power BI when it renders a map visualization.

v. Configure the Reseller table

vi. Configure the Sales table

vii. Bulk update properties

In the Properties pane, slide the Is Hidden property to Yes.

The columns were hidden because they’re either used by relationships or will be used in row-level security configuration or calculation logic.

viii. Review the Model Interface

In the Data pane, notice the following:
- Columns, hierarchies and their levels are fields, which can be used to configure report visuals
- Only fields relevant to report authoring are visible
- The SalespersonRegion table isn’t visible—because all of its fields are hidden
- Spatial fields in the Region and Reseller table are adorned with a spatial icon
- Fields adorned with the sigma symbol (Ʃ) will summarize, by default
- A tooltip appears when hovering the cursor over the Sales | Cost field

ix. Review the model interface

To turn off auto/date time, Navigate to File > Options and Settings > Options > Current File group, and select Data Load. In the Time Intelligence section, uncheck Auto Date/Time.

x. Create Quick Measures

xi. Create quick measures

Profit = 
SUM('Sales'[Sales]) - SUM('Sales'[Cost])

Profit Margin = 
DIVIDE([Profit], SUM('Sales'[Sales]))

xii. Create a many-to-many relationship

xiii. Relate the Targets table

9. Check your knowledge

What does data granularity mean?

The level of detail in your data, meaning that higher granularity means more detailed data.

Data granularity refers to how finely or coarsely your data is divided or aggregated.

What is the difference between a fact table and a dimension table?

Fact tables contain observational data such as sales orders, employees, shipping dates, and so on, while dimension tables contain information about specific entities such as product IDs and dates.

Choose the best answer to explain relationship cardinality?

Cardinality is the measure of unique values in a table.

An example of high cardinality would be a Sales table as it has a high number of unique values.

10. Summary

You have learned about modeling data in Power BI, which includes such topics as creating common date tables, learning about and configuring many-to-many relationships, resolving circular relationships, designing star schemas, and much more.

These skills are crucial to the Power BI practitioner's toolkit so that it is easier to build visuals and hand off your report elements to other teams. With this foundation, you now have the ability to explore the many nuances of the semantic model.

Learning objectives

In this module, you will:

Create common date tables
Configure many-to-many relationships
Resolve circular relationships
Design star schemas

VI. Module 6 Add measures to Power BI Desktop models

Course	Microsoft Power BI Data Analyst
Module 6/17

In this module,

you'll learn how to work with implicit and explicit measures.

You'll start by creating simple measures, which summarize a single column or table.

Then, you'll create more complex measures based on other measures in the model.

Additionally, you'll learn about the similarities of, and differences between, a calculated column and a measure.

1. Introduction

Implicit measures are automatic behaviors that allow visuals to summarize model column data.

Explicit measures, also known simply as measures, are calculations that you can add to your model.

a column that's shown with the sigma symbol ( ∑ ) indicates two facts:

It's a numeric column.
It will summarize column values when it is used in a visual (when added to a field well that supports summarization).

Implicit measures allow the report author to start with a default summarization technique and lets them modify it to suit their visual requirements.

i. Summarize non-numeric columns

Non-numeric columns can be summarized.

However, the sigma symbol does not show next to non-numeric columns in the Fields pane because they don't summarize by default.

Text columns allow the following aggregations:

First (alphabetically) Last (alphabetically) Count (Distinct) Count

Date columns allow the following aggregations:

Earliest Latest Count (Distinct) Count

Boolean columns allow the following aggregations:

Count (Distinct) Count

ii. Benefits of implicit measures

Several benefits are associated with implicit measures. Implicit measures are simple concepts to learn and use, and they provide flexibility in the way that report authors visualize model data.

Additionally, they mean less work for you as a data modeler because you don't have to create explicit calculations.

iii. Limitations of implicit measures

The most significant limitation of implicit measures is that they only work for simple scenarios, meaning that they can only summarize column values that use a specific aggregation function. Therefore, in situations when you need to calculate the ratio of each month's sales amount over the yearly sales amount, you'll need to produce an explicit measure by writing a Data Analysis Expressions (DAX) formula to achieve that more sophisticated requirement.

Implicit measures don't work when the model is queried by using Multidimensional Expressions (MDX). This language expects explicit measures and can't summarize column data. It's used when a Power BI semantic model is queried by using Analyze in Excel or when a Power BI paginated report uses a query that is generated by the MDX graphical query designer.

2. Create simple measures

A measure formula must return a scalar or single value.

In tabular modeling, no such concept as a calculated measure exists. The word calculated is used to describe calculated tables and calculated columns. It distinguishes them from tables and columns that originate from Power Query, which doesn't have the concept of an explicit measure.

Measures don't store values in the model. Instead, they're used at query time to return summarizations of model data. Additionally, measures can't reference a table or column directly; they must pass the table or column into a function to produce a summarization.

A simple measure is one that aggregates the values of a single column; it does what implicit measures do automatically.

Measures:

Revenue =
SUM(Sales[Sales Amount])

Cost =
SUM(Sales[Total Product Cost])

Profit =
SUM(Sales[Profit Amount])

Quantity =
SUM(Sales[Order Quantity])

Minimum Price =
MIN(Sales[Unit Price])

Maximum Price =
MAX(Sales[Unit Price])

Average Price =
AVERAGE(Sales[Unit Price])

Order Line Count =
COUNT(Sales[SalesOrderLineKey])

Order Count =
DISTINCTCOUNT('Sales Order'[Sales Order])

Order Line Count =
COUNTROWS(Sales)

3. Create compound measures

When a measure references one or more measures, it's known as a compound measure.

Profit =
[Revenue] - [Cost]

By removing this calculated column, you've optimized the semantic model. Removing this columns results in a decreased semantic model size and shorter data refresh times.

4. Create quick measures

a feature named Quick Measures. This feature helps you to quickly perform common, powerful calculations by generating the DAX expression for you.

5. Compare calculated columns with measures

Regarding similarities between calculated columns and measures, both are:

Calculations that you can add to your semantic model. Defined by using a DAX formula. Referenced in DAX formulas by enclosing their names within square brackets.

The areas where calculated columns and measures differ include:

Purpose - Calculated columns extend a table with a new column, while measures define how to summarize model data.
Evaluation - Calculated columns are evaluated by using row context at data refresh time, while measures are evaluated by using filter context at query time. Filter context is introduced in a later module; it's an important topic to understand and master so that you can achieve complex summarizations.
Storage - Calculated columns (in Import storage mode tables) store a value for each row in the table, but a measure never stores values in the model.
Visual use - Calculated columns (like any column) can be used to filter, group, or summarize (as an implicit measure), whereas measures are designed to summarize.

6. Check your knowledge

Which statement about measures is correct?

Measures can reference other measures. It's known as a compound measure.

Measures don't need to be added to the semantic model. The concept of an implicit measure allows visuals to summarize column values.

Which DAX function can summarize a table?

The COUNTROWS function summarizes a table by returning the number of rows.

The SUM function can only summarize a column.

Which of the following statements describing similarity of measures and calculated columns in an Import model is true?

They can achieve summarization of model data. A calculated column can be summarized (implicit measure) and a measure always achieves summarization.

While measures can be quickly created by using the Quick measures feature, no equivalent feature exists to create calculated columns.

7. Exercise - Create DAX Calculations in Power BI Desktop

In this exercise, you’ll create calculated tables, calculated columns, and simple measures using Data Analysis Expressions (DAX).

In this exercise you'll learn how to:

Create calculated tables
Create calculated columns
Create measures

i. Create Calculated Tables

In the formula bar (which opens directly beneath the ribbon when creating or editing calculations), type Salesperson =, press Shift+Enter, type ‘Salesperson (Performance)’, and then press Enter.

Note: Calculated tables are defined by using a DAX formula that returns a table. It’s important to understand that calculated tables increase the size of the data model because they materialize and store values. They’re recomputed whenever formula dependencies are refreshed, as will be the case for this data model when new (future) date values are loaded into tables.

Unlike Power Query-sourced tables, calculated tables can’t be used to load data from external data sources. They can only transform data based on what has already been loaded into the data model.

ii. Create the Salesperson table

Salesperson = 'Salesperson (Performance)'

iii. Create the Date table

 Date =  
 CALENDARAUTO(6)

The CALENDARAUTO() function returns a single-column table consisting of date values. The “auto” behavior scans all data model date columns to determine the earliest and latest date values stored in the data model. It then creates one row for each date within this range, extending the range in either direction to ensure full years of data is stored.

This function can take a single optional argument that is the last month number of a year. When omitted, the value is 12, meaning that December is the last month of the year. In this case, 6 is entered, meaning that June is the last month of the year.

iv. Create calculated columns

In this task, you’ll add more columns to enable filtering and grouping by different time periods. You’ll also create a calculated column to control the sort order of other columns.

 Year =
 "FY" & YEAR('Date'[Date]) + IF(MONTH('Date'[Date]) > 6, 1)

A calculated column is created by first entering the column name, followed by the equals symbol (=), followed by a DAX formula that returns a single-value result. The column name can’t already exist in the table.

The formula uses the date’s year value but adds one to the year value when the month is after June. It’s how fiscal years at Adventure Works are calculated.

Quarter = 
 'Date'[Year] & " Q"
 & IF(
        MONTH('Date'[Date]) IN {7,8,9},1,
        IF(
            MONTH('Date'[Date]) IN {10,11,12},2,
            IF(
                MONTH('Date'[Date]) IN {1,2,3},3,
                IF(
                    MONTH('Date'[Date]) IN {4,5,6},4
                )
            )
        )
    )

simpler:

Quarter = 
 'Date'[Year] & " Q"
 & SWITCH(
    TRUE(),
    MONTH('Date'[Date]) IN {7,8,9}, 1,
    MONTH('Date'[Date]) IN {10,11,12}, 2,
    MONTH('Date'[Date]) IN {1,2,3}, 3,
    MONTH('Date'[Date]) IN {4,5,6}, 4
 )

Month = 
FORMAT('Date'[Date],"yyyy mmm")

By default, text values sort alphabetically, numbers sort from smallest to largest, and dates sort from earliest to latest.

 MonthKey =
 (YEAR('Date'[Date]) * 100) + MONTH('Date'[Date])

On the Column Tools contextual ribbon, from inside the Sort group, select Sort by Column, and then select MonthKey.

v. Complete the Date table

In this task, you’ll complete the design of the Date table by hiding a column and creating a hierarchy. You’ll then create relationships to the Sales and Targets tables.

vi. Mark the Date table

On the Table Tools contextual ribbon, from inside the Calendars group, select Mark as Date Table, and then select Mark as Date Table.

vii. Create simple measures

Visible numeric columns allow report authors at report design time to decide how column values will summarize (or not). It can result in inappropriate reporting. Some data modelers don’t like leaving things to chance, however, and choose to hide these columns and instead expose aggregation logic defined in measures. It’s the approach you’ll now take in this lab.

 Avg Price =  
 AVERAGE(Sales[Unit Price])

Note:

It’s not possible to modify the aggregation behavior of a measure.

Median Price =
MEDIAN(Sales[Unit Price])

Min Price =
MIN(Sales[Unit Price])

Max Price =
MAX(Sales[Unit Price])

Orders =
DISTINCTCOUNT(Sales[SalesOrderNumber])

Order Lines =
COUNTROWS(Sales)

viii. Create additional measures

The HASONEVALUE() function tests whether a single value in the Salesperson column is filtered. When true, the expression returns the sum of target amounts (for just that salesperson). When false, BLANK is returned.

 Target =
 IF(
 HASONEVALUE('Salesperson (Performance)'[Salesperson]),
 SUM(Targets[TargetAmount])
 )

8. Summary

In this module, you learned that Power BI measures are either implicit or explicit. Implicit measures are automatic behaviors that are supported by visuals, while explicit measures use DAX formulas that summarize model data.

Explicit measures are important because they allow you to create complex DAX formulas to achieve the precise calculations that your report visuals need. While you learned to create simple and compound measures in this module, in later modules you'll learn to create more powerful measures by using filter modification functions and iterator functions.

Learning objectives

By the end of this module, you'll be able to:

Determine when to use implicit and explicit measures.
Create simple measures.
Create compound measures.
Create quick measures.
Describe similarities of, and differences between, a calculated column and a measure.

VII. Module 7 Add calculated tables and columns to Power BI Desktop models

Course	Microsoft Power BI Data Analyst
Module 7/17	Add calculated tables and columns to Power BI Desktop models

By the end of this module,

you'll be able to add calculated tables and calculated columns to your semantic model.

You'll also be able to describe row context, which is used to evaluated calculated column formulas. Because it's possible to add columns to a table using Power Query,

you'll also learn when it's best to create calculated columns instead of Power Query custom columns.

1. Introduction

You can write a Data Analysis Expressions (DAX) formula to add a calculated table to your model. The formula can duplicate or transform existing model data to produce a new table

A calculated table can't connect to external data; you must use Power Query to accomplish that task.

A calculated table formula must return a table object. The simplest formula can duplicate an existing model table.

Calculated tables have a cost: They increase the model storage size and they can prolong the data refresh time. The reason is because calculated tables recalculate when they have formula dependencies to refreshed tables.

i. Duplicate a table

Data:

Download and open the Adventure Works DW 2020 M03.pbix

Ship Date = 'Date'

Mark the Ship Date table as a date table by using the Ship Date column.

Calculated tables are useful to work in scenarios when multiple relationships between two tables exist, as previously described.

They can also be used to add a date table to your model. Date tables are required to apply special time filters known as time intelligence.

ii. Create a date table

Due Date = CALENDARAUTO(6)

When the table rows and distinct values are the same, it means that the column contains unique values. That factor is important for two reasons: It satisfies the requirements to mark a date table, and it allows this column to be used in a model relationship as the one-side.

2. Create calculated columns

Due Fiscal Year =
"FY"
    & YEAR('Due Date'[Due Date])
        + IF(
            MONTH('Due Date'[Due Date]) > 6,
            1
        )

The calculated column definition adds the Due Fiscal Year column to the Due Date table.

The following steps describe how Microsoft Power BI evaluates the calculated column formula:

The addition operator (+) is evaluated before the text concatenation operator (&).
The YEAR DAX function returns the whole number value of the due date year.
The IF DAX function returns the value when the due date month number is 7-12 (July to December); otherwise, it returns BLANK. (For example, because the Adventure Works financial year is July-June, the last six months of the calendar year will use the next calendar year as their financial year.)
The year value is added to the value that is returned by the IF function, which is the value one or BLANK. If the value is BLANK, it's implicitly converted to zero (0) to allow the addition to produce the fiscal year value.
The literal text value "FY" concatenated with the fiscal year value, which is implicitly converted to text.

Due Fiscal Quarter = 
'Due Date'[Due Fiscal Year] & " Q"
    & SWITCH(
        TRUE(),
        MONTH('Due Date'[Due Date]) IN {1, 2, 3}, 3,
        MONTH('Due Date'[Due Date]) IN {4, 5, 6}, 4,
        MONTH('Due Date'[Due Date]) IN {7, 8, 9}, 1,
        MONTH('Due Date'[Due Date]) IN {10, 11, 12}, 2
    )

The calculated column definition adds the Due Fiscal Quarter column to the Due Date table. The IF function returns the quarter number (Quarter 1 is July-September), and the result is concatenated to the Due Fiscal Year column value and the literal text Q.

Due Month =
FORMAT('Due Date'[Due Date], "yyyy mmm")

The calculated column definition adds the Due Month column to the Due Date table. The FORMATDAX function converts the Due Date column value to text by using a format string. In this case, the format string produces a label that describes the year and abbreviated month name.

Due Full Date =
FORMAT('Due Date'[Due Date], "yyyy mmm, dd")

MonthKey =
(YEAR('Due Date'[Due Date]) * 100) + MONTH('Due Date'[Due Date])

The MonthKey calculated column multiplies the due date year by the value 100 and then adds the month number of the due date. It produces a numeric value that can be used to sort the Due Month text values in chronological order.

3. Learn about row context

The formula for a calculated column is evaluated for each table row. Furthermore, it's evaluated within row context, which means the current row.

However, row context doesn't extend beyond the table. If your formula needs to reference columns in other tables, you have two options:

If the tables are related, directly or indirectly, you can use the RELATED or RELATEDTABLE DAX function. The RELATED function retrieves the value at the one-side of the relationship, while the RELATEDretrieves values on the many-side. The RELATEDTABLE function returns a table object.
When the tables aren't related, you can use the LOOKUPVALUE DAX function.

Sales table:

Discount Amount =
(
    Sales[Order Quantity]
        * RELATED('Product'[List Price])
) - Sales[Sales Amount]

The calculated column definition adds the Discount Amount column to the Sales table. Power BI evaluates the calculated column formula for each row of the Sales table. The values for the Order Quantity and Sales Amount columns are retrieved within row context. However, because the List Price column belongs to the Product table, the RELATED function is required to retrieve the list price value for the sale product.

Row context is used when calculated column formulas are evaluated. It's also used when a class of functions, known as iterator functions, are used. Iterator functions provide you with flexibility to create sophisticated summarizations. Iterator functions are described in a later module.

4. Choose a technique to add a column

There are three techniques that you can use to add columns to a model table:

Add columns to a view or table (as a persisted column), and then source them in Power Query. This option only makes sense when your data source is a relational database and if you have the skills and permissions to do so. However, it's a good option because it supports ease of maintenance and allows reuse of the column logic in other models or reports.
Add custom columns (using M) to Power Query queries.
Add calculated columns (using DAX) to model tables.

When you need to add a column to a calculated table, make sure that you create a calculated column. Otherwise, we recommend that you only use a calculated column when the calculated column formula:

Depends on summarized model data.
Needs to use specialized modeling functions that are only available in DAX, such as the RELATED and RELATEDTABLE functions. Specialized functions can also include the DAX parent and child hierarchies, which are designed to naturalize a recursive relationship into columns, for example, in an employee table where each row stores a reference to the row of the manager (who is also an employee).

5. Check your knowledge

Which statement about calculated tables is true?

Calculated tables increase the size of the semantic model.'

Calculated tables store data inside the model, and adding them results in a larger model size.

Which statement about calculated columns is true?

Calculated column formulas are evaluated by using row context.

You're developing a Power BI desktop model that sources data from an Excel workbook. The workbook has an employee table that stores one row for each employee. Each row has a reference to the employee's manager, which is also a row in the employee table. You need to add several columns to the Employee table in your model to analyze payroll data within the organization hierarchy (like, executive level, manager level, and so on). Which technique will you use to add the columns?

Add calculated columns by using DAX.

You can use the DAX parent-child functions to naturalize the recursive (employee-manager) relationship into columns.

6. Summary

In this module,

you learned how to use DAX to extend your model designs with calculated tables and calculated columns.

You also learned that calculated columns are evaluated within row context, and you learned how to correctly reference columns that belong to different tables.

Finally, you learned that while many ways are available for adding new columns to your model, it's best to create Power Query custom columns whenever possible.

A calculated table can't connect to external data; you must use Power Query to accomplish that task.

increase the model storage size and prolong the data refresh time

Calculated tables are useful to work in scenarios when multiple relationships between two tables exist, as previously described.

The formula for a calculated column is evaluated for each table row. Furthermore, it's evaluated within row context, which means the current row.

row context doesn't extend beyond the table - RELATED() , RELATEDTABLE() , LOOKUPVALUE().

Learning objectives

By the end of this module, you'll be able to:

Create calculated tables.
Create calculated columns.
Identify row context.
Determine when to use a calculated column in place of a Power Query custom column.
Add a date table to your model by using DAX calculations.

VIII. Module 8 Use DAX time intelligence functions in Power BI Desktop models

Course	Microsoft Power BI Data Analyst
Module 8/17	Use DAX time intelligence functions in Power BI Desktop models

By the end of this module,

you'll learn the meaning of time intelligence and

how to add time intelligence DAX calculations to your model.

1. Introduction

Time intelligence relates to calculations over time.

Specifically, it relates to calculations over dates, months, quarters, or years, and possibly time.

In Data Analysis Expressions (DAX) calculations, time intelligence means modifying the filter context for date filters.

For example, at the Adventure Works company, their financial year begins on July 1 and ends on June 30 of the following year. They produce a table visual that displays monthly revenue and year-to-date (YTD) revenue.

The filter context for 2017 August contains each of the 31 dates of August, which are stored in the Date table.

However, the calculated year-to-date revenue for 2017 August applies a different filter context.

It's the first date of the year through to the last date in filter context. In this example, that's July 1, 2017 through to August 31, 2017.

Time intelligence calculations modify date filter contexts. They can help you answer these time-related questions:

What's the accumulation of revenue for the year, quarter, or month?
What revenue was produced for the same period last year?
What growth in revenue has been achieved over the same period last year?
How many new customers made their first order in each month?
What's the inventory stock on-hand value for the company's products?

This module describes how to create time intelligence measures to answer these questions.

2. Use DAX time intelligence functions

i. Date table requirement

To work with time intelligence DAX functions, you need to meet the prerequisite model requirement of having at least one date table in your model. A date table is a table that meets the following requirements:

It must have a column of data type Date (or date/time), known as the date column.
The date column must contain unique values.
The date column must not contain BLANKs.
The date column must not have any missing dates.
The date column must span full years. A year isn't necessarily a calendar year (January-December).
The date table must be indicated as a date table.

Download and open the Adventure Works DW 2020 M07.pbix file.

measure definition to the Sales table that calculates YTD revenue:

Revenue YTD =
TOTALYTD([Revenue], 'Date'[Date], "6-30")

The year-end date value of "6-30" represents June 30.

ii. Summarizations over time

One group of DAX time intelligence functions is concerned with summarizations over time:

Revenue YTD =
TOTALYTD([Revenue], 'Date'[Date], "6-30")

iii. Comparisons over time

Another group of DAX time intelligence functions is concerned with shifting time periods:

you will add a measure to the Sales table that calculates revenue for the prior year by using the SAMEPERIODLASTYEAR function. Format the measure as currency with two decimal places.

Revenue PY =
VAR RevenuePriorYear = CALCULATE([Revenue], SAMEPERIODLASTYEAR('Date'[Date]))
RETURN
    RevenuePriorYear

Next, you will modify the measure by renaming it to Revenue YoY % and then updating the RETURN clause to calculate the change ratio. Be sure to change the format to a percentage with two decimals places.

Revenue YoY % =
VAR RevenuePriorYear = CALCULATE([Revenue], SAMEPERIODLASTYEAR('Date'[Date]))
RETURN
    DIVIDE(
        [Revenue] - RevenuePriorYear,
        RevenuePriorYear
    )

3. Additional time intelligence calculations

i. Calculate new occurrences

Another use of time intelligence functions is to count new occurrences.

The following example shows how you can calculate the number of new customers for a time period. A new customer is counted in the time period in which they made their first purchase.

Your first task is to add the following measure to the Sales table that counts the number of distinct customers life-to-date (LTD).

Life-to-date means from the beginning of time until the last date in filter context. Format the measure as a whole number by using the thousands separator.

Customers LTD =
VAR CustomersLTD =
    CALCULATE(
        DISTINCTCOUNT(Sales[CustomerKey]),
        DATESBETWEEN(
            'Date'[Date],
            BLANK(),
            MAX('Date'[Date])
        ),
        'Sales Order'[Channel] = "Internet"
    )
RETURN
    CustomersLTD

produces a result of distinct customers LTD until the end of each month.

The DATESBETWEEN function returns a table that contains a column of dates that begins with a given start date and continues until a given end date.

When the start date is BLANK, it will use the first date in the date column (Conversely, when the end date is BLANK, it will use the last date in the date column.) In this case, the end date is determined by the MAX function, which returns the last date in filter context.

Therefore, if the month of August 2017 is in filter context, then the MAX function will return August 31, 2017 and the DATESBETWEEN function will return all dates through to August 31, 2017.

Next, you will modify the measure by renaming it to New Customers and by adding a second variable to store the count of distinct customers before the time period in filter context. The RETURN clause now subtracts this value from LTD customers to produce a result, which is the number of new customers in the time period.

New Customers =
VAR CustomersLTD =
    CALCULATE(
        DISTINCTCOUNT(Sales[CustomerKey]),
        DATESBETWEEN(
            'Date'[Date],
            BLANK(),
            MAX('Date'[Date])
        ),
    'Sales Order'[Channel] = "Internet"
    )
VAR CustomersPrior =
    CALCULATE(
        DISTINCTCOUNT(Sales[CustomerKey]),
        DATESBETWEEN(
            'Date'[Date],
            BLANK(),
            MIN('Date'[Date]) - 1
        ),
        'Sales Order'[Channel] = "Internet"
    )
RETURN
    CustomersLTD - CustomersPrior

For the CustomersPrior variable, notice that the DATESBETWEEN function includes dates until the first date in filter context minus one. Because Microsoft Power BI internally stores dates as numbers, you can add or subtract numbers to shift a date.

ii. Snapshot calculations

Occasionally, fact data is stored as snapshots in time. Common examples include inventory stock levels or account balances. A snapshot of values is loaded into the table on a periodic basis.

When summarizing snapshot values (like inventory stock levels), you can summarize values across any dimension except date. Adding stock level counts across product categories produces a meaningful summary, but adding stock level counts across dates does not. Adding yesterday's stock level to today's stock level isn't a useful operation to perform (unless you want to average that result).

When you are summarizing snapshot tables, measure formulas can rely on DAX time intelligence functions to enforce a single date filter.

Now, you'll add a measure to the Inventory table that sums the UnitsBalance value for a single date. The date will be the last date of each time period. It's achieved by using the LASTDATE function. Format the measure as a whole number with the thousands separator.

Stock on Hand =
CALCULATE(
    SUM(Inventory[UnitsBalance]),
    LASTDATE('Date'[Date])
)

Notice that the measure formula uses the SUM function. An aggregate function must be used (measures don't allow direct references to columns), but given that only one row exists for each product for each date, the SUM function will only operate over a single row.

The measure returns BLANKs for June 2020 because no record exists for the last date in June. According to the data, it hasn't happened yet.

Filtering by the last date in filter context has inherent problems: A recorded date might not exist because it hasn't yet happened, or perhaps because stock balances aren't recorded on weekends.

Your next step is to adjust the measure formula to determine the last date that has a non-BLANK result and then filter by that date. You can achieve this task by using the LASTNONBLANK DAX function.

Stock on Hand =
CALCULATE(
    SUM(Inventory[UnitsBalance]),
    LASTNONBLANK(
        'Date'[Date],
        CALCULATE(SUM(Inventory[UnitsBalance]))
    )
)

4. Exercise - Create Advanced DAX Calculations in Power BI Desktop

i. Lab story

In this lab, you’ll create measures with DAX expressions involving filter context manipulation.

In this lab you learn how to:

Use the CALCULATE() function to manipulate filter context
Use Time Intelligence functions

ii. Work with Filter Context

iii. Create a matrix visual

iv. Manipulate filter context

use the CALCULATE() function to manipulate filter context.

 Sales All Region =

 CALCULATE(SUM(Sales[Sales]), REMOVEFILTERS(Region))

The CALCULATE() function is a powerful function used to manipulate the filter context. The first argument takes an expression or a measure (a measure is just a named expression). Subsequent arguments allow modifying the filter context.

The REMOVEFILTERS() function removes active filters. It can take either no arguments, or a table, a column, or multiple columns as its argument.

In this formula, the measure evaluates the sum of theSalescolumn in a modified filter context, which removes any filters applied to the columns of theRegiontable.

 Sales % All Region =  
 DIVIDE(  
  SUM(Sales[Sales]),  
  CALCULATE(  
  SUM(Sales[Sales]),  
  REMOVEFILTERS(Region)  
  )  
 )

The DIVIDE() function divides the Sales measure (not modified by filter context) by the Sales measure in a modified context, which removes any filters applied to the Region table.

 Sales % Country =  
 DIVIDE(  
  SUM(Sales[Sales]),  
  CALCULATE(  
  SUM(Sales[Sales]),  
  REMOVEFILTERS(Region[Region])  
  )  
 )

The difference is that the denominator modifies the filter context by removing filters on theRegioncolumn of theRegiontable, not all columns of theRegiontable. It means that any filters applied to the group or country columns are preserved. It will achieve a result that represents the sales as a percentage of country.

 Sales % Country =  
 IF(  
  ISINSCOPE(Region[Region]),  
  DIVIDE(  
  SUM(Sales[Sales]),  
  CALCULATE(  
  SUM(Sales[Sales]),  
  REMOVEFILTERS(Region[Region])  
  )  
  )  
 )

The IF() function uses the ISINSCOPE() function to test whether the region column is the level in a hierarchy of levels. When true, the DIVIDE() function is evaluated. When false, a blank value is returned because the region column isn’t in scope.

 Sales % Group =  
 DIVIDE(  
  SUM(Sales[Sales]),  
  CALCULATE(  
  SUM(Sales[Sales]),  
  REMOVEFILTERS(  
  Region[Region],  
  Region[Country]  
  )  
  )  
 )

To achieve sales as a percentage of group, two filters can be applied to effectively remove the filters on two columns.

v. Work with Time Intelligence

vi. Create a YTD measure

 Sales YTD =  
 TOTALYTD(SUM(Sales[Sales]), 'Date'[Date], "6-30")

The TOTALYTD() function evaluates an expression—in this case the sum of theSalescolumn—over a given date column. The date column must belong to a date table marked as a date table, as was done in theCreate DAX Calculations in Power BI Desktoplab.

The function can also take a third optional argument representing the last date of a year. The absence of this date means that December 31 is the last date of the year. For Adventure Works, June in the last month of their year, and so “6-30” is used.

The TOTALYTD() function performs filter manipulation, specifically time filter manipulation. For example, to compute YTD sales for September 2017 (the third month of the fiscal year), all filters on theDatetable are removed and replaced with a new filter of dates commencing at the beginning of the year (July 1, 2017) and extending through to the last date of the in-context date period (September 30, 2017).

Many Time Intelligence functions are available in DAX to support common time filter manipulations.

vii. Create a YoY growth measure

 Sales YoY Growth =  
 VAR SalesPriorYear =  
  CALCULATE(  
  SUM(Sales[Sales]),  
  PARALLELPERIOD(  
  'Date'[Date],  
  -12,  
  MONTH  
  )  
  )  
 RETURN  
  SalesPriorYear

TheSalesPriorYearvariable is assigned an expression that calculates the sum of theSalescolumn in a modified context that uses the PARALLELPERIOD() function to shift 12 months back from each date in filter context.

modified

 Sales YoY Growth =  
 VAR SalesPriorYear =  
  CALCULATE(  
  SUM(Sales[Sales]),  
  PARALLELPERIOD(  
  'Date'[Date],  
  -12,  
  MONTH  
  )  
  )  
 RETURN  
  DIVIDE(  
  (SUM(Sales[Sales]) - SalesPriorYear),  
  SalesPriorYear  
  )

5. Check your knowledge

In the context of semantic model calculations, which statement best describes time intelligence?

Filter context modifications involving a date table

Time intelligence calculations modify date filter contexts.

You're developing a semantic model in Power BI Desktop. You've just added a date table by using the CALENDARAUTO function. You've extended it with calculated columns, and you've related it to other model tables. What else should you do to ensure that DAX time intelligence calculations work correctly?

Mark as a Date table.

You must mark the date table so that Power BI can correctly filter its dates.

You have a table that stores account balance snapshots for each date, excluding weekends. You need to ensure that your measure formula only filters by a single date. Also, if no record is on the last date of a time period, it should use the latest account balance. Which DAX time intelligence function should you use?

LASTNONBLANK

The LASTNONBLANK function will return the last date in the filter context where a snapshot record exists. This option will help you achieve the objective.

6. Summary

In this module,

you learned that time intelligence calculations are concerned with modifying the filter context for date filters.

You were introduced to many DAX time intelligence functions, which support the creation of calculations, such as year-to-date (YTD) or year-over-year (YoY).

You also learned that life-to-date (LTD) calculations can help you count new occurrences over your fact data, and that snapshot data can be filtered in a way to help guarantee that only a single snapshot value is returned.

Learning objectives

By the end of this module, you'll be able to:

Define time intelligence.
Use common DAX time intelligence functions.
Create useful intelligence calculations.

IX. Module 9 Optimize a model for performance in Power BI

Course	Microsoft Power BI Data Analyst
Module 9/17	Optimize a model for performance in Power BI

Performance optimization, also known as performance tuning, involves making changes to the current state of the semantic model so that it runs more efficiently. Essentially, when your semantic model is optimized, it performs better.

1. Introduction to performance optimization

As a data analyst, you will spend approximately 90 percent of your time working with your data, and nine times out of ten, poor performance is a direct result of a bad semantic model, bad Data Analysis Expressions (DAX), or the mix of the two.

A smaller sized semantic model uses less resources (memory) and achieves faster data refresh, calculations, and rendering of visuals in reports. Therefore, the performance optimization process involves minimizing the size of the semantic model and making the most efficient use of the data in the model, which includes:

Ensuring that the correct data types are used.
Deleting unnecessary columns and rows.
Avoiding repeated values.
Replacing numeric columns with measures.
Reducing cardinalities.
Analyzing model metadata.
Summarizing data where possible.

In this module, you will be introduced to the steps, processes, and concepts that are necessary to optimize a semantic model for enterprise-level performance.

By the end of this module, you're able to:

Review the performance of measures, relationships, and visuals.
Use variables to improve performance and troubleshooting.
Improve performance by reducing cardinality levels.
Optimize DirectQuery models with table level storage.
Create and manage aggregations.

2. Review performance of measures, relationships, and visuals

semantic model - multiple tables, complex relationships, intricate calculations, multiple visuals, or redundant data - a potential exists for poor report performance

i. Identify report performance bottlenecks

optimal performance - create efficient semantic model - has fast running queries and measures.

improve the model - analyzing the query plans and dependencies and then making changes to further optimize performance.

a. Analyze performance

Performance analyzer - how long a visual to refresh when initiated by a user

Performance analyzer - accurate results - clear visual cache, clear data engine cache

b. Review results

DAX query - The time it took for the visual to send the query, along with the time it took Analysis Services to return the results.

Visual display - The time it took for the visual to render on the screen, including the time required to retrieve web images or geocoding.

Other - The time it took the visual to prepare queries, wait for other visuals to complete, or perform other background processing tasks. If this category displays a long duration, the only real way to reduce this duration is to optimize DAX queries for other visuals, or reduce the number of visuals in the report.

ii. Resolve issues and optimize performance

a. Visuals

number of visuals on the report page,

use drill-through pages and report page tooltips.

b. DAX query

Performance analyzer highlights potential issues but does not tell you what needs to be done to improve them. - use Dax studio to analyse further

try using different DAX functions

c. Semantic model

Performance analyzer - DAX query - high duration value > 120ms - relationships, columns, or metadata in your model, or it could be the status of the Auto date/time option

d. Relationships

relationship cardinality properties are correctly configured.

e. Columns

remove an unnecessary column

try to deal with them at the source when loading data

f. Metadata

Metadata is information about other data.

Power BI metadata contains information on your semantic model, such as

the name,
data type and
format of each of the columns,
the schema of the database,
the report design,
when the file was last modified,
the data refresh rates, and much more.

use Power Query Editor: Unnecessary columns, Unnecessary rows, Data type, Query, names,

Column details

Column quality - percentage of items in the column are valid

Column distribution - frequency and distribution of the values

Column profile - statistics chart and a column distribution chart

Column profiling based on top 1000 rows > Column profiling based on entire data set.

data - compressed and stored to the disk by the VertiPaq storage engine

g. Auto date/time feature

Auto date/time option - enabled globally - automatically creates a hidden calculated table for each date column

Disabling Auto date/time option - lower the size of your semantic model and reduce the refresh time.

3. Use variables to improve performance and troubleshooting

Variables in your DAX formulas to help you write less complex and more efficient calculations.

Variables in your semantic model:

Improved performance - remove the need to evaluate the same expression multiple times, reduce query processing time.

Improved readability -

Simplified debugging

Reduced complexity

i. Use variables to improve performance

Sales YoY Growth =
DIVIDE (
    ( [Sales] - CALCULATE ( [Sales], PARALLELPERIOD ( 'Date'[Date], -12, MONTH ) ) ),
    CALCULATE ( [Sales], PARALLELPERIOD ( 'Date'[Date], -12, MONTH ) )
)

Sales YoY Growth =
VAR SalesPriorYear =
    CALCULATE ( [Sales], PARALLELPERIOD ( 'Date'[Date], -12, MONTH ) )
VAR SalesVariance =
    DIVIDE ( ( [Sales] - SalesPriorYear ), SalesPriorYear )
RETURN
    SalesVariance

ii. Use variables to improve readability

iii. Use variables to troubleshoot multiple steps

4. Reduce cardinality

Cardinality is a term that is used to describe the uniqueness of the values in a column.

Cardinality is also used in the context of the relationships between two tables, where it describes the direction of the relationship.

i. Identify cardinality levels in columns

Power Query Editor to analyze the metadata, the Column distribution - display statistics on how many distinct and unique items were in each column in the data.

Distinct values count - The total number of different values found in a given column.
Unique values count - The total number of values that only appear once in a given column.

Low level of cardinality - more optimized performance; column that has a lot of repeated values in its range (unique count is low).

High level of cardinality - column that has a lot of unique values in its range (unique count is high).

Optimized performance - have more Lower cardinality columns & less high cardinally columns in your semantic model.

ii. Reduce relationship cardinality

iii. Improve performance by reducing cardinality levels

Reducing data - loaded into your model - improve the relationship cardinality of the report.

Reduce a model size - use a summary table from the data source - A disadvantage - might lose the ability to drill into data because the detail no longer exists. This tradeoff could be mitigated by using a mixed model design.

each table can have its Storage Mode property set as Import or DirectQuery.

Reduce the model size - set the Storage Mode property for larger fact-type tables to DirectQuery.

LO

Performance optimization - performance tuning, changes to the current state of the semantic model - runs more efficiently. - semantic model - optimized - performs better.

smaller sized semantic model - uses less resources (memory) and achieves faster data refresh, calculations, and rendering of visuals in reports

performance optimization process - Ensuring the correct data types, Deleting unnecessary columns and rows, Avoiding repeated values,

Replacing numeric columns with measures, Reducing cardinalities, Analyzing model metadata,

semantic model - multiple tables, complex relationships, intricate calculations, multiple visuals, or redundant data - a potential exists for poor report performance

Performance analyzer - for ccaurate results - clear visual cache, clear data engine cache ; highlights potential issues but does not tell you what needs to be done to improve them

Column profiling based on top 1000 rows > Column profiling based on entire data set.

Disabling Auto date/time option - lower the size of your semantic model and reduce the refresh time.

Use variables

Cardinality - uniqueness of the values in a column, direction of the relationship

Low level of cardinality - more optimized performance; column that has a lot of repeated values in its range (unique count is low).

High level of cardinality - column that has a lot of unique values in its range (unique count is high).

Optimized performance - have more Lower cardinality columns & less high cardinally columns in your semantic model.

Reduce the model size - mixed model design - set the Storage Mode property for larger fact-type tables to DirectQuery.

5. Optimize DirectQuery models with table level storage

DirectQuery - connecting directly to data in its source repository, queries might time out, only the schema is loaded.

i. Implications of using DirectQuery

as data changes frequently and near real-time reporting

large data without the need to pre-aggregate

data sovereignty restrictions to comply with legal requirements.

multidimensional data source that contains measures such as SAP Business Warehouse (BW)

ii. Behavior of DirectQuery connections

iii. Limitations of DirectQuery connections

Performance - depends on the performance of the underlying data source.

Security

Data transformation - data that is sourced from DirectQuery has limitations

Modeling - modeling capabilities aren't available, or are limited

iv. Optimize performance

examine the queries - sent to the underlying source

a. Optimize data in Power BI Desktop

reduce the number of visuals on report page; reduce the number of fields in visual; remove unnecessary columns and rows.

b. Optimize the underlying data source (connected database)

Avoid the use of complex calculated columns because the calculation expression will be embedded into the source queries.

It is more efficient to push the expression back to the source because it avoids the push down. You could also consider adding surrogate key columns to dimension-type tables.

v. Customize the Query reduction options

Power BI Desktop gives you the option to send fewer queries and to disable certain interactions that will result in a poor experience if the resulting queries take a long time to run.

Applying these options prevents queries from continuously hitting the data source, which should improve performance.

6. Create and manage aggregations

aggregation - reduces the table sizes in the semantic model - improve the query performance; data is cached - smaller cache; speed up the refresh process.

7. Check your knowledge

What benefit do you get from analyzing the metadata?

The benefit of analyzing the metadata is that you can clearly identify data inconsistences with your semantic model.

What can be achieved by removing unnecessary rows and columns?

Deleting unnecessary rows and columns will reduce a semantic model size and it's good practice to load only necessary data into your semantic model.

Which of the following statements about relationships in Power BI Desktop is true?

A working relationship can be created as long as there is at least one common column between them.

8. Summary

optimize performance and reduce the size of the model.

You started the optimization process by using Performance analyzer and other tools to review the performance of measures, relationships, and visuals, and then made improvements based on the analysis results.

Next, you used variables to write less complex and more efficient calculations. You then took a closer look at the column distribution and reduced the cardinality of your relationships.

At that stage, the semantic model was more optimized.

You considered how the situation would be different if your organization used a DirectQuery model, and then you identified how to optimize performance from Power BI Desktop and the source database.

Finally, you used aggregations to significantly reduce the size of the semantic model.

Learning objectives

By the end of this module, you will be able to:

Review the performance of measures, relationships, and visuals.
Use variables to improve performance and troubleshooting.
Improve performance by reducing cardinality levels.
Optimize DirectQuery models with table level storage.
Create and manage aggregations.

X. Module 10 Design Power BI reports

Course	Microsoft Power BI Data Analyst
Module 10/17	Design Power BI reports

Power BI - 30 core visuals

This module will guide you through selecting the most appropriate visual type to meet your design and report layout requirements.

1. Introduction

i. Report structure

ii. Report pages

2. Design the analytical report layout

3. Design visually appealing reports

4. Report objects

5. Select report visuals

6. Select report visuals to suit the report layout

7. Format and configure visualizations

8. Work with key performance indicators

9. Exercise - Design a report in Power BI desktop

i. Lab story

In this lab, you’ll create a three-page report. You’ll then publish it to Power BI, where you’ll open and interact with the report.

In this lab you learn how to:

Design a report
Configure visual fields and format properties

ii. Create a Report

iii. Design page 1

In the visual fields pane, in the X-axis well/area, for the Month field, select the down-arrow, and then select Show Items With No Data.

iv. Design page 2

v. Design page 3

simulate the performance of row-level security filters

vi. Publish the report

vii. Explore the report

10. Check your knowledge

At the Contoso Skateboard Company, Brandon is designing a dashboard report to show inventory stock levels over time. What type of report visual should Brandon choose to effectively show the stock levels?

A line chart is probably the most effective way to visualize a time series, such as stock inventory levels over time.

At the Contoso Skateboard Company, Sakura is designing an analytical report. It must include a visual that allows report consumers to explore and discover detailed sales values over time and by store. What type of report visual should Sakura choose to support the report consumer requirement?

A matrix visual allows the report consumer to drill down on the columns and rows to reveal detailed values.

At the Contoso Skateboard Company, James is designing a dashboard report. It must prominently show values of sales revenue, units sold, cost of goods sold, and profit, and each value should be compared to a target value. What type of report visual should James choose to support the report consumer requirement?

A KPI visual can show and compare actual and target values.

11. Summary

Learning objectives

In this module, you will:

Learn about the structure of a Power BI report.
Learn about report objects.
Select the appropriate visual type to use.

XI. Module 11 Configure Power BI report filters

Course	Microsoft Power BI Data Analyst
Module 11/17	Configure Power BI report filters

1. Introduction to designing reports for filtering

Filtering - semantic model (RLS), Report, Page, Visual, Measure.

i. semantic model (RLS)

Power BI report queries a single semantic model

semantic model can enforce row-level security (RLS) to restrict access to a subset of data, and different users will see different data

report - can't override RLS

ii. Report structure

Power BI report is hierarchical

iii. Measure

measure is a model object that is designed to summarize data

Measure DAX formulas - modify filter context by using the CALCULATE or CALCULATETABLE functions, intelligence functions - flexibility to add, remove, or modify filters - override filters applied to report structure.

A good example of a measure that overrides report filters is a three-month moving average calculation.

To compute the result for March, the filter context for month must expand to encompass January, February, and March.

The CALCULATE function or a time intelligence function can modify the filter context to produce that result.

2. Apply filters to the report structure

3. Apply filters with slicers

slicers filter - by default filter all other visuals on the page, edit visual interactions to restrict filtering between two visuals

slicer is a visual that propagates filters to other visuals on the same page or (when synced) across other pages.

The slicer layout is responsive to the data type of the field. Field data types are either text, numeric, or date.

By default, a text field will produce a list slicer, a numeric field will produce a numeric range "between" filter, and a date field will produce a date range "between" filter, allowing value selection with calendar controls.

4. Design reports with advanced filtering techniques

Beyond filters and slicers, report authors can employ other filtering techniques, such as:

Visual interactions
Drillthrough
Report tooltip
Bookmarks
Report options
Query reduction options

i. Visual interactions

disabling cros filtering filter propagation.

cross filters propagated

no cross filters propagated

ii. Drillthrough

Add drillthrough pages to allow report consumers to drill from visuals. By default, the drillthrough action propagates all filters that apply to the visual to the drillthrough page.

By default, the report tooltip receives all filters that apply to the visual.

iv. Bookmarks

Bookmarks capture a specific view of a report, including filters, slicers, the page selection, and the state of visuals

v. Report options

vi. Query reduction options

can configure report settings to reduce the number of queries that are sent to the semantic model. Fewer queries will result in better responsiveness as report consumers update filters or slicers or cross filter report pages.

Consider enabling the query reduction options when the semantic model uses DirectQuery table storage or when imported data volumes are large and calculations are complex and slow.

5. Consumption-time filtering

In reading view, report consumers can use many different filter techniques when viewing a Power BI report, such as:

Using slicers.

Using filters.

Applying interactive filtering actions.

Determining applied filters.

Working with persistent filters.- Persistent filters is a feature that saves report consumer's slicer and filter settings. It automatically applies the settings when the report consumer reopens the report. That way, Power BI remembers previously applied filters. You can revert to default filters (that are saved in the report by the report author) by selecting Reset to default.

6. Select report filter techniques

7. Case study - Configure report filters based on feedback

8. Check your knowledge

Which of the following capabilities is unique to the Filters pane?

Top N filtering can only be achieved in the Filters pane.

Which of the following capabilities is unique to slicers?

Horizontal orientation can only be achieved with slicers.

At the Contoso Skateboard Company, John is responsible for developing a report for the sales director. One report page includes a date slicer and many visuals. When the sales director uses the slicer, all but one of the visuals, a Decomposition Tree visual, should receive the slicer filters. How should John configure the requirement?

By editing visual interactions, John can restrict the slicer from filtering the Decomposition Tree visual.

9. Summary

After you have gained an understanding of how filters work in Power BI reports, you'll be on your way to developing reports that allow report consumers to focus on the data that interests them. The main takeaway is that, at design time, you will have two main techniques to apply filters: the Filters pane and slicers. Make sure that you avoid using both techniques in the same report because it can create confusion for your report consumers.

When you're ready to deliver your report, it might be helpful to provide a demonstration (or video) to show how you intend report consumers to use the report. That approach will allow you to show how to apply filters and provide lessons on how to apply interactive filtering actions, such as include/exclude groups, cross filtering, report page drillthrough, and bookmarks.

Filtering - semantic model (RLS), Report, Page, Visual, Measure.

report - can't override RLS

slicers filter - by default filter all other visuals on the page, edit visual interactions to restrict filtering between two visuals

Beyond filters and slicers, other filtering techniques - Visual interactions, Drillthrough, Report tooltip, Bookmarks, Report options, Query reduction options

Bookmarks capture a specific view of a report, including filters, slicers, the page selection, and the state of visuals

Learning objectives

In this module, you will:

Design reports for filtering.
Design reports with slicers.
Design reports by using advanced filtering techniques.
Apply consumption-time filtering.
Select appropriate report filtering techniques.

XII. Module 12 Enhance Power BI report designs for the user experience

Course	Microsoft Power BI Data Analyst
Module 12/17	Enhance Power BI report designs for the user experience

1. Design reports to show details

An analytical report layout can integrate a guided analytical experience. That way, the report design is aligned to the automatic and unconscious processes that take place when the report consumer looks at the report.

The guided analytical experience allows navigation between three levels:

High-level metrics
Supporting visuals
Details, when required

You can show details by using four different techniques:

Use drillable visuals
Add tooltips
Add drillthrough
Embed paginated reports

2. Design reports to highlight values

Power BI supports several techniques to highlight values:

Conditional formatting
Overlaid analytics
Anomaly detection
Specialized visuals

3. Design reports that behave like apps

4. Work with bookmarks

Bookmarks capture different state, relating to data, display, and the current page. By default, a new bookmark captures all state types, but you can decide to disable any of them.

6. Work with visual headers

7. Design reports with built-in assistance

8. Tune report performance

test your report in the Power BI Report Server to see how it works from a user's perspective

9. Optimize reports for mobile use

10. Exercise - Enhance Power BI reports

i. Lab story

In this lab you learn how to:

Sync slicers
Create a drill through page
Apply conditional formatting
Create and use bookmarks

iii. Get started – Open report

iv. Sync slicers

you’ll sync the Year and Region slicers

v. Configure drill through

you’ll create a new page and configure it as a drill through page

vi. Create a drill through page

you’ll create a new page and configure it as a drill through page

vii. Add Conditional Formatting

viii. Add conditional formatting

you’ll enhance the drill through page with conditional formatting.

ix. Add Bookmarks and Buttons

you’ll enhance the My Performance page with buttons, allowing the report user to select the visual type to display

x. Add bookmarks

you’ll add two bookmarks, one to display each of the monthly sales/targets visuals.

Disabling the Data option means the bookmark won’t use the current filter state. That’s important because otherwise the bookmark would permanently lock in the filter currently applied by the Year slicer

bookmark to ignore filters (Data option off)

xi. Add buttons

you’ll add two buttons, and assign bookmark actions to each.

xii. Publish the report

xiii. Explore the report

11. Check your knowledge

At the Contoso Skateboard Company, Sally is creating a report to show sales over time. Sally is aware that occasional spikes occur in sales revenue that could be attributed to many different reasons, such as a marketing campaign. Sally needs to ensure that supporting visuals provide explanation for the spikes. What type of visual should Sally use?

Adding anomaly detection to a line chart visual will highlight anomalies and provide accompanying explanations as data visuals.

The Q&A visual allows for the entry of natural language questions that are answered by data visuals. It can’t provide explanation for spikes in data.

At the Contoso Skateboard Company, Sanjay is authoring a report that will be distributed to sales managers. The report contains some sensitive data that shouldn’t be exported. Which report design feature can Sanjay configure to ensure that data isn’t exported?

Correct. By disabling the More options (…) icon, report consumers can’t export data.

Page tooltips allow your report consumers to gain deeper insights quickly and efficiently from a visual. It can’t restrict the exporting of data.

What is the purpose of the Display state in bookmarks?

The Display state captures the visibility of report objects, such as shapes, images, buttons, and text boxes.

12. Summary

Sync slicers
drill through page
conditional formatting
bookmarks
buttons

Learning objectives

In this module, you will:

Design reports to show details.
Design reports to highlight values.
Design reports that behave like apps.
Work with bookmarks.
Design reports for navigation.
Work with visual headers.
Design reports with built-in assistance.
Use specialized visuals.

XIII. Module 13 Perform analytics in Power BI

Course	Microsoft Power BI Data Analyst
Module 13/17	Perform analytics in Power BI

You'll learn how to use Power BI to perform data analytical functions, how to identify outliers in your data, how to group data together, and how to bin data for analysis. You'll also learn how to perform time series analysis.

Finally, you'll work with advanced analytic features of Power BI, such as Quick Insights, AI Insights, and the Analyze feature.

1. Introduction to analytics

Analytics - data mining, big data analytics, machine learning, AI, and predictive analytics.

technical aspects of analytics - predictive capabilities, solve business problems.

Additionally, analytics can help with fraud detection, image recognition, sentiment analysis

2. Explore statistical summary

statistical summary - quick and simple description of data.

Power BI - statistical analysis - DAX functions, visuals: histograms, bell curves, advanced analytics visuals, statistical programming languages: Python, R.

TOPN DAX function

Top 10 Products =
SUMX ( TOPN ( 10, Product, Product[Total Sales]), [Total Sales] )

3. Identify outliers with Power BI visuals

outlier - anomaly in data.

i. Use a visual to identify outliers

scatter chart - relationship between two numerical, display patterns in large sets of data, displaying outliers

ii. Use DAX to identify outliers

Outliers =
CALCULATE (
    [Order Qty],
    FILTER (
        VALUES ( Product[Product Name] ),
        COUNTROWS ( FILTER ( Sales, [Order Qty] >= [Min Qty] ) ) > 0
    )
)

4. Group and bin data for analysis

putting values into equal-sized groups (binning).

Grouping - categories of data.

Binning - grouping continuous fields, such as numbers and dates.

5. Apply clustering techniques

Clustering - identify a segment (cluster) of data that is similar to each other but dissimilar to the rest of the data

6. Conduct time series analysis

Time series analysis - analyzing a series of data in time order - trends and make predictions

7. Use the Analyze feature

Analyze feature - analyzing why your data distribution looks the way that it does.

8. Create what-if parameters

what-if parameters to run scenarios and scenario-type analysis on your data

what-if parameters - look at historical data to analyze potential outcomes

what-if parameters - look forward, to predict or forecast what could happen in the future.

9. Use specialized visuals

specialized visuals - AI visuals - Power BI uses machine learning to discover and display insights from data

The three main AI visuals are:

Key influencers
Decomposition tree
Q&A

i. Key influencers

Key influencers visual - understand the factors that drive a particular metric - will analyze the data, rank the factors that matter, and then present them as key influencers.

ii. Decomposition tree

Decomposition Tree - visualize data across multiple dimensions, automatically aggregates data, enables to drill down into dimensions in any order - ad hoc exploration and conducting root cause analysis

iii. Q&A

Q&A visual - ask questions of their data and receive responses as data visualizations

10. Exercise - Perform Advanced Analytics with AI Visuals

In this lab, you’ll create the Sales Exploration report.

In this lab you learn how to:

Create animated scatter charts Use a visual to forecast values

i. Lab story

iii. Get started – Create a semantic model

iv. Create the report

v. Create an animated scatter chart

you’ll create a scatter chart that can be animated.

vi. Create a forecast

you’ll create a forecast to determine possible future sales revenue.

When forecasting over a time line, you’ll need at least two cycles (years) of data to produce an accurate and stable forecast.

11. Check your knowledge

What Power BI feature can give an in-depth analysis of the distribution of data?

The Analyze feature allows a user to understand why the distribution looks the way that it does. Correct. The Analyze feature gives an in-depth analysis of the distribution of data.

How can you access the time series chart visual?

Time series charts can be imported from AppSource. Correct. Time series charts can be imported from AppSource.

What visual should be used to display outliers?

The scatter chart is best to display outliers. Correct. The scatter chart displays outliers best.

12. Summary

Additionally, analytics can help with fraud detection, image recognition, sentiment analysis

outlier - anomaly in data.

scatter chart - relationship between two numerical, display patterns in large sets of data, displaying outliers

Grouping - categories of data.

Binning - grouping continuous fields, such as numbers and dates.

Clustering - identify a segment (cluster) of data that is similar to each other but dissimilar to the rest of the data

Time series analysis - analyzing a series of data in time order - trends and make predictions

Analyze feature - analyzing why your data distribution looks the way that it does.

what-if parameters to run scenarios and scenario-type analysis on your data

what-if parameters - look at historical data to analyze potential outcomes

what-if parameters - look forward, to predict or forecast what could happen in the future.

specialized visuals - AI visuals - Power BI uses machine learning to discover and display insights from data - Key influencers, Decomposition tree, Q&A.

Q&A visual - ask questions of their data and receive responses as data visualizations

When forecasting over a time line, you’ll need at least two cycles (years) of data to produce an accurate and stable forecast.

Learning objectives

In this module, you'll:

Explore statistical summary.
Identify outliers with Power BI visuals.
Group and bin data for analysis.
Apply clustering techniques.
Conduct time series analysis.
Use the Analyze feature.
Use advanced analytics custom visuals.
Review Quick insights.
Apply AI Insights.

XIV. Module 14 Create and manage workspaces in Power BI

Course	Microsoft Power BI Data Analyst
Module 14/17	Create and manage workspaces in Power BI

Learn how to navigate the Power BI service, create and manage workspaces and related items, and distribute reports to users.

1. Introduction

load and transform data from numerous sources - build visuals - create DAX equations - publish report to Power BI - share these reports - workspace

A workspace is a centralized repository in which you can collaborate with colleagues and teams to create collections of reports and dashboards.

workspaces to house reports and dashboards for use by multiple teams

highest level of security is maintained by controlling who can access semantic models, reports, and dashboards.

2. Distribute a report or dashboard

3. Monitor usage and performance

Deployment pipeline

5. Troubleshoot data by viewing its lineage

Lineage view - relationships between the artifacts in a workspace and their external dependencies.

6. Configure data protection

7. Check your knowledge

How is the Admin workspace role different from other types of workspace roles?

Admin is the only role that can remove any users. Only Admins can add and remove users from a workspace.

Which one of the following options is the best description of a workspace?

A workspace is a centralized location or repository that allows you to collaborate with colleagues and teams to create collections of reports, dashboards, and so on. A workspace is a centralized location or repository that allows you to collaborate with colleagues and teams to create collections of reports, dashboards, and so on.

What feature in Power BI service can you use to troubleshoot the flow of data from its source to its destination?

Lineage view Lineage view allows you to view and troubleshoot the data flow from source to destination.

8. Summary

A workspace is a crucial feature of Power BI that allows you to share reports, build dashboards, and collaborate with your teams.

Power BI workspaces can help increase performance in your reports, ensure that appropriate security requirements are applied, make it easier for you to share content, and more.

With this information, you can add to your toolkit the ability to manage workspaces in Power BI service so that you can build your dashboards in the most efficient way possible.

Learning objectives

In this module, you will:

Create and manage Power BI workspaces and items.
Distribute a report or dashboard.
Monitor usage and performance.
Recommend a development lifecycle strategy.
Troubleshoot data by viewing its lineage.
Configure data protection.

XV. Module 15 Manage semantic models in Power BI

Course	Microsoft Power BI Data Analyst
Module 15/17	Manage semantic models in Power BI

Microsoft Power BI - single semantic model to build many reports.

scheduled semantic model refreshes and resolving connectivity errors.

1. Introduction

one semantic model can be used by multiple users

make the reports more dynamic so that they can filter the data themselves

2. Use a Power BI gateway to connect to on-premises data sources

Two types of on-premises gateways are:

Organization mode - multiple users to connect to multiple on-premises data sources and is suitable for complex scenarios.
Personal mode - one user to connect to data sources. This type of gateway can be used only with Power BI and it can't be shared with other users, so it is suitable in situations where you're the only one in your organization who creates reports. You will install the gateway on your local computer, which needs to stay online for the gateway to work.

i. on-premises gateway

When you are working in the cloud and interacting with an element that is connected to an on-premises data source, the following actions occur:

The cloud service creates a query and the encrypted credentials for the on-premises data source. The query and credentials are sent to the gateway queue for processing.
The gateway cloud service analyzes the query and pushes the request to Microsoft Azure Service Bus.
Service Bus sends the pending requests to the gateway.
The gateway gets the query, decrypts the credentials, and then connects to one or more data sources with those credentials.
The gateway sends the query to the data source to be run.
The results are sent from the data source back to the gateway and then to the cloud service. The service then uses the results.

ii. Troubleshoot an on-premises data gateway

how to run a network port test

how to provide proxy information for your gateway

current data center region that you're in

3. Configure a semantic model scheduled refresh

Power BI deactivates your refresh schedule after four consecutive failures or when the service detects an unrecoverable error that requires a configuration update, such as invalid or expired credentials. It is not possible to change the consecutive failures threshold.

If a semantic model displays a small warning icon, you'll know that the semantic model is currently experiencing an issue. Select the warning icon to get more information.

4. Configure incremental refresh settings

Incremental refresh should only be used on data sources and queries that support query folding

Quicker refreshes - Only data that needs to be changed gets refreshed

Reduced resource consumption - Because you only need to refresh the smaller the amount of data, the overall consumption of memory and other resources is reduced.

You can define an incremental refresh policy to solve this business problem. This process involves the following steps:

i. Define the filter parameters.

ii. Use the parameters to apply a filter.

iii. Define the incremental refresh policy.

iv. Publish changes to Power BI service.

5. Manage and promote semantic models

Power BI provides two ways to endorse your semantic models:

Promotion - Promote your semantic models when they're ready for broad usage.
Certification - Request certification for a promoted semantic model. Certification can be a highly selective process, so only the truly reliable and authoritative semantic models are used across the organization.

6. Troubleshoot service connectivity

If your data source credentials are not up to date, you'll need to take further action to investigate and resolve the issue.

7. Boost performance with query caching (Premium)

Query Caching is a local caching feature that maintains results on a user and report basis. Query caching reduces load time and increases query speed, especially for semantic models that aren't refreshed often and are accessed frequently.

8. Check your knowledge

Where are semantic model-scheduled refreshes configured?

Power BI service semantic model-scheduled refreshes are configured in Power BI service.

What reserved parameters configure the start and end of where Incremental refresh should occur?

RangeStart and RangeEnd RangeStart and RangeEnd configure the start and end of where Incremental refresh should occur.

What is the difference between Promotion and Certification when you are endorsing a semantic model?

Promotion is for broad usage while Certification needs permission granted on the Admin Tenant settings. Promotion does not need specific permissions while Certification requires permission from the semantic model owner to access to the semantic model.

9. Summary

This module allowed you to take advantage of the following features of Power BI to help you manage your semantic models:

Two data refresh options to help you automate the refresh process and make it more efficient.
Endorsement features for your most critical semantic models to help your users identify the semantic models that they should use.
The on-premises gateway and ideas on how to troubleshoot potential connectivity issues.

These semantic model management techniques will help you to increase the ease of access and up-to-date nature of your semantic models, and will help you build high-quality reports and dashboards so that your users can make real-time decisions.

Learning objectives

In this module, you will:

Use a Power BI gateway to connect to on-premises data sources.
Configure a scheduled refresh for a semantic model.
Configure incremental refresh settings.
Manage and promote semantic models.
Troubleshoot service connectivity.
Boost performance with query caching (Premium).

XVI. Module 16 Create dashboards in Power BI

Course	Microsoft Power BI Data Analyst
Module 16/17	Create dashboards in Power BI

Power BI report uses data from a single semantic model, a Power BI dashboard can contain visuals from different semantic models.

1. Introduction to dashboards

i. Dashboards vs. reports

Dashboards	Reports
Create only in service	Create in Power BI Desktop (limited in service)
Single page	Multiple pages
Static tiles	Interactive visuals
Read-only tiles	Filters pane

dashboards and reports can be refreshed to show the latest data.

ii. Get started with dashboards

Dashboards are created by pinning report visuals, and are then called tiles within reports.

2. Configure data alerts

Data alerts can notify you or a user that a specific data point is above, below, or at a specific threshold that you can set.

3. Explore data by asking questions

4. Review Quick insights

5. Add a dashboard theme

6. Pin a live report page to a dashboard

Dashboards are intended to be a collection from various sources, not just as a "launching pad" for reports.

We recommend that you pin at the tile level first and foremost, and if needed, the entire report page can also be pinned. Seeing an entire report page in a dashboard tile can be difficult.

7. Configure a real-time dashboard

Power BI's real-time streaming semantic models, you can stream data and update dashboards as soon as the data is logged.

i. Stream in Power BI

Data - from streaming data source - stored in a temporary cache, not a semantic model - means you can't make changes to the semantic model.

visualize the data from a streaming data source - create a tile directly on a dashboard and use a custom streaming data source.

tiles - optimized because no database exists to pull the data from - These types of tiles have low latency and are best suited for data that doesn't need additional transformations, such as temperature or humidity.

ii. Visualize real-time data in Power BI

8. Set mobile view

9. Exercise - Create a Power BI dashboard

i. Lab story

In this lab, you’ll create the Sales Monitoring dashboard in the Power BI service using an existing report.

In this lab you learn how to:

Pin visuals to a dashboard
Use Q&A to create dashboard tiles

iii. Get started – Publish the report

iv. Create a dashboard

In this task, you’ll create the Sales Monitoring dashboard. You’ll pin a visual from the report, and add a tile based on an image data URI, and use Q&A to create a tile.

v. Edit tile details

In this task, you’ll edit the details of two tiles.

vi. Refresh the Semantic model

In this exercise, you’ll first load sales order data for June 2020 into the AdventureWorksDW2020 database. You’ll then open your Power BI Desktop file, perform a data refresh, and then upload the file to your workspace.

vii. Update the lab database

No data:

Now,

UPDATE [dbo].[FactResellerSales] SET [ShipDate] = DATEADD(DAY, 7, [OrderDate]) WHERE [ShipDate] IS NULL;

Load data from csv file:

--Add June 2020 sales
BULK INSERT [dbo].[FactResellerSales] FROM 'F:\Power BI\filesfrom internet\Allfiles\Resources\ResellerSales_202006.csv' WITH
(
    DATAFILETYPE = 'widechar'
    ,FIRSTROW = 2
    ,FIELDTERMINATOR = ','
    ,ROWTERMINATOR = '\n'
    ,TABLOCK
);
GO

Data:

viii. Refresh the Power BI Desktop file

ix. Review the dashboard

10. Check your knowledge

What is a dashboard?

A canvas of report elements that can be built in Power BI service. Correct. Dashboards can only be built in Power BI service.

What is one way that reports and dashboards differ?

In reports, you can have multiple pages; in dashboards, you can have only one page. Correct. You can have only one-page dashboards, but you can have multiple page reports.

Where can you configure and set data alerts?

Data alerts can be set only in Power BI service on specific visuals such as KPI cards, gauges, and cards. Correct. Data alerts can be set only in Power BI service on specific visuals.

11. Summary

In this module, you have learned about dashboards: what they are, why you need them, and what tiles and pinning are in relation to dashboards. You have also learned how to accomplish several tasks around dashboards, such as:

Setting mobile view.
Adding a theme to the visuals in your dashboard.
Configuring data classification.
Adding real-time semantic model visuals to your dashboards.
Pinning a live report page to a dashboard.

With this new knowledge, consider how you can transform the data that you have to create a story. Dashboards can help you visualize that story.

Learning objectives

In this module, you'll:

Set a mobile view.
Add a theme to the visuals in your dashboard.
Add real-time semantic model visuals to your dashboards.
Pin a live report page to a dashboard.

XVI. Module 17 Implement row-level security

Course	Microsoft Power BI Data Analyst
Module 17/17	Implement row-level security

Row-level security (RLS) allows you to create a single or a set of reports that targets data for a specific user. In this module, you'll learn how to implement RLS by using either a static or dynamic method and how Microsoft Power BI simplifies testing RLS in Power BI Desktop and Power BI service.

1. Introduction

Row-level security (RLS) uses a DAX filter as the core logic mechanism.

2. Configure row-level security with the static method

RLS - static (fixed value in the DAX filter) or dynamic method (DAX function) - DAX filter as the core logic mechanism.

RLS involves several configuration steps, which should be completed in the following order:

Create a report in Microsoft Power BI Desktop.
1. Import the data.
2. Confirm the semantic model between both tables.
3. Create the report visuals.
Create RLS roles in Power BI Desktop by using DAX.
Test the roles in Power BI Desktop.
Deploy the report to Microsoft Power BI service.
Add members to the role in Power BI service.
Test the roles in Power BI service.

3. Configure row-level security with the dynamic method

userprincipalname() function

4. Exercise - Enforce row-level security in Power BI

In this exercise, you’ll enforce row-level security to ensure that a salesperson can only analyze sales data for their assigned region(s).

In this exercise you learn how to:

Enforce row-level security
Choose between dynamic and static methods

i. Lab story

ii. Get started

iii. Enforce row-level security

In this task, you’ll enforce row-level security to ensure a salesperson can only see sales made in their assigned region(s).

after:

Not working for me.

5. Check your knowledge

Which function will tell you the username of the person who is signed in to Power BI service?

USERPRINCIPALNAME() The USERPRINCIPALNAME() function will tell you which user is signed in to view a report.

Where can you test RLS by using different security roles?

Both Power BI Desktop and Power BI service You can use Power BI Desktop and Power BI service to test RLS.

6. Summary

This module described row-level security (RLS), the ability in Power BI to limit what a user sees on a specific report.

RLS targets the data to a specific user, for instance, only allowing a manager to see the salary of their direct reports.

RLS is implemented with a combination of Power BI Desktop and Power BI service. To implement RLS, you can create a DAX formula that restricts their data access, which makes RLS versatile.

You can use DAX to indicate that someone can only see records in the United States or sales transactions that are below a certain dollar amount.

This programmatic approach means that RLS can be used in a variety of solutions.

After you have created the DAX formula in a specific security role, you can deploy the report and then add users to that role. RLS is an effortlessly implemented, powerful security feature of Power BI.

Learning objectives

In this module, you will:

Configure row-level security by using a static method.
Configure row-level security by using a dynamic method.

Conclusion

Learning Objectives,

Discover data analysis
Get started building with Power BI
Get data in Power BI
Prepare the data
Model the data
Visualize and analyze the data
Deploy and maintain items

Source: Microsoft Power BI Data Analyst [Link]

Author: Dheeraj.Yss

Connect with me:

PowerBI Microsoft data analyst

Learning Microsoft Power BI Data Analyst course

Table of contents

Overview:

Skills measured

I. Module 1 Discover data analysis

1. Introduction

2. Overview of data analysis

3. Roles in data

4. Tasks of a data analyst

5. Answers:

Learning objectives

II. Module 2 Get started building with Power BI

1. Introduction

2. Use Power BI

3. Building blocks of Power BI

4. Tour and use the Power BI service

5. Answers:

6. Summary

Learning objectives

III. Module 3 Get data in Power BI

1. Introduction

2. Get data from files

3. Get data from relational data sources

4. Create dynamic reports with parameters

5. Create dynamic reports for multiple values

6. Get data from a NoSQL database

7. Get data from online services

8. Select a storage mode

9. Get data from Azure Analysis Services

10. Fix performance issues

i. Optimize performance in Power Query

a. Query folding

ii. Query diagnostics

iii. Other techniques to optimize performance

11. Resolve data import errors

i. Query timeout expired

ii. Power BI Query Error: Timeout expired

iii. We couldn't find any data formatted as a table

iv. Couldn't find file

iv. Data type errors

12. Exercise - Prepare data in Power BI Desktop

i. Get data from SQL Server

ii. Preview Data in Power Query Editor

a. DimEmployee Query

b. DimReseller query

c. DimSalesTerritory query

d. FactResellerSales query

iii. Get data from a CSV file

13. Check your knowledge

14. Summary

Learning objectives

IV. Module 4 Clean, transform, and load data in Power BI

1. Introduction

2. Shape the initial data

i. Get started with Power Query Editor

3. Simplify the data structure

4. Evaluate and change column data types

5. Combine multiple tables into a single table

i. Append queries

ii. Merge queries

6. Profile data in Power BI

i. Find data anomalies and data statistics

7. Use Advanced Editor to modify M code

8. Exercise - Load data in Power BI Desktop

i. Configure the Salesperson query

ii. Configure the SalespersonRegion query

iii. Configure the Product query

iv. Configure the Reseller query

v. Configure the Region query

vi. Configure the Sales query

vii. Configure the Targets query

viii. Configure the ColorFormats query

ix. Update the Product query

x. Update the ColorFormats query

xi. Finish up

9. Check your knowledge

10. Summary

Learning objectives

V. Module 5 Design a semantic model in Power BI

1. Introduction