Viva Questions
INT374
Power BI is a collection of software services, apps, and connectors that turn unrelated data sources into coherent, interactive insights. Its primary goal is to bridge the gap between data and decision-making by enabling users to connect, visualize, and share data findings.
The three elements are Power BI Desktop (for development), Power BI Service (SaaS for publishing and sharing), and Power BI Mobile Apps (for consuming reports on iOS/Android).
Power Query is the ETL (Extract, Transform, Load) engine used for cleaning data and uses the M language. Power Pivot is the data modeling engine used for relationships and calculations, utilizing DAX (Data Analysis Expressions).
The Q&A feature allows users to ask questions about their data using plain English, such as 'Total sales by region'. Power BI uses AI to interpret the query and automatically generates a visual representing the answer.
The two methods are the Microsoft Store (Recommended) and Direct Download (.exe). The Microsoft Store version is recommended because it automatically installs the latest monthly updates in the background.
A Free License allows for personal use and authoring, but you cannot securely share reports with peers in the Power BI Service or collaborate in App Workspaces. A Pro License is required for sharing.
The three views are Report View (the canvas for building visuals), Table View (for inspecting raw data rows/columns), and Model View (for managing relationships and the semantic model).
The 'Get Data' function is the first step in the workflow, allowing users to connect to various data sources like Excel, SQL, or the Web to bring information into Power BI.
It is recommended to turn it off because the default setting creates hidden date tables for every date column, which increases file size and memory usage. Professional models should use a custom Date Table instead.
Global Settings apply to every instance of Power BI Desktop installed on the machine (e.g., Cache Management), whereas Current File Settings apply only to the specific .pbix file currently open (e.g., Time Intelligence or Privacy levels).
The 'Back-End' (Power Query Editor) is the 'Kitchen' used for ETL processes using M language. The 'Front-End' (Report/Data View) is the 'Dining Room' used for modeling, DAX calculations, and visualization.
Import Mode loads data into memory, offering fast performance and full DAX support but requires refreshing. DirectQuery leaves data at the source, querying it in real-time for every visual interaction, which is slower but handles massive datasets.
You can use the Add Table Using Examples feature. This allows you to manually type a few examples of the data you want, and Power BI's AI will interpret the CSS selectors to build the table.
The 'Applied Steps' pane records every transformation action chronologically. It acts like a time machine, allowing you to click previous steps to see the data state at that moment, or delete/modify specific steps.
Column Distribution shows a histogram of value frequency, including the count of Distinct Values (total different values) and Unique Values (values that appear only once).
Trim removes leading and trailing whitespace from a text string. Clean removes non-printable characters, such as line breaks or control characters, from the text.
You use it when the source data's region format differs from your system's region (e.g., a UK date '02/05/2023' being read as May 2nd on a US system). It allows you to specify the origin culture so Power BI interprets the data correctly.
Unpivoting transforms a wide table (e.g., months as columns) into a tall table (Attribute/Value pairs), which is ideal for visualization. Pivoting does the reverse, turning unique values in a column into separate column headers.
The column headers in both tables must share the exact same names. Appending stacks tables vertically, so the structure must match.
A Left Outer join keeps all rows from the first (left) table and matches rows from the second (right) table. It is the most common join type, functioning similarly to an Excel VLOOKUP.
The Data Model defines relationships between different tables to create a unified semantic model. It ensures accurate calculations (measures), improves performance via the VertiPaq engine, and simplifies report creation.
While normalization reduces redundancy, it requires complex joins that slow down analytical queries. A Star Schema is partially denormalized, which optimizes read performance for reporting.
A Dimension Table contains descriptive attributes (Who, What, Where) and is 'wide and short'. A Fact Table records transactions or metrics (How much) and is 'narrow and long', containing Foreign Keys.
A Natural Key is derived from data (e.g., Email), while a Surrogate Key is an artificial integer (e.g., 1, 2, 3). Surrogate keys are preferred in Power BI because integers are faster for the engine to process than text strings.
This occurs when neither column in the relationship is unique (e.g., duplicate values in both tables). It is risky because it can produce unexpected results or ambiguity. It is usually better resolved by creating a Bridge Table with unique keys.
A Snowflake Schema normalizes dimension tables, splitting them into sub-dimensions (e.g., Product SubCategory Category). Its drawback is that filters must traverse multiple relationships, which is slower than the direct link in a Star Schema.
Filters flow 'downhill' from the One (1) side (Dimension) to the *Many ()** side (Fact). For example, filtering a Customer dimension automatically filters the Sales fact table.
Bi-Directional filtering allows the Fact table to filter the Dimension table. It should be avoided because it has a high performance impact and can introduce ambiguity or circular logic in the model.
Hierarchies enable the Drill Down feature on charts automatically, allowing users to explore data depth (from broad to detailed) without developers needing to add multiple fields manually to every visual.
Categorizing columns as Country, City, or Postal Code ensures that the Bing Maps engine correctly identifies and plots the locations, preventing ambiguity (e.g., distinguishing 'Georgia' the state from 'Georgia' the country).
DAX stands for Data Analysis Expressions. It is a formula expression language used in Power BI, Analysis Services, and Power Pivot to create custom calculations and return values based on context.
A Calculated Column calculates values immediately and stores them in RAM/Disk (increasing file size). A Measure is not stored; it is calculated on the fly (using CPU) whenever a visual interacts with it.
Explicit measures (written in DAX) are reusable, can handle complex logic (like Time Intelligence), and maintain their formatting anywhere they are used. Implicit measures are limited to simple aggregations and cannot be referenced by other formulas.
DIVIDE function safer than using the standard division operator (/)? The DIVIDE function handles divide-by-zero errors gracefully by returning BLANK (or an alternate specified result) instead of returning 'Infinity' or an error.
CALCULATE function. CALCULATE is the most powerful function in DAX. It evaluates an expression in a context that is modified by specific filters. It can add, remove, or update the filters currently applied to the report.
SUMX) differ from standard aggregators (like SUM)? Standard aggregators (SUM) operate on a single column. Iterators (SUMX) iterate through a table row-by-row, perform a calculation for each row (e.g., Price * Quantity), and then aggregate the results.
RELATED function do, and where is it used? RELATED retrieves a value from another table that has an active relationship. It works from the 'Many' side to the 'One' side and is typically used in Calculated Columns (Row Context).
SWITCH function instead of IF? SWITCH is ideal when there are multiple conditions or values to test. It replaces nested IF statements, making the code cleaner and easier to read (e.g., defining Month Names based on numbers).
DATEDIFF function? DATEDIFF calculates the difference between two dates. You can specify the interval for the result, such as Day, Month, Year, or Hour.
FILTER when used inside CALCULATE? FILTER is an iterator that returns a table. It is used inside CALCULATE to handle complex filtering logic that simple boolean expressions cannot, such as comparing a column value against a measure.
MAD stands for Monitor (high-level KPIs), Analyze (interactive charts for slicing/dicing), and Detail (granular tables/matrices).
The Z-Pattern describes how users scan a page: starting top-left (critical KPIs), moving horizontally, diagonally to bottom-left, and finally across to bottom-right (details). Report layout should follow this flow.
A Waterfall Chart is used to understand how an initial value becomes a final value through a series of positive and negative changes (e.g., tracking Gross Revenue down to Net Income).
A Card Visual is used to display a single, critical number or KPI, such as Total Revenue or Total Sales, without any axis or trends.
Sync Slicers allow a slicer selection on one page to automatically filter visuals on other pages. It can also be used to hide a slicer on subsequent pages while maintaining the active filter.
In the Filters Pane, expand the category field (e.g., Product), change the Filter Type to Top N, enter the number (e.g., 10), drag the ranking metric (e.g., Sales) into the 'By value' field, and click Apply.
The Selection Pane controls the Z-Order (layering) of objects, determining which sit on top of others. It also allows developers to hide/unhide objects, which is essential for creating bookmark logic.
A Table is a standard 2D grid. A Matrix is equivalent to an Excel Pivot Table; it supports Row and Column headers and allows for Drill-Down functionality into hierarchies.
The Play Axis adds a time dimension to the Scatter Plot, allowing the bubbles to animate and move over time to visualize changes in correlation.
The three levels are Visual Level Filter (selected chart only), Page Level Filter (all visuals on the page), and Report Level Filter (every page in the file).
Optimization occurs at The Data Source (SQL queries), The Data Model (Schema/Relationships), and The Visual Layer (DAX/Rendering).
It is a built-in tool that measures exactly how long each visual on the current page takes to load/render, helping identify bottlenecks.
DAX Studio is an external tool used to analyze the execution plan of DAX queries. It provides detailed server timings to show exactly how the engine is calculating a number.
'Pause Visuals' stops visuals from sending queries to the data engine when changes are made. This allows authors to make multiple configuration changes (like adding columns or formatting) without waiting for the visual to reload after every single action.
Query Diagnostics is used within Power Query Editor to troubleshoot slow data transformation steps (ETL) by recording the time taken for each step to execute.
The BPA is a feature within the external tool Tabular Editor. It scans the data model for rule violations, such as missing format strings or inefficient relationships, to improve model quality.
Slow reports lead to frustration. If a report takes too long to load or filter, users are less likely to use it, resulting in low adoption rates regardless of how insightful the data is.
It inspects the memory consumption of columns in the data model. It helps identify which columns are taking up the most RAM so they can be optimized or removed to reduce model size.
It manually triggers a query update for the visuals. It is typically used after 'Pause Visuals' is disabled or when you want to verify the output after making multiple paused configuration changes.
In DirectQuery, every visual interaction sends a query to the original data source. Inefficient queries or visuals can cause significant lag or timeout errors if the source database cannot handle the load quickly.