Reduce Dataflow Validation Time in Microsoft Fabric
Light rainbow flowing down path at night. Unsplash. 2025.
Introduction
Power BI and Fabric Dataflows are powerful tools for data transformation and integration, but users often face slow validation (Gen1 Dataflows) or publishing delays (Gen2 Dataflows). One of the key bottlenecks is the 10-minute timeout issue, which can cause frustration when working with complex or large datasets. This blog post explores why this happens and how to speed up the process using an advanced M code solution.
Understanding the Issue
Why Does Power BI Take Time to Validate or Publish Dataflows?
When you validate or publish a dataflow, Power BI must determine the column structure (names, data types) of each query. To do this efficiently, it applies a "Top 0" filter, which retrieves metadata without actually pulling data. This process helps Power BI understand the expected structure of the dataset without incurring excessive data load.
The "Top 0" Filter and Its Impact
Power BI modifies the query execution by appending a limit of 0 rows (e.g., SELECT TOP 0 * FROM Table
in SQL). This works efficiently when the data source supports query folding, as the filtering happens at the source. However, when query folding is not possible, Power BI might still attempt to execute the query, leading to long validation times or failures due to timeouts.
Common Solution: Limit Row Retrieval During Development
During development, apply a row limit (e.g.,
Table.FirstN(Source, 100)
) to reduce data load assuming query folding is enabled. Or a filter to cut down on the number of rows.Remove the row limit before publishing to avoid affecting the final dataset.
While these methods help, they might not be enough for highly complex queries, or queries that do not support query folding. That’s where an advanced M code solution comes in.
🪄Using Table.View
to Speed Up Validation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | let Source = Function.InvokeAfter( () => #table( type table [ #"Number Column"=number, #"Text Column"=text, #"Date Column"=date ], { {1,"Hello",#date(2016,1,1)}, {2,"World",#date(2017,12,12)} } ) , #duration(0, 0, 11, 1) ), TableTypeToReturn = type table [ #"Number Column"=number, #"Text Column"=text, #"Date Column"=date ], OverrideZeroRowFilter = Table.View( null, [ GetType = () => TableTypeToReturn, GetRows = () => Source, OnTake = (count as number) => if count = 0 then #table( TableTypeToReturn, {} ) else Table.FirstN(Source, count)] ) in OverrideZeroRowFilter |
The Table.View
function (line 27) in Power Query allows you to intercept the "Top 0" request and return an empty table with only metadata, significantly speeding up validation. Line 18 is making a fake delay to simulate a long query.
🧐Understanding GetType
, GetRows
, and OnTake
in Table.View
The Table.View
function in Power Query M lets you define a custom virtual table with specific behaviors when Power BI (or another client) interacts with the data. It allows you to control how the table responds to different operations.
1. GetType
– Defines the Table Schema
GetType
tells Power BI what the table structure (schema) should be. It returns a table type (type table [...]
), specifying column names and data types.
1 | GetType = () => type table [ ID = number, Name = text, CreatedDate = date ] |
Why it's useful:
When Power BI applies a
Top 0
filter (to check the schema without fetching data),GetType
ensures that Power BI understands the correct column names and types.
2. GetRows
– Retrieves the Actual Data
GetRows
is the function that returns the full dataset when Power BI queries the table without any filters (e.g., when loading all data).
1 2 3 4 5 6 7 8 | GetRows = () => #table( type table [ID = number, Name = text, CreatedDate = date], { {1, "Alice", #date(2023,1,1)}, {2, "Bob", #date(2023,2,15)} } ) |
Why it's useful:
This is the default behavior if Power BI doesn’t request a limited number of rows (e.g.,
Table.FirstN
).You can use
GetRows
to control what data is returned, such as dynamically fetching from an API.
3. OnTake
– Controls What Happens When Rows are Requested
OnTake(count)
is triggered when Power BI requests only a limited number of rows (e.g., Table.FirstN(Source, 10)
). It allows you to optimize performance by handling row requests efficiently.
1 2 3 4 5 | OnTake = (count as number) => if count = 0 then table(type table [ID = number, Name = text, CreatedDate = date], {}) // Empty table for schema detection else Table.FirstN(GetRows(), count) // Return actual data if rows are requested |
What This Does:
When Power BI validates the table (
Top 0
), it quickly returns an empty table with column headers instead of fetching real data. ✅If Power BI requests
N
rows (e.g.,Table.FirstN(Source, 10)
), onlyN
rows are returned, optimizing performance. ✅When Power BI loads all data,
GetRows
returns the full dataset. ✅
References:
Webb, C. (2025, March 23). Speed up Dataflow publishing validation times in Power BI and Fabric. Chris Webb's BI Blog. https://blog.crossjoin.co.uk/2025/03/23/speed-up-dataflow-publishing-validation-times-in-power-bi-and-fabric/