I’ve been using Power BI for a couple months now, not as a developer, but as a system architecture. I may not deal with dashboard and report development on a daily basis, however, I, as an end user, use Power BI extensively to monitor Azure and Power BI usage including audit and billing. I would like to learn more about this tool to its nuts and bolts. The intention of this blog series is to document and share what I’ve learned in this journey.
My first area I’d like to explore is performance as it has been most talked within our developer circle and Power BI community. One thing I’ve seen quite often when searching for Power BI performance improvement is to utilize the Query Folding, which basically, pushes query logics and filters to the database. Let’s database do what it does best – extracting and processing data before sending back to Power BI Desktop client. If this is done correctly, this will reduce workload that Power BI has to do on the client side.
There are a couple well-written articles about Query Folding and its benefits already. There is no need for me to repeat it.
- Query Folding in Power Query to Improve Performance by Koen Verbeeck
- Power BI – Checking Query Folding with View Native Query by Devin Knight
- Not Folding; the Back Hole of Power Query Performance by Reza Rad
Part of this exercise here is to learn how to measure expected performance improvement when using Query Folding.
I’m connecting to a 1.3-million-row Oracle database table.
SQL> SELECT COUNT(*) FROM TABLE_ADDRESS_T; COUNT(*) ---------- 1312127
I created a series of transformations with and without supporting Query Folding. The first two – Filtered Rows and Uppercased Text – in the sample below are the ones supporting Query Folding. We can confirm it by seeing that the View Native Query is available if right-clicking from the last one.
Or we can confirm it by viewing query itself to see that the native SQL query is re-written to include those transformations.
For the next one, the Split Column by Delimiter does not support Query Folding. Note that the View Native Query is now disabled.
The plan is to run these transformations so most will be utilizing the Query Folding then take a measurement. Then move the non-supported one up to the top so the rest will not be able to utilize the Query Folding thus transformations will be processed on the client.
If I would run this small set of transformations, we’d probably not see much difference in term of runtime so I added more transformations just by manually modifying the M Query as seen below – in this case, just adding alternative upper- and lower-case transformations.
To measure the total run time, I’m using the Rui Romano’s Power BI Desktop Trace Logs Analyser, which will read and visualize Power BI Desktop’s diagnostic trace files.
Before each run, the cache will be cleared and enable tracing will be enabled. Once done in each run, the tracing will be disabled.
A. 1st Run data refresh utilizing Query Folding
B. 1st Run data refresh without utilizing Query Folding
C. 2nd Run data refresh utilizing Query Folding
D. 2nd Run data refresh without utilizing Query Folding
The outcome clearly confirms our expectation. I think the number difference would be greater if more transformations especially complex ones are added.
In my future post, I’d like to explore further to understand what these Action Details are, though now I’m not confident that I could find any. It seems like this information may not be shared with customers according to the discussion here.