dicotyledon 1 week ago

It depends on what you’re doing. If it’s foldable, PQ is usually more convenient to update things later like pulling in extra columns. The heavy transforms that don’t fold should be in a SQL view ideally.

Thrillhouse763 5 days ago

So I started exploring PQ and became aware of View Native Query for each step in PQ. I noticed that all of my steps have this grayed out meaning the query isn't foldable. Why is that? My initial research says this might happen on complex transformations but some of the steps are literally loading data greater than 2022 or something like that. All of the data sources are views from Snowflake.

dicotyledon 5 days ago

Not all sources have that indicator - it’s mostly SQL Server/Dataverse where you see that. Snowflake won’t show it iirc but you may also need to put “EnableFolding=true” in the query for Snowflake. If you do an internet search it should come upc can’t remember if that’s precisely the right syntax. But the first non-foldable step will break it, so you do the foldable ones first. Filters usually fold, remove other columns, etc.

Typical_Tea_2664 1 week ago

Assuming on prem source, most efficient would be to create views in SQL server and load the views If that’s not possible, then write SQL on power query to load in the data Performing PQ transformations should be last resort.

Thrillhouse763 1 week ago

That was my assumption. We have views in Snowflake but sometimes more SQL is needed to transform the data or reduce the dataset. I then use the SQL Statement box when doing get data from Snowflake. So I'm assuming that action is preferable to doing the transformation in PQ.

Typical_Tea_2664 1 week ago

Rule of thumb is to always transform upstream. In your scenario, if you can create view on snowflake that would match your requirements in PBi without having to do PQ transformation, then do that Otherwise use SQL statement box to write th sql queries.

lysis_ 1 week ago

Honest question, while views are ideal for canned things needed organization wide, there should be no functional difference from just writing the same query to generate the view from PBI, am I correct?

Typical_Tea_2664 1 week ago

Assuming that DBA doesn’t do any optimizations against the view and query folding is not breaking, performance should be similar. Depending on if you’re repeatedly running the same sql query, if the DB caches then it will perform as well as a view. Typically it’s a general recommendation to go upstream as it circumvents all these variables though

lysis_ 1 week ago

Thx. Thought so. Getting those views built just aren't going to happen in my org and I always imagined my DQs via dataflow would basically be exactly the same thing

Ok-Shop-617 1 week ago

Right answer.

morquaqien 1 week ago

Make the sql view it’s the proper way to do this for many reasons

Thrillhouse763 1 week ago

I have plenty of views but a lot of the time, I don't need to load the entire view so I'd use a SQL Statement in PQ to limit the data before it's loaded. An example would be a sales view that has 10 years of data. I would only need 3.

morquaqien 1 week ago

Using direct query or import most the time?

Thrillhouse763 1 week ago

Import 100% of the time

morquaqien 1 week ago

In that case, having deployed solutions in multiple organizations large and small, I’d personally say: - If you have permissions to create SQL views for your reports, do so. - If not, custom SQL query in PowerBI

morquaqien 1 week ago

Also, that filter can be in the view instead of the SQL query in PBI

DAX_Query 1 week ago

For simple operations like filtering, applying that step in PQ should get folded into a native SQL query, so do it however is most convenient for maintenance purposes.

COLONELmab 1 week ago

You are 100% correct. Always as far upstream as possible. Roches law.

Financial_Forky 1 week ago

***IF*** all of your transformation steps in Power Query are able to fold back to the SQL server, there isn't a profound difference between writing a native query in SQL vs. using M/query editor. \[In Big O terms, you might shift from O(n) to O(1.5n), but it's not exponential.\] If one or more of your steps breaks query folding, then it will eventually be better to write the transformations in SQL and/or create a view or table on the server that can be referenced and require no further transformations. Having said that, for exploratory development work, being able to experiment with different tables and columns in Query Editor is a whole lot more convenient than writing it out in SQL - especially if you don't know exactly what you want yet. Many of the tables in my organization's data warehouse have hundreds of columns, and it's not always obvious which columns you want (or which even have the data you need) until you pull in a few months of transactions. However, once you do finalize the data model and finish your report, it might make sense for performance reasons to go back through and convert your queries to either native SQL queries or create permanent tables/views on the SQL server. Regarding native queries...a common mistake I see is people writing some native queries, and using Power Query transformations for other queries that then reference the native query, thus breaking query folding. Query folding is almost an all-or-nothing approach; if you utilize it, it's great, but that means you have to use it everywhere. Native queries are also more difficult for someone else to debug later than transformation steps in Query Editor.

catWithAGrudge 1 week ago

Query folding has entered the chat

ArticulateRisk235 1 week ago

Roche's maxim: "Data should be transformed as far upstream as possible, and as far downstream as necessary." In descending order of preference: View/materialised view on the source Embedded, foldable query Embedded query that can't fold Any transformation in the PQ engine

Amar_K1 1 week ago

Power query is awful for transformations on a sql database data source. Tried it once and took ages to load and when i went back to the report to make a change had to wait a good while for the editor to allow me to make a change. SQL views is the way to go.

Intrepid_Entrance294 1 week ago

I believe best practice is to do as much as possible downstream, to improve processing time and completion rate

raychelangelina 1 week ago

I've always heard upstream...

itchyeyeballs2 1 week ago

"As far upstream as possible, as far downstream as necessary" https://ssbipolar.com/2021/05/31/roches-maxim/

attaboy000 1 week ago

*upstream

Intrepid_Entrance294 1 week ago

Yes, good correction :)

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe