• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Loading Power Query Data - sometimes fast, sometimes slow

Hi all,

I often work in Power Query using an ODBC connection directly to our database to create recurring reports and to pull one-time data. There is one table I pull from a lot, and it has almost 2 million rows. So I filter in a number of ways (grouping, use of parameters I set up, or by just filtering while in the editor, among other ways). Sometimes the data will take less than a minute to load (it'll process between 75,000 and 100,000 rows of data per second) and sometimes it will take 5 or 6 minutes (it'll process between 6,500 and 7,000 rows of data per second). Obviously that's a huge difference, so I've tried multiple ways to troubleshoot (change order of steps, remove fewer or more columns, etc.), but there are so many variables, I can't really figure out why it loads fast as opposed to slow, or vice versa.

I don't know if I can upload a report since I'm pulling from our internal database, so I'm hoping this may ring a bell with someone who has figured out why there's a difference and if there's a way do something differently in order to fix it.

Thanks!
YL
 

Chihiro

Excel Ninja
There are few things you can do here. Mostly on DB side.

1. Instead of using native SQL statement to bring in data on the fly, use Stored Proc or View to generate desired table/report for importing.

2. Make sure that the table(s) are properly indexed and appropriate relationships are defined. In my experience, properly designed table(s) with 2 mil rows of data takes less than a min to import as is (along with related columns).

3. If you can minimize number of columns (unpivot) it could potentially speed up things. VertiPaq engine is Column storage engine. Meaning it will compress data very efficiently on larger columns (i.e. # of rows) than larger rows (i.e. # of columns).

4. Check your Query plan on SQL side, make sure that it's optimized.

5. Have a read of below link. It'll give you insight into how to design query folding for better performance.

https://www.mssqltips.com/sqlservertip/3635/query-folding-in-power-query-to-improve-performance/
 
Thanks Chihiro! I need to get more up to speed on what you recommended, but I read the article on query folding and removing a lot of columns from the start has helped tremendously. I'll be continuing to learn more on this and hopefully find the right formula and sequence to make it load faster.

As always, I really appreciate your help!
YL
 
Top