Warm tip: This article is reproduced from serverfault.com, please click

Does power BI import mode require more memory compared to direct query and live connection mode?

发布于 2020-11-29 12:34:05

With respect to the power bi premium or embedded licensing, in order to plan for required memory capacity; I want to ask - does power BI import mode require more memory compared to direct query or live connection mode - for example say dataset size is 8GB.

I ask because import mode stores this 8GB data into memory when a refresh is performed. Or is there a way to use hard disk to store the data instead of memory?

Where as direct query fetches the data from the db for every report/visualization and does not anything in the power bi memory.

Similarly live mode fetches data from the SSAS or Azure analysis services and does not store anything in the power bi memory.

Questioner
variable
Viewed
0
Jon 2020-11-30 03:13:39

Importing the data will create a copy of the data from the data source, and hold it in memory. The dataset will not have a 1 to 1 relationship in size, as the vertipaq engine will compress the data down. However you have don't just have to plan for the sufficient memory to hold the dataset, you have to remember that memory will be used in querying the data too. For example a FILTER function basically returns a table, that query table will be held in memory until the results of the measure are computed and returned. Memory will also be used when dataflows are being processed, even though they will be writing to blob storage and not being held in memory.

For Premium and Embedded, use the Premium metics app to find out what is going on in the capacity in terms of memory usage on reports and the capacity.

For Premium/Pro/Embedded you dataset/report is stored on blob storage, and when the report is accessed, then will be loaded into active memory. If you use up too much memory the report will fail, or may result in dataset evictions, where older datasets will be dropped from the capacity memory in favour of more used or new datasets.

For direct query and live connection, it will have a far lower memory overhead than importing, as it will not be holding the full data model, just the total for the result set generated and returned via the data source. For most cases this will be quite low, but if you are returning detailed data, then it will take up more memory.

You can't use hard disc space directly to store your dataset, it will still have to be loaded into memory at some point, but you can with Premium have dataflows with direct query with 'enhanced compute' if you don't want your data source doing any queries.

Premium/Embedded v2 is coming out which will allow auto scale so your capacity can manage lots of datasets and queries reaching the limit, then add capacity when needed.

For Premium and embedded, I would recommend carefully looking at your estimated usage and number of concurrent reports and users, as the metric of 'report refreshes' in the capacity spec's doesn't quite cover it, and I have seen reports fail in premium due to a big data model, with ineffecient queries hitting the 25GB memory limit. Pro and workspaces that aren't capacity can generally take quite a hit in memory and you will not notice, I've seen reports run in Pro without a hiccup, but fail in a Premium P1 node.