Using the Query Dependencies tool – Cleansing, Transforming, and Shaping Data

Using the Query Dependencies tool

For complex data models, it’s very helpful to understand the relationship between different queries, staging queries, and transformations. While the data model view in the main Power BI window provides a nice relationship view of the different queries in a data model, it shows the end-product of the work completed in Power Query. To see the full picture of what is happening with Power Query, the Query Dependencies tool can be used. The Query Dependencies tool gives a graphical picture of all data sources, merge and append queries, AI functions, and other transformations that take place within a data model.

Figure 4.4 – Query Dependencies tool

To access the Query Dependencies tool, click Query Dependencies under the Dependencies section of the View tab on the Power Query ribbon, as shown in Figure 4.4.

R and Python scripts

In addition to all the capabilities of Power Query, there is also an option to utilize Python or R to help transform data in Power Query. Python (https://www.python.org/) is an open source, interpreted, and general-purpose programming language that is often used in data science, data visualization, and ML applications today. R (https://www.r-project.org/) is an open source statistical programming language that is also used in data science and data visualization applications.

This requires the computer running Power BI Desktop/Power Query to also have the R or Python execution environment installed. To execute Python scripts in Power BI, there are required libraries that must be installed in our Python environment: pandas (https://pandas.pydata.org/) and matplotlib (https://matplotlib.org/). pandas is an open source library for data transformation and analysis, while matplotlib is an open source library for data visualization. Installation of these libraries in your Python environment may vary but is usually accomplished by running the following commands:

python –m pip install pandas

python -m pip install matplotlib

Or, run the following commands:

pip3 install pandas

pip3 install matplotlib

Once the required environment and/or libraries are installed, you can configure Power BI to use these on the Options screen under Options and Settings on the File tab of the ribbon. Set the location of the R or Python environment you’d like to use with Power BI by selecting it in the drop-down menu under the appropriate setting.

With the R and Python scripting capabilities, it’s possible to use these additional data transformation and analysis languages to prepare data in Power Query for reporting and visualization with Power BI.

Author: Noah Walker

Leave a Reply

Your email address will not be published. Required fields are marked *