Simon Hurst introduces the capabilities of Power Query and their possible relevance in audit.
The Power Query tools were an optional add-in when they were first introduced in Microsoft Excel in 2010. They then became a built-in option and now they are the default method for linking Excel to all sources of external data, replacing the previous Get External Data options in the Get & Transform Data ribbon tab. As the new default, use of Power Query is likely to spread and many more Excel users will be likely to encounter it.
The potential of Power Query goes far beyond its strengths when acquiring and manipulating large volumes of data from external sources. It provides a quick and easy way to solve many problems within a single Excel workbook or worksheet. As a simple example, converting a column of dates in US format to useable UK dates can be achieved through cell formulae with the use of several Excel text manipulation functions. Alternatively, the same result could be achieved using some simple interface options in Power Query (see screenshot 1).
In screenshot 1, we have used the From Table/Range option in the Get & Transform ribbon tab to read our original table of data into the Power Query editor. We have then clicked on the Data Type icon in the column heading of the Date column and used the Using Locale… option to set the required Data Type to ‘Date’ and the Locale to ‘English (United States)’.
Once you have sorted out your data, you can load it to your Excel worksheet as an Excel Table using the Close & Load command in the Power Query Editor’s Home ribbon tab.
Simple though the Power Query solution may be, there is one significant difference to using a cell-
based formula: where a formula usually updates automatically if the value in a cell it depends on is changed, Power Query Tables need to be refreshed to reflect any changes. This can be done manually, using the Refresh option on the Table’s right-click menu, or using the Refresh command in the Queries & Connections tab of the Excel Data ribbon. Alternatively, the Table properties can be set to refresh the Table at a specified time interval.
Steps, not formulae
The Power Query Editor works by converting interface actions and choices into a series of steps. When the query is refreshed, all those steps are re-applied. Creating a single process in this way can make a spreadsheet that uses Power Query much easier to understand, and to check, than a collection of multiple worksheets, each containing more than 17 billion cells – any one of which could contain a formula.
The step-based approach usually makes it easier to understand the logic behind the process – and to check the effect of each step. Power Query can display a Query Settings pane that lists the steps and allows any step to be selected. As each step is selected, the editor displays the table of data as it was, immediately after the application of that step’s actions (see screenshot 2, below).
In the example in screenshot 2, we have added a description to one of our steps to help document the process as indicated by the information icon. Clicking on this step shows us our table after we have split our Product column using the dash character as a delimiter and before any other steps are applied.
As well as being able to delete a step using the cross to the left of the selected step, or to edit it with the gear icon to the right, we can also insert steps. This allows us to go back in time in our process and change things before they have gone wrong.
Exceptions
Although the eventual result of a query is usually a Table in an Excel worksheet, the Power Query Editor itself can also be useful in identifying issues and exceptions in a table of data. Unlike columns in Excel, each table in the Power Query editor has a specific Data Type, indicated by the icon at the left of each column header that we used when we converted our American dates. The editor can be set up to examine the column contents and apply a Data Type automatically when the data is first read in. Data Types are also automatically applied after certain actions, such as splitting a column into separate columns.
When a Data Type is applied, the editor will ‘profile’ the quality of the data in that column using a small, coloured bar immediately below the column header. If you are going to make use of this feature, it is very important to check the Column profiling setting in the Status Bar. By default, it is set to only look at the top 1,000 rows, which will fail to reveal problems beyond this range. You can click on this setting to change it to use the entire data set, but for very large data sets, this might have a performance effect. In screenshot 3, we can see that a partial red bar reveals an error in the UnitPrice column and a black section at the end of the ExtendedPrice bar indicates the presence of some missing values.
In screenshot 4 (below), we have done a few things. First, we have turned on two options in the Data Preview group of the View ribbon tab to show more information about each of our columns. We have then selected the UnitPrice column and used the Keep Errors option from the Keep Rows dropdown in the Home ribbon tab to filter our table to only show rows where an error has been generated in our column. This will show us all the information for the error rows, but for the UnitPrice itself, we will just see ‘Error’, so we have gone back in time to the step before we changed the column Data Types and used the right-click Duplicate Column option to duplicate our UnitPrice column so that we can still see our original value.
So far, we have just looked at identifying errors within the Power Query editor, but we could easily load our resulting table to an Excel worksheet to serve as an exception report. As it is possible to base new queries on existing queries, this technique could be extended to test for a whole range of possible exceptions such as duplicate values, incorrect data types, missing values, and codes with the incorrect number of characters. Power Query also allows two tables to be joined using a variety of different join types, making it easy to create lists of records that are present in one table, but not in the linked table.
Extending your capabilities
Exploring the Power Query Editor will give some idea of the range of interface operations that can be applied. The capabilities can be further extended by using the hundreds of functions that can be used to create new columns and manipulate data. It is also possible to switch to the Advanced Editor to work directly in the underlying code but, for many users, just using the interface options will be enough to achieve impressive results.
Although this is a brief introduction to the capabilities of Power Query and their possible relevance in audit, understanding what Power Query is capable of could revolutionise your use of Excel when working with external data or even just tables of values in Excel workbooks.
Additional resources to help auditors get more from Excel are available through ICAEW’s Excel Community.
AUTHOR BIO
Simon Hurst provides training and consultancy services to accountants and other professionals. He is a long-term member of ICAEW’s Excel Community
Audit & Beyond
This article was first featured in the July/August 2022 edition of Audit & Beyond.