Adding columns – Cleansing, Transforming, and Shaping Data

Adding columns

There are multiple ways to add columns to a query using Power Query. New columns can be created using typed-in examples in a low-code experience or new columns can be created using Power Query M formula functions (https://docs.microsoft.com/powerquery-m/quick-tour-of-the-power-query-m-formula-language) to derive new columns from existing or generated data. These commands are available under the Add Column tab in the ribbon of Power Query:

  • Column From Example – This allows you to simply start typing the values you’d like to see in the new column, and from there Power Query will generate new column data using column operations in the examples you manually provide. This is useful if you’re not aware of other column transform capabilities because it shows you the detected transform.
  • Custom Column – This is where you can specify the specific M formula to transform data from other columns into the new column. The full syntax reference for the Power Query M formula is available on the Microsoft documentation website (https://docs.microsoft.com/powerquery-m/power-query-m-function-reference).
  • Invoke Custom Function – This is used when you’ve created custom functions either directly from M query code or by generating a function from a query that uses parameters. This provides reusability across custom functions for multiple queries and columns.
  • Conditional Column – This provides the ability to create a new column of data values based on values from another column using IF, ELSE IF, and ELSE logic to determine the values of the new column. For example, this would allow you to make a new column based on unique values in a column and do it using a low-code GUI interface to define the logic.
  • Index Column – Index columns are used when you need to generate a column of increasing numbers for each row. This is often used when generating surrogate keys for a table or query. Values can start at 0 or 1 or be custom-defined along with the incremental value for each row.
  • Duplicate Column – This simply duplicates the selected column.

It is also possible to add a new column using the same text, number, and data and time transforms discussed previously by invoking these under the Add Column tab in the ribbon. Some column transformations are also accessible from the right-click menu when clicking on a column name.

Next, we will discover row transformations. The ability to transform rows when needed is important as it helps ensure data quality and helps us build analytics that can be trustworthy.

Author: Noah Walker

Leave a Reply

Your email address will not be published. Required fields are marked *