As companies produce ever-increasing amounts of data, they are looking to Business Intelligence (BI) tools to harness that data to make better informed decisions. Traditionally, BI has been a function of IT, with a handful of reports being made available for the organization, rather than individual users creating reports for their specific needs.
In recent years, there has been a shift towards self-service BI. Users don’t want to wait for IT to build the report they need. Instead, users want access to tools that allows them to build their own reports. This trend is so prevalent that Gartner now separates the BI market into Today’s modern BI platforms designed to be used by end users with low IT involvement, and traditional Enterprise BI platforms. Microsoft’s Power BI suite has landed it a position as a leader in Gartner’s Magic Quadrant for Business Intelligence and Analytics Platforms, the report that outlines the modern BI market.
I have been developing internal reports using Power BI Desktop to gain insight into our business here at Total Solutions. After building a few reports, I thought I’d share a few takeaways from my experience.
1. Garbage in, Garbage out
Tasked with building a report to display the amount of work remaining on outstanding Purchase Orders, I connected to my data source and then created a couple formulas to manipulate my data as necessary. I checked the output of my newly created report for several engagements, comparing the report to my data source, and the values were calculating as expected. “Well, that wasn’t too difficult,” I thought proudly.
Then, taking a closer look at my report, I saw some negative values where I didn’t expect them.
Negative values where there shouldn’t be.
I double-checked my formulas, thinking something was clearly wrong, but I couldn’t find any errors. I opened the engagements (those displaying negative values) in my source, and it immediately became clear that the source data was not accurate. After cleaning up the data, my now-accurate totals changed by ~20%.
The danger here is that I was still able to create a report that provided an answer to the question I was asking, but it wasn’t accurate.
An alternative view of my report.
In my case, I wasn’t working with a massive data set so I was able to identify and correct the bad data, however, this can be a serious issue if you are working with a larger set of data. Of course, this is true for any BI initiative regardless of the tool being used, but I think it’s worth noting here given that the audience for Power BI may include folks who are new to BI altogether.
2. Calculated Columns and Measures
When you want to write a formula in Power BI, you can create either a Calculated Column or a Measure.
Calculated columns calculate values for each row. The values of the calculated column are calculated as soon as the formula is created, and then stored in the model. This reduces the time to compute calculated columns, however, this also means that calculated columns use memory. Once created, you can use calculated columns just as you would any other column.
I needed to create a field that would display the date for each entry in the format mm/yyyy for a report I was building. Since this is a static value for each row, I created a calculated column using the below formula:
For each row, the new Close Month/Year column is added and a value calculated.
Measures calculate values from multiple rows, rather than for each row. Therefore, measures are useful when you want to aggregate values from multiple rows. Unlike calculated columns, measures are evaluated at query time and will be updated as the current context changes (e.g. a filter is applied).
If I try to create a Measure using the same formula used to create the Calculated Column above, I get an error informing me that a single value for my inputs cannot be determined.
While it’s tempting to jump right in and start writing formulas, taking a few minutes to understand the differences between the two and when each should be used will save you time and reduce confusion.
Check out this post for additional clarification on Calculated Columns and Measures.
While Power BI is designed to be self-service, I’ve found that it’s often not quite as simple as dragging and dropping columns to build a report. For example, I set out to build a chart to show sales each month over several years. I wanted to display Month/Year on the X-axis and Sales on the Y-axis. First, I imported data from a ‘Deals’ table in our CRM that contains [Sale Amount] and [Date Closed] columns. Since I had a date field and a sales amount, I thought creating the chart should be pretty straight forward.
When I added the Date Closed column to my visualization, I was given a couple options with regard to formatting. I could choose to return the entire date as a single value, or choose Date Hierarchy, which gives me options to select Day, Month, Quarter, and Year. After a bit of experimenting, I’ve found that when using the Date Hierarchy, visualizations display data in the context of the broadest date filter. Meaning, when I selected Month and Year, thinking this would result in the visualization I wanted, the resulting chart had a single bar for each year, and no indication of Month.
Selecting Year and Month in the Date Hierarchy shows sales by year only.
If I select Month and not Year, the resulting chart has a single bar for each month that aggregates all years (i.e. July 2014, July 2015, July 2016 all show under a single July bar). This isn’t what I want either.
If I choose to return the entire date, rather than Date Hierarchy, the date is formatted in mm/dd/yyyy. The resulting chart represents each day with its own bar, which again, is not the result I’m looking for.
Selecting the full date doesn’t achieve the desired result either.
After a little more trial and error, I was able to create a new date column to display the date in mm/yyyy format, and create the chart displaying sales by month and year, but it required creating three calculated columns.
The finished report after adding several columns to display the date as desired.
If your data is not formatted exactly as it’s going to be displayed in a report, there will be some manipulation required. While many end users are capable of performing these steps, some end users who expect to simply drag and drop columns to build any report of their choosing may be frustrated by the extra steps required, especially when first using Power BI.
I’ve highlighted a couple of the challenges that I encountered when creating my first reports with Power BI, but overall I’m impressed with the tool. It’s relatively simple to access data from various sources, and once you have your data in the proper format, Power BI provides a wide range of useful visuals to choose from. After building a few more reports, I’m now able to quickly get my data formatted as needed and build useful reports that would have previously taken significant manual effort to produce.
About the Author