Business Intelligence (BI) Blog – Part 3: Power BI mashing up, refreshing, and on premise data
Part 1 of our examination of Dynamics AX Reporting Out of the Box focused on what is available to Dynamics AX users natively. Part 2 focused on a chronology of the elements making up the Microsoft’s Power BI solutions, including some of the incremental features and its strategic value. Part 3 considers data mashup, refreshing and considerations of on premise data.
Power BI complements and extends what is already available for Dynamics AX users. For example, products like Management Reporter and SQL Server Reporting Services will continue to be used as will List Pages and Analysis Services. The rate of development for Power BI is mind boggling with updates occurring rapid fire. Add in other Microsoft developments and acquisitions and now you have a Venn diagram with many circles and possible overlaps.
What is Data Mashup?
Data mashup is the combination, aggregation and visualization of information from multiple sources. It is important because it enables faster, easier and sometimes hidden insight to help draw conclusions and drive decisions.
How has Data Mashup evolved?
In the past, the desired results (better, faster, easier insight and decisions) has driven Data Warehouses/Cubes. The systematic and massive Extraction, Transformation, and Loading of source data from many and often disparate data was the mashup where analytical tools were then applied and visualizations presented to the users. This sequential process (which included the multi-dimensional modeling step – an important but complex step in itself) was often very time-consuming and IT heavy on the front end and led to long lead times, multiple iterations and disappointing results as users lamented the delays and frustration over available data sources. And, with the proliferation and importance of external (web) content, this has amplified the complexity.
Power BI has enabled mashup to be shifted to the backend and enabled users to quickly build and share insightful representations through the always growing visualizations provided by Power BI and the community (this is an entirely different story – the open source visualization community being promoted by Microsoft).
It’s important to note that the expression “mashup” can be a technical, data centric term meant to describe joining tables (inner, outer, cross, natural…). Or, it can be the aggregation and visualization of unique but related data.
Why should you care?
Big data is well… a big opportunity. More data isn’t better. In fact, more data can make it easier to “take one’s eye off the ball.” Finding relevant nuggets of data and then presenting it into valuable and timely (I think the term “actionable” may be a tad overexposed these days) manner is what’s important. So mashing up data which includes accessing and transforming (PowerQuery, OData…) aggregating, and then visualizing is at the center of “Business Intelligence,” and as vital as the analytical tools that often take center stage.
Think of the following example:
Perhaps you’re trying to investigate the marketing impact of different strategies using outbound mail campaigns and social media (Facebook and Twitter) with site visits. A simple example would be to create a bar chart that stacked the events (campaigns and social media) with a line graph depicting the google analytic visits. This is a simple example but one that shows possible correlation between marketing strategy/events and web site visits.
Here’s an example of a simple mashup showing Auto Crime by street location in Seattle. It’d be interesting to also layer in household income by block to further draw insight for these auto-related crimes.
Think of the mosaic that Power BI allows you to create. A landing page made up of tiles representing your company/role specific analytics with a mix of real-time metrics combined with standard, time-based (daily, weekly, quarterly, yearly, year-over-year… you get the idea) to ensure that the time element coincides with your need. That’s the beauty of Power BI – the canvas is completely yours to design and it’s the combination of data/time/dimension/visualization which brings data to life and enables insight, further analysis, and action.
Let’s talk about Refresh…
First off, there’s an excellent piece devoted to this in detail here: https://powerbi.microsoft.com/en-us/documentation/powerbi-refresh-data/.
In the past, using multi-dimensional data warehouses/marts/cubes involved a complex project that took into consideration the refresh strategy of the many data sources along with the extraction, transformation and loading. This was handled on the front end and contributed to the complexity and timing considerations of the data itself.
While Power BI can access multi-dimensional data sources, its power also lies in its ability to access other data sources (internal and external, public and private) where a refresh strategy must be considered. In some cases (Content Packs e.g., Salesforce), the refresh settings are inherited with the dataset and can only be changed by the originator.In other cases, the access is to the real-time data in which case there is no refresh. But there’s also the last group where the user selects the refresh strategy required for the need.
Refresh is an important aspect to any business intelligence tool since users analyzing information often ask, “this intelligence is good as of what date/time?” There’s much to consider given the various data sources, on-premise versus external, performance, security, and the actual analytical need…
Get your Data on
Using Power BI’s “Get Data” allows you to access the data sources (SQL tables, spreadsheets, Wikipedia sites, etc.) that have been setup for that user. Power BI automatically creates the data set which is simply the data uploaded that will be used for the report, visualization, etc. presented to the user.
You will have an opportunity to create a refresh schedule for most datasets that are created. And, while this refresh schedule is important it is trumped by the refresh schedule of the data source itself. It would never make sense to schedule a dataset refresh more frequently than the data source itself was updated. And, just because a data source is updated in near real time doesn’t mean that a user should schedule a refresh more frequently than was absolutely needed. It is really the intersection of the user need with the data source update schedule which should dictate the dataset refresh frequency.
Summary of the data sources and how the update or refresh is managed
There are many other data sources allowed with different access methods with different refresh options. We’ll summarize this but again, the previous link goes into more detail.
- On premise (Live Connectors) with direct query – SQL tables and Analysis Services – no refresh consideration since these are live connections direct to the data sources. These are especially good with dashboards and KPI type charts and graphs where near real-time analysis is important. Care must be taken since the real-time query aspect could have performance consequences.
- Content Packs – online services (e.g., SalesForce) or organizational (created and shared by someone within your company). Online services content packs are updated daily by default but can be altered to align with the refresh need. Organizational content packs are refreshed according to the original creator and are inherited with the dataset.
- External online data sources – as the dataset is created, it will allow you to create and edit a refresh frequency. This should be aligned with the updating of the source itself as has been stated before.
- Other on premise data sources – these sources handle access security using the Power BI Personal Gateway and the Azure Service Bus. To download and configure the Personal Gateway, see: https://powerbi.microsoft.com/en-us/documentation/powerbi-personal-gateway/
- Real-time data updates (Power BI REST API) and real-time analytics (Azure Stream Analytics) to update a Power BI Dashboard. The REST API allows you to use your own application to push data into your Dashboard in real-time. Ellipse worked with a Communication Platform vendor recently that could expose near real-time machine data through a REST API – this would be a relevant application for something like this. Likewise, the Azure Stream Analytics interface allows near real-time updating of streamed data. More can be seen here: https://msdn.microsoft.com/library/mt267603.aspx.
On Premise Data Refresh Considerations
On premise data includes access considerations. This means secured access much like a VPN or a bridge that authenticates the user to the data. In the case of Power BI, a Personal Gateway may be required. This can get confusing since there is also a Data Management Gateway (DMG) that provides similar functionality and was used prior to July 24, 2015 for refreshing Power BI workbooks (however, this was an add-on service to Office 365). The two gateways don’t play nicely together and care should be made to uninstall the DMG if using the Power BI Personal Gateway or else conflicts and errors will occur. Here’s a link to a short video demonstrating how the Personal Gateway is setup and some simple configurations:
Since the Power BI Personal Gateway (and, Azure Service Bus) is essential to refreshing some On-Premise data, consider the following:
- It’s only available on 64-bit Windows operating systems
- Be careful of possible conflicts
- As noted above, the DMG and the Power BI Personal Gateway cannot be installed on the same machine. The Personal Gateway shares code with the DMG and is considered the evolutionary gateway.
- Ditto that for the Analysis Services Connector – this won’t be a problem for most since this connector will be generally installed on the SSAS tabular server.
- It only needs to be installed once. One gateway can support any number of datasets.
- Windows User Account Type matters when the gateway is installed… If installed as an Administrator, the Personal Gateway runs as a service. This means that computer and service must be running but that the user does not have to be logged in. If it is installed not using Administrator privileges, it will run as an application, forcing the user to be logged in to the computer or else refreshes will fail.
Gartner is paying attention. In Gartner’s latest report on Business Intelligence and Analytics Platform just released this month, Microsoft is the clear leader emerging from a fairly crowded Leader’s Quadrant from last year (last year included nine vendors; this year there’s only three). With the confluence of AX7, Azure, and other Microsoft cloud based solutions, it just makes sense for Dynamics AX customers to embrace Power BI for their analytical needs. And, more and more content is being served up for Power BI which will continue to add value and reduce time to value.
It is however, crazy to try to keep up.
Acquisitions like Datazen (mobile/anywhere visualizations), new product development like Cortana Analytics Suite (think of a Personal Assistant on steroids integrating predictive behavior), and big data evolution (Microsoft’s acquisition of Revolution Analytics and the growing legacy of Hadoop to process big data) will continue to drive data insight by providing innovative-anywhere delivery visualizations in near real-time with automation driven by predictive analytics.
Looking for more on Power BI? Check out “Power BI: Focusing on Driving Time to Value”