Unless you've been living under a rock, you've probably heard people talking about big data. Just like "the cloud" or "good BBQ," the exact meaning of this term varies depending on who you're talking to. For our purposes, the Wikipedia definition is a pretty decent start: "data sets so large or complex that traditional data processing applications are inadequate." To that definition, I might suggest adding a second clause: "…and for which cloud-based visualization and analytics tools are becoming increasingly popular and useful."
Microsoft is a pretty significant player in the big data market in a number of ways: they've scaled SQL Server to efficiently handle very large transactional data sets, they're pushing a number of nifty analytics features into the Azure feature list, and they're offering some very slick tools under their Power BI brand. In this short article (and the series of articles that will follow it at irregular intervals), I want to try to demystify a few things about Power BI.
Here at Summit 7, we often talk to customers who say things like "I want more insight into my data"—a very generic opening statement that usually leads quickly into a very detailed discussion of what "more", "insight," and "data" mean in that particular context. For example, one of our customers has a large database of every real estate transaction in their state, going back for decades. They want to be able to see patterns and trends in the data, perform forecasts, and quickly answer questions such as "how many parcels of land have been bought or sold in area X in the last 30 days?" or "what is the aggregate value of all commercial real estate transactions year to date in cities of more than 200,000 population?"
The traditional way to answer such questions was to have a database administrator construct a report, run it, and then massage the results into a readable and useful format. That was a great solution back in the days when reports came out looking like this…
But modern users need more insight faster, and modern administrators have too many other things to do. These two trends are driving rapid adoption of self-service business intelligence tools, which is exactly what Power BI is.
(Brief digression: Microsoft has a ton of material that explains what Power BI is, how it works, and why you might want to use it at their powerbi.com web site. Now might be a good time to go poke around there if you're curious.)
For the sake of argument, let's assume that you want to start using Power BI. How do you get started? Simple. First you need an Office 365 tenant. This is non-negotiable. Although there are desktop and mobile Power BI clients, they're licensed through the Office 365 service. You can download the desktop client, load local Excel or Access files, and work with them completely offline… but you still need an Office 365 subscription.
Next, you need Power BI licenses. Power BI comes in two flavors: free and Pro. The free tier can easily be added to your tenant by doing the following:
- Go to PowerBI.com.
- Type your Office 365 email address into the text field and click the "Use it free" button.
- When prompted, type your Office 365 password.
This adds licenses to your Office 365 subscription. In fact, it adds an infinite number of Power BI licenses. (The "Power BI for Office 365 Individual User Trial" licenses you see listed first are ancient; you can ignore them because that SKU no longer exists.)
Once you have secured these licenses, you then need to assign them to users. The simplest way to start, of course, is by adding a license to your own user account; you can use all of the standard methods to add licenses, including editing individual users' properties, using the bulk edit features in the Office 365 admin portal, or using the Set-MsolUserLicense cmdlet.
After you assign licenses and wait a few minutes, you should see Power BI show up in the Office app launcher for the licensed user. When I added the license to my own account, I had to log out and then back in before I saw Power BI in the waffle—it's easy to spot thanks to the brilliant yellow color of its icon.
Now that all the licensing stuff is done, what can you do with it? I decided to grab a sample data set and experiment a bit, so I went to faa.gov and grabbed their database of all aircraft registered in the United States. Once I unzipped it, I had a nearly 200MB CSV file containing a wealth of data for every registered aircraft known to the FAA.
First I tried uploading it directly by doing the following:
1. I opened Power BI from the Office waffle; that loaded my default workspace page.
2. On the right side of the workspace page, I clicked the button labeled "Get" in the Files section.
3. When the Files page loaded, I clicked "Local File", then navigated to the CSV file's location and selected it.
I expected this to load the file, and I got no error messages or indications that anything was wrong… but I also didn't see the file in my workspace. Suspecting that the FAA had a different understanding of CSV files than the rest of the world, I opened the CSV file in Excel and found that Excel 2016 assumed it was a tab-delimited file. I saved it as an Excel binary workbook and re-uploaded it. This time, Power BI acknowledged that there was a file being processed.
Unfortunately, that didn't work either; I got an error complaining that my Excel workbook didn't have any data. Luckily this article clearly explained the solution: re-open the workbook, define an Excel table to contain the data, and then re-upload it. (Now's a good time to point out that it would have been faster if I'd imported a file stored in a OneDrive or OneDrive for Business folder—and when you do, changes to the file you import are automatically reflected in Power BI, a very useful feature.)
After the file contents made it to the Power BI service, I had to wait a few more minutes for the data to be ingested; the file immediately appeared as a dataset (with a yellow asterisk next to it, indicating that it was new) but there was also a pop-up message on the right side of the window indicating that the import was still in progress.
After I waited a few more minutes, the file was completely imported and I was presented with my new dataset:
This induced a mild feeling of panic, since I had no idea what to do next. That's why this is a multi-part series!
In my next post, I'll cover the process of creating reports and dashboards using the FAA data set, so if you want to prepare for that, you can grab the FAA data and upload it to your own workspace. Until then, you'll find that Microsoft has produced a ton of how-to content, including click-through guidance and demo videos. This getting-started article is pretty good as a place to start by using Microsoft's own canned data set. Power on!