In today’s data-driven age, a large amount of data gets generated daily from various sources such as supply chain and logistics, emails and multi-media, e-commerce websites, healthcare, transaction processing systems, etc. Power BI is a powerful business intelligence tool that provides various data analysis, visualization, and reporting functionalities. The data generated from the above sources can generate meaningful insights into Power BI by building and publishing reports and Power BI dashboards.
Power BI is one of the most popular data visualization, BI Tools, and analytics software products developed by Microsoft. It reports can be easily embedded in web applications or shared as a public link or pdf file. It provides insights from a variety of data sources, such as Dataverse, SQL Server, Azure, Excel, SAP, Snowflake, Teradata database, etc. by telling stories using reports and dashboards from data models.
You can clean, transform, and shape the data from multiple data sources using Power Query Editor so that the Power BI report developed using this data provides better decision-making and streamlined business practices. Its Desktop comes with Power Query Editor, where users can modify and transform the data into the desired shape. You can generate an insightful report with the help of Power BI built-in and custom visuals.
Data modeling and data analytics go hand in hand because a quality data model will provide impactful analytics for businesses. Data modeling is the process of analyzing, creating, and defining data model and their relationship to other data models. Data modeling helps people understand the logical structure of data within a database.
In this guide, I’ll show you how to build and publish a superstore
sales and profit report in Power BI using the US superstore retail transaction data to get valuable insights about sales and profits. Superstore retail transaction data available on Kaggle gives insights into online orders of a superstore in the US.
This article was published as a part of the Data Science Blogathon.
Power BI is one of the most popular data visualization, business intelligence, and analytics software products developed by Microsoft. Power BI provides insights from various data sources, such as Oracle, SAP Databases, SQL Server, Azure, Excel, Impala, Google Big Query, Snowflake, and Teradata databases, etc. by building and publishing reports and dashboards. Power BI consists of the following:
The building blocks of Power BI are:
1. Visualizations: A visualization is a visual representation of data, such as tables, column charts, scatter charts, gradient-colored maps, etc., which can represent your data visually.
2. Datasets: Power BI uses datasets for creating visuals. A dataset can be a single data source or a combination of multiple data sources like Web, Dataverse, CSV, etc. For example, you can create a dataset from three database fields, i.e., one SQL Server table, an Excel table, and online results of a promotional marketing campaign. Power BI has built-in connectors that help users easily connect to services such as MailChimp, Facebook, etc.
3. Reports: Reports are a collection of visualizations related to each other that appear together on a page. The report can be developed either using Power BI Service or Power BI Desktop. A report may contain more than a page. For example, the Power BI report was developed to analyze an organization’s quarterly sales.
4. Dashboards: Dashboards are single-page presentations having one or more visualizations added to them, which you can share with others. The dashboard provides quick insight into the story user is trying to present. Dashboards can be shared with other users or groups, who can then interact with your dashboards in the Power BI service or on their mobile devices. For example, the Sales dashboard can have donut charts, KPI scorecards, area charts, bar charts, maps, etc.
5. Tiles: A tile is a rectangular box with visuals like cards, pie charts, etc.
The get data option is used for loading data for visualization and data analysis. Developers can either load the data first or perform a data transformation before loading the data. The get data option is used to connect to different data sources and is present in the Home tab on the Power BI desktop. Users can import data from Azure Synapse Analytics SQL, Excel, Text/CSV, Web, Amazon RedShift, Oracle, MySQL, Snowflake, SAP databases, Google BigQuery, MariaDB, SharePoint List, etc., in Power BI.
Below are the three dataset connectivity modes available in Power BI:
1. Import mode: Import mode imports the data from the data source and stores it in the disk, thus, has fast performance. By default, the data is imported using import mode in Power BI.
The data must be fully loaded in import mode during the querying or data refreshing process.
2. Direct Query Mode: Direct Query mode directly retrieves data from underlying data sources with the help of native queries. This mode doesn’t import data but directly uses the data from the underlying data source. Direct Query mode is used when the data is nearly real-time and large in volume.
3. Composite Mode: Composite mode can combine imported datasets by Import and Direct Query modes or integrate various Direct Query data. It gives the best performance among the different dataset connectivity modes.
To develop the report, install Power BI Desktop.
Visit https://powerbi.microsoft.com/en-us/downloads/
Select Download from Microsoft Power BI Desktop.
A .exe file will be downloaded. Install the downloaded .exe.
Search for Power BI Desktop and open it. We can see the below screen after opening the Power BI Desktop.
In the Home tab, from the Get data option, we can get data from various data sources such as Excel workbook, Text/CSV, SQL Server, Power BI datasets, Power BI dataflows, Web, OData feed, Analysis Services, Parquet, Sharepoint folder, Azure Blob Storage, Azure Databricks, MariaDB, etc.
We can see the data sources options from the Recent sources option, which we have connected recently to the Power BI report.
We can add visuals to the report page by clicking the New Visual option.
With the Text box option, we can insert a text box in the report page and set the font type and font size for the text visual.
With More visual options, users can insert the visuals created by them or import the visuals from AppSource.
On the Power BI desktop’s left side are icons for the report view, data view, and model view. By default, when we open the Power BI desktop, the report view is selected.
In the Visualizations panel on the right side of the Power BI desktop, visuals such as stacked area charts, pie charts, matrix, waterfall charts, line, and clustered column charts are available. In the same panel, drill-through options like cross-filters are also present.
In the Insert tab, from the Text box option, we can insert a text box in the report page and set the font type and font size for the text visual. Using the Shapes option, we can insert shapes like block arrows pointing to different directions and basic shapes such as circles, ovals, lines, pentagons, hexagons, etc.
Using Paginated Report, Power Apps, and Power Automate options, we can create paginated reports, connect power apps, and power automate flow and apps in the Power BI Report.
We can insert the image and buttons in Power BI Report using the Image and buttons options.
In the Modeling tab, we have the Managing Relationships option to manage the relationships between the tables imported from the various data sources. Using the New Measure and New Column options, we can create measure and calculated columns in the Power BI report.
From the View tab, we can choose the theme for the report. Users can also customize the current theme by using Customize current theme option. Using the Page View option, we can make a report Fit to page, Fit to width, or actual size.
Using Filters and bookmarks options, the user can apply filters and bookmarks to the Power BI Report.
Now, we have completed the walkthrough of the interface of Power BI Desktop.
Users can create a report in Power BI and publish it after signing in with their organizational domain email ID.
Download the Superstore retail transaction data available on Kaggle to find meaningful insights from online orders of a superstore in the US.
From this data, we will analyze the sales and profits of the US superstores from 2014-2018 based on the discount rates of the products, city or states where people shop the most, customer loyalty product discount rates, etc.
Open the Downloaded Excel; we can see that downloaded Excel has an Orders sheet with 9995 rows and 21 columns.
Now we have the data, let’s start developing the sales and profit report in Power BI Desktop.
The get data option is used for loading data for visualization and analysis.
Step 1: Click on Get data in the Home tab, then select Excel workbook, now choose the downloaded excel file and Open it.
Users can import data from Azure Synapse Analytics SQL, Excel, Text/CSV, Web, Amazon RedShift, Oracle, MySQL, Snowflake, SAP databases, Google BigQuery, MariaDB, SharePoint List, etc., in Power BI Desktop based on their requirements.
Step 2: Select Sheet1-> Transform Data.
Perform data transformation to remove unnecessary columns, shape the data, evaluate and change column data types, and clean the data.
A. Shaping & Cleaning the Data
1. Right Click Sheet1 -> Rename. Rename it to Orders.
2. Right Click Order ID-> Remove Empty.
3. Right Click Order Date -> Date filters -> Between -> Filter Rows tab select Advanced radio button -> Select 05-08-2014 in date picker parallel to is after or equal to -> Select 11-05-2016 in date picker parallel to is before or equal to.
4. Right Click Quantity -> Number filters -> Greater than or equal to -> Filter Rows tab select Basic radio button -> Select 2 in drop-down parallel is greater than or equal to -> OK.
B. Evaluating and changing column data types
Now, click on every column and check if the Data Type is correctly selected or not.
If the Data Type is not correctly selected, change the Data Type from the dropdown. For example, the Data Type of Order Date & Ship Date column must be Date.
C. Dropping, Sorting column data & Renaming the columns
1. Check the columns and figure out which columns you don’t want to use in the report or keep in the data model.
2. In the imported data, the Row ID column is unnecessary; we can drop the Row ID column. Right Click Row ID-> Remove.
3. Sort Data based on the Descending Order of Profit column. Right Click Profit-> Sort Descending -> OK.
Change the Applied steps name in the right pane as per your understanding to avoid confusion.
After you have completed all your data transformations, click Close & Apply.
After clicking on close & apply, the report view opens.
Click Rename to rename the report page in Power BI Desktop.
In the Visualizations pane on the right side, from Canvas settings, choose vertical alignment as middle, and from Wallpaper, choose the color as black.
Select the Text box option in the Home tab, insert and write your text inside the text box, and set the alignment as left, font type as Arial, and font size as 32 for the text visual. Set the Height, Width, Horizontal, & Vertical Position of the text box from the General tab on the right side.
Using Image options from the Insert tab, insert the downloaded image and set the Height, Width, Horizontal & Vertical Position of the image from the General tab.
Now, select table visual from the Build visual tab and drag the Product
Name and Sales column to the Fields tab. In the Filters pane, in the Product Name column, select Filter type as Top N, Show items Top 5, By value drag Sales column and click on Apply filter.
Select two card visuals from the build visual. In one card visual, drag the Sales column in Fields as Sum and give the card title as Total Sales. In another card visual, drag the Customer Name column in Fields as Count(Distinct) and give the card title as Total Customers.
Select Slicer Visual, drag the Order Date column in the Fields, provide Slicer Header as Order Date, and select slider color, border color, position, line width, and another formatting for slicer visual using the options in the Format visual tab.
Select the Stacked column chart visual, drag the Segment column in the Axis and Sales column in Values as Sum. From Format visual in the right pane, provide font type, font size, color, and labels for the X-axis and Y-axis of the stacked column chart. Provide title, X-axis, and Y-axis names properly. Manage color, alignment, font type, font size, background, border, tooltips, etc., from the visual and general tab of the Format visual pane. This graph helps users visualize the sales based on the Consumer, Corporate, and Home office segments.
Select the Stacked column chart visual, drag the Ship Mode column in the Axis and Product Name column in Values as Count(Distinct). From Format visual in the right pane, provide font type, font size, color, and labels for the X-axis and Y-axis of the stacked column chart. Provide title, X-axis, and Y-axis names properly. Manage color, alignment, font type, font size, background, border, tooltips, etc., from the visual and general tab of the Format visual pane. We can rename column names in Axis or Values using Rename for this visual option. This graph helps users visualize the product count based on the Standard class, Second class, first class, and Same Day Ship Mode categories.
Similar to the above manner, visualize Sales based on City & State using TreeMap visual. Using a clustered bar chart visual, visualize Sales & Profit based on subcategories like Phones, Accessories, Chairs, etc..
Click (+) to create a new report page and name it Profit Analysis.
Do the similar page formatting as done for the Sales Analysis Page.
Select card visual, drag the Profit column in Fields as Sum and give the card title as Total Profit.
Create Total customers card & Order Date slicer as done for the Sales Analysis Page.
Create a Stacked column chart for displaying Profit based on segments.
Create a Stacked bar chart for displaying the Count of sub-categories by category.
Select the Waterfall chart visual, drag the Region column in the Category and the Profit column in Values as Sum. Manage color, alignment, font type, font size, background, border, tooltips, etc. from the visual and general tab of the Format visual pane.
Use Filled Map visual, to display the Top 10 States based on Profits by applying the Top N filter on State.,
Now, save the developed report.
Sign in to Power BI. To sign in, click here.
In the left pane, select Workspaces -> Create a Workspace.
Give workspace name as PowerBIReport. Click Save.
Now, in the Power BI Desktop, select Publish.
Choose PowerBIReport
workspace -> Select. The report is successfully published to the Power BI
Service.
You can see it using the Open ‘Report_Name’ in the success pop-up.
We have seen how to build and publish a superstore sales and profit report in Power BI using the US superstore retail transaction data to get valuable insights about sales and profits. We understood how reports developed using its support of multiple data sources, are secure, and are easily scalable. We learned how we could build it for a real-time scenario.
Ans. Power BI is a Microsoft-developed software for data visualization, business intelligence, and analytics, enabling users to build insightful reports and dashboards from diverse data sources.
Ans. Data visualization is crucial for simplifying complex data, making patterns and trends easily understandable, and aiding decision-makers in making informed decisions based on the presented insights.
Ans. Tools like Power BI, Tableau, QlikView, Google Data Studio, and Excel (with built-in charting features) are effective for achieving impactful data visualization.
Ans. The article offers a concise guide on visualizing data in a Power BI project, covering steps from importing and transforming data to building and formatting visualizations, and ultimately publishing the report to Power BI Service.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
I really love how Power BI has changed the reporting & visualization space. The learning path is easier and time involved is lesser if you take a guided learning path. Just love how you visualized this case study.
this was really helpful. thank you.