Data Flow Tutorial With Mage.ai | Part 6: Refined Data Presentation
We conclude the data flow project with the step of developing a business intelligence dashboard to present machine data, at a reliable schedule, to a leadership or maintenance team.
Check out the previous articles in this data pipeline series:
- Part 1: The Challenge of Data
- Part 2: Initializing the Software
- Part 3: Connecting to Google Sheets
- Part 4: PostgreSQL Database
- Part 5: Basic Data Dashboards
If you’ve made it this far in the series, you’ve trekked through the murky depths of the data engineering realm. Admittedly, the workflow seen thus far is much more complex than simple Excel analytics. However, the return on the mental and digital gymnastics is much greater than what a simple spreadsheet can provide.
Final Phase of the Data Flow Project
At the conclusion of this process, the mage.ai pipeline can run on a scheduled cadence with very little manual intervention. The Postgres database is designed to store large volumes of data and to handle frequent writes of new data; something that a ‘master’ Excel file could not cope with over a short time in the real world.
The missing piece of the puzzle is the visual component that business leaders and operators are most familiar with. As mentioned in other articles, Power BI and Tableau are some of the most common data dashboarding tools in the corporate world. These would be viable options for our use case, offering quick “click not code” integrations to the Postgres database. However, I will use Grafana OSS since the software offers a generous free tier.
Figure 1. Example of Grafana dashboard. Image used courtesy of Grafana
Provisioning Grafana
We will need to update the compose.yaml file that we have been using to provision applications throughout this series. Add the Grafana service, as shown below, into your compose.yaml file:
grafana: image: grafana/grafana:latest container_name: grafana ports: - "3000:3000" volumes: - ./grafana_data:/var/lib/grafana:rw environment: - GF_SECURITY_ADMIN_USER=admin - GF_SECURITY_ADMIN_PASSWORD=adminpassword
Entering the command docker compose up -d
in a terminal window of the project directory will start up the containerized applications defined in the yaml file. The Grafana dashboard can be accessed at http://localhost:3000. You should be greeted with a login UI, where you may enter the GF_SECURITY_ADMIN_USER and GF_SECURITY_ADMIN_PASSWORD values defined in the compose.yaml file. Obviously, my choices of values are weak for security and are only appropriate for our introductory demonstration purposes:
Figure 2. Grafana login page. Image used courtesy of the author
Integrating to Postgres
Upon logging into the platform, our first step is to connect to the Postgres database. In the welcome window, select the following icon:
Figure 3. Integrate a data source widget. Image used courtesy of the author
Search for Postgres in the search bar and select the PostgreSQL service that appears:
Figure 4. PostgreSQL integration widget. Image used courtesy of the author
Upon selecting the widget in Figure 4, you should be directed to configure the connection to the Postgres database. The configuration will mirror the environment variables defined in the .env file of our project directory. These are the same values passed to the Postgres container in the compose.yaml file here:
Figure 5. The environment variables can be found in the .env file. Image used courtesy of the author
We will also need the IP address of our local machine since docker utilizes port forwarding from the host machine to direct traffic to the container instance. On a Windows machine, the IP address can be obtained using ipconfig
in a command prompt window. On a Mac machine, a variation of the ipconfig
can be used with en0
to obtain the IP address of the wireless network interface:
Figure 6. Obtaining the IP address on a Mac machine. Image used courtesy of the author
With all configuration values in hand, complete the connection information of the widget:
Figure 7. Entering connection info for Grafana integration to PostgreSQL. Image used courtesy of the author
At the bottom of the UI, save and test the connection with the blue button. It should yield the transparent green “Database Connection OK” notification if all goes well.
Figure 8. A successful integration. Image used courtesy of the author
Building a Dashboard
Finally! We can get to designing the visualization. This piece of the solution is by far the most tangible and visible to the leadership stakeholders in our use case. It will provide a peek and summary into the health of the machine you have been tasked to improve. In the same configuration window we left off in Figure 8, scroll back up to the top and select “Build a dashboard” in the top right-hand corner:
Figure 9. Build a dashboard. Image used courtesy of the author
The next guided step is to create a new visualization; Grafana certainly makes the workflow intuitive and easy to follow:
Figure 10. Add a new visualization. Image used courtesy of the author
You’ve now entered a sandbox world where there are many options that can showcase the data. I can’t begin to cover all possible visualization scenarios, so I encourage you to explore the vast set of features. However, I will offer some basic solutions and believe that Grafana offers a great product for customized dashboards.
The first scenario is a high-level fault per day count. Within the query tab, enter the following configuration settings. Notice that the components are sourced from the Faults table in the PostgreSQL database:
Figure 11. The Query settings interface. Image used courtesy of the author
Below the Query editor window will show the corresponding SQL query associated with the settings selected. Grafana makes things somewhat easier by building the SQL code from the “clicking” UI.
Figure 12. SQL code generated from parameter selection in Figure 11. Image used courtesy of the author
Clicking apply on the visualization will complete the first graph. You will now see a preview of the dashboard and have the option to add additional graphs with the “Add” button below:
Figure 13. Aggregated Faults Per Day graph. Image used courtesy of the author
A second potential visualization could be the frequency of faults for the 72-hour run. In the query builder tab from the newly created visual, we can create a query using SQL code instead of using the UI tools here:
Figure 14. Code selection in the query tab. Image used courtesy of the author
We can enter the query below:
SELECT COUNT(fault) AS "fault frequency", fault FROM "Faults" GROUP BY fault
Into the box here:
Figure 15. Entering the SQL query in the UI. Image used courtesy of the author
Clicking on apply will yield the following dashboard:
Figure 16. Fault frequency distribution for the 72-hour run. Image used courtesy of the author
Data: From Source to Visualization
We hope this article series has provided you with a high-level understanding of more powerful tools that can be used to build resilient, scalable manufacturing operations reports.
While the scope of technologies used may at first seem overwhelming, the only reasonable solution is to be persistent in learning. How else did we learn automation, electronics, and other control technologies? With consistency, these technologies will become more familiar and these powerful tools can help you tap into unrealized opportunities for yourself, your business, or your employer.