Technical Article

Data Flow Tutorial With Mage.ai | Part 6: Refined Data Presentation

January 03, 2024 by Michael Levanduski

We conclude the data flow project with the step of developing a business intelligence dashboard to present machine data, at a reliable schedule, to a leadership or maintenance team.

Check out the previous articles in this data pipeline series:

If you’ve made it this far in the series, you’ve trekked through the murky depths of the data engineering realm. Admittedly, the workflow seen thus far is much more complex than simple Excel analytics. However, the return on the mental and digital gymnastics is much greater than what a simple spreadsheet can provide.

Final Phase of the Data Flow Project

At the conclusion of this process, the mage.ai pipeline can run on a scheduled cadence with very little manual intervention. The Postgres database is designed to store large volumes of data and to handle frequent writes of new data; something that a ‘master’ Excel file could not cope with over a short time in the real world.

The missing piece of the puzzle is the visual component that business leaders and operators are most familiar with. As mentioned in other articles, Power BI and Tableau are some of the most common data dashboarding tools in the corporate world. These would be viable options for our use case, offering quick “click not code” integrations to the Postgres database. However, I will use Grafana OSS since the software offers a generous free tier.

Grafana dashboard icon

Figure 1. Example of Grafana dashboard. Image used courtesy of Grafana

Provisioning Grafana

We will need to update the compose.yaml file that we have been using to provision applications throughout this series. Add the Grafana service, as shown below, into your compose.yaml file:

 grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - ./grafana_data:/var/lib/grafana:rw
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=adminpassword

Entering the command docker compose up -d in a terminal window of the project directory will start up the containerized applications defined in the yaml file. The Grafana dashboard can be accessed at http://localhost:3000. You should be greeted with a login UI, where you may enter the GF_SECURITY_ADMIN_USER and GF_SECURITY_ADMIN_PASSWORD values defined in the compose.yaml file. Obviously, my choices of values are weak for security and are only appropriate for our introductory demonstration purposes:

Grafana login screen

Figure 2. Grafana login page. Image used courtesy of the author

Integrating to Postgres

Upon logging into the platform, our first step is to connect to the Postgres database. In the welcome window, select the following icon:

Grafana data source widget

Figure 3. Integrate a data source widget. Image used courtesy of the author

Search for Postgres in the search bar and select the PostgreSQL service that appears:

Grafana SQL integration widget

Figure 4. PostgreSQL integration widget. Image used courtesy of the author

Upon selecting the widget in Figure 4, you should be directed to configure the connection to the Postgres database. The configuration will mirror the environment variables defined in the .env file of our project directory. These are the same values passed to the Postgres container in the compose.yaml file here:

Project environmental variables file

Figure 5. The environment variables can be found in the .env file. Image used courtesy of the author

We will also need the IP address of our local machine since docker utilizes port forwarding from the host machine to direct traffic to the container instance. On a Windows machine, the IP address can be obtained using ipconfig in a command prompt window. On a Mac machine, a variation of the ipconfig can be used with en0 to obtain the IP address of the wireless network interface:

Obtaining a computer's IP address

Figure 6. Obtaining the IP address on a Mac machine. Image used courtesy of the author

With all configuration values in hand, complete the connection information of the widget:

Connection interface information for PostgreSQL

Figure 7. Entering connection info for Grafana integration to PostgreSQL. Image used courtesy of the author

At the bottom of the UI, save and test the connection with the blue button. It should yield the transparent green “Database Connection OK” notification if all goes well.

Successful connection of a data source

Figure 8. A successful integration. Image used courtesy of the author

Building a Dashboard

Finally! We can get to designing the visualization. This piece of the solution is by far the most tangible and visible to the leadership stakeholders in our use case. It will provide a peek and summary into the health of the machine you have been tasked to improve. In the same configuration window we left off in Figure 8, scroll back up to the top and select “Build a dashboard” in the top right-hand corner:

Setting up a first data visualization dashboard

Figure 9. Build a dashboard. Image used courtesy of the author

The next guided step is to create a new visualization; Grafana certainly makes the workflow intuitive and easy to follow:

Beginning the data dashboard configuration

Figure 10. Add a new visualization. Image used courtesy of the author

You’ve now entered a sandbox world where there are many options that can showcase the data. I can’t begin to cover all possible visualization scenarios, so I encourage you to explore the vast set of features. However, I will offer some basic solutions and believe that Grafana offers a great product for customized dashboards.

The first scenario is a high-level fault per day count. Within the query tab, enter the following configuration settings. Notice that the components are sourced from the Faults table in the PostgreSQL database:

Setting properties for the data query

Figure 11. The Query settings interface. Image used courtesy of the author

Below the Query editor window will show the corresponding SQL query associated with the settings selected. Grafana makes things somewhat easier by building the SQL code from the “clicking” UI.

SQL code for generating data

Figure 12. SQL code generated from parameter selection in Figure 11. Image used courtesy of the author

Clicking apply on the visualization will complete the first graph. You will now see a preview of the dashboard and have the option to add additional graphs with the “Add” button below:

First graph of faults per day

Figure 13. Aggregated Faults Per Day graph. Image used courtesy of the author

A second potential visualization could be the frequency of faults for the 72-hour run. In the query builder tab from the newly created visual, we can create a query using SQL code instead of using the UI tools here:

Query code editor

Figure 14. Code selection in the query tab. Image used courtesy of the author

We can enter the query below:

SELECT COUNT(fault) AS "fault frequency", fault 
FROM "Faults" 
GROUP BY fault

Into the box here:

Setting the SQL query

Figure 15. Entering the SQL query in the UI. Image used courtesy of the author

Clicking on apply will yield the following dashboard:

Graph of fault frequency distribution

Figure 16. Fault frequency distribution for the 72-hour run. Image used courtesy of the author

Data: From Source to Visualization

We hope this article series has provided you with a high-level understanding of more powerful tools that can be used to build resilient, scalable manufacturing operations reports.

While the scope of technologies used may at first seem overwhelming, the only reasonable solution is to be persistent in learning. How else did we learn automation, electronics, and other control technologies? With consistency, these technologies will become more familiar and these powerful tools can help you tap into unrealized opportunities for yourself, your business, or your employer.