Technical Article

Infrastructure From IT to OT: SCADA Platform Example with Ansible

May 21, 2024 by Munir Ahmad

High-level IT/OT integration in a manufacturing environment involves skills in networking communication and field equipment. This example involves automating remote connections between computers.

A modern SCADA/DCS system environment comprises IT and OT infrastructure, including servers, workstations, network switches/routers, PLCs, RTUs, and IEDs that can be configured and tuned with a central software platform. Ansible is a simple and powerful tool used to deploy SCADA systems. It can automate the deployment of applications across multiple servers or devices, configure databases, set up communication protocols, and start and stop crucial services.

In this example, we will use a simple automation platform to issue commands remotely to a pair of servers. Although limited in scope, this high-level walkthrough can be paired with other tutorials to work directly with PLCs and floor-level equipment to create a more complete SCADA system.


What is a SCADA Automation Platform?

Red Hat Ansible, as with some other platforms, is an open-source agentless automation platform that allows system users and administrators to automate repetitive tasks like device configuration, package installation, SCADA application deployment, copying files, and network configuration across multiple nodes simultaneously. Within the Ansible environment, tasks to be performed on different hosts are written in a human-readable format called playbooks which are easy to read and simple for understanding the deployed infrastructure.

Ansible is a Python-based program with a rich ecosystem of plugins. The most powerful type of plugin is a module, and these are the real workers in terms of getting the work done. The other types of plugins are inventory and connection.


Infrastructure of host network

Figure 1. The Ansible core package is installed on the Control Node which can manage inventory hosts comprised of servers and network switches.


The major strength of Ansible is cross-platform support which supports Linux, Windows, and virtual and physical cloud environments. We can manage different hosts (RHEL, Ubuntu, and Windows) with a single playbook.

As described, Ansible playbooks written in YAML are easy to read and understand in their plain text format. Every aspect of the IT infrastructure can be described and documented. One of its key advantages is that it supports dynamic host inventories which means that the list of managed host machines can be dynamically updated. It can easily integrate with other third-party systems like Jenkins or Red Hat Satellite, along with other systems.


The Architecture of Ansible, Working Ansible, and Playbooks

The Ansible core package can be installed on any Linux machine, but it can not be installed on Windows because of its Python dependency. Ansible is agentless, which means that there is no need to install any extra or additional software on the managed hosts. Ansible usually connects to the managed hosts using Open SSH (Linux hosts) and winRM therefore, Ansible is more secure and efficient than other alternatives.


Control Node or Ansible Controller

The computer on which the Ansible package is installed is called an Ansible controller, also sometimes called the control node. From the control node, we can automate the server, workstations, network devices, and even public and private cloud infrastructure. Ansible can not be installed on Windows machines; despite this, it can automate and manage a Windows host. Ansible is powerful and secure in the sense that there is no need to install any extra package or agent on the managed hosts.


Inventory Hosts

The list of systems managed by Ansible are called inventory hosts or managed hosts. In the inventory hosts file, we organize the systems by groups based on geographic location, platforms, etc. by entering the node's hostname and IP address. In this way, Ansible refers to the systems.


Ansible Plugins

There are numerous plugins built for Ansible, with one of the key plugins being the module which is the actual worker that performs the task. Ansible playbook is a YAML file. YAML is easy to read and understand and it is often used to write configuration files. The Ansible playbook has a list of plays or a single play. Interestingly, YAML originally stood for “Yet Another Markup Language” as it was developed in the era of other languages like HTML and XML, but is simpler as far as syntax is concerned.


Ansible plugins

Figure 2. The Ansible software consists of multiple rich plugins


Having more than one ‘play’ means targeting a different set of inventory hosts. Each play has tasks, and tasks are the real workers such as copying a file, updating the software, and installing packages on the remote devices.


Inventory Plugin

We can manage hosts into different kinds of inventories. An inventory can be defined statically in the text file or dynamically obtained from external sources.


Connection Plugin

This plugin is used to connect to the managed hosts. This plugin defines how the Ansible controller connects the managed hosts. For Linux systems, we can use SSH plugins for remote connection, and for Windows systems, WinRm plugins are used.


Become Plugin

How do you connect the managed hosts, whether as a normal user or as a root/Administrator user, in order to install the software and start/stop the essential services? These Become plugins allow Ansible to act with certain user privileges (like sudo) to ‘become’ that user in order to accomplish some tasks.


Example: Write a Playbook to Install HTTPD Package on Managed Hosts

In this example, we will use 3 machines: one machine is a control node, and the other two are inventory hosts managed by the control node using Ansible. The playbook will install the httpd software package on 2 SCADA application servers. For those unfamiliar, httpd is a software program that runs in the background of an http server.

  1. Let's name our first and second machines ‘server1’ and ‘server2’ respectively, which will act as our managed nodes, while the third machine named ‘workstation’ will act as the control node.
    hostname IP address OS version Ansible package
    server1 RHEL release 9.3 Not required
    server2 RHEL release 9.3 Not required
    workstation RHEL release 9.3 Ansible core 2.14.9
  2. The easy way to find the hostname and OS version of all aforementioned 3 machines is by using the command ‘hostname’ and by reading the content of the file “redhat-release” under the ‘/etc’ directory.

    Identifying workstation name

    Verifying version

  3. The Ansible core package version 2.14 is installed on the control node ‘workstation’ and the command ‘Ansible --version’ is used to find the Ansible package version.

    Checking Ansible version

  4. Ping both servers to check the communication between the control node and inventory hosts.

    Pinging servers for communication

  5. The project directory hierarchy contains the necessary files to run a playbook.
    • playbook.yml: tasks are written to install httpd webserver software on managed hosts, i.e. server1 and server2.
    • Ansible.cfg
    • group_vars/appservers/var.yml: file to define variables related to specific group
    • host_vars/server1/vars.yml and server2/vars.yml: define variables for each individual server.
    • linux: inventory file

    YAML project scope

    Figure 3. The project directory contains the inventory file, playbook.yml, and group_vars, and to define variables


  6. As shown in Figure 3, the playbook.yml code first checks that httpd is installed to the latest version, then it enables the httpd service and deploys the content to the index.html file.
  7. Check for any syntax error before executing the playbook, use the following command: ‘Ansible-playbook playbook.yml --syntax-check’

    Syntax error check

  8. If no error is found, then execute the Ansible playbook which is going to install the httpd software package on the managed hosts. The Ansible executes the tasks step by step as shown below:

    Project execution steps and progress

  9. Now let's check that the changes have been made successfully on the managed hosts. The following snaps confirm that the httpd package has been installed on each server and that the content has been successfully written to the file.

    Server 1 check

    Server 2 check

Starting with this very simple example, we can now install software, copy files, configure services, and set up databases on a single server or even thousands of them using a single playbook.


All images used courtesy of the author