The Evolution of Machine Vision

This article discusses the evolution and history of machine vision throughout the years and where this technology is headed in the future.

Technical Article May 22, 2020 by Jeff Kerns

If you don’t know what machine or computer vision is, look no further than your smartphone. Using the camera on your phone to check details or prices on a product, scan a code, or chasing Pokémon in augmented reality on an app are all examples of machine vision. This technology has been around longer than many might think.

Looking From the Past

The 1950s: Starting to use neural networks to identify images
The 1970s: First commercial use identifying handwriting from a typed script
The 1980s: Started popping up in manufacturing used more identifying symbols and labels

What is Machine Vision Used For?

Early machine vision was impressive as computing power during these decades could only do so much. For example, the first digital camera was invented around 1975 at a whopping 0.01 MP. This changed dramatically as computing power increased through the 90s and digital imaging became affordable to mass markets.

Probably one of the largest drivers that could have affected machine vision was the smartphone. The smartphone dramatically reduces the cost, size, and power consumption of sensors and cameras. Additionally, smartphones helped drive mobile web technology so video and images can be sent wirelessly in real-time to other devices.

What Does Machine Vision Look Like in Present Day?

Today, machine and computer vision are used in many industries. From the advanced self-driving industrial robots, cars, and tractors to facial recognition on security cameras or social media. The market for this technology is growing. In 2019, Forbes published an article that reads the market is expected to expand to a 48.6 billion market in 2022.

In 2018 PR Newswire published a report stating that, “The global market size of Industrial Machine Vision was 8.44 Billion US$ in 2018, with a CAGR of 6.86% between 2019 and 2025.” Additionally, it mentions a CAGR of 9.75% for 3D imaging alone.

However, it is the technology driving these markets that is truly inspiring. As cameras, processors, and networks advance, engineers are able to make cameras do extraordinary things. From the beginning of vision technology, some objectives haven’t changed. Early applications and research invested much in finding edges of objects. Finding edges is still at the root of machine vision; although operating at a much higher level of fidelity.

Cameras today have enough resolution to recognize advanced geometries to identify what parts or products the camera is ‘looking’ at, and even inspect a component for GD&T. Recognizing edges, parts, and scanning labels helps the adoption of other technologies such as augmented reality too.

cognex vision systems

Several vision system technologies including cameras built for harsh industrial environments. Image used courtesy of Cognex.

The software can match edges or scan codes to identify objects and align the real-world image on a screen to a model. Superimposing CAD images, or animations over a real-world image can help guide technicians through installations, troubleshooting, and other maintenance needs. These vision tools can reduce training and time in the field with step by step instructions through animations, blueprints, and more.

Another common technology operating on a higher level today is scanning labels. Manufacturing, shipping, and packaging lines are becoming more dynamic and flexible. Customers are more demanding than ever, so products must move quickly and to the right places. Cameras must be able to scan at high rates with labels that might be partially torn, at various angles, and on packages that are multiple sizes, shapes, and heights. This can be problematic to a camera with a fixed focal point.

Liquid lens is a technology that uses a layer of water, or conductive liquid, and oil that when small voltages are applied to change the curve of the lens. Changing the flow of electricity by milliwatts is able to change the focal point of the lens in the order of tens of thousandths of a second. An early demonstration in 2005 processed 250 images in a second.

The Future: Moving Forward With Machine Vision

Moving forward, cameras are advancing and increasing the amount of data they are able to collect. Currently, each pixel is able to record data such as color or temperature - depending on the camera. As software advances cameras are starting to work as teams. Having multiple cameras can increase data by offering things such as depth, or distances.

While the liquid lens is able to change focus rapidly, multiple cameras can operate closer to the way the human eyes operate to give us depth perception. While a potentially obvious example might be self-driving applications, let's consider something simpler.

FLIR Machine Vision Product On-camera deep learning & third generation Pregius. Image used courtesy of FLIR systems.

Pick-and-place applications can be simple if a product has a specific location, packaging, shape, weight, etc. However, as production lines become more flexible and handle more complex automated features, robots are being asked to reach into bins filled with various products. Machine vision could identify objects, their orientation, but adding depth to this application could greatly improve accuracy.

Technology such as 3D stereo vision is able to combine multiple images from different angles. This technology could continue to be adopted in pick-and-place, and self-driving applications. But, as machine vision, simulation, and computing power advances it is hard to tell where machine vision will stop.