IIoT condition monitoring for Australian manufacturing sites is most useful when it sits between the asset and the maintenance team in a way that actually changes what happens on the floor. Fitting sensors and collecting data is the straightforward part. The work that determines whether a programme delivers is selecting the right assets, sampling at a rate the failure mode demands, moving the data cleanly into a system that can act on it, and building alerts the maintenance team trusts. Metromotion Controls is a control systems integrator based in Mount Waverley that builds condition-monitoring and production-data systems for manufacturers across Australia. This guide sets out how to approach a condition-monitoring programme as an engineering discipline rather than a sensor-buying exercise.
The structure follows the way a real programme is built: the standards that frame it, the maintenance strategy and the physics of failure that justify it, the sensing and the data path that carry it, a worked example to make the thresholds concrete, and the Australian context that surrounds it.
This post supports our industrial data and IIoT service, where condition-monitoring data is read from the asset, moved over standard protocols and stored in a historian rather than collected in isolation.
Condition monitoring as a defined discipline
Condition monitoring has a recognised structure, and working to it keeps a programme coherent as it scales. Two standards do most of the framing.
ISO 17359 is the umbrella standard for condition monitoring and diagnostics of machines. It sets out a general procedure: identify the equipment and its critical assets, carry out a reliability and criticality audit, select the measurement parameters that reveal the failure modes you care about, establish the measurement method and data-collection interval, set alarm criteria, and then act on the diagnosis. It is deliberately general so it applies across machine types, and it points to a family of supporting standards for specific techniques such as vibration and thermography.
ISO 13374 defines the data-processing and information-presentation side. It breaks a condition-monitoring system into a sequence of functional blocks: data acquisition, data manipulation, state detection, health assessment, prognostics and advisory generation. The value of that model is architectural. It tells you that raw measurement (data acquisition) is only the first block, and that the steps which turn a vibration reading into a maintenance action (state detection through to advisory) are distinct functions that have to be designed, not assumed. A programme that collects data but never implements state detection or health assessment has built only the first block and will not produce the warning it was bought for.
The ISO 18436 series covers the competence of the people doing the work, including the certification categories for vibration analysts. It matters because interpreting a vibration spectrum is a skilled task, and a programme is only as good as the analysis applied to the data it gathers.
Treating these standards as the backbone of the programme, rather than reaching for them after the fact, is what separates a structured condition-monitoring system from a collection of trend screens nobody reviews.
Reactive, preventive and predictive maintenance
Condition monitoring exists to support a maintenance strategy, so it helps to be clear about which strategy each asset sits under.
- Reactive (run-to-fail). The asset runs until it fails, then it is repaired or replaced. This is the correct strategy for cheap, non-critical, redundant or easily stocked items where the consequence of failure is low. Spending on monitoring here adds cost without changing the decision.
- Preventive (time or cycle based). Components are serviced or replaced on a fixed schedule, set from manufacturer guidance or site history, regardless of measured condition. It trades some wasted component life for predictability and is well suited to assets with a known wear-out pattern. Its weakness is that it intervenes on a calendar rather than on evidence, so it can replace healthy parts early and still miss a fault that develops between services.
- Predictive (condition based). Intervention is triggered by measured condition, such as a rising vibration trend or a temperature climbing above its established baseline. It targets the maintenance effort at the asset that actually needs it and aims to act before functional failure. Condition monitoring is the data layer that makes this possible.
Most Australian sites run all three at once. The engineering judgement is matching each asset to the right strategy based on consequence of failure, not applying predictive monitoring everywhere because the technology allows it.
The P-F curve and what it dictates
The economic case for predictive maintenance rests on the P-F curve, and so does the most important engineering decision in the whole programme: how often to measure.
The curve describes degradation over time. Point P is where a potential failure first becomes detectable by a chosen technique. Point F is functional failure, where the asset can no longer perform its function. The interval between them, the P-F interval, is the warning window the monitoring technique can give you.
The critical point is that the P-F interval depends on the failure mode and the detection technique together. For a rolling-element bearing, an early defect can appear in high-frequency vibration analysis weeks before the bearing fails, then become audible, then show as a temperature rise, then reach functional failure. Vibration detects it early and gives a long P-F interval. Temperature detects it late and gives a short one.
That interval sets the sampling rate. The accepted rule is to sample at no more than half the P-F interval, so at least two readings fall inside the warning window before functional failure. If vibration gives a six-week P-F interval, a three-weekly reading is defensible and a continuous online sensor is unnecessary. If a failure mode gives a P-F interval of hours, only continuous monitoring will catch it. Setting a sampling rate without estimating the P-F interval for the failure modes you are trying to catch is the single most common reason a programme collects data yet still misses the failure it was meant to warn about.
Sensing: vibration, temperature and motor current
The measurement parameters in ISO 17359 come from a small set of techniques that cover the majority of rotating-equipment faults.
| Sensor type | What it detects | Typical assets | Relative P-F warning |
|---|
| Vibration (accelerometer) | Bearing wear, imbalance, misalignment, looseness | Motors, pumps, fans, gearboxes | Early |
| Temperature (RTD / thermocouple / infrared) | Overheating, cooling failure, friction, lubrication loss | Motors, drives, bearings, compressors | Mid to late |
| Motor current signature (MCSA) | Broken rotor bars, load change, mechanical drag, phase loss | Motors, motor-driven pumps and fans | Mid |
| Acoustic emission | Early-stage bearing defects, cavitation, leaks | Pumps, compressors, valves | Early |
| Oil analysis / particle count | Gear and bearing wear debris, contamination | Gearboxes, hydraulic systems | Mid |
Vibration is the workhorse for rotating equipment because it gives the earliest warning of the most common faults. An accelerometer mounted on a bearing housing produces a signal that can be analysed as an overall level, such as ISO 10816 velocity in millimetres per second, for a simple trend, or as a frequency spectrum for diagnosis. Specific defect frequencies, such as the bearing's outer-race and inner-race pass frequencies, let an analyst identify which component is degrading rather than only that something is wrong.
Temperature is cheap, easy to fit and intuitive, but it generally detects faults later in the P-F curve, once friction or electrical loss has already raised the heat. It is a good confirming and secondary parameter rather than the primary early-warning one.
Motor current signature analysis reads the current the motor already draws and infers mechanical and electrical conditions from it, which means it can be applied without mounting a sensor on the machine itself. It is useful for motor-driven assets where access is difficult.
The selection principle is that more sensors do not automatically mean better outcomes. Choose the parameter that detects the failure mode you care about as early as the P-F interval requires, and add confirming parameters only where they change the diagnosis.
The data path: edge acquisition, MQTT Sparkplug B and OPC UA
A reading has no value until it reaches a system that can store, trend and act on it. The data path is where the IIoT part of the work lives, and it should be built on standard protocols so it remains maintainable.
An edge device sits close to the asset, samples the sensor, and does the first stage of ISO 13374 data acquisition and manipulation locally. It might compute an overall vibration level and a spectrum, apply report-by-exception so it only transmits when a value moves, and buffer data if the network drops. Keeping that processing at the edge reduces network load and means the asset is not dependent on a continuous link to a central server.
From the edge, two protocols carry the data:
- MQTT is a lightweight publish and subscribe protocol designed for constrained and intermittent networks. Devices publish to topics on a broker, and any consumer subscribes to what it needs, so adding sensors means adding publishers without rebuilding the integration layer each time. The Sparkplug B specification, maintained alongside the MQTT ecosystem, adds the structure MQTT lacks on its own: a defined topic namespace, a state-management model with birth and death certificates so the system always knows whether a device is online, and a typed payload. That structure is what makes MQTT suitable for an industrial condition-monitoring fleet rather than just a transport. The Ignition platform's MQTT modules consume Sparkplug B payloads directly, which is documented by Inductive Automation.
- OPC UA is a strongly typed, secure machine-to-machine protocol well suited to structured on-premises communication between controllers, SCADA and historians. It carries rich information models and built-in security, which suits the link between the control system and the historian.
A common and sound pattern uses MQTT Sparkplug B for the distributed edge sensors reporting by exception, and OPC UA between the PLC, SCADA and historian layer, with both landing in the same historian. The protocols are complementary, and the choice between them for any given link is driven by the number of nodes, the network conditions and how the data is consumed. This integration work belongs in the industrial data and IIoT and OT networks layers, where the network is designed to carry both production and condition data without contention.
Historian integration and the link to OEE
Once the data arrives, it has to be stored in a way that supports trending and analysis. A historian records each tag with its value and timestamp at the moment it changed, which is what lets an analyst look back over weeks of vibration trend or pull the temperature record around a stoppage. A platform such as Ignition can consume MQTT Sparkplug B and OPC UA directly, store the tags in its historian, and present trends, thresholds and maintenance reports from one place. Building those alarms and screens to suit the maintenance team's workflow is part of the PLC, SCADA and HMI layer.
The connection to OEE is where condition monitoring earns its place in the production-data picture. Unplanned downtime is the largest availability loss on most lines, and condition monitoring attacks it from two directions. It reduces the frequency of unplanned stops by giving warning before a failure forces one, which protects availability. When a stop does happen, the condition record gives a timestamped, machine-sourced reason tied to a specific asset, which makes the downtime classification in the OEE record more defensible than an operator-entered reason recalled at shift end. Because the condition stream and the OEE stream often share the same historian and the same network, the two systems reinforce each other. The limits of OEE as a metric, and why machine-sourced signals matter for it, are covered in our guide on where OEE misleads.
A worked example: a bearing vibration alerting scheme
The figures below are illustrative engineering values to show how an alerting scheme is structured. They are not measurements from any specific site or asset, and any real scheme must be set against the asset's own baseline.
Consider a typical 75 kW process pump motor on a food and beverage line, with an accelerometer on the drive-end bearing housing reporting overall vibration velocity in millimetres per second RMS. The relevant general reference for evaluating that level is ISO 10816 (now largely superseded by the ISO 20816 series), which gives velocity-based evaluation zones for machine vibration.
A workable threshold scheme might be set as follows.
| Band | Example velocity (mm/s RMS) | Interpretation | Action |
|---|
| Baseline | up to 2.8 | Normal running, established over the first weeks | Trend only |
| Alert | 2.8 to 4.5 | Condition has changed from baseline | Schedule inspection at next opportunity |
| Alarm | 4.5 to 7.1 | Significant degradation | Plan intervention within the P-F window |
| Trip / urgent | above 7.1 | Approaching functional failure | Stop and inspect before further damage |
Two refinements make the scheme defensible rather than a set of fixed numbers.
First, the baseline is established from the asset itself. A motor that normally runs at 1.8 mm/s and climbs to 3.2 mm/s has changed even though it is still inside a generic acceptable band, so a percentage rise from baseline, for example a sustained increase of more than 50 percent, can trigger an alert before any absolute threshold is crossed.
Second, the sampling rate is set from the P-F interval, not from convenience. If spectral analysis on this bearing gives an estimated P-F interval of around six weeks, a reading every two to three weeks places at least two samples inside the warning window. If the pump is critical enough that a continuous online sensor is justified, the same banding applies but the data is sampled continuously and alerts are raised automatically.
The scheme should also guard against nuisance alerts. Requiring a threshold to be exceeded on consecutive readings, or for a short sustained period on a continuous sensor rather than a single spike, prevents a transient from generating an alert that erodes the maintenance team's trust in the system.
The Australian context
Condition monitoring in Australia runs on the same international standards used elsewhere, with ISO 17359, ISO 13374 and the ISO 10816 / ISO 20816 vibration series available through Standards Australia as adopted standards. Building a programme to those references keeps it aligned with how the wider industry and equipment suppliers work.
Two local factors shape how the work is done. The first is asset criticality driven by location and supply. Many Australian manufacturing sites sit a long way from spares and specialist support, so the lead time to obtain a replacement motor, gearbox or large bearing can be weeks. That lengthens the effective consequence of an unplanned failure and pushes the criticality ranking of major rotating assets higher than it might sit in a market with same-day supply, which in turn justifies monitoring assets that elsewhere might be left to run to failure.
The second is the safety and electrical framework around the installation work. Fitting sensors, edge devices and network cabling on operating plant is electrical and OT work that falls under the general duties of the model Work Health and Safety laws administered through Safe Work Australia, alongside the relevant electrical wiring rules. Where condition data crosses onto the OT network, the connectivity should follow the guidance from the Australian Cyber Security Centre on protecting operational technology, so that adding sensors does not widen the attack surface of the control system. That network-security dimension is covered in our guide on OT network security for manufacturing.
Common mistakes to avoid
A condition-monitoring programme usually fails for predictable reasons rather than technical ones.
- Monitoring everything because the technology allows it. Sensors on low-consequence assets generate data and alerts that nobody acts on, which dilutes attention and erodes trust. Rank assets by consequence of failure first and monitor where early warning changes the decision.
- Setting the sampling rate by convenience instead of the P-F interval. A monthly reading on a failure mode with a one-week P-F interval will reliably miss the failure. Match the sampling rate to the physics.
- Using absolute thresholds without a baseline. A generic vibration band ignores that each asset has its own normal signature. Establish a baseline and watch for change from it as well as for a fixed number being crossed.
- Collecting data without implementing state detection or health assessment. Under the ISO 13374 model, data acquisition is only the first block. A programme that stores trends but never converts them into an assessment and an advisory has not been finished.
- Treating it as an IT or software purchase. The value sits in the engineering: choosing parameters, setting thresholds against the asset, and integrating multi-vendor sensors and controllers cleanly. Better software on weak inputs produces a better-looking dashboard built on the same shaky data.
- Building alerts the team will not act on. An alert with no clear owner, no clear action and a history of false positives gets ignored. Define who responds, what they do, and tune the scheme so an alert means something.
What this means
The value of IIoT condition monitoring comes from the maintenance action it enables, not from the volume of data collected. Frame the programme with ISO 17359 and ISO 13374, choose the maintenance strategy each asset deserves, set the sampling rate from the P-F interval rather than from habit, sense the parameter that detects the failure mode early enough, move the data over MQTT Sparkplug B or OPC UA into a historian, and build an alert scheme against the asset's own baseline that the maintenance team trusts. Done in that order, a narrow deployment on the right assets will outperform a broad one that nobody reviews, and the same data will sharpen the OEE picture rather than sit in a separate system.
References