Transmission Delays in Residual Computation

The design of control, estimation or diagnosis algorithms most often assumes that all available process variables represent the system state at the same instant of time. However, this is never true in current network systems, because of the unknown deterministic or stochastic transmission delays introduced by the communication network. During the diagnosing stage, this will often generate false alarms. Under nominal operation, the different transmission delays associated with the variables that appear in the computation form produce discrepancies of the residuals from zero. A technique aiming at the minimisation of the resulting false alarms rate, that is based on the explicit modelling of communication delays and on their best-case estimation is proposed. 1 Introduction Owing to the growing complexity and spatial distribution of automated systems, communication networks have become the backbone of most control architecture. As systems are required to be more scalable and flexible, they have additional sensors, actuators and controllers, often referred to as field (intelligent) devices [1, 2]. Networked control systems result from connecting these system components via a communication network such as controller area network (CAN), PROFIBUS or Ethernet. An increasing amount of research addresses the distributed control of interconnected dynamical processes: stability and control [3– 5], decision, coordination and scheduling [6, 7], diagnosis of discrete event systems [8] and fault tolerance [9 – 11]. However, only a few studies of the impact of the communication network on the diagnosis of continuous systems have recently been published [12 – 14]. In model-based fault detection and isolation (FDI), a set of residuals that should be ideally zero in the fault-free case and different from zero, in the faulty case are designed [15 – 17]. However, in practice, residuals are different from zero, not only because of measurement noise, unknown inputs and modelling uncertainties but also because of transmission delays. Since no network can communicate instantaneously , data which are used in the residual computation do not represent the state of the system at the time of the computation. Instead, they represent the state of the system at some (often unknown) time prior to the computation. Moreover, each variable being possibly transmitted under a different transmission delay, the whole set of data that are used in the residual computation may even not be consistent with the system state at any moment prior to the computation. Therefore, residuals that should theoretically be zero in the non-faulty case might create …


Introduction
Owing to the growing complexity and spatial distribution of automated systems, communication networks have become the backbone of most control architecture.As systems are required to be more scalable and flexible, they have additional sensors, actuators and controllers, often referred to as field (intelligent) devices [1,2].Networked control systems result from connecting these system components via a communication network such as controller area network (CAN), PROFIBUS or Ethernet.
An increasing amount of research addresses the distributed control of inter-connected dynamical processes: stability and control [3][4][5], decision, co-ordination and scheduling [6,7], diagnosis of discrete event systems [8] and fault tolerance [9 -11].However, only a few studies of the impact of the communication network on the diagnosis of continuous systems have recently been published [12 -14].
In model-based fault detection and isolation (FDI), a set of residuals that should be ideally zero in the fault-free case and different from zero, in the faulty case are designed [15 -17].However, in practice, residuals are different from zero, not only because of measurement noise, unknown inputs and modelling uncertainties but also because of transmission delays.Since no network can communicate instantaneously, data which are used in the residual computation do not represent the state of the system at the time of the computation.Instead, they represent the state of the system at some (often unknown) time prior to the computation.Moreover, each variable being possibly transmitted under a different transmission delay, the whole set of data that are used in the residual computation may even not be consistent with the system state at any moment prior to the computation.Therefore, residuals that should theoretically be zero in the non-faulty case might create false alarms as the result of transmission delays.
The false alarms rate can be decreased by increasing the decision threshold, at the cost of reducing the sensitivity to faults.In this paper, a technique aiming at the minimisation of the false alarms caused by transmission delays without increasing the number of missed detection is proposed.It relies on the explicit modelling of communication delays, and their most likely estimation.
The paper is organised as follows: Section 2 provides a background on fault detection and isolation based on analytical redundancy, with emphasis on the decision making logic.Section 3 presents the influence of transmission delays.An optimisation technique for the estimation of unknown delays is described in Section 4. Finally, an illustrative example is shown in Section 5.

2
Analytical redundancy for fault detection and isolation

System and faults
Consider the deterministic system modelled by (1) where and w(t) [ R q are, respectively, the state, control, output and fault vector, and f and g are given smooth vector fields.
The system normal operation on a time window a, b ½ ½ is described by while the occurrence of a fault at time g is associated with

Analytical redundancy relations
Analytical redundancy is based on successive derivations of the output signal (2), which together with the repeated use of (1) produces the system where y(t) [and also u(t) and w(t)] is the vector obtained by expanding y(t) with its derivatives _ y(t), € y(t), and so on.[the In a second step, ( 5) is transformed into an equivalent system y(t) ¼ G[x(t), u(t), y(t), w(t))] () where the equations in the subsystem G 2 are the so-called analytic redundancy relations (ARR), which are independent on the state vector x(t).It can be shown that such ARR can always be found, provided the output can be derivated up to an order large enough [15 -17].The interest of these ARR is obviously that -since the state has been eliminated -they depend only on the inputs, outputs and faults, and thus providing a means to check whether the no-fault hypothesis is consistent with the observed input -outputs.

Practical determination of analytical redundancy relations
From a practical point of view, obtaining the set of equations G 2 in (6) from the original set (5) makes use of a projection operator when system (1), ( 2) is linear (this is the parity space technique, see for example [15,17]).However, for more general cases, it rests on elimination theory (see for example [18] for the case where (1), ( 2) is polynomial).It is also worth to notice that ARR are not uniquely defined.Indeed, any linear or nonlinear combination of analytical redundancy relations is also an analytical redundancy relation.

Computation and evaluation forms:
Let the subsystem G 2 be decomposed as where for all input/output pairs u(t), y(t) are associated with systems (1) and ( 2) Then condition G 2 ¼ 0 can be written as where r(t) is the residual vector, and ( 8) and ( 9) are, respectively, its computation and evaluation form.The first one describes how the residual value is obtained from the system inputs and outputs.The latter describes how the resulting value depends on faults.
According to ( 8) and ( 9), the fault detection and isolation procedure is decomposed into two steps.The first one is residual computation where the residual value is computed from the known variables, using the computation form (8). The second step is residual evaluation, that includes fault detection and fault isolation.

Fault detection: Given a time window a, b
½ ½, the fault detection problem is defined as follows: given the residual r(t), t [ a, b ½ ½, select the most likely hypothesis between H 0 system and H 1 system where Using ( 7) and ( 9), the simplest implementation of a fault detection procedure is obtained by checking the residual value against zero at each time t (by a slight abuse of notation, the time intervals a, b ½ ½ and g, l ½ ½ are no longer mentioned).
For the sake of simplicity, only perfect deterministic models have been considered so far that result in (10) being indeed true.However, measurement noise, unknown inputs and model uncertainties will result in residuals being never zero even in normal operation.This can be taken into account in a more realistic procedure which extends (10) where N (0) is some neighbourhood of zero.Note that false alarms -r(t) Ó N (0) under H 0 system -and missed detections -r(t) [ N (0) under H 1 system -are possible.The design of a set N (0) that guarantees both a low false alarm and a low missed detection-rate is the central problem of statistical decision making [19 -21].

Fault isolation:
Fault isolation rests on special properties of the residual evaluation form (directional residual and structured residuals) are not developed here (see for example [15,16,18] for good presentations): it is assumed in the sequel that the residual vector r(t) has satisfactory detection/isolation properties.

3
Influence of transmission delays

Data decomposition in distributed systems
In distributed control systems, the residual computation form is implemented as an algorithm in one node of the network.At each time t, its input data are noted as According to the overall system distributed architecture, z(t) is decomposed into a set of subvectors z i (t), i [ I and the computation form of the residual vector writes The subvectors z i (t) are such that all variables in z i (t) are transmitted in one single packet through the communication network.Note that this does not imply that all the variables produced at a given node are transmitted in one single packet.As shown in Fig. 1   z where z 1 is produced and transmitted by node 1 (a smart sensor), while z 2 < z 3 are produced by node 2 (a local controller).The data are decomposed into packets z 2 and z 3 for their transmission through the communication system.

Incidence of transmission delays
Owing to the transmission delays, the data z i (t) generated at the production nodes and the data ẑi (t) available at the residual computation node must be distinguished.One obviously has where d i [ R þ is the transmission delay, that is the data z i was produced at time t À d i , and it was received only at time t.Transmission delays d i may be time dependent and generally unknown.The normal operation of the communication network on a given time window ½a, b½ can be described by a very simple deterministic model.Namely, the maximum delay D is assumed to be known If communication delays are not taken into account, residual computation can be performed as but using data which are taken from the system at different time instants would obviously result in false alarms.
Taking into account the communication delays by using future values of the arguments, as in (16), is obviously impossible.
Finally, the only possibility to obtain a feasible algorithm is to 'synchronise' the data by using a delay t as follows where r(t) is the residual available at the residual computation node.Note that at a time t one in fact computes the value that the residual had at time The delay t must obviously satisfy

Decision procedure under unknown transmission delays
When the vector of transmission delays is perfectly known, the decision procedure associated with (11) can be directly run by choosing however when r(t) Ó N (0) the fault detection process is delayed by t.
When transmission delays are unknown, the decision has to be taken in the presence of the so-called nuisance parameters d.From ( 14) and ( 20), the following decision logic is true This decision logic expresses that the non-existence of a vector of transmission delays, d such that (1) d k k 1 D and (2) the residual lies inside N (0) evidences that the system, the network, or both do not operate properly.Even though both faults in the network and in the system can be detected, they cannot be isolated from each other in the absence of extra information.

Estimating the transmission delays
Let the system fault detection neighbourhood N (0) be defined by Qr with Q .0 and s .0 is a given decision threshold.Checking the property H 0 system ^H0 network can be done by solving the following optimisation problem d ¼ arg min where then the following interpretation holds (i) d is the 'most likely' vector of admissible delays in the sense that the function J (t) associated with the delayed residual r(t) is minimum (ii) This residual may or may not be compatible with the hypothesis that the system operates in a nominal way, and therefore the decision logic (21) becomes J (t) .s ) H 1 system Finally, it should be noted that the estimation of d by (22) implements a sufficient condition-based decision logic.Indeed, if the minimal value J (t) associated with d does not satisfy J (t) s, then no other estimation will do.However, the set {d:[kdk 1 D] ^[r(t) [ N (0)]} may contain more than one element.

Searching for a minimum
The cost function J (r(t)) } is in general a nonlinear function of the adjustable parameters d, and its minimum can be found using the well-known iterative search methods [24].However, since the problem is to be solved in real time, it is of interest to study the conditions under which the estimation d can be found quickly and accurately.Note that even when a dynamic feedback control loop is involved, the on-line search for a minimum, being a part of the FDI algorithm, can be run at a much lower frequency than the one associated with the control computation, for systems where faults are not critical (should faults be critical, it can be assumed that the network would have been designed so as to make transmission delays negligible).

Persistent excitation condition:
Assuming that all functions involved are differentiable, the solution of the optimisation problem (22) satisfies the necessary condition where m i is the Khu ¨n and Tu ¨cker parameter associated with the inequality constraint d i t.This system can be solved for d if its Jacobian is not too ill conditioned in a neighbourhood of the optimum.From it is seen that there are some system trajectories that produce a rank defective Jacobian, namely when z i (t) is constant for some i [ I. Therefore, for delay estimation, it is necessary that a persistent excitation condition be satisfied, such that no transmitted packet of variables is constant over time.
When the persistent excitation condition is not satisfied, the previous estimated value of the delay can be used to compute r(t).

Local minima:
The search for a minimum is started out from an initial guess on the parameters d (0) and in general it will converge towards a local minimum.Starting with zero at the very beginning and taking the last estimate of the transmissions delays as the initial guess for the next estimation seems to be a good approach when the exploitation conditions of the network do not change at a faster rate than the rate at which residuals are computed.Converging towards a local minimum is a source of false alarms, since the estimation d may lead to r(t) Ó N (0) while the global minimum, say d Ã , would have provided r Ã (t) [ N (0).In special cases, such as linear systems and convex cost functions, global minima will be found, but in more general cases, algorithms that avoid getting trapped in a local minimum are to be used [22].

5
Illustrative example Fig. 2 depicts the schematic diagram of a control position for a DC motor.The aim of this system is to control the position c(t) of the mechanical load according to the reference position u(t).This system has been taken from [23].
The transfer function in open loop is where ], e v (t) is the error between the output and the input positions.K m and T m are known parameters of the system.Let be the residual computation form associated with the system, where c(t) and e v (t) are assumed to be produced in two different nodes of a distributed system (Fig. 3).
The data available at the residual computation node are êv (t) and ĉ(t), and the actually computed residual is where d i are the delays through the network and t is their maximum value (19).
Simulations have been performed with K m ¼ 5.5 s 21 and T m ¼ 0.13 s. u(t) has been modelled as an unit step signal and c(t) as its response.Matlab has been used for simulation and optimisation.The function fmincon has been used for delays estimation.This function uses a sequential quadratic programming method [24], and is suitable for finding a minimum of a constrained nonlinear multivariable function.Initial condition, d 0 , for the algorithm has been fixed at zero.
In order to study the benefits of delay estimation to optimise residual computation, neither unknown inputs (noise or disturbances) nor uncertainties have been considered.The analysis has been focused on the performance of the method to reduce false alarms in the absence of system faults and to reduce the missed detection rates in presence of a sensor fault.It is well known that false alarms can always be avoided (e.g. by never firing any alarm).This can be seen in Fig. 4, which shows that false alarms are avoided in both cases, but at the cost of four orders of magnitude on the decision threshold (+10 instead of +0.25).This means worse performances when faults will be present (larger missed detection rate, bigger size for faults to be detectable).
The comparisons of actual (d 1 and d 2 ) and estimated ( d 1 and d 2 ) delays, and their distributions, are depicted in Fig. 4. Owing to the lack of a time reference, it seems that estimations are not well achieved.In order to introduce a time reference, the estimation error is computed as the difference between the comparisons of the actual and estimated delays (Fig. 4).The  estimation error is close to zero, and it increases during the steady-state time window because the excitation condition is not fulfilled, nevertheless it does not affect the computation of the residual r(t) as can be seen in Fig. 4. Fig. 5 depicts the quadratic cost function J[r(t)] to be minimised at one time instant, in that case the actual delays were d 1 ¼ 0.044 s and d 2 ¼ 0.055 s, the dot on the figure is the minimum of the cost function J[r(t)] ¼ 1.92 Â 10 29 .

Faulty system situation
Fig. 6 represents residuals r(t) and r(t) in the presence of an additive fault introduced on the sensor e(t) as a constant 0.2 bias during the time window [0.3, 0.6] s.The effect of this fault on residual r(t) is not large enough to overpass the thresholds, causing missed detections during almost all the fault instants.On the other hand, the lower figure depicts how r(t) allows a proper detection.The estimation error increases significantly during the fault instants, but it does not affect the detection of the fault.

Conclusion
In model-based FDI a set of residuals is computed, which should be ideally zero in the fault-free case and different from zero, in the faulty case.However, in practice, residuals are different from zero, not only because of faults but also because of measurement noise, unknown inputs modelling uncertainties and transmission delays in distributed systems.The use of these residuals will produce false alarms and missed detections.
In this paper, it is shown that when transmission delays are known, it is possible to take them into account in the residual computation, and thus introducing a delayed but otherwise unchanged decision and avoiding false alarms as a result of delays.However, when delays are unknown, it is necessary to estimate them in order to compensate for their effect in the decision procedure.In this case, based on a very rough model of the delays, the paper proposes to address the problem as an optimisation problem.A search algorithm is used for the delay estimation by minimising the residual under the constraints given by the transmission model.When the persistent excitation condition is fulfilled, the delays estimation can be carried out giving reliable results and avoiding false alarms.
The efficiency of the proposed approach is illustrated by a simple example where a set of different dynamics and fault magnitudes have been simulated.Current research addresses more complex transmission models, and more complex sources of errors (measurement noise and transmission errors).

Fig. 1
Fig. 1 Data decomposition in distributed system d 1 and d 2 are uniformly distributed in the interval [0, 0.1] s.Under H 0 system ^H0 network , one obviously obtains r(t) ¼ 0 when d i are replaced by their actual values.When these values are unknown, the admissible delays d i are estimated, by solving the optimisation problem min r 2 (t), under the constraint d i [ [0, 0:1] s (27)

Fig. 2
Fig. 2 Schematic diagram of a control position for a DC motor Fig. 3 Block diagram of the system

Fig. 4
Fig. 4 Residual in fault-free situation, comparisons of the actual and estimated delays and the estimation error