Analysis of Acoustic Feedback Cancellation Systems based on Direct Closed-Loop Identiﬁcation

—This work presents, using the least squares esti- mation theory, a theoretical and experimental analysis on the performance of the standard adaptive ﬁltering algorithms when applied to acoustic feedback cancellation. Expressions for the bias and covariance matrix of the acoustic feedback path estimate provided by these algorithms are derived as a function of the signals statistics as well as derivatives of the cost function. It is demonstrated that, in general, the estimate is biased and presents a large covariance because the closed-loop nature of the system makes the cross-correlation between the loudspeaker and system input signals non-zero. Simulations are carried out to exemplify the results using speech signals, a long acoustic feedback path and the recursive least squares algorithm. The results illustrate that these algorithms converge very slowly to a solution that is not the true acoustic feedback path. The relationship between the performance of the adaptive ﬁltering algorithms and the aforementioned cross-correlation is proven by varying the signal- to-noise ratio and the delay introduced by the forward path.


I. INTRODUCTION
A sound reinforcement (SR) system essentially comprises microphones, amplifiers and loudspeakers. Its fundamental purpose is to pick up, amplify and play back one or more desired sounds signals in the same acoustic environment. In many situations, mixing consoles and signal processors are employed to combine and enhance the sound signals.
A SR system may be very complex, including hundreds of microphones and multiple loudspeaker arrays. But nearly every system covered in the literature is single-channel, i.e., utilizes one microphone and one loudspeaker [1]. For this reason, only single-channel SR systems will be addressed in this work.
The acoustic coupling between loudspeaker and microphone unavoidably cause the loudspeaker signal to be fed back into the microphone. Thus, a closed signal loop is generated and leads to the so-called acoustic feedback problem [1]. The acoustic coupling and the signal processing circuit are referred to as acoustic feedback path and forward path, respectively.
The acoustic feedback affects the performance of a SR system in two ways. First and foremost, the closed-loop system can exhibit instability, which may lead to oscillations that are acoustically perceived as howling. This phenomenon is also known as Larsen effect [1]. As a consequence, the achievable amplification is limited [1], [2]. Second, the sound quality is deteriorated by excessive reverberation or even ringing.
Most people have witnessed the howling effect in public address (PA) systems widely used in presentations, lectures, shows, events in general [1], [2], [3]. But it also occurs in hearing aids (HA) [4], [5], [6], [7]. The howling effect is one of the most frequent complaints from users and the reason why many of them give up HA [8]. Industry experts estimate that up to 15% of the HA are returned to the factory within 90 days after manufacture because of feedback problems [9].
In order to control the howling effect and increase the achievable amplification, several methods have been developed over the past decades [1]. The acoustic feedback cancellation (AFC) methods aim at estimating the feedback signal and subtracting it from the microphone signal [1]. The predicted feedback signal is obtained by filtering the loudspeaker signal with a model of the acoustic feedback path. This model is calculated using an adaptive filter that is designed to estimate and track the feedback path by means of some iterative algorithm.
The algorithms of the adaptive filtering theory, which approximate the Wiener solution through the gradient or least squares, identify the feedback path by minimizing in some deterministic sense the error signal [10], [11]. In the target application, the error signal should be defined as the difference between the true and predicted feedback signals. But the microphone signal is actually used as the feedback path output. Besides the feedback signal, the microphone signal also comprises the desired sound signal and a possible ambient noise, which act jointly as interference to the adaptive algorithms.
This approach corresponds to direct closed-loop identification because the data used for identification is collected in closed loop but the identification is performed using an open-loop model [12]. In other words, the feedback path is identified using only measurements of its input and output, and no assumptions are made on how they are generated [13].
The cancellation architecture is similar to the acoustic echo cancellation (AEC) commonly used in teleconference systems. Although it is effective when applied to AEC, this scheme is inefficient when applied to AFC. The adaptive filtering algorithms premise that the signals acting as input and interference to the adaptive filter are uncorrelated [10], [11]. In AEC, these signals are independent. In AFC, on the other hand, these signals are strongly correlated, mainly when the desired sound signal has a high degree of spectral coloration. Therefore, the standard adaptive filtering algorithms perform poorly when applied to AFC [1], [4], [14].
Most solutions to overcome this problem introduce a decorrelation procedure in the AFC approach with the aim of reducing the cross-correlation between the loudspeaker and interference signals [1], [14]. Some methods aims to change the loudspeaker signal with the disadvantage of a possible distortion. In this context, addition of different shaped noises to the loudspeaker signal, insertion of delays, nonlinear processing, as half-wave rectification, and time-varying processing, as frequency shifting, phase and delay modulation, into the forward path have been proposed [1], [4], [15], [16] Other methods aims to create modified versions of the loudspeaker or microphone signals that are used only in updating the adaptive filter. Delaying the loudspeaker signal was first proposed [4]. Subsequent methods are based on the prediction error framework. They assume that the desired sound signal is the output of a filter, the source model, whose input is a zero-mean white noise [12], [17]. The loudspeaker and microphone signals are then prefiltered with the inverse source model in order to be whiten. In [5], [18], the source model is fixed. In [6], an adaptive filter estimates the source model in hearing aids. Short and long-time prediction filters are combined to estimate the source model in [19].
Another possible solution is to not utilize the standard gradient or least-squares based adaptive filtering algorithms to update the adaptive filter. Following this approach, the works in [2], [20] have demonstrated that the cepstra of the microphone and error signals can be defined as a function of the impulse responses of the feedback path, forward path and adaptive filter. And methods that estimates the feedback path impulse response from the signal cepstra to update the adaptive filter in a recursive fashion have been proposed [2], [20].
The poor performance of the standard adaptive filtering algorithms when applied to AFC has already been theoretically studied in some previous works. In [4], [21], [22], it was analyzed in a statistical framework using the Wiener theory. In [1], [14], it was analyzed in a deterministic framework using the least squares (LS) theory. And, in [5], [23], it was analyzed considering the specific prediction error framework. However, none of these works presented experiments using real speech signals and a long acoustic feedback path of a PA system to exemplify the conclusions drawn from the theoretical analysis.
This work has two goals: first, using the LS estimation theory, to present a theoretical analysis on the poor performance of the standard adaptive filtering algorithms when applied to AFC, compiling in detail the results available in the literature; second, to exemplify the conclusions drawn from the theoretical analysis using speech signals, a long acoustic feedback path of a PA system and the recursive LS (RLS) algorithm. This paper is organized as follows: Section II presents the modelling of both acoustic feedback problem and acoustic feedback cancellation; Section III presents the theoretical analysis and compiles in detail the results available in the literature; Section IV describes the configuration of the simulated experiments; in Section V, the experimental results are presented and discussed based on the statistical properties of the desired sound signal, ambient noise and closed-loop system impulse response; finally, Section VI concludes the paper, emphasizing its main contributions. The acoustic feedback path models the acoustic coupling between loudspeaker and microphone. For simplicity, it also includes the characteristics of the D/A converter, loudspeaker, microphone and A/D converter. The feedback path may present non-linearities, for example due to loudspeaker saturation, but it is generally assumed to be linear. Hence, the feedback path is represented by the -order time-varying transfer function where denotes the discrete-time shift operator [24], i.e., −1 ( ) = ( − 1), or by the impulse response The forward path models the amplification system and any other signal processing device inserted in that part of the signal loop. The forward path may present non-linearities, for example because of frequency compression, but it is usually assumed to be linear. Hence, the forward path is represented by the -order time-varying transfer function or by the impulse response The loudspeaker signal ( ) and the system input signal ( ) are related by the time-varying closed-loop transfer function of the SR system as follows where the filtering of ( ) with ( , ) is given by and the symbol * denotes the convolution operation.
From the control systems theory, the Nyquist's stability criterion states that the closed-loop system is unstable if there is at least one frequency for which [1] ( , ) ( , ) ≥ 1 where ( , ) and ( , ) are the short-term frequency responses of the forward and feedback paths, respectively, and ∈ [0, 2 ] is the normalized angular frequency. Therefore, if one frequency component is not attenuated and its phase is shifted by an integer multiple of 2 by the time-varying open-loop transfer function of the SR system, ( , ) ( , ), then this frequency component will never disappear from the system even if there is no more input signal ( ). If it is amplified, its magnitude will increase after each loop in the system, leading to a howling at that frequency.
With the aim of quantifying the achievable amplification in a SR system, it is customary to define a broadband gain of the forward path as [1], [2] ( ) = 1 2 and to extract it from ( , ) as follows Assuming that ( , ) is known and ( ) can be varied, the maximum stable gain (MSG) of the SR system is defined as [1] MSG( ) (dB) = 20 log 10 ( ) resulting in [1], [2] MSG( ) (dB) = − 20 log 10 max where ( ) denotes the set of frequencies at which the phase condition in (7) is satisfied, which are also called the critical frequencies of the SR system. The cancellation architecture of the AFC approach is depicted in Figure 2 [1], [2]. The adaptive filter, which identifies and tracks the acoustic feedback path ( , ), is represented by the -order time-varying transfer function or by the impulse response Then, the feedback signal f ( ) * ( ) is estimated as h( ) * ( ) and subtracted from the microphone signal ( ), generating the feedback-compensated or error signal which is effectively the signal fed to the forward path ( , ). The signals ( ) and ( ) are now related by the closedloop transfer function of the SR system with a AFC method, hereinafter called AFC system, as follows According to the Nyquist's stability criterion, the closed-loop AFC system is unstable if there is at least one frequency for which where ( , ) is the short-term frequency response of the adaptive filter. The MSG of the AFC system is defined as [1], [2] MSG( ) (dB) = −20 log 10 and the increase in the amount of achievable amplification, the MSG increase, is given by [1], [2] ΔMSG( ) (dB) = −20 log 10 where ( ) denotes the set of frequencies at which the phase condition in (16) is met, which are the critical frequencies of the AFC system.
It is concluded from (18) that the achievable ΔMSG( ) increases as the match between the frequency response magnitudes of the adaptive filter and feedback path at the critical frequencies of the AFC system gets better. If ( , ) = ( , ), ∀ ∈ ( ), the MSG of the AFC system can be indefinitely high. But some reverberation may still exist in ( ) due to the frequency components that were not perfectly matched. If ( , ) = ( , ), it follows from (14) and (15) that the acoustic feedback will be totally cancelled and the system will no longer have a closed signal loop, respectively.

III. LEAST SQUARES ANALYSIS OF AFC SYSTEMS
This section presents, using the LS estimation theory, a theoretical analysis on the poor performance of standard adaptive filtering algorithms when applied to AFC and compiles in detail the results already presented in the literature. Thereunto, let a data record { ( ), ( )} =1 of the loudspeaker and microphone signals be available as well as the initial conditions { ( )} 0 =1− of the loudspeaker signal. And let the feedback path be time-invariant (f ( ) = f) so that no data windowing is employed and the order of the feedback path estimator h( ) be equal to that of f ( = ). In the LS approach, an estimate of f at the sample index , h( ), is obtained by minimizing the cost function or error criterion 1 defined as [25] where the data matrices and vectors are defined as The LS cost function, defined in (19), can be written as and thus its gradient is given by The LS estimator of the feedback path is then obtained by making the gradient equal to zero, resulting in [1] An estimator may be characterized by its bias and variance [1], [25]. In the sequel, the bias and variance of the LS estimator defined in (25) is obtained and discussed separately.

A. Bias of the LS Estimator
The bias of an estimator is the difference between its expected value and the parameter true value [25]. Hence, at index , the bias of the feedback path estimator is defined as [1] bias 1 The error criterion in the LS approach is commonly defined as [25]. In this work, a constant 1 /2 is included only for convenience.
where E {·} denotes the statistical expectation operator. The estimator is desired to be unbiased, i.e., to have zero bias [25]. Replacing (25) in (27), the bias of the LS estimator of the feedback path is given by [1], [14] bias In order to conclude on the bias, it is necessary to realize that (28) can be written as where is the ( + 1) × ( + 1) time-average autocorrelation matrix of ( ) [11] and is the ( +1)×( +1) time-average cross-correlation vector between ( ) and ( ) [11].
If ( ) and ( ) are at least jointly wide-sense stationary and ergodic processes then, for large , the time averages R and p are consistent estimates of the respective statistical averages [26], i.e., lim and lim where R is the ( + 1) × ( + 1) autocorrelation matrix of ( ) and p is the ( + 1) × 1 cross-correlation vector between ( ) and ( ) [11].
Since autocorrelation matrices are positive definite for practical signals [11], R −1 is positive definite and thus the only possibility for the bias defined in (34) to be null is that p = 0, i.e., the loudspeaker signal ( ) and the system input signal ( ) must be orthogonal. If at least one of them has zero mean, it is equivalent to say that they must be uncorrelated. However, as indicated in (5), ( ) and ( ) are related by the closed-loop transfer function ( , ) of the SR system, which generally introduces linear dependence. As will be discussed later, a counterexample to this correlation introduced by the system closed loop is when ( ) is white Gaussian noise and ( , ) has at least 1-sample delay.
Disregarding the assumptions of stationarity and ergodicity of ( ) and ( ), similar conclusion can be drawn directly from (29). The bias can be generally understood to be nonzero because the closed-loop nature of the system tends to cause p ≠ 0. Therefore, in general, bias {h( )} ≠ 0. And it actually depends on the autocorrelation matrix of the loudspeaker signal ( ) and the cross-correlation vector between ( ) and the system input signal ( ).
The existence of bias means that, on average, the feedback path estimator h( ) will not converge to f, the true value of the feedback path, no matter how many process realizations are performed and their time duration [25]. The resulting effect in AFC is twofold: first, the adaptive filter estimates and cancels only part of the feedback signal f * ( ); second, it also estimates and cancels part of the system input signal ( ). As a consequence, the feedback-compensated signal ( ) is a distorted estimate of the system input signal ( ) [1], [2].
In addition to the definitions in (28), (29) and (34), a different expression for the bias can be obtained by realizing that and where (·) is the second derivative of the error criterion as a function of h( ). Replacing (35) and (36) in (29), the bias can be defined as a function of the error criterion as follows

B. Variance of the LS Estimator
The variance of the LS estimator can be obtained by considering its covariance matrix, also called coefficient-errorvector covariance matrix, which is defined as [11], [10] cov Replacing (25) in (38), the covariance matrix is given by which using (30) and (31) becomes Assuming that ( ) and ( ) are at least jointly widesense stationary and ergodic processes and thus replacing (32) and (33) in (40), the covariance matrix is defined as The only possibility for the covariance matrix, defined in (41), to be null is that bias {h( )} = 0, i.e, the estimator h( ) must be unbiased. However, as discussed in the previous subsection, this does not normally occur because the closed-loop transfer function ( , ) of the SR system makes p ≠ 0, which results in a biased LS estimator of the feedback path.
Disregarding the assumptions of stationarity and ergodicity of ( ) and ( ), similar conclusion can be drawn directly from (40). The covariance matrix can be generally understood to be nonzero because the closed-loop nature of the system tends to cause p ≠ 0. Therefore, in general, cov {h( )} ≠ 0. And it actually depends on the autocorrelation matrix of the loudspeaker signal ( ) and the cross-correlation vector between ( ) and the system input signal ( ).
The resulting effect of the covariance matrix in AFC can be understood by realizing that (39) can be written as or [1], [13] cov where is the × autocorrelation matrix of ( ) [11], [10] and is its instantaneous estimate [10]. The interpretation of (42) and (43) may be related to the double-talk problem in AEC [1]. In AEC, the signals ( ) and ( ), which are respectively called far-end speaker signal and near-end speaker signal, are independent. Then, when ( ) is active and ( ) is not, the covariance matrix of the echo path LS estimator is relatively small becauseR ≈ R ≈ 0 and thus the adaptive filter works properly. But when both signals are active, situation commonly called double-talk, the covariance matrix can become large becauseR 0 and R 0, which consequently lead to a decrease in the convergence speed, or even divergence, of the adaptive filter [1]. This problem becomes more severe when ( ) has a high degree of spectral coloration, as occurs when ( ) is speech, because R presents a denser structure in this case [1].
In AFC, on the other hand, the loudspeaker signal ( ) and the system input signal ( ) are not independent because they are related by the system closed-loop as indicated in (5). Therefore, except for possible short time intervals dependent on ( , ) and ( , ), the system operates in a continuous double-talk situation. And this is further worsened by the aforementioned correlation between ( ) and ( ) [1]. The resulting effect in AFC is that the adaptive filter ( , ) presents a slow convergence speed throughout its operation [1].
In addition to the previous definitions, the covariance matrix can be also defined as a function of the error criterion by replacing (35) and (36) in (40), resulting in IV. SIMULATIONS CONFIGURATION This section describes the configuration of the two experiments carried out in a simulated environment to corroborate and exemplify the conclusions on the bias and covariance matrix of the LS estimator. In the first, the bias is estimated over time. In the second, the slow convergence speed of the standard adaptive filtering algorithms when applied to AFC is exemplified using the RLS algorithm. To this end, the following configuration is used.
A. Simulated Environment 1) Feedback Path: The impulse response f of the feedback path is a measured room impulse response (RIR) available in [27]. The RIR was downsampled to the sampling rate = 16 kHz and then truncated so that = 1000. The impulse response of the acoustic feedback path is shown in Figure 3. 2) Forward Path: As in [1], [2], the forward path is a timeinvariant filter defined as a delay and a gain, i.e., which leads to ( ) = and ( , ) = − . The delay is inherent to any digital signal processing included in that part of the signal loop and plays a key role in the analysis. Because of that, = {1, 100, 400, 800}. And = 0.7 so that the initial gain margin, the difference between the MSG of the SR system and the actual broadband gain, is 3 dB as in [1], [2].
3) Closed-loop System: According to (5), the closed-loop transfer function of the SR system is defined as If | ( , ) ( , )| < 1, a sufficient condition to ensure the closed-loop system stability, then ( , ) can be written as It can be concluded that the system closed-loop transfer function is the forward path transfer function multiplied by a power sum of the system open-loop transfer function. Each term on (49) can be interpreted as the transfer function from ( ) to ( ) after loops through the SR system. By replacing (47) in (49), the closed-loop transfer function of the simulated SR system is given by In the time domain, except for a constant, the first two terms ( = 0, 1) on (50) correspond to an impulse at the sample index (the forward path impulse response) and the feedback path impulse response delayed by 2 samples, respectively. The other terms ( > 1) correspond to successive convolutions of the feedback path impulse response delayed by ( + 1) samples. Thus, as increases, this linear delay as a function of causes an increasing dilatation of the system closedloop impulse response a along the sample axis. This effect is illustrated in Figure 4 for = {100, 400}, where it should be noted that = 0 for < . B. Evaluation Metrics 1) Bias energy: In the first experiment, the bias of the LS estimator is measured through its energy, which is defined as where · denotes the Euclidean or Frobenius norm.
2) Misalingment: In the second experiment, the performance of the adaptive filter is evaluated through the normalized misalignment (MIS), which is defined as [2] MIS . (52) 3) Maximum stable gain: In the second experiment, the performance of the adaptive filter is also evaluated through the achievable ΔMSG( ) defined in (18). It is noteworthy that the metrics MIS( ) and ΔMSG( ) are related because both depend on ( , ) − ( , ) . With respect to this factor, the difference is that MIS( ) takes all frequencies into account while ΔMSG( ) considers only one of the critical frequencies.

C. Speech Signals
The source signals ( ) are created from signals of a speech database. Each signal contains one short sentence recorded in a 4 s time slot and at a 48 kHz sampling rate, but downsampled to 16 kHz. The active power level of each signal is normalized to −26 dBov according to the ITU-T Rec. P.56 algorithm. All sentences are spoken by native speakers of the following nationalities and genders: 4 Americans (2 males and 2 females), 2 British (1 male and 1 female), 2 French (1 male and 1 female) and 2 Germans (1 male and 1 female).
Since long-term signals are required, several signals from the same speaker are concatenated and the silence parts are removed through a voice activity detector (VAD), thereby resulting in 10 speech signals (1 signal per speaker).

V. SIMULATIONS RESULTS
This section is devoted to reporting the results of the two experiments performed. The simulated environment, evaluation metrics and signals described in Section IV are employed.
The bias energy ( ) is measured according to (51). In order to simplify the analysis and discussion of the results, it is assumed that ( ) and ( ) are stationary and ergodic processes as well as is large enough so that R ≈ R and p ≈ p , thereby making (34) valid. In this case, ( ) is bounded from above as follows [28] ( ) ≤ R −1 2 p 2 . (53) As demonstrated in Appendices A and B, R and p can be written as and therefore p is actually independent of ( ).
The results for SNR → ∞ and = {1, 100, 400, 800} are shown in Figure 5. It is observed that ( ) decreases exponentially over time, becoming nearly constant at the end of the simulation. This occurs because the time averages R and p are not sensitive to any changes in the signal statistics for large [10]. But ( ) ≠ 0 even after 10 s, which illustrates the bias of the acoustic feedback path estimate provided by the standard adaptive filtering algorithms.
It is also noticed that ( ) decreases as increases. This can be explained by noting that p 2 is bounded from above as follows As increases, fewer correlation vectors are included in right-hand side of (57), decreasing the upper bound of p 2 . And, since the autocorrelation function of speech signals usually decays with increasing lag, the higher magnitude vectors p ( ) are removed from this computation and consequently the upper bound of p 2 tends to significantly decrease. On the other hand, as increases,ˆ( ) presents an increasing dilatation along the lags caused by the similar dilation of a along the sample axis, as discussed in Section IV-A3.  This effect in R , defined in (54), is not significant because the aforementioned decreasing behavior of the speech signal autocorrelation function makes R ( ) ≈ 0 ( +1)×( +1) for high values of | |. Therefore, as increases, p 2 tends to decrease while R −1 2 is not significantly affected, leading to a decrease in the upper bound of ( ). In practise, although not mathematically guaranteed, the bias energy ultimately decreases as indicated by the obtained results.
In addition, the results for = 400 and SNR = {∞, 30, 20, 10} dB are shown in Figure 6. It is observed that ( ) decreases as SNR decreases. This is justified by observing that p , defined in (56), is not affected by ( ) while R , defined in (54), is affected by its power ( ). From (54), it is noted that R 2 is bounded from above as follows while R −1 2 has the following lower bound [28] As increases (SNR decreases), the upper bound of R 2 increases and consequently the lower bound of R −1 2 de- creases. Therefore, as SNR decreases, p 2 is not affected while R −1 2 tends to decrease, leading to a decrease in the upper bound of the bias energy defined in (53). Again in practise, although not mathematically guaranteed, the bias energy ultimately decreases as indicated by the obtained results. Finally, the results for white Gaussian noise (SNR → −∞) and = {1, 400} are shown in Figure 7. It is observed that the values of ( ) are practically the same. This occurs because, as can be concluded from (56), ( ) = 0 combined with 0 = 0 makes p = 0, i.e, the loudspeaker signal ( ) and the system input signal ( ) are uncorrelated. As a consequence, bias {h( )} = 0. It is noteworthy that, although it is much smaller than for speech signals, the bias energy is still nonzero after 10 s because ( ) is indeed sequences of pseudorandom values drawn from the standard normal distribution and, consequently, R and p are close but not equal to the statistical R and p even for large .
The presented results exemplify and corroborate the theoretical discussion on the existence of bias in the feedback path estimate provided by the standard adaptive filtering algorithms and its direct relationship with the autocorrelation matrix of the loudspeaker signal ( ) and the cross-correlation between ( ) and the system input signal ( ).

B. Experiment 2
In this experiment, the slow convergence speed of the standard adaptive filtering algorithms when applied to AFC is exemplified using the RLS algorithm, As did in Experiment 1, in order to simplify the analysis and discussion of the results, it is assumed that ( ) and ( ) are stationary and ergodic processes as well as is large enough so that R ≈ R and p ≈ p , thereby making (37) valid. In this case, the norm of the covariance matrix is given by being equal to the bias energy.
The average results for SNR → ∞ and = {1, 100, 400, 800} are shown in Figure 8. It should be observed from Figure 8a that the adaptive filter has not yet converged even after 10 s in all cases. For = 1, the adaptive filter actually diverged. But, due to the high time required for filter convergence and computational cost of the RLS algorithm, the discussion on the improvement in the convergence speed will be addressed by means of the illustrated transient behavior.
From Figure 8, it is observed that performance improves as increases. This occurs because, when ( ) is speech, increasing decreases the bias energy, as discussed in the previous subsection, and consequently cov {h( )} , leading to an increase in the convergence speed of the adaptive filter. The lack of improvement from = 1 to = 100 in Figure 8b is due to ΔMSG( ) taking only one frequency into consideration as explained in Section IV-B3.
Moreover, the average results for = 400 and SNR = {∞, 30, 20, 10} dB are shown in Figure 9. It is noticed that performance improves as SNR decreases. This occurs because the white Gaussian noise ( ) decreases the bias energy, as discussed in the previous subsection, and consequently cov {h( )} , thereby leading to an increase in the convergence speed of the adaptive filter. Note that the results for SNR = {∞, 30} dB are very close since the difference between the bias energies is small as indicated in Figure 6.
However, in the cases discussed so far, MIS > −3.5 dB and MSG < 4.2 dB after 10 s. These results exemplify the slow convergence speed of the standard adaptive filtering algorithms when applied to AFC and are explained by the fact that, when ( ) is speech, R is such that cov {h( )} is large. These results are even more striking because of the known fast convergence speed of the RLS algorithm when the eigenvalue spread of autocorrelation matrix of the adaptive filter input, R x in this case, is large as occurs for speech signals [10].
Finally, the results for white Gaussian noise (SNR → −∞) and = {1, 400} is shown in Figure 10. It is observed that MIS( ) and MSG( ) are very similar over time. This occurs because, as discussed in the first experiment, 0 = 0 is, in theory, sufficient to completely decorrelate ( ) and ( ) when the latter is white Gaussian noise, leading to bias {h( )} = 0 and thus cov {h( )} = 0 in both cases. Because of that, the performance of the adaptive filter is much better than for speech signals, achieving MIS ≈ −12 dB and MSG ≈ 12 dB after 10 s. However, despite the improvement in convergence speed, it is noteworthy the adaptive filter has not yet converged because ( ) is indeed sequences of pseudorandom values drawn from the standard normal distribution. This causes bias {h( )} ≠ 0, as can be seen in Figure 7, and thus cov {h( )} ≠ 0, thereby limiting the convergence speed of the adaptive filter. As a matter of fact, when 0 = 0 and ( ) is white Gaussian noise with zero mean and variance 2 , the signals ( ) and ( ) are independent and thus the covariance matrix, defined in (42), can be written as [10] cov which, by replacing (30), becomes The equations (61) and (62) are the Cramer-Rao lower bound (CRLB) of the covariance matrix for any unbiased estimator of f [10]. Therefore, h( ) is not only the best unbiased linear estimator (BLUE), in the sense that no other unbiased linear solution generated by any approach has lower variance, but also is the minimum-variance unbiased (MVU) estimator [10]. And considering that is large enough so that R ≈ R , the covariance matrix defined in (62) can be written as which, by making use of (54), becomes Therefore, in the problem at hand, the CRLB of the covariance matrix depends solely on the sample autocorrelation matrix of the closed-loop impulse response of the SR system. Moreover, from (62) to (64), it can be concluded that elements of cov {h( )} decrease as time progresses, becoming null as → ∞. This result is in agreement with the discussion of the results shown in Figure 10 because the assumptions made are the same that led to bias {h( )} = 0 and thus cov {h( )} = 0.
The presented results corroborate the theoretical discussion on the covariance matrix of the acoustic feedback path estimate provided by the AFC systems based on direct closed-loop identification. In addition, they exemplify the slow convergence speed of the standard adaptive filtering algorithms and its direct relationship with the autocorrelation matrix of the system input signal ( ). This fact justifies the development of specific methods to estimate the impulse response f of the acoustic feedback path in AFC systems.

VI. CONCLUSIONS
This work presented, using the least squares estimation theory, a theoretical and experimental analysis on the poor performance of standard adaptive filtering algorithms, which approximate the Wiener solution through the gradient or least squares, when applied to acoustic feedback cancellation with direct closed-loop identification.
Expressions for the bias and covariance matrix of the acoustic feedback path estimate provided by these algorithms were derived as functions of the signals statistics as well as derivatives of the cost function. Stationary and non-stationary environments were considered. It was demonstrated that, in general, the acoustic feedback path estimate is biased and presents a large covariance because the closed-loop nature of the system makes the cross-correlation between the loudspeaker and system input signals non-zero. As a consequence, the adaptive filter converges very slowly to a solution that is not the true acoustic feedback path.
These problems were exemplified using speech signals, a long acoustic feedback path and the RLS algorithm. It was verified that, after 10 s, the bias energy is usually greater than −15 dB while the RLS algorithm achieves a normalized misalignment not less than −3.5 dB and increases the maximum stable gain by no more than just 4.2 dB. The relationship between the performance of the adaptive algorithm and the aforementioned cross-correlation was proven by varying the signal-to-noise ratio and the delay caused by forward path.

APPENDIX A EXPANSION OF THE CROSS-CORRELATION VECTOR p
The ( + 1) × 1 cross-correlation vector between ( ) and ( ) is defined as [11] where is the cross-correlation function between ( ) and ( ) [11]. By replacing (5) in (66), we have Assuming that ( ) is a white Gaussian noise with zero mean and variance 2 , then ( ) = ( ) = 0 and Hence, in this case, (68) becomes Assuming that the closed-loop system is causal, then = 0, < 0, and (70) becomes Therefore, by replacing (71) in (65), the cross-correlation vector between the loudspeaker signal ( ) and the system input signal ( ) can be expanded as where

APPENDIX B EXPANSION OF THE AUTOCORRELATION MATRIX R
The ( + 1) × ( + 1) autocorrelation matrix of ( ) is defined as [11] where is the autocorrelation function of ( ) [11]. By replacing (5) in (75), we have is the sample autocorrelation function of a [11].

ACKNOWLEDGMENT
The authors would like to acknowledge the FAPPR (Fundação Araucária), SETI-PR (Secretaria de Estado da Ciência, Tecnologia e Ensino Superior) and the Government of the State of Paraná for the financial support received. This work was also supported by National Funds from FCT -Fundação para a Ciência e a Tecnologia through project UIDB/50016/2020. Wellington Murilo da Silva Nogueira received the B.Sc. degree in electronics engineering from the Federal University of Technology -Paraná (UTFPR), Cornélio Procópio, Brazil, in 2018. His research interests are in digital signal processing, specially applied to acoustic feedback cancellation and speech processing.