State of charge classification for lithium-ion batteries using impedance based features

Currently, the electrification of the drive train of passenger cars takes place, and the task of obtaining precise knowledge about the condition of the on board batteries gains importance. Due to a flat open circuit voltage (OCV) to state of charge (SoC) characteristic of lithium ion batteries, methods employed in applications with other cell chemistries cannot be adapted. Exploiting the higher significance of the impedance for state estimation for that chemistry, new impedance based features are proposed by this work. To evaluate the suitability of these features, simulations have been conducted using a simplified on-board power supply net as excitation source. The simulation outcome has been investigated regarding the cross correlation factor rxy and in a polynomial regression scenario. The results of the simulations show a best case error below 1% SoC, which is 3 percentage points lower than using terminal voltage and impedance. When increasing the measurement uncertainty, the difference remains around 2 percent points.


Introduction
A main challenge to the current generation of engineers is the successful transition from fossil fuels to the use of regenerative energy sources for stationary automotive applications. In Germany, the electrification of drives is propelled by CO 2 reduction laws (European Parliament, 2014), demanding an average emission of 95 g CO 2 km −1 by all new cars by 2021, as well as the objective of one million electric vehicles (EVs) by 2020 and six million EVs by 2030 (Second Report by The German National Platform for Electric Mobility, 2011). Due to electrification of the drive train in the car, the importance of batteries as power storage increases. Lithium ion batter-  (Keller, 2013).
ies are promising prospects to achieve the set goals, because of the high energy density compared to other commercially available cells. Condition monitoring of the EVs battery is an important task in the automotive field: The state of charge (SoC) is the "fuel gauge" to the battery. Precise knowledge of the SoC does not only provide information about the remaining distance coverable, but also enables the system to keep a safety buffer for the function of security systems in case of need.
For conventional cell chemistries, the open circuit voltage (OCV) was a considerable measure for the SoC (Richardson et al., 2012;Feng et al., 2015;Chang, 2013, Sect. 3.1.1 and sources therein). This does not apply to Li ion batteries, as the OCV characteristic is very flat, with a slope only a few 1 mV for each decade of % SoC (see Fig. 1, with LiFePO 4 as extreme example). Furthermore, the OCV curve based meth-ods require a rest period, which length is difficult to quantify and challenging to implement in xEV applications as clocks and some other loads cannot be turned off entirely (Piller et al., 2001). As an alternative, the impedance has shown to be a significant SoC-sensitive feature (Piller et al., 2001;Klotz et al., 2011;Schmidt et al., 2011).
The Taylor Fourier transform (TFT) is a method originally used in stationary power line applications, where it is used to identify faults from abnormal mains frequency properties (de la O Serna and Rodríguez, 2007). As a novelty in this work, the TFT is used to extract impedance based features. Furthermore, the aptness of the features regarding battery state monitoring is investigated in an exemplary use case. Especially for low noise, the results are promising.
The rest of this work is structured as follows: Sect. 2 will describe the underlying theory of the signal model (Sect. 2.1) and the extraction algorithm (Sect. 2.2). In Sect. 3, the simulation setup is presented. The 4th section covers the investigations about the suitability of the new features, explicitly by correlation coefficient analysis (Sect. 4.1) and an exemplary use case in a polynomial regression (Sect. 4.2). Finally, Sect. 5 concludes this work with a summary and suggestions regarding future work.

Theory
In this section, the signal model used in the investigations is explained in detail. After that, the Taylor Fourier transform is described as it is used for feature extraction. The signal model and derivation of the TFT follow de la O Serna and Rodríguez (2007).

Signal model
Let s(n) be a real valued, sampled oscillating signal with sample index n, signal frequency f , sampling rate f s , amplitude a(n) and phase offset ϕ(n).
In Eq.
(2), a(n) and ϕ(n) can be gathered in a complex pointer called dynamic phasor p(n), with z * denoting the complex conjugate of z ∈ C. As an abbreviation, ω = 2π f f s is used for the normalized angular frequency corresponding to f and f s . This yields an equation, where the amplitude and phase shift are gathered in the variable p(t).
In case of the battery, measurements of the voltage will not oscillate with zero mean, in contrast to the assumptions by de la O Serna and Rodríguez (2007). In this case, the measured signal x(n) with bias c ∈ R would be For brevity and simplicity, s(n) has no bias c. This can be achieved by subtracting the mean value c from the measured data.

Taylor Fourier transform
The Taylor Fourier transform can be used to extract p(n) and its derivatives. To follow the evolution of these parameters, the time signal is filtered using a sliding window. Let s k (n) be the kth window of the sampled signal s(n) from above with odd window length N and N h = N its rounded down half. Then,s (2) k (n) is an approximation for s k (n) with respect to a truncated Taylor series of p k of second order, p To create a system of linear equations, rewrite Eq. (5) using the complex conjugate like in Eq. (3). Let the time samples be vector n, the exponential functions vector b, then matrix D k (n, b) is a function of both. Furthermore, vector p k consists of factors p i,k . This altogether yields Eq. (6).
As real valued features for regression, the absolute (abs(p k )) and the polar angle (arg(p k )) of the complex parameters p k are used.

Simulation setup
To investigate the correlation of the p k parameters with the SoC, simulations have been used as data source. The model parameters for the simulation are stated in Table 1. The battery model is a 2RC model for a commercial 18 650 lithium iron phosphate cell, parameterized by Vergossen et al. (2011). To achieve an oscillating signal, the battery has been put into a strongly simplified on-board power supply net as depicted in Fig. 2. The setup consists of a modeled battery u b connected in series with resistor R 2 , which is parallel to resistor R 1 . The connection to R 1 is periodically toggled by b sw with switching frequency f = 120 Hz. It could stand for a pulsed seat heater in parallel to a miscellaneous combination of ohmic loads.
Observing the frequency spectrum of the current characteristic in Fig. 3 resulting from that setup, it can be seen that there is a dominant component at switching frequency f . Assuming the superposition principle to be valid, this signal can be used to obtain the p (2) k parameters. The TFT used to extract the parameters was executed on windows with a length of N = 65 samples.

Suitability for SoC estimation studies
To evaluate the fitness of the parameters, first the Bravais-Pearson correlation r xy is investigated. After that, tests with a polynomial fit for SoC detection are conducted.

Correlation coefficient
As an exemplary measure for the usability of parameters for regression, the Bravais-Pearson correlation coefficient r xy of simulation output to the SoC is investigated. Letā be the mean value of an arbitrary signal a, then the coefficient for signals x and y calculates as (Fahrmeir et al., 2016) Commonly, 0.8 ≤ r xy is perceived as strong correlation, 0.5 ≤ r xy < 0.8 stands for a intermediate correlation and r xy < 0.5 is considered as weak correlation (Fahrmeir et al., 2016). Investigating the results in that scope, it has to be kept in mind that the correlation coefficient is a measure for linear correlation. The results can be found in Table 2. Following the definition from above, the terminal voltagē u, as well as the parameters abs(p 0 ), abs(p 1 ) and arg(p 1 ) show a strong correlation to the SoC, the remaining parameters arg(p 0 ), abs(p 2 ) and arg(p 2 ) only correlate weakly in linear sense. Though this results seem bad, its has to be mentioned that a battery is a highly nonlinear system. Therefore, polynomial dependency has also been investigated.

Polynomial regression
To get a further idea for the suitability for SoC detection of the parameters beyond linear correlation, a multivariate  polynomial has been fitted to the data. As fitness measure, the root mean square error (e rms,i ) is given for a polynomial g 1 (ū, abs(p 0 ), abs(p 1 ), arg(p 1 ), abs(p 2 ), arg(p 2 )) making use of voltage measurement and the calculated parameters and g 2 (ū, abs(p 0 ), arg(p 0 )) (voltage and impedance) for comparison. The corresponding errors are e rms,1 for g 1 and e rms,2 for g 2 . As common practice in supervised learning, the data set is split into a training set (targets) and a test set. The proportion of training to test data is 80 : 20. The data are assigned to the sets randomly. First, the influence by the order of the polynomial has been investigated. The mean values µ x and standard deviations σ x dependent on the polynomial order are presented in Table 3. To avoid influence of the set assignment process, 51 Monte Carlo runs have been conducted. To avoid inaccuracies due to rounding errors, the measurement and data matrix have been transformed into zero mean unit variance form.
It can be seen that for g 1 , the mean root mean square (RMS) error is always lower than for g 2 . For orders higher than 1, the mean RMS error is lower than 1 % for g 1 , where for g 2 this takes a polynomial order of 4. Also, the standard deviation σ 1 is lower than σ 2 all the time, declining along with µ 1 over inclining order of the polynomial.
In the ideal case presented here, using the new features results in considerably lower deviations from the ground truth than using only the terminal voltage and the impedance. Where e rms,2 only undercuts the value of 1 for a order of polynomial of 4, the error e rms,1 is well below 1 % for all cases with a polynomial of order 2 or higher. The resulting tradeoff is between a high accuracy with g 1 , where the com- putational cost is higher because of the pseudo inverse on the one hand, and the computationally cheaper but less precise solution of g 2 . Increasing the order of polynomial also increases the performance of g 2 , but to lift it on the same level as g 1 , a high number of summands would be necessary. This would also result in high computational cost, unsaid that the success of that is guaranteed. Finally, the robustness agains the influence of noise has been investigated. The noisy simulated measurement data have been used unaltered, e.g. without filtering. As noise, usual values for measurement uncertainty have been assumed as noise variance, e.g. e meas = {0, 0.5, 1, 3, 5, 10}%. In Fig. 4, e rms,1 (blue line) and e rms,2 (red line) are depicted over different noise levels. The graphs show the mean values over the 11 Monte Carlo runs, where the box plots display the typical statistical deviations. For g 1 , the error for no noise is below 1 %, only differs little around 4 % for measurement uncertainties of 1 to 5 % and is short above 5 % with a high measurement uncertainty of 10 %. For g 2 , for no noise the uncertainty is between 3 and 4 %, and stays flat at around 7 % for all other values.
For all simulations, g 1 scored a significantly lower error. The difference is around 3 percent points for low noise and only undercuts 2 percent points for a measurements uncertainty of 10 %.

Conclusions and outlook
In this work, the dynamic phasor and its derivatives have been presented as new impedance based features for SoC classification. Simulations based on a lithium ion battery model have been conducted. Investigations utilizing the correlation coefficient r xy showed a strong linear correlation of a subset of the features. Regression tests with a multivariate polynomial of orders above 2 yield a RMS error below 1 % SoC, which in comparison is lower than a regression featuring only terminal voltage and impedance measurements. In an exemplary experiment with a polynomial of order 2 on data with altering measurement uncertainties, the impedance features yield to a 2 to 3 percent point lower estimation RMS error.
In future work, the features could be subject to more elaborate performance testing utilizing any type of regression methods from the field of machine learning. Especially, ensembles of different regression types, potentially in combination with other SoC detection algorithms, could be with good prospects for highly precise state estimation. Furthermore, a laboratory study could be conducted to compare the results from the measurements with the ones obtained in the simulations.
Code availability. The code is subject to an unpublished dissertation, hence will not be published beforehand. Competing interests. The authors declare that they have no conflict of interest.