CN110176248B

CN110176248B - Road voice recognition method, system, computer device and readable storage medium

Info

Publication number: CN110176248B
Application number: CN201910436946.5A
Authority: CN
Inventors: 黎恒; 徐韶华; 唐文娟; 韦泽贤; 陈静
Original assignee: Guangxi Jiaoke Group Co Ltd
Current assignee: Guangxi Jiaoke Group Co Ltd
Priority date: 2019-05-23
Filing date: 2019-05-23
Publication date: 2020-12-22
Anticipated expiration: 2039-05-23
Also published as: CN110176248A

Abstract

The invention discloses a road voice recognition method, a system, computer equipment and readable storage, wherein the method comprises the following steps: acquiring a data sample and a sample category of road sound; sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample; inputting the logarithmic Mel features into a convolution cycle network model according to the sample class for training until the convolution cycle network model meets a preset training end condition; and identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model. The method takes the logarithmic Mel characteristics of the road sound data samples as the input of the convolution cycle network model, trains out a model which can be used for identifying various sounds in a complex traffic scene, and is beneficial to improving the accuracy of road traffic incident detection.

Description

Road voice recognition method, system, computer device and readable storage medium

Technical Field

The invention relates to the technical field of voice recognition, in particular to a road voice recognition method, a road voice recognition system, a computer device and a readable storage medium.

Background

With the development of science and technology, intelligent traffic gradually becomes an important means for road monitoring, and meanwhile, in order to make road monitoring more intelligent, a traffic incident detection technology is developed. The traditional road traffic incident detection technology mainly depends on video detection, however, the video detection has directionality, correct identification is difficult to complete under the conditions of severe weather, poor lighting conditions, lens pollution and the like, and the detection accuracy rate is not guaranteed.

The inventors have found that road sounds carry a great deal of event information, and that a plurality of sounds such as car whistling sounds, car engine operating sounds, vehicle collision sounds, etc. may be simultaneously present on the road at the same time, and if the road sounds can be effectively recognized, the accuracy of the traffic event detection technology will be greatly improved. Meanwhile, the inventor finds that most of the existing research on voice recognition is limited to recognizing the most prominent event information, such as laughing voices, applause voices and the like at the same time, and other event information is lost, which obviously does not meet the requirement of voice recognition of a complex scene, such as road voice.

Disclosure of Invention

Based on the road sound identification method, the road sound identification system, the computer equipment and the readable storage medium, the road sound identification method, the road sound identification system, the computer equipment and the readable storage medium can identify models of various sounds on a road, are suitable for sound identification of a road traffic complex scene, and are beneficial to improving the accuracy of road traffic incident detection.

In a first aspect, the present invention provides a road voice recognition method, including:

acquiring a data sample and a sample category of road sound;

sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample;

inputting the logarithmic Mel features into a convolution cycle network model according to the sample class for training until the convolution cycle network model meets a preset training end condition;

and identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model.

In a second aspect, the present invention provides a road voice recognition system, comprising:

the sample acquisition module is used for acquiring a data sample and a sample category of road sound;

the characteristic extraction module is used for sequentially carrying out time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample;

the training module is used for inputting the logarithmic Mel features into a convolution cycle network model according to the sample types for training until the convolution cycle network model meets a preset training end condition;

and the classification module is used for identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model.

In a third aspect, the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring a data sample and a sample category of road sound;

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

acquiring a data sample and a sample category of road sound;

The road sound identification method, the system, the computer equipment and the readable storage medium are characterized in that the method comprises the steps of obtaining data samples and sample types of road sounds; sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample; inputting the logarithmic Mel features into a convolution cycle network model according to the sample class for training until the convolution cycle network model meets a preset training end condition; and identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model. The method takes the logarithmic Mel characteristics of the road sound data samples as the input of the convolution cycle network model, trains out a model which can be used for identifying various sounds in a complex traffic scene, and is beneficial to improving the accuracy of road traffic incident detection.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

FIG. 1 is a flowchart illustrating a road voice recognition method according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of a road voice recognition system according to a second embodiment of the present invention;

fig. 3 is an internal structural view of a computer device in a third embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Please refer to fig. 1, which is a flowchart illustrating a road voice recognition method according to a first embodiment of the present invention. A first embodiment of the present invention provides a road voice recognition method, including the steps of:

s1, acquiring a data sample and a sample category of the road sound;

s2, sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample;

s3, inputting the logarithmic Mel features into a convolution cycle network model according to the sample types for training until the convolution cycle network model meets a preset training end condition;

and S4, recognizing and classifying the sound data to be processed by using the trained convolution cycle network model.

In one embodiment, the obtaining the data sample of the road sound comprises:

acquiring an original data sample of road sound;

and performing data enhancement on the original data sample by utilizing a random mixing technology to obtain a data sample of road sound.

The data enhancement is carried out on the original data samples to increase the number of samples, and more samples can enable the trained model to be more robust and stable.

In an embodiment, the performing data enhancement on the original data sample by using a random mixing technique to obtain a data sample of road sound specifically includes:

resampling the original data sample of the road sound to obtain a plurality of sound segments;

and selecting more than two sound segments, respectively matching the sound segments in any ratio, and mixing to obtain a data sample of the road sound.

In one particular embodiment, raw data samples of road sounds may be obtained in real time, but are not limited to, from an AudioSet of data and on the road using a microphone.

In a specific embodiment, the obtained raw data sample of the road sound is processed as follows to obtain a road sound data sample:

and resampling the traffic sound signal, wherein the sampling rate is 16kHz, and the sampling value is modulated by 16-bit pulse code to obtain a plurality of sound segments.

In order to be compatible with sound segments with different time lengths, 10-second time window data is selected as input, the traffic sound signals with the time length less than 10s are filled with zeros, and the traffic sound signals with the time length more than 10s are cut, so that the time length is guaranteed to be 10 s.

The sound segments are labeled with one-hot codes.

Randomly selecting two sound segments y from the processed sound segments₁，y₂The labels are respectively l₁，l₂. Selecting a random ratio

Will y₁，y₂Mixing to obtain mixed sound mix_r(y₁,y₁₂) The formula is as follows:

in the formulae (1) and (2), G₁，G₂Are each y₁，y₂The sound pressure level of (a). The sound pressure level G of the sound file is obtained by a weighting method A, which comprises the steps of firstly creating a 0.1s small window and respectively calculating the time sequence weighted sound level { G }₁,g₂,...,g_t}，G＝max{g₁,g₂,...,g_t}. Mixed sound mix_r(y₁,y₁₂) Is marked with a label

In the embodiment, the random mixing technology is adopted to perform data enhancement on the original sample data, the training data is expanded, and the robustness and the generalization capability of the model are improved.

In one embodiment, the step S2 includes:

sequentially carrying out non-recursive filter filtering processing, slicing and Hamming window windowing processing on the data samples to obtain first data processing samples;

converting the frequency information in the data processing sample into Mel frequency information, sequentially inputting the Mel frequency information into a plurality of triangular filters, and outputting a second data processing sample;

carrying out logarithmic operation on the data reprocessing samples, and stacking the results subjected to logarithmic budget along a time axis to obtain a third data reprocessing sample;

and sequentially carrying out first-order derivation and second-order derivation on the third data processing sample, and taking the third data processing sample, the first-order derivation of the third data processing sample and the second-order derivation of the third data processing sample as the logarithmic Mel characteristic of the data sample.

In a specific embodiment, the step S2 includes:

1) a first order non-recursive filter is used to pre-emphasize the high frequency components of the sound signal. The expression for the first order non-recursive filter is:

H(z)＝1-αz^-1 (3)

in equation (3), α is a pre-emphasis coefficient, and h (z) is a filter response. Preferably, α is 0.97.

2) Extracting local information of a sound fragment according to a frame length of 25ms and a frame overlapping of 10ms, carrying out windowing processing on each frame of signal by using a Hamming window function in order to reduce frequency spectrum leakage, wherein the output of each frame of signal after windowing is as follows:

O_t＝x_t(n)*w(n) (4)

in the formulas (4) and (5), x is a convolution operator_t(n) is the sound signal of the t-th frame, w (n) is a window function, O_tThe windowed output signal for the t-th frame is window long winlen. Then, carrying out 512-point fast Fourier transform on each frame of signal to obtain a corresponding frequency spectrum X_n(k) Where n is the number of frames and k is the frequency.

3) Converting the frequency of the signal to a Mel (Mel) frequency, the conversion formula is as follows:

Mel(f)＝2595lg(1+f/700) (6)

in the formula (6), f is the signal frequency, and Mel (f) is the Mel frequency corresponding to f.

4) Configuring L triangular filters on the Mel frequency, wherein the output of each triangular filter is as follows:

in the formulae (7) and (8), W_l(k) Is the coefficient of the first filter, | X_n(k) L is the amplitude spectrum of the nth frame signal, h (l), c (l), o (l) are the upper, center and lower frequencies of the lth filter, respectively. Preferably, L-64.

5) And performing logarithm operation on the output Y (L) of the filter, and stacking logY (L), L is 1,2, and L along a time axis to obtain a static logarithm Mel two-dimensional time-frequency characteristic. And then, solving a first derivative and a second derivative of the static logarithmic Mel two-dimensional time-frequency feature, and enabling the static logarithmic Mel two-dimensional time-frequency feature, the first derivative of the static logarithmic Mel two-dimensional time-frequency feature and the second derivative of the static logarithmic Mel two-dimensional time-frequency feature to jointly form a 3-channel logarithmic Mel feature which is used as an input sample of the convolution cycle network model.

For a sound sample, the original sound data has high dimensionality and high training complexity, and is easy to overfit, so that feature extraction is needed, and the extracted features are used as input samples of a convolution cycle network model, so that not only can the precision be improved, but also the complexity of early data processing can be reduced. In the embodiment, the logarithmic mel features of the data samples are extracted, and the logarithmic mel features can be used for calculating the sound frame by frame, capturing the instantaneous dynamic features of the sound source and mapping the frequency response similar to human auditory perception, so that the logarithmic mel features are closer to the original data samples, and when a convolution cycle network model is used for training, the difference between different sample classes can be better reflected, and the road sound can be more accurately identified and classified.

In one embodiment, the convolution cyclic network model comprises a gate control convolution network layer, a cyclic network layer, a time distribution type full connection layer and a classification output layer which are arranged in sequence; the number of layers of the gated convolutional network layer is 4, and the number of layers of the cyclic network layer is 2.

In the selection of the number of the gated convolutional network layers and the number of the cyclic network layers, the inventor finds that the convolutional cyclic network model with the number of the gated convolutional network layers of 4 and the number of the cyclic network layers of 2 has the best effect on road sound identification through repeated reciprocating tests.

In a specific embodiment, the process of building the convolution cyclic network model includes:

1) taking three-channel characteristics of logarithmic Mel characteristics as input samples of convolution cycle network model, dividing the input samples into 10 subsets with the same size, and recording as S₁,S₂,...S₁₀By S_i(i ═ 1,2,. 10) as a test set, the remaining 9 as training sets;

2) and (3) building a convolution cycle network model by using software, wherein the convolution cycle network model comprises a gate control convolution network layer, a cycle network layer, a time distribution type full connection layer and a classification output layer which are sequentially arranged.

3) Inputting the input samples into a convolution cycle network model, and performing supervised learning to obtain parameters of each layer of the trained convolution cycle network model; during training, random distribution function is used for carrying out random initialization on convolution kernel and weight, the learning rate is self-adaptively and dynamically adjusted, the initial value of the learning rate is set to be 0.01, and the minimum learning rate is 10^-9The precision of the test set is unchanged in 20 training periods, the learning rate is reduced by 10 times, the convolution cycle network model is trained by using a binary cross entropy loss function and an adaptive moment estimation optimizer in a back propagation mode, and the training is stopped when no change exists in 50 training periods or the limit error of the cost function is less than 0.01.

4) The convolution cycle network model is tested, and the test method comprises the following steps: and inputting the samples of the test set into the trained convolution cycle network model, comparing the output of the convolution cycle network model with the sample categories corresponding to the samples of the test set, calculating the accuracy and evaluating the convolution cycle network model.

Further, the training principle of the convolution cycle network model is as follows: the convolutional layer in the gated convolutional network layer is used as a feature extractor, convolutional kernels in 4 gated convolutional network layers are 64,128 and 128 in sequence, the scale size is 5 ' 5, higher feature map numbers corresponding to the number of the convolutional kernels are obtained by calculation of convolutional response and a GLU excitation function, a batch processing normalization layer is introduced after the convolutional layers, internal ramp shift is reduced, the training process is accelerated, the dimension of feature data is reduced by adopting a maximum pooling mode, more frequency invariance is provided, pooling is only carried out on a frequency axis for ensuring the time integrity of a sound event, the sizes of the first three pooling layers are 1 ' 2, the size of the last pooling layer is 1 ' 4, the output vector of the fourth gated convolutional network layer block passes through a time distribution type full connection layer, the feature maps are stacked along the frequency direction and input to the bidirectional GRU circulating network layer, and pass through a updating gate and a resetting gate unit, learning the context information of the characteristics, then inputting the output vector into a sigmoid classifier through a time-distributed full-connection layer with the nodes of 500 and the excitation function of a rectified linear ReLu function, obtaining the cognitive results of 9 target events in each frame, after weighting averaging, carrying out binarization on the cognitive results of the target events one by one through a group of thresholds, and realizing the cognitive classification of road sounds.

In one embodiment, the sample categories include at least two of an alarm sound, a whistling sound, a vehicle running sound, a brake sound, an explosion sound, a person calling for help sound, a door closing sound, a collision sound, and a rain sound.

In an embodiment, the step S4 is specifically:

and identifying the logarithmic Mel characteristics of the sound data to be processed by using the trained convolution cycle network model so as to realize the classification of the sound data to be processed.

The road sound identification method comprises the steps of obtaining a data sample and a sample category of road sound; sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data sample to obtain logarithmic Mel characteristics of the data sample; inputting the logarithmic Mel features into a convolution cycle network model according to the sample class for training until the convolution cycle network model meets a preset training end condition; and identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model. The method takes the logarithmic Mel characteristics of the road sound data samples as the input of the convolution cycle network model, trains out a model which can be used for identifying various sounds in a complex traffic scene, and is beneficial to improving the accuracy of road traffic incident detection.

Please refer to fig. 2, which is a schematic structural diagram of a road voice recognition system according to an embodiment. A second embodiment of the present invention provides a road voice recognition system, including:

the system comprises a sample acquisition module 1, a data analysis module and a data analysis module, wherein the sample acquisition module is used for acquiring a data sample and a sample category of road sound;

the characteristic extraction module 2 is used for sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data samples to obtain logarithmic Mel characteristics of the data samples;

the training module 3 is used for inputting the logarithmic Mel features into a convolution cycle network model according to the sample types for training until the convolution cycle network model meets a preset training end condition; and the number of the first and second groups,

and the classification module 4 is used for identifying and classifying the sound data to be processed by utilizing the trained convolution cycle network model.

In one embodiment, the sample acquiring module 1 comprises:

the system comprises an original sample acquisition unit, a data acquisition unit and a data processing unit, wherein the original sample acquisition unit is used for acquiring an original data sample of road sound;

and the data enhancement unit is used for carrying out data enhancement on the original data sample by utilizing a random mixing technology to obtain a data sample of the road sound.

In one embodiment, the data enhancement unit specifically includes:

the resampling subunit is used for resampling the original data sample of the road sound to obtain a plurality of sound segments;

and the sound mixing subunit is used for selecting more than two sound segments, respectively matching the sound segments in any ratio and mixing the sound segments to obtain a data sample of the road sound.

In one embodiment, the feature extraction module 2 includes:

the data processing unit is used for sequentially carrying out non-recursive filter filtering processing, slicing and Hamming window windowing processing on the data samples to obtain first data processing samples;

the frequency conversion unit is used for converting the frequency information in the data processing samples into Mel frequency information, sequentially inputting the Mel frequency information into a plurality of triangular filters and outputting second data processing samples;

the logarithm arithmetic unit is used for carrying out logarithm arithmetic on the data reprocessing samples and stacking the results after the logarithm budgeting along a time axis to obtain a third data reprocessing sample;

and the derivation unit is used for sequentially carrying out first derivation and second derivation on the third data processing sample, and taking the third data processing sample, the first derivative of the third data processing sample and the second derivative of the third data processing sample as the logarithmic Mel characteristic of the data sample.

In an embodiment, the classification module 4 is specifically configured to:

It should be noted that, the road voice recognition system provided in the embodiment of the present invention is used for executing all the method steps of the road voice recognition method in the first embodiment, and the working principles and beneficial effects of the two are in one-to-one correspondence, so that the details are not repeated.

Please refer to fig. 3, which is an internal structure diagram of a computer apparatus according to a third embodiment. A third embodiment of the present invention provides a computer apparatus including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the road sound recognition method of the first embodiment described above when executing the computer program.

The computer device includes a processor, a memory, and a network interface connected by a system bus. The processor is configured to provide computing and control capabilities, the memory includes a nonvolatile storage medium and an internal memory, the nonvolatile storage medium stores an operating system, a computer program, and a database, the internal memory provides an environment for the operating system and the computer program in the nonvolatile storage medium to run, and when the computer program is executed by the processor, the road sound recognition method according to the first embodiment is implemented.

Those skilled in the art will appreciate that the architecture shown in fig. 3 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

A fourth embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the steps of the road sound identification method of the first embodiment described above.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A road voice recognition method, comprising:

acquiring a data sample and a sample category of road sound, and specifically comprising the following steps:

acquiring an original data sample of road sound;

carrying out data enhancement on the original data sample by utilizing a random mixing technology to obtain a data sample of road sound;

the data enhancement of the original data sample by using the random mixing technology to obtain the data sample of the road sound specifically comprises:

selecting more than two sound segments, respectively matching the sound segments in any ratio and mixing the sound segments to obtain a data sample of the road sound and a label of the data sample, wherein the concrete steps of obtaining the data sample of the mixed sound and the label of the data sample of the mixed sound are as follows:

randomly selecting two sound segments y from the processed sound segments₁，y₂The labels are respectively l₁，l₂；

Selecting a random ratio

Will y₁，y₂Mixing to obtain mixed sound mix_r(y₁,y₁₂) And mix the sounds mix_r(y₁,y₂) Is marked with a label

The formula is as follows:

wherein G is₁，G₂Respectively a sound clip y₁，y₂The sound pressure level of;

inputting the logarithmic Mel features into a convolution cycle network model according to the sample class for training until the convolution cycle network model meets a preset training end condition; the convolution cyclic network model comprises a gate control convolution network layer, a cyclic network layer, a time distribution type full connection layer and a classification output layer which are sequentially arranged; the gated convolutional network layer uses a GLU function as an excitation function, the number of layers is 4, convolution kernels in the 4 gated convolutional network layers are 64,128 and 128 in sequence, the scale size is 5' 5, the cyclic network layer is a bidirectional GRU cyclic network, and the number of layers is 2; the classification output layer adopts a sigmoid classifier;

2. The method of claim 1, wherein the sequentially performing time-frequency decomposition, frequency conversion, logarithmic operation and derivation on the data samples to obtain logarithmic mel features of the data samples comprises:

3. The road sound identification method according to claim 1, wherein the sample categories include at least two of an alarm sound, a whistling sound, a vehicle running sound, a brake sound, an explosion sound, a person calling for help sound, a door closing sound, a collision sound, and a rain sound.

4. The road voice recognition method according to claim 1, wherein the voice data to be processed is recognized and classified by using the trained convolutional recurrent network model, specifically:

5. A road voice recognition system, comprising:

the sample acquisition module is used for acquiring data samples and sample categories of road sounds, and specifically comprises the following steps:

acquiring an original data sample of road sound;

Selecting a random ratio

The formula is as follows:

the training module is used for inputting the logarithmic Mel features into a convolution cycle network model according to the sample types for training until the convolution cycle network model meets a preset training end condition; the convolution cyclic network model comprises a gate control convolution network layer, a cyclic network layer, a time distribution type full connection layer and a classification output layer which are sequentially arranged; the gated convolutional network layer uses a GLU function as an excitation function, the number of layers is 4, convolution kernels in the 4 gated convolutional network layers are 64,128 and 128 in sequence, the scale size is 5' 5, the cyclic network layer is a bidirectional GRU cyclic network, and the number of layers is 2; the classification output layer adopts a sigmoid classifier;

6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 4 are implemented when the computer program is executed by the processor.

7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 4.