WO2023216172A1

WO2023216172A1 - Poultry voiceprint recognition method and system

Info

Publication number: WO2023216172A1
Application number: PCT/CN2022/092354
Authority: WO
Inventors: 林玠佑; 张光甫; 黄醴万
Original assignee: 智逐科技股份有限公司
Priority date: 2022-05-12
Filing date: 2022-05-12
Publication date: 2023-11-16

Abstract

A poultry voiceprint recognition system (100), installed on a server connected to a network. The poultry voiceprint recognition system (100) comprises a receiver (100), a feature processing module (120), a feature analysis module (130), and an artificial intelligence voice model (160). The receiver (100) is arranged in a poultry house and is used for receiving recording information of the poultry house in a period of time. The feature processing module (120) is used for converting the recording information into a plurality of voice features. The feature analysis module (130) is used for analyzing each of the plurality of voice features so as to determine a voice state of each of the plurality of voice features by means of the artificial intelligence voice model (160). The voice state comprises a normal poultry voice state or an abnormal poultry voice state. The artificial intelligence voice model (160) generates a training set according to the plurality of voice features.

Description

Poultry voiceprint identification method and system

Technical field

This case relates to a voiceprint identification method and system, and in particular, to a poultry voiceprint identification method and system.

Background technique

Most modern poultry industries are large-scale and intensively farmed, which promotes the spread of diseases. When poultry is infected in a poultry house, the disease spreads very quickly in the poultry house to the entire batch of poultry. Therefore, it has a huge impact on the poultry industry every year. economic losses. These diseases are mainly caused by viruses: Avian Influenza (AI), Newcastle Disease (ND), Infectious Bronchitis (IB), Infectious Laryngotracheitis (Infectious Laryngotracheitis) Among them, infectious bronchitis (IB) is one of the most important respiratory diseases in Asia. Generally speaking, infected poultry will have tracheal rales, coughing, sneezing, runny nose, reduced egg production rate and reduced feed meat conversion rate. In addition, infected birds often show changes in their vocalizations before developing more severe symptoms. If changes in the sounds of poultry can be detected early, especially the sounds of infected poultry, it will provide a good early warning effect.

technical problem

However, the observation of poultry diseases often relies on manual empirical judgment. In addition, since most modern poultry industries are large-scale and intensively farmed, many poultry sounds and non-poultry sounds are often mixed together, so it is more difficult to identify abnormal poultry sounds among the many sounds.

Technical solutions

Therefore, this case proposes a poultry voiceprint identification method and system that can improve the aforementioned common problems.

According to an embodiment of this case, a poultry voiceprint identification method is proposed, including the following steps: receiving a recording information of a poultry house in a time period; and analyzing the recording information to determine a sound of the recording information The sound state includes a normal poultry sound state or an abnormal poultry sound state.

In some embodiments of this case, the poultry voiceprint identification method further includes the following steps: converting the recorded information into several sound features, wherein the step of converting the recorded information into several sound features includes the following steps : filtering out the recorded information to generate filtered recorded information having a specific frequency range; dividing the filtered recorded information into a plurality of sound information; and extracting the plurality of sound features from the plurality of sound information , wherein the step of analyzing the recording information to determine the sound state of the recording information is analyzing each of the plurality of sound features to determine the sound state of each of the plurality of sound characteristics. .

In some embodiments of this case, the step of filtering the recording information to generate the filtered recording information having the specific frequency range is implemented by bandpass filtering and spectral subtraction.

In some embodiments of this case, the step of extracting the plurality of sound features from the plurality of sound information is implemented using openSMILE, Wavelet or short-time Fourier transform.

In some embodiments of this case, the poultry voiceprint identification method further includes the following steps: storing the voice status of the recording information in a database through a network; and allowing a user to view the recorded information through the network. the sound status in the database.

In some embodiments of the present case, the step of analyzing each of the plurality of sound features to determine the sound state of each of the plurality of sound features is determined through an artificial intelligence sound model, The artificial intelligence sound model generates a training group based on the plurality of sound characteristics, and the training group includes an identification condition recording the normal poultry sound state and the abnormal poultry sound state.

In some embodiments of this case, the artificial intelligence sound model determines each of the several sound features as a poultry sound state or a non-poultry sound state through a support vector machine. If it belongs to the poultry sound state , the artificial intelligence sound model determines each of the plurality of sound features belonging to the poultry sound state as the positive poultry sound state or the abnormal poultry sound state via the support vector machine.

According to another embodiment of the present case, a poultry voiceprint recognition system is proposed, which is installed on a server connected to a network and includes: a receiver, which is provided in a poultry house and is used to receive all the data in a time period. A recording information of the poultry house; and a feature analysis module for analyzing the recording information to determine a sound state of the recording information, where the sound state includes a normal poultry sound state or an abnormal poultry sound state.

In some embodiments of this case, the poultry voiceprint recognition system further includes: a feature processing module for converting the recorded information into several sound features; wherein the feature processing module further includes: a filtering unit , used to filter out the recorded information to generate filtered recorded information with a specific frequency range; a dividing unit used to divide the filtered recorded information into several pieces of sound information; and an extraction unit used to extract the filtered recorded information from the The plurality of sound features are extracted from the plurality of sound information, wherein the feature analysis module is used to analyze the plurality of sound features in the step of analyzing the recording information to determine the sound state of the recording information. Each to determine the sound state of each of the plurality of sound features.

In some embodiments of this case, the filtering unit is further configured to: the step of filtering the recording information to generate the filtered recording information having the specific frequency range is implemented by bandpass filtering and spectral subtraction.

In some embodiments of this case, the extraction unit is further configured to: extract the plurality of sound features from the plurality of sound information by using openSMILE, Wavelet or short-time Fourier transform.

In some embodiments of this case, the poultry voiceprint recognition system further includes: a database for storing the voice status of the recording information through the network; and a user interface for using the The network provides a user with access to the sound status in the database.

In some embodiments of the present case, the feature analysis module is further configured to: in the step of analyzing each of the several sound features to determine the sound state of each of the several sound features, Determined by an artificial intelligence sound model, the artificial intelligence sound model generates a training group based on the several sound characteristics and uses the training group to train the feature analysis module. The training group includes records of the normal poultry sounds. status and an identification condition for the abnormal poultry sound status.

beneficial effects

In this case, by filtering, segmenting, extracting or converting the recorded information from a poultry house in a time period to determine the sound state of the recorded information, it can be used on a large scale in most modern poultry industries. And abnormal poultry sounds were identified among the many sounds in intensively farmed poultry houses. In addition, the production of abnormal poultry sounds can be accurately and quickly identified using artificial intelligence, reducing the manpower required and enabling early detection of abnormal poultry sounds so that response measures can be taken as early as possible.

Description of the drawings

In order to have a better understanding of the above and other aspects of this case, examples are given below, and detailed descriptions are as follows with the accompanying drawings:

Figure 1 is a schematic diagram of a poultry voiceprint recognition system according to an embodiment of the present invention.

Figure 2 is a schematic diagram of a poultry voiceprint identification method according to an embodiment of the present case.

Figure 3 is a schematic diagram of a poultry voiceprint identification method according to another embodiment of the present invention.

Figure 4A shows a waveform diagram of original recording information of poultry according to an embodiment of the present invention.

Figure 4B shows a time-frequency diagram of original recording information of poultry according to an embodiment of the present invention.

Figure 5A shows a waveform diagram of the original recording information of poultry after band-pass filtering according to an embodiment of the present invention.

Figure 5B shows a time-frequency diagram of the original recording information of poultry after band-pass filtering according to an embodiment of the present invention.

Figure 6A shows a waveform diagram of the original recording information of poultry after spectral subtraction according to an embodiment of the present invention.

Figure 6B shows a time-frequency diagram of the original poultry recording information after spectral subtraction according to an embodiment of the present case.

Figure 7A shows a time-frequency diagram of poultry sounds in the original recording information according to an embodiment of the present invention.

FIG. 7B shows a time-frequency diagram of poultry sounds in the filtered recording information after band-pass filtering according to an embodiment of the present invention.

Figure 7C shows a time-frequency diagram of poultry sounds in the filtered recording information after spectral subtraction according to an embodiment of the present invention.

8A shows a waveform diagram of poultry sounds in the filtered recording information after band-pass filtering and spectral subtraction processing according to an embodiment of the present invention.

Figure 8B shows a time-frequency diagram of poultry sounds in the filtered recording information after band-pass filtering and spectral subtraction processing according to an embodiment of the present invention.

Figure 9 is a schematic diagram of a portion of voice activity intercepted by VAD according to an embodiment of the present invention.

Figure 10A shows a sample of normal poultry sounds of native chickens according to an embodiment of the present invention.

Figure 10B shows a sample of abnormal poultry sounds of native chickens according to an embodiment of the present invention.

Figure 11A shows a sample of normal poultry sounds of laying hens according to an embodiment of the present invention.

Figure 11B shows a sample of abnormal poultry sounds of laying hens according to an embodiment of the present invention.

Figure 12 is a schematic diagram of a classification diagram of a support vector machine according to an embodiment of this case.

Figure 13 shows the temperature and humidity information in the poultry house of each age according to an embodiment of the present invention.

Figure 14 illustrates the prediction results of the number of voice data on the same day for each age according to an embodiment of the present case.

Figure 15 shows the prediction results of the number of abnormal poultry sounds of each day of age according to an embodiment of this case.

Figure 16 shows the proportion of each category in the prediction results of the day according to an embodiment of this case.

Figure 17A shows the prediction results of the number of sound data from 6 a.m. to 2 p.m. at each age according to an embodiment of the present case.

Figure 17B shows the prediction results of the number of sound data from 2 pm to 10 pm for each day according to an embodiment of the present case.

Figure 17C shows the prediction results of the number of sound data from 10 pm to 6 am the next day according to an embodiment of the present case.

Figure 18A shows the prediction results of the number of abnormal poultry sound data of each age from 6 am to 2 pm according to an embodiment of the present case.

Figure 18B shows the prediction results of the number of abnormal poultry sound data of each age from 2 pm to 10 pm according to an embodiment of the present case.

Figure 18C shows the prediction results of the number of abnormal poultry sound data from 10 pm to 6 am of each day of age according to an embodiment of the present case.

Best Mode of Carrying Out the Invention

Please refer to FIG. 1 , which illustrates a schematic diagram of a poultry voiceprint recognition system 100 according to an embodiment of the present invention. The poultry voiceprint recognition system 100 is installed on a server (not shown) connected to a network (not shown). The poultry voiceprint identification system 100 includes a receiver 110 and a processor 115. The processor 115 includes a feature processing module 120 and a feature analysis module 130 . The receiver 110 is disposed in a poultry house and is used for receiving a recording information _IR of the poultry house in a time period. The feature processing module 120 is used to transform the recording information _IR into several sound features _CS . The feature analysis module 130 is used to analyze the recording information I _R to determine the sound state S _S of the recording information I _R , or to determine the sound state S _S of each of the plurality of sound features _CS . The sound state S _S includes a normal poultry sound state and/or an abnormal poultry sound state _. In some embodiments, the receiver 110 and the processor 115 may be integrated into a device (not shown; such as a single-chip microcomputer Raspberry Pi) provided in the poultry house.

The feature processing module 120 further includes a filtering unit 121, a segmentation unit 122 and an extraction unit 123. The filtering unit 121 is used to filter out the recording information I _R to generate filtered recording information I _FR having a specific frequency range. The dividing unit 122 is used to divide the filtered recording information I _FR into several pieces of sound information _IS . The extraction unit 123 is used to extract several sound features _CS from several pieces of sound information _IS .

The poultry voiceprint recognition system 100 further includes a database 140 and a user interface 150. The database 140 is used to store the sound state S _S of the recording information I _R (or the sound state S _S of each of the plurality of sound features C _S ) through the network. The user interface 150 is used for a user to view the sound status S _S in the database 140 through the network.

The feature analysis module 130 is further configured to make a determination through an artificial intelligence sound model 160 in the step of analyzing each of the plurality of sound characteristics _CS to determine the sound state S _S of each of the plurality of sound characteristics _CS , The artificial intelligence sound model 160 generates a training group T based on the plurality of sound features _CS and uses the training group T to train the feature analysis module 120. The training group T includes an identification condition that records normal poultry sound states and abnormal poultry sound states.

The artificial intelligence sound model 160 determines each of the plurality of sound features C _S as a poultry sound state or a non-poultry sound state through a support vector machine. If it belongs to the poultry sound state, the artificial intelligence sound model uses a support vector machine to determine whether it belongs to the poultry sound state. Each of the several sound characteristics _CS of the sound state is determined to be a positive poultry sound state or an abnormal poultry sound state.

Please refer to Figure 2, which illustrates a schematic diagram of a poultry voiceprint identification method 200 according to an embodiment of the present case, which at least includes steps S210 and S220, as detailed below.

In step S210, please also refer to FIG. 1. The receiver 110 receives a recording information _IR of a poultry house in a time period.

In step S220, the feature analysis module 130 analyzes and determines a sound state S _S of the recording information _IR .

Please refer to Figure 3, which illustrates a schematic diagram of a poultry voiceprint identification method 300 according to another embodiment of the present invention, which at least includes steps S310, S320, S330, S340, S350 and S360, as detailed below.

In step S310, the receiver 110 receives a recording information _IR of a poultry house in a time period. In some embodiments, the receiver 110 is at least one of an omnidirectional microphone and a directional microphone. In some embodiments, receiver 110 is preferably a directional microphone. As shown in FIGS. 4A and 4B , the recording information _IR received by the receiver 110 may include a waveform diagram or a time-frequency diagram. As shown in Figure 4A, the waveform of the recording information _IR presented in this waveform diagram is not obvious due to the interference of noise. As shown in Figure 4B, the recording information _IR presented in this frequency diagram includes background noise with a frequency similar to that of poultry sounds. The background noise includes but is not limited to the sound of fans running, the sound of cars and rain outside the poultry house. However, these background noises will cause errors in subsequent feature value calculations and affect the classification results. Therefore, the recording information _IR needs to be filtered before analysis.

In step S320, the feature processing module 120 transforms the recording information _IR into several sound features _CS . Step S320 also includes step S321, step S322 and step S323. The feature processing module 120 includes a filtering unit 121, a segmentation unit 122 and an extraction unit 123.

In step S321, the filtering unit 121 filters out the recording information I _R to generate filtered recording information I _FR having a specific frequency range. In some embodiments, the filtered recording information I _FR has a specific frequency range between about 500 Hz and about 5 kHz. In some embodiments, methods for implementing step S321 include, but are not limited to, using filters, spectral subtraction, spectral gating, noise gating, and multi-microphone noise reduction. (multi-microphone noise reduction) or other methods that can improve the signal-to-noise ratio, where filters include but are not limited to adapter filters, finite impulse filters (FIR) or infinite impulse filters (IIR) ), where the finite impulse filter (FIR) or the infinite impulse filter (IIR) is, for example, a high-pass filter (high-pass filter), a band-pass filter (band-pass filter), a low-pass filter (low-pass filter) ) or band-stop filter.

In some embodiments, the filtering unit 121 includes, for example, a Butterworth filter, and performs step S321 with bandpass filtering. The amplitude gain of the n-order Butterworth low-pass filter can be expressed as the following formula (1) , where _Ha is the transfer function, N is the order of the filter, ω is the angular frequency of the signal, and ω _c is the cut-off frequency when the amplitude drops by 3dB. As the order of the filter is higher, the amplitude of the filter attenuates faster in the stop band, and the filtering effect is better. Frequencies lower than ω _c will pass with gain, and frequencies higher than ω _c will be suppressed.

In the received recording information _IR , in addition to high- and low-frequency background noise interference, there is also background noise interference with a frequency similar to that of poultry sounds, as shown in Figure 5B. However, in this case, if band-pass filtering is directly used to suppress background noise with a frequency similar to that of poultry sounds, it will also lead to the elimination of poultry sounds.

In addition, in other embodiments, the step S321 is performed by spectral subtraction, as shown in Figures 6A to 6B. Spectral subtraction is based on a simple assumption, and this assumption is that the noise in the speech signal only exists additive noise. As long as the spectrum of the noisy signal is subtracted from the spectrum of the noise signal, a relatively clean pure speech spectrum can be obtained. The signal model in the time domain can be expressed as the following formula (2), where y(m) is the noisy signal, x(m) is the additive noise, d(m) is the pure speech signal, and m is time.

y(m)=x(m)+d(m), transformed into d(m)=y(m)-x(m), (2)

In the entire filtered recording information I _FR , the background noise after spectral subtraction (shown in Figure 6B) is suppressed to a much greater extent than after band-pass filtering (shown in Figure 5B). Therefore, compared with band-pass filtering, spectral subtraction can effectively eliminate background noise that is similar to the frequency of poultry sounds (that is, it can effectively eliminate background noise within the band-pass band). However, please refer to Figure 6B. Compared with band-pass filtering, spectral subtraction cannot effectively suppress low-frequency noise. In addition, please refer to Figures 7A to 7C. When the signal-to-noise ratio is too low, spectral subtraction may cause sound distortion after filtering (as shown in Figure 7C).

In addition, in some embodiments, the step S321 is performed using bandpass filtering and spectral subtraction. Please refer to Figure 8A to Figure 8B. After most of the high- and low-frequency noise is suppressed through band-pass filtering, spectral subtraction is then used to suppress the background noise in the pass-band. This also reduces the distortion of poultry sounds that may be caused by spectral subtraction. (As shown in Figure 8B).

In step S322, the dividing unit 122 divides the filtered recording information I _FR into several pieces of sound information _IS . For example, the recording information I _R and the filtered recording information I _FR are files with continuous recording duration of 5 minutes, including many normal poultry sound clips, abnormal poultry sound clips, non-poultry sound clips, no sound clips, etc. This makes it difficult to define the classification of this file, causing the subsequent extraction of acoustic features in this file to lose its unity. Therefore, it is necessary to segment the filtered recording information I _FR to reduce the number of poultry sounds contained in a single information file. In some embodiments, methods for implementing step S322 include, but are not limited to, voice activity detection (VAD), autocorrelation function (autocorrelation function, ACF), or other voice feature recognition methods.

In some embodiments, step S322 is performed using Voice Activity Detection (VAD). Voice activity detection is based on a frame length of 10 milliseconds, and is characterized by six sub-band energies, namely 80~250Hz, 250~500Hz, 500~1000Hz, 1~2kHz, 2~3kHz, 3~4kHz, for the entire segment Filter the recording information I _FR for detection, and calculate the probability that each frame is speech and noise respectively. The final criterion for judging that there is speech activity is that the total likelihood ratio of speech in any subband or six subbands is greater than 0.9, then it is judged as having speech activity. Voice activities. In addition, when voice activity is detected, the program will start recording until the voice likelihood ratio is detected to be less than 0.9, that is, when there is no voice activity, the program will end the recording and intercept the documentary segment into a new one with a shorter duration. Audio files (for example, files usually within 2 seconds). Please refer to Figure 9. The number of poultry sounds contained in these intercepted sound information _IS is relatively small. Compared with the original 5-minute recording information I _R and the filtered recording information I _FR , it is more single and can target poultry. Sounds are classified more accurately.

In addition, please refer to Table 1 below, which shows that compared to using only band-pass filtering or only using spectral subtraction, spectral subtraction after band-pass filtering has better sound interception results.

Table 1:

In step S323, the extraction unit 123 extracts several sound features _CS from several pieces of sound information _IS . When analyzing a piece of audio, short-term analysis is usually the main method. Because the audio changes greatly and the recording environment of this study is in a commercial poultry house, the audio content is more complex, so short-term data analysis is relatively stable. , and short-term analysis usually calculates the feature values of a piece of audio content in units of sound frames (Frame) with a length of about 30 milliseconds. Step S323 may also include a feature acquisition method, a feature transformation method or a feature extraction method. In some embodiments, the feature acquisition method in step S323 includes but is not limited to signal transformation (such as wavelet analysis (Wavelet Analysis), power spectral density (PSD)), openSMILE human emotion feature set, process Zero-crossing rate (ZCR), short-time Fourier transform, Fast Fourier Transform (FFT) (such as energy intensity, fundamental frequency), Mel-frequency cepstral coefficient (MFCC) ), Cepstral peak prominence (CPP) and Welchz method. In some embodiments, the feature transformation method in step S323 includes but is not limited to standardization, normalization, binarization or other numerical transformation, scaling, and function transformation methods. In some embodiments, the feature extraction methods in step S323 include but are not limited to principal component analysis (PCA), linear discriminant analysis (LDA), local linear embedding (Local Linear Embedding) , LLE), Laplacian Eigenmap (LE), Stochastic Neighbor Embedding (SNE), T-Stochastic Neighbor Embedding (T-SNE), kernel Principal component analysis (KPCA), transfer component analysis (TCA) or other feature dimensionality reduction and feature extraction methods.

In some embodiments, openSMILE is used to perform step S323. openSMILE is an open source toolkit suitable for signal processing and audio acoustic feature extraction. This embodiment uses the "The INTERSPEECH 2009 Emotion Challenge feature set" feature set, and a total of 384 acoustic features are extracted for each audio file, including the root-mean-square signal frame energy (root-mean-square signal frame). energy), Mel-Frequency cepstral coefficients 1-12, zero-crossing rate of time signal, the voicing probability calculated from the ACF ACF), the fundamental frequency computed from the Cepstrum, etc., a total of 16 low-level descriptors (LLDs).

Mel-Frequency cepstral coefficients (MFCCs) are a set of key coefficients used to establish Mel-Frequency cepstral coefficients. Mel-Frequency cepstral coefficients are a spectrum used to represent short-term audio. The principle is based on the use of nonlinear Mel scale representation of the logarithmic spectrum and its linear cosine transform. Its biggest feature is that the frequency bands on the Mel cepstrum are evenly distributed on the Mel scale. This representation method is closer to the human nonlinear auditory system. This acoustic feature also takes into account the human ear's perception of different frequencies. , so it is particularly suitable for speech recognition, where the transformation between the Mel scale (m) and the actual frequency (f) can be expressed as the following formula (3), where f is the actual frequency value. Its reference point defines 1000Hz as 1000mel.

In step S330, the artificial intelligence sound model 160 generates a training group T according to several sound features _CS . The artificial intelligence sound model 160 uses the training group T to train the feature analysis module 130. The training group T includes an identification condition that records normal poultry sound states, abnormal poultry sound states, and non-poultry sound states. In some embodiments, methods for implementing step S330 include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. (reinforcement learning). Supervised learning includes but is not limited to classifiers (classification) and regression (regression). Classifiers include but are not limited to random forest (random forest), K-Nearest Neighbor (k-NN), support vector machine ( support vector machine (SVM), artificial neural network (ANN), support vector domain description (SVDD), sparse representation classifier (SRC). Unsupervised learning includes but is not limited to clustering and dimensionality reduction.

For example, please refer to FIGS. 10A and 10B , which show normal poultry sounds and abnormal poultry sounds collected in an experimental example. As shown in Figure 10A, the normal sound data of a 20-day-old native chicken is displayed. It has been confirmed by practitioners that the normal sound frequency of a native chicken is between about 2.5kHz and about 3.5kHz, and the duration of a single sound is about 0.1 seconds. As shown in Figure 10B, abnormal sound data of a 20-day-old native chicken is displayed. The frequency of the abnormal sound is between about 1 kHz and about 1.5 kHz, and the duration of this single sound is between about 0.67 seconds and 0.7 seconds. During the period, practitioners confirmed that this abnormal sound was a symptom of rales. One of the characteristics of rales is a prolonged sound caused by excessive mucus in the trachea and obstruction.

In addition, please refer to Figures 11A to 11B, which show normal poultry sounds and abnormal poultry sounds collected in another experimental example. As shown in Figure 11A, the normal sound data of a 19-day-old native chicken is displayed. It has been confirmed by practitioners that the normal sound frequency of a native chicken is between about 2.5kHz and about 4kHz, and the duration of a single sound is about 0.15 seconds. As shown in Figure 11B, the abnormal sound data of a 20-day-old native chicken is displayed. The frequency of the abnormal sound is between about 1kHz and about 1.5kHz, and the duration of this single sound is about 0.7 seconds, which was confirmed by practitioners. , this abnormal sound is a symptom of rales. One of the characteristics of rales is a prolonged sound caused by excessive mucus in the trachea and obstruction.

As shown in Figures 10A to 11B, the frequency difference between normal poultry sounds and abnormal poultry sounds is about 2 kHz, and there is a difference between sound durations of about 0.5 seconds to 0.6 seconds. Therefore, for example, the identification condition in step S330 may be set according to the frequency difference or duration difference of the sounds.

Please refer to FIG. 12 , which illustrates a classification diagram 1200 of a support vector machine according to an embodiment of the present application. The artificial intelligence sound model 160 determines each of the several sound features C _S as a poultry sound state or a non-poultry sound state through a support vector machine. If it belongs to the poultry sound state, the artificial intelligence sound model 160 uses a support vector machine to determine whether it belongs to the poultry sound state. Each of the several sound characteristics _CS of the poultry sound state is determined to be a positive poultry sound state or an abnormal poultry sound state.

In some embodiments, the artificial intelligence sound model 160 uses a total of 150 pieces of normal poultry sound data, 150 pieces of abnormal poultry sound data, and 150 pieces of non-poultry sound data to train the model. The three types of training set data are shown in Table 2. . Through confusion matrix and 10-fold cross-validation, the trained artificial intelligence sound model 160 has a verification accuracy of 84.2% in identifying these three types of data. The verification results of the artificial intelligence sound model 160 are shown in Table 3.

Table 2:

table 3:

In step S340, the feature analysis module 130 analyzes and determines a sound state S _S of each of the plurality of sound features C _S . In some embodiments, the step of analyzing each of the plurality of sound characteristics _CS to determine the sound state S _S of each of the plurality of sound characteristics _CS is through an artificial intelligence sound model 160 (eg, step S330 shown).

In step S350, the sound state S _S of the recording information I _R (or the sound state S _S of each of the plurality of sound features C _S ) is stored in a database 140 through a network. In some embodiments, the database 140 is a cloud database for storing historical information of the sound state S _S.

In step S360, a user is allowed to view the sound status S _S in the database 140 through the network.

Please refer to Figures 13 to 16. Figure 13 illustrates the temperature and humidity information in the poultry house of each age according to an embodiment of the present case. Figure 14 illustrates the prediction results of the number of sound data on the same day for each age according to an embodiment of the present case. Figure 15 shows the prediction results of the number of abnormal poultry sounds of each age on the day according to an embodiment of this case. Figure 16 shows the proportion of each category in the prediction results of the day according to an embodiment of this case. The human observations shown in Figures 14 to 16 are The abnormality day D is the date when the practitioner observed abnormal poultry sounds (such as rales in the poultry), and the artificial observation of the abnormality day D is also applicable to the description of the subsequent figures, so it will be explained first. As shown in Figures 14 and 16, although this practitioner mentioned that some poultry were found to have rales at 20 days of age, since the number of normal poultry sounds is much greater than the number of abnormal poultry sounds, it is difficult to observe the abnormal poultry sounds. Quantity changes. As shown in Figure 15, the changes in the number of abnormal poultry sounds were independently sorted out. It can be seen that the proportion of abnormal poultry sounds began to increase significantly at the age of 18 days, and the proportion began to decrease at the age of 23 days.

Please refer to Figures 17A to 17C. Figure 17A shows the prediction results of the number of sound data from 6 a.m. to 2 p.m. for each day age according to an embodiment of the present case. Figure 17B shows the prediction results for the sound data quantity from 2 p.m. for each day age according to an embodiment of the present case. The prediction results of the number of sound data from 10 p.m. to 10 p.m., FIG. 17C shows the prediction results of the number of sound data from 10 p.m. to 6 a.m. the next day according to an embodiment of the present case.

Furthermore, to independently sort out the changes in the number of abnormal poultry sounds, please refer to Figures 18A to 18C. Figure 18A shows the prediction results of the number of abnormal poultry sound data from 6 am to 2 pm for each age according to an embodiment of this case. , Figure 18B shows the prediction results of the number of abnormal poultry sound data from 2 pm to 10 pm of each age according to an embodiment of this case, and Figure 18C shows the prediction results of the number of abnormal poultry sound data from 10 pm to 6 am of each day of age according to an embodiment of this case. Prediction results of the number of abnormal poultry sound data points. In this example, after confirmation by practitioners, under the data during this 2-week period, an average of about 12,000 pieces of sound data were collected every day, and the accuracy rate of being actually abnormal poultry sounds and being identified as abnormal poultry sounds was 99.3%. . Therefore, the prediction results of abnormal poultry sounds by the system and method proposed in this case can be provided to a user (such as a practitioner) as a basis for assessing the health status of poultry. In addition, compared with practitioners observing abnormal poultry sounds at 20 days of age, the system and method proposed in this case can observe abnormal poultry sounds at 18 days of age, which can help users (such as practitioners) take measures faster.

According to some embodiments, this case discloses a poultry voiceprint recognition system and method that can receive a recording information of a poultry house in a time period and convert the recording information into an image (such as a waveform diagram or a time-frequency diagram and other planar images). ), analyze the recording information based on several image indicators of the image (such as frequency gap or duration gap and other identification conditions) to determine the sound status of the recording information. The sound status includes normal poultry sound status and/or abnormal poultry sound status. According to some embodiments, the above-mentioned step of analyzing the recording information based on several image indicators of the image to determine the sound state of the recording information can also be implemented in conjunction with an artificial intelligence sound model.

The poultry voiceprint identification system and method based on the embodiment of this case can accurately and quickly identify through the improvement of computer software/hardware and the combination of artificial intelligence sound models, even when the poultry house receives recording information with many mixed sounds. The production of abnormal poultry sounds.

In summary, although the present case has been disclosed as above using embodiments, they are not used to limit the present case. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the scope of protection in this case shall be determined by the appended claims.

Claims

A poultry voiceprint identification method, characterized in that the poultry voiceprint identification method includes the following steps:

receiving a recorded message for a poultry house during a time period; and

The recorded information is analyzed to determine a sound state of the recorded information, where the sound state includes a normal poultry sound state or an abnormal poultry sound state.
The poultry voiceprint identification method according to claim 1, wherein the poultry voiceprint identification method further includes the following steps:

Converting the recorded information into several sound features, wherein the step of converting the recorded information into several sound features includes the following steps:

filtering the recorded information to generate filtered recorded information having a specific frequency range;

Divide the filtered recording information into several pieces of sound information; and

extract the plurality of sound features from the plurality of sound information,

The step of analyzing the recording information to determine the sound state of the recording information is analyzing each of the plurality of sound features to determine the sound state of each of the plurality of sound characteristics.
The poultry voiceprint identification method according to claim 2, wherein the step of filtering out the recording information to generate the filtered recording information having the specific frequency range is implemented by bandpass filtering and spectral subtraction.
The poultry voiceprint identification system of claim 2, wherein the step of extracting the plurality of sound features from the plurality of sound information is implemented using openSMILE, Wavelet or short-time Fourier transform.
The poultry voiceprint identification method according to claim 1, wherein the poultry voiceprint identification method further includes the following steps:

Storing the sound status of the recording information in a database through a network; and

A user is provided to view the sound status in the database through the network.
The poultry voiceprint identification method according to claim 2, wherein in the step of analyzing each of the plurality of sound characteristics to determine the sound state of each of the plurality of sound characteristics: Judging by an artificial intelligence sound model, the artificial intelligence sound model generates a training group based on the plurality of sound characteristics. The training group includes an identification condition that records the normal poultry sound state and the abnormal poultry sound state. .
The poultry voiceprint identification method of claim 6, wherein the artificial intelligence sound model determines each of the plurality of sound features as a poultry sound state or a non-poultry sound through a support vector machine. state, if it belongs to the poultry sound state, the artificial intelligence sound model determines each of the several sound features belonging to the poultry sound state as the positive poultry sound state or The abnormal poultry sound condition.
A poultry voiceprint identification system, characterized in that the poultry voiceprint identification system is installed on a server connected to a network, and the poultry voiceprint identification system includes:

A receiver, disposed in a poultry house, for receiving a recorded message of the poultry house in a time period; and

A feature analysis module is used to analyze the recording information to determine a sound state of the recording information. The sound state includes a normal poultry sound state or an abnormal poultry sound state.
The poultry voiceprint identification system according to claim 8, wherein the poultry voiceprint identification system further includes:

A feature processing module for converting the recorded information into several sound features, wherein the feature processing module further includes:

a filtering unit for filtering the recording information to generate filtered recording information having a specific frequency range;

A dividing unit used to divide the filtered recording information into several pieces of sound information; and

an extraction unit, used to extract the plurality of sound features from the plurality of sound information,

In the step of analyzing the recording information to determine the sound state of the recording information, the feature analysis module is used to analyze each of the plurality of sound characteristics to determine each of the plurality of sound characteristics. of said sound state.
The poultry voiceprint identification system according to claim 9, wherein the filtering unit is further used for:

The step of filtering out the recording information to generate the filtered recording information having the specific frequency range is implemented by bandpass filtering and spectral subtraction.
The poultry voiceprint identification system according to claim 9, wherein the extraction unit is further used to:

The step of extracting the plurality of sound features from the plurality of sound information is implemented by openSMILE, Wavelet or short-time Fourier transform.
The poultry voiceprint identification system according to claim 6, wherein the poultry voiceprint identification system further includes:

a database for storing the sound status of the recording information through the network; and

A user interface for a user to view the sound status in the database through the network.
The poultry voiceprint identification system according to claim 9, wherein the feature analysis module is further used to:

In the step of analyzing each of the plurality of sound characteristics to determine the sound state of each of the plurality of sound characteristics, the determination is made through an artificial intelligence sound model, and the artificial intelligence sound model is based on the The several sound features generate a training group and use the training group to train the feature analysis module. The training group includes an identification condition that records the normal poultry sound state and the abnormal poultry sound state.
The poultry voiceprint identification system of claim 13, wherein the artificial intelligence sound model determines each of the plurality of sound features as a poultry sound state or a non-poultry sound through a support vector machine. state, if it belongs to the poultry sound state, the artificial intelligence sound model determines each of the several sound features belonging to the poultry sound state as the positive poultry sound state or The abnormal poultry sound condition.