CN114420303A - Novel new crown auxiliary screening method based on sound characteristics - Google Patents

Novel new crown auxiliary screening method based on sound characteristics Download PDF

Info

Publication number
CN114420303A
CN114420303A CN202111650411.1A CN202111650411A CN114420303A CN 114420303 A CN114420303 A CN 114420303A CN 202111650411 A CN202111650411 A CN 202111650411A CN 114420303 A CN114420303 A CN 114420303A
Authority
CN
China
Prior art keywords
model
data
sound
detection
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111650411.1A
Other languages
Chinese (zh)
Inventor
刘永昌
章恒靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xundalke Suzhou Computer Information Technology Co ltd
Original Assignee
Xundalke Suzhou Computer Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xundalke Suzhou Computer Information Technology Co ltd filed Critical Xundalke Suzhou Computer Information Technology Co ltd
Priority to CN202111650411.1A priority Critical patent/CN114420303A/en
Publication of CN114420303A publication Critical patent/CN114420303A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention relates to the technical field, in particular to a novel new crown auxiliary screening method based on sound characteristics, which comprises the following steps: the method comprises the following steps: an acquisition stage, acquiring original data of a user; step two: the processing stage, the original data is preprocessed, and characteristic engineering is carried out; step three: in the training stage, simulated pre-training is carried out in a pre-training model, and a sound characteristic vector is extracted in a transfer learning mode; step four: and in the detection stage, the new characteristic dimension and the sound characteristic vector are transmitted to a detection model, and the sick personnel are identified. According to the invention, the new coronary patient is finally detected by utilizing the sound data and without complex acquisition equipment through data analysis, data preprocessing, characteristic engineering and model detection, the new coronary patient can be rapidly and timely identified and detected, and the detection timeliness and the detection efficiency are greatly improved.

Description

Novel new crown auxiliary screening method based on sound characteristics
Technical Field
The invention relates to a novel new crown auxiliary screening method based on sound characteristics, in particular to a novel new crown auxiliary screening method based on sound characteristics, and belongs to the technical field.
Background
The substance for detecting nucleic acid is nucleic acid of virus, and the nucleic acid detection is to find whether nucleic acid of virus invaded from the outside exists in respiratory tract specimen, blood or feces of a patient to determine whether the patient is infected by new coronavirus, so that once the nucleic acid is detected to be positive, the virus exists in the patient body, but the nucleic acid detection has certain defects in efficiency and cost: firstly, a complete nucleic acid detection process comprises collection, storage and transportation, extraction and detection of sample nucleic acid and interpretation of results, the execution time of the whole process is 2 hours at the fastest speed, and the results cannot be obtained in real time.
Therefore, there is a need for improvements in new crown-assisted screening methods to address the above existing problems.
Disclosure of Invention
The invention aims to provide a novel new crown auxiliary screening method based on sound characteristics, which utilizes sound data, does not need complex acquisition equipment, finally detects a new crown patient through data analysis, data preprocessing, characteristic engineering and model detection, can quickly and real-timely identify and detect the new crown patient for a user, and greatly improves the detection timeliness and detection efficiency.
In order to achieve the purpose, the invention adopts the main technical scheme that:
a novel new crown auxiliary screening method based on sound characteristics comprises the following steps:
the method comprises the following steps: an acquisition stage, acquiring original data of a user;
step two: in the processing stage, the original data are preprocessed according to personal experience and expert opinions, characteristic engineering is carried out, and relevant original data are extracted to identify abnormal behaviors and generate new characteristic dimensions;
step three: in the training stage, the public data is utilized to perform simulated pre-training in a pre-training model, and the sound characteristic vector is extracted in a transfer learning mode;
step four: and in the detection stage, the new feature dimension and the sound feature vector are transmitted to a detection model, and the sick person is identified.
According to the technical scheme, original data need to be researched, data analysis is carried out on some basic characteristics of a user, such as disease history, existing symptoms, whether new crown detection is positive or not, so that the user knows the basic characteristics, the data are preprocessed according to relevant data such as personal experience and expert opinions, characteristic engineering is carried out, data dimensions which are relatively meaningful and can be used for identifying abnormal behaviors are extracted or abstracted, pre-training of a model is carried out by utilizing a public data set, sound characteristic vectors are extracted in a transfer learning mode, the dimensions are used as input and transmitted to a detection model, a sick person is identified from the dimensions, the sound data are utilized, complex acquisition equipment is not needed, detection can be carried out rapidly and in real time, deployment is convenient, integration is easy, and identification of new crown patients can be carried out rapidly.
Preferably, the raw data includes:
age and sex;
a disease history for a brief description of whether the user is currently suffering from a respiratory disease;
symptoms, including dry cough, wet cough, throat pain, headache, lack of strength, chest distress, and tachypnea, are recorded for the user;
a cough record for recording a segment of a sound band in which the user coughs;
the deep breathing record is used for recording a section of sound wave band of deep breathing of a user;
and detecting the result, and marking whether the user is a confirmed patient, wherein the field is label data.
Preferably, the sampling frequency of the cough record is 22kHz, the user is required to cough for 3-5 sounds during recording, the sampling frequency of the deep breath record is 22kHz, and the user is required to take 3-5 times of deep breath by mouth during recording.
Preferably, the processing of the raw data by the feature engineering specifically includes:
duration of audio: carrying out noise processing and pruning on the audio frequency;
note onset: for indicating the time point of occurrence of the note;
rhythm characteristics: global acoustic rhythm characteristics of the whole audio are used for measuring the frequency of peak occurrence;
pitch period: detecting periods of voiced sounds as parameters describing an excitation source in the processing of the speech signal;
root mean square energy: performing a square root calculation of the amplitude over a period of time for perceived loudness, which loudness is available for event detection;
spectrum centroid: this feature is used to describe where the "centroid" of the sound is located for use as a detected feature;
roll-off frequency: a measure of signal shape for audio event detection;
zero crossing rate: a rate of signal symbol change for characterizing the detection;
mel-frequency cepstrum coefficients: the overall shape of the spectral envelope is used as a feature for sound signal processing.
Preferably, the pre-training model is a VGGish model, the detection model is a Support Vector Machine (SVM) model, and the algorithm execution process includes the following steps:
performing data cleaning and duplicate removal on the original data to obtain preprocessed data;
carrying out feature extraction on the cleaned preprocessed data through the feature engineering to obtain a feature vector;
sending the cleaned preprocessed data into the VGGish model for transfer learning to obtain an embedded vector;
combining the embedded vector and the feature vector, and performing dimensionality reduction through PCA to obtain a 128-dimensional vector;
sending the 128-dimensional vector into a Support Vector Machine (SVM) model, and constructing a two-classification model by using a Gaussian kernel function;
and training the SVM model and outputting a detection result.
Through the technical scheme, parameters such as the spectrum centroid, the mel-frequency cepstrum coefficient and the like are characteristics of a time sequence and cannot be directly used as characteristics which can be input by a model, so that some statistical characteristics need to be extracted after characteristic engineering to be used as characteristics which are finally sent to the model for training.
Then, the invention introduces the transfer learning to train the model, use VGGish model to carry on the transfer learning in this text, VGGish model is a convolution neural network based on original audio input, it trains with a large-scale YouTube data set, can be used for extracting the corpus characteristic, in the invention, use VGGish model as a characteristic extractor to carry on the transfer learning, input the original audio into VGGish model, change into a 128 dimensional embedded vector based on the transfer learning, then transmit to the next classification model.
Preferably, the VGGish model is a convolutional neural network based on original audio input, and is trained by using a large-scale YouTube data set for extracting corpus features.
Preferably, the PCA transforms the raw data into a set of representations linearly independent of each dimension by linear transformation for extracting principal feature components of the data, and reduces the dimension to 128 dimensions by the PCA, which is sent to the detection model.
Preferably, the support vector machine SVM model finds an optimal separation hyperplane in a feature space so that a positive-negative sample interval on a training set is maximum, and the hyperplane algorithm is:
wTx + b is 0, where w (w _1, w _ 2.., w _ d) is the normal vector and b is the displacement term, determining the distance between the hyperplane and the origin;
suppose that the hyperplane can combine samples (x)i,yi) If correctly classified, there are:
wTxi+b≧+1,yi=+1;wTxi+bβ-1,yi=-1;
the nearest vectors of the hyperplane just enable equal signs to be established, the vectors are support vectors, and the interval from two heterogeneous support vectors to the hyperplane is;
Figure BDA0003444725330000051
the maximum interval is:
Figure BDA0003444725330000052
s.t.Yi(WTXi+b)≥,i=1,2,...,m。
according to the technical scheme, the detection model is a simple classification model, an SVM model is used for classification, an optimal separation hyperplane is found on a feature space of the SVM model to enable the interval between positive and negative samples on a training set to be maximum, the SVM is a supervised learning algorithm used for solving a two-classification problem, the SVM can also be used for solving a nonlinear problem after a kernel method is introduced, and a Gaussian kernel function is selected to classify the SVM in a high-dimensional space due to the fact that the data dimension is high.
Compared with the prior art, the method and the device do not need to perform identification in a temperature measuring gun or nucleic acid detection mode, and can perform feature extraction on the sound of the user through data analysis and the like according to the basic data of the user such as cough, breath and symptoms, so that the new crown patient can be quickly identified.
The invention has at least the following beneficial effects:
utilize sound data, do not need complicated collection equipment, finally detect out new crown patient through data analysis, data preprocessing, characteristic engineering, model detection, can be quick, real-time to the user new crown patient discernment and detect, promote the timeliness of detection and the efficiency that detects greatly.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a block diagram of the present invention;
FIG. 3 is a diagram of a feature engineering structure of the present invention;
FIG. 4 is a schematic diagram of the algorithm implementation of the present invention;
FIG. 5 is a flow chart of the algorithm implementation of the present invention.
Detailed Description
Embodiments of the present application will be described in detail with reference to the drawings and examples, so that how to implement technical means to solve technical problems and achieve technical effects of the present application can be fully understood and implemented.
As shown in fig. 1 to fig. 5, the novel new crown-assisted screening method based on sound characteristics provided by this embodiment includes the following steps:
the method comprises the following steps: an acquisition stage, acquiring original data of a user;
step two: in the processing stage, the original data are preprocessed according to personal experience and expert opinions, characteristic engineering is carried out, and relevant original data are extracted to identify abnormal behaviors and generate new characteristic dimensions;
step three: in the training stage, the public data is utilized to perform simulated pre-training in a pre-training model, and the sound characteristic vector is extracted in a transfer learning mode;
step four: in the detection stage, the new characteristic dimension and the sound characteristic vector are transmitted to a detection model, and the sick personnel are identified from the detection model;
the method comprises the steps of firstly researching original data, carrying out data analysis on some basic characteristics of a user, such as disease history, existing symptoms, whether new crown detection is positive and the like, so as to know the basic characteristics, preprocessing the data according to related data such as personal experience, expert opinions and the like, carrying out characteristic engineering, extracting or abstracting some data dimensions which are relatively meaningful and can be used for identifying abnormal behaviors, then carrying out model pre-training by using an open data set, extracting sound characteristic vectors by a transfer learning mode, finally taking the dimensions as input to be transmitted to a detection model, identifying sick personnel from the detection model, using sound data, needing no complex acquisition equipment, being capable of carrying out detection quickly and in real time, being convenient to deploy, being easy to integrate and being capable of quickly identifying new crown patients.
The raw data includes:
age and sex;
a disease history for briefly describing whether the user is currently suffering from respiratory diseases or not, and for judging whether the user is already suffering from diseases other than new coronary pneumonia or not;
symptoms, including dry cough, wet cough, throat pain, headache, lack of strength, chest distress, and tachypnea, are recorded for the user;
a cough record for recording a segment of a sound band in which the user coughs;
the deep breathing record is used for recording a section of sound wave band of deep breathing of a user;
and detecting the result, and marking whether the user is a confirmed patient, wherein the field is label data.
The sampling frequency of cough record is 22kHz, the user is required to cough 3-5 times during recording, the sampling frequency of deep breath record is 22kHz, and the user is required to breathe 3-5 times in the mouth during recording.
Then, processing the original data into a data type capable of performing characteristic engineering and cleaning missing data and error data, the invention mainly analyzes the frequency spectrum data of sound by performing the characteristic engineering on the original sound data, manually extracts partial characteristics according to an audio sequence, and the characteristic engineering processes the original data, and specifically comprises the following steps:
duration of audio: carrying out noise processing and trimming on the audio frequency to obtain the duration time of a real sound wave band from beginning to end;
note onset: for representing the point in time at which a note occurs, the onset being characterized by a sudden increase in energy or a change in spectral power distribution;
rhythm characteristics: global acoustic rhythm characteristics of the whole audio are used for measuring the frequency of peak occurrence;
pitch period: the sound signal is divided into unvoiced sound and voiced sound according to the mode of vocal cord vibration. Wherein voiced sounds require periodic vibrations of the vocal cords and therefore have a significant periodicity. The gene period is one of important parameters for describing an excitation source in speech signal processing;
root mean square energy: performing a square root calculation of the amplitude over a period of time for perceived loudness, which loudness is available for event detection;
spectrum centroid: this feature is used to describe where the "centroid" of the sound is located for use as a detected feature;
roll-off frequency: a measure of signal shape for audio event detection;
zero crossing rate: the rate at which the sign of the signal changes, i.e., the number of times the speech signal changes from positive to negative or from negative to positive in each frame. This feature is one of the feature parameters commonly used in speech recognition;
mel-frequency cepstrum coefficients: this feature is a small set of features that concisely describe the overall shape of the spectral envelope, one of the most important features of sound signal processing.
In this embodiment, as shown in fig. 5, the pre-training model is a VGGish model, the detection model is a support vector machine SVM model, and the algorithm execution flow includes the following steps:
performing data cleaning and duplicate removal on the original data to obtain preprocessed data;
carrying out feature extraction on the cleaned preprocessed data through feature engineering to obtain feature vectors;
sending the cleaned preprocessed data into a VGGish model for transfer learning to obtain an embedded vector;
combining the embedded vector and the feature vector, and reducing the dimension through PCA to obtain a 128-dimensional vector;
sending the 128-dimensional vector into a Support Vector Machine (SVM) model, and constructing a two-classification model by using a Gaussian kernel function;
and training a Support Vector Machine (SVM) model and outputting a detection result.
Because parameters such as the spectrum centroid, the mel-frequency cepstrum coefficient and the like are characteristics of a time sequence and cannot be directly used as characteristics which can be input by the model, after characteristic engineering, some statistical characteristics need to be extracted to be used as characteristics which are finally input into the model for training.
Then, the invention introduces the transfer learning to train the model, use VGGish model to carry on the transfer learning in this text, VGGish model is a convolution neural network based on original audio input, it trains with a large-scale YouTube data set, can be used for extracting the corpus characteristic, in the invention, use VGGish model as a characteristic extractor to carry on the transfer learning, input the original audio into VGGish model, change into a 128 dimensional embedded vector based on the transfer learning, then transmit to the next classification model.
The VGGish model is a convolution neural network based on original audio input, a large-scale YouTube data set is used for training and extracting corpus features, PCA transforms original data into a group of representations which are linearly independent of each dimension through linear transformation and is used for extracting main feature components of the data, the dimensions are reduced to 128 dimensions through PCA, the data are sent into a detection model, because the VGGish pre-training model outputs a vector with 128 dimensions, the finally generated vector dimension reaches 433 dimensions after feature engineering, therefore, before the model is sent into for training, the feature dimensions need to be reduced, the dimension is reduced by using a PCA algorithm, PCA is a data analysis method, PCA transforms original data into a group of representations which are linearly independent of each dimension through linear transformation, the PCA can be used for extracting the main feature components of the data and is commonly used for reducing the dimensions of high-dimensional data, finally, dimensionality is reduced to 128 dimensionalities through PCA, and the dimensionality is sent to a detection model.
The SVM model finds the optimal separation hyperplane in the feature space to maximize the interval between positive and negative samples in the training set, and the hyperplane algorithm is as follows:
wTx + b is 0, where w (w _1, w _ 2.., w _ d) is the normal vector and b is the displacement term, determining the distance between the hyperplane and the origin;
suppose that the hyperplane can combine samples (x)i,yi) If correctly classified, there are:
wTxi+b≧+1,yi=+1;wTxi+bβ-1,yi=-1;
the nearest vectors of the hyperplane just enable equal signs to be established, the vectors are support vectors, and the interval from two heterogeneous support vectors to the hyperplane is;
Figure BDA0003444725330000091
the maximum interval is:
Figure BDA0003444725330000092
s.t.Yi(WTXi+b)≥,i=1,2,...,m。
the detection model is a simple classification model, an SVM model is used for classification, an SVM model is used for finding an optimal separation hyperplane on a feature space to enable the interval between a positive sample and a negative sample on a training set to be maximum, an SVM is a supervised learning algorithm used for solving a two-classification problem, the SVM can also be used for solving a nonlinear problem after a kernel method is introduced, and a Gaussian kernel function is selected for SVM classification in a high-dimensional space due to the fact that the data dimension is high.
Compared with the prior art, the method and the device do not need to perform identification in a temperature measuring gun or nucleic acid detection mode, and can perform feature extraction on the sound of the user through data analysis and the like according to the basic data of the user such as cough, breath and symptoms, so that the new crown patient can be quickly identified.
As used in the specification and in the claims, certain terms are used to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This specification and claims do not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms "include" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. "substantially" means within an acceptable error range, and a person skilled in the art can solve the technical problem within a certain error range to achieve the technical effect basically.
It is noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the article or system in which the element is included.
The foregoing description shows and describes several preferred embodiments of the invention, but as aforementioned, it is to be understood that the invention is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A novel new crown auxiliary screening method based on sound characteristics is characterized by comprising the following steps:
the method comprises the following steps: an acquisition stage, acquiring original data of a user;
step two: in the processing stage, the original data are preprocessed according to personal experience and expert opinions, characteristic engineering is carried out, and relevant original data are extracted to identify abnormal behaviors and generate new characteristic dimensions;
step three: in the training stage, the public data is utilized to perform simulated pre-training in a pre-training model, and the sound characteristic vector is extracted in a transfer learning mode;
step four: and in the detection stage, the new feature dimension and the sound feature vector are transmitted to a detection model, and the sick person is identified.
2. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: the raw data includes:
age and sex;
a disease history for a brief description of whether the user is currently suffering from a respiratory disease;
symptoms, including dry cough, wet cough, throat pain, headache, lack of strength, chest distress, and tachypnea, are recorded for the user;
a cough record for recording a segment of a sound band in which the user coughs;
the deep breathing record is used for recording a section of sound wave band of deep breathing of a user;
and detecting the result, and marking whether the user is a confirmed patient, wherein the field is label data.
3. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 2, wherein: the sampling frequency of the cough record is 22kHz, the cough of the user is required to be 3-5 times during recording, the sampling frequency of the deep breath record is 22kHz, and the user is required to take 3-5 times of deep breath by mouth during recording.
4. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: the processing of the original data by the feature engineering specifically includes:
duration of audio: carrying out noise processing and pruning on the audio frequency;
note onset: for indicating the time point of occurrence of the note;
rhythm characteristics: global acoustic rhythm characteristics of the whole audio are used for measuring the frequency of peak occurrence;
pitch period: detecting periods of voiced sounds as parameters describing an excitation source in the processing of the speech signal;
root mean square energy: performing a square root calculation of the amplitude over a period of time for perceived loudness, which loudness is available for event detection;
spectrum centroid: this feature is used to describe where the "centroid" of the sound is located for use as a detected feature;
roll-off frequency: a measure of signal shape for audio event detection;
zero crossing rate: a rate of signal symbol change for characterizing the detection;
mel-frequency cepstrum coefficients: the overall shape of the spectral envelope is used as a feature for sound signal processing.
5. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: the pre-training model is a VGGish model, the detection model is a Support Vector Machine (SVM) model, and the algorithm execution flow comprises the following steps:
5.1: performing data cleaning and duplicate removal on the original data to obtain preprocessed data;
5.2: carrying out feature extraction on the cleaned preprocessed data through the feature engineering to obtain a feature vector;
5.3: sending the cleaned preprocessed data into the VGGish model for transfer learning to obtain an embedded vector;
5.4: combining the embedded vector and the feature vector, and performing dimensionality reduction through PCA to obtain a 128-dimensional vector;
5.5: sending the 128-dimensional vector into a Support Vector Machine (SVM) model, and constructing a two-classification model by using a Gaussian kernel function;
5.6: and training the SVM model and outputting a detection result.
6. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: the VGGish model is a convolutional neural network based on original audio input, is trained by utilizing a large-scale YouTube data set and is used for extracting corpus features.
7. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: and the PCA transforms the original data into a group of representations which are linearly independent of each dimension through linear transformation and are used for extracting main characteristic components of the data, and the dimensions are reduced to 128 dimensions through the PCA and are sent into the detection model.
8. The novel new crown-assisted screening method based on sound characteristics as claimed in claim 1, wherein: the SVM model finds the optimal separation hyperplane on the feature space to enable the interval between positive and negative samples on a training set to be maximum, and the hyperplane algorithm is as follows:
wTx + b is 0, where w (w _1, w _ 2.., w _ d) is the normal vector and b is the displacement term, determining the distance between the hyperplane and the origin;
suppose that the hyperplane can combine samples (x)i,yi) If correctly classified, there are:
wTxi+b≧+1,yi=+1;wTxi+bβ-1,yi=-1;
the nearest vectors of the hyperplane just enable equal signs to be established, the vectors are support vectors, and the interval from two heterogeneous support vectors to the hyperplane is;
Figure FDA0003444725320000031
the maximum interval is:
Figure FDA0003444725320000032
s.t.Yi(WTXi+b)≥,i=1,2,...,m。
CN202111650411.1A 2021-12-29 2021-12-29 Novel new crown auxiliary screening method based on sound characteristics Withdrawn CN114420303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111650411.1A CN114420303A (en) 2021-12-29 2021-12-29 Novel new crown auxiliary screening method based on sound characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111650411.1A CN114420303A (en) 2021-12-29 2021-12-29 Novel new crown auxiliary screening method based on sound characteristics

Publications (1)

Publication Number Publication Date
CN114420303A true CN114420303A (en) 2022-04-29

Family

ID=81269407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111650411.1A Withdrawn CN114420303A (en) 2021-12-29 2021-12-29 Novel new crown auxiliary screening method based on sound characteristics

Country Status (1)

Country Link
CN (1) CN114420303A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115497502A (en) * 2022-11-07 2022-12-20 图灵人工智能研究院(南京)有限公司 Method and system for distinguishing new crown infection based on human body representation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115497502A (en) * 2022-11-07 2022-12-20 图灵人工智能研究院(南京)有限公司 Method and system for distinguishing new crown infection based on human body representation

Similar Documents

Publication Publication Date Title
Mouawad et al. Robust detection of COVID-19 in cough sounds: using recurrence dynamics and variable Markov model
Deng et al. Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps
CN103280220B (en) A kind of real-time recognition method for baby cry
Chen et al. Voice disorder identification by using Hilbert-Huang transform (HHT) and K nearest neighbor (KNN)
Panek et al. Acoustic analysis assessment in speech pathology detection
CN105895078A (en) Speech recognition method used for dynamically selecting speech model and device
Liu et al. Infant cry signal detection, pattern extraction and recognition
WO2007102505A1 (en) Infant emotion judging method, and device and program therefor
CN111329494A (en) Depression detection method based on voice keyword retrieval and voice emotion recognition
You et al. Cough detection by ensembling multiple frequency subband features
CN115410711B (en) White feather broiler health monitoring method based on sound signal characteristics and random forest
Bhagatpatil et al. An automatic infant’s cry detection using linear frequency cepstrum coefficients (LFCC)
Hamidi et al. COVID-19 assessment using HMM cough recognition system
CN114373452A (en) Voice abnormity identification and evaluation method and system based on deep learning
CN114420303A (en) Novel new crown auxiliary screening method based on sound characteristics
JP2023018658A (en) Difficult airway evaluation method and device based on machine learning voice technology
Yamashita Acoustic HMMs to detect abnormal respiration with limited training data
Hu et al. Auditory receptive field net based automatic snore detection for wearable devices
CN111862991A (en) Method and system for identifying baby crying
Liu et al. Classifying respiratory sounds using electronic stethoscope
Yamashita Classification between normal and abnormal respiration using ergodic HMM for intermittent abnormal sounds
Patel et al. Multi Feature fusion for COPD Classification using Deep learning algorithms
Fathan et al. An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds
CN115116600A (en) Automatic classification and recognition system for children cough
CN113160967A (en) Method and system for identifying attention deficit hyperactivity disorder subtype

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220429

WW01 Invention patent application withdrawn after publication