CN113096691A - Detection method, device, equipment and computer storage medium - Google Patents

Detection method, device, equipment and computer storage medium Download PDF

Info

Publication number
CN113096691A
CN113096691A CN202110303977.0A CN202110303977A CN113096691A CN 113096691 A CN113096691 A CN 113096691A CN 202110303977 A CN202110303977 A CN 202110303977A CN 113096691 A CN113096691 A CN 113096691A
Authority
CN
China
Prior art keywords
neural network
network model
audio data
preset
cough
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110303977.0A
Other languages
Chinese (zh)
Inventor
高贵锋
于力伟
崔丹
王瑞强
何杨
饶青超
陈文福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ambulanc Shenzhen Tech Co Ltd
Original Assignee
Ambulanc Shenzhen Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ambulanc Shenzhen Tech Co Ltd filed Critical Ambulanc Shenzhen Tech Co Ltd
Priority to CN202110303977.0A priority Critical patent/CN113096691A/en
Publication of CN113096691A publication Critical patent/CN113096691A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a detection method, a device, equipment and a computer storage medium, wherein the detection method comprises the following steps: acquiring audio data of a person to be detected; inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data; the detection result output by the preset neural network model is obtained, the problem that real-time detection cannot be achieved due to the fact that the detection process is complex in the prior art is solved, and detection efficiency is improved.

Description

Detection method, device, equipment and computer storage medium
Technical Field
The present invention relates to the field of medical technology, and in particular, to a detection method, apparatus, device, and computer storage medium.
Background
For example, the detection of the new coronavirus nucleic acid is carried out by laboratory methods for detection and screening, and confirmed cases, suspected cases and asymptomatic infected persons of the new coronavirus are found. Most of the existing detection methods for diseases adopt methods such as nasopharyngeal swab and anal swab to detect, but the detection processes of the detection methods are too complicated and the detection period is too long, and real-time detection cannot be realized when a large number of detection objects appear.
Disclosure of Invention
The invention mainly aims to provide a detection method, a detection device, detection equipment and a computer storage medium, and aims to solve the problem that real-time detection cannot be realized due to the fact that the detection process is complicated.
In order to achieve the above object, the present invention provides a detection method, which is applied in a detection device; in one embodiment, the detection method comprises the following steps:
acquiring audio data of a person to be detected;
inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data;
and obtaining a detection result output by the preset neural network model.
In an embodiment, the step of inputting the audio data into the preset neural network model is preceded by:
acquiring preset cough audio data, and building an initial preset neural network model;
generating a cough audio data set according to the preset cough audio data;
and constructing a preset neural network model according to the initial neural network model and the cough audio data set.
In an embodiment, the step of generating a cough audio data set according to the preset cough audio data comprises:
preprocessing the preset cough audio data to obtain a cough audio data set, wherein the preprocessing process sequentially comprises the following steps: pre-emphasis processing, framing processing, windowing processing, fourier transform, mel filtering, logarithmic energy computation, and discrete cosine transform.
In an embodiment, the initial neural network model comprises a first neural network model and a second neural network model; the step of constructing a preset neural network model from the initial neural network model and the cough audio data set comprises:
obtaining audio features according to the cough audio data set and the first neural network model;
and training according to the audio features and the second neural network model to obtain a preset neural network model.
In one embodiment, the audio features include at least: vocal emotional characteristics, respiratory tract biological characteristics, and vocal cord biological characteristics.
In an embodiment, the step of constructing a preset neural network model according to the initial neural network model and the cough audio data set further includes:
dividing the cough audio data set into a cough audio training set and a cough audio testing set;
training the initial neural network model by adopting the cough audio training set, and testing the trained initial neural network model by adopting the cough audio testing set;
and when the accuracy of the test result reaches a preset threshold value, obtaining a preset neural network model.
In an embodiment, the step of obtaining the detection result output by the preset neural network model includes:
obtaining a detection result according to a cross entropy loss function and the preset neural network model, wherein the cross entropy loss function is as follows:
Figure BDA0002987381830000031
n is the number of audio features, ynFor the true detection result, whatThe above-mentioned
Figure BDA0002987381830000032
To predict the detection result.
In order to achieve the above object, the present invention also provides a detection apparatus, comprising:
the acquisition module is used for acquiring the audio data of a person to be detected;
the input module is used for inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained by training according to preset cough audio data;
and the output module is used for acquiring the detection result output by the preset neural network model.
To achieve the above object, the present invention further provides a detection apparatus, which includes a memory, a processor, and a detection program stored in the memory and executable on the processor, wherein the detection program, when executed by the processor, implements the steps of the detection method as described above.
To achieve the above object, the present invention also provides a computer storage medium storing a detection program, which when executed by a processor, implements the steps of the detection method as described above.
According to the detection method, the device, the equipment and the computer storage medium, the audio data of the person to be detected are obtained, the audio data are preprocessed, the preprocessed audio data are input into the preset neural network, so that the detection result output by the preset neural network model is obtained, the preset neural network model is adopted to process the audio data of the person to be detected, the problem that real-time detection cannot be achieved due to the fact that the detection process is complicated in the prior art is solved, and the detection efficiency is improved.
Drawings
FIG. 1 is a schematic structural diagram of a detection apparatus according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of the detection method of the present invention;
FIG. 3 is a schematic flow chart of a second embodiment of the detection method of the present invention;
FIG. 4 is a schematic flow chart of a third embodiment of the detection method of the present invention;
FIG. 5 is a schematic flow chart of a fourth embodiment of the detection method of the present invention;
FIG. 6 is a schematic flow chart of a fifth embodiment of the detection method of the present invention;
FIG. 7 is a schematic structural diagram of the detecting device of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
This application is for solving the loaded down with trivial details problem that leads to can not real-time detection of testing process among the prior art, through acquireing the audio data who treats the person of examining, will the neural network model is predetermine in the audio data input, wherein, predetermine the neural network model and obtain according to predetermineeing cough audio data predetermine the technical scheme of the testing result of neural network model output, the realization is according to predetermineeing the purpose that the neural network model obtained testing result fast, improves detection efficiency.
For a better understanding of the above technical solutions, exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, fig. 1 is a schematic structural diagram of a hardware operating environment according to an embodiment of the present invention.
It should be noted that fig. 1 is an architectural diagram of a hardware operating environment of the detection device.
As shown in fig. 1, the detection apparatus may include: a processor 1001, such as a CPU, a memory 1005, a user interface 1003, a network interface 1004, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a display screen (Di sp ay), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (NoN-cache memory), such as a disk memory. The memory 1005 may optionally be a storage means separate from the processor 1001 as described above, and the detection apparatus further comprises a microphone, an audio processor.
It will be understood by those skilled in the art that the detection device configuration shown in FIG. 1 does not constitute a limitation of the detection device, and that the detection device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a detection program. Among them, the operating system is a program that manages and controls the hardware and software resources of the inspection apparatus, the operation of the inspection program, and other software or programs.
In the detection apparatus shown in fig. 1, the user interface 1003 is mainly used for connecting a terminal, and performing data communication with the terminal; the network interface 1004 is mainly used for the background server and performs data communication with the background server; the processor 1001 may be used to invoke a detection program stored in the memory 1005.
In this embodiment, the detection apparatus includes: a memory 1005, a processor 1001 and a detection program stored on the memory and executable on the processor, wherein:
in the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
acquiring audio data of a person to be detected;
inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data;
and obtaining a detection result output by the preset neural network model.
In the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
acquiring preset cough audio data, and building an initial preset neural network model;
generating a cough audio data set according to the preset cough audio data;
and constructing a preset neural network model according to the initial neural network model and the cough audio data set.
In the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
preprocessing the preset cough audio data to obtain a cough audio data set, wherein the preprocessing process sequentially comprises the following steps: pre-emphasis processing, framing processing, windowing processing, fourier transform, mel filtering, logarithmic energy computation, and discrete cosine transform.
In the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
obtaining audio features according to the cough audio data set and the first neural network model;
and training according to the audio features and the second neural network model to obtain a preset neural network model.
In the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
dividing the cough audio data set into a cough audio training set and a cough audio testing set;
training the initial neural network model by adopting the cough audio training set, and testing the trained initial neural network model by adopting the cough audio testing set;
and when the accuracy of the test result reaches a preset threshold value, obtaining a preset neural network model.
In the embodiment of the present application, the processor 1001 may be configured to call the detection program stored in the memory 1005 and perform the following operations:
obtaining a detection result according to a cross entropy loss function and the preset neural network model, wherein the cross entropy loss function is as follows:
Figure BDA0002987381830000071
n is the number of audio features, ynFor the true detection result, the
Figure BDA0002987381830000072
To predict the detection result.
Since the detection device provided in the embodiment of the present application is a detection device used for implementing the method in the embodiment of the present application, based on the method described in the embodiment of the present application, a person skilled in the art can understand the specific structure and deformation of the detection device, and thus details are not described here. All the detection devices adopted in the method of the embodiment of the present application belong to the protection scope of the present application. The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
For a software implementation, the techniques described in this disclosure may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described in this disclosure. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Based on the above structure, an embodiment of the present invention is proposed.
Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the detection method of the present invention, which includes the following steps:
step S110, audio data of the person to be detected is obtained.
And step S120, inputting the audio data into a preset neural network model.
And step S130, obtaining a detection result output by the preset neural network model.
In this embodiment, the category of the subject to be detected may be healthy people, confirmed cases of new coronavirus, suspected cases of new coronavirus, or asymptomatic infected persons, and the category of the subject to be detected has uncertainty; in the application scenario of the application, a sound pickup in the detection equipment is adopted to collect audio data of a person to be detected, an audio file is generated according to the collected audio data, the sound pickup can be a microphone, and the audio file is sent to an audio processor in the detection equipment to be preprocessed, so that cough audio data which can be input into a preset neural network model are obtained; the audio data is cough audio data of the person to be detected, and the audio data may be mixed with other noises except the cough audio data of the person to be detected, so that in order to ensure the quality of the extracted cough audio data, the sound data of the person to be detected is collected in a closed space with low noise as much as possible, for example, a mute room.
In this embodiment, in order to improve the accuracy of the detection result, the audio data of the person to be detected may be collected according to an actual situation, in an application scenario of the present application, a time length of the collected audio data of the person to be detected may be set, a frequency of the collected audio data of the person to be detected may also be set, and how long the audio data is collected may also be set, for example, the frequency of the collected audio data to be detected may be set to 3 times, the audio data collected each time generates a corresponding audio file, the time length of the audio data collected each time is 6 seconds, and the audio data is collected again every 7 days; and when the audio data of the person to be detected is acquired, comparing the audio data acquired for many times, selecting the cough audio data with the highest accuracy as the audio data input into the preset neural network model, and when the acquired audio data does not meet the requirement, carrying out error prompt and reminding to acquire the cough audio data again.
In this embodiment, the audio data of the person to be detected is input into a preset neural network model, the preset neural network model is obtained according to preset cough audio data, the preset neural network model can be used for determining the detection result of the person to be detected, when the audio data of the person to be detected is collected, the audio data is input into the preset neural network model, the obtained detection result output by the preset neural network model is the detection result of the person to be detected, and the detection result only includes two conditions, namely, healthy or unhealthy.
In the technical scheme of this embodiment, the audio data of the person to be detected is acquired, the audio data is preprocessed to obtain the cough audio data, the preprocessed cough audio data is input into the preset neural network model, and the detection result of the person to be detected output by the preset neural network model is obtained, so that the problems of complexity and long consumed time of the detection process in the prior art are solved, the detection time is shortened, and the detection efficiency is improved.
Referring to fig. 3, fig. 3 is a schematic flow chart of a second embodiment of the detection method of the present invention, wherein steps S220-S240 in the second embodiment are located before step S120 in the first embodiment, and the second implementation includes the following steps:
step S210, obtaining audio data of the person to be detected.
And step S220, acquiring preset cough audio data and building an initial preset neural network model.
And step S230, generating a cough audio data set according to the preset cough audio data.
Step S240, constructing a preset neural network model according to the initial neural network model and the cough audio data set.
In this embodiment, before inputting audio data of a person to be detected into a preset neural network model, a preset neural network model needs to be constructed, where the preset neural network model is obtained according to preset cough audio data and the initial neural network model, the initial neural network model adopts a parallel Resnet50 network, the preset cough audio data is derived from audio data of a plurality of detectors, the detectors are classified in advance according to categories, and the categories of the detectors include: the method comprises the steps that healthy people, new coronavirus confirmed cases, suspected cases or asymptomatic infected persons are determined, the types of testers are determined, preset cough audio data of different types of testers are collected respectively, the preset cough audio data of the different types of testers are integrated to obtain cough audio data sets corresponding to the types, the cough audio data of the different types are input into an initial neural network model to be trained, and the preset neural network model is obtained.
In this embodiment, the audio features of different testers are different, so that the cough audio data sets are also different, the cough audio data set is obtained by preprocessing the preset cough audio data, and the preprocessing the preset cough audio data sequentially includes: pre-emphasis processing, framing processing, windowing processing, Fourier transform, Mel filtering, logarithmic energy calculation and discrete cosine transform; specifically, the pre-emphasis processing is to input the preset cough audio data into a high-pass filter, and the pre-emphasis processing aims to improve the high-frequency part of the preset cough audio data, so that the frequency spectrum of the preset cough audio data becomes flat, the frequency spectrum is kept in the whole frequency band from low frequency to high frequency, and the frequency spectrum can be obtained by using the same signal-to-noise ratio; the framing processing is to firstly assemble N pieces of preset cough audio data into a unit, which is called a frame, so that an overlapping area is formed between two adjacent frames in order to avoid overlarge change of the two adjacent frames, and the overlapping area contains M pieces of preset cough audio data; the windowing process multiplies each frame by a hamming window to increase the continuity of the left and right ends of the frame; the Fourier transform is used for transforming the preset cough audio data on the time domain to obtain characteristics which are difficult to obtain, so that the preset cough audio data need to be transformed into energy distribution on the frequency domain for observation, different energy distributions can represent the characteristics of different preset cough audio data, after the Hamming window is multiplied, each frame needs to be subjected to fast Fourier transform to obtain energy distribution on a frequency spectrum, the fast Fourier transform is carried out on the preset cough audio data of each frame after the frame is subjected to windowing to obtain the frequency spectrum of each frame, and the power spectrum of the preset cough audio data is obtained by carrying out modular squaring on the frequency spectrum of the preset cough audio data; the Mel filtering, namely Mel, passes an energy spectrum through a set of triangular filter banks with Mel scale, and the purpose of the Mel filtering is to smooth the frequency spectrum, so that a Mel frequency cepstrum coefficient, namely MFCC, is not influenced by different tones of input preset cough audio data; the logarithmic energy calculation is used for calculating the logarithmic energy output by each filter bank, and the logarithmic energy is the volume of one frame; the discrete cosine transform obtains a mel-frequency cepstrum coefficient, namely an MFCC coefficient according to logarithmic energy output by each filter bank, the MFCC coefficient only reflects static audio features of preset cough audio data, dynamic audio features of the preset cough audio data can be obtained through differential spectrums of the static audio features, and audio signals are converted into digital signals through a series of operations.
In this embodiment, in order to improve the detection accuracy of the preset neural network model, the time length of the collected audio data of the detector may be set, and the number of times of the collected audio data of the detector may also be set, for example, the number of times of the collected detected audio data may be set to 5 times, a corresponding audio file is generated for each collected audio data, and the time length of each audio data is 10 seconds; when the audio data of the testers are collected, comparing the audio data collected for many times, selecting the cough audio data with the highest accuracy as preset cough audio data, integrating the preset cough audio data of the testers of different categories to obtain preset cough audio data sets of the testers of different categories, and inputting the preset cough audio data into an initial preset neural network model respectively for training to obtain a preset neural network model.
Step S250, inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data.
And step S260, obtaining a detection result output by the preset neural network model.
In the technical scheme of this embodiment, before audio data of a person to be detected is input into a preset neural network model, preset cough audio data is acquired, an initial preset neural network model is built, the preset cough audio data is acquired and subjected to a series of preprocessing operations, a cough audio data set is obtained, the cough audio data set is input into the initial neural network model for training, so that the preset neural network model is built, the audio data of the person to be detected is input into the preset neural network model, a detection result of the person to be detected can be quickly obtained, and the detection efficiency is improved.
Referring to fig. 4, fig. 4 is a schematic flow chart of a third embodiment of the detection method of the present invention, steps S241-S242 in the third embodiment are detailed steps of step S240 in the second embodiment, and the third implementation includes the following steps:
and step S241, obtaining audio characteristics according to the cough audio data set and the first neural network model.
And step S242, training according to the audio features and the second neural network model to obtain a preset neural network model.
In this embodiment, the initial neural network model includes a first neural network model and a second neural network model, both of which adopt a Resnet50 network architecture, the first neural network model adopts a plurality of parallel Resnet50 networks, and the cough audio data set is input into the first neural network model for training, so as to obtain audio features, that is, what kinds of features are included in the audio features, where the audio features at least include: the system comprises a voice emotion characteristic, a respiratory tract biological characteristic and a vocal cord biological characteristic, and also comprises other audio characteristics; inputting the audio features into a second neural network model for training to obtain a preset neural network model, wherein the preset neural network model determines a detection result by detecting whether the three audio features exist in the cough audio data set or not; in the training process of the second neural network model, the second neural network model gathers the outputs of the second neural network model together by connecting with the global average pooling layer, and then an activation function layer and a sigmoid layer are added to judge the output detection result.
In the technical scheme of the embodiment, the cough audio data set is input into the first neural network model to obtain the audio characteristics, and the audio characteristics are input into the second neural network model to train to obtain the preset neural network model, so that the preset neural network model is built.
Referring to fig. 5, fig. 5 is a schematic flow chart of a fourth embodiment of the detection method of the present invention, wherein steps S341 to S343 in the fourth embodiment are the detailed steps of step S240 in the second embodiment, and the fourth implementation includes the following steps:
in step S341, the cough audio data set is divided into a cough audio training set and a cough audio testing set.
Step S342, training the initial neural network model by using the cough audio training set, and testing the trained initial neural network model by using the cough audio testing set.
Step S343, when the accuracy of the test result reaches a preset threshold, a preset neural network model is obtained.
In this embodiment, since the preset parameters of the neural network model are unknown, in order to obtain an optimal parameter, a known cough audio data set needs to be used to train the initial neural network model, and the cough audio data set is divided into a cough audio training set and a cough audio test set, and the cough audio data set can also be divided into a cough audio training set, a cough audio test set and a cough audio verification set, and the data proportions of the cough audio training set, the cough audio test set and the cough audio verification set can be divided according to the actual training process, for example, the proportions are divided into 8:1: 1; usually, the cough audio training set is used to train the initial neural network model, i.e. the cost function is minimized, and then the test set is substituted into the cost function to detect the effect of the trained initial neural network model.
In this embodiment, the cough audio training set is used to estimate a model, the cough audio verification set is used to determine a network structure or a parameter for controlling the complexity of the model, and the cough audio test set is used to test how to finally select an optimal model, when the accuracy of the test result reaches a preset threshold, a mature preset neural network model is obtained, all three parts are randomly extracted from preset cough audio data, when the total number of samples is small, only a small part is usually left to be used as the cough audio test set, then a K-fold cross-validation method is applied to the remaining N samples, for example, the preset cough audio data is uniformly divided into K parts, K-1 parts of training are selected in turn, the remaining one part is verified, the square sum of prediction errors is calculated, and finally the square sum of prediction errors of K times is averaged to be used as a basis for selecting the optimal model structure.
In the technical scheme of this embodiment, the cough audio data set is divided into a cough audio training set and a cough audio testing set, the cough audio training set is adopted to train the initial neural network model, the trained initial neural network model is tested by the cough audio testing set, and when the accuracy of a test result reaches a preset threshold value, a preset neural network model is obtained, so that the precision of the preset neural network model is continuously improved.
Referring to fig. 6, fig. 6 is a schematic flow chart of a fifth embodiment of the detection method of the present invention, and step S131 in the fifth embodiment is a refinement step of step S130 in the first embodiment, and the fifth embodiment includes the following steps:
step S110, audio data of the person to be detected is obtained.
Step S120, inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data.
And S131, obtaining a detection result according to a cross entropy loss function and the preset neural network model.
In this embodiment, because of the particularity of the detection type, it is only necessary to obtain that the detection result is healthy or unhealthy, and no other determination is needed, so the process involves the problem of binary classification, and therefore the cross entropy loss function is adopted in the present application for determining the detection result, and the cross entropy loss function is:
Figure BDA0002987381830000131
n is the number of audio features, ynFor the true detection result, the
Figure BDA0002987381830000132
For predicting the detection result, the judgment that the judgment result of the cross entropy loss function is actually 0 or 1 can use 1 to represent health, 0 represents unhealthy, 1 when the prediction detection result is not equal to the real detection result, and 0 when the prediction detection result is equal to the real detection result, and the detection result can be obtained by adopting the cross entropy loss function in the preset neural network model.
In the technical scheme of the embodiment, the audio data of the person to be detected is acquired and input into the preset neural network model, and because the detection result only has two conditions, the embodiment adopts the cross entropy loss function to judge in the preset neural network model and obtains the detection result, thereby solving the problem of complex detection process in the prior art and improving the detection efficiency.
Based on the same inventive concept, the present invention further provides a detection apparatus, as shown in fig. 7, fig. 7 is a schematic structural diagram of the detection apparatus of the present invention, the detection apparatus includes: the acquisition module 10, the input module 20, and the output module 30, each of which will be described in the following:
the acquisition module 10 is used for acquiring the audio data of the person to be detected.
Before the input module 20, the following are also included: acquiring preset cough audio data, and building an initial preset neural network model; generating a cough audio data set according to the preset cough audio data; and constructing a preset neural network model according to the initial neural network model and the cough audio data set. Specifically, the generating of the cough audio data set according to the preset cough audio data includes: preprocessing the preset cough audio data to generate audio features; and obtaining the cough audio data set according to the audio features. The pretreatment comprises the following steps in sequence: pre-emphasis processing, framing processing, windowing processing, fourier transform, mel filtering, logarithmic energy computation, and discrete cosine transform. The audio features include at least: vocal emotional characteristics, respiratory tract biological characteristics, and vocal cord biological characteristics. The step of constructing a preset neural network model from the initial neural network model and the cough audio data set comprises: obtaining audio features according to the cough audio data set and the first neural network model; and training according to the audio features and the second neural network model to obtain a preset neural network model.
An input module 20, configured to input the audio data into a preset neural network model, where the preset neural network model is obtained according to preset cough audio data.
An output module 30, configured to obtain a detection result output by the preset neural network model, specifically, obtain the detection result according to a cross entropy loss function and the preset neural network model, where the cross entropy loss function is:
Figure BDA0002987381830000141
n is the number of audio features, ynFor the true detection result, the
Figure BDA0002987381830000142
To predict the detection result.
The technical scheme is that the detection method provides the detection device, solves the problem that the detection process is complicated in the prior art and cannot be detected in real time, and improves the detection efficiency.
Based on the same inventive concept, the embodiment of the present application further provides a computer storage medium, where the computer storage medium stores a detection program, and the detection program, when executed by a processor, implements the steps of the detection method described above, and can achieve the same technical effects, and is not described herein again to avoid repetition.
Since the computer storage medium provided in the embodiments of the present application is a computer storage medium used for implementing the method in the embodiments of the present application, based on the method described in the embodiments of the present application, a person skilled in the art can understand a specific structure and a modification of the computer storage medium, and thus details are not described here. Computer storage media used in the methods of embodiments of the present application are all intended to be protected by the present application.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable computer storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by certain computer program instructions. These determining machine program instructions may be provided to a processor of a general purpose determining machine, a special purpose determining machine, an embedded processing machine, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the determining machine or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These determining machine program instructions may also be stored in a determining machine readable memory that can direct a determining machine or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the determining machine readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These decision machine program instructions may also be loaded onto a decision machine or other programmable data processing apparatus to cause a series of operational steps to be performed on the decision machine or other programmable apparatus to produce a decision machine implemented process such that the instructions which execute on the decision machine or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed determination machine. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as tokens.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A detection method is characterized in that the detection method is applied to a detection device; the method comprises the following steps:
acquiring audio data of a person to be detected;
inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained according to preset cough audio data;
and obtaining a detection result output by the preset neural network model.
2. The detection method of claim 1, wherein the step of inputting the audio data into a predetermined neural network model is preceded by the steps of:
acquiring preset cough audio data, and building an initial preset neural network model;
generating a cough audio data set according to the preset cough audio data;
and constructing a preset neural network model according to the initial neural network model and the cough audio data set.
3. The detection method as claimed in claim 2, wherein the step of generating a cough audio data set from the preset cough audio data comprises:
preprocessing the preset cough audio data to obtain a cough audio data set, wherein the preprocessing process sequentially comprises the following steps: pre-emphasis processing, framing processing, windowing processing, fourier transform, mel filtering, logarithmic energy computation, and discrete cosine transform.
4. The detection method of claim 2, wherein the initial neural network model comprises a first neural network model and a second neural network model; the step of constructing a preset neural network model from the initial neural network model and the cough audio data set comprises:
obtaining audio features according to the cough audio data set and the first neural network model;
and training according to the audio features and the second neural network model to obtain a preset neural network model.
5. The detection method according to claim 4, wherein the audio features comprise at least: vocal emotional characteristics, respiratory tract biological characteristics, and vocal cord biological characteristics.
6. The detection method of claim 2, wherein the step of constructing a preset neural network model from the initial neural network model and the cough audio data set further comprises:
dividing the cough audio data set into a cough audio training set and a cough audio testing set;
training the initial neural network model by adopting the cough audio training set, and testing the trained initial neural network model by adopting the cough audio testing set;
and when the accuracy of the test result reaches a preset threshold value, obtaining a preset neural network model.
7. The method of claim 1, wherein the step of obtaining the detection result output by the neural network model comprises:
obtaining a detection result according to a cross entropy loss function and the preset neural network model, wherein the cross entropy loss function is as follows:
Figure FDA0002987381820000021
n is the number of audio features, ynFor the true detection result, the
Figure FDA0002987381820000022
To predict the detection result.
8. A detection device, the device comprising:
the acquisition module is used for acquiring the audio data of a person to be detected;
the input module is used for inputting the audio data into a preset neural network model, wherein the preset neural network model is obtained by training according to preset cough audio data;
and the output module is used for acquiring the detection result output by the preset neural network model.
9. A detection device, characterized in that the detection device comprises a memory, a processor and a detection program stored in the memory and executable on the processor, the detection program, when executed by the processor, implementing the steps of the detection method according to any one of claims 1-7.
10. A computer storage medium, characterized in that the computer storage medium stores a detection program which, when executed by a processor, implements the steps of the detection method according to any one of claims 1 to 7.
CN202110303977.0A 2021-03-22 2021-03-22 Detection method, device, equipment and computer storage medium Pending CN113096691A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110303977.0A CN113096691A (en) 2021-03-22 2021-03-22 Detection method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110303977.0A CN113096691A (en) 2021-03-22 2021-03-22 Detection method, device, equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN113096691A true CN113096691A (en) 2021-07-09

Family

ID=76669022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110303977.0A Pending CN113096691A (en) 2021-03-22 2021-03-22 Detection method, device, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN113096691A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409825A (en) * 2021-08-19 2021-09-17 南京裕隆生物医学发展有限公司 Intelligent health detection method and device, electronic equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109431507A (en) * 2018-10-26 2019-03-08 平安科技(深圳)有限公司 Cough disease identification method and device based on deep learning
CN109493874A (en) * 2018-11-23 2019-03-19 东北农业大学 A kind of live pig cough sound recognition methods based on convolutional neural networks
CN109602421A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Health monitor method, device and computer readable storage medium
CN112472065A (en) * 2020-11-18 2021-03-12 天机医用机器人技术(清远)有限公司 Disease detection method based on cough sound recognition and related equipment thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109431507A (en) * 2018-10-26 2019-03-08 平安科技(深圳)有限公司 Cough disease identification method and device based on deep learning
CN109493874A (en) * 2018-11-23 2019-03-19 东北农业大学 A kind of live pig cough sound recognition methods based on convolutional neural networks
CN109602421A (en) * 2019-01-04 2019-04-12 平安科技(深圳)有限公司 Health monitor method, device and computer readable storage medium
CN112472065A (en) * 2020-11-18 2021-03-12 天机医用机器人技术(清远)有限公司 Disease detection method based on cough sound recognition and related equipment thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409825A (en) * 2021-08-19 2021-09-17 南京裕隆生物医学发展有限公司 Intelligent health detection method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN110189769B (en) Abnormal sound detection method based on combination of multiple convolutional neural network models
US9058816B2 (en) Emotional and/or psychiatric state detection
US7957967B2 (en) Acoustic signal classification system
WO2018068396A1 (en) Voice quality evaluation method and apparatus
Fujimura et al. Classification of voice disorders using a one-dimensional convolutional neural network
CN109346087B (en) Noise robust speaker verification method and apparatus against bottleneck characteristics of a network
CN109979486B (en) Voice quality assessment method and device
CN111785288B (en) Voice enhancement method, device, equipment and storage medium
CN107507625B (en) Sound source distance determining method and device
WO2016015461A1 (en) Method and apparatus for detecting abnormal frame
CN111696580A (en) Voice detection method and device, electronic equipment and storage medium
Lundén et al. On urban soundscape mapping: A computer can predict the outcome of soundscape assessments
CN115394318A (en) Audio detection method and device
CN113096691A (en) Detection method, device, equipment and computer storage medium
Whitehill et al. Whosecough: In-the-wild cougher verification using multitask learning
US20230245674A1 (en) Method for learning an audio quality metric combining labeled and unlabeled data
Gorodnichev et al. On the Task of Classifying Sound Patterns in Transport
CN104036785A (en) Speech signal processing method, speech signal processing device and speech signal analyzing system
Enzinger et al. Mismatched distances from speakers to telephone in a forensic-voice-comparison case
Duangpummet et al. A robust method for blindly estimating speech transmission index using convolutional neural network with temporal amplitude envelope
Xie et al. Acoustic feature extraction using perceptual wavelet packet decomposition for frog call classification
CN114302301B (en) Frequency response correction method and related product
CN114209302B (en) Cough detection method based on data uncertainty learning
CN115223584A (en) Audio data processing method, device, equipment and storage medium
Zhang et al. Use of relevant data, quantitative measurements, and statistical models to calculate a likelihood ratio for a Chinese forensic voice comparison case involving two sisters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210709