CN112185364A - Method and device for detecting baby crying - Google Patents
Method and device for detecting baby crying Download PDFInfo
- Publication number
- CN112185364A CN112185364A CN202011039588.3A CN202011039588A CN112185364A CN 112185364 A CN112185364 A CN 112185364A CN 202011039588 A CN202011039588 A CN 202011039588A CN 112185364 A CN112185364 A CN 112185364A
- Authority
- CN
- China
- Prior art keywords
- infant crying
- infant
- crying sound
- sound
- confidence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 206010011469 Crying Diseases 0.000 title claims abstract description 150
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000001514 detection method Methods 0.000 claims abstract description 37
- 238000013136 deep learning model Methods 0.000 claims abstract description 18
- 230000005236 sound signal Effects 0.000 claims abstract description 18
- 230000004044 response Effects 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 12
- 238000004590 computer program Methods 0.000 claims description 7
- 230000000694 effects Effects 0.000 abstract description 4
- 230000002708 enhancing effect Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010295 mobile communication Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B21/00—Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
- G08B21/02—Alarms for ensuring the safety of persons
- G08B21/0202—Child monitoring systems using a transmitter-receiver system carried by the parent and the child
- G08B21/028—Communication between parent and child units via remote transmission means, e.g. satellite network
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Child & Adolescent Psychology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Business, Economics & Management (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Emergency Management (AREA)
- Emergency Alarm Devices (AREA)
Abstract
The invention discloses a method for detecting baby crying, which comprises the following steps: outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not; and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal. The problem of far-field identification of the infant crying can be solved by enhancing the infant crying sound through the microphone array, and the infant crying sound classifier is trained by taking various types of infant crying sounds as expected sounds based on the deep learning model, so that the identification performance of the infant crying sound classifier can be improved, and the effect of high confidence degree accuracy of the output infant crying sound can be achieved.
Description
Technical Field
The invention belongs to the technical field of voice recognition, and particularly relates to a method and a device for detecting baby crying.
Background
Crying is an instinctive response of infants, especially infants younger than two years of age, and because they do not yet have speech expression ability, crying is the most important way they express feelings to respond to external stimuli, so when an infant crying, a caretaker needs to attend to it in a timely manner. However, in a real environment, a caretaker cannot attend at all times, and particularly when the baby is asleep, the caretaker often does other tasks such as doing housework, watching television and the like, and if the baby cries at the moment, the caretaker, particularly the elderly, often cannot hear and attend in time, so that the baby can be accidentally injured and sadness is brought to the whole family.
At present, some techniques for detecting crying of infants are available, and the main principle is to judge whether an infant is in a crying state by counting the characteristics of external audio within a period of time based on the characteristics of higher volume and higher audio frequency when the infant cryes.
In the process of implementing the present application, the inventor finds that the prior art solution has at least the following problems: the baby cry is small and is a little far away, and the recognition rate is seriously reduced; there are also sounds in normal speech that resemble a baby crying, making misidentification more serious.
Disclosure of Invention
The embodiment of the invention provides a method and a device for detecting baby crying, which are used for solving at least one of the technical problems.
In a first aspect, an embodiment of the present invention provides a method for detecting baby crying, including: outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not; and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal.
In a second aspect, an embodiment of the present invention provides an apparatus for detecting baby crying, including: a first output module configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; the judging module is configured to judge whether the confidence of the infant crying sound is smaller than a preset confidence threshold value; the second output module is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method for detecting baby crying according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention also provide a computer program product including a computer program stored on a non-volatile computer-readable storage medium, the computer program including program instructions which, when executed by a computer, cause the computer to perform the steps of the infant crying detection method of any of the embodiments of the present invention.
According to the scheme provided by the method and the device, the infant crying sound is enhanced through the microphone array, the problem of far-field identification of the infant crying can be solved, the deep learning model is adopted, massive infant crying sounds and similar infant crying sounds can be trained, and the identification performance of the infant crying model can be further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a method for detecting crying of an infant according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for detecting crying of an infant according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for detecting baby crying according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Please refer to fig. 1, which shows a flowchart of an embodiment of the baby crying detection method according to the present application, and the baby crying detection method according to the present embodiment may be applied to terminals with a language model or a real-time voice conversation function, such as an intelligent voice television, an intelligent sound box, and other existing intelligent terminals supporting intelligent voice recognition.
As shown in fig. 1, in step 101, in response to the received audio signal, a confidence level of the infant crying sound is output via an infant crying sound classifier, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
in step 102, judging whether the confidence of the baby crying sound is smaller than a preset confidence threshold;
in step 103, if the confidence of the baby cry is not less than the preset confidence threshold, a signal indicating that the baby cry is successfully detected is output.
In this embodiment, for step 101, the infant crying detection apparatus outputs the confidence level of the infant crying sound via the infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on the deep learning model. Then, in step 102, the infant crying detection apparatus determines whether the confidence of the infant crying sound is smaller than a preset confidence threshold, wherein the confidence may be a value normalized to 0 to 1, and the higher the value is, the higher the confidence is, the higher the probability of the infant crying sound is. Then, in step 103, if the confidence of the baby cry is not less than the preset confidence threshold, the baby cry detection apparatus outputs a signal indicating that the baby cry detection is successful.
According to the scheme provided by the embodiment, the infant crying sound classifiers are trained by taking various types of infant crying sounds as expected sounds based on the deep learning models, so that the identification performance of the infant crying sound classifiers can be improved, and the effect of high confidence degree accuracy of the output infant crying sounds is realized.
Further, the infant crying sound classifier is trained using at least one similar infant crying sound as a counterexample based on the deep learning model. Therefore, the sounds similar to the baby crying can be analyzed, and a large amount of similar audios are collected to be used as counterexamples to be added into the model training, so that the false recognition of the similar sounds of the baby crying by the model is reduced.
Specifically, the counter example of similar baby crying may be an animal cry, such as a cat cry, and the music background melody is similar in frequency spectrum.
In a preferred embodiment, the infant crying detection apparatus may perform signal enhancement on the audio signal based on the microphone array in response to the audio signal acquired in real time.
The microphone array in the scheme provided by the embodiment can be an array formed by a plurality of microphones, and compared with a single microphone, the microphone array has the advantages that the spatial information of voice can be obtained, spatial filtering can be realized, the directional noise can be well inhibited, signals in an undesired direction can be inhibited, signals of target sound can be retained, and the signal enhancement effect can be achieved.
Wherein the microphone array includes: the microphone array comprises a double-microphone array, a linear four-microphone array, an annular four-microphone array and an annular six-microphone array.
Specifically, the baby crying detection device responds to the received baby crying detection success signal, and the system sends detection success information to the user. And sending the detection success information comprises sending a short message or voice broadcast.
In a specific application scenario, parents cook in a kitchen, a story machine accompany a child to play in a room, and when the child cries, the story machine receives a signal indicating that the baby cries successfully and broadcasts the signal in voice, thereby alerting the parents in the kitchen that the child is crying.
In another specific application scenario, a parent goes out for a short time, a child in a room wakes up and cries, and after the intelligent sound box receives a signal indicating that the baby cries successfully, the intelligent sound box sends information to the mobile phone of the parent through the network so as to inform the parent to return to the home as soon as possible.
In one embodiment, if the confidence level of the baby crying sound is not less than the preset confidence threshold, the baby crying detection device stops collecting the audio signal. Therefore, the problem that the detection success information is repeatedly sent to the user due to the fact that the baby cry sound is repeatedly collected can be solved.
Further, when the baby crying sound needs to be detected again, the user can wake up the baby crying detection device again.
It should be noted that the above method steps are not intended to limit the execution order of the steps, and in fact, some steps may be executed simultaneously or in the reverse order of the steps, which is not limited herein.
The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.
The inventor finds that the defects in the prior art are mainly caused by the following reasons in the process of implementing the application: the baby cry is small and is a little far away, and the recognition rate is seriously reduced; sounds similar to baby cry also exist in normal voice, so that the rate of speech recognition interference is not high, and false recognition is serious.
The inventor also found that: the method solves the problem of far-field identification, improves the identification accuracy, not only needs to understand the characteristics of the baby crying sound deeply, but also has knowledge in the aspects of enhancing the signal processing of the microphone array, deep learning and the like. The practitioner in the industry can hardly have the knowledge at the same time.
The scheme of this application mainly starts to design and optimize from following several aspects and improves the discernment precision, solves and can not carry out far field discernment, and the distance is far away, and the problem that the recognition rate descends:
(1) the problem of far-field identification of the baby crying is solved by enhancing the baby crying sound through the microphone array
(2) Through the deep learning model, massive baby crying sounds and similar baby crying sounds are trained, and the recognition performance of the baby crying model is further improved.
Please refer to fig. 2, which shows a flow chart of the baby cry detection method of the present application.
As shown in fig. 2, the first step: collecting audio by a multi-microphone array;
the second step is that: the audio collected by the multi-microphone array is subjected to signal processing, so that the sound of crying of the baby is enhanced;
the third step: inputting the enhanced baby crying sound into a deep learning-based baby crying sound classifier model, and outputting a confidence coefficient that a section of sound can be the baby crying sound by the model;
the fourth step: judging whether the confidence coefficient of the infant crying sound output by the model is larger than a preset threshold value or not, wherein the confidence coefficient can be a numerical value which is normalized to 0-1, the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is, if the confidence coefficient is larger than or equal to the threshold value, the infant crying sound is successfully detected, and the system sends a signal that the detection is successful; otherwise, the microphone array continues to acquire audio.
The inventors have also adopted the following alternatives in the course of carrying out the present application and summarized the advantages and disadvantages of the alternatives.
Beta version: in the initial version, a large number of babies cry in various states are used for training a deep learning model, sound detection similar to the baby cry is obtained, the error recognition rate is high, and although the sound detection is usable, the final performance of the system is influenced to a certain extent.
Referring to fig. 3, a block diagram of an apparatus for detecting baby crying according to an embodiment of the present invention is shown.
As shown in fig. 3, the apparatus 200 for detecting baby crying includes a first output module 210, a determining module 220 and a second output module 230.
Wherein the first output module 210 is configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; a determining module 220 configured to determine whether the confidence of the baby crying sound is smaller than a preset confidence threshold; the second output module 230 is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
It should be understood that the modules depicted in fig. 3 correspond to various steps in the method described with reference to fig. 1. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 3, and are not described again here.
It is to be noted that the modules in the embodiments of the present disclosure are not intended to limit the aspects of the present disclosure, for example, the first output module may be described as a module that outputs the confidence of the infant crying sound via the infant crying sound classifier in response to the received audio signal. In addition, the related function module may also be implemented by a hardware processor, for example, the determining module may also be implemented by a processor, which is not described herein again.
In other embodiments, the present invention further provides a non-volatile computer storage medium storing computer-executable instructions for performing the method for detecting baby crying in any of the above method embodiments;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the infant crying sound is smaller than a preset confidence coefficient threshold value, wherein the confidence coefficient is a numerical value which is normalized to 0-1, and the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is;
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, outputting a successful baby crying detection signal.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the infant crying detection apparatus, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory remotely located from the processor, which may be connected to the infant crying detection device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform any one of the above methods of detecting baby crying.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: one or more processors 310 and a memory 320, one processor 310 being illustrated in fig. 4. The apparatus of the baby crying detection method may further include: an input device 330 and an output device 340. The processor 310, the memory 320, the input device 330, and the output device 340 may be connected by a bus or other means, such as the bus connection in fig. 4. The memory 320 is a non-volatile computer-readable storage medium as described above. The processor 310 executes various functional applications and data processing of the server by executing the nonvolatile software programs, instructions and modules stored in the memory 320, so as to implement the baby cry detection method of the above-mentioned method embodiment. The input device 330 may receive input numerical or character information and generate key signal inputs related to user settings and function control of the infant crying detection apparatus. The output device 340 may include a display device such as a display screen.
The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
As an embodiment, the electronic device is applied to an infant crying detection apparatus, and is used for a client, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the infant crying sound is smaller than a preset confidence coefficient threshold value, wherein the confidence coefficient is a numerical value which is normalized to 0-1, and the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is;
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, outputting a successful baby crying detection signal.
The electronic device of the embodiments of the present application exists in various forms, including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc.
(3) A portable entertainment device: such devices can display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.
(5) And other electronic devices with data interaction functions.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A method for detecting crying in an infant, comprising:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not;
and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal.
2. The method of claim 1, wherein the infant crying sound classifier is trained using at least one similar infant crying sound as a counter example based on the deep learning model.
3. The method of claim 1, wherein after the outputting of the infant crying detection success signal if the confidence of the infant crying sound is not less than the preset confidence threshold, the method further comprises:
and responding to the received baby crying detection success signal, and sending detection success information to the user by the system.
4. The method of claim 1, wherein after said determining whether the confidence of the infant crying sound is less than a preset confidence threshold, the method further comprises:
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, stopping collecting the audio signal.
5. The method of any one of claims 1-4, wherein prior to said outputting, via an infant crying sound classifier, a confidence level of an infant crying sound in response to a received audio signal, the method further comprises:
in response to an audio signal acquired in real-time, signal enhancement is performed on the audio signal based on a microphone array.
6. The method of claim 5, wherein the microphone array comprises: the microphone array comprises a double-microphone array, a linear four-microphone array, an annular four-microphone array and an annular six-microphone array.
7. An infant crying detection device comprising:
a first output module configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
the judging module is configured to judge whether the confidence of the infant crying sound is smaller than a preset confidence threshold value;
the second output module is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
8. The apparatus of claim 7, wherein the infant crying sound classifier is trained using at least one similar infant crying sound as a counter example based on the deep learning model.
9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1 to 6.
10. A storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, is adapted to carry out the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039588.3A CN112185364A (en) | 2020-09-28 | 2020-09-28 | Method and device for detecting baby crying |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039588.3A CN112185364A (en) | 2020-09-28 | 2020-09-28 | Method and device for detecting baby crying |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112185364A true CN112185364A (en) | 2021-01-05 |
Family
ID=73944610
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011039588.3A Withdrawn CN112185364A (en) | 2020-09-28 | 2020-09-28 | Method and device for detecting baby crying |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112185364A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105139869A (en) * | 2015-07-27 | 2015-12-09 | 安徽清新互联信息科技有限公司 | Baby crying detection method based on interval difference features |
US20160364963A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Method and System for Detecting an Audio Event for Smart Home Devices |
CN107818779A (en) * | 2017-09-15 | 2018-03-20 | 北京理工大学 | A kind of infant's crying sound detection method, apparatus, equipment and medium |
US10418957B1 (en) * | 2018-06-29 | 2019-09-17 | Amazon Technologies, Inc. | Audio event detection |
-
2020
- 2020-09-28 CN CN202011039588.3A patent/CN112185364A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160364963A1 (en) * | 2015-06-12 | 2016-12-15 | Google Inc. | Method and System for Detecting an Audio Event for Smart Home Devices |
CN105139869A (en) * | 2015-07-27 | 2015-12-09 | 安徽清新互联信息科技有限公司 | Baby crying detection method based on interval difference features |
CN107818779A (en) * | 2017-09-15 | 2018-03-20 | 北京理工大学 | A kind of infant's crying sound detection method, apparatus, equipment and medium |
US10418957B1 (en) * | 2018-06-29 | 2019-09-17 | Amazon Technologies, Inc. | Audio event detection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112119454B (en) | Automatic assistant adapted to multiple age groups and/or vocabulary levels | |
CN108538298B (en) | Voice wake-up method and device | |
CN109947984A (en) | A kind of content delivery method and driving means for children | |
WO2020253128A1 (en) | Voice recognition-based communication service method, apparatus, computer device, and storage medium | |
EP3923198A1 (en) | Method and apparatus for processing emotion information | |
CN110910885B (en) | Voice wake-up method and device based on decoding network | |
US11862153B1 (en) | System for recognizing and responding to environmental noises | |
JP6892426B2 (en) | Learning device, detection device, learning method, learning program, detection method, and detection program | |
CN107146605B (en) | Voice recognition method and device and electronic equipment | |
CN112786029A (en) | Method and apparatus for training VAD using weakly supervised data | |
CN111312222A (en) | Awakening and voice recognition model training method and device | |
CN111627423A (en) | VAD tail point detection method, device, server and computer readable medium | |
CN111243604B (en) | Training method for speaker recognition neural network model supporting multiple awakening words, speaker recognition method and system | |
KR20220078614A (en) | Systems and Methods for Prediction and Recommendation Using Collaborative Filtering | |
CN108806699B (en) | Voice feedback method and device, storage medium and electronic equipment | |
CN111063375A (en) | Music playing control system, method, equipment and medium | |
CN114220423A (en) | Voice wake-up, method of customizing wake-up model, electronic device, and storage medium | |
CN113205809A (en) | Voice wake-up method and device | |
CN111339881A (en) | Baby growth monitoring method and system based on emotion recognition | |
CN112185364A (en) | Method and device for detecting baby crying | |
KR20180012192A (en) | Infant Learning Apparatus and Method Using The Same | |
CN116825105A (en) | Speech recognition method based on artificial intelligence | |
CN111536660A (en) | Control method and device of air conditioner, storage medium and air conditioner | |
CN111063338B (en) | Audio signal identification method, device, equipment, system and storage medium | |
JP6306447B2 (en) | Terminal, program, and system for reproducing response sentence using a plurality of different dialogue control units simultaneously |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant after: Sipic Technology Co.,Ltd. Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province Applicant before: AI SPEECH Co.,Ltd. |
|
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210105 |