CN112185364A - Method and device for detecting baby crying - Google Patents

Method and device for detecting baby crying Download PDF

Info

Publication number
CN112185364A
CN112185364A CN202011039588.3A CN202011039588A CN112185364A CN 112185364 A CN112185364 A CN 112185364A CN 202011039588 A CN202011039588 A CN 202011039588A CN 112185364 A CN112185364 A CN 112185364A
Authority
CN
China
Prior art keywords
infant crying
infant
crying sound
sound
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011039588.3A
Other languages
Chinese (zh)
Inventor
徐俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AI Speech Ltd
Original Assignee
AI Speech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AI Speech Ltd filed Critical AI Speech Ltd
Priority to CN202011039588.3A priority Critical patent/CN112185364A/en
Publication of CN112185364A publication Critical patent/CN112185364A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • G08B21/0202Child monitoring systems using a transmitter-receiver system carried by the parent and the child
    • G08B21/028Communication between parent and child units via remote transmission means, e.g. satellite network
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Child & Adolescent Psychology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Emergency Management (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

The invention discloses a method for detecting baby crying, which comprises the following steps: outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not; and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal. The problem of far-field identification of the infant crying can be solved by enhancing the infant crying sound through the microphone array, and the infant crying sound classifier is trained by taking various types of infant crying sounds as expected sounds based on the deep learning model, so that the identification performance of the infant crying sound classifier can be improved, and the effect of high confidence degree accuracy of the output infant crying sound can be achieved.

Description

Method and device for detecting baby crying
Technical Field
The invention belongs to the technical field of voice recognition, and particularly relates to a method and a device for detecting baby crying.
Background
Crying is an instinctive response of infants, especially infants younger than two years of age, and because they do not yet have speech expression ability, crying is the most important way they express feelings to respond to external stimuli, so when an infant crying, a caretaker needs to attend to it in a timely manner. However, in a real environment, a caretaker cannot attend at all times, and particularly when the baby is asleep, the caretaker often does other tasks such as doing housework, watching television and the like, and if the baby cries at the moment, the caretaker, particularly the elderly, often cannot hear and attend in time, so that the baby can be accidentally injured and sadness is brought to the whole family.
At present, some techniques for detecting crying of infants are available, and the main principle is to judge whether an infant is in a crying state by counting the characteristics of external audio within a period of time based on the characteristics of higher volume and higher audio frequency when the infant cryes.
In the process of implementing the present application, the inventor finds that the prior art solution has at least the following problems: the baby cry is small and is a little far away, and the recognition rate is seriously reduced; there are also sounds in normal speech that resemble a baby crying, making misidentification more serious.
Disclosure of Invention
The embodiment of the invention provides a method and a device for detecting baby crying, which are used for solving at least one of the technical problems.
In a first aspect, an embodiment of the present invention provides a method for detecting baby crying, including: outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not; and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal.
In a second aspect, an embodiment of the present invention provides an apparatus for detecting baby crying, including: a first output module configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; the judging module is configured to judge whether the confidence of the infant crying sound is smaller than a preset confidence threshold value; the second output module is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
In a third aspect, an electronic device is provided, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method for detecting baby crying according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention also provide a computer program product including a computer program stored on a non-volatile computer-readable storage medium, the computer program including program instructions which, when executed by a computer, cause the computer to perform the steps of the infant crying detection method of any of the embodiments of the present invention.
According to the scheme provided by the method and the device, the infant crying sound is enhanced through the microphone array, the problem of far-field identification of the infant crying can be solved, the deep learning model is adopted, massive infant crying sounds and similar infant crying sounds can be trained, and the identification performance of the infant crying model can be further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flowchart of a method for detecting crying of an infant according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a method for detecting crying of an infant according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for detecting baby crying according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Please refer to fig. 1, which shows a flowchart of an embodiment of the baby crying detection method according to the present application, and the baby crying detection method according to the present embodiment may be applied to terminals with a language model or a real-time voice conversation function, such as an intelligent voice television, an intelligent sound box, and other existing intelligent terminals supporting intelligent voice recognition.
As shown in fig. 1, in step 101, in response to the received audio signal, a confidence level of the infant crying sound is output via an infant crying sound classifier, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
in step 102, judging whether the confidence of the baby crying sound is smaller than a preset confidence threshold;
in step 103, if the confidence of the baby cry is not less than the preset confidence threshold, a signal indicating that the baby cry is successfully detected is output.
In this embodiment, for step 101, the infant crying detection apparatus outputs the confidence level of the infant crying sound via the infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on the deep learning model. Then, in step 102, the infant crying detection apparatus determines whether the confidence of the infant crying sound is smaller than a preset confidence threshold, wherein the confidence may be a value normalized to 0 to 1, and the higher the value is, the higher the confidence is, the higher the probability of the infant crying sound is. Then, in step 103, if the confidence of the baby cry is not less than the preset confidence threshold, the baby cry detection apparatus outputs a signal indicating that the baby cry detection is successful.
According to the scheme provided by the embodiment, the infant crying sound classifiers are trained by taking various types of infant crying sounds as expected sounds based on the deep learning models, so that the identification performance of the infant crying sound classifiers can be improved, and the effect of high confidence degree accuracy of the output infant crying sounds is realized.
Further, the infant crying sound classifier is trained using at least one similar infant crying sound as a counterexample based on the deep learning model. Therefore, the sounds similar to the baby crying can be analyzed, and a large amount of similar audios are collected to be used as counterexamples to be added into the model training, so that the false recognition of the similar sounds of the baby crying by the model is reduced.
Specifically, the counter example of similar baby crying may be an animal cry, such as a cat cry, and the music background melody is similar in frequency spectrum.
In a preferred embodiment, the infant crying detection apparatus may perform signal enhancement on the audio signal based on the microphone array in response to the audio signal acquired in real time.
The microphone array in the scheme provided by the embodiment can be an array formed by a plurality of microphones, and compared with a single microphone, the microphone array has the advantages that the spatial information of voice can be obtained, spatial filtering can be realized, the directional noise can be well inhibited, signals in an undesired direction can be inhibited, signals of target sound can be retained, and the signal enhancement effect can be achieved.
Wherein the microphone array includes: the microphone array comprises a double-microphone array, a linear four-microphone array, an annular four-microphone array and an annular six-microphone array.
Specifically, the baby crying detection device responds to the received baby crying detection success signal, and the system sends detection success information to the user. And sending the detection success information comprises sending a short message or voice broadcast.
In a specific application scenario, parents cook in a kitchen, a story machine accompany a child to play in a room, and when the child cries, the story machine receives a signal indicating that the baby cries successfully and broadcasts the signal in voice, thereby alerting the parents in the kitchen that the child is crying.
In another specific application scenario, a parent goes out for a short time, a child in a room wakes up and cries, and after the intelligent sound box receives a signal indicating that the baby cries successfully, the intelligent sound box sends information to the mobile phone of the parent through the network so as to inform the parent to return to the home as soon as possible.
In one embodiment, if the confidence level of the baby crying sound is not less than the preset confidence threshold, the baby crying detection device stops collecting the audio signal. Therefore, the problem that the detection success information is repeatedly sent to the user due to the fact that the baby cry sound is repeatedly collected can be solved.
Further, when the baby crying sound needs to be detected again, the user can wake up the baby crying detection device again.
It should be noted that the above method steps are not intended to limit the execution order of the steps, and in fact, some steps may be executed simultaneously or in the reverse order of the steps, which is not limited herein.
The following description is provided to enable those skilled in the art to better understand the present disclosure by describing some of the problems encountered by the inventors in implementing the present disclosure and by describing one particular embodiment of the finally identified solution.
The inventor finds that the defects in the prior art are mainly caused by the following reasons in the process of implementing the application: the baby cry is small and is a little far away, and the recognition rate is seriously reduced; sounds similar to baby cry also exist in normal voice, so that the rate of speech recognition interference is not high, and false recognition is serious.
The inventor also found that: the method solves the problem of far-field identification, improves the identification accuracy, not only needs to understand the characteristics of the baby crying sound deeply, but also has knowledge in the aspects of enhancing the signal processing of the microphone array, deep learning and the like. The practitioner in the industry can hardly have the knowledge at the same time.
The scheme of this application mainly starts to design and optimize from following several aspects and improves the discernment precision, solves and can not carry out far field discernment, and the distance is far away, and the problem that the recognition rate descends:
(1) the problem of far-field identification of the baby crying is solved by enhancing the baby crying sound through the microphone array
(2) Through the deep learning model, massive baby crying sounds and similar baby crying sounds are trained, and the recognition performance of the baby crying model is further improved.
Please refer to fig. 2, which shows a flow chart of the baby cry detection method of the present application.
As shown in fig. 2, the first step: collecting audio by a multi-microphone array;
the second step is that: the audio collected by the multi-microphone array is subjected to signal processing, so that the sound of crying of the baby is enhanced;
the third step: inputting the enhanced baby crying sound into a deep learning-based baby crying sound classifier model, and outputting a confidence coefficient that a section of sound can be the baby crying sound by the model;
the fourth step: judging whether the confidence coefficient of the infant crying sound output by the model is larger than a preset threshold value or not, wherein the confidence coefficient can be a numerical value which is normalized to 0-1, the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is, if the confidence coefficient is larger than or equal to the threshold value, the infant crying sound is successfully detected, and the system sends a signal that the detection is successful; otherwise, the microphone array continues to acquire audio.
The inventors have also adopted the following alternatives in the course of carrying out the present application and summarized the advantages and disadvantages of the alternatives.
Beta version: in the initial version, a large number of babies cry in various states are used for training a deep learning model, sound detection similar to the baby cry is obtained, the error recognition rate is high, and although the sound detection is usable, the final performance of the system is influenced to a certain extent.
Referring to fig. 3, a block diagram of an apparatus for detecting baby crying according to an embodiment of the present invention is shown.
As shown in fig. 3, the apparatus 200 for detecting baby crying includes a first output module 210, a determining module 220 and a second output module 230.
Wherein the first output module 210 is configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model; a determining module 220 configured to determine whether the confidence of the baby crying sound is smaller than a preset confidence threshold; the second output module 230 is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
It should be understood that the modules depicted in fig. 3 correspond to various steps in the method described with reference to fig. 1. Thus, the operations and features described above for the method and the corresponding technical effects are also applicable to the modules in fig. 3, and are not described again here.
It is to be noted that the modules in the embodiments of the present disclosure are not intended to limit the aspects of the present disclosure, for example, the first output module may be described as a module that outputs the confidence of the infant crying sound via the infant crying sound classifier in response to the received audio signal. In addition, the related function module may also be implemented by a hardware processor, for example, the determining module may also be implemented by a processor, which is not described herein again.
In other embodiments, the present invention further provides a non-volatile computer storage medium storing computer-executable instructions for performing the method for detecting baby crying in any of the above method embodiments;
as one embodiment, a non-volatile computer storage medium of the present invention stores computer-executable instructions configured to:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the infant crying sound is smaller than a preset confidence coefficient threshold value, wherein the confidence coefficient is a numerical value which is normalized to 0-1, and the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is;
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, outputting a successful baby crying detection signal.
The non-volatile computer-readable storage medium may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the infant crying detection apparatus, and the like. Further, the non-volatile computer-readable storage medium may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the non-transitory computer readable storage medium optionally includes memory remotely located from the processor, which may be connected to the infant crying detection device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform any one of the above methods of detecting baby crying.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device includes: one or more processors 310 and a memory 320, one processor 310 being illustrated in fig. 4. The apparatus of the baby crying detection method may further include: an input device 330 and an output device 340. The processor 310, the memory 320, the input device 330, and the output device 340 may be connected by a bus or other means, such as the bus connection in fig. 4. The memory 320 is a non-volatile computer-readable storage medium as described above. The processor 310 executes various functional applications and data processing of the server by executing the nonvolatile software programs, instructions and modules stored in the memory 320, so as to implement the baby cry detection method of the above-mentioned method embodiment. The input device 330 may receive input numerical or character information and generate key signal inputs related to user settings and function control of the infant crying detection apparatus. The output device 340 may include a display device such as a display screen.
The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
As an embodiment, the electronic device is applied to an infant crying detection apparatus, and is used for a client, and includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the infant crying sound is smaller than a preset confidence coefficient threshold value, wherein the confidence coefficient is a numerical value which is normalized to 0-1, and the higher the numerical value is, the higher the confidence coefficient is, the higher the probability of the infant crying sound is;
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, outputting a successful baby crying detection signal.
The electronic device of the embodiments of the present application exists in various forms, including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc.
(3) A portable entertainment device: such devices can display and play multimedia content. The devices comprise audio and video players, handheld game consoles, electronic books, intelligent toys and portable vehicle-mounted navigation devices.
(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.
(5) And other electronic devices with data interaction functions.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for detecting crying in an infant, comprising:
outputting a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
judging whether the confidence coefficient of the baby crying sound is smaller than a preset confidence coefficient threshold value or not;
and if the confidence coefficient of the infant crying sound is not less than the preset confidence coefficient threshold value, outputting an infant crying detection success signal.
2. The method of claim 1, wherein the infant crying sound classifier is trained using at least one similar infant crying sound as a counter example based on the deep learning model.
3. The method of claim 1, wherein after the outputting of the infant crying detection success signal if the confidence of the infant crying sound is not less than the preset confidence threshold, the method further comprises:
and responding to the received baby crying detection success signal, and sending detection success information to the user by the system.
4. The method of claim 1, wherein after said determining whether the confidence of the infant crying sound is less than a preset confidence threshold, the method further comprises:
and if the confidence coefficient of the baby crying sound is not less than the preset confidence coefficient threshold value, stopping collecting the audio signal.
5. The method of any one of claims 1-4, wherein prior to said outputting, via an infant crying sound classifier, a confidence level of an infant crying sound in response to a received audio signal, the method further comprises:
in response to an audio signal acquired in real-time, signal enhancement is performed on the audio signal based on a microphone array.
6. The method of claim 5, wherein the microphone array comprises: the microphone array comprises a double-microphone array, a linear four-microphone array, an annular four-microphone array and an annular six-microphone array.
7. An infant crying detection device comprising:
a first output module configured to output a confidence level of the infant crying sound via an infant crying sound classifier in response to the received audio signal, wherein the infant crying sound classifier is obtained by training at least one infant crying sound based on a deep learning model;
the judging module is configured to judge whether the confidence of the infant crying sound is smaller than a preset confidence threshold value;
the second output module is configured to output a signal indicating that the baby cry detection is successful if the confidence of the baby cry sound is not less than the preset confidence threshold.
8. The apparatus of claim 7, wherein the infant crying sound classifier is trained using at least one similar infant crying sound as a counter example based on the deep learning model.
9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the method of any one of claims 1 to 6.
10. A storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, is adapted to carry out the steps of the method of any one of claims 1 to 6.
CN202011039588.3A 2020-09-28 2020-09-28 Method and device for detecting baby crying Withdrawn CN112185364A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011039588.3A CN112185364A (en) 2020-09-28 2020-09-28 Method and device for detecting baby crying

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011039588.3A CN112185364A (en) 2020-09-28 2020-09-28 Method and device for detecting baby crying

Publications (1)

Publication Number Publication Date
CN112185364A true CN112185364A (en) 2021-01-05

Family

ID=73944610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011039588.3A Withdrawn CN112185364A (en) 2020-09-28 2020-09-28 Method and device for detecting baby crying

Country Status (1)

Country Link
CN (1) CN112185364A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139869A (en) * 2015-07-27 2015-12-09 安徽清新互联信息科技有限公司 Baby crying detection method based on interval difference features
US20160364963A1 (en) * 2015-06-12 2016-12-15 Google Inc. Method and System for Detecting an Audio Event for Smart Home Devices
CN107818779A (en) * 2017-09-15 2018-03-20 北京理工大学 A kind of infant's crying sound detection method, apparatus, equipment and medium
US10418957B1 (en) * 2018-06-29 2019-09-17 Amazon Technologies, Inc. Audio event detection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160364963A1 (en) * 2015-06-12 2016-12-15 Google Inc. Method and System for Detecting an Audio Event for Smart Home Devices
CN105139869A (en) * 2015-07-27 2015-12-09 安徽清新互联信息科技有限公司 Baby crying detection method based on interval difference features
CN107818779A (en) * 2017-09-15 2018-03-20 北京理工大学 A kind of infant's crying sound detection method, apparatus, equipment and medium
US10418957B1 (en) * 2018-06-29 2019-09-17 Amazon Technologies, Inc. Audio event detection

Similar Documents

Publication Publication Date Title
CN112119454B (en) Automatic assistant adapted to multiple age groups and/or vocabulary levels
CN108538298B (en) Voice wake-up method and device
CN109947984A (en) A kind of content delivery method and driving means for children
WO2020253128A1 (en) Voice recognition-based communication service method, apparatus, computer device, and storage medium
EP3923198A1 (en) Method and apparatus for processing emotion information
CN110910885B (en) Voice wake-up method and device based on decoding network
US11862153B1 (en) System for recognizing and responding to environmental noises
JP6892426B2 (en) Learning device, detection device, learning method, learning program, detection method, and detection program
CN107146605B (en) Voice recognition method and device and electronic equipment
CN112786029A (en) Method and apparatus for training VAD using weakly supervised data
CN111312222A (en) Awakening and voice recognition model training method and device
CN111627423A (en) VAD tail point detection method, device, server and computer readable medium
CN111243604B (en) Training method for speaker recognition neural network model supporting multiple awakening words, speaker recognition method and system
KR20220078614A (en) Systems and Methods for Prediction and Recommendation Using Collaborative Filtering
CN108806699B (en) Voice feedback method and device, storage medium and electronic equipment
CN111063375A (en) Music playing control system, method, equipment and medium
CN114220423A (en) Voice wake-up, method of customizing wake-up model, electronic device, and storage medium
CN113205809A (en) Voice wake-up method and device
CN111339881A (en) Baby growth monitoring method and system based on emotion recognition
CN112185364A (en) Method and device for detecting baby crying
KR20180012192A (en) Infant Learning Apparatus and Method Using The Same
CN116825105A (en) Speech recognition method based on artificial intelligence
CN111536660A (en) Control method and device of air conditioner, storage medium and air conditioner
CN111063338B (en) Audio signal identification method, device, equipment, system and storage medium
JP6306447B2 (en) Terminal, program, and system for reproducing response sentence using a plurality of different dialogue control units simultaneously

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant after: Sipic Technology Co.,Ltd.

Address before: 215123 building 14, Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou City, Jiangsu Province

Applicant before: AI SPEECH Co.,Ltd.

WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210105