WO2023138660A1 - 一种音频检测的方法及电子设备 - Google Patents

一种音频检测的方法及电子设备 Download PDF

Info

Publication number
WO2023138660A1
WO2023138660A1 PCT/CN2023/073167 CN2023073167W WO2023138660A1 WO 2023138660 A1 WO2023138660 A1 WO 2023138660A1 CN 2023073167 W CN2023073167 W CN 2023073167W WO 2023138660 A1 WO2023138660 A1 WO 2023138660A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
signal
electronic device
user
action
Prior art date
Application number
PCT/CN2023/073167
Other languages
English (en)
French (fr)
Inventor
方玉锋
李靖
黄洁静
陈文娟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023138660A1 publication Critical patent/WO2023138660A1/zh

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs
    • A61B5/0823Detecting or evaluating cough events
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb

Definitions

  • the present application relates to the technical field of electronic equipment, in particular to an audio detection method and electronic equipment.
  • Respiratory diseases are common and frequently-occurring diseases, and some of them are contagious. Therefore, it is necessary to carry out simple and efficient detection of some respiratory diseases.
  • the current conventional methods for detecting respiratory diseases mainly include doctor consultation, chest imaging examination, sputum culture, etc. These methods generally require manual intervention, and the cost is high, and the detection takes a long time, so the detection efficiency is very low. Therefore, more researches on the detection of respiratory diseases (such as asthma, COPD, and new coronary pneumonia) based on cough and breath sounds have been born. Since most respiratory diseases have similar symptoms such as cough, fever, dyspnea, etc., based on these symptom characteristics, an intelligent algorithm can be used to analyze whether a respiratory disease has occurred.
  • respiratory diseases such as asthma, COPD, and new coronary pneumonia
  • the current method of detecting respiratory diseases based on cough sounds is mainly applied to wearable devices.
  • Real-time detection of users' cough sounds by wearable devices can quickly analyze the risk of users suffering from respiratory diseases.
  • the wearable device may mistakenly detect the cough sound of other users as the cough sound of the user wearing the wearable device during the process of disease risk analysis based on the cough sound detected in real time, which will cause a false detection situation, resulting in an error in the detection result of the user wearing the wearable device.
  • the present application provides an audio detection method and electronic equipment, which are used to detect the matching relationship between the audio collected by the electronic equipment and other signals, and then perform subsequent processing according to the matching relationship.
  • the present application provides an audio detection method applied to an electronic device, the method comprising: collecting a first audio, and identifying the type of the first audio; collecting a first signal through one or more sensors, wherein the electronic device includes one or more sensors; when the first audio is identified as a set type, determining a first time interval according to the first audio, and the first time interval includes a start time and an end time; determining a second signal according to the first time interval and the first signal; and displaying the first information if the second signal matches the first audio.
  • the electronic device can perform matching between the audio and other signals during the audio generation period for the audio of the set type, and then determine the matching relationship between the audio and other signals, and can perform corresponding subsequent processing when the audio matches other signals.
  • the method can realize matching between specific audio and other signals, and then can perform subsequent processing according to the matching relationship between audio and other signals, for example, further realize matching between audio and objects corresponding to other signals.
  • This method can realize the matching between the audio and the user by matching between the audio and other signals, and then can clarify the corresponding relationship between the audio and the user, so as to perform subsequent processing according to the corresponding relationship, thereby improving the accuracy of the subsequent processing.
  • the user's disease risk can be detected based on the audio, thereby improving detection accuracy and reducing errors.
  • the method further includes: if the second signal does not match the first audio, displaying second information.
  • the electronic device performs different processing when the audio and the signal match and when the audio and the signal do not match. Therefore, by matching the audio and the signal, the electronic device can perform corresponding processing based on the determined matching relationship, so as to ensure the smooth execution of the subsequent processing process, and improve the accuracy and flexibility of the subsequent processing process to a certain extent.
  • the method before determining the first time interval according to the first audio, the method further includes: determining a first risk parameter according to the first audio, where the first risk parameter is greater than or equal to a set threshold, wherein the first risk parameter is used to indicate the disease risk corresponding to the first audio.
  • the electronic device can determine the corresponding risk parameter based on the audio.
  • the electronic device can continue to perform the subsequent matching process.
  • the risk response processing may not be performed, so the electronic device may not perform the subsequent matching process, and may start the next audio collection and processing process. Based on this, the electronic device can flexibly switch the processing mode according to the value of the risk parameter, and improve the accuracy of the subsequent processing process.
  • the first information is used to indicate that: the first audio is from the first user, and/or that the first user has a risk of disease, wherein the first user is a user wearing an electronic device; the second information is used to indicate that: the first audio is not from the first user, and/or that the environment where the first user is located has a risk of infection.
  • the second signal may be a signal collected for the first user.
  • the first audio matches the second signal it may be determined that the first audio also matches the first user, and when the first audio does not match the second signal, it may be determined that the first audio does not match the first user either.
  • the first risk parameter is large, if the first audio matches the second signal, it can be determined that the first user has a high risk of disease; if the first audio does not match the second signal, it can be determined that there is an infection risk in the environment where the first user is located. Based on this, the electronic device can provide more comprehensive and accurate risk warnings for the first user, so that the first user can actively take corresponding measures to reduce risks, thereby improving user experience.
  • determining the first risk parameter according to the first audio includes: extracting feature data from the first audio; wherein the feature data is used to characterize the time domain feature and/or frequency domain feature of the first audio; according to the set disease screening model and feature data, determine the first risk parameter; wherein the disease screening model is used to represent the relationship between the time domain feature and/or frequency domain feature of the audio and the risk parameter corresponding to the audio.
  • a relatively comprehensive audio analysis can be performed based on the time-domain features and frequency-domain features of the audio, and the accuracy of the audio analysis can be improved by using the model, so that the risk parameters corresponding to the first audio can be determined more accurately, and the accuracy of the audio analysis can be improved.
  • the method before displaying the first information if the second signal matches the first audio, the method further includes: according to the first audio, determining a first action corresponding to the first audio, and, according to the second signal, determining a second action corresponding to the second signal; matching the second signal to the first audio includes: the type of the second action corresponding to the second signal is the same as the type of the first action corresponding to the first audio.
  • audio and other signals are different types of information, but if the audio and other signals originate from the same object, they can both reflect some characteristics of the object. Therefore, based on whether audio and other signals correspond to the same signal to determine if the audio matches other signals.
  • the same feature may be the type of action, and the matching between the audio and the signal may be realized by comparing whether the types of actions corresponding to the audio and the signal are the same. For example, when both the audio and the signal are from the user, the action may be an action performed by the user, and when the types of actions corresponding to the audio and the signal are the same, it may be determined that the audio and the signal are from the same user.
  • determining the second action corresponding to the second signal according to the second signal includes: determining the second action and the confidence level of the second action according to the set action recognition model and the second signal, wherein the action recognition model is used to represent the relationship between the second signal and the second action and the confidence level, and the confidence level is used to represent the recognition accuracy of the action recognition model; determine that the confidence level is greater than or equal to the set first threshold; or determine that the confidence level is greater than or equal to the set second threshold and less than or equal to the set third threshold; : confirm whether to execute the second action; receive second indication information, and the second indication information is used to indicate: confirm to execute the second action.
  • the accuracy of action recognition can be improved by using the model, and then the action and its type corresponding to the second signal can be more accurately determined, thereby improving the accuracy of matching.
  • the first indication information may be sent to the user, and the second indication information may be from the user.
  • the electronic device can flexibly switch the subsequent processing process based on the accuracy parameter of the action recognition model, that is, the confidence level, to ensure the accuracy of the final matching result.
  • the confidence level is small, it can be considered that the first audio does not match the second signal, so the electronic device may not perform the subsequent matching process, and may start the next audio collection and processing process.
  • an additional confirmation can be performed, so as to determine whether to perform a subsequent matching process according to the confirmation result, so as to ensure the accuracy of the processing.
  • the electronic device can flexibly switch the processing mode according to the value of the confidence level, thereby improving the accuracy of the subsequent matching process.
  • the second signal includes at least one of the following: a signal collected by an acceleration sensor; a signal collected by a gyro sensor; a signal collected by a photoelectric sensor.
  • the signals measured by the acceleration sensor or the gyroscope can represent the motion and posture characteristics of the user, and the signals measured by the photoelectric sensor can represent the physiological characteristics of the user. These signals can reflect the relevant characteristics when the user performs an action. Therefore, based on these signals, the action and its type corresponding to the second signal can be accurately determined, thereby improving the accuracy of matching with the audio.
  • the setting type includes at least one of the following: coughing sound, breathing sound, sneezing sound, and joint snapping.
  • the user's disease risk can be predicted based on the types of audio such as cough sounds and breath sounds emitted by the user. Therefore, by matching this type of audio with other signals, the matching result and the prediction result of disease risk can be combined to more accurately determine the objects that may have the risk of disease or infection, so as to further respond to prompts and improve user experience.
  • the present application provides an audio detection method applied to a first electronic device, the method comprising: collecting the first audio, and identifying the type of the first audio; when the first audio is identified as a set type, determining a first time interval according to the first audio, the first time interval includes a start time and an end time; sending request information to the second electronic device, the request information is used to request to obtain a signal corresponding to the first time interval; receiving a first signal from the second electronic device, the first signal is collected by the second electronic device through one or more sensors, wherein the second electronic device includes one or more sensors; , to display the first message.
  • the method further includes: if the first signal does not match the first audio, displaying second information.
  • the method before determining the first time interval according to the first audio frequency, the method further includes: determining a first risk parameter according to the first audio frequency, where the first risk parameter is greater than or equal to a set threshold, wherein the first risk parameter is determined by In order to indicate the disease risk corresponding to the first audio.
  • the first signal is a signal collected by the second electronic device within the first time interval.
  • the first information is used to indicate that: the first audio is from the first user, and/or that the first user has a risk of disease, wherein the first user is a user wearing the second electronic device; the second information is used to indicate that: the first audio is not from the first user, and/or that the environment where the first user is located has a risk of infection.
  • determining the first risk parameter according to the first audio includes: extracting feature data from the first audio; wherein the feature data is used to characterize the time domain feature and/or frequency domain feature of the first audio; according to the set disease screening model and feature data, determine the first risk parameter; wherein the disease screening model is used to represent the relationship between the time domain feature and/or frequency domain feature of the audio and the risk parameter corresponding to the audio.
  • the method before displaying the first information if the first signal matches the first audio, the method further includes: according to the first audio, determining a first action corresponding to the first audio, and, according to the first signal, determining a second action corresponding to the first signal; matching the first signal to the first audio includes: the type of the second action corresponding to the first signal is the same as the type of the first action corresponding to the first audio.
  • determining the second action corresponding to the first signal according to the first signal includes: determining the second action and the confidence level of the second action according to the set action recognition model and the first signal, wherein the action recognition model is used to represent the relationship between the first signal and the second action and the confidence level, and the confidence level is used to represent the recognition accuracy of the action recognition model; determine that the confidence level is greater than or equal to the set first threshold; or determine that the confidence level is greater than or equal to the set second threshold and less than or equal to the set third threshold; : confirm whether to execute the second action; receive second indication information, and the second indication information is used to indicate: confirm to execute the second action.
  • the first signal includes at least one of the following: a signal collected by an acceleration sensor; a signal collected by a gyroscope sensor; a signal collected by a photoelectric sensor.
  • the setting type includes at least one of the following: coughing sound, breathing sound, sneezing sound, and joint snapping.
  • the present application provides an audio detection method, which is applied to a system composed of a first electronic device and a second electronic device.
  • the method includes: the first electronic device collects the first audio, and identifies the type of the first audio; and, the second electronic device collects the first signal through one or more sensors, wherein the second electronic device includes one or more sensors; when the first electronic device recognizes that the first audio is of a set type, it determines a first time interval according to the first audio, and the first time interval includes a start time and an end time; The electronic device determines the first time interval according to the received request information, and determines the second signal according to the first time interval and the first signal; the second electronic device sends the second signal to the first electronic device; if the first electronic device determines that the received second signal matches the first audio, it displays the first information.
  • the method further includes: if the first electronic device determines that the second signal does not match the first audio, displaying second information.
  • the method before the first electronic device determines the first time interval according to the first audio, the method further includes: the first electronic device determines a first risk parameter according to the first audio, where the first risk parameter is greater than or equal to a set threshold, where the first risk parameter is used to indicate the disease risk corresponding to the first audio.
  • the first information is used to indicate that: the first audio is from the first user, and/or that the first user has a risk of disease, wherein the first user is a user wearing the second electronic device; the second information is used to indicate that: the first audio is not from the first user, and/or that the environment where the first user is located has a risk of infection.
  • the first electronic device determines the first risk parameter according to the first audio, including: the first electronic device extracts feature data from the first audio; wherein the feature data is used to characterize the time domain feature and/or frequency domain feature of the first audio; the first electronic device determines the first risk parameter according to the set disease screening model and feature data; wherein the disease screening model is used to represent the relationship between the time domain feature and/or frequency domain feature of the audio and the risk parameter corresponding to the audio.
  • the method before the first electronic device displays the first information if it determines that the second signal matches the first audio, the method further includes: the first electronic device determines, according to the first audio, a first action corresponding to the first audio, and, according to the second signal, determines a second action corresponding to the second signal; matching the second signal to the first audio includes: the type of the second action corresponding to the second signal is the same as the type of the first action corresponding to the first audio.
  • the first electronic device determines the second action corresponding to the second signal according to the second signal, including: the first electronic device determines the second action and the confidence level of the second action according to the set action recognition model and the second signal, wherein the action recognition model is used to represent the relationship between the second signal and the second action and the confidence level, and the confidence level is used to represent the recognition accuracy of the action recognition model; the first electronic device determines that the confidence level is greater than or equal to the set first threshold; value; display the first indication information, the first indication information is used to indicate: confirm whether to execute the second action; receive the second indication information, the second indication information is used to indicate: confirm to execute the second action.
  • the second signal includes at least one of the following: a signal collected by an acceleration sensor; a signal collected by a gyro sensor; a signal collected by a photoelectric sensor.
  • the setting type includes at least one of the following: coughing sound, breathing sound, sneezing sound, and joint snapping.
  • the present application provides a system, which includes the first electronic device and the second electronic device described in the third aspect.
  • the present application provides an electronic device, which includes a display screen, a memory, and one or more processors; wherein, the memory is used to store computer program codes, and the computer program codes include computer instructions; when the computer instructions are executed by one or more processors, the electronic device is executed.
  • the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program.
  • the computer program executes the method described in the above-mentioned first aspect or any possible design of the first aspect, or executes the method described in the above-mentioned second aspect or any possible design of the second aspect, or executes the method performed by the first electronic device or the second electronic device in the above-mentioned third aspect or any possible design of the third aspect.
  • the present application provides a computer program product.
  • the computer program product includes a computer program or an instruction.
  • the computer executes the method described in the above-mentioned first aspect or any possible design of the first aspect, or executes the method described in the above-mentioned second aspect or any possible design of the second aspect, or executes the method performed by the first electronic device or the second electronic device in the above-mentioned third aspect or any possible design of the third aspect.
  • FIG. 1 is a schematic diagram of a hardware architecture of an electronic device provided in an embodiment of the present application
  • FIG. 2 is a schematic diagram of a software architecture of an electronic device provided in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an audio detection method provided in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of an interface of a mobile phone displaying prompt information provided by an embodiment of the present application
  • FIG. 5 is a schematic diagram of an interface of another mobile phone displaying prompt information provided by an embodiment of the present application
  • FIG. 6 is a schematic diagram of an interface of another mobile phone displaying prompt information provided by the embodiment of the present application.
  • FIG. 7 is a schematic diagram of an interface of another mobile phone displaying prompt information provided by the embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a method for detecting the risk of a user's respiratory tract infection provided by an embodiment of the present application
  • FIG. 9 is a schematic diagram of a function control interface of a smart watch provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an interface of a smart watch displaying disease risk prompt information provided by an embodiment of the present application.
  • Fig. 11 is a schematic diagram of an interface of another smart watch displaying disease risk prompt information provided by the embodiment of the present application.
  • FIG. 12 is a schematic interface diagram of a prompt interface of a smart watch provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an audio detection method provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of an audio detection method provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may be a device with a wireless connection function.
  • the electronic device may also have audio detection (sound detection) and/or sensing functions.
  • audio may also be referred to as sound.
  • the electronic device may be a portable device, such as a mobile phone, a tablet computer, a wearable device with a wireless communication function (for example, a watch, a bracelet, a helmet, a headset, etc.), a vehicle-mounted terminal device, an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device, a notebook computer, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a netbook, a personal digital assistant (personal digital assistant, P DA), smart home devices (such as smart TVs, smart speakers, etc.), smart robots, workshop equipment, wireless terminals in Self Driving, wireless terminals in Remote Medical Surgery, wireless terminals in Smart Grid, wireless terminals in Transportation Safety, wireless terminals in Smart City, or wireless terminals in Smart Home, flying devices (such as smart robots) , hot air balloons, drones, airplanes), etc.
  • a portable device such as a mobile phone, a tablet computer, a wearable device with a wireless communication function (for example
  • the wearable device is a portable device that can be directly worn by the user or integrated into the user's clothes or accessories.
  • the wearable device in the embodiment of the present application may be a portable device with a sensing function and an audio detection function.
  • the electronic device may also include other functions such as personal digital assistant and/or audio Portable terminal equipment with music player function.
  • portable terminal devices include, but are not limited to Or portable terminal equipment with other operating systems.
  • the above-mentioned portable terminal device may also be other portable terminal devices, such as a laptop computer (Laptop) with a touch-sensitive surface (such as a touch panel).
  • the above-mentioned electronic device may not be a portable terminal device, but a desktop computer with a touch-sensitive surface (such as a touch panel).
  • At least one in the embodiments of the present application refers to one or more, and “multiple” refers to two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which may indicate: A exists alone, A and B exist simultaneously, and B exists alone, where A and B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one (item) of the following” or similar expressions refer to any combination of these items, including any combination of single item(s) or plural item(s).
  • At least one item (unit) in a, b or c can represent: a, b, c, a and b, a and c, b and c, or a, b and c, wherein a, b, c can be single or multiple.
  • an embodiment of the present application provides an audio detection method and electronic equipment, which can more accurately identify the object that emits the audio and improve the accuracy of identifying the audio source.
  • the method provided by the embodiment of the present application can be used in a wearable device.
  • the wearable device can accurately identify whether the detected cough sound belongs to the user wearing the wearable device, and then can perform more accurate disease risk detection for the user based on the cough sound belonging to the user.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, and a motor 191 , an indicator 192, a camera 193, a display screen 194, and a SIM card interface 195, etc.
  • a processor 110 an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D,
  • the sensor module 180 may include a gyroscope sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, a temperature sensor, a pressure sensor, a distance sensor, a magnetic sensor, an ambient light sensor, an air pressure sensor, a bone conduction sensor, and the like.
  • the electronic device 100 shown in FIG. 1 is only an example and does not constitute a limitation to the electronic device, and the electronic device may have more or fewer components than those shown in the figure, may combine two or more components, or may have different component configurations.
  • the various components shown in FIG. 1 may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (Neural-network Pro cessing Unit, NPU) and so on. Wherein, different processing units may be independent devices, or may be integrated in one or more processors. Wherein, the controller may be the nerve center and command center of the electronic device 100 . The controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
  • the execution of the audio detection method provided by the embodiment of the present application can be completed by controlling the processor 110 or calling other components, such as calling the processing program of the embodiment of the present application stored in the internal memory 121, or calling the processing program of the embodiment of the present application stored in a third-party device through the external memory interface 120, to control the wireless communication module 160 to perform data communication with other devices, improve the intelligence and convenience of the electronic device 100, and improve user experience.
  • the processor 110 can include different devices. For example, when a CPU and a GPU are integrated, the CPU and the GPU can cooperate to execute the audio detection method provided by the embodiment of the present application. For example, in the audio detection method, part of the algorithm is executed by the CPU, and another part of the algorithm is executed by the GPU to obtain faster processing efficiency.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can adopt liquid crystal display (liquid crystal display, LCD), organic light-emitting diode (organic light-emitting diode, OLED), active matrix organic light-emitting diode or active-matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dots Light-emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the display screen 194 can be used to display information input by the user or information provided to the user and various graphical user interfaces (graphical user interface, GUI).
  • GUI graphical user interface
  • the display screen 194 can display photos, videos, web pages, or files, etc.
  • the display screen 194 may be an integral flexible display screen, or a spliced display screen composed of two rigid screens and a flexible screen between the two rigid screens.
  • a camera 193 (either a front camera or a rear camera, or a camera that can function as both a front camera and a rear camera) is used to capture still images or video.
  • the camera 193 may include a photosensitive element such as a lens group and an image sensor, wherein the lens group includes a plurality of lenses (convex lens or concave lens) for collecting light signals reflected by objects to be photographed, and transmitting the collected light signals to the image sensor.
  • the image sensor generates an original image of the object to be photographed according to the light signal.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data. Wherein, the stored program area can store the codes of the operating system and application programs (such as the function of audio detection, etc.).
  • the storage data area can store data created during the use of the electronic device 100 and the like.
  • the internal memory 121 may also store one or more computer programs corresponding to the audio detection algorithm provided in the embodiment of the present application.
  • the one or more computer programs are stored in the internal memory 121 and configured to be executed by the one or more processors 110, the one or more computer programs include instructions, and the above instructions can be used to execute various steps in the following embodiments.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • a non-volatile memory such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the code of the audio detection algorithm provided in the embodiment of the present application may also be stored in an external memory.
  • the processor 110 may run the code of the audio detection algorithm stored in the external memory through the external memory interface 120 .
  • the sensor module 180 may include a gyro sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, and the like.
  • Touch sensor also known as "touch panel”.
  • the touch sensor can be arranged on the display screen 194, and the touch sensor and the display screen 194 form a touch display screen, also called “touch screen”.
  • the touch sensor is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor can also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
  • the display screen 194 of the electronic device 100 displays a main interface, and the main interface includes icons of multiple applications (such as a camera application, a WeChat application, etc.).
  • the display screen 194 displays an interface of the camera application, such as a viewfinder interface.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves and radiate them through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device. In the embodiment of the present application, the mobile communication module 150 may also be used for information interaction with other devices.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent from the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field) applied on the electronic device 100. communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the wireless communication module 160 is configured to establish a connection with other electronic devices for data interaction.
  • the wireless communication module 160 can be used to access Point devices, send control commands to other electronic devices, or receive data from other electronic devices.
  • the electronic device 100 may implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the electronic device 100 may receive an input of the key 190 and generate a key signal input related to user setting and function control of the electronic device 100 .
  • the electronic device 100 can use the motor 191 to generate a vibration prompt (such as a vibration prompt for an incoming call).
  • the indicator 192 in the electronic device 100 can be an indicator light, which can be used to indicate the charging status, the change of the battery capacity, and can also be used to indicate messages, missed calls, notifications and the like.
  • the SIM card interface 195 in the electronic device 100 is used for connecting a SIM card.
  • the SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the electronic device 100 may include more or fewer components than those shown in FIG. 1 , which is not limited in this embodiment of the present application.
  • the illustrated electronic device 100 is only one example, and the electronic device 100 may have more or fewer components than shown in the figure, may combine two or more components, or may have a different configuration of components.
  • the various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • the Android system with layered architecture is taken as an example to illustrate the software structure of the electronic device.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. As shown in Figure 2, the software architecture can be divided into four layers, from top to bottom are the application program layer, the application program framework layer (framework, FWK), Android runtime and system library, and the Linux kernel layer.
  • the application program layer the application program framework layer (framework, FWK)
  • Android runtime and system library the Linux kernel layer.
  • the application layer is the top layer of the operating system, including native applications of the operating system, such as camera, gallery, calendar, Bluetooth, music, video, information, and so on.
  • the application program involved in the embodiment of the present application is referred to as an application (application, APP), which is a software program capable of realizing one or more specific functions.
  • application APP
  • multiple applications can be installed in an electronic device.
  • the application mentioned below may be a system application installed on the electronic device when it leaves the factory, or a third-party application downloaded from the Internet or obtained from other electronic devices by the user during the use of the electronic device.
  • the application program can be developed using the Java language and completed by calling the application programming interface (application programming interface, API) provided by the application program framework layer.
  • API application programming interface
  • the application framework layer provides API and programming framework for applications in the application layer.
  • the application framework layer can include some predefined functions.
  • Application framework layers can include window managers, content providers, view systems, telephony managers, resource managers, notification managers, and more.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • Data may include files (such as documents, videos, images, audios), text and other information.
  • the view system includes visual controls, such as controls that display text, pictures, documents, etc.
  • the view system can be used to build applications.
  • the interface displayed in the window can be composed of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of electronic devices.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
  • the Android runtime includes core libraries and a virtual machine.
  • the Android runtime is responsible for the scheduling and management of the Android system.
  • the core library of the Android system includes two parts: one part is the function function that the Java language needs to call, and the other part is the core library of the Android system.
  • the application layer and the application framework layer run in virtual machines. Taking Java as an example, the virtual machine executes the Java files of the application program layer and the application program framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • a system library can include multiple function modules. For example: surface manager, media library, 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.564, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel (Kernel) layer provides the core system services of the operating system, such as security, memory management, process management, network protocol stack and driver model, etc. are all implemented based on the kernel layer.
  • the kernel layer also acts as an abstraction layer between the hardware and software stacks. There are many drivers related to electronic devices in this layer. The main drivers are: display driver; keyboard driver as an input device; Flash driver based on memory technology devices; camera driver; audio driver; Bluetooth driver; WiFi driver, etc.
  • the audio detection method provided by the embodiment of the present application includes:
  • S301 The electronic device determines that the collected audio contains target audio, and the type of the target audio is a set type.
  • the target user may be a user wearing or carrying an electronic device, or the target user may also be a preset user associated with the electronic device.
  • the above-mentioned target audio may be cough sound, breath sound, sneeze sound, joint snapping (that is, the sound produced by the joints of the human body) and the like.
  • the electronic device may be a device with an audio detection function, and the electronic device may collect and detect the audio (or sound) in the environment in real time, thereby obtaining the target audio signal (ie, the audio signal of the target audio).
  • the electronic device may also be a device with a wireless connection function, and then the electronic device may receive target audio signals collected by other audio detection devices.
  • the electronic device may also be a device having both the audio detection function and the wireless connection function, and the electronic device may acquire the target audio signal in any of the above methods.
  • the target audio may be a sound made by any user in the environment where the electronic device is located.
  • the electronic device can receive and recognize the audio that appears in the environment in real time.
  • the electronic device recognizes that the received audio is a set type, such as a cough sound, it can determine that there is a user who is coughing in the environment. Therefore, the electronic device can obtain the audio signal during the user's cough, and obtain the corresponding cough sound signal.
  • the electronic device may identify the target action corresponding to the target audio according to the target audio.
  • the specific recognition can be realized with the help of the trained network model.
  • the electronic device after the electronic device acquires the target audio signal, it may first detect whether the user who sends the target audio has a disease risk according to the target audio signal. Specifically, after the electronic device acquires the target audio signal, it may first perform preprocessing on the target audio signal, such as denoising, pre-emphasis, framing, windowing, and other processing. Then, feature data of the target audio can be extracted from the preprocessed target audio signal, and whether the user to whom the target audio belongs has a disease risk is detected according to the extracted feature data.
  • preprocessing such as denoising, pre-emphasis, framing, windowing, and other processing.
  • feature data of the target audio can be extracted from the preprocessed target audio signal, and whether the user to whom the target audio belongs has a disease risk is detected according to the extracted feature data.
  • the feature data of the target audio may include but not limited to at least one of the following:
  • the time domain feature may be a zero crossing rate (zero crossing rate, ZCR) of the audio signal.
  • the zero-crossing rate refers to the number of times the signal passes through the zero point (from positive to negative or from negative to positive) in each frame of audio signal.
  • the time-domain features may also include attack time, autocorrelation parameters, or waveform features of the audio signal. Among them, the attack time refers to the duration of the audio energy in the rising stage.
  • the autocorrelation parameter refers to the similarity of an audio signal to its time-shifted signal.
  • the frequency domain feature may be Mel-frequency cepstral coefficients (Mei-frequency ceptrai coefficients, MFCC), power spectral density, spectral centroid, spectral flatness, spectral flux, and the like.
  • MFCC is the cepstral coefficient extracted in the Mel-scale frequency domain, and the Mel-scale describes the nonlinear characteristics of the human ear's perception of frequency.
  • the power spectral density refers to the power (mean square value) in the unit frequency band of the signal.
  • the spectral centroid refers to the concentrated point of energy in the signal spectrum, which is used to describe the brightness of the signal tone.
  • Spectral flatness refers to the similarity between quantized signal and noise.
  • Spectral flux refers to the degree of change between adjacent frames of the quantized signal.
  • the energy feature may be root mean square energy or the like. Root mean square energy refers to the energy average of a signal within a certain time range.
  • the music theory feature may be pitch frequency, inharmonicity, and the like.
  • the pitch frequency refers to the frequency of the pitch of the sound.
  • the detuning degree refers to the degree of deviation between the overtone frequency of the signal and the integer multiple of the pitch frequency.
  • the perceptual feature may be sound loudness (intensity), sharpness, and the like.
  • the loudness refers to the strength of a signal perceived by the human ear (for example, the size of a sound, etc.).
  • Sharpness is used to indicate the energy of the high-frequency part of the audio signal. The greater the energy of the high-frequency part, the higher the sharpness, and the sharper the sound perceived by the human ear.
  • the electronic device may use model recognition to detect whether there is a disease risk according to the extracted feature data of the target audio.
  • the extracted characteristic data can be input into the trained disease screening model to obtain the disease risk parameter output by the disease screening model, which can be used to indicate the level classification of the disease risk (such as high risk, medium risk, low risk, etc.), or the parameter can be used to indicate a specific value representing the size of the disease risk.
  • the disease risk parameter can also be used to indicate the disease type.
  • the disease screening model adopted by the electronic device can be obtained after training a network model using algorithms such as logistic regression algorithm (logistic regression, LR) and extreme gradient boosting (eXtreme Gradient Boosting, XGBoost) algorithm.
  • logistic regression logistic regression
  • eXtreme Gradient Boosting XGBoost
  • the electronic device when the electronic device detects from the target audio signal that the user who emits the target audio has a disease risk or has a high risk of disease (for example, when the disease risk parameter output by the disease screening model is greater than or equal to the set risk threshold), the following step S302 is performed to obtain the physiological data of the target user, and according to the target user Physiological data to determine whether the target audio is the audio produced by the target user.
  • the electronic device detects according to the target audio signal that the user who emits the target audio has no disease risk or a low disease risk (for example, when the disease risk parameter output by the disease screening model is less than the set risk threshold)
  • no additional processing may be performed, for example, it only needs to continue to perform the original audio monitoring.
  • S302 The electronic device acquires physiological data of the target user, where the physiological data is used to characterize the physiological characteristics of the target user.
  • the user usually performs some actions during the process of making a sound.
  • the user when the user makes a cough sound, the user performs the coughing action, and when the user makes the breath sound, the user performs the breathing action. Therefore, there is a corresponding relationship between the sound made by the user and the action performed by the user, that is, the type of sound made by the user corresponds to the type of action performed by the user, and it can be considered that the sound made by the user is generated during the execution of the action. Therefore, in this embodiment of the present application, the target audio may be the sound made by the user to which the target audio belongs during the execution of the target action, wherein the target audio is of a set type, and the type of the target action corresponds to the set type. For example, when the target audio is the above-mentioned coughing sound, breathing sound, sneezing sound, and joint snapping, the corresponding actions are coughing, breathing, sneezing, and joint movement respectively.
  • the electronic device can identify the action performed by the target user based on the physiological characteristics of the target user, and then determine the relationship between the target audio detected by the electronic device and the target user according to the relationship between the action performed by the target user and the action corresponding to the audio detected by the electronic device.
  • the electronic device may be a device with a sensing function or a sensor, and the electronic device may detect the physiological characteristics of the user in real time, and collect physiological parameters used to characterize the physiological characteristics of the user.
  • the electronic device can also be a device with a wireless connection function, and then the electronic device can receive physiological data collected by other physiological monitoring devices.
  • the electronic device may also be a device having both the sensing function and the wireless connection function, and the electronic device may obtain the physiological data of the user in any of the above methods.
  • the physiological data collected within a period of time closest to the current time may be saved.
  • the duration of this period of time may be a set duration.
  • the electronic device may be a smart watch or a mobile phone.
  • the target user is the user wearing the smart watch, and the smart watch itself can monitor the physiological characteristics of the target user, and collect and save the user's physiological data in real time.
  • the electronic device is a mobile phone
  • the target user may be a user holding the mobile phone or a registered owner of the mobile phone.
  • the target user can wear a wearable device such as a wristband for monitoring physiological characteristics, and the bracelet can monitor the physiological characteristics of the target user in real time and send the collected physiological data to the mobile phone, and the mobile phone can save the physiological data from the bracelet.
  • the wristband may not report the physiological data to the electronic device temporarily after collecting the physiological data, but the bracelet itself first saves the collected physiological data, and then reports the physiological data according to the instructions of the electronic device when the electronic device needs it.
  • step S302 when the electronic device executes step S302, it can obtain the required physiological data from its stored physiological data, or can instruct the physiological monitoring device to report the physiological data required by the electronic device.
  • the physiological data of the target user acquired by the electronic device is the physiological data of the user collected within a target time period.
  • the target time period is the time period during which the electronic device or the audio detection device collects the target audio signal, It is also the time period when the target audio is produced. Based on this, it can be ensured that the generation time of the physiological data acquired by the electronic device is consistent with the generation time of the target audio, so that the matching can be performed according to the audio signal and the physiological data in the same time period to ensure the matching accuracy.
  • the physiological data acquired by the electronic device at least includes motion posture parameters and/or heart rate parameters.
  • the motion posture parameter is used to indicate the motion or posture characteristics of the measured user, and may be measured by devices such as an acceleration transducer (acceleration transducer, ACC) and a gyroscope.
  • the heart rate parameter is used to indicate the heart rate and pulse characteristics of the measured user, and can be measured by a photoelectric sensor using a photoplethysmography (PPG) method.
  • the physiological data may also include but not limited to at least one of the following: respiratory rate parameters, blood oxygen parameters, blood pressure parameters, pulse parameters, etc. These parameters can be obtained by analyzing and calculating data measured by sensors such as ACC, gyroscope, and photoelectric sensor.
  • the above physiological data can be used to detect the action performed by the corresponding user, and then determine whether the target audio belongs to the user according to whether the action performed by the user is consistent with the action corresponding to the target audio. For example, when the motion posture parameters measured by the ACC have a large peak value or obvious fluctuation over time, and the heart rate parameter has a relatively obvious increase, it can be determined that the user has performed a coughing action. If the target audio information signal is a cough sound signal, it can be determined that the target audio belongs to the user.
  • S303 The electronic device determines whether the target audio is from the target user according to the physiological data.
  • the electronic device After the electronic device acquires the physiological data of the target user, it may determine whether the target audio originates from (belongs to) the target user according to the physiological data. Specifically, the electronic device may use a pre-trained action recognition model to identify whether the target user performs the target action corresponding to the target audio according to the physiological data of the target user, and then determine whether the target audio matches the target user according to whether the target user performs the target action.
  • the type of the target action corresponds to the type of the target audio, and the target action can be determined according to the target audio.
  • the action recognition model may be obtained after training a network model using support vector machines (support vector machines, SVM), LR and other algorithms.
  • physiological sample parameters of users performing various actions may be collected to form a model database of various physiological parameters.
  • the action recognition model can match the physiological data input into the model with the physiological parameters in the model database. If the physiological parameters of a certain type of action have the highest matching degree with the physiological data input into the action recognition model or the matching degree is greater than the set matching degree threshold, then this type of action is determined to be the action corresponding to the physiological data input into the action recognition model. If this type of action is the same as the target action, it can be determined that the user to whom the physiological data belongs, that is, the target user, has performed the target action, and then it can be determined that the target audio is the sound produced by the target user during the execution of the target action. Conversely, if the type of action is different from the target action, a determination result may be obtained that the target audio is not the sound made by the target user during the execution of the target action.
  • the model training stage when performing source detection for a specific type of audio, for example, when performing source detection for audio when a target action is performed, in the model training stage, only physiological sample parameters when different users perform the target action can also be obtained to form a model database corresponding to such physiological parameters.
  • the action recognition model can match the physiological parameters of the input model with the physiological parameters in the model database. If the matching degree of the input physiological data and the physiological parameters exceeding the set number in the model database is greater than the set matching degree threshold, it can be determined that the target audio is the sound produced by the target user during the execution of the target action; otherwise, it is determined that the target audio is not the sound produced by the target user during the execution of the target action.
  • the action recognition model can be used to identify the user state or user action gesture corresponding to the input physiological data, and when it is determined that the action performed by the user corresponding to the physiological data is the target action, it can also If it is determined that the target user to whom the physiological data belongs has performed the target action, then the physiological data may be considered to match the target audio signal, and then it is determined that the target audio is the sound made by the target user during the process of performing the target action.
  • the collected physiological data of the user performing different actions and the corresponding action indication labels can be used to form a data pair, which can be used as training data to train the network model.
  • the input data of the action recognition model may be the physiological data obtained by the electronic device
  • the output data of the action recognition model may be the judgment result of whether the target user performs the target action, or the confidence level of the action recognition model, which is used to indicate the possibility of the user performing the target action. The higher the confidence, the greater the possibility of the user performing the target action.
  • the electronic device inputs the acquired physiological data into the action recognition model, and then can obtain the recognition result or the corresponding confidence level of whether the user performs the target action output by the action recognition model, and then judge whether the target audio is the sound of the target user.
  • the electronic device may determine that the target audio is from the target user.
  • the electronic device may determine that the target audio does not originate from the target user.
  • the electronic device may further determine the relationship between the target audio and the target user by prompting the target user for confirmation.
  • the first confidence threshold is greater than the second confidence threshold.
  • the electronic device may also determine that the target audio is from the target user when determining that the confidence of the above model is greater than or equal to a set third confidence threshold, and determine that the target audio is not from the target user when determining that the confidence of the model is less than the third confidence threshold.
  • the electronic device may display prompt information to ask whether the user has performed the target action, and determine whether the user has performed the target action according to information fed back by the user, and if so, finally determine that the target audio is from the target user, otherwise, determine that the target audio is not from the target user.
  • the electronic device when the electronic device is a smart watch and the target action is coughing, when the electronic device prompts the target user for confirmation, the electronic device may prompt the target user to view the display screen of the electronic device by means of vibration, etc., and display a prompt information interface on the display screen.
  • the interface includes prompt information such as "did you cough just now?" Then the target user can click the corresponding option according to the actual situation of coughing or not.
  • the electronic device can determine the state of the target user.
  • the electronic device determines that the target audio is not from the target user, it determines that the target audio is from other users existing in the environment where the target user is located.
  • the electronic device when the electronic device determines that the target audio is from the target user, it can further determine the target user's disease risk (including the type of disease and the corresponding risk); when the electronic device determines that the target audio comes from other users in the environment where the target user is located, it can also further determine the disease risk of other users.
  • relevant reminders can be given to the target user, so that the target user can respond according to the prompts, thereby avoiding or reducing the possibility of contracting diseases from the environment.
  • the following only uses the electronic device to determine the disease risk of the target user as an example.
  • the method used by the electronic device to determine the disease risk of other users is the same as the method used by the electronic device to determine the disease risk of the target user, and will not be repeated below.
  • the disease risk parameter detected according to the feature data of the target audio in the above step S301 may be used as the disease risk parameter of the target user, so as to obtain the disease risk of the target user.
  • the electronic device may identify and analyze the disease risk of the target user in combination with more information.
  • the electronic device can use a trained disease screening model to identify corresponding disease risk parameters according to at least one of target audio feature data, target user's personal information, and target user's physiological data, so as to obtain the disease risk of the target user.
  • the feature data of the target audio may include at least one of the features provided in the foregoing embodiments.
  • the personal information of the target user may include information such as the target user's gender, age, height, weight, etc., which may be entered into the electronic device by the user in advance.
  • the electronic device may select a corresponding risk level from preset correspondences between different features and risk levels according to at least one characteristic data, and use the selected risk level as the disease risk level of the target user, so as to obtain the disease risk of the target user.
  • the characteristic data may include the above-mentioned various characteristic data, and may also include at least one of the following characteristic parameters: parameters used to indicate the intensity of the target audio; the number of times the electronic device or the audio detection device detects the target audio signal within a preset period of time; the risk level output by the above-mentioned disease screening model; among the target audio signals detected by the electronic device or the audio detection device multiple times, the duration of the longest target audio signal, etc.
  • at least one parameter can also be used as feature data of the audio and used in the above-mentioned process of using the disease screening model to identify the disease risk.
  • the electronic device may prompt the target user that the target user has a disease risk, so that the target user can further seek medical treatment in a timely manner.
  • the electronic device may prompt the target user that there is a disease risk by means of a notification message prompt or the like.
  • the electronic device may display the interface shown in FIG. 4 .
  • the electronic device may display the interface shown in FIG. 5 .
  • the electronic device can display risk warning information ("possible risk of respiratory infection” as shown in Figure 4 or "high risk of respiratory infection (suspected pneumonia)" as shown in Figure 5), and can also display information such as date and time and corresponding recommendations.
  • the electronic device can also display the following advice: According to your recent detection data, you may be at risk of respiratory infection, please take active measurements or seek medical attention.
  • the electronic device can also display the following suggestion information: According to your recent test data, you may have a high risk of respiratory infection, please seek medical treatment in time.
  • the electronic device may also display prompt information such as "this test is not used as a basis for professional clinical diagnosis of respiratory health".
  • the electronic device may also periodically display notification information to prompt the user's health status.
  • the electronic device can periodically display the interface shown in Figure 6 to remind the user of the current risk of illness, and can also display corresponding suggestions and prompts such as "Analyzing your recent measurement data, your risk of respiratory tract infection is low and within the healthy range, and continuing to maintain good living habits will bring you continuous health", "This test is not used as a professional clinical diagnosis basis for respiratory health".
  • the above-mentioned electronic device determines that other users have a risk of disease or a high risk of disease, and the type of disease that other users may suffer from is an infectious disease, it may prompt the target user that there may be a risk of disease infection in the surrounding environment, and remind the target user to take protection, etc.
  • the electronic device may prompt the target user that there may be a risk of respiratory system infection in the current environment by means of a notification message prompt or the like.
  • the electronic device may display the interface shown in FIG. 7, and the interface may include information for prompting the risk level of respiratory system infection in the environment. interest.
  • the interface may also display the current location of the target user, detected relevant characteristic parameters (such as the number of cough sounds detected in the environment), corresponding suggestions, etc.
  • the interface may also include prompt information such as "You are currently located in X place, and the number of coughs in the surrounding environment within XX time is XXX. It is recommended that you leave this place or take protective measures", "This test is not used as a professional clinical diagnosis basis for respiratory system health".
  • the above-mentioned electronic device may also display information such as the waveform of the detected cough sound on the display interface, which is not specifically limited in this embodiment of the present application.
  • the electronic device may determine the relationship between the target audio and the target user according to the physiological parameters of the target user when the target audio is detected, so as to distinguish the audio of the target user from the audio of other users.
  • the target audio may also be processed according to the source of the target audio during subsequent processing. For example, if it is determined that the source of the target audio is the target user, the target audio has high reference significance for detecting the target user's disease risk, and the detection result based on the target audio signal also has high reliability, so the target user's own illness can be detected based on the target audio.
  • the target audio has low reference significance for detecting the target user's disease risk, and it can avoid the situation that the target audio is mistakenly used as the target user's audio for disease risk detection, thereby improving the accuracy of related detection.
  • the solution provided by the embodiment of the present application is introduced below by taking the scenario where the solution provided by the embodiment of the present application is applied to detecting the risk of a user's respiratory tract infection as an example.
  • the following uses an electronic device as a wearable device worn by a target user, and the target user uses the wearable device to detect whether there is a risk of respiratory tract infection as an example.
  • the target audio is cough sound.
  • Wearable devices have an audio detection function, which can detect cough sounds in the environment and detect the risk of respiratory tract infection based on the detected cough sounds.
  • the wearable device has a physiological monitoring function, which can monitor the physiological characteristics of the target user in real time.
  • the flow of the method for detecting the risk of a user's respiratory tract infection includes:
  • the wearable device collects and detects the audio generated in the environment in real time, and detects and saves the physiological data of the target user in real time.
  • the function of the wearable device to detect the user's risk of respiratory tract infection can be controlled and realized through the health monitoring application installed in the wearable device.
  • the wearable device can execute the method provided in this example to detect the risk of respiratory tract infection and other processing.
  • the health monitoring application of the wearable device may provide a control switch of the airway monitoring function.
  • the control switch By operating the control switch, the user may control to turn on or off the function of monitoring the airway health.
  • the wearable device can execute the method provided in the embodiment of the present application to monitor the respiratory health of the user wearing the wearable device.
  • the wearable device can run the function in the background, and the interface displayed in the foreground can be updated with the user's operation.
  • the wearable device may use a trained audio recognition model to identify whether the collected audio is a cough sound.
  • S803 The wearable device extracts feature data of the cough sound signal.
  • S804 The wearable device inputs the feature data of the cough sound signal into the trained disease screening model.
  • the wearable device uses the disease screening model to determine whether the cough sound corresponds to a risk of respiratory tract infection; if yes, execute step S806; otherwise, execute step S801.
  • S806 The wearable device determines a time period in which the cough sound occurs.
  • the time period includes at least the start time and stop time of the cough sound, and may also include the duration of the cough sound.
  • S807 The wearable device acquires the stored physiological data detected within the time period.
  • S808 The wearable device inputs the physiological data into the trained action recognition model.
  • the wearable device determines whether the physiological data corresponds to the execution of the target action according to the action recognition model, and determines the corresponding confidence level.
  • step S303 For the specific implementation of this step, reference may be made to the method for recognizing physiological data using an action recognition model described in step S303 above, which will not be repeated here.
  • the wearable device determines whether the confidence is greater than or equal to a set first confidence threshold; if yes, execute step S811, otherwise, execute step S812.
  • the wearable device determines that the cough sound belongs to the target user, and displays prompt information to prompt the target user that there is a risk of respiratory tract infection.
  • the wearable device may display the interface shown in FIG. 10 , which includes prompt information prompting the target user to have a risk of respiratory tract infection.
  • the wearable device determines whether the confidence is less than or equal to a set second confidence threshold; if yes, execute step S813, otherwise, execute step S814.
  • the second confidence threshold is smaller than the first confidence threshold.
  • the wearable device determines that the cough sound does not belong to the target user, and displays prompt information to prompt the target user that there is a risk of respiratory tract infection in an environment where the target user is located.
  • the wearable device may display the interface shown in FIG. 11 , which includes prompt information prompting the target user that there is a risk of respiratory tract infection in the environment where the target user is located.
  • step S814 The wearable device displays prompt information to ask the target user whether to cough. And execute step S815.
  • the wearable device may display the interface shown in FIG. 12 , which includes prompt information asking the user whether to cough.
  • the wearable device determines whether the target user coughs according to the information fed back by the user; if yes, execute step S811, otherwise, execute step S813.
  • the wearable device by detecting the source of the coughing sound, the wearable device can ensure that the coughing sound for respiratory disease risk prediction comes from the target user's own voice, which reduces the interference of the wrongly collected audio signal on the prediction process, thereby improving the accuracy of the prediction.
  • the target user's own respiratory infection risk and the respiratory tract infection risk in the environment can be respectively targeted. Anticipate the risk of infection. On the one hand, it can improve the collection effect and prediction effect of the target user's own cough sound.
  • the solution can timely remind the target user to take personal protection and other countermeasures to reduce the risk of infection by judging the infection source and risk existing in the environment.
  • the solution can quickly identify the source of coughing sounds, and further predict and warn the risk of respiratory tract infection.
  • the real-time performance and implementation cost are low, and the user experience can be greatly improved.
  • this solution has great application value in public scenes such as indoors, and is highly practicable.
  • the embodiment of the present application also provides a method for audio detection, as shown in FIG. 13 , the method includes:
  • the electronic device collects first audio, and identifies the type of the first audio.
  • the electronic device may be the electronic device described in the above embodiment ( FIG. 3 ), or may be the wearable device described in the above embodiment ( FIG. 8 ).
  • the electronic device collects a first signal through one or more sensors, where the electronic device includes one or more sensors.
  • the first signal is a signal collected for the first user.
  • the first user may be the target user described in the foregoing embodiments.
  • the first signal may be a signal corresponding to the physiological data described in the foregoing embodiments, and the electronic device may determine the physiological data described in the foregoing embodiments according to the collected first signal.
  • the first audio of the set type may be the target audio described in the above embodiments
  • the first time interval is the time interval when the first audio is generated, which may be the target time period described in the above embodiments.
  • S1304 The electronic device determines a second signal according to the first time interval and the first signal.
  • the second signal may be a signal corresponding to the physiological data within the target time period described in the above embodiments, and the electronic device may determine the physiological data according to the collected second signal.
  • the electronic device displays second information.
  • the first information may be the notification information described in the above embodiments for notifying the target user that there is a disease risk, such as the information shown in FIG. 6 or FIG. 10 .
  • the first information may be the notification information described in the above embodiments for notifying the target user that there is an infection risk in the environment, such as the information shown in FIG. 7 or FIG. 11 .
  • the embodiment of the present application also provides an audio detection method, which is applied to a system including a first electronic device and a second electronic device, as shown in FIG. 14 , the method includes:
  • the first electronic device collects a first audio, and identifies the type of the first audio.
  • the first electronic device may be the electronic device described in the above embodiment (FIG. 3).
  • the first electronic device may be a mobile terminal device such as a mobile phone.
  • the second electronic device collects the first signal through one or more sensors, where the second electronic device includes one or more sensors.
  • the second electronic device may be the physiological monitoring device described in the foregoing embodiments.
  • the second electronic device may be a wearable device such as a watch or a bracelet.
  • the first signal is a signal collected for the first user.
  • the first user may be the target user described in the foregoing embodiments.
  • the first signal may be a signal corresponding to the physiological data described in the foregoing embodiments.
  • step S1401 may be executed earlier than step S1402, may be executed later than step S1402, or be executed simultaneously (or synchronously) with step S1402.
  • the first audio of the set type may be the target audio described in the above embodiments
  • the first time interval is the time interval when the first audio is generated, which may be the target time period described in the above embodiments.
  • the first electronic device sends request information to the second electronic device, where the request information is used to request to obtain a signal corresponding to the first time interval.
  • the second electronic device determines a first time interval according to the received request information, and determines a second signal according to the first time interval and the first signal.
  • the second signal may be a signal corresponding to the physiological data within the target time period described in the above embodiments.
  • S1406 The second electronic device sends a second signal to the first electronic device.
  • the electronic device displays second information.
  • the first information may be the notification information described in the above embodiments for notifying the target user that there is a disease risk, such as the information shown in FIG. 6 or FIG. 10 .
  • the first information may be the notification information described in the above embodiments for notifying the target user that there is an infection risk in the environment, such as the information shown in FIG. 7 or FIG. 11 .
  • an electronic device 1500 may include: a display screen 1501 , a memory 1502 , one or more processors 1503 , and one or more computer programs (not shown in the figure).
  • the various components described above may be coupled by one or more communication buses 1504 .
  • the display screen 1501 is used for displaying images, videos, application interfaces and other related user interfaces.
  • One or more computer programs are stored in the memory 1502, and one or more computer programs include computer instructions; one or more processors 1503 call the computer instructions stored in the memory 1502, so that the electronic device 1500 performs the audio detection method provided by the embodiment of the present application.
  • the memory 1502 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices or other non-volatile solid-state storage devices.
  • the memory 1502 may store an operating system (hereinafter referred to as the system), such as embedded operating systems such as ANDROID, IOS, WINDOWS, or LINUX.
  • the memory 1502 can be used to store the implementation program of the embodiment of the present application.
  • the memory 1502 can also store a network communication program, which can be used to communicate with one or more additional devices, one or more user devices, and one or more network devices.
  • the one or more processors 1503 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (Application-Specific Integrated Circuit, ASIC), or one or more integrated circuits for controlling the execution of the program of the present application.
  • CPU Central Processing Unit
  • ASIC Application-Specific Integrated Circuit
  • FIG. 15 is only an implementation of the electronic device 1500 provided in the embodiment of the present application. In actual applications, the electronic device 1500 may also include more or fewer components, which is not limited here.
  • an embodiment of the present application further provides a system, where the system includes the above-mentioned first electronic device and the second electronic device.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program runs on the computer, the computer executes the method provided by the above-mentioned embodiments.
  • the embodiments of the present application also provide a computer program product, the computer program product includes computer programs or instructions, and when the computer programs or instructions are run on a computer, the computer is made to execute the method provided by the above embodiments.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • a computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are produced in whole or in part.
  • a computer can be a general purpose computer, special purpose computer, computer network, network equipment, user equipment, or other programmable apparatus.
  • Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • computer instructions may be transmitted from one website, computer, server or data center to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL) or wireless (such as infrared, wireless, microwave, etc.).
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL) or wireless (such as infrared, wireless, microwave, etc.).
  • Data storage devices such as servers and data centers integrated with multiple available media. Available media can be magnetic media (for example, floppy disks, hard disks, tapes), optical media (for example, digital video discs (digital video disc, DVD for short), or semiconductor media (for example, SSD).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Physiology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Pulmonology (AREA)
  • Physics & Mathematics (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Dentistry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

一种音频检测的方法及电子设备(100),方法应用于电子设备(100)。方法包括:采集第一音频,并识别第一音频的类型(S1301);通过一个或多个传感器采集第一信号,其中,电子设备(100)包括一个或多个传感器(S1302);当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间(S1303);根据第一时间区间及第一信号,确定第二信号(S1304);若第二信号与第一音频匹配,显示第一信息(S1305)。方法能够检测电子设备(100)采集到的音频与其它信号之间的匹配关系,进而可以根据该匹配关系进行后续处理。

Description

一种音频检测的方法及电子设备
相关申请的交叉引用
本申请要求在2022年01月21日提交中华人民共和国知识产权局、申请号为202210073455.0、申请名称为“一种音频检测的方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备技术领域,尤其涉及一种音频检测的方法及电子设备。
背景技术
呼吸系统疾病是一种常见病、多发病,且其中一些疾病还存在传染性,因此对一些呼吸系统疾病进行简便高效的检测是很有必要的。
当前检测呼吸系统疾病的常规方法主要有医生问诊、胸部影像学检查、痰培养等方式,这些方式一般都需要人工干预,且成本较高,检测花费的时间较长,因此检测效率很低,也因此催生了更多基于咳嗽音、呼吸音等检测呼吸系统疾病(例如哮喘、慢阻肺、新冠肺炎等)的研究。由于大部分呼吸系统疾病都具有相似的症状如咳嗽、发热、呼吸困难等,因此可以基于这些症状特征,通过智能算法进行分析来检测是否是发生呼吸系统疾病。
当前基于咳嗽音检测呼吸系统疾病的方法主要应用于可穿戴设备中,通过可穿戴设备对用户的咳嗽音进行实时检测,可以较快的分析用户患呼吸系统疾病的风险。但是,在该方法中,当可穿戴设备所在环境中存在多个用户(例如佩戴可穿戴设备的用户处于公共环境的场景)时,可穿戴设备在根据实时检测到的咳嗽音进行患病风险分析的过程中,可能会存在误将其它用户的咳嗽音作为佩戴该可穿戴设备的用户的咳嗽音进行检测的问题,就会造成误检情况,导致佩戴该可穿戴设备的用户的检测结果出现误差。
发明内容
本申请提供一种音频检测的方法及电子设备,用以检测电子设备采集到的音频与其它信号之间的匹配关系,进而可以根据该匹配关系进行后续处理。
第一方面,本申请提供一种音频检测的方法,应用于电子设备,该方法包括:采集第一音频,并识别第一音频的类型;通过一个或多个传感器采集第一信号,其中,电子设备包括一个或多个传感器;当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间;根据第一时间区间及第一信号,确定第二信号;若第二信号与第一音频匹配,显示第一信息。
在该方法中,电子设备可以针对设定类型的音频,进行该音频与该音频产生期间内的其它信号之间的匹配,进而确定该音频与其它信号之间的匹配关系,并可以在该音频与其它信号匹配时进行相应的后续处理。基于此,该方法能够实现特定音频与其它信号之间的匹配,进而能够根据音频与其它信号之间的匹配关系进行后续处理,例如进一步实现音频与其它信号所对应的对象之间的匹配。例如,当其它信号为针对用户测得的信号时,基于 该方法则可以通过进行音频与其它信号之间的匹配,实现音频与用户之间的匹配,进而可以明确音频与用户之间的对应关系,从而根据该对应关系进行后续处理,进而提高后续处理过程的准确度。例如在确定音频与用户匹配后,可以根据该音频检测该用户的患病风险等,从而提高检测的准确度,降低误差。
在一种可能的设计中,该方法还包括:若第二信号与第一音频不匹配,显示第二信息。
在该方法中,音频和信号匹配的情况下与音频和信号不匹配的情况下,电子设备执行的处理不同。因此,通过进行音频和信号的匹配,电子设备可以基于确定的匹配关系进行相应的处理,保证后续处理过程的顺利执行,一定程度上能够提高后续处理过程的准确性和灵活性。
在一种可能的设计中,在根据第一音频确定第一时间区间之前,该方法还包括:根据第一音频确定第一风险参数,第一风险参数大于或等于设定阈值,其中,第一风险参数用于指示第一音频对应的疾病风险。
在该方法中,电子设备可以基于音频确定对应的风险参数,当风险参数较大时,有必要进行风险应对处理,因此电子设备可以继续执行后续的匹配过程。当风险参数较小时,可以不进行风险应对处理,因此电子设备可以不进行后续的匹配过程,并可以开始下一次的音频采集和处理过程。基于此,电子设备可以根据风险参数的取值情况灵活切换处理方式,并提高后续处理过程的准确度。
在一种可能的设计中,第一信息用于指示:第一音频来源于第一用户,和/或,第一用户存在患病风险,其中,第一用户为佩戴电子设备的用户;第二信息用于指示:第一音频不是来源于第一用户,和/或,第一用户所在的环境存在感染风险。
在该方法中,第二信号可以是针对第一用户采集的信号。在第一音频与第二信号匹配时,可以确定第一音频也与第一用户匹配,在第一音频与第二信号不匹配时,可以确定第一音频与第一用户也不匹配。进一步的,在第一风险参数较大的情况下,若第一音频与第二信号匹配,则可以确定第一用户的患病风险较大,若第一音频与第二信号不匹配,则可以确定第一用户所在的环境中存在感染风险。基于此,电子设备可以针对第一用户进行更加全面准确的风险提示,以便第一用户主动采取相应措施来降低风险,进而提高用户使用体验。
在一种可能的设计中,根据第一音频确定第一风险参数,包括:从第一音频中提取特征数据;其中,特征数据用于表征第一音频的时域特征和/或频域特征;根据设定的疾病筛查模型和特征数据,确定第一风险参数;其中,疾病筛查模型用于表示音频的时域特征和/或频域特征与音频对应的风险参数之间的关系。
在该方法中,基于音频的时域特征、频域特征等特征可以进行较为全面的音频分析,而利用模型可以提高进行音频分析的准确度,进而更加准确的确定第一音频对应的风险参数,提高进行音频分析的准确度。
在一种可能的设计中,在若第二信号与第一音频匹配,显示第一信息之前,该方法还包括:根据第一音频,确定第一音频对应的第一动作,以及,根据第二信号,确定第二信号对应的第二动作;第二信号与第一音频匹配,包括:第二信号对应的第二动作的类型与第一音频对应的第一动作的类型相同。
在该方法中,音频与其它信号是不同类型的信息,但是,若音频和其它信号来源于同一对象,则均可以反映该对象的一些特征。因此,基于音频和其它信号是否对应相同的特 征,可以确定音频与其它信号是否匹配。在该方法中,相同的特征可以是动作的类型,则通过比较音频和信号对应的动作的类型是否相同,可以实现音频与信号之间的匹配。例如,当音频和信号均来自于用户时,动作可以是用户所执行的动作,则当音频和信号对应的动作的类型相同时,可以确定音频与信号来自于同一用户。
在一种可能的设计中,根据第二信号,确定第二信号对应的第二动作,包括:根据设定的动作识别模型和第二信号,确定第二动作及第二动作的置信度,其中,动作识别模型用于表示第二信号与第二动作及置信度之间的关系,置信度用于表征动作识别模型的识别准确度;确定置信度大于或等于设定的第一阈值;或者,确定置信度大于或等于设定的第二阈值且小于或等于设定的第三阈值;显示第一指示信息,第一指示信息用于指示:确认是否执行第二动作;接收第二指示信息,第二指示信息用于指示:确认执行第二动作。
在该方法中,一方面,利用模型可以提高进行动作识别的准确度,进而更加准确的确定第二信号对应的动作及其类型,进而提高匹配时的准确度。另一方面,第一指示信息可以是发送给用户,第二指示信息可以是来自于用户。电子设备可以基于动作识别模型的准确度参数即置信度,灵活切换后续的处理过程来保证最终匹配结果的准确度。当置信度较小时,可以认为第一音频与第二信号不匹配,因此电子设备可以不进行后续的匹配过程,并可以开始下一次的音频采集和处理过程。而当置信度处于一个中间取值的范围内时,可以额外进行确认,从而根据确认结果确定是否进行后续的匹配过程,保证处理的准确度。基于此,电子设备可以根据置信度的取值情况灵活切换处理方式,进而提高后续匹配过程的准确度。
在一种可能的设计中,第二信号包括以下至少一项:通过加速度传感器采集的信号;通过陀螺仪传感器采集的信号;通过光电传感器采集的信号。
在该方法中,加速度传感器或陀螺仪测得的信号可以表征用户的运动和姿态特征,光电传感器测得的信号可以表征用户的生理特征,这些信号均可以反映用户执行动作时的相关特征,因此,基于这些信号能够准确的确定第二信号对应的动作及其类型,进而提高与音频匹配时的准确度。
在一种可能的设计中,设定类型包括以下至少一项:咳嗽音,呼吸音,喷嚏音,关节弹响。
在该方法中,基于用户发出的咳嗽音、呼吸音等类型的音频,可以对用户的患病风险进行预测。因此,通过将该种类型的音频与其它信号进行匹配,可以结合匹配结果和患病风险的预测结果,更加准确的确定可能存在患病或感染风险的对象,从而进一步进行响应提示,提高用户使用体验。
第二方面,本申请提供一种音频检测的方法,应用于第一电子设备,该方法包括:采集第一音频,并识别第一音频的类型;当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间;向第二电子设备发送请求信息,请求信息用于请求获取第一时间区间对应的信号;接收来自第二电子设备的第一信号,第一信号是第二电子设备通过一个或多个传感器采集得到的,其中,第二电子设备包括一个或多个传感器;若第一信号与第一音频匹配,显示第一信息。
在一种可能的设计中,该方法还包括:若第一信号与第一音频不匹配,显示第二信息。
在一种可能的设计中,在根据第一音频确定第一时间区间之前,该方法还包括:根据第一音频确定第一风险参数,第一风险参数大于或等于设定阈值,其中,第一风险参数用 于指示第一音频对应的疾病风险。
在一种可能的设计中,第一信号为第二电子设备在第一时间区间内采集到的信号。
在一种可能的设计中,第一信息用于指示:第一音频来源于第一用户,和/或,第一用户存在患病风险,其中,第一用户为佩戴第二电子设备的用户;第二信息用于指示:第一音频不是来源于第一用户,和/或,第一用户所在的环境存在感染风险。
在一种可能的设计中,根据第一音频确定第一风险参数,包括:从第一音频中提取特征数据;其中,特征数据用于表征第一音频的时域特征和/或频域特征;根据设定的疾病筛查模型和特征数据,确定第一风险参数;其中,疾病筛查模型用于表示音频的时域特征和/或频域特征与音频对应的风险参数之间的关系。
在一种可能的设计中,在若第一信号与第一音频匹配,显示第一信息之前,该方法还包括:根据第一音频,确定第一音频对应的第一动作,以及,根据第一信号,确定第一信号对应的第二动作;第一信号与第一音频匹配,包括:第一信号对应的第二动作的类型与第一音频对应的第一动作的类型相同。
在一种可能的设计中,根据第一信号,确定第一信号对应的第二动作,包括:根据设定的动作识别模型和第一信号,确定第二动作及第二动作的置信度,其中,动作识别模型用于表示第一信号与第二动作及置信度之间的关系,置信度用于表征动作识别模型的识别准确度;确定置信度大于或等于设定的第一阈值;或者,确定置信度大于或等于设定的第二阈值且小于或等于设定的第三阈值;显示第一指示信息,第一指示信息用于指示:确认是否执行第二动作;接收第二指示信息,第二指示信息用于指示:确认执行第二动作。
在一种可能的设计中,第一信号包括以下至少一项:通过加速度传感器采集的信号;通过陀螺仪传感器采集的信号;通过光电传感器采集的信号。
在一种可能的设计中,设定类型包括以下至少一项:咳嗽音,呼吸音,喷嚏音,关节弹响。
第三方面,本申请提供一种音频检测的方法,应用于第一电子设备与第二电子设备组成的系统,该方法包括:第一电子设备采集第一音频,并识别第一音频的类型;以及,第二电子设备通过一个或多个传感器采集第一信号,其中,第二电子设备包括一个或多个传感器;第一电子设备当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间;第一电子设备向第二电子设备发送请求信息,请求信息用于请求获取第一时间区间对应的信号;第二电子设备根据接收到的请求信息确定第一时间区间,并根据第一时间区间及第一信号,确定第二信号;第二电子设备向第一电子设备发送第二信号;第一电子设备若确定接收到的第二信号与第一音频匹配,显示第一信息。
在一种可能的设计中,该方法还包括:第一电子设备若确定第二信号与第一音频不匹配,显示第二信息。
在一种可能的设计中,在第一电子设备根据第一音频确定第一时间区间之前,该方法还包括:第一电子设备根据第一音频确定第一风险参数,第一风险参数大于或等于设定阈值,其中,第一风险参数用于指示第一音频对应的疾病风险。
在一种可能的设计中,第一信息用于指示:第一音频来源于第一用户,和/或,第一用户存在患病风险,其中,第一用户为佩戴第二电子设备的用户;第二信息用于指示:第一音频不是来源于第一用户,和/或,第一用户所在的环境存在感染风险。
在一种可能的设计中,第一电子设备根据第一音频确定第一风险参数,包括:第一电子设备从第一音频中提取特征数据;其中,特征数据用于表征第一音频的时域特征和/或频域特征;第一电子设备根据设定的疾病筛查模型和特征数据,确定第一风险参数;其中,疾病筛查模型用于表示音频的时域特征和/或频域特征与音频对应的风险参数之间的关系。
在一种可能的设计中,在第一电子设备若确定第二信号与第一音频匹配,显示第一信息之前,该方法还包括:第一电子设备根据第一音频,确定第一音频对应的第一动作,以及,根据第二信号,确定第二信号对应的第二动作;第二信号与第一音频匹配,包括:第二信号对应的第二动作的类型与第一音频对应的第一动作的类型相同。
在一种可能的设计中,第一电子设备根据第二信号,确定第二信号对应的第二动作,包括:第一电子设备根据设定的动作识别模型和第二信号,确定第二动作及第二动作的置信度,其中,动作识别模型用于表示第二信号与第二动作及置信度之间的关系,置信度用于表征动作识别模型的识别准确度;第一电子设备确定置信度大于或等于设定的第一阈值;或者,第一电子设备确定置信度大于或等于设定的第二阈值且小于或等于设定的第三阈值;显示第一指示信息,第一指示信息用于指示:确认是否执行第二动作;接收第二指示信息,第二指示信息用于指示:确认执行第二动作。
在一种可能的设计中,第二信号包括以下至少一项:通过加速度传感器采集的信号;通过陀螺仪传感器采集的信号;通过光电传感器采集的信号。
在一种可能的设计中,设定类型包括以下至少一项:咳嗽音,呼吸音,喷嚏音,关节弹响。
第四方面,本申请提供一种系统,该系统包括上述第三方面所述的第一电子设备和第二电子设备。
第五方面,本申请提供一种电子设备,该电子设备包括显示屏,存储器和一个或多个处理器;其中,存储器用于存储计算机程序代码,计算机程序代码包括计算机指令;当计算机指令被一个或多个处理器执行时,使得电子设备执行上述第一方面或第一方面的任一可能的设计所描述的方法,或者执行上述第二方面或第二方面的任一可能的设计所描述的方法,或者执行上述第三方面或第三方面的任一可能的设计中由第一电子设备或第二电子设备所执行的方法。
第六方面,本申请提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,当计算机程序在计算机上运行时,使得计算机执行上述第一方面或第一方面的任一可能的设计所描述的方法,或者执行上述第二方面或第二方面的任一可能的设计所描述的方法,或者执行上述第三方面或第三方面的任一可能的设计中由第一电子设备或第二电子设备所执行的方法。
第七方面,本申请提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令在计算机上运行时,使得计算机执行上述第一方面或第一方面的任一可能的设计所描述的方法,或者执行上述第二方面或第二方面的任一可能的设计所描述的方法,或者执行上述第三方面或第三方面的任一可能的设计中由第一电子设备或第二电子设备所执行的方法。
上述第二方面到第七方面的有益效果,请参见上述第一方面的有益效果的描述,这里不再重复赘述。
附图说明
图1为本申请实施例提供的一种电子设备的硬件架构示意图;
图2为本申请实施例提供的一种电子设备的软件架构示意图;
图3为本申请实施例提供的一种音频检测的方法的示意图;
图4为本申请实施例提供的一种手机显示提示信息的界面示意图;
图5为本申请实施例提供的另一种手机显示提示信息的界面示意图;
图6为本申请实施例提供的又一种手机显示提示信息的界面示意图;
图7为本申请实施例提供的再一种手机显示提示信息的界面示意图;
图8为本申请实施例提供的一种检测用户呼吸道感染风险的方法的流程示意图;
图9为本申请实施例提供的一种智能手表的功能控制界面的示意图;
图10为本申请实施例提供的一种智能手表显示患病风险提示信息的界面示意图;
图11为本申请实施例提供的另一种智能手表显示患病风险提示信息的界面示意图;
图12为本申请实施例提供的一种智能手表的提示界面的界面示意图;
图13为本申请实施例提供的一种音频检测的方法的示意图;
图14为本申请实施例提供的一种音频检测的方法的示意图;
图15为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。其中,在本申请实施例的描述中,以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
为了便于理解,示例性的给出了与本申请相关概念的说明以供参考。
电子设备,可以为具有无线连接功能的设备。本申请一些实施例中,电子设备还可以具备音频检测(声音检测)和/或传感功能。本申请实施例中,音频也可以称为声音。
本申请一些实施例中电子设备可以是便携式设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴设备(例如,手表、手环、头盔、耳机等)、车载终端设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、智能家居设备(例如,智能电视、智能音箱等)、智能机器人、车间设备、无人驾驶(Self Driving)中的无线终端、远程手术(Remote Medical Surgery)中的无线终端、智能电网(Smart Grid)中的无线终端、运输安全(Transportation Safety)中的无线终端、智慧城市(Smart City)中的无线终端,或智慧家庭(Smart Home)中的无线终端、飞行设备(例如,智能机器人、热气球、无人机、飞机)等。
其中,可穿戴设备为用户可以直接穿戴在身上或者整合到用户的衣服或配件上的一种便携式设备。本申请实施例中的可穿戴设备可以为具备传感功能和音频检测功能的便携式设备。
在本申请一些实施例中,电子设备还可以是还包含其它功能诸如个人数字助理和/或音 乐播放器功能的便携式终端设备。便携式终端设备的示例性实施例包括但不限于搭载或者其它操作系统的便携式终端设备。上述便携式终端设备也可以是其它便携式终端设备,诸如具有触敏表面(例如触控面板)的膝上型计算机(Laptop)等。还应当理解的是,在本申请其它一些实施例中,上述电子设备也可以不是便携式终端设备,而是具有触敏表面(例如触控面板)的台式计算机。
应理解,本申请实施例中“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一(项)个”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b或c中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c,或a、b和c,其中a、b、c可以是单个,也可以是多个。
针对当前可穿戴设备基于咳嗽音进行呼吸系统疾病检测的方法存在的难以识别咳嗽音来源的问题,本申请实施例提供了一种音频检测的方法及电子设备,该方案可以较为准确的识别发出音频的对象,提高识别音频来源的准确度。
示例性的,本申请实施例提供的方法可以用于可穿戴设备中,当本申请实施例提供的方法应用于可穿戴设备中时,可穿戴设备可以准确的识别检测到的咳嗽音是否属于佩戴该可穿戴设备的用户,进而可以根据属于该用户的咳嗽音,针对该用户进行更为准确的患病风险检测。
下面参阅图1,对本申请实施例提供的方法适用的电子设备的结构进行介绍。
如图1所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,USB接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及SIM卡接口195等。
其中传感器模块180可以包括陀螺仪传感器、加速度传感器、接近光传感器、指纹传感器、触摸传感器、温度传感器、压力传感器、距离传感器、磁传感器、环境光传感器、气压传感器、骨传导传感器等。
可以理解的是,图1所示的电子设备100仅仅是一个范例,并不构成对电子设备的限定,并且电子设备可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图1中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(Neural-network Processing Unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
本申请实施例提供的音频检测的方法的执行可以由处理器110来控制或调用其他部件来完成,比如调用内部存储器121中存储的本申请实施例的处理程序,或者通过外部存储器接口120调用第三方设备中存储的本申请实施例的处理程序,来控制无线通信模块160向其它设备进行数据通信,提高电子设备100的智能化、便捷化程度,提升用户的体验。处理器110可以包括不同的器件,比如集成CPU和GPU时,CPU和GPU可以配合执行本申请实施例提供的音频检测的方法,比如音频检测的方法中部分算法由CPU执行,另一部分算法由GPU执行,以得到较快的处理效率。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。显示屏194可用于显示由用户输入的信息或提供给用户的信息以及各种图形用户界面(graphical user interface,GUI)。例如,显示屏194可以显示照片、视频、网页、或者文件等。
在本申请实施例中,显示屏194可以是一个一体的柔性显示屏,也可以采用两个刚性屏以及位于两个刚性屏之间的一个柔性屏组成的拼接显示屏。
摄像头193(前置摄像头或者后置摄像头,或者一个摄像头既可作为前置摄像头,也可作为后置摄像头)用于捕获静态图像或视频。通常,摄像头193可以包括感光元件比如镜头组和图像传感器,其中,镜头组包括多个透镜(凸透镜或凹透镜),用于采集待拍摄物体反射的光信号,并将采集的光信号传递给图像传感器。图像传感器根据光信号生成待拍摄物体的原始图像。
内部存储器121可以用于存储计算机可执行程序代码,可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,应用程序(比如音频检测的功能等)的代码等。存储数据区可存储电子设备100使用过程中所创建的数据等。
内部存储器121还可以存储本申请实施例提供的音频检测的算法对应的一个或多个计算机程序。该一个或多个计算机程序被存储在上述内部存储器121中并被配置为被一个或多个处理器110执行,该一个或多个计算机程序包括指令,上述指令可以用于执行以下实施例中的各个步骤。
此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
当然,本申请实施例提供的音频检测的算法的代码还可以存储在外部存储器中。这种情况下,处理器110可以通过外部存储器接口120运行存储在外部存储器中的音频检测的算法的代码。
传感器模块180可以包括陀螺仪传感器、加速度传感器、接近光传感器、指纹传感器、触摸传感器等。
触摸传感器,也称“触控面板”。触摸传感器可以设置于显示屏194,由触摸传感器与显示屏194组成触摸显示屏,也称“触控屏”。触摸传感器用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
示例性的,电子设备100的显示屏194显示主界面,主界面中包括多个应用(比如相机应用、微信应用等)的图标。用户通过触摸传感器点击主界面中相机应用的图标,触发处理器110启动相机应用,打开摄像头193。显示屏194显示相机应用的界面,例如取景界面。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。在本申请实施例中,移动通信模块150还可以用于与其它设备进行信息交互。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频装置(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。本申请实施例中,无线通信模块160,用于与其它电子设备建立连接,进行数据交互。或者无线通信模块160可以用于接入接入 点设备,向其它电子设备发送控制指令,或者接收来自其它电子设备发送的数据。
另外,电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。电子设备100可以接收按键190输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。电子设备100可以利用马达191产生振动提示(比如来电振动提示)。电子设备100中的指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。电子设备100中的SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。
应理解,在实际应用中,电子设备100可以包括比图1所示的更多或更少的部件,本申请实施例不作限定。图示电子设备100仅是一个范例,并且电子设备100可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备的软件结构。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。如图2所示,该软件架构可以分为四层,从上至下分别为应用程序层,应用程序框架层(framework,FWK),安卓运行时和系统库,以及Linux内核层。
应用程序层是操作系统的最上一层,包括操作系统的原生应用程序,例如相机、图库、日历、蓝牙、音乐、视频、信息等等。本申请实施例涉及的应用程序简称应用(application,APP),为能够实现某项或多项特定功能的软件程序。通常,电子设备中可以安装多个应用。比如,相机应用、邮箱应用、智能家居控制应用等。下文中提到的应用,可以是电子设备出厂时已安装的系统应用,也可以是用户在使用电子设备的过程中从网络下载或从其他电子设备获取的第三方应用。
当然,对于开发者来说,开发者可以编写应用程序并安装到该层。一种可能的实现方式中,应用程序可以使用Java语言开发,通过调用应用程序框架层所提供的应用程序编程接口(application programming interface,API)来完成,开发者可以通过应用程序框架来与操作系统的底层(例如内核层等)进行交互,开发自己的应用程序。
应用程序框架层为应用程序层的应用程序提供API和编程框架。应用程序框架层可以包括一些预先定义的函数。应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。数据可以包括文件(例如文档、视频、图像、音频),文本等信息。
视图系统包括可视控件,例如显示文字、图片、文档等内容的控件等。视图系统可用于构建应用程序。显示窗口中的界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备的通信功能。通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。
安卓运行时包括核心库和虚拟机。安卓运行时负责安卓系统的调度和管理。
安卓系统的核心库包含两部分:一部分是Java语言需要调用的功能函数,另一部分是安卓系统的核心库。应用程序层和应用程序框架层运行在虚拟机中。以Java举例,虚拟机将应用程序层和应用程序框架层的Java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器,媒体库,三维图形处理库(例如:OpenGL ES),二维图形引擎(例如:SGL)等。表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了二维和三维图层的融合。媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.564,MP3,AAC,AMR,JPG,PNG等。三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。二维图形引擎是二维绘图的绘图引擎。
内核(Kernel)层提供操作系统的核心系统服务,如安全性、内存管理、进程管理、网络协议栈和驱动模型等都基于内核层实现。内核层同时也作为硬件和软件栈之间的抽象层。该层有许多与电子设备相关的驱动程序,主要的驱动有:显示驱动;作为输入设备的键盘驱动;基于内存技术设备的Flash驱动;照相机驱动;音频驱动;蓝牙驱动;WiFi驱动等。
需要理解的是,如上的功能服务只是一种示例,在实际应用中,电子设备也可以按照其他因素来划分为更多或更少的功能服务,或者可以按照其他方式来划分各个服务的功能,或者也可以不划分功能服务,而是按照整体来工作。
下面结合具体实施例,对本申请提供的方案进行详细说明。
参阅图3,本申请实施例提供的音频检测的方法包括:
S301:电子设备确定采集到的音频中包含目标音频,该目标音频的类型为设定类型。
在本申请一些实施例中,目标用户可以为佩戴或携带电子设备的用户,或者,目标用户也可以为预先设定的、与电子设备所关联的用户。
示例性的,上述目标音频(或设定类型)可以为咳嗽音、呼吸音、喷嚏音、关节弹响(即人体的关节在运动时发出的声音)等。
在本申请一些实施例中,电子设备可以为具备音频检测功能的设备,则电子设备可以对所在环境中的音频(或声音)进行实时的采集和检测,从而获得目标音频信号(即目标音频的音频信号)。电子设备也可以是具备无线连接功能的设备,则电子设备可以接收其它音频检测设备所采集的目标音频信号。当然,电子设备也可以是上述音频检测功能和无线连接功能均具备的设备,则电子设备可以采用上述任一种方式获取目标音频信号。
示例性的,目标音频可以是电子设备所在环境中的任一用户发出的声音。例如,电子设备与目标用户处于同一环境中时,电子设备可以对该环境中出现的音频进行实时的接收和识别,当电子设备识别到接收的音频为设定类型例如咳嗽音时,可以确定该环境中存在一个用户正在咳嗽,因此,电子设备可以获得该用户咳嗽发生期间的音频信号,得到对应的咳嗽音信号。
在本申请一些实施例中,电子设备采集到目标音频后,可以根据目标音频识别目标音频对应的目标动作。具体识别时可以借助已训练的网络模型实现。
在本申请一些实施例中,电子设备在获取目标音频信号后,可以先根据目标音频信号检测发出目标音频的用户是否存在患病风险。具体的,电子设备获取到目标音频信号后,可以先对目标音频信号进行预处理,例如去噪、预加重、分帧、加窗等处理。然后可以从预处理后的目标音频信号中提取目标音频的特征数据,并根据提取到的特征数据检测目标音频所属的用户是否存在患病风险。
其中,目标音频的特征数据可以包括但不限于以下至少一项:
1)时域特征。
示例性的,时域特征可以为音频信号的过零率(zero crossing rate,ZCR)。过零率是指在每帧音频信号中,信号通过零点(从正变为负或从负变为正)的次数。时域特征还可以包括起音时间、自相关参数或音频信号的波形特征等。其中,起音时间指音频能量在上升阶段的时长。自相关参数指音频信号与其沿时间位移后的信号的相似度。
2)频域特征。
示例性的,频域特征可以为梅尔频率倒谱系数(Mei-freguency ceptrai coefficients,MFCC)、功率谱密度、谱质心、频谱平坦度、频谱通量等。其中,MFCC是在梅尔标度频率域提取出来的倒谱系数,梅尔标度描述了人耳对频率感知的非线性特性。功率谱密度指信号单位频带内的功率(均方值)。谱质心指信号频谱中能量的集中点,用于描述信号音色的明亮度。频谱平坦度指量化信号和噪音间的相似度。频谱通量指量化信号相邻帧间的变化程度。
3)能量特征。
示例性的,能量特征可以为均方根能量等。均方根能量指信号在一定时间范围内的能量均值。
4)乐理特征。
示例性的,乐理特征可以为基音频率、失谐度等。其中,基音频率指声音的音高的频率。失谐度指信号泛音频率与基音频率的整数倍间偏离程度。
5)感知特征。
示例性的,感知特征可以为声音响度(强度)、尖锐度等。其中,响度指人耳感受到的信号强弱(例如声音的大小等)。尖锐度用于表示音频信号中高频部分的能量大小,高频部分能量越大,尖锐度越高,人耳感受到的声音越尖锐。
在本申请一些实施例中,电子设备可以采用模型识别的方式,根据提取到的目标音频的特征数据检测是否存在患病风险。具体的,电子设备提取得到目标音频的特征数据后,可以将提取的特征数据输入已训练的疾病筛查模型中,得到疾病筛查模型输出的患病风险参数,该参数可以用于指示患病风险的等级分类(如高风险、中风险、低风险等),或者该参数可以用于指示表征患病风险大小的具体值。可选的,患病风险参数还可以用于指示疾病类型。
其中,电子设备采用的疾病筛查模型可以是对采用逻辑回归算法(logistic regression,LR)、极致梯度提升(eXtreme Gradient Boosting,XGBoost)算法等算法的网络模型进行训练后得到的。
在本申请一些实施例中,电子设备可以在根据目标音频信号检测到发出目标音频的用户存在患病风险或患病风险较高(例如疾病筛查模型输出的患病风险参数大于或等于设定的风险阈值的情况)时,执行以下步骤S302,获取目标用户的生理数据,并根据目标用户 的生理数据来确定目标音频是否是目标用户发出的音频。可选的,电子设备在根据目标音频信号检测发出目标音频的用户不存在患病风险或患病风险较低(例如疾病筛查模型输出的患病风险参数小于设定的风险阈值的情况)时,可以不进行额外处理,例如只需继续执行原本的音频监测即可。
S302:电子设备获取目标用户的生理数据,该生理数据用于表征目标用户的生理特征。
在实际场景中,一方面,用户在发出声音的过程中一般都伴随一些动作的执行,例如用户发出咳嗽音时伴随用户执行咳嗽的动作,用户发出呼吸音时伴随呼吸的动作,因此,用户发出的声音与用户执行的动作之间具有对应关系,即用户发出的声音的类型与用户执行的动作的类型是相对应的,可以认为用户发出的声音是在执行动作过程中产生的。因此,本申请实施例中,目标音频可以为目标音频所属的用户在执行目标动作过程中发出的声音,其中,目标音频为设定类型,目标动作的类型与设定类型相对应。例如,目标音频为上述的咳嗽音、呼吸音、喷嚏音、关节弹响时,对应的动作分别为咳嗽、呼吸、打喷嚏、关节运动。
另一方面,用户在执行不同动作的过程中,其生理特征会发生不同的变化,因此,用户执行不同动作时的生理特征一般不同,基于用户的生理特征可以识别用户所执行的对应的动作。进一步的,可以基于用户执行的动作将用户发出的声音和用户的生理数据关联起来,进而将用户发出的声音与用户关联起来。因此,本申请实施例中,电子设备可以基于目标用户的生理特征对目标用户所执行的动作进行识别,再根据目标用户执行的动作与电子设备检测到的音频对应的动作之间的关系,确定电子设备检测到的目标音频与目标用户的关系。
在本申请一些实施例中,电子设备可以为具备传感功能或传感器的设备,则电子设备可以对用户的生理特征进行实时检测,并可以采集用于表征用户生理特征的生理参数。电子设备也可以是具备无线连接功能的设备,则电子设备可以接收其它生理监测设备所采集的生理数据。当然,电子设备也可以是上述传感功能和无线连接功能均具备的设备,则电子设备可以采用上述任一种方式获取用户的生理数据。
在本申请一些实施例中,电子设备采集用户的生理数据或者接收生理监测设备的生理数据过程中,可以对距当前时间最近的一段时间内采集的生理数据进行保存。该段时间的时长可以为设定时长。
示例性的,电子设备可以为智能手表或手机。当电子设备为智能手表时,目标用户则为佩戴该智能手表的用户,智能手表自身可以对目标用户的生理特征进行监测,并实时采集和保存用户的生理数据。当电子设备为手机时,目标用户可以是手持该手机的用户或者已注册的该手机的所有者。目标用户可以佩戴用于进行生理特征监测的可穿戴设备例如手环,则手环可以实时监测目标用户的生理特征并将采集的生理数据发送至手机,手机可以保存来自手环的生理数据。可选的,手环采集生理数据后也可以暂时不上报给电子设备,而是手环自身先保存采集的生理数据,待电子设备需要时,再根据电子设备的指示进行生理数据的上报。
基于以上方法,电子设备在执行步骤S302时,可以从自身已保存的生理数据中获取所需的生理数据,或者可以指示生理监测设备上报电子设备所需的生理数据。
在本申请一些实施例中,电子设备获取的目标用户的生理数据是目标时间段内采集到的该用户的生理数据。目标时间段为电子设备或音频检测设备采集目标音频信号的时间段, 也是目标音频产生的时间段。基于此,可以保证电子设备获取到的生理数据的产生时间与目标音频的产生时间是一致的,从而可以根据同一时间段内的音频信号和生理数据进行匹配,以保证匹配的准确度。
电子设备或生理监测设备中可以配置各种传感器来检测得到生理数据。在本申请一些实施例中,电子设备获取的生理数据至少包括运动姿态参数和/或心率参数。其中,运动姿态参数用于指示被测用户的运动或姿态特征,可以由加速度传感器(acceleration transducer,ACC)、陀螺仪等器件测得。心率参数用于指示被测用户的心率、脉搏特征,可以由光电传感器采用光电容积描记(photo plethysmography,PPG)法测得。可选的,生理数据还可以包括但不限于以下至少一项:呼吸率参数、血氧参数、血压参数、脉搏参数等,这些参数可以通过对ACC、陀螺仪、光电传感器等传感器测得的数据进行分析计算后获得。
上述生理数据可以用于检测对应用户执行的动作,进而根据用户执行的动作与目标音频对应的动作是否一致,来确定目标音频是否属于该用户。例如,当ACC测得的运动姿态参数随时间变化产生较大的峰值或出现较为明显的波动,且心率参数有较为明显的增幅时,可以确定用户执行了咳嗽的动作,若目标音频信息信号为咳嗽音信号,则可以确定目标音频属于该用户。
S303:电子设备根据该生理数据,确定目标音频是否来源于目标用户。
电子设备获取到目标用户的生理数据后,可以根据该生理数据确定目标音频是否来源于(属于)目标用户。具体的,电子设备可以利用预先训练得到的动作识别模型,根据目标用户的生理数据,识别目标用户是否执行目标音频对应的目标动作,再根据目标用户是否执行目标动作确定目标音频和目标用户是否匹配,其中,目标动作的类型与目标音频的类型相对应,目标动作可以根据目标音频确定。在本申请一些实施例中,动作识别模型可以是对采用支持向量机(support vector machines,SVM)、LR等算法的网络模型进行训练后得到的。
作为一种可选的实施方式,在模型训练阶段可以采集用户执行各类动作时的生理样本参数,构成各类生理参数的模型数据库。在匹配阶段,动作识别模型可以将输入模型的生理数据与模型数据库中的生理参数进行匹配,如果某类动作的生理参数与输入动作识别模型的生理数据的匹配度最高或者匹配度大于设定的匹配度阈值,则确定该类动作为输入动作识别模型的生理数据对应的动作。若该类动作与目标动作相同,则可以确定该生理数据所属的用户即目标用户执行了目标动作,进而可以确定目标音频是目标用户在执行目标动作过程中发出的声音。反之,若该类动作与目标动作不同,则可以得到目标音频不是目标用户在执行目标动作过程中发出的声音的判断结果。
作为另一种可选的实施方式,在针对特定类型的音频进行来源检测时,例如针对目标动作被执行时的音频进行来源检测时,在模型训练阶段也可以仅获取不同用户执行目标动作时的生理样本参数,构成该类生理参数对应的模型数据库。在匹配阶段,动作识别模型可以将输入模型的生理参数与模型数据库中的生理参数进行匹配,若输入的生理数据与模型数据库中超过设定数量的生理参数的匹配度均大于设定的匹配度阈值,则可以确定目标音频是目标用户在执行目标动作过程中发出的声音,反之,确定目标音频不是目标用户在执行目标动作过程中发出的声音。
作为又一种可选的实施方式,动作识别模型可以用于识别输入的生理数据所对应的用户状态或用户动作姿态等,当确定该生理数据对应的用户执行的动作是目标动作时,也可 以确定该生理数据所属的目标用户执行了目标动作,则可以认为该生理数据与目标音频信号是匹配的,进而确定目标音频是目标用户在执行目标动作过程中发出的声音。该方式中,在模型训练阶段可以利用采集到的用户执行不同动作时的生理数据和对应的动作指示标签组成数据对,并作为训练数据,对网络模型进行训练。
在本申请一些实施例中,动作识别模型的输入数据可以为电子设备获取到的生理数据,动作识别模型的输出数据可以为目标用户是否执行目标动作的判断结果,或者为该动作识别模型的置信度,该置信度用于表示用户执行目标动作的可能性。置信度越高,说明用户执行目标动作的可能性越大。电子设备将获取到的生理数据输入到动作识别模型,就可以得到动作识别模型输出的用户是否执行目标动作的识别结果或者对应的置信度,进而据此判断目标音频是否是目标用户发出的声音。
具体的,当输出结果为目标用户执行了目标动作或者模型的置信度大于或等于设定的第一置信度阈值时,电子设备可以确定目标音频来源于目标用户。当输出结果为目标用户未执行目标动作或者模型的置信度小于或等于设定的第二置信度阈值时,电子设备可以确定目标音频不是来源于目标用户。当输出结果为模型的置信度小于第一置信度阈值且大于第二置信度阈值时,电子设备可以通过提示目标用户确认的方式,来进一步判断目标音频与目标用户关系。其中,第一置信度阈值大于第二置信度阈值。
可选的,电子设备也可以在确定上述模型的置信度大于或等于设定的第三置信度阈值时,确定目标音频来源于目标用户,在确定模型的置信度小于第三置信度阈值时,确定目标音频不是来源于目标用户。
在提示目标用户确认时,电子设备可以显示提示信息以询问用户是否执行了目标动作,并根据用户反馈的信息确定用户是否执行了目标动作,若是,则最终确定目标音频来源于目标用户,否则,确定目标音频不是来源于目标用户。
例如,当电子设备为智能手表、目标动作为咳嗽时,电子设备在提示目标用户确认时,可以通过振动等方式提示目标用户查看该电子设备的显示屏,并在显示屏中显示提示信息的界面,该界面中包括提示信息例如“刚刚您是否咳嗽了?”,还包括可供用户操作的控制选项例如“是”和“否”的选项。则目标用户可以根据是否咳嗽的实际情况点击对应的选项。电子设备接收用户的操作后可以确定目标用户的状态。
上述电子设备在确定目标音频不是来源于目标用户时,确定目标音频来源于目标用户所在环境中存在的其它用户。
在本申请一些实施例中,电子设备在确定目标音频来源于目标用户时,可以进一步确定目标用户的患病风险(包括所患疾病的类型和对应的风险大小);电子设备在确定目标音频来源于目标用户所在环境中的其它用户时,也可以进一步确定其它用户的患病风险,当确定其它用户可能患传染性疾病时,可以对目标用户进行相关提示,以使目标用户根据提示进行应对,从而避免或降低从环境中感染疾病的可能。
以下仅以电子设备确定目标用户的患病风险为例进行说明,电子设备确定其它用户的患病风险时采用的方式与电子设备确定目标用户的患病风险时采用的方法相同,下文中不再重复介绍。
作为一种可选的实施方式,电子设备确定目标用户的患病风险时,可以将上述步骤S301中根据目标音频的特征数据检测到的患病风险参数作为目标用户的患病风险参数,从而得到目标用户的患病风险。
作为另一种可选的实施方式,电子设备可以结合更多方面的信息对目标用户的患病风险进行识别分析。例如,电子设备可以根据目标音频的特征数据、目标用户的个人信息、目标用户的生理数据中的至少一种信息,利用已训练的疾病筛查模型识别对应的患病风险参数,从而得到目标用户的患病风险。
示例性的,目标音频的特征数据可以包括上述实施例提供的特征中的至少一种特征。目标用户的个人信息可以包括目标用户的性别、年龄、身高、体重等信息,这些信息可以是用户预先录入电子设备的。
作为又一种可选的实施方式,电子设备可以根据至少一种特征数据,在预设的不同特征与风险级别的对应关系中选择对应的风险级别,并将选择的风险级别作为目标用户患病的风险级别,从而得到目标用户的患病风险。
示例性的,特征数据可以包括上述的多种特征数据,还可以包括以下至少一种特征参数:用于指示目标音频强弱特征的参数;在预设时长内电子设备或音频检测设备检测到目标音频信号的次数;上述疾病筛查模型输出的风险等级;电子设备或音频检测设备多次检测到的目标音频信号中,持续时间最长的目标音频信号的持续时长等。当然,至少一种参数也可以作为音频的特征数据,用于上述利用疾病筛查模型识别患病风险的过程中。
在一种可能的情况中,当上述电子设备确定目标用户存在患病风险或患病风险较高时,可以提示目标用户存在患病风险,以使目标用户进一步及时就诊等。
示例性的,当电子设备为手机、目标动作为咳嗽时,电子设备可以通过通知消息提示等方式来提示目标用户存在患病风险。例如,当目标用户患病风险较低时,电子设备可以显示图4中所示的界面。当目标用户患病风险较高时,电子设备可以显示图5中所示的界面。在图4或图5所示的界面中,电子设备可以显示风险提示信息(如图4中所示的“可能存在呼吸感染风险”或图5中所示的“呼吸感染高风险(疑似肺炎)”),还可以显示日期时间和相对应的建议等信息。例如,在确定用户可能存在呼吸感染风险时,电子设备还可以显示以下建议信息:根据您近段时间的检测数据,您可能存在呼吸道感染风险,请进行主动测量或就医检测等。在确定用户存在呼吸感染高风险时,电子设备还可以显示以下建议信息:根据您近段时间的检测数据,您可能存在较高的呼吸道感染风险,请及时就医。可选的,电子设备还可以显示“本次检测不作为呼吸系统健康的专业临床诊断依据”等提示信息。
可选的,当上述电子设备根据目标音频信号检测到发出目标音频的用户不存在患病风险或患病风险很低,电子设备也可以定时显示通知信息来提示用户的健康状况。例如,电子设备可以定时显示图6所示的界面,以提示用户当前患病风险,还可以显示“通过分析您的近期测量数据,您的呼吸道感染风险较低,属于健康的范围,继续保持良好的生活习惯,将为您带来持续的健康”、“本次检测不作为呼吸系统健康的专业临床诊断依据”等相应的建议和提示信息。
在另一种可能的情况中,当上述电子设备确定其它用户存在患病风险或患病风险较高,且其它用户可能患病的疾病类型为传染性疾病时,可以提示目标用户周围环境中可能存在疾病感染风险,并提示目标用户进行防护等。
示例性的,当电子设备为手机、目标动作为咳嗽时,电子设备可以通过通知消息提示等方式来提示目标用户当前环境中可能存在呼吸系统感染风险。例如,电子设备可以显示图7中所示的界面,该界面中可以包括用于提示环境中存在呼吸系统感染的风险等级的信 息。可选的,该界面中还可以显示目标用户当前所处的位置、检测到的相关特征参数(例如环境中检测到的咳嗽音的次数)、相应的建议等,例如该界面中还可以包括“您当前位于X地,XX时间内周围环境中的咳嗽次数为XXX,建议您离开此地或做好防护措施”、“本次检测不作为呼吸系统健康的专业临床诊断依据”等提示信息。
可选的,上述电子设备在显示界面中还可以显示检测到的咳嗽音的波形等信息,本申请实施例中不做具体限定。
上述实施例中,电子设备可以根据检测到目标音频时目标用户的生理参数确定目标音频与目标用户关系,从而对目标用户的音频与其它用户的音频加以区分。进一步的,在后续处理过程中也可以根据目标音频的来源对目标音频进行处理。例如,若确定目标音频的来源是目标用户,则目标音频对于检测目标用户的患病风险具有较高的参考意义,基于目标音频信号检测到的结果也具有较高的可信度,因此可以基于目标音频检测目标用户自身的患病情况。若确定目标音频的来源不是目标用户,则目标音频对于检测目标用户的患病风险具有较低的参考意义,则可以避免误将该目标音频作为目标用户的音频进行患病风险检测的情况,从而提高相关检测的准确度。
下面以本申请实施例提供的方案应用于检测用户呼吸道感染风险的场景中为例,对本申请实施例提供的方案进行介绍。
以下以电子设备为目标用户佩戴的可穿戴设备、目标用户通过可穿戴设备检测是否存在呼吸道感染风险为例进行说明。其中,目标音频为咳嗽音。可穿戴设备具备音频检测功能,可以检测所在环境中存在的咳嗽音并根据检测到的咳嗽音检测呼吸道感染风险。并且,可穿戴设备具备生理监测功能,可以实时监测目标用户的生理特征。
参照图8,本申请实施例提供的检测用户呼吸道感染风险的方法的流程包括:
S801:可穿戴设备实时采集和检测环境中产生的音频,以及,实时检测并保存目标用户的生理数据。
其中,可穿戴设备检测用户呼吸道感染风险的功能可以通过可穿戴设备中安装的健康监测应用进行控制和实现。当可穿戴设备的健康监测应用中的呼吸道监测的功能被打开后,可穿戴设备可以执行本实例提供的方法进行呼吸道感染风险的检测等处理,当可穿戴设备的健康监测应用中的呼吸道监测的功能被关闭后,可穿戴设备可以停止执行本实例提供的方法进行呼吸道感染风险的检测等处理。
示例性的,如图9中所示,可穿戴设备的健康监测应用中可以提供呼吸道监测功能的控制开关,用户通过对该控制开关进行操作,可以控制打开或关闭对呼吸道健康进行监测的功能。在呼吸道健康监测的功能打开后,可穿戴设备可以执行本申请实施例提供的方法,对佩戴该可穿戴设备的用户的呼吸道健康进行监测。
可选的,用户打开呼吸道健康监测的功能后,可穿戴设备可后台运行该功能,其前台显示的界面可随用户操作进行更新。
S802:可穿戴设备通过识别环境中的音频,确定存在咳嗽音时,确定对应的咳嗽音信号。
可选的,可穿戴设备可以利用已训练的音频识别模型识别采集到的音频是否为咳嗽音。
S803:可穿戴设备提取咳嗽音信号的特征数据。
S804:可穿戴设备将咳嗽音信号的特征数据输入至已训练的疾病筛查模型。
S805:可穿戴设备利用疾病筛查模型判断咳嗽音是否对应存在呼吸道感染风险;若是,执行步骤S806,否则,执行步骤S801。
上述步骤的具体实施方式可参照上述步骤S301中所述的利用疾病筛查模型检测是否存在患病风险的方法,此处不再赘述。
S806:可穿戴设备确定咳嗽音发生的时间段。
其中,该时间段至少包括咳嗽音发生的起始时间和停止时间,还可以包括咳嗽音的时长。
S807:可穿戴设备获取已保存的、该时间段内检测到的生理数据。
该步骤的具体实施方式可参照上述步骤S302中相关的方法,此处不再赘述。
S808:可穿戴设备将生理数据输入至已训练的动作识别模型。
S809:可穿戴设备根据动作识别模型确定生理数据是否对应执行目标动作,并确定对应的置信度。
该步骤的具体实施方式可参照上述步骤S303中所述的利用动作识别模型识别生理数据的方法,此处不再赘述。
S810:可穿戴设备确定置信度是否大于或等于设定的第一置信度阈值;若是,执行步骤S811,否则,执行步骤S812。
S811:可穿戴设备确定咳嗽音属于目标用户,并显示提示信息,以提示目标用户存在呼吸道感染风险。
示例性的,可穿戴设备可以显示图10中所示的界面,该界面中包含提示目标用户存在呼吸道感染风险的提示信息。
S812:可穿戴设备确定置信度是否小于或等于设定的第二置信度阈值;若是,执行步骤S813,否则,执行步骤S814。
其中,第二置信度阈值小于第一置信度阈值。
S813:可穿戴设备确定咳嗽音不属于目标用户,并显示提示信息,以提示目标用户其所在环境中存在呼吸道感染风险。
示例性的,可穿戴设备可以显示图11中所示的界面,该界面中包含提示目标用户其所在环境中存在呼吸道感染风险的提示信息。
S814:可穿戴设备显示提示信息,以询问目标用户是否咳嗽。并执行步骤S815。
示例性的,可穿戴设备可以显示图12所示的界面,该界面中包含询问用户是否咳嗽的提示信息。
S815:可穿戴设备根据用户反馈的信息确定目标用户是否咳嗽;若是,执行步骤S811,否则,执行步骤S813。
上述步骤的具体执行可参照上述实施例中的相关介绍,本实例中不再赘述。
需要说明的是,上述实例提供的具体实施流程,仅是对本申请实施例适用方法流程的举例说明,其中各步骤的执行顺序可根据实际需求进行相应调整,还可以增加其它步骤,或减少部分步骤。
上述实施例中,可穿戴设备通过检测咳嗽音的来源,可以确保进行呼吸道患病风险预测的咳嗽音是来自目标用户本身的声音,降低了错误采集的音频信号对预测过程的干扰,进而可以提高预测的准确度。此外,通过区分咳嗽音的来源是佩戴可穿戴设备的目标用户还是环境中的其它用户,可以分别针对目标用户自身的呼吸道感染风险和环境中的呼吸道 感染风险进行预测。一方面可以提高目标用户自身咳嗽音的采集效果和预测效果,另一方面可以通过判断环境中存在的感染源和风险,及时提醒目标用户做好个人防护等应对措施,减少感染风险。综上,该方案能够快速识别咳嗽音的来源,并进一步进行呼吸道感染风险的预测和预警,实时性和实施成本等都较低,且能够极大提升用户使用的体验度。此外,该方案在室内等公共场景中的应用价值较大,可实施性很高。
基于以上实施例及相同构思,本申请实施例还提供一种音频检测的方法,如图13中所示,该方法包括:
S1301:电子设备采集第一音频,并识别第一音频的类型。
示例性的,电子设备可以为上述实施例(图3)中所述的电子设备,也可以为上述实施例(图8)中所述的可穿戴设备。
S1302:电子设备通过一个或多个传感器采集第一信号,其中,电子设备包括一个或多个传感器。
其中,第一信号为针对第一用户采集的信号。示例性的,第一用户可以为上述实施例中所述的目标用户。第一信号可以为上述实施例中所述的生理数据对应的信号,电子设备可以根据采集到的第一信号确定上述实施例中所述的生理数据。
S1303:电子设备当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间。
示例性的,设定类型的第一音频可以是上述实施例中所述的目标音频,第一时间区间为第一音频产生的时间区间,可以为上述实施例中所述的目标时间段。
S1304:电子设备根据第一时间区间及第一信号,确定第二信号。
示例性的,第二信号可以为上述实施例中所述的目标时间段内的生理数据对应的信号,电子设备可以根据采集到的第二信号确定该生理数据。
S1305:若第二信号与第一音频匹配,电子设备显示第一信息。
可选的,若第二信号与第一音频不匹配,电子设备显示第二信息。
示例性的,第一信息可以为上述实施例中所述的用于通知目标用户存在患病风险的通知信息,例如图6或图10中所示的信息。第一信息可以为上述实施例中所述的用于通知目标用户所在的环境中存在感染风险的通知信息,例如图7或图11中所示的信息。
具体的,该方法中电子设备所执行的具体步骤可参阅前述实施例中的相关介绍,在此不再过多赘述。
基于以上实施例及相同构思,本申请实施例还提供一种音频检测的方法,应用于包括第一电子设备与第二电子设备的系统,如图14中所示,该方法包括:
S1401:第一电子设备采集第一音频,并识别第一音频的类型。
示例性的,第一电子设备可以为上述实施例(图3)中所述的电子设备。例如,第一电子设备可以为手机等移动终端设备。
S1402:第二电子设备通过一个或多个传感器采集第一信号,其中,第二电子设备包括一个或多个传感器。
示例性的,第二电子设备可以为上述实施例中所述的生理监测设备。例如,第二电子设备可以为手表、手环等可穿戴设备。
第一信号为针对第一用户采集的信号。示例性的,第一用户可以为上述实施例中所述的目标用户。第一信号可以为上述实施例中所述的生理数据对应的信号。
需要说明的是,上述步骤S1401与步骤S1402的执行并无严格的时序限制,例如,步骤S1401可以早于步骤S1402执行,也可以晚于步骤S1402执行,或者与步骤S1402同时(或同步)执行。
S1403:第一电子设备当识别到第一音频为设定类型时,根据第一音频确定第一时间区间,第一时间区间包括起始时间和结束时间。
示例性的,设定类型的第一音频可以是上述实施例中所述的目标音频,第一时间区间为第一音频产生的时间区间,可以为上述实施例中所述的目标时间段。
S1404:第一电子设备向第二电子设备发送请求信息,请求信息用于请求获取第一时间区间对应的信号。
S1405:第二电子设备根据接收到的请求信息确定第一时间区间,并根据第一时间区间及第一信号,确定第二信号。
示例性的,第二信号可以为上述实施例中所述的目标时间段内的生理数据对应的信号。
S1406:第二电子设备向第一电子设备发送第二信号。
S1407:第一电子设备若确定接收到的第二信号与第一音频匹配,显示第一信息。
可选的,若第二信号与第一音频不匹配,电子设备显示第二信息。
示例性的,第一信息可以为上述实施例中所述的用于通知目标用户存在患病风险的通知信息,例如图6或图10中所示的信息。第一信息可以为上述实施例中所述的用于通知目标用户所在的环境中存在感染风险的通知信息,例如图7或图11中所示的信息。
具体的,该方法中第一电子设备或第二电子设备所执行的具体步骤可参阅前述实施例中的相关介绍,在此不再过多赘述。
基于以上实施例及相同构思,本申请实施例还提供一种电子设备,该电子设备用于实现本申请实施例提供的音频检测的方法。如图15中所示,电子设备1500可以包括:显示屏1501,存储器1502,一个或多个处理器1503,以及一个或多个计算机程序(图中未示出)。上述各器件可以通过一个或多个通信总线1504耦合。
其中,显示屏1501用于显示图像、视频、应用界面等相关用户界面。存储器1502中存储有一个或多个计算机程序(代码),一个或多个计算机程序包括计算机指令;一个或多个处理器1503调用存储器1502中存储的计算机指令,使得电子设备1500执行本申请实施例提供的音频检测的方法。
具体实现中,存储器1502可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器1502可以存储操作系统(下述简称系统),例如ANDROID,IOS,WINDOWS,或者LINUX等嵌入式操作系统。存储器1502可用于存储本申请实施例的实现程序。存储器1502还可以存储网络通信程序,该网络通信程序可用于与一个或多个附加设备,一个或多个用户设备,一个或多个网络设备进行通信。一个或多个处理器1503可以是一个通用中央处理器(Central Processing Unit,CPU),微处理器,特定应用集成电路(Application-Specific Integrated Circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。
需要说明的是,图15仅仅是本申请实施例提供的电子设备1500的一种实现方式,实 际应用中,电子设备1500还可以包括更多或更少的部件,这里不作限制。
基于以上实施例及相同构思,本申请实施例还提供一种系统,该系统包括上述的第一电子设备和第二电子设备。
基于以上实施例及相同构思,本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,当计算机程序在计算机上运行时,使得计算机执行上述实施例提供的方法。
基于以上实施例及相同构思,本申请实施例还提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令在计算机上运行时,使得计算机执行上述实施例提供的方法。
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,简称DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,简称DVD)、或者半导体介质(例如,SSD)等。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (21)

  1. 一种音频检测的方法,应用于电子设备,其特征在于,所述方法包括:
    采集第一音频,并识别所述第一音频的类型;
    通过一个或多个传感器采集第一信号,其中,所述电子设备包括所述一个或多个传感器;
    当识别到所述第一音频为设定类型时,根据所述第一音频确定第一时间区间,所述第一时间区间包括起始时间和结束时间;
    根据所述第一时间区间及所述第一信号,确定第二信号;
    若所述第二信号与所述第一音频匹配,显示第一信息。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    若所述第二信号与所述第一音频不匹配,显示第二信息。
  3. 如权利要求2所述的方法,其特征在于,在根据所述第一音频确定第一时间区间之前,所述方法还包括:
    根据所述第一音频确定第一风险参数,所述第一风险参数大于或等于设定阈值,其中,所述第一风险参数用于指示所述第一音频对应的疾病风险。
  4. 如权利要求3所述的方法,其特征在于,
    所述第一信息用于指示:所述第一音频来源于第一用户,和/或,所述第一用户存在患病风险,其中,所述第一用户为佩戴所述电子设备的用户;
    所述第二信息用于指示:所述第一音频不是来源于所述第一用户,和/或,所述第一用户所在的环境存在感染风险。
  5. 如权利要求3或4所述的方法,其特征在于,所述根据所述第一音频确定第一风险参数,包括:
    从所述第一音频中提取特征数据;其中,所述特征数据用于表征所述第一音频的时域特征和/或频域特征;
    根据设定的疾病筛查模型和所述特征数据,确定所述第一风险参数;其中,所述疾病筛查模型用于表示音频的时域特征和/或频域特征与所述音频对应的风险参数之间的关系。
  6. 如权利要求1~5任一所述的方法,其特征在于,在若所述第二信号与所述第一音频匹配,显示第一信息之前,所述方法还包括:根据所述第一音频,确定所述第一音频对应的第一动作,以及,根据所述第二信号,确定所述第二信号对应的第二动作;
    所述第二信号与所述第一音频匹配,包括:所述第二信号对应的所述第二动作的类型与所述第一音频对应的所述第一动作的类型相同。
  7. 如权利要求6所述的方法,其特征在于,根据所述第二信号,确定所述第二信号对应的第二动作,包括:
    根据设定的动作识别模型和所述第二信号,确定所述第二动作及所述第二动作的置信度,其中,所述动作识别模型用于表示所述第二信号与所述第二动作及所述置信度之间的关系,所述置信度用于表征所述动作识别模型的识别准确度;
    确定所述置信度大于或等于设定的第一阈值;或者
    确定所述置信度大于或等于设定的第二阈值且小于或等于设定的第三阈值;显示第一指示信息,所述第一指示信息用于指示:确认是否执行所述第二动作;接收第二指示信息, 所述第二指示信息用于指示:确认执行所述第二动作。
  8. 如权利要求1~7任一所述的方法,其特征在于,所述第二信号包括以下至少一项:
    通过加速度传感器采集的信号;
    通过陀螺仪传感器采集的信号;
    通过光电传感器采集的信号。
  9. 如权利要求1~8任一所述的方法,其特征在于,所述设定类型包括以下至少一项:
    咳嗽音,呼吸音,喷嚏音,关节弹响。
  10. 一种音频检测的方法,应用于第一电子设备,其特征在于,所述方法包括:
    采集第一音频,并识别所述第一音频的类型;
    当识别到所述第一音频为设定类型时,根据所述第一音频确定第一时间区间,所述第一时间区间包括起始时间和结束时间;
    向第二电子设备发送请求信息,所述请求信息用于请求获取所述第一时间区间对应的信号;
    接收来自所述第二电子设备的第一信号,所述第一信号是所述第二电子设备通过一个或多个传感器采集得到的,其中,所述第二电子设备包括所述一个或多个传感器;
    若所述第一信号与所述第一音频匹配,显示第一信息。
  11. 如权利要求10所述的方法,其特征在于,所述方法还包括:
    若所述第一信号与所述第一音频不匹配,显示第二信息。
  12. 如权利要求11所述的方法,其特征在于,在根据所述第一音频确定第一时间区间之前,所述方法还包括:
    根据所述第一音频确定第一风险参数,所述第一风险参数大于或等于设定阈值,其中,所述第一风险参数用于指示所述第一音频对应的疾病风险。
  13. 如权利要求12所述的方法,其特征在于,
    所述第一信息用于指示:所述第一音频来源于第一用户,和/或,所述第一用户存在患病风险,其中,所述第一用户为佩戴所述第二电子设备的用户;
    所述第二信息用于指示:所述第一音频不是来源于所述第一用户,和/或,所述第一用户所在的环境存在感染风险。
  14. 如权利要求12或13所述的方法,其特征在于,根据所述第一音频确定第一风险参数,包括:
    从所述第一音频中提取特征数据;其中,所述特征数据用于表征所述第一音频的时域特征和/或频域特征;
    根据设定的疾病筛查模型和所述特征数据,确定所述第一风险参数;其中,所述疾病筛查模型用于表示音频的时域特征和/或频域特征与所述音频对应的风险参数之间的关系。
  15. 如权利要求10~14任一所述的方法,其特征在于,在若所述第一信号与所述第一音频匹配,显示第一信息之前,所述方法还包括:根据所述第一音频,确定所述第一音频对应的第一动作,以及,根据所述第一信号,确定所述第一信号对应的第二动作;
    所述第一信号与所述第一音频匹配,包括:所述第一信号对应的所述第二动作的类型与所述第一音频对应的所述第一动作的类型相同。
  16. 如权利要求15所述的方法,其特征在于,根据所述第一信号,确定所述第一信号对应的第二动作,包括:
    根据设定的动作识别模型和所述第一信号,确定所述第二动作及所述第二动作的置信度,其中,所述动作识别模型用于表示所述第一信号与所述第二动作及所述置信度之间的关系,所述置信度用于表征所述动作识别模型的识别准确度;
    确定所述置信度大于或等于设定的第一阈值;或者
    确定所述置信度大于或等于设定的第二阈值且小于或等于设定的第三阈值;显示第一指示信息,所述第一指示信息用于指示:确认是否执行所述第二动作;接收第二指示信息,所述第二指示信息用于指示:确认执行所述第二动作。
  17. 如权利要求10~16任一所述的方法,其特征在于,所述第一信号包括以下至少一项:
    通过加速度传感器采集的信号;
    通过陀螺仪传感器采集的信号;
    通过光电传感器采集的信号。
  18. 如权利要求10~17任一所述的方法,其特征在于,所述设定类型包括以下至少一项:
    咳嗽音,呼吸音,喷嚏音,关节弹响。
  19. 一种电子设备,其特征在于,所述电子设备包括显示屏,存储器和一个或多个处理器;
    其中,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令;当所述计算机指令被所述一个或多个处理器执行时,使得所述电子设备执行如权利要求1~9任一所述的方法,或者执行如权利要求10~18任一所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行如权利要求1~9任一所述的方法,或者执行如权利要求10~18任一所述的方法。
  21. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得所述计算机执行如权利要求1~9任一所述的方法,或者执行如权利要求10~18任一所述的方法。
PCT/CN2023/073167 2022-01-21 2023-01-19 一种音频检测的方法及电子设备 WO2023138660A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210073455.0A CN116509371A (zh) 2022-01-21 2022-01-21 一种音频检测的方法及电子设备
CN202210073455.0 2022-01-21

Publications (1)

Publication Number Publication Date
WO2023138660A1 true WO2023138660A1 (zh) 2023-07-27

Family

ID=87347891

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/073167 WO2023138660A1 (zh) 2022-01-21 2023-01-19 一种音频检测的方法及电子设备

Country Status (2)

Country Link
CN (1) CN116509371A (zh)
WO (1) WO2023138660A1 (zh)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080312547A1 (en) * 2005-10-05 2008-12-18 Yasunori Wada Cough Detecting Apparatus and Cough Detecting Method
CN204744109U (zh) * 2015-06-09 2015-11-11 同济大学 一种基于加速度计的咳嗽判别系统
CN108024760A (zh) * 2015-09-14 2018-05-11 保健之源股份有限公司 可穿戴式呼吸疾病监测设备
CN111653273A (zh) * 2020-06-09 2020-09-11 杭州叙简科技股份有限公司 一种基于智能手机的院外肺炎初步识别方法
CN112120700A (zh) * 2019-06-25 2020-12-25 松下电器(美国)知识产权公司 咳嗽检测装置、咳嗽检测方法以及记录介质
CN112233700A (zh) * 2020-10-09 2021-01-15 平安科技(深圳)有限公司 基于音频的用户状态识别方法、装置及存储介质
CN112804941A (zh) * 2018-06-14 2021-05-14 斯特拉多斯实验室公司 用于检测生理事件的装置和方法
US20220167917A1 (en) * 2015-09-14 2022-06-02 Health Care Originals, Inc. Respiratory disease monitoring wearable apparatus
WO2022195327A1 (en) * 2021-03-18 2022-09-22 Ortho Biomed Inc. Health monitoring system and method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080312547A1 (en) * 2005-10-05 2008-12-18 Yasunori Wada Cough Detecting Apparatus and Cough Detecting Method
CN204744109U (zh) * 2015-06-09 2015-11-11 同济大学 一种基于加速度计的咳嗽判别系统
CN108024760A (zh) * 2015-09-14 2018-05-11 保健之源股份有限公司 可穿戴式呼吸疾病监测设备
US20220167917A1 (en) * 2015-09-14 2022-06-02 Health Care Originals, Inc. Respiratory disease monitoring wearable apparatus
CN112804941A (zh) * 2018-06-14 2021-05-14 斯特拉多斯实验室公司 用于检测生理事件的装置和方法
CN112120700A (zh) * 2019-06-25 2020-12-25 松下电器(美国)知识产权公司 咳嗽检测装置、咳嗽检测方法以及记录介质
CN111653273A (zh) * 2020-06-09 2020-09-11 杭州叙简科技股份有限公司 一种基于智能手机的院外肺炎初步识别方法
CN112233700A (zh) * 2020-10-09 2021-01-15 平安科技(深圳)有限公司 基于音频的用户状态识别方法、装置及存储介质
WO2022195327A1 (en) * 2021-03-18 2022-09-22 Ortho Biomed Inc. Health monitoring system and method

Also Published As

Publication number Publication date
CN116509371A (zh) 2023-08-01

Similar Documents

Publication Publication Date Title
CN108292317B (zh) 问题和答案处理方法以及支持该方法的电子设备
KR102446811B1 (ko) 복수의 디바이스들로부터 수집된 데이터 통합 및 제공 방법 및 이를 구현한 전자 장치
KR102363794B1 (ko) 정보 제공 방법 및 이를 지원하는 전자 장치
US20180151036A1 (en) Method for producing haptic signal and electronic device supporting the same
US9444998B2 (en) Method for control of camera module based on physiological signal
US11350885B2 (en) System and method for continuous privacy-preserved audio collection
WO2021114224A1 (zh) 语音检测方法、预测模型的训练方法、装置、设备及介质
WO2021238995A1 (zh) 用于肌肤检测的电子设备的交互方法及电子设备
KR102401932B1 (ko) 생체 정보를 측정하는 전자 장치와 이의 동작 방법
KR102348758B1 (ko) 음성 인식 서비스 운용 방법 및 이를 지원하는 전자 장치
US20180315426A1 (en) Electronic device for providing speech recognition service and method thereof
KR102412523B1 (ko) 음성 인식 서비스 운용 방법, 이를 지원하는 전자 장치 및 서버
KR102561572B1 (ko) 센서 활용 방법 및 이를 구현한 전자 장치
KR102356889B1 (ko) 음성 인식을 수행하는 방법 및 이를 사용하는 전자 장치
KR102431817B1 (ko) 사용자 발화를 처리하는 전자 장치 및 서버
KR102391298B1 (ko) 음성 인식 서비스를 제공하는 전자 장치 및 그 방법
CN107784268B (zh) 基于红外线传感器使用其来测量心率的方法和电子设备
CN108288471A (zh) 用于识别语音的电子设备
KR102358849B1 (ko) 스마트 워치에 대한 정보를 제공하는 전자 장치와 이의 동작 방법
US11741986B2 (en) System and method for passive subject specific monitoring
WO2023273834A1 (zh) 一种检测方法及相关设备
CN113519022B (zh) 电子设备及其控制方法
CN108652594A (zh) 用于测量生物计量信息的电子设备和方法
KR102457247B1 (ko) 이미지를 처리하는 전자 장치 및 그 제어 방법
KR102282704B1 (ko) 영상 데이터를 재생하는 전자 장치 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23742975

Country of ref document: EP

Kind code of ref document: A1