CN111883146A - Cross-platform distributed nearby wake-up method and device - Google Patents

Cross-platform distributed nearby wake-up method and device Download PDF

Info

Publication number
CN111883146A
CN111883146A CN202010742734.2A CN202010742734A CN111883146A CN 111883146 A CN111883146 A CN 111883146A CN 202010742734 A CN202010742734 A CN 202010742734A CN 111883146 A CN111883146 A CN 111883146A
Authority
CN
China
Prior art keywords
arbitration
request
wake
voice energy
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010742734.2A
Other languages
Chinese (zh)
Inventor
陈晓松
李旭滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maosheng Intelligent Technology Co ltd
Original Assignee
Shanghai Maosheng Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maosheng Intelligent Technology Co ltd filed Critical Shanghai Maosheng Intelligent Technology Co ltd
Priority to CN202010742734.2A priority Critical patent/CN111883146A/en
Publication of CN111883146A publication Critical patent/CN111883146A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application relates to a cross-platform distributed nearby awakening method and device, wherein the method comprises the following steps: when a wake-up instruction is detected, an arbitration node in the networking receives arbitration requests sent by other request devices in the networking, each request device comprises a plurality of devices of at least one type, the arbitration requests carry normalized voice energy values of the request devices, and the normalized voice energy values are products of the voice energy values detected by the request devices to the wake-up instruction and normalization coefficients of the request devices; the arbitration node responds to the arbitration request, selects the target device with the maximum normalized voice energy value from the request devices, and sends a notification message to the target device to notify the target device of responding to the awakening instruction. Through the application, the problems that the distributed nearby awakening technology cannot be used across platforms and the nearby awakening accuracy is low in the related technology are solved, the distributed nearby awakening technology is used across platforms, and the technical effect of improving the distributed nearby awakening accuracy is achieved.

Description

Cross-platform distributed nearby wake-up method and device
Technical Field
The present application relates to the field of internet of things, and in particular, to a cross-platform distributed nearby wake-up method, apparatus, computer device, and computer-readable storage medium.
Background
The distributed nearby awakening technology can be used for solving the problem that a plurality of voice interaction devices supporting the same awakening word in the same space respond simultaneously when receiving the same awakening instruction, namely the voice interaction devices in the same space perform networking and communication through a local area network, execute arbitration logic to select the device closest to a user after receiving the awakening instruction, and the device is responsible for finishing subsequent interaction with the user. The basic principle of arbitration logic is: after the plurality of voice interaction devices complete networking, selecting a blanking node based on a certain determined strategy (for example, according to the device MAC address character sequencing, determining the smallest device as an arbitration node); each voice interaction device respectively calculates the voice energy value of the awakening voice and uniformly sends the voice energy value to the arbitration node; the arbitration node selects the device with the highest voice energy value according to the principle that the voice energy value gradually attenuates along with the propagation distance, considers that the device is closest to the user, and informs the device of responding to the user.
In an intelligent home scene, multiple types of voice interaction equipment, such as an Android center control sound box, a Linux intelligent panel, various household appliances and the like, exist in the same space, and the equipment may belong to the same manufacturer or brand, so that the same awakening words are supported. The distributed nearby wake-up technique in the related art has the following problems:
1. the distributed nearby wake-up technology in the related technology is designed and realized only for a single type of equipment, and cannot be used across hardware and systems;
2. different types of equipment have incomparable voice energy values due to different gains of microphone pickup, and cannot be used as a basis for distance comparison in an arbitration process, so that the accuracy of nearby awakening is low.
At present, no effective solution is provided for the problems that the distributed nearby wake-up technology in the related art cannot be used across platforms and the accuracy of nearby wake-up is low.
Disclosure of Invention
The embodiment of the application provides a cross-platform distributed nearby awakening method, a cross-platform distributed nearby awakening device, computer equipment and a computer readable storage medium, so as to at least solve the problems that a distributed nearby awakening technology in the related technology cannot be used in a cross-platform mode and the nearby awakening accuracy is low.
In a first aspect, an embodiment of the present application provides a cross-platform distributed nearby wake-up method, including:
when a wake-up instruction is detected, an arbitration node in a networking receives arbitration requests sent by other request devices in the networking, wherein the request devices comprise a plurality of devices of at least one type, the arbitration requests carry normalized voice energy values of the request devices, and the normalized voice energy values are products of the voice energy values detected by the request devices for the wake-up instruction and normalized coefficients of the request devices;
and the arbitration node responds to the arbitration request, selects the target equipment with the maximum normalized voice energy value from the request equipment, and sends a notification message to the target equipment, wherein the notification message is used for notifying the target equipment to respond to the awakening instruction.
In some of these embodiments, the normalized coefficient for the requesting device is determined by:
respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency;
and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
In some of these embodiments, before detecting the wake-up instruction, the method further comprises:
compiling and packaging interfaces supporting different operating system platform calls, wherein the interfaces at least comprise:
the initialization interface is used for establishing a networking and electing the arbitration node;
starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request;
closing a proximity wake-up interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
In some embodiments, when a wake-up command is detected, a network is created by calling the initialization interface and the arbitration node is elected; the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface; the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device; the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction; and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
In a second aspect, an embodiment of the present application provides a cross-platform distributed nearby wake-up apparatus, including:
a receiving request unit, configured to receive, by an arbitration node in a networking system, an arbitration request sent by another request device in the networking system when a wake-up instruction is detected, where the request device includes multiple devices of at least one type, and the arbitration request carries a normalized voice energy value of the request device, where the normalized voice energy value is a product of a voice energy value detected by the request device for the wake-up instruction and a normalization coefficient of the request device;
and the arbitration response unit is used for responding the arbitration request by the arbitration node, selecting the target equipment with the maximum normalized voice energy value from the request equipment, and sending a notification message to the target equipment, wherein the notification message is used for notifying the target equipment to respond to the awakening instruction.
In some of these embodiments, the normalized coefficient for the requesting device is determined by:
respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency;
and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
In some of these embodiments, the apparatus further comprises:
a compiling and packaging unit, configured to compile and package interfaces supporting different operating system platform calls before detecting the wake-up instruction, where the interfaces at least include:
the initialization interface is used for establishing a networking and electing the arbitration node;
starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request;
closing a proximity wake-up interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
In some of these embodiments, the apparatus is configured to: when a wake-up instruction is detected, establishing networking by calling the initialization interface and electing the arbitration node; the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface; the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device; the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction; and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
In a third aspect, an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the computer program, implements the cross-platform distributed nearby wake-up method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the cross-platform distributed nearby wake-up method according to the first aspect.
Compared with the related art, the cross-platform distributed nearby wake-up method provided by the embodiment of the application, when a wake-up command is detected, an arbitration node in the networking responds to a received arbitration request sent by a request device, and selects a target device with the maximum normalized voice energy value from the arbitration node to respond to the wake-up command, the voice energy value of the request device is preprocessed through the normalization coefficient in the embodiment of the application, so that the difference of microphone pickup gains of different types of request devices can be eliminated, therefore, the voice energy values of different types of request equipment are comparable, the problems that the distributed nearby awakening technology cannot be used across platforms and the nearby awakening accuracy is low in the related technology are solved, the purpose of using the distributed nearby awakening technology across platforms is achieved, and the technical effect of improving the distributed nearby awakening accuracy is achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application;
FIG. 2 is a flow diagram of a cross-platform distributed nearby wake-up method according to an embodiment of the present application;
FIG. 3 is a flow chart of a cross-platform distributed nearby wake-up method according to a preferred embodiment of the present application;
FIG. 4 is a block diagram of a cross-platform distributed nearby wake-up apparatus according to an embodiment of the present application;
fig. 5 is a hardware structure diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The embodiment provides a mobile terminal. Fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application. As shown in fig. 1, the mobile terminal includes: a Radio Frequency (RF) circuit 110, a memory 120, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, a wireless fidelity (WiFi) module 170, a processor 180, and a power supply 190. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each constituent element of the mobile terminal in detail with reference to fig. 1:
the RF circuit 110 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 180; in addition, the data for designing uplink is transmitted to the base station. Typically, the RF circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.
The memory 120 may be used to store software programs and modules, and the processor 180 executes various functional applications and data processing of the mobile terminal by operating the software programs and modules stored in the memory 120. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Further, the memory 120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the input unit 130 may include a touch panel 131 and other input devices 132. The touch panel 131, also referred to as a touch screen, may collect touch operations of a user on or near the touch panel 131 (e.g., operations of the user on or near the touch panel 131 using any suitable object or accessory such as a finger or a stylus pen), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 131 may include two parts, i.e., a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 180, and can receive and execute commands sent by the processor 180. In addition, the touch panel 131 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 130 may include other input devices 132 in addition to the touch panel 131. In particular, other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 140 may be used to display information input by a user or information provided to the user and various menus of the mobile terminal. The Display unit 140 may include a Display panel 141, and optionally, the Display panel 141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 131 can cover the display panel 141, and when the touch panel 131 detects a touch operation on or near the touch panel 131, the touch operation is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. Although the touch panel 131 and the display panel 141 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 131 and the display panel 141 may be integrated to implement the input and output functions of the mobile terminal.
The mobile terminal may also include at least one sensor 150, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 141 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 141 and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the mobile terminal, and related functions (such as pedometer and tapping) for vibration recognition; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile terminal, further description is omitted here.
A speaker 161 and a microphone 162 in the audio circuit 160 may provide an audio interface between the user and the mobile terminal. The audio circuit 160 may transmit the electrical signal converted from the received audio data to the speaker 161, and convert the electrical signal into a sound signal for output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electric signal, converts the electric signal into audio data after being received by the audio circuit 160, and then outputs the audio data to the processor 180 for processing, and then transmits the audio data to, for example, another mobile terminal via the RF circuit 110, or outputs the audio data to the memory 120 for further processing.
WiFi belongs to a short-distance wireless transmission technology, and the mobile terminal can help a user to send and receive e-mails, browse webpages, access streaming media and the like through the WiFi module 170, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 170, it is understood that it does not belong to the essential components of the mobile terminal, and it can be omitted or replaced with other short-range wireless transmission modules, such as Zigbee module or WAPI module, etc., as required within the scope not changing the essence of the invention.
The processor 180 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby performing overall monitoring of the mobile terminal. Alternatively, processor 180 may include one or more processing units; preferably, the processor 180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.
The mobile terminal also includes a power supply 190 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 180 via a power management system that may be configured to manage charging, discharging, and power consumption.
Although not shown, the mobile terminal may further include a camera, a bluetooth module, and the like, which will not be described herein.
In this embodiment, the processor 180 is configured to: when a wake-up instruction is detected, an arbitration node in a networking receives arbitration requests sent by other request devices in the networking, wherein the request devices comprise a plurality of devices of at least one type, the arbitration requests carry normalized voice energy values of the request devices, and the normalized voice energy values are products of the voice energy values detected by the request devices for the wake-up instruction and normalized coefficients of the request devices; and the arbitration node responds to the arbitration request, selects the target equipment with the maximum normalized voice energy value from the request equipment, and sends a notification message to the target equipment, wherein the notification message is used for notifying the target equipment to respond to the awakening instruction.
In some of these embodiments, the processor 180 is further configured to: respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency; and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
In some of these embodiments, the processor 180 is further configured to: before a wake-up instruction is detected, compiling and packaging interfaces supporting different operating system platform calls, wherein the interfaces at least comprise: the initialization interface is used for establishing a networking and electing the arbitration node; starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request; closing a proximity wake-up interface for ending the arbitration processing thread; and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
In some of these embodiments, the processor 180 is further configured to: when a wake-up instruction is detected, establishing networking by calling the initialization interface and electing the arbitration node; the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface; the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device; the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction; and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
The embodiment also provides a cross-platform distributed nearby wake-up method. Fig. 2 is a flowchart of a cross-platform distributed nearby wake-up method according to an embodiment of the present application, where, as shown in fig. 2, the flowchart includes the following steps:
step S202, when a wake-up instruction is detected, an arbitration node in a networking receives arbitration requests sent by other request devices in the networking, wherein the request devices comprise a plurality of devices of at least one type, the arbitration requests carry normalized voice energy values of the request devices, and the normalized voice energy values are products of the voice energy values detected by the request devices for the wake-up instruction and normalized coefficients of the request devices;
step S204, the arbitration node responds to the arbitration request, selects a target device with the largest normalized voice energy value from the request devices, and sends a notification message to the target device, where the notification message is used to notify the target device to respond to the wake-up instruction.
Through the steps, when the wake-up instruction is detected, the arbitration node in the networking responds to the received arbitration request sent by the request device, and the target device with the maximum normalized voice energy value is selected from the received arbitration request and responds to the wake-up instruction.
In step S202, the networking may include multiple devices, one of the devices may serve as an arbitration node, other devices than the arbitration node may serve as a request device, the request device may include multiple devices, and the request device may include at least one type, where the type refers to a type of an operating system platform used by the request device, such as Linux, Android, IOS, and the like.
Because the pickup gains of different types of request devices are different, the prior art directly compares the voice energy values of different types of request devices to the wake-up command to cause inaccurate comparison results, and further affects the accuracy of the arbitration node selecting the target device responding to the wake-up command, in order to overcome the above defects in the prior art, in the embodiment of the present application, each type of request device in the networking is subjected to the voice energy normalization processing in advance, and the specific process may include: firstly, selecting a reference device according to actual requirements, wherein the reference device can be a device with the best recording consistency, or a device with the largest shipment volume, or a device which is most frequently used by a user, and the like; then, playing a target audio, simultaneously starting the recording of the request device and the recording of the reference device, and respectively sending the recordings of the two devices into a voice energy detection tool to obtain a first voice energy value of the request device to the target audio and a second voice energy value of the reference device to the target audio; the value of the second speech energy value divided by the first speech energy value is then determined as a normalized coefficient for the requesting device. The normalization coefficient of each type of request equipment can be determined through the process, and the voice energy value of each type of request equipment is processed by using the normalization coefficient, so that the voice energy values of different types of request equipment can be compared.
If the user sends out the awakening voice, the arbitration node and the request device in the networking can detect the awakening instruction corresponding to the awakening voice. When the arbitration node detects the awakening instruction, an arbitration processing thread is created and the request device is waited to send an arbitration request, when the request device detects the awakening instruction, the voice energy value of the awakening instruction is firstly detected, then the voice energy value is multiplied by the normalization coefficient of the request device, the obtained product is used as the normalized voice energy value, the arbitration request containing the normalized voice energy value is sent to the arbitration node, and the arbitration result returned by the arbitration node is waited.
In the step S204, after receiving the arbitration request of each requesting device, the arbitration node responds to the arbitration request, and the specific responding process may include: according to the normalized voice energy value of each request device contained in the arbitration request sent by each request device, the request device with the maximum normalized voice energy value is selected as the target device, a notification message is sent to the target device to notify the target device to respond to the awakening instruction, and meanwhile, an arbitration result is sent to other request devices to notify other request devices not to respond to the awakening instruction.
In some embodiments, in order to solve the problem of unifying APIs across platforms, the embodiments of the present application perform interface encapsulation and compilation respectively for different common operating system platforms, such as Linux, Android, and IOS, where the interface at least includes:
the initialization interface is used for establishing a networking and electing an arbitration node;
starting a nearby awakening interface for the arbitration node to create an arbitration processing thread, wherein the arbitration processing thread is used for blanking processing to select the target equipment when receiving an arbitration request;
closing the proximity awakening interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for requesting the equipment to send an arbitration request to the arbitration node and receiving an arbitration result returned by the arbitration node.
Based on the above interfaces, the cross-platform distributed nearby wake-up method of the embodiment of the present application may be implemented by calling the corresponding interfaces, and specifically includes:
when a wake-up instruction is detected, establishing networking and electing an arbitration node by calling the initialization interface;
sending the arbitration request to an arbitration node by calling the interface request device for sending the arbitration request;
receiving an arbitration request sent by request equipment by calling an arbitration node for starting a nearby awakening interface, and responding to the arbitration request to select target equipment from the request equipment;
sending a notification message to the target equipment by calling the arbitration node of the arbitration request sending interface so as to indicate the target equipment to respond to the awakening instruction;
and finishing the response to the arbitration request by calling the close proximity wake-up interface arbitration node.
According to the embodiment of the application, the interfaces supporting different operating system platform calling are compiled and packaged in advance, the distributed nearby awakening technology can be realized by calling the interfaces, the defect that the distributed nearby awakening technology in the related technology is only designed and realized for a single type of equipment is overcome, and cross-platform use, namely cross-use of different hardware and systems is really realized.
The embodiments of the present application are described and illustrated below by means of preferred embodiments.
The preferred embodiment proposes a cross-platform implementation of near-wake, which is divided into two parts, namely "implementing cross-platform API" and "proposing a method of wake-up speech energy normalization".
1. The method realizes the cross-platform API and supports Linux, Android and IOS platform calling.
The basic logic for implementing the nearby wake based on C language is shown in fig. 3, where fig. 3 is a flowchart of a cross-platform distributed nearby wake method according to the preferred embodiment of the present application, and the method includes the following steps:
step S301, networking;
step S302, electing an arbitration node;
step S303, judging whether the heartbeat of the arbitration node is lost, if so, executing step S302, otherwise, executing step S304;
step S304, determining whether a wake-up command is detected, if so, performing step S305, otherwise, performing step S303;
step S305, determining whether the device is an arbitration node, if yes, executing steps S306 to S307, and returning to execute step S303, otherwise, executing steps S308 to S309, and returning to execute step S303;
step S306, processing the arbitration request;
step S307, issuing an arbitration result;
step S308, sending an arbitration request;
in step S309, the arbitration result is processed.
The logic is encapsulated, and the interfaces shown in the following table 1 are provided for the outside:
TABLE 1 external interface
Figure BDA0002607277480000121
Figure BDA0002607277480000131
And performing Linux and IOS platform compilation on the interface.
And performing Android JNI packaging on the interface.
2. A method for waking up voice energy normalization is provided
The normalization of the voice energy means that a normalization coefficient is introduced in order to make the recording energy of different devices comparable, and the recording data of the devices is multiplied by the coefficient to compensate the difference of the recording gains of different hardware or systems. For example, if the gain of device a is measured to be 3 and the gain of device B is measured to be 2, then the normalization coefficients for a and B may be set to 1 and 1.5, respectively.
The specific process can comprise the following steps:
(1) a class of devices is selected as the reference device.
The device with the best recording consistency is usually selected, and can also be selected according to project requirements (such as selecting the device with the largest shipment volume or the device most frequently used by customers).
(2) And developing a voice energy detection tool for normalization debugging.
The tool inputs a segment of audio and counts its average energy.
(3) And (5) carrying out normalized debugging on the equipment.
Putting the reference device R and the device T to be debugged together, playing a section of audio, and simultaneously starting the two devices to record; and respectively sending the recording of the two devices into a voice energy detection tool to obtain average energy Re and Te of the two devices, wherein the normalization coefficient of the device T to be debugged is a value obtained by dividing Re by Te.
(4) The normalization factor is used.
In the actual operation process of the equipment, the recorded data is multiplied by a normalization coefficient, and then sent to the Start interface in the table 1 for energy calculation and nearby arbitration.
The method and the device solve the problem of cross-platform uniform API, are not dependent on specific hardware and systems in design, and respectively carry out interface packaging and compiling aiming at common Linux, Android and IOS equipment; the problem of difference of recording gain coefficients of different devices is solved, an end-to-end debugging mode is designed, and the recording normalization coefficient of the devices is determined, so that the recording energy of different devices is comparable.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The embodiment also provides a cross-platform distributed nearby wake-up apparatus, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the apparatus is omitted for brevity. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a block diagram of a cross-platform distributed nearby wake-up apparatus according to an embodiment of the present application, and as shown in fig. 4, the apparatus includes:
a receiving request unit 42, configured to receive, by an arbitration node in a networking system, an arbitration request sent by another request device in the networking system when a wake-up instruction is detected, where the request device includes multiple devices of at least one type, and the arbitration request carries a normalized voice energy value of the request device, where the normalized voice energy value is a product of a voice energy value detected by the request device for the wake-up instruction and a normalization coefficient of the request device;
an arbitration responding unit 44, configured to, in response to the arbitration request, the arbitration node select a target device with a largest normalized voice energy value from the request devices, and send a notification message to the target device, where the notification message is used to notify the target device to respond to the wake-up instruction.
In some of these embodiments, the normalized coefficient for the requesting device is determined by:
respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency;
and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
In some of these embodiments, the apparatus further comprises:
a compiling and packaging unit, configured to compile and package interfaces supporting different operating system platform calls before detecting the wake-up instruction, where the interfaces at least include:
the initialization interface is used for establishing a networking and electing the arbitration node;
starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request;
closing a proximity wake-up interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
In some of these embodiments, the apparatus is configured to: when a wake-up instruction is detected, establishing networking by calling the initialization interface and electing the arbitration node; the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface; the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device; the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction; and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
The embodiment of the application also provides computer equipment, and the cross-platform distributed nearby awakening method combined with the embodiment of the application can be realized by the computer equipment. Fig. 5 is a hardware structure diagram of a computer device according to an embodiment of the present application.
The computer device may comprise a processor 51 and a memory 52 in which computer program instructions are stored.
Specifically, the processor 51 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 52 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 52 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 52 may include removable or non-removable (or fixed) media, where appropriate. The memory 52 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 52 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 52 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random Access Memory (FPMDRAM), an Extended Data Output Dynamic Random Access Memory (EDODRAM), a Synchronous Dynamic Random Access Memory (SDRAM), and the like.
The memory 52 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 51.
The processor 51 implements any of the above embodiments of the cross-platform distributed nearby wake-up method by reading and executing computer program instructions stored in the memory 52.
In some of these embodiments, the computer device may also include a communication interface 53 and a bus 50. As shown in fig. 5, the processor 51, the memory 52, and the communication interface 53 are connected via the bus 50 to complete mutual communication.
The communication interface 53 is used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application. The communication port 53 may also be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
Bus 50 comprises hardware, software, or both coupling the components of the computer device to each other. Bus 50 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 50 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a HyperTransport (HT) interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video electronics standards association Local Bus (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 50 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
In addition, in combination with the cross-platform distributed nearby wake-up method in the foregoing embodiments, embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by the processor, implement any of the above embodiments of a cross-platform distributed nearby wake-up method.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A cross-platform distributed nearby wake-up method, comprising:
when a wake-up instruction is detected, an arbitration node in a networking receives arbitration requests sent by other request devices in the networking, wherein the request devices comprise a plurality of devices of at least one type, the arbitration requests carry normalized voice energy values of the request devices, and the normalized voice energy values are products of the voice energy values detected by the request devices for the wake-up instruction and normalized coefficients of the request devices;
and the arbitration node responds to the arbitration request, selects the target equipment with the maximum normalized voice energy value from the request equipment, and sends a notification message to the target equipment, wherein the notification message is used for notifying the target equipment to respond to the awakening instruction.
2. The cross-platform distributed nearby wake-up method according to claim 1, characterized in that the normalized coefficient of the requesting device is determined by the following procedure:
respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency;
and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
3. The cross-platform distributed nearby wake up method according to claim 1, characterized in that before detecting the wake up instruction, the method further comprises:
compiling and packaging interfaces supporting different operating system platform calls, wherein the interfaces at least comprise:
the initialization interface is used for establishing a networking and electing the arbitration node;
starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request;
closing a proximity wake-up interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
4. The cross-platform distributed nearby wake-up method of claim 3,
when a wake-up instruction is detected, establishing networking by calling the initialization interface and electing the arbitration node;
the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface;
the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device;
the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction;
and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
5. A cross-platform distributed nearby wake-up apparatus, comprising:
a receiving request unit, configured to receive, by an arbitration node in a networking system, an arbitration request sent by another request device in the networking system when a wake-up instruction is detected, where the request device includes multiple devices of at least one type, and the arbitration request carries a normalized voice energy value of the request device, where the normalized voice energy value is a product of a voice energy value detected by the request device for the wake-up instruction and a normalization coefficient of the request device;
and the arbitration response unit is used for responding the arbitration request by the arbitration node, selecting the target equipment with the maximum normalized voice energy value from the request equipment, and sending a notification message to the target equipment, wherein the notification message is used for notifying the target equipment to respond to the awakening instruction.
6. The cross-platform distributed nearby wake-up apparatus according to claim 5, wherein the normalized coefficient of the requesting device is determined by:
respectively sending the recordings of the request equipment and the reference equipment to a target audio frequency into a voice energy detection tool to obtain a first voice energy value of the request equipment to the target audio frequency and a second voice energy value of the reference equipment to the target audio frequency;
and determining a value obtained by dividing the second voice energy value by the first voice energy value as a normalized coefficient of the request device.
7. The cross-platform distributed nearby wake-up apparatus according to claim 5, characterized in that the apparatus further comprises:
a compiling and packaging unit, configured to compile and package interfaces supporting different operating system platform calls before detecting the wake-up instruction, where the interfaces at least include:
the initialization interface is used for establishing a networking and electing the arbitration node;
starting a nearby wake-up interface, and creating an arbitration processing thread for the arbitration node, wherein the arbitration processing thread is used for blanking processing to select the target device when receiving the arbitration request;
closing a proximity wake-up interface for ending the arbitration processing thread;
and the arbitration request sending interface is used for sending the arbitration request to the arbitration node by the request equipment and receiving the arbitration result returned by the arbitration node.
8. The cross-platform distributed nearby wake-up apparatus of claim 7, characterized in that the apparatus is configured to:
when a wake-up instruction is detected, establishing networking by calling the initialization interface and electing the arbitration node;
the request device sends the arbitration request to the arbitration node by calling the arbitration request sending interface;
the arbitration node receives the arbitration request sent by the request device by calling and starting a nearby wake-up interface, and responds to the arbitration request to select the target device from the request device;
the arbitration node sends the notification message to the target equipment by calling the arbitration request sending interface so as to instruct the target equipment to respond to the awakening instruction;
and the arbitration node finishes responding to the arbitration request by calling the close proximity wake-up interface.
9. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the cross-platform distributed nearby wake-up method of any of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a cross-platform distributed nearby wake-up method according to any one of claims 1 to 4.
CN202010742734.2A 2020-07-29 2020-07-29 Cross-platform distributed nearby wake-up method and device Pending CN111883146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010742734.2A CN111883146A (en) 2020-07-29 2020-07-29 Cross-platform distributed nearby wake-up method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010742734.2A CN111883146A (en) 2020-07-29 2020-07-29 Cross-platform distributed nearby wake-up method and device

Publications (1)

Publication Number Publication Date
CN111883146A true CN111883146A (en) 2020-11-03

Family

ID=73200322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010742734.2A Pending CN111883146A (en) 2020-07-29 2020-07-29 Cross-platform distributed nearby wake-up method and device

Country Status (1)

Country Link
CN (1) CN111883146A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112164398A (en) * 2020-11-05 2021-01-01 佛山市顺德区美的电子科技有限公司 Voice equipment and awakening method and device thereof and storage medium
CN113055827A (en) * 2021-03-12 2021-06-29 云知声智能科技股份有限公司 Method, device and system for realizing nearby awakening of distributed equipment based on AC + AP network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment
CN110322878A (en) * 2019-07-01 2019-10-11 华为技术有限公司 A kind of sound control method, electronic equipment and system
CN111179931A (en) * 2020-01-03 2020-05-19 青岛海尔科技有限公司 Method and device for voice interaction and household appliance

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment
CN110322878A (en) * 2019-07-01 2019-10-11 华为技术有限公司 A kind of sound control method, electronic equipment and system
CN111179931A (en) * 2020-01-03 2020-05-19 青岛海尔科技有限公司 Method and device for voice interaction and household appliance

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112164398A (en) * 2020-11-05 2021-01-01 佛山市顺德区美的电子科技有限公司 Voice equipment and awakening method and device thereof and storage medium
CN112164398B (en) * 2020-11-05 2023-08-15 佛山市顺德区美的电子科技有限公司 Voice equipment, wake-up method and device thereof and storage medium
CN113055827A (en) * 2021-03-12 2021-06-29 云知声智能科技股份有限公司 Method, device and system for realizing nearby awakening of distributed equipment based on AC + AP network
CN113055827B (en) * 2021-03-12 2022-06-17 云知声智能科技股份有限公司 Method, device and system for realizing nearby awakening of distributed equipment based on AC + AP network

Similar Documents

Publication Publication Date Title
JP6467526B2 (en) Communication message transmission method and wearable device
CN106847298B (en) Pickup method and device based on diffuse type voice interaction
US11016860B2 (en) Method for information processing and related device
CN107231159B (en) Radio frequency interference processing method, device, storage medium and terminal
US10657347B2 (en) Method for capturing fingerprint and associated products
WO2018103441A1 (en) Network positioning method and terminal device
CN106371964B (en) Method and device for prompting message
WO2018103443A1 (en) Network positioning method and terminal device
US20160360332A1 (en) Electronic device and method for controlling input and output by electronic device
CN108920220B (en) Function calling method, device and terminal
WO2018161540A1 (en) Fingerprint registration method and related product
CN108322602B (en) Method, terminal and computer readable storage medium for processing application no response
CN111883146A (en) Cross-platform distributed nearby wake-up method and device
CN111273955B (en) Thermal restoration plug-in optimization method and device, storage medium and electronic equipment
CN107911777B (en) Processing method and device for return-to-ear function and mobile terminal
CN106919458B (en) Method and device for Hook target kernel function
CN113286335B (en) Frequency point switching method and device, storage medium and access point
CN105278942B (en) Component management method and device
CN110209434B (en) Memory management method and device and computer readable storage medium
CN107659996B (en) Channel access method and equipment
WO2018103440A1 (en) Network positioning method and terminal device
CN115113950A (en) Method and device for outputting prompt information of application product
CN105635379B (en) Noise suppression method and device
CN112130928A (en) Automatic searching method, device, equipment and storage medium for Linux system sound card
CN113591006A (en) Web extension method and device based on WebSocket

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination