CN115985323B

CN115985323B - Voice wakeup method and device, electronic equipment and readable storage medium

Info

Publication number: CN115985323B
Application number: CN202310273455.XA
Authority: CN
Inventors: 鲁勇; 丁萌; 刘波
Original assignee: Beijing Intengine Technology Co Ltd
Current assignee: Beijing Intengine Technology Co Ltd
Priority date: 2023-03-21
Filing date: 2023-03-21
Publication date: 2023-06-16
Anticipated expiration: 2043-03-21
Also published as: CN115985323A

Abstract

The application discloses a voice awakening method, a device, electronic equipment and a readable storage medium, wherein the voice awakening method comprises the following steps: collecting sample signals within a preset time period; counting signal values corresponding to each frame of sample frames in the sample signals; calculating a background signal value corresponding to the sample signal based on the counted signal value; and when a voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value. The voice awakening scheme does not need to consume a large amount of calculation power of equipment to carry out long-time standby awakening, and avoids the condition that equipment generates heat to reduce the service life of the equipment.

Description

Voice wakeup method and device, electronic equipment and readable storage medium

Technical Field

The present invention relates to the field of communications, and in particular, to a voice wake-up method, apparatus, electronic device, and readable storage medium.

Background

With the advent of the mobile internet and the artificial intelligence era, voice interaction has gained unprecedented growth in recent years, and voice wake-up technology is a special voice recognition technology, and becomes an important component for interaction between users and machines. The goal of the voice wake-up system is to wake up the device without manual operation.

In the current voice awakening scheme, an inertial filter or a scheme based on a neural network model is generally adopted for voice awakening, however, under the two schemes, a great amount of calculation power of equipment is consumed for long-time standby awakening, which can cause equipment to generate heat, so that the service life of the equipment is reduced.

Disclosure of Invention

Aiming at the technical problems, the application provides a voice awakening method, a voice awakening device, electronic equipment and a readable storage medium, which do not need to consume a great deal of calculation power of the equipment to carry out long-time standby awakening, and avoid the condition that the equipment generates heat to reduce the service life of the equipment.

In order to solve the above technical problems, the present application provides a voice wake-up method, including:

collecting sample signals within a preset time period;

counting signal values corresponding to each frame of sample frames in the sample signals;

calculating a background signal value corresponding to the sample signal based on the counted signal value;

and when a voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value.

Optionally, in some embodiments of the present application, the calculating, based on the statistical signal value, a background signal value corresponding to the sample signal includes:

acquiring a historical signal value in historical time;

and calculating a background signal value corresponding to the sample signal according to the fluctuation between the historical signal value and the statistical signal value.

Optionally, in some embodiments of the present application, the calculating a background signal value corresponding to the sample signal according to the fluctuation between the historical signal value and the statistical signal value includes:

determining an initial signal value from the counted signal values;

calculating a difference between the initial signal value and the historical signal value;

adjusting the historical signal value according to the difference value to obtain an adjusted signal value;

and adjusting the adjusted signal value according to the fluctuation among other signal values except the initial signal value to obtain a background signal value corresponding to the sample signal.

Optionally, in some embodiments of the present application, the adjusting the historical signal value according to the difference value, to obtain an adjusted signal value includes:

when the difference value is detected to be larger than a threshold value, calculating the sum of the historical signal value and the preset value to obtain an adjusted signal value;

and when the difference value is detected to be smaller than a threshold value, calculating the difference between the historical signal value and the preset value to obtain an adjusted signal value.

Optionally, in some embodiments of the present application, when a voice wake-up operation triggered for a target device is detected, waking up the target device according to the background signal value includes:

when voice awakening operation triggered by target equipment is detected, acquiring an operation signal value corresponding to the voice awakening operation;

detecting whether the operation signal value is larger than the background signal value;

and waking up the target device when the operation signal value is detected to be larger than the background signal value.

Optionally, in some embodiments of the present application, further includes:

periodically updating the background signal value to obtain an updated background signal value;

when the voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value, wherein the voice wake-up operation comprises the following steps: and when voice awakening operation triggered by the target equipment is detected, awakening the target equipment according to the updated background signal value.

Correspondingly, the application also provides a voice awakening device, which comprises:

the acquisition module is used for acquiring sample signals within a preset time length;

the statistics module is used for counting signal values corresponding to each frame of sample frame in the sample signal;

the calculating module is used for calculating a background signal value corresponding to the sample signal based on the counted signal value;

and the wake-up module is used for waking up the target equipment according to the background signal value when the voice wake-up operation triggered by the target equipment is detected.

Optionally, in some embodiments of the present application, the computing module includes:

an acquisition unit for acquiring a history signal value in a history time;

and the calculating unit is used for calculating the background signal value corresponding to the sample signal according to the fluctuation between the historical signal value and the statistical signal value.

The application also provides an electronic device comprising a memory storing a computer program and a processor implementing the steps of the method as described above when executing the computer program.

The present application also provides a computer storage medium storing a computer program which, when executed by a processor, implements the steps of the method as described above.

As described above, the present application provides a voice wake-up method, a device, an electronic apparatus, and a readable storage medium, where the voice wake-up method includes: collecting sample signals within a preset time period; counting signal values corresponding to each frame of sample frames in the sample signals; calculating a background signal value corresponding to the sample signal based on the counted signal value; and when a voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value. In the voice awakening scheme provided by the application, the signal value corresponding to each frame of sample frame in the sample signal is utilized to calculate the background signal value corresponding to the sample signal, and the calculated background signal value is utilized to awaken the target equipment, so that voice awakening is not required to be carried out through an inertial filter or a scheme based on a neural network model, long-time standby awakening is not required to be carried out with great consumption of a great amount of calculation power of the equipment, and the situation that equipment heats and the service life of the equipment is reduced is avoided.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

Fig. 1 is a schematic structural diagram of a voice wake-up system provided in an embodiment of the present application;

fig. 2 is a flow chart of a voice wake-up method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a voice wake-up device according to an embodiment of the present application;

fig. 4 is another schematic structural diagram of a voice wake-up device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an intelligent terminal provided in an embodiment of the present application.

The realization, functional characteristics and advantages of the present application will be further described with reference to the embodiments, referring to the attached drawings. Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the element defined by the phrase "comprising one … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element, and furthermore, elements having the same name in different embodiments of the present application may have the same meaning or may have different meanings, a particular meaning of which is to be determined by its interpretation in this particular embodiment or by further combining the context of this particular embodiment.

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In the following description, suffixes such as "module", "component", or "unit" for representing elements are used only for facilitating the description of the present application, and are not of specific significance per se. Thus, "module," "component," or "unit" may be used in combination.

The embodiments related to the present application are specifically described below, and it should be noted that the order of description of the embodiments in the present application is not limited to the priority order of the embodiments.

The embodiment of the application provides a voice awakening method, a voice awakening device, a storage medium and electronic equipment. Specifically, the voice wake-up method of the embodiment of the application may be executed by an electronic device, where the electronic device may be a terminal. The electronic device may be an electronic device such as a smart phone, a tablet computer, a notebook computer, a touch screen, a game console, a personal computer (PC, personalComputer), a personal digital assistant (Personal Digital Assistant, PDA), etc., and the electronic device may further include a client, which may be a voice wake-up client or other clients. The electronic device can be connected with the server in a wired or wireless mode, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like.

For example, when the voice wake-up method is operated in the electronic device, after the electronic device collects the sample signals within a preset time, the electronic device counts signal values corresponding to each frame of sample frames in the sample signals, then the electronic device calculates background signal values corresponding to the sample signals based on the counted signal values, and when the electronic device detects voice wake-up operation triggered by the target device, the electronic device wakes up the target device according to the background signal values.

Referring to fig. 1, fig. 1 is a schematic system diagram of a voice wake-up device according to an embodiment of the present application. The system may include at least one electronic device 1000, at least one server or personal computer 2000. The electronic device 1000 held by the user may be connected to different servers or personal computers through a network. The electronic device 1000 may be an electronic device having computing hardware capable of supporting and executing software products corresponding to multimedia. In addition, the electronic device 1000 may also have one or more multi-touch sensitive screens for sensing and obtaining input from a user through touch or slide operations performed at multiple points of the one or more touch sensitive display screens. In addition, the electronic device 1000 may be connected to a server or a personal computer 2000 through a network. The network may be a wireless network or a wired network, such as a Wireless Local Area Network (WLAN), a Local Area Network (LAN), a cellular network, a 2G network, a 3G network, a 4G network, a 5G network, etc. In addition, the different electronic devices 1000 may be connected to other embedded platforms or to a server, a personal computer, or the like using their own bluetooth network or hotspot network. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms.

The embodiment of the application provides a voice awakening method which can be executed by electronic equipment. The electronic equipment comprises a touch display screen and a processor, wherein the touch display screen is used for presenting a graphical user interface and receiving an operation instruction generated by a user acting on the graphical user interface. When a user operates the graphical user interface through the touch display screen, the graphical user interface can control local content of the electronic equipment by responding to a received operation instruction, and can also control content of a server side by responding to the received operation instruction. For example, the user-generated operational instructions acting on the graphical user interface include instructions for processing the initial audio data, and the processor is configured to launch a corresponding application upon receiving the user-provided instructions. Further, the processor is configured to render and draw a graphical user interface associated with the application on the touch-sensitive display screen. A touch display screen is a multi-touch-sensitive screen capable of sensing touch or slide operations performed simultaneously by a plurality of points on the screen. The user performs touch operation on the graphical user interface by using a finger, and when the graphical user interface detects the touch operation, the graphical user interface controls the graphical user interface of the application to display the corresponding operation.

According to the voice awakening scheme, the signal value corresponding to each frame of sample frame in the sample signal is utilized, the background signal value corresponding to the sample signal is calculated, the calculated background signal value is utilized to awaken the target equipment, voice awakening is not needed through an inertial filter or a scheme based on a neural network model, long-time standby awakening is not needed, a great amount of calculation power of the equipment is not needed, and the situation that equipment heats and the service life of the equipment is reduced is avoided.

The following will describe in detail. It should be noted that the following description order of embodiments is not a limitation of the priority order of embodiments.

A voice wakeup method, comprising: collecting sample signals within a preset time period; counting signal values corresponding to each frame of sample frames in the sample signals; calculating a background signal value corresponding to the sample signal based on the counted signal value; and when the voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value.

Referring to fig. 2, fig. 2 is a flow chart of a voice wake-up method according to an embodiment of the present application. The specific flow of the digital voice wake-up method can be as follows:

101. sample signals are acquired within a preset time period.

The sample signal is an audio signal collected within a preset time period, the sample signal may include a human voice signal, an environmental voice signal, and other types of voice signals, the voice signal may be collected by a voice sensor (such as a microphone) of the electronic device, the preset time period may be 10 minutes, 20 minutes or 100 minutes, and may also be 50 seconds, 120 seconds or 300 seconds, which is specifically set according to an actual situation and will not be described herein.

102. And counting signal values corresponding to each frame of sample frames in the sample signals.

For example, a sample signal may be specifically subjected to framing processing to obtain a plurality of sample frames corresponding to the sample signal, where it is to be noted that, since there may be a human voice signal (i.e., a voice signal) in the sample signal, the voice signal is macroscopically unstable and microscopically stable, with short-time stationarity (10—30ms may be considered as the voice signal being approximately unchanged), and in order to facilitate subsequent voice wakeup, the sample signal may be divided into short segments, each of which is a sample frame of the present application, in which the characteristics of the voice signal may be considered as stable, and the framing principle is that it must be short enough to ensure that the intra-frame signal is stable, the length of one frame should be less than the length of one phoneme, and the duration of the next phoneme at a normal speech speed is about 50ms. In addition, the frame to be subjected to fourier analysis must contain a sufficient vibration period, and considering that male voices are about 100 hz and female voices are about 200 hz, the converted period is 10ms and 5 ms, that is, the length of each frame of sample frame is 10 ms-40 ms, and the length of each frame of sample frame can be specifically selected according to practical situations.

Further, a discrete fourier transform (also called short-time discrete fourier transform) may be applied to a frame of signal to obtain information about frequency-energy distribution of the signal in the frame, images of each frame in the frequency domain are spliced together, the horizontal axis is frequency, the vertical axis is amplitude, and a spectrogram of the sample signal can be obtained, where in the present application, the amplitude is determined as a signal value corresponding to the sample frame.

103. And calculating a background signal value corresponding to the sample signal based on the counted signal value.

For example, specifically, a reference value may be obtained, a fluctuation between the counted signal values is determined, and a background signal value corresponding to the sample signal is calculated based on a difference between the fluctuation and the reference value, that is, optionally, in some embodiments, the step of calculating the background signal value corresponding to the sample signal based on the counted signal value may specifically include:

(11) Acquiring a historical signal value in historical time;

(12) And calculating a background signal value corresponding to the sample signal according to the fluctuation between the historical signal value and the statistical signal value.

For example, the initial signal value may be determined from the statistical signal values, e.g. the signal value of the first frame of sample frames of the sample signal is determined as the initial signal value, then the difference between the initial signal value and the historical signal value is calculated, and the background signal value corresponding to the sample signal is calculated based on the difference and the fluctuation between the statistical signal values, i.e. optionally, in some embodiments, the step of "calculating the background signal value corresponding to the sample signal from the fluctuation between the historical signal value and the statistical signal value" may specifically include:

(21) Determining an initial signal value from the counted signal values;

(22) Calculating a difference between the initial signal value and the historical signal value;

(23) Adjusting the historical signal value according to the difference value to obtain an adjusted signal value;

(24) And adjusting the adjusted signal values according to the fluctuation among other signal values except the initial signal values to obtain background signal values corresponding to the sample signals.

For example, specifically, after determining a signal value of a first frame of a sample signal as an initial signal, acquiring a historical signal value, where the historical signal value may be an average value corresponding to a signal value acquired in a historical time period, and the historical signal value may be used as environmental background noise, further, calculating a difference between the initial signal value and the historical signal value, adjusting the historical signal value based on the difference, then, calculating a difference between a signal value corresponding to a next frame of the initial signal value and an adjusted historical signal value, and further adjusting the adjusted historical signal value based on the difference until all sample frames in the sample signal are processed, and finally, calculating an average value of all adjusted historical signal values to obtain a background signal value corresponding to the sample signal.

It should be noted that, in the present application, the addition and subtraction device is used to adjust the historical signal value, that is, greater than the set value +1, equal to the set value not being adjusted, and less than the set value-1, to adjust the historical signal value, that is, optionally, in some embodiments, the step of adjusting the historical signal value according to the difference value to obtain the adjusted signal value may specifically include:

(31) When the difference value is detected to be larger than the threshold value, calculating the sum of the historical signal value and the preset value to obtain an adjusted signal value;

(32) And when the detected difference value is smaller than the threshold value, calculating the difference between the historical signal value and the preset value to obtain an adjusted signal value.

104. And when the voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the background signal value.

For example, specifically, when a voice wake-up operation triggered by the target device is detected, an operation signal value corresponding to the voice wake-up operation is obtained, and the target device is triggered to perform a wake-up mode by comparing the magnitude between the operation signal value and a background signal value, that is, optionally, in some embodiments, the step of "when the voice wake-up operation triggered by the target device is detected, waking up the target device according to the background signal value" may specifically include:

(41) When voice awakening operation triggered by target equipment is detected, acquiring an operation signal value corresponding to the voice awakening operation;

(42) Detecting whether the operation signal value is larger than the background signal value;

(43) When the operation signal value is detected to be larger than the background signal value, the target device is awakened.

For example, specifically, when the chip of the target device is powered on, a sound signal is collected, an average statistic is calculated through long-time statistics, the average statistic is taken as an environmental background noise (i.e. a historical signal value) of the application scene, then, the target device continuously collects a sample signal, counts a signal value corresponding to each frame of sample frame, then, the target device calculates a difference value between the signal value and the historical signal value, and continuously updates the background signal value, when a voice wake-up operation triggered by the target device is detected, the target device is waken according to the background signal value, that is, it can be understood that, in some embodiments, the voice wake-up method specifically further includes periodically updating the background signal value to obtain an updated background signal value, and the step of "when the voice wake-up operation triggered by the target device is detected, waking up the target device according to the background signal value", specifically includes: and when the voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the updated background signal value.

The voice wake-up flow of the application is completed.

As can be seen from the foregoing, the present application provides a voice wake-up method, which collects sample signals within a preset duration, then counts signal values corresponding to each frame of sample frames in the sample signals, calculates background signal values corresponding to the sample signals based on the counted signal values, wakes up a target device according to the background signal values when a voice wake-up operation triggered by the target device is detected, calculates background signal values corresponding to each frame of sample frames in the sample signals according to a voice wake-up scheme provided in the present application, wakes up the target device by using the calculated background signal values, does not need to wake up the target device by using an inertial filter or a scheme based on a neural network model, does not need to consume a great amount of computing power of the device for long-time standby wake-up, and avoids the situation that the device generates heat to reduce the service life of the device.

In order to facilitate better implementation of the voice wake-up method, the application also provides a voice wake-up device. The meaning of the nouns is the same as that of the voice wake method, and specific implementation details can refer to the description of the method embodiment.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a voice wake-up device provided in the present application, where the voice wake-up device may include an acquisition module 201, a statistics module 202, a calculation module 203, and a wake-up module 204, and may specifically be as follows:

the acquisition module 201 is configured to acquire a sample signal within a preset time period.

The statistics module 202 is configured to count signal values corresponding to each frame of sample frames in the sample signal.

The calculating module 203 is configured to calculate a background signal value corresponding to the sample signal based on the counted signal value.

For example, specifically, a reference value may be obtained, a fluctuation between the counted signal values is determined, and based on a difference between the fluctuation and the reference value, a background signal value corresponding to the sample signal is calculated, that is, optionally, in some embodiments, the calculating module 203 may specifically include:

an acquisition unit for acquiring a history signal value in a history time;

Alternatively, in some embodiments, the computing unit may specifically include:

a determining subunit, configured to determine an initial signal value from the counted signal values;

a calculating subunit for calculating a difference between the initial signal value and the historical signal value;

the adjusting subunit is used for adjusting the historical signal value according to the difference value to obtain an adjusted signal value;

and the adjusting subunit is used for adjusting the adjusted signal value according to the fluctuation among other signal values except the initial signal value to obtain a background signal value corresponding to the sample signal.

Alternatively, in some embodiments, the adjustment subunit may be specifically configured to: when the difference value is detected to be larger than the threshold value, calculating the sum of the historical signal value and the preset value to obtain an adjusted signal value; and when the detected difference value is smaller than the threshold value, calculating the difference between the historical signal value and the preset value to obtain an adjusted signal value.

And the wake-up module 204 is configured to wake-up the target device according to the background signal value when a voice wake-up operation triggered for the target device is detected.

For example, specifically, when a voice wake-up operation triggered for the target device is detected, an operation signal value corresponding to the voice wake-up operation is obtained, and the target device is triggered to perform a wake-up mode by comparing the magnitude between the operation signal value and a background signal value, optionally, in some embodiments, the wake-up module 204 may specifically be configured to: when voice awakening operation triggered by target equipment is detected, acquiring an operation signal value corresponding to the voice awakening operation; detecting whether the operation signal value is larger than the background signal value; when the operation signal value is detected to be larger than the background signal value, the target device is awakened.

Optionally, in some embodiments, referring to fig. 4, the voice wake apparatus of the present application may specifically further include an updating module 205, where the updating module 205 may specifically be configured to: and periodically updating the background signal value to obtain an updated background signal value.

Optionally, in some embodiments, the wake module 204 may be further specifically configured to: and when the voice wake-up operation triggered by the target equipment is detected, waking up the target equipment according to the updated background signal value.

The voice wake-up flow of the application is completed.

As can be seen from the foregoing, the present application provides a voice wake-up device, the collection module 201 collects sample signals within a preset duration, then the statistics module 202 counts signal values corresponding to each frame of sample frames in the sample signals, then the calculation module 203 calculates background signal values corresponding to the sample signals based on the counted signal values, and the wake-up module 204 wakes up the target device according to the background signal values when a voice wake-up operation triggered by the target device is detected, in the voice wake-up scheme provided by the present application, calculates background signal values corresponding to the sample signals by using the signal values corresponding to each frame of sample frames in the sample signals, wakes up the target device by using the calculated background signal values, and performs voice wake-up without using an inertial filter or a scheme based on a neural network model, so that a large amount of equipment is not consumed for waiting wake-up for a long time, and the situation that equipment heats up to reduce the service life of the equipment is avoided.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.

The embodiment of the present invention further provides an electronic device 500, as shown in fig. 5, where the electronic device 500 may integrate the above-mentioned voice wake-up device, and may further include a Radio Frequency (RF) circuit 501, a memory 502 including one or more computer readable storage media, an input unit 503, a display unit 504, a sensor 505, an audio circuit 506, a wireless fidelity (WiFi, wireless Fidelity) module 507, a processor 508 including one or more processing cores, and a power supply 509. Those skilled in the art will appreciate that the electronic device 500 structure shown in fig. 5 is not limiting of the electronic device 500 and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:

the RF circuit 501 may be configured to receive and send information or signals during a call, and in particular, after receiving downlink information of a base station, the downlink information is processed by one or more processors 508; in addition, data relating to uplink is transmitted to the base station. Typically, RF circuitry 501 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a subscriber identity Module (SIM, subscriberIdentity Module) card, a transceiver, a coupler, a low noise amplifier (LNA, low NoiseAmplifier), a duplexer, and the like. In addition, RF circuitry 501 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol including, but not limited to, global system for mobile communications (GSM, global Systemof Mobile communication), universal packet Radio Service (GPRS, generalPacket Radio Service), code division multiple access (CDMA, code DivisionMultiple Access), wideband code division multiple access (WCDMA, wideband CodeDivision Multiple Access), long term evolution (LTE, long TermEvolution), email, short message Service (SMS, shortMessaging Service), and the like.

The memory 502 may be used to store software programs and modules, and the processor 508 executes the software programs and modules stored in the memory 502 to perform various functional applications and information processing. The memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, a target data playing function, etc.), and the like; the storage data area may store data created according to the use of the electronic device 500 (such as audio data, phonebooks, etc.), and the like. In addition, memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 502 may also include a memory controller to provide access to the memory 502 by the processor 508 and the input unit 503.

The input unit 503 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, the input unit 503 may include a touch-sensitive surface, as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations thereon or thereabout by a user using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection means according to a predetermined program. Alternatively, the touch-sensitive surface may comprise two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 508, and can receive commands from the processor 508 and execute them. In addition, touch sensitive surfaces may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. The input unit 503 may comprise other input devices besides a touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 504 may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device 500, which may be composed of graphics, text, icons, video, and any combination thereof. The display unit 504 may include a display panel, which may be optionally configured in the form of a liquid crystal display (LCD, liquid Crystal Display), an Organic Light-emitting diode (OLED), or the like. Further, the touch-sensitive surface may overlay a display panel, and upon detection of a touch operation thereon or thereabout, the touch-sensitive surface is passed to the processor 508 to determine the type of touch event, and the processor 508 then provides a corresponding visual output on the display panel based on the type of touch event. Although in fig. 5 the touch sensitive surface and the display panel are implemented as two separate components for input and output functions, in some embodiments the touch sensitive surface may be integrated with the display panel to implement the input and output functions.

The electronic device 500 may also include at least one sensor 505, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or backlight when the electronic device 500 is moved to the ear. As one of the motion sensors, the gravitational acceleration sensor may detect the acceleration in each direction (generally, three axes), and may detect the gravity and direction when stationary, and may be used for applications of recognizing the gesture of a mobile phone (such as horizontal/vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer, and knocking), and other sensors such as gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, which may be further configured in the electronic device 500, will not be described herein.

Audio circuitry 506, speakers, and a microphone may provide an audio interface between the user and the electronic device 500. The audio circuit 506 may transmit the received electrical signal after audio data conversion to a speaker, where the electrical signal is converted to a sample signal for output; on the other hand, the microphone converts the collected sample signal into an electrical signal, which is received by the audio circuit 506 and converted into audio data, which is processed by the audio data output processor 508, and then sent via the RF circuit 501 to, for example, another electronic device 500, or the audio data is output to the memory 502 for further processing. Audio circuitry 506 may also include an ear bud jack to provide communication of the peripheral ear bud with electronic device 500.

WiFi belongs to a short-distance wireless transmission technology, and the electronic equipment 500 can help a user to send and receive emails, browse webpages, access streaming media and the like through the WiFi module 507, so that wireless broadband Internet access is provided for the user. Although fig. 5 shows a WiFi module 507, it is understood that it does not belong to the necessary constitution of the electronic device 500, and may be omitted entirely as needed within a range that does not change the essence of the invention.

The processor 508 is a control center of the electronic device 500, connects various parts of the entire handset using various interfaces and lines, and performs various functions of the electronic device 500 and processes data by running or executing software programs and/or modules stored in the memory 502, and invoking data stored in the memory 502, thereby performing overall monitoring of the handset. Optionally, the processor 508 may include one or more processing cores; preferably, the processor 508 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 508.

The electronic device 500 also includes a power supply 509 (e.g., a battery) for powering the various components, which may be logically connected to the processor 508 via a power management system that performs functions such as managing charge, discharge, and power consumption. The power supply 509 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power data indicator, and the like.

Although not shown, the electronic device 500 may further include a camera, a bluetooth module, etc., which will not be described herein. In particular, in this embodiment, the processor 508 in the electronic device 500 loads executable files corresponding to the processes of one or more application programs into the memory 502 according to the following instructions, and the processor 508 executes the application programs stored in the memory 502, so as to implement various functions:

obtaining a preset storage capacity of a virtual memory; setting the initial phase to a preset value; starting the BIST circuit, and performing phase scanning on the control of the BIST circuit based on the set initial phase; when the BIST circuit is detected to scan to the final phase, the optimal phase is calculated.

In the foregoing embodiments, the descriptions of the embodiments are focused, and the portions of an embodiment that are not described in detail may be referred to the detailed description of the voice wake-up method, which is not repeated herein.

As can be seen from the above, the electronic device 500 according to the embodiment of the present invention can utilize the BIST circuit to perform phase scanning, calculate the optimal phase based on the scanning result, and finally perform voice wake-up according to the optimal phase, so as to ensure that the voice wake-up is performed quickly and with heavy load, and no additional circuit is required.

To this end, embodiments of the present application further provide a storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor to perform the steps in the voice wakeup method described above.

The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.

Wherein the storage medium may include: read Only Memory (ROM), random access memory (RAM, random AccessMemory), magnetic or optical disk, and the like.

The instructions stored in the storage medium can execute the steps in any voice wake-up method provided by the embodiment of the present invention, so that the beneficial effects that any voice wake-up method provided by the embodiment of the present invention can be achieved, and detailed descriptions of the foregoing embodiments are omitted.

The voice wake-up method, device, system and storage medium provided by the embodiments of the present invention are described in detail, and specific examples are applied to illustrate the principles and embodiments of the present invention, where the description of the above embodiments is only used to help understand the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present invention, the present description should not be construed as limiting the present invention.

Claims

1. A method of waking up speech, comprising:

collecting sample signals within a preset time period;

acquiring a historical signal value in historical time;

determining an initial signal value from the counted signal values;

calculating a first difference between the initial signal value and the historical signal value;

adjusting the historical signal value according to the first difference value to obtain an adjusted signal value;

sequentially calculating second difference values between other signal values except the initial signal value and the adjusted signal value, and further adjusting the adjusted signal value based on the second difference values until all sample frames in the sample signal are processed;

calculating the average value of all the adjusted signal values to obtain a background signal value corresponding to the sample signal;

2. The method of claim 1, wherein adjusting the historical signal value based on the first difference value results in an adjusted signal value, comprising:

when the first difference value is detected to be larger than a threshold value, calculating the sum of the historical signal value and a preset value to obtain an adjusted signal value;

and when the first difference value is detected to be smaller than a threshold value, calculating the difference between the historical signal value and a preset value to obtain an adjusted signal value.

3. The method according to claim 1 or 2, wherein when a voice wake-up operation triggered for a target device is detected, waking up the target device according to the background signal value comprises:

4. The method according to claim 1 or 2, further comprising:

5. A voice wakeup apparatus, comprising:

the calculation module is used for acquiring historical signal values in the historical time and determining initial signal values in the counted signal values; calculating a first difference between the initial signal value and the historical signal value; adjusting the historical signal value according to the first difference value to obtain an adjusted signal value; sequentially calculating second difference values between other signal values except the initial signal value and the adjusted signal value, and further adjusting the adjusted signal value based on the second difference values until all sample frames in the sample signal are processed; calculating the average value of all the adjusted signal values to obtain a background signal value corresponding to the sample signal;

6. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the voice wake-up method of any of claims 1 to 4 when the computer program is executed.

7. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the voice wake-up method according to any of claims 1 to 4.