CN110046045B

CN110046045B - Voice wake-up data packet processing method and device

Info

Publication number: CN110046045B
Application number: CN201910268017.8A
Authority: CN
Inventors: 贺学焱; 陈建哲; 王兴
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Apollo Zhilian Beijing Technology Co Ltd
Priority date: 2019-04-03
Filing date: 2019-04-03
Publication date: 2021-07-30
Anticipated expiration: 2039-04-03
Also published as: CN110046045A

Abstract

The application provides a voice awakening data packet processing method and device, wherein the method comprises the following steps: acquiring the audio time length and the predicted processing time of the current voice data packet; generating a real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time; detecting whether the real-time awakening rate meets a preset data packet processing condition or not; if the awakening real-time rate meets the preset data packet processing condition, increasing the awakening real-time rate of the current accumulated value to be used as a new current accumulated value; judging whether the new current accumulated value is larger than or equal to a preset processing threshold value or not; and if the new current accumulated value is larger than or equal to the preset processing threshold value, deleting the voice data packet in the current system from the current system. Therefore, the voice data packet is actively deleted through the awakening engine according to the mechanism of awakening real-time rate detection, the utilization rate of the CPU is increased, the phenomenon of awakening in a pause is prevented, and the user experience is improved.

Description

Voice wake-up data packet processing method and device

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing a voice-awakened data packet.

Background

Generally, in some vehicle scenes, due to hardware limitations, the resources of the processor are extremely limited, and since the voice wakeup function runs in the background for a long time, it is unavoidable that a user is listening to music and opening a map to navigate to a certain destination while the user wishes to use the function of voice interaction, if other functions consuming the CPU resources run on a platform with limited hardware processor, the wakeup engine cannot apply for the resources of the processor, and then only waits for the resources of the processor to be idle and then allocates, after the user speaks a wakeup word, since the engine does not start Processing, the system does not feed back to the user, and at this time, the user often attempts to speak the wakeup word again or more times because the engine does not start Processing, when the CPU resource is idle, the awakening engine can start to process the accumulated voice data, and at the moment, because the previous data accumulation can generate a plurality of voice awakening results, the voice interaction system can give a plurality of feedbacks after the CPU is idle, and the problem of voice awakening jamming is caused.

Content of application

The present application is directed to solving, at least to some extent, one of the technical problems in the related art described above.

Therefore, a first objective of the present application is to provide a voice-awakening data packet processing method, which solves the technical problem that a voice-awakening data packet processing mode in the prior art causes an awakening pause, actively deletes a voice data packet according to an awakening real-time rate detection mechanism by an awakening engine, increases the utilization rate of a CPU, prevents an awakening pause phenomenon, and improves user experience.

A second objective of the present application is to provide a voice-awakening packet processing apparatus.

A third object of the present application is to propose a computer device.

A fourth object of the present application is to propose a non-transitory computer-readable storage medium.

To achieve the above object, a first aspect of the present application provides a voice-awakening packet processing method, including: acquiring the audio time length and the predicted processing time of the current voice data packet; generating a real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time; detecting whether the real-time awakening rate meets a preset data packet processing condition; if the awakening real-time rate meets the preset data packet processing condition, increasing the current accumulated value by the awakening real-time rate to be used as a new current accumulated value; judging whether the new current accumulated value is greater than or equal to a preset processing threshold value; and if the new current accumulated value is larger than or equal to the preset processing threshold value, deleting the voice data packet in the current system from the current system.

In addition, the voice wake-up data packet processing method in the embodiment of the present application further has the following additional technical features:

optionally, the detecting whether the real-time wakeup rate meets a preset packet processing condition includes: setting a preset threshold value; and judging whether the real-time awakening rate is greater than the preset threshold value.

Optionally, after detecting whether the real-time wakeup rate meets a preset packet processing condition, the method further includes: and if the awakening real-time rate does not meet the preset data packet processing condition, identifying the received voice data packet and clearing the current accumulated value.

Optionally, the generating a real-time wake-up rate of the wake-up engine according to the audio time length and the expected processing time includes: calculating a ratio of the estimated processing time to the audio time length; and generating the real-time awakening rate according to the ratio of the predicted processing time to the audio time length.

Optionally, the deleting the voice data packet in the current system from the current system includes: deleting all voice data packets in the current system from the current system; or all voice data packets are sequenced according to the receiving time, and N voice data packets before sequencing are deleted.

To achieve the above object, a second aspect of the present application provides a voice-awakening packet processing apparatus, including: the acquisition module is used for acquiring the audio time length and the predicted processing time of the current voice data packet; the generating module is used for generating the real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time; the detection module is used for detecting whether the real-time awakening rate meets a preset data packet processing condition or not; the statistical module is used for increasing the current accumulated value by the awakening real-time rate to be used as a new current accumulated value if the awakening real-time rate meets the preset data packet processing condition; the judging module is used for judging whether the new current accumulated value is greater than or equal to a preset processing threshold value or not; and the processing module is used for deleting the voice data packet in the current system from the current system if the new current accumulated value is greater than or equal to a preset processing threshold value.

In addition, the voice awakening data packet processing device according to the embodiment of the present application further has the following additional technical features:

optionally, the detection module is specifically configured to: setting a preset threshold value; and judging whether the real-time awakening rate is greater than the preset threshold value.

Optionally, the apparatus further includes: and the identification module is used for identifying the received voice data packet and clearing the current accumulated value if the awakening real-time rate does not meet the preset data packet processing condition.

Optionally, the generating module is specifically configured to: calculating a ratio of the estimated processing time to the audio time length; and generating the real-time awakening rate according to the ratio of the predicted processing time to the audio time length.

Optionally, the processing module is specifically configured to: deleting all voice data packets in the current system from the current system; or all voice data packets are sequenced according to the receiving time, and N voice data packets before sequencing are deleted.

To achieve the above object, a third aspect of the present application provides a computer device, including: a processor and a memory; the processor reads the executable program code stored in the memory to run a program corresponding to the executable program code, so as to implement the voice wake-up packet processing method according to the embodiment of the first aspect.

To achieve the above object, a fourth aspect of the present application provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the voice-awakened packet processing method according to the first aspect of the present application.

To achieve the above object, a fifth aspect of the present application provides a computer program product, where instructions of the computer program product, when executed by a processor, implement the voice-wake packet processing method according to the first aspect.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

the method comprises the steps of obtaining the audio time length and the predicted processing time of a current voice data packet, generating the awakening real-time rate of an awakening engine according to the audio time length and the predicted processing time, detecting whether the awakening real-time rate meets the preset data packet processing condition, increasing the awakening real-time rate of a current accumulated value to be used as a new current accumulated value when the awakening real-time rate meets the preset data packet processing condition, judging whether the new current accumulated value is larger than or equal to a preset processing threshold value, and deleting the voice data packet in the current system from the current system when the new current accumulated value is larger than or equal to the preset processing threshold value. Therefore, the voice data packet is actively deleted through the awakening engine according to the mechanism of awakening real-time rate detection, the utilization rate of the CPU is increased, the phenomenon of awakening in a pause is prevented, and the user experience is improved.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow chart of a packet processing method for voice wakeup according to the prior art of the present application;

FIG. 2 is a flow diagram of a method for voice wake packet processing according to one embodiment of the present application;

FIG. 3 is an exemplary diagram of voice wake-up packet processing according to one embodiment of the present application;

FIG. 4 is a block diagram of a voice-awakened packet processing device according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a voice-awakened packet processing apparatus according to another embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

The following describes a voice wake-up packet processing method and apparatus according to an embodiment of the present application with reference to the drawings.

For the technical problem mentioned in the background art, when multiple application programs consuming the CPU use the CPU resource, the total occupation of the processor resource on the same platform may be up to 90%, the wake-up engine needs to allocate more than 10% of the resource to the CPU for processing the voice data packet, and at this time, the resource is only reallocated for calculation after the CPU is idle, and often a user tries to wake up again when the first wake-up fails, so that the wake-up result responds many times after the CPU is idle.

As shown in fig. 1, when the user says "small degree", the system is in a high load state, the wake-up engine waits for the CPU to allocate resources, that is, the wake-up engine cannot say "small degree" to the user for processing, when the system does not have feedback, the user says "small degree" again, the CPU starts to allocate resources to the wake-up engine, and the wake-up engine starts to process, so as to recognize that two wake-up results are output "i am" and "i am".

In order to solve the problems, the application provides a voice awakening data packet processing method, which includes calculating an awakening real-time rate, detecting whether the awakening real-time rate meets a preset data packet processing condition, increasing the awakening real-time rate of a current accumulated value to be a new current accumulated value if the awakening real-time rate meets the preset data packet processing condition, judging that the new current accumulated value is larger than or equal to a preset processing threshold value, and deleting a voice data packet in a current system from the current system if the new current accumulated value is larger than or equal to the preset processing threshold value.

Specifically, fig. 2 is a flowchart of a voice wake-up packet processing method according to an embodiment of the present application, and as shown in fig. 2, the method includes:

step 101, obtaining the audio time length and the predicted processing time of the current voice data packet.

And 102, generating a real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time.

Specifically, in the embodiment of the present application, whether the current system is in a state of high load of the CPU is determined by the wake real-time rate, that is, whether the current wake engine can apply for a CPU resource is determined by the wake real-time rate. The wake-up real-time rate is a value that is commonly used to measure the decoding speed of an automatic speech recognition system.

Firstly, obtaining the audio time length and the expected processing time of the current voice data packet, wherein the audio time length is the time spent on the normal playing of the current voice data packet, and the expected processing time is the time spent on processing the current voice data packet.

Further, the wake-up real-time rate of the wake-up engine is generated according to the audio time length and the predicted processing time, it is understood that the ratio of the predicted processing time to the audio time length, or the ratio of the audio time length to the predicted processing time, or the ratio of the difference between the predicted processing time and the audio time length to the audio time length, etc. may be directly calculated as the wake-up real-time rate.

For example, if it is expected that it takes b to process a current voice packet with an audio time length of a and the wake-up real-time rate is b/a, for example, if it is expected that it takes 8 hours to process a current voice packet with an audio time length of 2 hours, the wake-up real-time rate is 8/2-4.

And 103, detecting whether the awakening real-time rate meets a preset data packet processing condition.

And step 104, if the awakening real-time rate meets the preset data packet processing condition, increasing the awakening real-time rate of the current accumulated value to be used as a new current accumulated value.

Specifically, the awakening real-time rate calculated according to the different manners is detected whether the awakening real-time rate meets different preset data packet processing conditions, as a possible implementation manner, a ratio of the predicted processing time to the audio time length is calculated, the awakening real-time rate is generated according to the ratio of the predicted processing time to the audio time length, a preset threshold value is set, and whether the awakening real-time rate is greater than the preset threshold value is judged.

For example, when the wake real-time rate is equal to or less than a preset threshold 1, the processing is real-time; when the awakening real-time rate is larger than the preset threshold value 1, the current awakening engine is considered to be possible to generate data blockage due to too large calculated amount or application to CPU resource processing, so that data packet processing is required, and the current accumulated value is increased by the awakening real-time rate and then is used as a new current accumulated value.

The current accumulated value may be 0 or a certain value, and the current accumulated value is added with the wake-up real-time rate to be used as a new current accumulated value, it can be understood that a new wake-up real-time rate is generated every time a new voice data packet is received, the current accumulated value and the new wake-up real-time rate can be added to be used as a new current accumulated value, and the new current accumulated value is continuously increased along with the continuous increase of the number of the received voice data packets.

It should be noted that, if the wake-up real-time rate does not satisfy the preset data packet processing condition, for example, when the wake-up real-time rate is less than or equal to the preset threshold 1, the recognition processing is real-time, that is, enough CPU resources can be applied to perform recognition processing on the voice data packet, so that the received voice data packet is recognized and the current accumulated value is cleared.

And 105, judging whether the new current accumulated value is greater than or equal to a preset processing threshold value.

And 106, if the new current accumulated value is larger than or equal to the preset processing threshold value, deleting the voice data packet in the current system from the current system.

It can be understood that, as time goes by, the number of all voice data packets in the current system gradually increases, that is, the new current accumulated value continuously increases, a preset processing threshold value can be set to indicate that the voice data packets are blocked to a certain extent and must be processed, and then whether the new current accumulated value is greater than or equal to the preset processing threshold value or not is judged, and when the new current accumulated value is greater than or equal to the preset processing threshold value, the voice data packets in the current system are deleted from the current system.

The method comprises the following steps that a voice data packet in a current system is deleted from the current system in a plurality of modes, and all voice data packets in the current system are deleted from the current system as a possible implementation mode; as another possible implementation manner, all the voice data packets are sorted according to the receiving time, and the N voice data packets before being sorted are deleted, that is, a preset number of voice data packets received earlier may be deleted.

It is understood that the current accumulated value is 0 after all voice packets in the current system are deleted from the current system.

Specifically, when the wakeup engine waits for the CPU resource allocation time to exceed a certain time length, the current system CPU is considered to be in a high-load no-response state, and at this time, if the wakeup engine continues to wait, the voice data packet is necessarily blocked, and when the wakeup engine has enough CPU resources, the current voice data packet is not a real-time voice data packet, but a voice data packet started from a previous time period under the last CPU high-load condition, and at this time, the wakeup engine starts to process from the previous time period, and the user has started a new round of wakeup, so that a wakeup jamming phenomenon is necessarily caused.

Therefore, when the wake-up engine cannot apply for the processor resource, the voice data packet is actively discarded in real time, so that when the wake-up engine can apply for the CPU resource, the wake-up engine can start to process the voice data packet in real time from the moment when the wake-up engine successfully applies for the CPU resource, thereby preventing the occurrence of the wake-up stuck phenomenon, as a scene example, as shown in fig. 3, step 201 is to turn on the wake-up engine, start to receive the voice data packet input by the user, i.e. the audio data with the size of 512 bytes, step 202 is to calculate the current wake-up Real Time Factor (RTF), step 203 is to wait for the CPU to allocate the resource when the current wake-up real time factor is greater than the preset threshold 1, and use the current accumulated value as a new current accumulated value after the real time value is increased from the current accumulated value, step 204 is to delete all the voice data packets in the system from the current system when the new current accumulated value is greater than or equal to the preset processing threshold, e.g. 30, and the voice data packet is deleted actively, so that the utilization rate of the CPU is increased.

To sum up, in the voice awakening data packet processing method according to the embodiment of the present application, the audio time length and the predicted processing time of the current voice data packet are obtained, the awakening real-time rate of the awakening engine is generated according to the audio time length and the predicted processing time, whether the awakening real-time rate meets the preset data packet processing condition is detected, when the awakening real-time rate meets the preset data packet processing condition, the current accumulated value is increased by the awakening real-time rate and then is used as a new current accumulated value, whether the new current accumulated value is greater than or equal to the preset processing threshold value is judged, and when the new current accumulated value is greater than or equal to the preset processing threshold value, the voice data packet in the current system is deleted from the current system. Therefore, the voice data packet is actively deleted through the awakening engine according to the mechanism of awakening real-time rate detection, the utilization rate of the CPU is increased, the phenomenon of awakening in a pause is prevented, and the user experience is improved.

In order to implement the above embodiments, the present application further provides a voice wake-up packet processing apparatus. Fig. 4 is a schematic structural diagram of a voice-awakened packet processing apparatus according to an embodiment of the present application, and as shown in fig. 4, the voice-awakened packet processing apparatus includes: an acquisition module 10, a generation module 20, a detection module 30, a statistics module 40, a judgment module 50 and a processing module 60, wherein,

and the obtaining module 10 is used for obtaining the audio time length and the predicted processing time of the current voice data packet.

And the generating module 20 is configured to generate a real-time wake-up rate of the wake-up engine according to the audio time length and the expected processing time.

The detecting module 30 is configured to detect whether the wake-up real-time rate meets a preset data packet processing condition.

And the statistical module 40 is configured to increase the current accumulated value by the wakeup real-time rate to serve as a new current accumulated value if the wakeup real-time rate meets a preset data packet processing condition.

And the judging module 50 is used for judging whether the new current accumulated value is greater than or equal to a preset processing threshold value.

And the processing module 60 is configured to delete the voice data packet in the current system from the current system if the new current accumulated value is greater than or equal to a preset processing threshold value.

In an embodiment of the present application, the detection module 30 is specifically configured to: setting a preset threshold value; and judging whether the awakening real-time rate is greater than a preset threshold value.

In an embodiment of the present application, as shown in fig. 5, on the basis of fig. 4, the method further includes: the module 70 is identified.

The identification module 70 is configured to, if the wakeup real-time rate does not meet the preset data packet processing condition, perform identification processing on the received voice data packet, and clear the current accumulated value.

In an embodiment of the present application, the generating module 20 is specifically configured to: calculating the ratio of the estimated processing time to the audio time length; and generating the real-time awakening rate according to the ratio of the predicted processing time to the audio time length.

In an embodiment of the present application, the processing module 60 is specifically configured to: deleting all voice data packets in the current system from the current system; or all voice data packets are sequenced according to the receiving time, and N voice data packets before sequencing are deleted.

It should be noted that the foregoing explanation of the voice awakening packet processing method embodiment is also applicable to the voice awakening packet processing apparatus of this embodiment, and details are not described here.

To sum up, the voice awakening data packet processing apparatus according to the embodiment of the present application, by obtaining the audio time length and the expected processing time of the current voice data packet, generating the awakening real-time rate of the awakening engine according to the audio time length and the expected processing time, and detecting whether the awakening real-time rate meets the preset data packet processing condition, increases the awakening real-time rate to the current accumulated value when the awakening real-time rate meets the preset data packet processing condition, and then takes the current accumulated value as a new current accumulated value, and determines whether the new current accumulated value is greater than or equal to the preset processing threshold, and deletes the voice data packet in the current system from the current system when the new current accumulated value is greater than or equal to the preset processing threshold. Therefore, the voice data packet is actively deleted through the awakening engine according to the mechanism of awakening real-time rate detection, the utilization rate of the CPU is increased, the phenomenon of awakening in a pause is prevented, and the user experience is improved.

In order to implement the foregoing embodiments, the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for processing packets of voice wakeup as described in the foregoing embodiments is implemented.

In order to implement the above embodiments, the present application also proposes a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the voice-awakened packet processing method as described in the foregoing method embodiments.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A voice wake-up packet processing method is characterized by comprising the following steps:

acquiring the audio time length and the predicted processing time of the current voice data packet;

generating a real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time;

detecting whether the real-time awakening rate meets a preset data packet processing condition;

if the awakening real-time rate meets the preset data packet processing condition, increasing the current accumulated value by the awakening real-time rate to be used as a new current accumulated value;

judging whether the new current accumulated value is greater than or equal to a preset processing threshold value;

and if the new current accumulated value is larger than or equal to the preset processing threshold value, deleting the voice data packet in the current system from the current system.

2. The method of claim 1, wherein the detecting whether the real-time wakeup rate meets a predetermined packet processing condition comprises:

setting a preset threshold value;

and judging whether the real-time awakening rate is greater than the preset threshold value.

3. The method of claim 1, wherein after the detecting whether the real-time wake-up rate satisfies a predetermined packet processing condition, the method further comprises:

and if the awakening real-time rate does not meet the preset data packet processing condition, identifying the received voice data packet and clearing the current accumulated value.

4. The method of claim 1, wherein generating a real-time rate of wake-up of a wake-up engine as a function of the length of audio time and the expected processing time comprises:

calculating a ratio of the estimated processing time to the audio time length;

and generating the real-time awakening rate according to the ratio of the predicted processing time to the audio time length.

5. The method of claim 1, wherein the deleting the voice data packet in the current system from the current system comprises:

deleting all voice data packets in the current system from the current system; or

And sequencing all the voice data packets according to the receiving time, and deleting the N voice data packets before sequencing.

6. A voice-activated packet processing device, comprising:

the acquisition module is used for acquiring the audio time length and the predicted processing time of the current voice data packet;

the generating module is used for generating the real-time awakening rate of the awakening engine according to the audio time length and the predicted processing time;

the detection module is used for detecting whether the real-time awakening rate meets a preset data packet processing condition or not;

the statistical module is used for increasing the current accumulated value by the awakening real-time rate to be used as a new current accumulated value if the awakening real-time rate meets the preset data packet processing condition;

the judging module is used for judging whether the new current accumulated value is greater than or equal to a preset processing threshold value or not;

and the processing module is used for deleting the voice data packet in the current system from the current system if the new current accumulated value is greater than or equal to a preset processing threshold value.

7. The apparatus of claim 6, wherein the detection module is specifically configured to:

setting a preset threshold value;

8. The apparatus of claim 6, further comprising:

and the identification module is used for identifying the received voice data packet and clearing the current accumulated value if the awakening real-time rate does not meet the preset data packet processing condition.

9. The apparatus of claim 6, wherein the generation module is specifically configured to:

calculating a ratio of the estimated processing time to the audio time length;

10. The apparatus of claim 6, wherein the processing module is specifically configured to:

11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the voice-awakened packet processing method according to any one of claims 1 to 5.

12. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the voice-wake packet processing method according to any one of claims 1-5.