CN111464644A

CN111464644A - Data transmission method and electronic equipment

Info

Publication number: CN111464644A
Application number: CN202010250839.6A
Authority: CN
Inventors: 袁路路; 李智勇; 常乐
Original assignee: Beijing SoundAI Technology Co Ltd
Current assignee: Beijing SoundAI Technology Co Ltd
Priority date: 2020-04-01
Filing date: 2020-04-01
Publication date: 2020-07-28
Anticipated expiration: 2040-04-01
Also published as: CN111464644B

Abstract

The invention relates to the technical field of communication, and provides a data transmission method and electronic equipment to solve the problem that data is easy to lose in a data uploading process. The method comprises the following steps: detecting whether the user finishes the voice input of the current round; and under the condition that the user finishes the voice input of the current round, if the electronic equipment meets a preset uploading condition, uploading the first group of audio data of the current round to a server. That is, as long as the voice input of the current round is finished, the electronic device is in a state to be awakened, the uplink broadband of the electronic device is idle, at least one of the audio data which is not uploaded and the uploading service for uploading the data is in an active and idle state, the first group of audio data of the current round can be uploaded to the server, the situation that the data is lost due to the fact that the voice data stored by the electronic device is limited and the earliest stored data is cleared when the quantity of the data exceeds the preset storable quantity of the data can be reduced, and the integrity of the uploaded data is improved.

Description

Data transmission method and electronic equipment

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a data transmission method and an electronic device.

Background

Along with the continuous development of intelligent technology, various intelligent products come into operation, and the function of intelligent product is also more and more powerful, has brought very big facility for user's life and work. For example, a user can perform voice interaction through the intelligent device, and the electronic device can upload collected voice data input by the user to the server.

However, in the uploading process, in order not to affect normal interaction, the collected voice data is firstly stored, and the collected voice data is uploaded at a preset time. Therefore, the requirements on the memory, hardware and the like of the electronic equipment are high, the voice data stored by the electronic equipment is limited, and under the condition that the data volume exceeds the preset data volume capable of being stored, the earliest stored data is cleared and cannot be uploaded to the server, namely, data is easy to lose in the process of uploading the data.

Disclosure of Invention

The embodiment of the invention provides a data transmission method and electronic equipment, and aims to solve the problem that data is easy to lose in the existing data uploading process.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a data transmission method, where the method includes:

detecting whether the user finishes the voice input of the current round;

under the condition that the user finishes the voice input of the current round, if the electronic equipment meets a preset uploading condition, uploading a first group of audio data of the current round to a server;

wherein the preset uploading condition comprises at least one of the following items:

the electronic equipment is in a state to be awakened;

the uplink broadband is idle;

there is audio data that is not uploaded;

the upload service for uploading data is in an active and idle state.

In a second aspect, an embodiment of the present invention further provides an electronic device, including:

the detection module is used for detecting whether the user finishes the voice input of the current round;

the uploading module is used for uploading a first group of audio data of the current round to a server if the electronic equipment meets a preset uploading condition under the condition that the user is detected to finish the voice input of the current round;

the electronic equipment is in a state to be awakened;

the uplink broadband is idle;

there is audio data that is not uploaded;

the upload service for uploading data is in an active and idle state.

In a third aspect, an embodiment of the present invention further provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the data transmission method as described above when executing the computer program.

In a fourth aspect, the embodiment of the present invention further provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the data transmission method as described above.

In the data transmission method of this embodiment, when it is detected that the user ends the voice input of the current round, if the electronic device meets the preset uploading condition, the first group of audio data of the current round is uploaded to the server, namely, as long as the voice input of the current round is finished, the electronic equipment is in a state to be awakened, the uplink broadband of the electronic equipment is idle, at least one of the states that the audio data which is not uploaded and the uploading service for uploading the data are active and idle exists, the first group of audio data of the current round can be uploaded to the server without waiting for the preset time to upload the stored data at one time, so that the problem that the electronic equipment has limited stored voice data can be reduced, in the case of exceeding the preset data amount which can be saved, the earliest saved data is cleared, so that the data is lost, and the integrity of the data uploaded to the server can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

Fig. 1 is a flowchart of a data transmission method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of data storage in the data transmission method according to the embodiment of the present invention;

fig. 3 is a second flowchart of a data transmission method according to an embodiment of the present invention;

fig. 4 is a schematic block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart of a data transmission method provided by an embodiment of the present invention, where the method is applicable to an electronic device, and as shown in fig. 1, the method includes the following steps:

step 101: whether the user finishes the voice input of the current round is detected.

The user can input voice at the electronic equipment, and the user can carry out many rounds of speech input with the electronic equipment, all keeps corresponding first group audio data after every round of speech input. It should be noted that, in each round of voice input process, a wake-up voice is input in the electronic device, for example, a wake-up voice with a preset keyword (for example, a wake-up voice with an "XX" keyword) is input, so as to wake up the electronic device, and after the wake-up, the electronic device may output a prompt message to prompt the user that the user has woken up, for example, output an "on wording" prompt message. After the electronic equipment is awakened, the user inputs voice data into the electronic equipment, and after the voice data input is finished, the voice input in the current round is finished. For example, the user inputs the voice of "play music a", and the voice input of this round is ended after the voice is input. After the electronic equipment reduces the noise of the voice data and sends the voice data to the server, the server can recognize the voice data after the noise reduction and then respond according to the recognition result, response information corresponding to the recognition result is returned to the electronic equipment, and the electronic equipment receives the response information and then outputs the response information, so that voice interaction is realized. For example, for the voice of "playing music a" input by the user, the voice is denoised and then sent to the server, and the server responds to send the source data of the music a to the electronic device and play the music a through the electronic device.

Step 102: under the condition that the user finishes the voice input of the current round, if the electronic equipment meets the preset uploading condition, the first group of audio data of the current round is uploaded to the server.

Under the condition that the user finishes the voice input of the current round, whether the electronic equipment meets the data uploading condition needs to be judged, and if the electronic equipment meets the preset uploading condition, a first group of audio data generated in the voice input process of the current round is uploaded to a server. Wherein the preset uploading condition comprises at least one of the following items: the electronic equipment is in a state to be awakened; the uplink broadband is idle; there is audio data that is not uploaded; the upload service for uploading data is in an active and idle state.

The electronic device is in a state to be awakened, which can be understood as that the electronic device is not awakened, and is in an un-awakened state, even if a user inputs voice into the electronic device, the electronic device can collect the voice input by the user, but does not give a response. The state to be awakened is opposite to the state to be awakened, and in the state to be awakened, the electronic equipment can not only collect voice input by the user, but also give a response, specifically, receive the output of the response information returned by the server, namely, realize the response aiming at the voice input by the user. If the preset uploading condition includes that the electronic equipment is in a state to be awakened, namely the electronic equipment is required to be in the state to be awakened, data can be uploaded, and in the state to be awakened, voice interaction can be carried out through the electronic equipment, so that data can be uploaded when the electronic equipment is in the state to be awakened, and normal voice interaction is not influenced. And when the uplink broadband of the electronic equipment is idle, normal uploading of data can be ensured. For the audio data which is not uploaded, it can be understood that the audio data is uploaded one by one, and for the audio data which is not uploaded and is generated in the voice input process of the current round, at the beginning, the first group of audio data of the current round is not uploaded, which indicates that the audio data which is not uploaded exists, and at this time, the audio data which is not uploaded includes the first group of audio data of the current round. If the first group of audio data of the current round includes multiple segments of audio data, for example, a segment of audio data before waking up and a segment of audio data after waking up, the multiple segments of audio data are uploaded one by one, for example, the audio data before waking up is uploaded first, after the uploading is completed, the audio data which is not uploaded still exists, that is, the audio data after waking up is also uploaded, and then the audio data after waking up is uploaded subsequently. For the uploading service for uploading data is in an active and idle state, the uploading service represents a service for uploading data, and specifically can be understood as a service for uploading a first group of audio data by a user, and the uploading service is in an active and idle state, so that the smoothness of data uploading can be improved.

In one example, after uploading the first set of audio data of the current round to the server, the method may further include: the first set of audio data for the current round is deleted. Therefore, the occupation of the uploaded audio data on the storage space of the electronic equipment is avoided, and the storage space is saved.

In one example, when it is detected that the user ends the voice input of the current round, if the electronic device further includes audio data of the target round that is not uploaded, the audio data of the target round that is not uploaded and the first group of audio data of the current round are uploaded to the server when the electronic device meets a preset upload condition. The target round is the round before the current round. The electronic equipment also comprises the non-uploaded audio data of the target wheel, and the non-uploaded audio data is represented in the group of audio data of the target wheel. As one example, the non-uploaded audio data of the target round and the first set of audio data of the current round are uploaded to the server according to the chronological order of the acquisition.

In one embodiment, the first set of audio data comprises at least one of first audio data and second audio data;

the first audio data is audio data collected by the electronic equipment within a first preset time before a first moment, the first moment is a moment when the electronic equipment is switched into an awakening state based on a first awakening voice input by a user, the first audio data comprises the first awakening voice, the second audio data comprises audio data collected from the first moment to a second moment, and the second moment is a moment when the user finishes voice input.

The first audio data may be understood as first original audio data, which is a section of audio data input by a user and collected before waking up, and the second audio data may be understood as second original audio data, which is a section of audio data input by a user and collected after waking up. Under the condition that the user finishes the voice input of the current round is detected, if the electronic equipment meets a preset uploading condition, at least one item of the first audio data and the second audio data can be uploaded to the server, so that the server can optimize a voice recognition method according to the at least one item of the first audio data and the second audio data, and the like; for example, if the recognition result indicates that the preset keyword is recognized, a wake-up instruction is sent to the electronic device, and the electronic device is switched to a wake-up state after receiving the wake-up instruction, so that wake-up is achieved. Therefore, after the server optimizes the voice recognition method according to at least one of the first audio data and the second audio data, the condition of mistaken awakening can be reduced and the recognition accuracy is improved. As one example, the audio data collected between the first time and the second time includes second voice data input by the user between the first time and the second time.

In one embodiment, the first set of audio data includes first audio data and second audio data;

if the electronic equipment meets the preset uploading condition, uploading the first group of audio data of the current round to a server, and the method comprises the following steps: and if the electronic equipment meets the preset uploading condition, uploading the first audio data and the second audio data to the server according to the sequence of the acquisition time.

In other words, in this embodiment, the first audio data and the second audio data are sequentially uploaded according to the sequence of the acquisition time, so that the situation of uploading errors of the first audio data and the second audio data is reduced, the situation of transmission errors is reduced, and the transmission accuracy is improved.

In one embodiment, if the electronic device meets a preset uploading condition, uploading the first audio data and the second audio data to the server according to the sequence of the acquisition time, including:

uploading the first audio data to a server under the condition that the electronic equipment meets a preset uploading condition;

after the first audio data are uploaded, the second audio data are uploaded to the server under the condition that the electronic equipment meets the preset uploading condition.

In this embodiment, when the electronic device meets the preset uploading condition, the first audio data is uploaded to the server, and after the uploading of the first audio data is completed, the second audio data is uploaded to the server when the electronic device still meets the preset uploading condition. That is, after each data transmission, the preset uploading condition needs to be satisfied, so as to avoid affecting normal voice interaction or/and improving the accuracy of data uploading.

In one embodiment, in a case that it is detected that the user finishes the voice input, before uploading the first set of audio data of the current round to the server if the electronic device meets a preset uploading condition, the method further includes: when the electronic equipment is in a state to be awakened and receives a first awakening voice input by a user, switching to an awakening state based on the first awakening voice; when the electronic equipment is in an awakening state, if the voice data input by the user is not received within a second preset time after the second voice data input by the user is received, determining that the user finishes the voice input, wherein the second audio data comprises the second voice data.

That is, in this embodiment, whether the user finishes the voice input is determined by detecting whether the voice data input by the user is received within a second preset time period after the second voice data input by the user is received in the awake state, and if the voice data input by the user is not received within the second preset time period after the second voice data input by the user is received, it indicates that the user finishes the voice input of the current round.

It should be noted that the time when the user finishes the voice input is detected can be understood as determining the time when the user finishes the voice input. As an example, when the electronic device is in the wake-up state, if the voice data input by the user is not received within a second preset time period after the second voice data input by the user is received within a third preset time period after the first time, it is determined that the voice input is ended by the user, where the third preset time period is less than a time period between the first time and the second time. As one example, the second preset duration may range from 300 milliseconds to 500 milliseconds.

In an example, when the electronic device is in the awake state, and if the voice data input by the user is not received within a second preset time period after the second voice data input by the user is received, after determining that the voice input by the user is ended, the method may further include: and switching the electronic equipment into a state to be awakened. As such, for subsequent uploads of the first set of audio data.

The process of the above data transmission method is described below with a specific embodiment.

For the electronic device of the embodiment of the invention, the electronic device can be an intelligent sound box and has two uploading processes, namely two uploading services are included, for one uploading service, on one hand, when the electronic device is in a state to be awakened, the awakening voice input by a user is collected, the voice is subjected to noise reduction and then uploaded to a server, the server identifies the awakening voice subjected to noise reduction and provides corresponding response information of the awakening voice, the electronic device receives the response information and then outputs the response information, if the server identifies the awakening voice, after a preset keyword is identified, the response information to the electronic device comprises a prompt message and an awakening instruction, the awakening instruction can be used for awakening the electronic device, and the electronic device plays the prompt message after receiving the response information and is switched to the awakening state. And in the awakening state, receiving second voice data input by a user, uploading the second voice data subjected to noise reduction to a server, identifying the second voice data subjected to noise reduction by the server, giving response information corresponding to the second voice data subjected to noise reduction, and outputting the response information after receiving the response information by the electronic equipment. For example, the user inputs the second voice data of "play music a", and uploads the second voice data to the server after denoising, the response information given by the server may include the source data of music a, and the electronic device may play the second voice data after receiving the second voice data, that is, the music a is played. Through the process, normal voice interaction of the user is realized.

On the other hand, in addition to the voice interaction process, the electronic device further includes another uploading service, that is, the uploading service in the preset condition in the data transmission method, and when the preset uploading condition is satisfied, the first group of audio data can be uploaded to the server through the uploading service, so that data uploading in the data transmission method according to each embodiment is realized. The method can collect the awakening voice input by the user in the state to be awakened, directly upload the first audio data including the awakening voice to the server without noise reduction, and the first audio data is used for optimizing the voice recognition method of the server based on the first audio data. And in the awakening state, receiving second voice data input by a user, and directly uploading second audio data comprising the second voice data to the server without noise reduction, wherein the second audio data can also be used for optimizing a voice recognition method based on the second audio data by the server and the like. In the embodiment of the present invention, if the preset uploading condition includes that the electronic device is in the state to be wakened, that is, under the condition that it is detected that the user ends the voice input of the current round and the electronic device needs to be in the state to be wakened, the data transmission may be performed so as to avoid that the normal voice interaction is affected by performing the first audio data transmission and the second audio data transmission in the state to be wakened.

As shown in fig. 2, the latest 1s of audio data is retained in the idle state of the electronic device. If it is detected that the electronic device is awakened (i.e. corresponding to WakeUp in fig. 2), a unique identifier (dialogId) is generated to be used as an identifier of the voice input of the current round, and the latest 1s audio data is saved as the original awakening data (i.e. the first audio data including the awakening voice input by the user), and in fig. 2, the user starts to awaken, i.e. the user inputs the awakening voice to awaken the electronic device. After waking up, the next audio data is stored as the original ASR data (i.e. the second audio data) until VAD end is detected (i.e. until the user is detected to end the voice of the current round), in fig. 2, at VAD start after waking up, the user starts voice, the data is stored as the audio data with the longest time length of 5s by default, if the user still speaks with the length of 5s, the earliest segment of the original ASR data of the current round is deleted, so as to implement updating. And storing the original awakening data and the original ASR data of the current round as a group of audio data (namely a first group of audio data) corresponding to the identifier, and informing an uploading service to upload the data after the voice input of the user is finished.

As shown in fig. 3, the data transmission method of the present embodiment includes the following steps:

the initial state of the uploading service is in a waiting state, and when the notification of uploading data is received, whether the data can be uploaded at the moment is checked. In one example, upon detecting that the user has finished his or her voice input round, a notification may be delivered to the upload service notifying the upload of data. Or after the original wake-up data of the current round is uploaded, as the original ASR data still remains, a notification can also be transmitted to the upload service to notify the upload data.

The data can be uploaded only when the following preset uploading conditions (namely the following four conditions) are simultaneously met, otherwise, the data continues to wait until the preset uploading conditions are met:

(1) there is currently audio data that has not yet been uploaded;

(2) the electronic equipment is currently in an un-awakened state;

(3) the uplink broadband is idle;

(4) the upload service is in an active state and is in an idle state.

And when the uploading service is in an active state, the uploading process of the uploading service is started, if the four conditions are met, the original awakening audio is uploaded firstly, the uploading is finished and the uploading is continued, and if the uploading condition is met, the original ASR audio is uploaded continuously, otherwise, the waiting is continued, the uploading is continued after the voice input of the next round is finished, namely, the uploading data notification of the next round is waited. The uploading service can be realized by libCurl or webSocket and the like.

Whether uploading of original audio (including original awakening data and original ASR audio) is started, the data length of the original awakening data which can be stored, the data length of the original ASR data which can be stored and the maximum stored group number can be dynamically configured through an external configuration file, and the electronic equipment can be read after being electrified without modifying a program.

In the data transmission process, each group of audio data has a corresponding identifier, and the corresponding original audio data is uploaded in time, so that the corresponding original audio data can be directly obtained conveniently for analysis, and a Debug flow (namely a debugging flow) is optimized. Through the data transmission process of the embodiment, the acquired original awakening data and the original ASR audio are conveniently uploaded to the server, and the problems that the missing data is difficult to analyze, the reason for mistaken awakening occurs and the recognition accuracy is low are solved. The original audio is uploaded at the idle time by using the uploading broadband, so that the interactive experience is not influenced. In most cases, only one group of data needs to be stored and deleted in time after being uploaded, and the occupied space of the memory and the hard disk is small.

Referring to fig. 4, fig. 4 is a schematic block diagram of an electronic device 400 according to an embodiment of the present invention, and as shown in fig. 1, the electronic device 400 includes:

a detection module 401, configured to detect whether a user ends the current round of voice input;

the uploading module 402 is configured to, when it is detected that the user ends the voice input of the current round, upload the first group of audio data of the current round to the server if the electronic device meets a preset uploading condition;

the electronic equipment is in a state to be awakened;

the uplink broadband is idle;

there is audio data that is not uploaded;

the upload service for uploading data is in an active and idle state.

an upload module comprising:

and the data uploading module is used for uploading the first audio data and the second audio data to the server according to the sequence of the acquisition time if the electronic equipment meets the preset uploading condition.

In one embodiment, the data upload module includes:

the first data uploading sub-module is used for uploading the first audio data to the server under the condition that the electronic equipment meets the preset uploading condition;

and the second data uploading submodule is used for uploading the second audio data to the server under the condition that the electronic equipment meets the preset uploading condition after the first audio data is uploaded.

In one embodiment, the electronic device 400 further comprises:

the first switching module is used for switching to an awakening state based on a first awakening voice under the condition that the electronic equipment is in the to-be-awakened state and receives the first awakening voice input by a user;

the determining module is configured to determine that the user ends the voice input if the voice data input by the user is not received within a second preset time period after the second voice data input by the user is received when the electronic device is in the wake-up state, where the second audio data includes the second voice data.

The electronic device 400 can implement the processes implemented by the method in the foregoing method embodiments, and details are not repeated here to avoid repetition.

In an embodiment, an embodiment of the present invention further provides an electronic device, including a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process in the data transmission method embodiment, and can achieve the same technical effect, and details are not repeated here to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the data transmission method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling an electronic device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of data transmission, the method comprising:

detecting whether the user finishes the voice input of the current round;

the electronic equipment is in a state to be awakened;

the uplink broadband is idle;

there is audio data that is not uploaded;

the upload service for uploading data is in an active and idle state.

2. The method of claim 1, wherein the first set of audio data comprises at least one of first audio data and second audio data;

the first audio data are audio data collected by the electronic equipment within a first preset time before a first moment, the first moment is a moment when a first awakening voice input by the electronic equipment based on a user is converted into an awakening state, the first audio data comprise the first awakening voice, the second audio data comprise the audio data collected from the first moment to a second moment, and the second moment is a moment when the user finishes the voice input.

3. The method of claim 2, wherein the first set of audio data comprises first audio data and second audio data;

if the electronic equipment meets the preset uploading condition, uploading the first group of audio data of the current round to a server, and the method comprises the following steps:

and if the electronic equipment meets the preset uploading condition, uploading the first audio data and the second audio data to the server according to the sequence of acquisition time.

4. The method of claim 3, wherein if the electronic device meets a preset uploading condition, uploading the first audio data and the second audio data to the server according to a sequence of acquisition times, comprising:

uploading the first audio data to the server under the condition that the electronic equipment meets the preset uploading condition;

5. The method as claimed in claim 2, wherein before the step of uploading the first set of audio data of the current round to the server if the electronic device satisfies the preset uploading condition in the case of detecting that the user finishes the voice input, the method further comprises:

when the electronic equipment is in a state to be awakened and receives the first awakening voice input by a user, switching to the awakening state based on the first awakening voice;

and when the electronic equipment is in the awakening state, if the voice data input by the user is not received within a second preset time after the second voice data input by the user is received, determining that the user finishes the voice input, wherein the second audio data comprises the second voice data.

6. An electronic device, comprising:

the electronic equipment is in a state to be awakened;

the uplink broadband is idle;

there is audio data that is not uploaded;

the upload service for uploading data is in an active and idle state.

7. The electronic device of claim 6, wherein the first set of audio data includes at least one of first audio data and second audio data;

8. The electronic device of claim 7, wherein the first set of audio data comprises first audio data and second audio data;

the upload module comprises:

and the data uploading module is used for uploading the first audio data and the second audio data to the server according to the sequence of acquisition time if the electronic equipment meets the preset uploading condition.

9. The electronic device of claim 8, wherein the data upload module comprises:

10. The electronic device of claim 7, further comprising:

the first switching module is used for switching to the awakening state based on the first awakening voice under the condition that the electronic equipment is in the to-be-awakened state and receives the first awakening voice input by a user;

the determining module is configured to determine that the user ends voice input if the voice data input by the user is not received within a second preset time period after the second voice data input by the user is received when the electronic device is in the wake-up state, where the second audio data includes the second voice data.

11. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the steps in the data transmission method according to any one of claims 1 to 5.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data transmission method according to any one of claims 1 to 5.