CN112562683A - Instruction obtaining method, device, sending end and storage medium - Google Patents

Instruction obtaining method, device, sending end and storage medium Download PDF

Info

Publication number
CN112562683A
CN112562683A CN202011421332.9A CN202011421332A CN112562683A CN 112562683 A CN112562683 A CN 112562683A CN 202011421332 A CN202011421332 A CN 202011421332A CN 112562683 A CN112562683 A CN 112562683A
Authority
CN
China
Prior art keywords
instruction
voice
sub
cloud
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011421332.9A
Other languages
Chinese (zh)
Inventor
曾德智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN202011421332.9A priority Critical patent/CN112562683A/en
Publication of CN112562683A publication Critical patent/CN112562683A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses an instruction obtaining method, which comprises the following steps: receiving a voice instruction; cutting the voice command to obtain a plurality of sub voice commands; determining a selected voice instruction from the plurality of sub voice instructions; acquiring the utilization rate of a cloud processor; and when the utilization rate of the cloud processor is greater than a preset threshold value, sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction. The invention also discloses an instruction obtaining device, a sending end and a storage medium. When the instruction obtaining method of the invention is used for obtaining the text instruction, the text instruction obtaining speed is higher.

Description

Instruction obtaining method, device, sending end and storage medium
Technical Field
The present invention relates to the field of instruction obtaining, and in particular, to an instruction obtaining method, an apparatus, a sending end, and a computer-readable storage medium.
Background
With the development of smart home technology, voice interaction has become a basic interaction function essential for smart home. The related technology center discloses an instruction obtaining method, wherein a sending end receives a voice instruction and sends the voice instruction to a cloud end, so that the cloud end carries out recognition conversion on the voice instruction to obtain a text instruction, the text instruction is used for being sent to a receiving end by the cloud end, and the receiving end executes the text instruction.
However, when the existing instruction obtaining method is adopted to obtain the text instruction, the obtaining speed of the text instruction is slow.
Disclosure of Invention
The invention mainly aims to provide an instruction obtaining method, an instruction obtaining device, a sending end and a computer readable storage medium, and aims to solve the technical problem that when a text instruction is obtained by adopting the existing instruction obtaining method in the prior art, the text instruction obtaining speed is low.
In order to achieve the above object, the present invention provides an instruction obtaining method, which is characterized by comprising the following steps:
receiving a voice instruction;
cutting the voice command to obtain a plurality of sub voice commands;
acquiring the utilization rate of a cloud processor;
when the utilization rate of the cloud processor is greater than a preset threshold value, determining a selected voice instruction from the plurality of sub voice instructions;
and sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction.
Optionally, the step of cutting the voice command to obtain a plurality of sub voice commands includes:
obtaining a sound waveform of the voice instruction based on the voice instruction;
based on the sound waveform and the voice instruction, a plurality of sub voice instructions are obtained.
Optionally, the step of obtaining a plurality of sub voice commands based on the sound waveform and the voice command includes:
determining a region with amplitude smaller than a preset amplitude value in the sound waveform as a silent region;
screening a selected silent area with the duration longer than a preset duration in the silent area;
determining an area in the voice instruction corresponding to the selected silence area as a cutting area;
determining a cutting point in the cutting area;
and cutting the voice command by using the cutting point to obtain a plurality of sub voice commands.
Optionally, when the usage rate of the cloud processor is greater than a preset threshold, the step of determining a selected voice instruction from the plurality of sub-voice instructions includes:
when the utilization rate of the cloud processor is greater than a preset threshold value, screening a result silent area with the maximum duration from the selected silent area;
determining a first sub-voice instruction after the resulting silence area as the selected voice instruction.
Optionally, the method further includes:
when the utilization rate of the cloud processor is smaller than or equal to the preset threshold value, the sub voice instructions are sent to the cloud end, so that the cloud end obtains a second text instruction based on the sub voice instructions.
Optionally, the plurality of sub-voice instructions have a preset sequence; the step of sending the sub-voice commands to the cloud comprises:
and sending the sub voice instructions to the cloud according to the preset sequence.
Optionally, before the step of receiving the voice instruction, the method further includes:
receiving a wake-up instruction when audio information is played;
based on the wake-up instruction, pausing the audio information;
the step of receiving voice instructions comprises:
and receiving a voice instruction when the audio information is in a pause state.
In addition, to achieve the above object, the present invention also provides an instruction obtaining apparatus, including:
the receiving module is used for receiving a voice instruction;
the cutting module is used for cutting the voice command to obtain a plurality of sub voice commands;
the acquisition module is used for acquiring the utilization rate of the cloud processor;
the determining module is used for determining a selected voice instruction in the sub-voice instructions when the utilization rate of the cloud processor is greater than a preset threshold value;
the sending module is used for sending the selected voice instruction to the cloud end so that the cloud end can obtain a first text instruction based on the selected voice instruction.
In addition, to achieve the above object, the present invention further provides a transmitting end, where the transmitting end includes: a memory, a processor and an instruction obtaining program stored on the memory and running on the processor, the instruction obtaining program when executed by the processor implementing the steps of the instruction obtaining method as claimed in any one of the above.
Furthermore, to achieve the above object, the present invention also proposes a computer-readable storage medium having stored thereon an instruction obtaining program that, when executed by a processor, realizes the steps of the instruction obtaining method according to any one of the above.
The technical scheme of the invention provides an instruction obtaining method, which comprises the steps of receiving a voice instruction; cutting the voice command to obtain a plurality of sub voice commands; determining a selected voice instruction from the plurality of sub voice instructions; acquiring the utilization rate of a cloud processor; and when the utilization rate of the cloud processor is greater than a preset threshold value, sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction. The sending end sends the selected voice instruction to the cloud end when the utilization rate of the cloud end processor is larger than the preset threshold value, the selected voice instruction is obtained after the voice instruction is cut and is only one part of the voice instruction, and the selected voice instruction is transmitted faster compared with the voice instruction.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a transmitting end according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first exemplary embodiment of a method for instruction fetching according to the present invention;
FIG. 3 is a block diagram of a first embodiment of an instruction obtaining apparatus according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a transmitting end structure according to an embodiment of the present invention.
The transmitting end may be a User Equipment (UE) such as a Mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a Personal Digital Assistant (PDA), a tablet computer (PAD), a handheld device, a vehicle-mounted device, a wearable device, a computing device or other processing device connected to a wireless modem, a Mobile Station (MS), or the like. The transmitting end may be referred to as a user terminal, a portable terminal, a desktop terminal, etc.
In general, a transmitting end includes: at least one processor 301, a memory 302, and an instruction obtaining program stored on the memory and executable on the processor, the instruction obtaining program being configured to implement the steps of the instruction obtaining method as described before.
The processor 301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 301 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 301 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state.
In some embodiments, the processor 301 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. The processor 301 may further include an AI (Artificial Intelligence) processor for processing relevant instruction obtaining method operations so that the instruction obtaining method model can be trained autonomously, improving efficiency and accuracy.
Memory 302 may include one or more computer-readable storage media, which may be non-transitory. Memory 302 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 302 is used to store at least one instruction for execution by processor 301 to implement the instruction acquisition method provided by the method embodiments herein.
In some embodiments, the terminal may further include: a communication interface 303 and at least one peripheral device. The processor 301, the memory 302 and the communication interface 303 may be connected by a bus or signal lines. Various peripheral devices may be connected to communication interface 303 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 304, a display screen 305, and a power source 306.
The communication interface 303 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 301 and the memory 302. In some embodiments, processor 301, memory 302, and communication interface 303 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 301, the memory 302 and the communication interface 303 may be implemented on a single chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 304 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 304 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 304 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 304 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 304 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 304 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 305 is a touch display screen, the display screen 305 also has the ability to capture touch signals on or over the surface of the display screen 305. The touch signal may be input to the processor 301 as a control signal for processing. At this point, the display screen 305 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 305 may be one, the front panel of the electronic device; in other embodiments, the display screens 305 may be at least two, respectively disposed on different surfaces of the electronic device or in a folded design; in still other embodiments, the display screen 305 may be a flexible display screen disposed on a curved surface or a folded surface of the electronic device. Even further, the display screen 305 may be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display screen 305 may be made of LCD (liquid crystal Display), OLED (Organic Light-Emitting Diode), and the like.
The power supply 306 is used to power various components in the electronic device. The power source 306 may be alternating current, direct current, disposable or rechargeable. When the power source 306 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology. Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the instruction obtaining apparatus and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
Furthermore, an embodiment of the present invention also provides a computer-readable storage medium, on which an instruction obtaining program is stored, and when the instruction obtaining program is executed by a processor, the steps of the instruction obtaining method are implemented as described above. Therefore, a detailed description thereof will be omitted. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. Determining as an example, the program instructions may be deployed to execute on one sender or one cloud, or on multiple clouds or multiple senders at one site, or on multiple senders and multiple clouds distributed over multiple sites and interconnected by a communication network.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The computer-readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Based on the above hardware structure, an embodiment of the instruction obtaining method of the present invention is provided.
Referring to fig. 2, fig. 2 is a flowchart illustrating a second embodiment of a method for obtaining an instruction according to the present invention, which is used for a sending end, and the method includes the following steps:
step S11: and receiving a voice instruction.
It should be noted that the execution main body of the present invention is a sending end, and the sending end may be any one of the sending ends described above, which is not described herein again. The voice command is usually sent by the user through a voice input module of the sending end, and may also be sent by the user through other electronic equipment connected to the sending end.
Further, before step S11, the method further includes: receiving a wake-up instruction when audio information is played; based on the wake-up instruction, pausing the audio information; accordingly, step S11 includes: and receiving a voice instruction when the audio information is in a pause state.
It should be noted that, users usually use the sending end to play audio information (or video information) to meet the entertainment requirement; meanwhile, the main body of the method (the sending end of the instruction obtaining program) of this embodiment is not always in a state of receiving the voice instruction, and usually requires the user to send a wake-up instruction to wake up the instruction obtaining program of this embodiment; the wake-up instruction may be a voice instruction, a key instruction, a gesture instruction, or the like, which is not limited in the present invention.
When a wake-up instruction is received and the sending end outputs audio information (or video information), the sending end pauses playing the audio information (or the video information) to receive a voice instruction; meanwhile, when the voice instruction reception is finished, the audio information (or video information) can be continuously played.
In addition, when receiving the wake-up instruction, the sending end directly receives the voice instruction.
Step S12: and cutting the voice command to obtain a plurality of sub voice commands.
The voice instruction is usually longer voice information, the amount of data information is larger, the transmission speed of the voice instruction transmitted to the cloud end on the network is lower, and meanwhile, the cloud end consumes more resources and time when the voice instruction is converted; it can be understood that, in the same network environment, compared with the voice instruction, the selected voice instruction (one sub-voice instruction of a plurality of sub-voice instructions obtained after the voice instruction is cut) has less data volume and high transmission speed, and meanwhile, when the cloud end is at the same processor utilization rate, the cloud end has higher data processing speed on part of the voice instruction and has lower data processing speed on the whole voice instruction; when the cloud end processes the same voice command, the processing speed is higher when the utilization rate of the cloud end processor is lower, and the processing speed is lower when the utilization rate of the cloud end processor is higher.
Because the utilization rates of the cloud processors are different, the conditions of the voice commands sent to the cloud by the sending end are also different (the utilization rate of the cloud processors is high, the selected voice commands are sent, the utilization rate of the cloud processors is low, and the sub voice commands are sent). No matter how the utilization rate of the cloud processor is, the sending end cuts the voice instruction into a plurality of sub voice instructions in advance, namely, cuts the voice instruction in advance. And then the sending end selects to send a selected voice instruction according to the utilization rate of the cloud processor when the utilization rate of the cloud processor is greater than a preset threshold value, or directly sends the voice instructions to the cloud end without determining the selected voice instruction when the utilization rate of the processor is not greater than the preset threshold value.
The cloud end can be an internet of things cloud end or a server and the like, the cloud end is in communication connection with the sending end respectively, and the cloud end is used as an intermediate medium of the sending end and the receiving end (the cloud end comprises data information transmission and data information processing functions); the transmitting end is usually a smart phone, a tablet computer, a personal computer or the like, the receiving end is usually a smart home device, such as a smart refrigerator, a smart air conditioner, a smart television or the like, and the main structure of the smart home device refers to the structure of the transmitting end, which is not described herein again. The receiving end is an execution end for receiving the text instruction, and may be an intelligent home device, such as an intelligent television, an intelligent air conditioner, or an intelligent refrigerator.
Generally, before a sending end and a receiving end are connected to a cloud end, the sending end and the receiving end need to register at the cloud end, and the cloud end stores device information and the like registered at the sending end and the cloud end, and generally, when instructions or data information are transmitted between the sending end and the receiving end through the cloud end, the instructions or the data information may include device information (the device information of the sending end and the device information of the receiving end) or device states (the operating state of the sending end and the operating state of the receiving end) and the like.
In addition, the usage rate of the cloud processor generally refers to the usage rate of the processor at the current time, that is, the usage rate of the real-time cloud processor obtained from the cloud when the sending end executes the method of this embodiment, and generally, the usage rate of the cloud processor is presented in percentage.
Further, step S12 includes: obtaining a sound waveform of the voice instruction based on the voice instruction; based on the sound waveform and the voice instruction, a plurality of sub voice instructions are obtained.
Specifically, based on the voice command, obtaining a sound waveform of the voice command; determining a region with amplitude smaller than a preset amplitude value in the sound waveform as a silent region; screening a selected silent area with the duration longer than a preset duration in the silent area; determining an area in the voice instruction corresponding to the selected silence area as a cutting area; determining a cutting point in the cutting area; and cutting the voice command by using the cutting point to obtain a plurality of sub voice commands.
The expression form of the sound waveform may be a waveform diagram or may be in another form. Generally, the voice command does not contain full user voice, there may be a vacuum segment (the interval in which the user speaks, and there is usually no voice in the interval, which is an invalid interval, and the vacuum segment corresponds to a silence region in the voice waveform), and generally, the voice command after the longer vacuum segment includes important command information.
The vacuum section has no sound of a user, so that the amplitude of the sound waveform corresponding to the vacuum section is small (even if some noise is included, the amplitude does not change greatly), and is smaller than a preset amplitude (the preset amplitude can be set by the user according to needs, it can be understood that the same voice command, the amplitudes of the sound waveforms obtained by different algorithms are different, the corresponding preset amplitudes are also different, and the determination needs to be performed according to actual conditions), and then the region in the sound waveform corresponding to the vacuum section is determined as a silence region.
In addition, there is always a very short interval (corresponding to a silence region with a short duration in the sound waveform) between each character in the user's speech information, and this short interval is not the pause of the user, and it is necessary to remove this short interval, that is, the silence region smaller than or equal to the preset duration is not the object of selection, in other words, the silence region larger than the preset duration is the selected silence region. The preset time length can be set by a user according to needs, the invention is not limited, and generally, the preset time length does not exceed 1 s.
Any point in the selected silence area can be used as a cut point, which can be a starting point (time starting point) of the selected silence area, or an ending point (time ending point) of the selected silence area, for example, the voice command 12s, and the selected silence area is the 5.50 th s-7.00 th s therein, so that the cut point can be the 5.50 th s, or the 7.00 th s, or any one of the 5.50 th s-7.00 th s. Wherein, in order to increase the accuracy, the second is taken as the unit, and the last two digits accurate to decimal point are the best.
Step S13: and acquiring the utilization rate of the cloud processor.
Step S14: and when the utilization rate of the cloud processor is greater than a preset threshold value, determining a selected voice instruction in the sub-voice instructions.
It should be noted that the preset threshold may be a percentage threshold of the utilization rate of the cloud processor, and may be set by a user, and the preset threshold is preferably 60% without limitation. Generally, the utilization rate of the cloud processor is greater than a preset threshold value, the load of the cloud processor is high, and the processing speed is low; meanwhile, the selected voice command comprises less data information, and the transmission speed of the selected voice command to the cloud is high. At this time, the transmitting end only needs to transmit the selected voice command. The transmission time of the voice command and the conversion time of the voice command are reduced.
It should be noted that after obtaining a plurality of sub voice commands, because of different utilization rates of the cloud processor, the voice command sent to the cloud may be one voice command (i.e., a selected voice command) or all sub voice commands, and generally, when sending all sub voice commands, the selected voice command does not need to be determined, and only when sending one sub voice command, the selected voice command needs to be determined.
Further, step S14: when the utilization rate of the cloud processor is greater than a preset threshold value, screening a result silent area with the maximum duration from the selected silent area; determining a first sub-voice instruction after the resulting silence area as the selected voice instruction.
Generally speaking, when a user sends voice information, after a long pause, a voice instruction sent by a sending end includes an important voice instruction, and then a part of the voice instruction after the long pause is selected, that is, a first sub-voice instruction after a result silence area with the largest silence duration is selected as a selected voice instruction in the selected silence area, and the selected voice instruction is a part of instruction sent to the cloud end and serves as a basis for a receiving end to execute the instruction.
Step S15: and sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction.
It should be noted that the selected voice command is in a voice format, the selected voice command cannot be executed by the receiving end, and the cloud end is required to convert the selected voice command into a text command executable by the receiving end.
It can be understood that the cloud processor is larger than the preset threshold value, and only needs to send the selected voice instruction, so that the transmission time of the voice instruction and the conversion time of the cloud to the voice instruction are reduced, and the instruction acquisition speed and efficiency are improved.
Further, the method further comprises: when the utilization rate of the cloud processor is smaller than or equal to the preset threshold value, the sub voice instructions are sent to the cloud end, so that the cloud end obtains a second text instruction based on the sub voice instructions.
And when the utilization rate of the cloud processor is less than or equal to a preset threshold value, all the sub-voice instructions are sent to the cloud end, so that the cloud end obtains a second text instruction based on the plurality of sub-voice instructions.
Further, the plurality of sub-voice commands have a preset sequence; the step of sending the sub-voice commands to the cloud comprises: and sending the sub voice instructions to the cloud according to the preset sequence.
The preset sequence of the sub-voices is the sequence of the sub-voices in the original voice command, and when the sub-voice commands are sent, the sub-voices are also sent according to the sequence. For example, a 10s long voice command is divided into three sub-voice commands of 0.00s-3.00s, 3.00s-6.00s and 6.00s-10.00s, and their sequence is the first sub-voice command of 0.00s-3.00s, the second sub-voice command of 3.00s-6.00s and the third sub-voice command of 6.00s-10.00s, respectively, and the sequence is the preset sequence and also the sending sequence.
The technical scheme of the invention provides an instruction obtaining method, which comprises the steps of receiving a voice instruction; cutting the voice command to obtain a plurality of sub voice commands; determining a selected voice instruction from the plurality of sub voice instructions; acquiring the utilization rate of a cloud processor; and when the utilization rate of the cloud processor is greater than a preset threshold value, sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction. The sending end sends the selected voice instruction to the cloud end when the utilization rate of the cloud end processor is larger than the preset threshold value, the selected voice instruction is obtained after the voice instruction is cut and is only one part of the voice instruction, and the selected voice instruction is transmitted faster compared with the voice instruction.
Referring to fig. 3, fig. 3 is a block diagram of a first embodiment of an instruction obtaining apparatus according to the present invention, for a sender, the apparatus includes:
the receiving module 11 is used for receiving a voice instruction;
the cutting module 12 is configured to cut the voice instruction to obtain a plurality of sub-voice instructions;
the acquisition module 13 is configured to acquire a utilization rate of the cloud processor;
a determining module 14, configured to determine a selected voice instruction from the multiple sub-voice instructions when the usage rate of the cloud processor is greater than a preset threshold;
the sending module 15 is configured to send the selected voice instruction to the cloud, so that the cloud obtains a first text instruction based on the selected voice instruction.
The above description is only an alternative embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An instruction obtaining method, characterized in that the method comprises the steps of:
receiving a voice instruction;
cutting the voice command to obtain a plurality of sub voice commands;
acquiring the utilization rate of a cloud processor;
when the utilization rate of the cloud processor is greater than a preset threshold value, determining a selected voice instruction from the plurality of sub voice instructions;
and sending the selected voice instruction to the cloud end so that the cloud end obtains a first text instruction based on the selected voice instruction.
2. The instruction obtaining method according to claim 1, wherein the step of cutting the voice instruction to obtain a plurality of sub voice instructions comprises:
obtaining a sound waveform of the voice instruction based on the voice instruction;
based on the sound waveform and the voice instruction, a plurality of sub voice instructions are obtained.
3. The instruction obtaining method according to claim 2, wherein the step of obtaining a plurality of sub voice instructions based on the sound waveform and the voice instruction comprises:
determining a region with amplitude smaller than a preset amplitude value in the sound waveform as a silent region;
screening a selected silent area with the duration longer than a preset duration in the silent area;
determining an area in the voice instruction corresponding to the selected silence area as a cutting area;
determining a cutting point in the cutting area;
and cutting the voice command by using the cutting point to obtain a plurality of sub voice commands.
4. The method of claim 3, wherein the step of determining the selected voice command from the plurality of sub-voice commands when the usage rate of the cloud processor is greater than a predetermined threshold comprises:
when the utilization rate of the cloud processor is greater than a preset threshold value, screening a result silent area with the maximum duration from the selected silent area;
determining a first sub-voice instruction after the resulting silence area as the selected voice instruction.
5. The instruction obtaining method of claim 4, wherein the method further comprises:
when the utilization rate of the cloud processor is smaller than or equal to the preset threshold value, the sub voice instructions are sent to the cloud end, so that the cloud end obtains a second text instruction based on the sub voice instructions.
6. The instruction obtaining method according to claim 5, wherein the plurality of sub-voice instructions have a preset order; the step of sending the sub-voice commands to the cloud comprises:
and sending the sub voice instructions to the cloud according to the preset sequence.
7. The instruction obtaining method according to any one of claims 1-6, wherein the step of receiving a voice instruction is preceded by the method further comprising:
receiving a wake-up instruction when audio information is played;
based on the wake-up instruction, pausing the audio information;
the step of receiving voice instructions comprises:
and receiving a voice instruction when the audio information is in a pause state.
8. An instruction obtaining apparatus, characterized in that the apparatus comprises:
the receiving module is used for receiving a voice instruction;
the cutting module is used for cutting the voice command to obtain a plurality of sub voice commands;
the acquisition module is used for acquiring the utilization rate of the cloud processor;
the determining module is used for determining a selected voice instruction in the sub-voice instructions when the utilization rate of the cloud processor is greater than a preset threshold value;
the sending module is used for sending the selected voice instruction to the cloud end so that the cloud end can obtain a first text instruction based on the selected voice instruction.
9. A transmitting end, characterized in that the transmitting end comprises: memory, processor and an instruction fetcher stored on the memory and running on the processor, the instruction fetcher when executed by the processor implementing the steps of the instruction fetcher method as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon an instruction obtaining program, which when executed by a processor, implements the steps of the instruction obtaining method according to any one of claims 1 to 7.
CN202011421332.9A 2020-12-07 2020-12-07 Instruction obtaining method, device, sending end and storage medium Pending CN112562683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011421332.9A CN112562683A (en) 2020-12-07 2020-12-07 Instruction obtaining method, device, sending end and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011421332.9A CN112562683A (en) 2020-12-07 2020-12-07 Instruction obtaining method, device, sending end and storage medium

Publications (1)

Publication Number Publication Date
CN112562683A true CN112562683A (en) 2021-03-26

Family

ID=75059395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011421332.9A Pending CN112562683A (en) 2020-12-07 2020-12-07 Instruction obtaining method, device, sending end and storage medium

Country Status (1)

Country Link
CN (1) CN112562683A (en)

Similar Documents

Publication Publication Date Title
CN109378000A (en) Voice awakening method, device, system, equipment, server and storage medium
CN106940997B (en) Method and device for sending voice signal to voice recognition system
CN111371705B (en) Download task execution method and electronic device
CN112351097A (en) Device control method, device, sending end and storage medium
CN111093259A (en) Bluetooth power adjustment method, device, storage medium and terminal
CN112883036A (en) Index creation method, device, storage server and storage medium
CN110035504B (en) Method for determining spatial relationship, terminal and base station
CN112612526B (en) Application program control method, device, terminal equipment and storage medium
CN109660445B (en) Message processing method, device and storage medium
CN113918280A (en) Dynamic adjustment method of virtual machine resources, terminal device and storage medium
CN112583907A (en) Connection method and device, smart home equipment and computer readable storage medium
CN112399686A (en) Light control method, device, equipment and storage medium
CN111208966B (en) Display method and device
CN112689172A (en) Program playing method and device, set top box and storage medium
CN112003983A (en) Adaptive vibration system, terminal, method, and computer-readable storage medium
CN112562683A (en) Instruction obtaining method, device, sending end and storage medium
CN111818657A (en) Uplink transmission discarding method, uplink transmission discarding configuration method and related equipment
CN112350895B (en) Keep-alive method and device for application program, equipment and computer readable storage medium
CN112346885A (en) Electronic device control method, device, equipment and computer readable storage medium
CN112073785A (en) Character input method and device, smart television and computer readable storage medium
CN112423004B (en) Video data transmission method, device, transmitting end and storage medium
CN113098902A (en) Method and device for managing vulnerability of network equipment, management terminal equipment and storage medium
CN112035036A (en) Electronic whiteboard sharing method, system, terminal equipment and storage medium
CN111883145A (en) Wake-up recognition processing method and device
CN112437333B (en) Program playing method, device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination