CN110933500A - Voice triggering method, device, equipment and computer storage medium - Google Patents

Voice triggering method, device, equipment and computer storage medium Download PDF

Info

Publication number
CN110933500A
CN110933500A CN201911424389.1A CN201911424389A CN110933500A CN 110933500 A CN110933500 A CN 110933500A CN 201911424389 A CN201911424389 A CN 201911424389A CN 110933500 A CN110933500 A CN 110933500A
Authority
CN
China
Prior art keywords
voice
input method
state
trigger event
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911424389.1A
Other languages
Chinese (zh)
Other versions
CN110933500B (en
Inventor
杨倩倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN201911424389.1A priority Critical patent/CN110933500B/en
Publication of CN110933500A publication Critical patent/CN110933500A/en
Application granted granted Critical
Publication of CN110933500B publication Critical patent/CN110933500B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a voice triggering method, a device, equipment and a computer storage medium, wherein the voice triggering method comprises the following steps: if a voice trigger event is detected, acquiring an input method state of a preset voice input method; if the input method state is a popup state, starting a first recording function of the voice input method; and if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant. The invention reduces the dependence of the intelligent television on the controller, and improves the voice triggering speed and the running performance of the intelligent television.

Description

Voice triggering method, device, equipment and computer storage medium
Technical Field
The present invention relates to the field of voice triggering technologies, and in particular, to a voice triggering method, apparatus, device, and computer storage medium.
Background
Voice interaction is an important function of smart televisions. At present, voice interaction between a user and an intelligent television is mainly divided into two aspects of a voice assistant and a voice input method. At present, the traditional scheme for distinguishing the recording functions of the voice assistant and the voice input method is to use the controller to broadcast the state of the voice input method by accessing the controller, and determine the attribution of the recording function.
However, the above method has a great technical defect, the broadcasting of the remote controller can be received only by the controller, so that the dependence of the smart television on the controller is too strong, the voice interaction process is complicated due to the existence of the controller, and the response speed of voice triggering is reduced.
Disclosure of Invention
The invention mainly aims to provide a voice triggering method, a voice triggering device, voice triggering equipment and a computer storage medium, and aims to reduce the dependence of an intelligent television on a controller and improve the voice triggering speed and the running performance of the intelligent television.
In order to achieve the above object, an embodiment of the present invention provides a voice triggering method, where the voice triggering method includes:
if a voice trigger event is detected, acquiring an input method state of a preset voice input method;
if the input method state is a popup state, starting a first recording function of the voice input method;
and if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant.
Optionally, if the input method state is a non-pop-up state, the step of generating a voice static broadcast according to the voice trigger event includes:
and if the input method state is a non-pop-up state, generating a voice assistant exclusive symbol based on the voice trigger event, and generating a voice static broadcast according to the voice assistant exclusive symbol and the voice trigger event.
Optionally, if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant further includes:
and if the second recording function does not acquire any voice interaction event within the preset time length, releasing the memory resource occupied by the voice assistant.
Optionally, if the input method state is a non-pop-up state, the step of generating a voice static broadcast according to the voice trigger event includes:
if the input method state is a non-popup state, popup state anomaly detection is carried out on the voice input method, and a detection result is obtained;
if the detection result is that the pop-up state is abnormal, outputting abnormal early warning information;
and if the detection result is that the pop-up state is normal, generating a voice static broadcast according to the voice trigger event.
Optionally, the step of acquiring the input method state of the preset voice input method if the voice trigger event is detected includes:
if a voice trigger event is detected, judging whether a voice awakening word exists in the voice trigger event;
if yes, determining the awakening attribute of the voice awakening word;
if the voice awakening attribute is the attribute of the voice input method, starting a first recording function of the voice input method;
if the voice awakening attribute is a voice assistant attribute, starting a second recording function of the voice assistant;
if not, acquiring the input method state of the preset voice input method.
Optionally, the step of obtaining the input method state of the preset voice input method includes:
acquiring a device serial number and a trigger timestamp in the voice trigger event;
if the equipment serial number is consistent with a preset serial number and the triggering timestamp is consistent with the current timestamp, determining that the voice triggering event is a legal triggering event;
and if the voice trigger event is a legal trigger event, acquiring the input method state of the preset voice input method.
Optionally, the voice triggering method further includes:
uploading a voice interaction instruction obtained according to the first recording function or the second recording function to a preset cloud platform;
acquiring a voice control instruction fed back by the preset cloud platform based on the voice interaction instruction, and determining a target application of the voice control instruction;
and sending the voice control instruction to the target application.
The invention also provides a voice trigger device, comprising:
the state module is used for acquiring the input method state of a preset voice input method if a voice trigger event is detected;
the pop-up module is used for starting a first recording function of the voice input method if the input method state is the pop-up state;
and the non-pop-up module is used for generating a voice static broadcast according to the voice trigger event if the input method state is a non-pop-up state, and triggering a preset voice assistant based on the voice static broadcast so as to start a second recording function of the voice assistant.
Optionally, the non-ejection module comprises:
and the non-pop-up unit is used for generating a voice assistant exclusive symbol based on the voice trigger event if the input method state is a non-pop-up state, and generating a voice static broadcast according to the voice assistant exclusive symbol and the voice trigger event.
Optionally, the voice trigger device further includes:
and the release module is used for releasing the memory resource occupied by the voice assistant if the second recording function does not acquire any voice interaction event within the preset time length.
Optionally, the non-ejection module comprises:
the abnormal detection unit is used for detecting the abnormal pop-up state of the voice input method and obtaining a detection result if the input method state is the non-pop-up state;
the abnormal unit is used for outputting abnormal early warning information if the detection result is that the pop-up state is abnormal;
and the normal unit is used for generating the voice static broadcast according to the voice trigger event if the detection result is that the pop-up state is normal.
Optionally, the status module comprises:
the judging unit is used for judging whether a voice awakening word exists in the voice triggering event or not if the voice triggering event is detected;
a determining unit, configured to determine a wake-up attribute of the voice wake-up word if the voice wake-up word is a voice wake-up word;
the voice input method unit is used for starting a first recording function of the voice input method if the voice awakening attribute is the attribute of the voice input method;
the voice assistant unit is used for starting a second recording function of the voice assistant if the voice awakening attribute is the voice assistant attribute;
and the state unit is used for acquiring the input method state of the preset voice input method if the input method state is not the preset voice input method.
Optionally, the status module further comprises:
the acquisition unit is used for acquiring the equipment serial number and the trigger timestamp in the voice trigger event;
the confirming unit is used for confirming that the voice trigger event is a legal trigger event if the equipment serial number is consistent with a preset serial number and the trigger timestamp is consistent with the current timestamp;
and the legal unit is used for acquiring the input method state of the preset voice input method if the voice trigger event is a legal trigger event.
Optionally, the voice trigger device further includes:
the uploading module is used for uploading the voice interaction instruction obtained according to the first recording function or the second recording function to a preset cloud platform;
the application module is used for acquiring a voice control instruction fed back by the preset cloud platform based on the voice interaction instruction and determining a target application of the voice control instruction;
and the sending module is used for sending the voice control instruction to the target application.
Further, to achieve the above object, the present invention also provides an apparatus comprising: a memory, a processor, and a voice trigger program stored on the memory and executable on the processor, wherein:
the voice trigger program when executed by the processor implements the steps of the voice trigger method as described above.
In addition, to achieve the above object, the present invention also provides a computer storage medium;
the computer storage medium has stored thereon a voice trigger program, which when executed by a processor implements the steps of the voice trigger method as described above.
In the invention, if a voice trigger event is detected, whether a preset voice input method calls an input method interface is judged; if yes, starting a first recording function of the voice input method; and if not, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant. According to the invention, by reducing the intermediate broadcasting process of the controller, the dependence of the intelligent television on the controller is reduced, the stability of the intelligent television is improved, the voice interaction process is simplified, the response speed of voice trigger is greatly improved, and meanwhile, through the awakening mechanism, the voice assistant does not need to reside in a memory, so that the memory consumption of the intelligent television is greatly reduced, the operation burden is reduced, and the operation performance of the intelligent television is improved. .
Drawings
FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a voice triggering method according to an embodiment of the present invention.
The objects, features and advantages of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main idea of the embodiment scheme of the invention is as follows: in the invention, if a voice trigger event is detected, whether a preset voice input method calls an input method interface is judged; if yes, starting a first recording function of the voice input method; and if not, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant. According to the invention, by reducing the intermediate broadcasting process of the controller, the dependence of the intelligent television on the controller is reduced, the stability of the intelligent television is improved, the voice interaction process is simplified, the response speed of voice trigger is greatly improved, and meanwhile, through the awakening mechanism, the voice assistant does not need to reside in a memory, so that the memory consumption of the intelligent television is greatly reduced, the operation burden is reduced, and the operation performance of the intelligent television is improved.
The embodiment of the invention considers that the traditional intelligent television distinguishes the recording function of the voice assistant and the voice input method by means of the controller, and the broadcast of the remote controller can be received only by the controller, so that the dependence of the intelligent television on the controller is too strong, the voice interaction process is complicated due to the existence of the controller, the response speed of voice trigger is reduced, meanwhile, the voice assistant and the voice input method both need resident memories, a large amount of terminal operation resources are consumed, a burden is brought to the operation memories of the intelligent television, and the operation performance of the intelligent television is greatly reduced.
The invention provides a solution, which can reduce the dependency of the intelligent television on the controller, improve the stability of the intelligent television, simplify the voice interaction process, greatly improve the response speed of voice trigger, and greatly reduce the memory consumption of the intelligent television, reduce the operation burden and improve the operation performance of the intelligent television by enabling the voice assistant not to reside in the memory through the awakening mechanism.
As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.
The device of the embodiment of the invention can be a PC or a server device.
As shown in fig. 1, the apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a voice trigger program.
In the device shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and processor 1001 may be configured to invoke a voice trigger program stored in memory 1005 and perform the operations in the various embodiments of the voice trigger method described below.
Based on the hardware structure, the embodiment of the voice triggering method is provided.
The invention provides a voice triggering method, in an embodiment of the voice triggering method, referring to fig. 2, the voice triggering method includes:
step S10, if a voice trigger event is detected, acquiring the input method state of a preset voice input method;
step S20, if the input method state is a pop-up state, starting a first recording function of the voice input method;
step S30, if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant.
The voice triggering method is mainly applied to the smart television and comprises the following specific contents:
step S10, if a voice trigger event is detected, acquiring the input method state of a preset voice input method;
the voice trigger event refers to an Android standard voice key event triggered by a user on a remote controller, for example, when the user presses a voice key on the remote controller, the voice trigger event is triggered. In order to call the voice interaction function, the smart television needs to start a recording function. Generally, a voice assistant and a voice input method in the smart television have a recording function. It will be appreciated that in a practical application scenario, the phonetic input method will have priority over the voice assistant, and thus normally it is a voice-triggered event that triggers the phonetic input method preferentially. In the embodiment, the voice input method is executed on the Android system bottom layer of the smart television and can receive all key events in the system, so that the voice input method can achieve global voice triggering in the system. I.e. once a voice trigger event is detected, the voice input method can be triggered. In this embodiment, the input method status is used as a criterion for determining whether the voice input method is triggered.
Step S20, if the input method state is a pop-up state, starting a first recording function of the voice input method;
supposing that the input method state of the voice input method in the intelligent television is a popup state, the current voice input method calls an input method interface, a voice trigger event triggers the voice input method, and the voice interaction process is realized through the functions of the voice input method. The called input method interface can be displayed on a display screen of the smart television, such as a voiceprint interface, and when the input method state is a popup state, the displayed input method interface can be intuitively perceived by a user. The voice input method is directly called, and the control is not performed through the traditional controller scheme, so that the trigger response speed of the voice interaction method is higher, after the voice interaction process is determined to be processed by the voice input method, the first recording function of the voice input method can be started, and the voice interaction instruction of the user is analyzed and processed through the first recording function. The process does not need a controller for control analysis any more, reduces the interference processing of the controller, and effectively determines the attribution control right of the voice interaction function according to the input method state of the voice input method.
Step S30, if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant.
If the input method state is a non-popup state, the fact that the current voice input method does not call an input method interface is proved, the voice input method does not process a voice trigger event, the current voice trigger event is not occupied and consumed, and at the moment, the event returns to a frame layer in the smart television system. After receiving the returned voice trigger event, the frame layer generates a broadcast event representing the voice trigger according to the voice trigger event, wherein the broadcast event is a voice static broadcast and is used for triggering a voice assistant in the frame layer. It is understood that the voice assistant is a preset voice service process in the smart television, the process is in a closed state, and the voice static broadcast as a trigger factor can directly trigger the voice assistant to start the voice assistant to activate the second recording function of the voice assistant.
The embodiment is not controlled by a traditional controller scheme any more, so that the trigger response speed of the method is higher, after the voice interaction process is determined to be processed by the voice input method, the first recording function of the voice input method can be started, and the voice interaction instruction of the user is analyzed and processed by the first recording function. The process does not need a controller for control analysis any more, reduces the interference processing of the controller, and effectively determines the attribution control right of the voice interaction function according to the input method state of the voice input method.
Further, if the input method state is a non-pop-up state, the step of generating a voice static broadcast according to the voice trigger event includes:
step A1, if the input method state is a non-pop-up state, generating a voice assistant specific character based on the voice trigger event, and generating a voice static broadcast according to the voice assistant specific character and the voice trigger event.
In this embodiment, there may be other special processes in the frame layer, where the special processes may be malicious monitoring processes or unknown processes with priorities greater than that of the voice assistant, and these special processes may intercept and intercept the voice trigger event, so that the smart television cannot be normally applied to the voice trigger event, and the voice interaction function fails. In order to ensure that the voice trigger event can normally trigger the voice assistant and is not consumed by any other process interception, the embodiment generates a voice assistant specific character based on the voice trigger event, and the voice assistant specific character marks a trigger object and a consumption object of the voice trigger event, that is, the trigger object and the consumption object specifying the voice trigger event are only the voice assistant and cannot be consumed by other object interception. Therefore, the present embodiment generates the voice assistant specific character based on the voice trigger event, and generates the voice static broadcast according to the voice assistant specific character and the voice trigger event. Through the exclusive character of the voice assistant, the voice trigger event is determined to be only consumed by the voice assistant, so that the scheme integrity of the embodiment of the invention is ensured.
Further, if the input method state is a non-pop-up state, the step of generating a voice static broadcast according to the voice trigger event includes:
step A2, if the input method state is a non-popup state, popup state anomaly detection is carried out on the voice input method, and a detection result is obtained;
specifically, in order to avoid the abnormal phenomenon that the voice assistant is finally triggered by errors due to the fact that the voice input method does not pop up the input method interface normally, a detection mechanism is added in the process of determining that the input method state is the non-pop-up state so as to detect whether the voice input method is in the pop-up state abnormal state or not. The pop-up state abnormity detection is carried out on the voice input method, and a detection result is obtained.
Step A3, if the detection result is that the pop-up state is abnormal, outputting abnormal early warning information;
and step A4, if the detection result is that the pop-up state is normal, generating a voice static broadcast according to the voice trigger event.
If the detection result is that the pop-up state is abnormal, the fact that abnormal operation possibly exists in the process of calling the interface by the input method of the current voice input method is proved. For example, the voice input method actually detects a voice trigger event, but due to calling out in time or calling parameters being wrong, the input method interface pop-up process is abnormal, at this time, in order to ensure normal operation of the embodiment of the present invention, the smart television outputs abnormal early warning information, so as to provide the user to perform abnormal elimination processing on the voice input method, such as shutdown and restart, data initialization, and other operations.
If the detection result is that the pop-up state is normal, it is proved that the current voice trigger event does not trigger the voice input method, and at the moment, the smart television generates voice static broadcast according to the voice trigger event.
The embodiment provides a pop-up state anomaly detection mechanism, effectively avoids the phenomenon that the error change of the object is triggered due to the pop-up anomaly of the input method interface, and improves the fault-tolerant mechanism of the smart television.
Further, if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant further includes:
step a, if the second recording function does not acquire any voice interaction event within a preset time length, releasing the memory resource occupied by the voice assistant.
Assuming that the voice-triggering event of the present invention triggers the voice assistant, the recording will be performed by the second recording function of the voice assistant. In reality, there may be a situation where the voice trigger event is a false touch, and this phenomenon may cause the voice assistant to not receive any voice interaction instruction for a long time, i.e. not have any voice input. It can be understood that the invention uses two interactive modes of voice input method and voice assistant, and the voice input method is used as the global application of the system and usually resides in the memory. And the voice assistant is not applied globally by the system and is applied less frequently than the voice input method. Therefore, in order to avoid the influence on the operation performance of the smart television caused by the fact that the memory resource is always occupied, the preset duration is set for the second recording function. And if the second recording function does not acquire any voice interaction event within the preset time length, automatically releasing the memory resources occupied by the current voice assistant. The voice interaction event refers to a voice instruction sent by a user, and if the voice interaction event is not received within a preset time length, the fact that the current scene is not a voice interaction scene is proved, and the voice trigger event may be a false touch. At this moment, the voice assistant is closed, and the memory resource occupied by the voice assistant is released, so that the memory resource of the smart television can be prevented from being occupied by the voice assistant, the waste of terminal operation resources is avoided, the operation memory is optimized, and the operation performance of the smart television is improved.
In the invention, if a voice trigger event is detected, whether a preset voice input method calls an input method interface is judged; if yes, starting a first recording function of the voice input method; and if not, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant. According to the invention, by reducing the intermediate broadcasting process of the controller, the dependence of the intelligent television on the controller is reduced, the stability of the intelligent television is improved, the voice interaction process is simplified, the response speed of voice trigger is greatly improved, and meanwhile, through the awakening mechanism, the voice assistant does not need to reside in a memory, so that the memory consumption of the intelligent television is greatly reduced, the operation burden is reduced, and the operation performance of the intelligent television is improved.
Further, based on the first embodiment, a second embodiment of the voice trigger method of the present invention is provided, in which if a voice trigger event is detected, the step of acquiring an input method state of a preset voice input method includes:
step B1, if a voice trigger event is detected, judging whether a voice awakening word exists in the voice trigger event;
specifically, the voice trigger event may be a key trigger performed by a remote controller of the smart television, or a voice trigger performed by a voice receiver. Assuming that the voice is triggered, the user may wake up the corresponding trigger object by a voice wake-up word, such as "trigger voice input method" or "trigger voice assistant" by adding the voice wake-up word to the voice. Therefore, there is a need for voice wake word detection for voice triggered events.
Step B2, if yes, determining the awakening attribute of the voice awakening word;
step B3, if the voice awakening attribute is a voice input method attribute, starting a first recording function of the voice input method;
step B4, if the voice awakening attribute is a voice assistant attribute, starting a second recording function of the voice assistant;
if the voice awakening word exists, the user already determines the trigger object, and the corresponding recording function is started according to the voice awakening word. The key point is that the awakening attribute of the voice awakening word is determined, if the voice awakening attribute is the attribute of the voice input method, a first recording function corresponding to the voice input method is started, and if the voice awakening attribute is the attribute of the voice assistant, a second recording function corresponding to the voice assistant is started.
And step B5, if not, acquiring the input method state of the preset voice input method.
And if the voice awakening word does not exist, directly acquiring the input method state of the voice input method, and determining the voice trigger object according to the input method state.
Further, based on the first embodiment, a third embodiment of the voice trigger method of the present invention is provided, in which the step of acquiring the input method state of the preset voice input method includes:
step C1, acquiring the device serial number and the triggering time stamp in the voice triggering event;
step C2, if the equipment serial number is consistent with the preset serial number and the triggering timestamp is consistent with the current timestamp, determining that the voice triggering event is a legal triggering event;
and step C3, if the voice trigger event is a legal trigger event, acquiring the input method state of the preset voice input method.
The voice trigger event may be a forged illegal event, and in order to ensure data security of the smart television, in this embodiment, validity verification is performed by acquiring a device serial number and a trigger event stamp in the voice trigger event. The equipment serial number is an equipment authentication number of an equipment source of a voice trigger event, and usually a set of intelligent television systems are generated by the same manufacturer, so the equipment serial number is an important factor of legality authentication; and secondly, the triggering time stamp proves the generation time of the current voice triggering event, and the encryption algorithm is used for keeping secret, so that the interference and confusion of other malicious time stamps are avoided. Correspondingly, the smart television stores a preset serial number and a current timestamp, if the equipment serial number and the trigger timestamp are legal, the preset serial number and the preset timestamp in the smart television respectively correspond to the equipment serial number and the trigger timestamp, the verification is passed, the voice trigger event is a legal trigger event, and the smart television acquires the input method state of the preset voice input method; otherwise, the smart television cannot match the equipment serial number and the trigger timestamp, the verification fails, and the voice trigger event is an illegal trigger event and is not processed.
Further, based on the first embodiment, a fourth embodiment of the voice triggering method of the present invention is provided, in which the voice triggering method further includes:
b, uploading a voice interaction instruction obtained according to the first recording function or the second recording function to a preset cloud platform;
step c, acquiring a voice control instruction fed back by the preset cloud platform based on the voice interaction instruction, and determining a target application of the voice control instruction;
and d, sending the voice control instruction to the target application.
In this embodiment, the voice interaction instruction obtained by the voice input method or the voice assistant is uploaded to the cloud platform, and a voice control instruction of the cloud platform is obtained, where the voice control instruction corresponds to a target application to be controlled. In the conventional voice triggering scheme, a voice control instruction needs to be analyzed and filtered by a controller, and the obtained final voice instruction can be sent to a corresponding target application for processing by a target application process. However, the embodiment does not have a link of the controller, so that the voice control instruction can be directly sent to the target application, and the target application can acquire the voice control instruction more quickly, so that voice feedback and response are performed more quickly.
In addition, an embodiment of the present invention further provides a voice trigger device, where the voice trigger device includes:
the state module is used for acquiring the input method state of a preset voice input method if a voice trigger event is detected;
the pop-up module is used for starting a first recording function of the voice input method if the input method state is the pop-up state;
and the non-pop-up module is used for generating a voice static broadcast according to the voice trigger event if the input method state is a non-pop-up state, and triggering a preset voice assistant based on the voice static broadcast so as to start a second recording function of the voice assistant.
Optionally, the non-ejection module comprises:
and the non-pop-up unit is used for generating a voice assistant exclusive symbol based on the voice trigger event if the input method state is a non-pop-up state, and generating a voice static broadcast according to the voice assistant exclusive symbol and the voice trigger event.
Optionally, the voice trigger device further includes:
and the release module is used for releasing the memory resource occupied by the voice assistant if the second recording function does not acquire any voice interaction event within the preset time length.
Optionally, the non-ejection module comprises:
the abnormal detection unit is used for detecting the abnormal pop-up state of the voice input method and obtaining a detection result if the input method state is the non-pop-up state;
the abnormal unit is used for outputting abnormal early warning information if the detection result is that the pop-up state is abnormal;
and the normal unit is used for generating the voice static broadcast according to the voice trigger event if the detection result is that the pop-up state is normal.
Optionally, the status module comprises:
the judging unit is used for judging whether a voice awakening word exists in the voice triggering event or not if the voice triggering event is detected;
a determining unit, configured to determine a wake-up attribute of the voice wake-up word if the voice wake-up word is a voice wake-up word;
the voice input method unit is used for starting a first recording function of the voice input method if the voice awakening attribute is the attribute of the voice input method;
the voice assistant unit is used for starting a second recording function of the voice assistant if the voice awakening attribute is the voice assistant attribute;
and the state unit is used for acquiring the input method state of the preset voice input method if the input method state is not the preset voice input method.
Optionally, the status module further comprises:
the acquisition unit is used for acquiring the equipment serial number and the trigger timestamp in the voice trigger event;
the confirming unit is used for confirming that the voice trigger event is a legal trigger event if the equipment serial number is consistent with a preset serial number and the trigger timestamp is consistent with the current timestamp;
and the legal unit is used for acquiring the input method state of the preset voice input method if the voice trigger event is a legal trigger event.
Optionally, the voice trigger device further includes:
the uploading module is used for uploading the voice interaction instruction obtained according to the first recording function or the second recording function to a preset cloud platform;
the application module is used for acquiring a voice control instruction fed back by the preset cloud platform based on the voice interaction instruction and determining a target application of the voice control instruction;
and the sending module is used for sending the voice control instruction to the target application.
In addition, an embodiment of the present invention further provides an apparatus, where the apparatus includes: a memory 109, a processor 110, and a voice activated program stored on the memory 109 and executable on the processor 110, the voice activated program when executed by the processor 110 implementing the steps of the embodiments of the voice activated method described above.
In addition, the present invention also provides a computer storage medium, which stores a voice trigger program, and the voice trigger program can be further executed by a processor for implementing the steps of the embodiments of the voice trigger method.
The specific implementation of the device and the computer storage medium of the present invention has basically the same expansion content as the embodiments of the voice triggering method, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above, and includes instructions for enabling a device (such as a mobile phone, a computer, a smart tv, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A voice triggering method, characterized in that the voice triggering method comprises:
if a voice trigger event is detected, acquiring an input method state of a preset voice input method;
if the input method state is a popup state, starting a first recording function of the voice input method;
and if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant.
2. The voice trigger method according to claim 1, wherein if the input method state is a non-pop-up state, the step of generating a voice static broadcast according to the voice trigger event comprises:
and if the input method state is a non-pop-up state, generating a voice assistant exclusive symbol based on the voice trigger event, and generating a voice static broadcast according to the voice assistant exclusive symbol and the voice trigger event.
3. The voice trigger method according to claim 1, wherein if the input method state is a non-pop-up state, generating a voice static broadcast according to the voice trigger event, and triggering a preset voice assistant based on the voice static broadcast to start a second recording function of the voice assistant further comprises:
and if the second recording function does not acquire any voice interaction event within the preset time length, releasing the memory resource occupied by the voice assistant.
4. The voice trigger method according to claim 1, wherein the step of generating a voice static broadcast according to the voice trigger event if the input method state is a non-pop-up state comprises:
if the input method state is a non-popup state, popup state anomaly detection is carried out on the voice input method, and a detection result is obtained;
if the detection result is that the pop-up state is abnormal, outputting abnormal early warning information;
and if the detection result is that the pop-up state is normal, generating a voice static broadcast according to the voice trigger event.
5. The voice trigger method according to claim 1, wherein the step of obtaining the input method status of the preset voice input method if the voice trigger event is detected comprises:
if a voice trigger event is detected, judging whether a voice awakening word exists in the voice trigger event;
if yes, determining the awakening attribute of the voice awakening word;
if the voice awakening attribute is the attribute of the voice input method, starting a first recording function of the voice input method;
if the voice awakening attribute is a voice assistant attribute, starting a second recording function of the voice assistant;
if not, acquiring the input method state of the preset voice input method.
6. The voice trigger method of claim 1, wherein the step of acquiring the input method state of the preset voice input method comprises:
acquiring a device serial number and a trigger timestamp in the voice trigger event;
if the equipment serial number is consistent with a preset serial number and the triggering timestamp is consistent with the current timestamp, determining that the voice triggering event is a legal triggering event;
and if the voice trigger event is a legal trigger event, acquiring the input method state of the preset voice input method.
7. The voice trigger method of any one of claims 1-6, wherein the voice trigger method further comprises:
uploading a voice interaction instruction obtained according to the first recording function or the second recording function to a preset cloud platform;
acquiring a voice control instruction fed back by the preset cloud platform based on the voice interaction instruction, and determining a target application of the voice control instruction;
and sending the voice control instruction to the target application.
8. A voice activated device, the voice activated device comprising:
the state module is used for acquiring the input method state of a preset voice input method if a voice trigger event is detected;
the pop-up module is used for starting a first recording function of the voice input method if the input method state is the pop-up state;
and the non-pop-up module is used for generating a voice static broadcast according to the voice trigger event if the input method state is a non-pop-up state, and triggering a preset voice assistant based on the voice static broadcast so as to start a second recording function of the voice assistant.
9. An apparatus, characterized in that the apparatus comprises: memory, a processor and a voice trigger program stored on the memory and executable on the processor, the voice trigger program when executed by the processor implementing the steps of the voice trigger method according to any one of claims 1 to 7.
10. A computer storage medium having stored thereon a voice-activated program which, when executed by a processor, performs the steps of the voice-activated method of any one of claims 1 to 7.
CN201911424389.1A 2019-12-30 2019-12-30 Voice triggering method, device, equipment and computer storage medium Active CN110933500B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911424389.1A CN110933500B (en) 2019-12-30 2019-12-30 Voice triggering method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911424389.1A CN110933500B (en) 2019-12-30 2019-12-30 Voice triggering method, device, equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN110933500A true CN110933500A (en) 2020-03-27
CN110933500B CN110933500B (en) 2022-07-29

Family

ID=69854624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911424389.1A Active CN110933500B (en) 2019-12-30 2019-12-30 Voice triggering method, device, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN110933500B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882394A (en) * 2021-01-12 2021-06-01 北京小米松果电子有限公司 Device control method, control apparatus, and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346127A (en) * 2013-08-02 2015-02-11 腾讯科技(深圳)有限公司 Realization method, realization device and terminal for voice input
US20150363165A1 (en) * 2014-06-11 2015-12-17 Huawei Technologies Co., Ltd. Method For Quickly Starting Application Service, and Terminal
CN107919123A (en) * 2017-12-07 2018-04-17 北京小米移动软件有限公司 More voice assistant control method, device and computer-readable recording medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346127A (en) * 2013-08-02 2015-02-11 腾讯科技(深圳)有限公司 Realization method, realization device and terminal for voice input
US20150363165A1 (en) * 2014-06-11 2015-12-17 Huawei Technologies Co., Ltd. Method For Quickly Starting Application Service, and Terminal
CN107919123A (en) * 2017-12-07 2018-04-17 北京小米移动软件有限公司 More voice assistant control method, device and computer-readable recording medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882394A (en) * 2021-01-12 2021-06-01 北京小米松果电子有限公司 Device control method, control apparatus, and readable storage medium

Also Published As

Publication number Publication date
CN110933500B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US8200962B1 (en) Web browser extensions
US10409635B2 (en) Switching method, switching system and terminal for system and/or application program
KR101990598B1 (en) Method and device for recommending solution based on user operation behavior
CN106201574B (en) Application interface starting method and device
US20140156277A1 (en) Information processing device and content retrieval method
US20100235846A1 (en) Information processing apparatus and data output managing system
CN112256225A (en) Screen projection method, server, terminal device and computer readable storage medium
CN109766725B (en) Data processing method, device, intelligent terminal and computer readable medium
CN106940674B (en) Method and device for triggering target event in mobile terminal
CN105975320B (en) Method and device for forbidding installation of third-party application and terminal
CN110362288B (en) Same-screen control method, device, equipment and storage medium
CN108668241B (en) Information reminding method and device, storage medium and electronic equipment
CN110933500B (en) Voice triggering method, device, equipment and computer storage medium
CN110727941A (en) Private data protection method and device, terminal equipment and storage medium
US20170325003A1 (en) A video signal caption system and method for advertising
CN109213442B (en) File copying method, terminal device and computer readable storage medium
CN113596593A (en) Multi-terminal interaction method, television and computer readable storage medium
CN106569851B (en) Application program processing method and device
US20190065998A1 (en) Flash Control Method and Apparatus
CN109976790B (en) Application updating method, device, terminal and storage medium
CN109840113B (en) Application data processing method and equipment, storage medium and terminal thereof
CN110659082A (en) Application program interface display method and device, terminal and storage medium
EP3239880B1 (en) Legal installation package acquiring method and apparatus, computer program and recording medium
CN107968799B (en) Information acquisition method, terminal equipment and system
CN109819330B (en) Live broadcast room jumping method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant