CN113393839A - Intelligent terminal control method, storage medium and intelligent terminal - Google Patents

Intelligent terminal control method, storage medium and intelligent terminal Download PDF

Info

Publication number
CN113393839A
CN113393839A CN202110937087.5A CN202110937087A CN113393839A CN 113393839 A CN113393839 A CN 113393839A CN 202110937087 A CN202110937087 A CN 202110937087A CN 113393839 A CN113393839 A CN 113393839A
Authority
CN
China
Prior art keywords
screen
intelligent terminal
state
user
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110937087.5A
Other languages
Chinese (zh)
Other versions
CN113393839B (en
Inventor
帅丹
王鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Jimi Technology Co Ltd
Chengdu XGIMI Technology Co Ltd
Original Assignee
Chengdu Jimi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Jimi Technology Co Ltd filed Critical Chengdu Jimi Technology Co Ltd
Priority to CN202110937087.5A priority Critical patent/CN113393839B/en
Publication of CN113393839A publication Critical patent/CN113393839A/en
Application granted granted Critical
Publication of CN113393839B publication Critical patent/CN113393839B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3141Constructional details thereof
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses an intelligent terminal control method, a storage medium and an intelligent terminal. The method comprises the following steps: acquiring a voice instruction; carrying out semantic recognition on voice data carried by the voice instruction; when the intelligent terminal is in a screen-off state, if the intelligent terminal does not need to be turned on to respond to the voice command according to the semantic recognition result, responding to the voice command in the screen-off state; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameters, if yes, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state. By implementing the scheme of the invention, when the intelligent terminal is in the working state of the screen closing state, the screen can be opened in one step in place to meet the requirements of users, and the user experience is improved. Meanwhile, whether the screen is opened or not is confirmed, so that the influence of factors such as mistaken identification on user experience can be avoided.

Description

Intelligent terminal control method, storage medium and intelligent terminal
Technical Field
The invention relates to the technical field of voice interaction, in particular to an intelligent terminal control method, a storage medium and an intelligent terminal.
Background
With the continuous progress of science and technology, the intelligent terminal bears more and more functions. Taking a projector as an example, a traditional projector can only perform projection in a screen-on state, and a screen-off state is a power-off state, while the current projector can perform projection in a screen-on state and enters a sound mode in a screen-off state, that is, the working state of the projector includes a screen-on state and a screen-off state. In addition, people are pursuing more intelligent terminal control experience, and voice control is a common control means. Therefore, it is a current research direction to satisfy the voice control with the multi-functionalization of the intelligent terminal.
The above-disclosed information is only for further understanding of the background of the invention and therefore it may contain information that does not form the prior art that is known to a person of ordinary skill in the art.
Disclosure of Invention
In view of this, embodiments of the present invention provide an intelligent terminal control method, a storage medium, and an intelligent terminal, which can turn on a screen in one step to meet user requirements when the intelligent terminal is in a working state of a screen off state.
In a first aspect, an embodiment of the present invention provides an intelligent terminal control method, where a working state of an intelligent terminal includes a screen-on state and a screen-off state, and the method includes:
acquiring a voice instruction;
performing semantic recognition on voice data carried by the voice instruction;
when the intelligent terminal is in a screen-off state, if it is determined that the voice command does not need to be responded by opening the screen according to the semantic recognition result, responding to the voice command in the screen-off state; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameter, if so, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state.
In one possible implementation, the target parameters include any one or more of: and the semantic recognition result is the voiceprint recognition result of the voice data, the system time, the image of the projection area, the ambient light brightness and the user behavior data of the intelligent terminal.
In a possible implementation manner, the responding to the voice command in the screen-off state includes voice broadcasting, and the wake-up-free state is maintained during the voice broadcasting.
In one possible implementation, when the wake-up exempt state is maintained during the voice broadcast, the conditions for performing the voice response include:
the continuation speaker is the currently speaking user and the semantics of the continuation speaker input speech are related to the current semantic scene.
In a possible implementation manner, the determining whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the target parameter includes at least one of the following:
if the semantic recognition result is definite screen opening semantics, the intelligent terminal is confirmed to be adjusted from a screen closing state to a screen opening state;
if the input user of the voice command is determined to be a child based on the voiceprint recognition result and the screen-opening child lock is opened on the intelligent terminal, determining that the intelligent terminal is not adjusted from the screen-closing state to the screen-opening state;
if the projection area of the intelligent terminal is determined to be occupied based on the image of the projection area, the intelligent terminal is not adjusted from a screen-off state to a screen-on state;
and if the current time is determined to be the night time based on the system time and the environmental light brightness is smaller than the brightness threshold, determining whether to adjust the intelligent terminal from the screen-off state to the screen-on state or not according to the user behavior data of the intelligent terminal.
In a possible implementation manner, the determining whether to adjust the intelligent terminal from the screen-off state to the screen-on state according to the target parameter includes:
judging whether the semantic recognition result is definite screen-on semantics or not, and if the semantic recognition result is definite screen-on semantics, determining to adjust the intelligent terminal from a screen-off state to a screen-on state;
performing voiceprint recognition on the voice data, and if the input user of the voice instruction is determined to be a child based on the voiceprint recognition result and the intelligent terminal is unlocked, determining that the intelligent terminal is not adjusted from a screen-off state to a screen-on state, and reminding children at the current time that the screen cannot be unlocked for use;
judging whether the projection area of the intelligent terminal is occupied or not based on the image of the projection area, if so, confirming that the intelligent terminal is not adjusted from a screen-off state to a screen-on state, and prompting;
judging whether the current time is the evening time and the environmental light brightness is smaller than a brightness threshold value, if the current time is the evening time and the environmental light brightness is smaller than the brightness threshold value, determining whether the intelligent terminal is adjusted from a screen-off state to a screen-on state according to the user behavior data of the intelligent terminal;
and prompting a user whether to adjust the intelligent terminal from the screen closing state to the screen opening state for response, if the user confirms that the intelligent terminal is adjusted from the screen closing state to the screen opening state for response, confirming that the intelligent terminal is adjusted from the screen closing state to the screen opening state, and otherwise, confirming that the intelligent terminal is not adjusted from the screen closing state to the screen opening state.
In a possible implementation manner, determining whether to adjust the intelligent terminal from the screen-off state to the screen-on state according to the user behavior data of the intelligent terminal includes:
generating a user screen opening event table according to the current voice behavior data of the user, wherein the user screen opening event table comprises a field user ID and any one or more of the following fields: whether a holiday or a weekend is saved on the next day, the time period, the semantic field and the current state are judged, and whether a screen needs to be started or not is judged;
and inputting the data in the user screen opening event list into a screen opening model trained by using the user historical behavior data of the intelligent terminal, and determining whether to adjust the intelligent terminal from a screen closing state to a screen opening state according to the output of the screen opening model.
In a possible implementation manner, the method for training the screen-opening model includes:
generating a first user event list according to a collected voice-related behavior log of a user in the evening time, wherein the log comprises a user ID and any one or more of the following fields: using the voice time, the semantic field, the current state, whether a screen needs to be started or not, and whether the screen is started or not; the first user event table comprises a field user ID and any one or more of the following fields: whether the next day is holiday or weekend, time period, semantic field, current state, whether the screen needs to be started or not;
counting the voice use times of a preset time period of a user in the evening time according to the first user event table to generate a user voice use event table, wherein the user voice use event table comprises a field user ID and any one or more of the following fields: the method comprises the following steps of voice using times, screen opening using times, the times of the final screen opening of a screen opening instruction in a screen closing state, the times of the final screen opening of the screen opening instruction in a screen closing state in a holiday on the next day, and the times of the final screen opening of the screen opening instruction in a screen closing state in a non-holiday on the second day;
only retaining data of which the current state is off and the screen needs to be opened in the first user event table, and generating a second user event table, wherein fields contained in the second user event table are the same as those in the first user event table;
training the open-screen model using the user speech usage event table and a second user event table.
In a second aspect, an embodiment of the present invention provides an intelligent terminal control device, where a working state of an intelligent terminal includes a screen-on state and a screen-off state, and the intelligent terminal control device includes:
the voice instruction acquisition module is used for acquiring a voice instruction;
the semantic recognition module is used for carrying out semantic recognition on the voice data carried by the voice command;
the screen opening module is used for responding the voice command in the screen closing state if the intelligent terminal is in the screen closing state and the screen opening response is determined not to be required according to the semantic recognition result; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameter, if so, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state.
In a third aspect, an embodiment of the present invention provides an intelligent terminal, where the intelligent terminal includes a processor and a memory, where the memory stores instructions executable by the processor, and the instructions are loaded and executed by the processor, so as to implement the intelligent terminal control method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the intelligent terminal control method according to the first aspect.
In a fifth aspect, an embodiment of the present invention provides a computer program product, which includes instructions that, when executed by at least one processor, cause the at least one processor to execute the intelligent terminal control method according to the first aspect.
It should be noted that the apparatus of the second aspect, the intelligent terminal of the third aspect, the storage medium of the fourth aspect, and the computer program product of the fifth aspect are used to execute the method provided by the first aspect, so that the same beneficial effects as those of the method of the first aspect can be achieved, and details of the embodiments of the present invention are not repeated.
By implementing the scheme of the invention, when the intelligent terminal is in the working state of the screen closing state, the screen can be opened in one step in place to meet the requirements of users, and the user experience is improved. Meanwhile, whether the screen is opened or not is confirmed, so that the influence of factors such as mistaken identification on user experience can be avoided.
Drawings
The invention will now be described by way of example and with reference to the accompanying drawings in which:
fig. 1 is a flowchart of an intelligent terminal control method according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for confirming whether to open a screen according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. Moreover, while the present disclosure has been described in terms of one or more exemplary embodiments, it is to be understood that each aspect of the disclosure can be implemented as a separate entity, whether or not such embodiment is described in connection with its specific embodiments. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
In the embodiments of the present invention, words such as "exemplary", "for example", etc. are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the term using examples is intended to present concepts in a concrete fashion.
Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The use of "first," "second," and similar terms in the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another, and may or may not be identical in meaning. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The term "and/or" is meant to encompass any and all possible combinations of one or more of the associated listed items. The character "/" generally indicates that the preceding and following associated objects are in an "or" relationship.
In the embodiment of the present invention, the intelligent terminal may be a television or a projector, and the working state of the intelligent terminal includes a screen-on state and a screen-off state.
Fig. 1 is a flowchart of an intelligent terminal control method according to an embodiment of the present invention. As shown in fig. 1, the intelligent terminal control method includes the following steps:
s101, acquiring a voice command.
The intelligent terminal can obtain a voice instruction through a voice receiving device (such as a remote controller, a far-field microphone array, and the like), for example: the microphone of the remote controller transmits data to the intelligent terminal through Bluetooth, and the far-field microphone array acquires voice instructions and the like.
And S102, performing semantic recognition on voice data carried by the voice command.
And carrying out voice recognition on voice data carried by the voice command, converting the voice data into text data, and carrying out semantic recognition according to the text data. And if the semantic recognition result can be in both semantic fields, carrying out response biased to a certain field according to the on-off state of the intelligent terminal. For example, the user inputs 'love dream', responds to the music field in the screen-off state, and responds to the movie field in the screen-on state.
S103, judging whether the intelligent terminal is in a screen-opening state.
If the intelligent terminal is in the screen-opening state, directly responding to the instruction according to the recognized semantics; if the intelligent terminal is in a screen-off state, judging whether the screen needs to be opened to respond to the voice command according to the semantic recognition result, specifically, if the response to the voice command can be directly finished without depending on a User Interface (UI), judging that the screen does not need to be opened, for example, the voice command for inquiring weather, listening to songs, adjusting volume and the like does not need to be opened to respond; if the response to the voice instruction needs to rely on the UI display, it is determined that an open screen is required, such as a voice instruction for viewing pictures, videos, etc., an open screen response is required. If the fact that the screen does not need to be opened to respond to the voice command is determined according to the semantic recognition result, responding to the voice command in a screen closing state; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameter, if so, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state. The voice instruction may be any type of voice instruction, for example: query-type instructions (e.g., instructions for querying weather), tool-type instructions (e.g., instructions for setting alarm), control-type instructions (e.g., instructions for adjusting volume), etc., which are not specifically limited in this embodiment of the present invention. The target parameters described above may include any one or more of the following: and semantic recognition results, voice print recognition results of the voice data, system time, images of projection areas, ambient light brightness and user behavior data of the intelligent terminal.
For example, when the voice command is a query type command, responding to the voice command usually requires voice broadcasting, for example, the voice command is used for querying "how the XX city is today", and the specific way of responding to the voice command is usually: inquiring weather of city XX today, converting the inquiry result from Text data into voice data, and broadcasting the voice data To the user, which is a complete TTS (Text To Speech from Text To voice) voice broadcasting process.
In an embodiment of the present invention, the responding to the voice command in the screen-off state includes voice broadcasting, and the wake-up-free state is maintained during the voice broadcasting. The intelligent terminal can adjust UI display and optimize the semantics to TTS in response to the voice command in the screen-off state, and pictures and characters are not displayed any more. For example, the voice command is a query command for querying "what weather is what the weather is, and in the open screen state, the user is afraid of being influenced by the UI display, and the TTS is relatively short: the TTS broadcasts only today's weather and displays it and the last 7 days and related surroundings on the UI; in the screen-off state, since no UI is displayed, all the contents can be adjusted to voice broadcast (such as TTS broadcast). In the screen off state, the TTS broadcast time may be too long, and the state of no wake-up is maintained all the time during the period when the TTS broadcast is not finished, and a user does not need to speak a wake-up word to directly talk (wherein the user needs to wake-up words except for multiple turns in the screen on state). Wherein, the condition for next response in the state of exempting from awakening includes:
(a) the continuing speaker is the currently speaking user (speaker recognition techniques may be employed).
(b) The semantics of the input speech of the continuous speaker are related to the current semantic scene (nlu (natural Language understanding) recognition technology can be adopted). For example, the semantic is "how all weather is done", and if the user speaking at this time continues to say "what weather is done in beijing", "does not say any more", "put on clothes suggestion" or the like, which indicates that the semantic of continuing to speak the speech input by the speaker is associated with "how all weather is done" in the current semantic scene, the user can directly respond to the speech input by the continuing to speak the speech, for example, "what movie is done in the next day", and then does not respond.
In some embodiments, confirming whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the target parameter includes at least one of:
if the semantic recognition result is definite screen opening semantics, the intelligent terminal is confirmed to be adjusted from a screen closing state to a screen opening state;
if the input user of the voice command is determined to be a child based on the voiceprint recognition result and the screen-opening child lock is opened on the intelligent terminal, determining that the intelligent terminal is not adjusted from the screen-closing state to the screen-opening state;
if the projection area of the intelligent terminal is determined to be occupied based on the image of the projection area, the intelligent terminal is not adjusted from a screen-off state to a screen-on state;
and if the current time is determined to be night based on the system time and the ambient light brightness is smaller than a brightness threshold (the brightness threshold can be selected from 80LUX to 400LUX, such as 100 LUX), determining whether to adjust the intelligent terminal from a screen-off state to a screen-on state according to the user behavior data of the intelligent terminal.
Exemplarily, as shown in fig. 2, determining whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the target parameter includes:
s3001, judging whether the semantic recognition result is clear screen opening semantic. For example, if the screen is opened, the optical machine is opened, and the like, the screen semantic meaning is definitely opened. And if the screen is in the clear screen opening semantic, confirming to open the screen, namely, confirming to adjust the intelligent terminal from the screen closing state to the screen opening state, and otherwise, entering the next step.
S3002, identifying the age bracket of the current speaking user by using the voiceprint identification technology. If the speaking user is a child and the screen unlocking child lock is already unlocked on the intelligent terminal (the user can set a time period for unlocking the screen unlocking child lock, such as 0: 00-6: 00), the screen is not unlocked, namely the intelligent terminal is not adjusted from the screen closing state to the screen opening state, and a child is reminded that the screen cannot be unlocked for use at the current time, otherwise, the next step is carried out.
S3003, judging whether the projection area of the intelligent terminal is occupied or not. If the intelligent terminal is a projector, an infrared sensor, a camera on the projector and the like can be triggered to take a snapshot, and whether people exist in the current projection area is judged by utilizing an infrared technology, an image recognition technology and the like. And if the screen area of the intelligent terminal is occupied, confirming that the screen is not opened, and prompting, otherwise, entering the next step. It should be noted that this step is an optional step, and the embodiment of the present invention is not limited to this.
S3004, judging whether the current time is night time (for example, 0: 00-6: 00, which can be set according to actual conditions) based on the system time, and whether the ambient light is dark (if not, the light is turned on), if so, identifying whether the ambient light is dark by using a snap-shot picture, namely, whether the ambient light brightness is lower than a certain preset value, if so, the ambient light brightness is lower than 100 LUX. And if the current time is the night time and the ambient light is dark, determining whether to turn on the screen according to the user behavior data of the intelligent terminal, and if not, entering the next step.
In some embodiments, determining whether to open the screen based on the behavioral data of the user includes training of an open screen model and predicting using the trained open screen model.
(1) And (5) training the screen-opening model.
S6001, collecting a voice-related behavior log used by a user in the evening time, wherein the log comprises a user ID and any one or more of the following fields: the method comprises the steps of using voice time, semantic field, current state, whether a screen needs to be started or not, and whether the screen is started or not. And judging whether the user finally opens the screen according to the historical behavior of the user. Illustratively, the collected logs are shown in table 1.
TABLE 1
User ID Using speech time Semantic domain Current state Whether or not to open the screen Whether to turn on the screen
userA 2020-07-01 00:30:00 Video Screen opening device Whether or not Is that
userA 2020-07-01 05:40:00 Video Guan Ping Is that Whether or not
userB 2020-07-02 01:40:00 Music Guan Ping Whether or not Whether or not
userB 2020-07-02 03:50:00 Video Guan Ping Is that Whether or not
S6002, generating a first user event table according to the collected voice related behavior logs used by the users in the evening, wherein the first user event table comprises field user IDs and any one or more of the following fields: whether the next day is holiday or weekend, time period, semantic field, current state, whether the screen needs to be started, and whether the screen is started. Illustratively, the first user event table is shown in table 2.
TABLE 2
User ID Whether holidays or weekends are reserved or not on the next day Time period Semantic domain Current state Whether or not to open the screen Whether to turn on the screen
userA Whether or not 0 Video Screen opening device Whether or not Is that
userA Whether or not 5 Video Guan Ping Is that Whether or not
userB Is that 1 Music Guan Ping Whether or not Whether or not
userB Is that 3 Video Guan Ping Is that Whether or not
Wherein, whether the holiday or the weekend and the time period are obtained on the next day according to the using voice time in the table 1, and the time period is selected from the hours, such as the system time of 00:00:00-00:59:59, the hour of 0, the system time of 05:00:00-05:59:59, and the hour of 5. And whether the second day in table 2 is a holiday or weekend is an optional field, which is not limited in the embodiment of the present invention.
S6003, according to the first user event table, counting the number of times of voice usage of a preset time period of a user in the evening time, for example, counting the number of times of voice usage of the user in the last period of time (for example, 1 month) between defined evening times (for example, 00: 00-06: 00), and generating a user voice usage event table, where the user voice usage event table includes a field user ID and any one or more of the following fields: the screen switching method comprises the following steps of voice using times, screen switching-on using times, the times of the screen needing to be switched on finally under a screen switching-off state, the times of the screen needing to be switched on finally under a screen switching-on state, the times of the screen needing to be switched on finally under a screen switching-off state under a holiday on the next day, and the times of the screen needing to be switched on finally under a screen switching-off state under a non-holiday on the next day.
S6004, only data of which the current state is off and the screen needs to be opened in the first user event table are reserved, and a second user event table is generated, wherein fields contained in the second user event table are the same as those in the first user event table. Illustratively, the second user event table is shown in table 3.
TABLE 3
User ID Whether holidays and weekends are reserved or not on the next day Time period Semantic domain Current state Whether or not to open the screen Whether to turn on the screen
userA Whether or not 5 Video Guan Ping Is that Whether or not
userB Is that 3 Video Guan Ping Is that Whether or not
And S6005, training the screen-opening model by using the user voice use event list and the second user event list. Illustratively, the data in the user voice usage event table and the second user event table are standardized, such as the data in the user voice usage event table is standardized and normalized; and carrying out characteristic coding on the data in the second user event list under the screen-related state. Wherein, the output (label) of the open screen model training is whether to open the screen, if the 0 mark is not opened, the 1 mark is opened; the feature data is the connection of the data in the processed user voice usage event list and the data in the processed second user event list. And training periodically at regular time by using a logistic regression algorithm to obtain a nearest model file. In the embodiment of the invention, the screen opening model is a logistic regression model, but the type of the screen opening model is not limited by the invention, and the screen opening model can be other models such as neural networks and the like.
(2) And predicting by using the trained screen opening model.
a. And in the model training process, the voice behavior of the user is processed into a user behavior feature set.
b. Generating a user screen opening event table according to the current voice behavior data of the user, wherein the user screen opening event table comprises a field user ID and any one or more of the following fields: whether the next day is holiday or weekend, time period, semantic field, current state, whether the screen needs to be opened. For example, a user screen-opening event table is generated according to the current semantics, time, screen state, etc. of the user, wherein the user screen-opening event representation is shown in table 4, for example.
TABLE 4
User ID Whether holidays and weekends are reserved or not on the next day Time period Semantic domain Current state Whether or not to open the screen
userA Whether or not 5 Video Guan Ping Is that
c. And inputting data in the user screen opening event list into a trained screen opening model, and determining whether to open the screen according to the output of the screen opening model. If the data in the user voice use event list and the data in the user screen opening event list are connected and processed as model training, the trained logistic regression model is finally used for prediction, and if 1 is returned, screen opening is confirmed; and returning to 0, confirming that the screen is not opened.
S3005, secondary confirmation is carried out to determine whether to start screen response, such as TTS reminding and asking and answering whether to start screen response. And if the user semantically answers to open the screen, confirming to open the screen, namely, confirming to adjust the intelligent terminal from the screen closing state to the screen opening state, and otherwise, confirming not to open the screen, namely, confirming not to adjust the intelligent terminal from the screen closing state to the screen opening state.
In the embodiment of the present invention, based on the same inventive concept as the above-mentioned intelligent terminal control method, an embodiment of the present invention further provides an intelligent terminal control apparatus, including:
the voice instruction acquisition module is used for acquiring a voice instruction;
the semantic recognition module is used for carrying out semantic recognition on the voice data carried by the voice command;
the screen opening module is used for responding the voice command in the screen closing state if the intelligent terminal is in the screen closing state and the screen opening response is determined not to be required according to the semantic recognition result; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameter, if so, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state.
The embodiment of the invention also provides an intelligent terminal, which comprises a processor and a memory, wherein the memory stores instructions capable of being executed by the processor, and the instructions are loaded and executed by the processor, so that the intelligent terminal control method related to the embodiment is realized.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the intelligent terminal control method according to the above-described embodiment.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, some or all of the steps may be executed in parallel or executed sequentially, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented.
The modules described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more modules are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part thereof, which essentially contributes to the prior art, can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device or a terminal, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: u disk, removable hard disk, ROM, RAM) magnetic or optical disk, or the like.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. The intelligent terminal control method is characterized by comprising the following steps of:
acquiring a voice instruction;
performing semantic recognition on voice data carried by the voice instruction;
when the intelligent terminal is in a screen-off state, if it is determined that the voice command does not need to be responded by opening the screen according to the semantic recognition result, responding to the voice command in the screen-off state; and if the fact that the screen needs to be opened to respond to the voice command is determined according to the semantic recognition result, whether the intelligent terminal is adjusted from the screen closing state to the screen opening state is determined according to the target parameter, if so, the intelligent terminal is adjusted to the screen opening state and responds to the voice command, and if not, the voice command is responded to in the screen closing state.
2. The intelligent terminal control method according to claim 1, wherein the target parameters include any one or more of the following: and semantic recognition results, voice print recognition results of the voice data, system time, images of projection areas, ambient light brightness and user behavior data of the intelligent terminal.
3. The intelligent terminal control method according to claim 1, wherein the responding to the voice command in the off-screen state comprises voice broadcasting, and the wake-up exempt state is maintained during the voice broadcasting.
4. The intelligent terminal control method according to claim 3, wherein when the wake-up exempt state is maintained during the voice broadcast, the conditions for performing the voice response include:
the continuation speaker is the currently speaking user and the semantics of the continuation speaker input speech are related to the current semantic scene.
5. The intelligent terminal control method according to claim 2, wherein the confirming whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the target parameter comprises at least one of:
if the semantic recognition result is definite screen opening semantics, the intelligent terminal is confirmed to be adjusted from a screen closing state to a screen opening state;
if the input user of the voice command is determined to be a child based on the voiceprint recognition result and the screen-opening child lock is opened on the intelligent terminal, determining that the intelligent terminal is not adjusted from the screen-closing state to the screen-opening state;
if the projection area of the intelligent terminal is determined to be occupied based on the image of the projection area, the intelligent terminal is not adjusted from a screen-off state to a screen-on state;
and if the current time is determined to be the night time based on the system time and the environmental light brightness is smaller than the brightness threshold, determining whether to adjust the intelligent terminal from the screen-off state to the screen-on state or not according to the user behavior data of the intelligent terminal.
6. The intelligent terminal control method according to claim 2, wherein the determining whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the target parameter comprises:
judging whether the semantic recognition result is definite screen-on semantics or not, and if the semantic recognition result is definite screen-on semantics, determining to adjust the intelligent terminal from a screen-off state to a screen-on state;
performing voiceprint recognition on the voice data, and if the input user of the voice instruction is determined to be a child based on the voiceprint recognition result and the intelligent terminal is unlocked, determining that the intelligent terminal is not adjusted from a screen-off state to a screen-on state, and reminding children at the current time that the screen cannot be unlocked for use;
judging whether the projection area of the intelligent terminal is occupied or not based on the image of the projection area, if so, confirming that the intelligent terminal is not adjusted from a screen-off state to a screen-on state, and prompting;
judging whether the current time is the evening time and the environmental light brightness is smaller than a brightness threshold value, if the current time is the evening time and the environmental light brightness is smaller than the brightness threshold value, determining whether the intelligent terminal is adjusted from a screen-off state to a screen-on state according to the user behavior data of the intelligent terminal;
and prompting a user whether to adjust the intelligent terminal from the screen closing state to the screen opening state for response, if the user confirms that the intelligent terminal is adjusted from the screen closing state to the screen opening state for response, confirming that the intelligent terminal is adjusted from the screen closing state to the screen opening state, and otherwise, confirming that the intelligent terminal is not adjusted from the screen closing state to the screen opening state.
7. The intelligent terminal control method according to claim 5 or 6, wherein determining whether to adjust the intelligent terminal from the off-screen state to the on-screen state according to the user behavior data of the intelligent terminal comprises:
generating a user screen opening event table according to the current voice behavior data of the user, wherein the user screen opening event table comprises a field user ID and any one or more of the following fields: whether a holiday or a weekend is saved on the next day, the time period, the semantic field and the current state are judged, and whether a screen needs to be started or not is judged;
and inputting the data in the user screen opening event list into a screen opening model trained by using the user historical behavior data of the intelligent terminal, and determining whether to adjust the intelligent terminal from a screen closing state to a screen opening state according to the output of the screen opening model.
8. The intelligent terminal control method according to claim 7, wherein the training method of the screen-opening model comprises the following steps:
generating a first user event list according to a collected voice-related behavior log of a user in the evening time, wherein the log comprises a user ID and any one or more of the following fields: using the voice time, the semantic field, the current state, whether a screen needs to be started or not, and whether the screen is started or not; the first user event table comprises a field user ID and any one or more of the following fields: whether the next day is holiday or weekend, time period, semantic field, current state, whether the screen needs to be started or not;
counting the voice use times of a preset time period of a user in the evening time according to the first user event table to generate a user voice use event table, wherein the user voice use event table comprises a field user ID and any one or more of the following fields: the method comprises the following steps of voice using times, screen opening using times, the times of the final screen opening of a screen opening instruction in a screen closing state, the times of the final screen opening of the screen opening instruction in a screen closing state in a holiday on the next day, and the times of the final screen opening of the screen opening instruction in a screen closing state in a non-holiday on the second day;
only retaining data of which the current state is off and the screen needs to be opened in the first user event table, and generating a second user event table, wherein fields contained in the second user event table are the same as those in the first user event table;
training the open-screen model using the user speech usage event table and a second user event table.
9. An intelligent terminal, characterized in that the intelligent terminal comprises a processor and a memory, wherein the memory stores instructions executable by the processor, and the instructions are loaded and executed by the processor to realize the intelligent terminal control method according to any one of claims 1 to 8.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the intelligent terminal control method of any one of claims 1-8.
CN202110937087.5A 2021-08-16 2021-08-16 Intelligent terminal control method, storage medium and intelligent terminal Active CN113393839B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110937087.5A CN113393839B (en) 2021-08-16 2021-08-16 Intelligent terminal control method, storage medium and intelligent terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110937087.5A CN113393839B (en) 2021-08-16 2021-08-16 Intelligent terminal control method, storage medium and intelligent terminal

Publications (2)

Publication Number Publication Date
CN113393839A true CN113393839A (en) 2021-09-14
CN113393839B CN113393839B (en) 2021-11-12

Family

ID=77622535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110937087.5A Active CN113393839B (en) 2021-08-16 2021-08-16 Intelligent terminal control method, storage medium and intelligent terminal

Country Status (1)

Country Link
CN (1) CN113393839B (en)

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130234928A1 (en) * 2012-03-09 2013-09-12 Electronics And Telecommunications Research Institute Apparatus and method for controlling screen
US8706827B1 (en) * 2012-06-21 2014-04-22 Amazon Technologies, Inc. Customized speech generation
CN103957366A (en) * 2014-03-31 2014-07-30 福建省科正智能科技有限公司 Method for intelligently controlling use of television by children and system
CN105389077A (en) * 2014-09-01 2016-03-09 三星电子株式会社 Displaying method of electronic device and electronic device thereof
CN106549833A (en) * 2015-09-21 2017-03-29 阿里巴巴集团控股有限公司 A kind of control method and device of intelligent home device
CN107566869A (en) * 2017-10-23 2018-01-09 四川长虹电器股份有限公司 A kind of advertisement TV display brightness dynamic adjusting system and method
CN108470034A (en) * 2018-02-01 2018-08-31 百度在线网络技术(北京)有限公司 A kind of smart machine service providing method and system
CN108540856A (en) * 2018-04-28 2018-09-14 上海与德科技有限公司 Smart television child lock unlocking method, device, storage medium and smart television
CN108919952A (en) * 2018-06-28 2018-11-30 郑州云海信息技术有限公司 A kind of control method, device, equipment and the storage medium of intelligent terminal screen
CN109243431A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 A kind of processing method, control method, recognition methods and its device and electronic equipment
CN109688474A (en) * 2018-12-28 2019-04-26 南京创维信息技术研究院有限公司 TV speech control method, device and computer readable storage medium
US20190147862A1 (en) * 2017-11-16 2019-05-16 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for providing voice service
CN109788360A (en) * 2018-12-12 2019-05-21 百度在线网络技术(北京)有限公司 Voice-based TV control method and device
CN111199728A (en) * 2018-10-31 2020-05-26 阿里巴巴集团控股有限公司 Training data acquisition method and device, intelligent sound box and intelligent television
CN111312241A (en) * 2020-02-10 2020-06-19 深圳创维-Rgb电子有限公司 Unmanned shopping guide method, terminal and storage medium
CN111932296A (en) * 2020-07-20 2020-11-13 中国建设银行股份有限公司 Product recommendation method and device, server and storage medium
CN112017652A (en) * 2019-05-31 2020-12-01 华为技术有限公司 Interaction method and terminal equipment
CN112118431A (en) * 2019-06-20 2020-12-22 青岛海信激光显示股份有限公司 Projection method and device

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130234928A1 (en) * 2012-03-09 2013-09-12 Electronics And Telecommunications Research Institute Apparatus and method for controlling screen
US8706827B1 (en) * 2012-06-21 2014-04-22 Amazon Technologies, Inc. Customized speech generation
CN103957366A (en) * 2014-03-31 2014-07-30 福建省科正智能科技有限公司 Method for intelligently controlling use of television by children and system
CN105389077A (en) * 2014-09-01 2016-03-09 三星电子株式会社 Displaying method of electronic device and electronic device thereof
CN106549833A (en) * 2015-09-21 2017-03-29 阿里巴巴集团控股有限公司 A kind of control method and device of intelligent home device
CN109243431A (en) * 2017-07-04 2019-01-18 阿里巴巴集团控股有限公司 A kind of processing method, control method, recognition methods and its device and electronic equipment
CN107566869A (en) * 2017-10-23 2018-01-09 四川长虹电器股份有限公司 A kind of advertisement TV display brightness dynamic adjusting system and method
US20190147862A1 (en) * 2017-11-16 2019-05-16 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for providing voice service
CN108470034A (en) * 2018-02-01 2018-08-31 百度在线网络技术(北京)有限公司 A kind of smart machine service providing method and system
CN108540856A (en) * 2018-04-28 2018-09-14 上海与德科技有限公司 Smart television child lock unlocking method, device, storage medium and smart television
CN108919952A (en) * 2018-06-28 2018-11-30 郑州云海信息技术有限公司 A kind of control method, device, equipment and the storage medium of intelligent terminal screen
CN111199728A (en) * 2018-10-31 2020-05-26 阿里巴巴集团控股有限公司 Training data acquisition method and device, intelligent sound box and intelligent television
CN109788360A (en) * 2018-12-12 2019-05-21 百度在线网络技术(北京)有限公司 Voice-based TV control method and device
CN109688474A (en) * 2018-12-28 2019-04-26 南京创维信息技术研究院有限公司 TV speech control method, device and computer readable storage medium
CN112017652A (en) * 2019-05-31 2020-12-01 华为技术有限公司 Interaction method and terminal equipment
CN112118431A (en) * 2019-06-20 2020-12-22 青岛海信激光显示股份有限公司 Projection method and device
CN111312241A (en) * 2020-02-10 2020-06-19 深圳创维-Rgb电子有限公司 Unmanned shopping guide method, terminal and storage medium
CN111932296A (en) * 2020-07-20 2020-11-13 中国建设银行股份有限公司 Product recommendation method and device, server and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
S. M. S. SARANGA SENARATHNA 等: ""Intelligent Robot Companion Capable of Controlling Environment Ambiance of Smart Houses by Observing User"s Behavior"", 《2018 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING》 *
张瑾: ""智能家居语音控制的研究与实现"", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Also Published As

Publication number Publication date
CN113393839B (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN110100447B (en) Information processing method and device, multimedia device and storage medium
JP7322076B2 (en) Dynamic and/or context-specific hotwords to launch automated assistants
CN111683263B (en) Live broadcast guiding method, device, equipment and computer readable storage medium
CN106791921B (en) Processing method and device for live video and storage medium
CN109243431A (en) A kind of processing method, control method, recognition methods and its device and electronic equipment
CN106941619A (en) Program prompting method, device and system based on artificial intelligence
CN110995929A (en) Terminal control method, device, terminal and storage medium
CN110046486B (en) Intelligent interaction equipment control method, system, controller and medium
CN107453980A (en) Problem response method and device in instant messaging
CN112185389A (en) Voice generation method and device, storage medium and electronic equipment
WO2024160041A1 (en) Multi-modal conversation method and apparatus, and device and storage medium
CN115312068A (en) Voice control method, device and storage medium
CN111090733A (en) Human-computer interaction method, device, equipment and readable storage medium
CN111951787A (en) Voice output method, device, storage medium and electronic equipment
CN116109866A (en) Fine tuning model construction method, image classification processing device and electronic equipment
CN113596604B (en) Event reminding processing method and device based on television, intelligent terminal and medium
EP3407096B1 (en) Method and device for determining descriptive information of precipitation trend, and readable storage medium
CN113393839B (en) Intelligent terminal control method, storage medium and intelligent terminal
CN109658924B (en) Session message processing method and device and intelligent equipment
CN114885189A (en) Control method, device and equipment for opening fragrance and storage medium
CN114217841A (en) Application program control method and device, electronic equipment and readable storage medium
CN112906923A (en) Intelligent cinema implementation method and device, computer equipment and storage medium
DE112021003164T5 (en) Systems and methods for recognizing voice commands to create a peer-to-peer communication link
CN113254611A (en) Question recommendation method and device, electronic equipment and storage medium
CN113314115A (en) Voice processing method of terminal equipment, terminal equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant