CN116434746A - Television control method and device, electronic equipment and storage medium - Google Patents

Television control method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116434746A
CN116434746A CN202310196057.2A CN202310196057A CN116434746A CN 116434746 A CN116434746 A CN 116434746A CN 202310196057 A CN202310196057 A CN 202310196057A CN 116434746 A CN116434746 A CN 116434746A
Authority
CN
China
Prior art keywords
instruction
target
matching degree
instruction type
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310196057.2A
Other languages
Chinese (zh)
Inventor
夏帅
黄伟琦
江鹏
唐睿坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310196057.2A priority Critical patent/CN116434746A/en
Publication of CN116434746A publication Critical patent/CN116434746A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The disclosure provides a control method and device for a television, electronic equipment and a storage medium, relates to the technical field of computers, and particularly relates to the technical field of artificial intelligence such as voice recognition and natural language processing. The specific implementation scheme is as follows: acquiring a voice control instruction; identifying the voice control instruction to obtain a voice identification result; analyzing the voice recognition result based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction; and controlling the television based on the target instruction type and the target keywords. Therefore, the target instruction type and the corresponding target keywords corresponding to the voice control instruction can be accurately determined based on the preset instruction configuration list, and further, the television can be accurately controlled to perform corresponding operation based on the target instruction type and the corresponding target keywords, so that the accuracy of controlling the television based on the voice control instruction is improved.

Description

Television control method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of computers, in particular to the technical field of artificial intelligence such as voice recognition and natural language processing, and specifically relates to a control method and device of a television, electronic equipment and a storage medium.
Background
Currently, with the increasing update of consumer electronics market technology, television functions are more and more powerful, and the degree of intelligence and humanization is more and more powerful. With the continuous development of voice technology, voice recognition and control technologies are rapidly rising in recent years, and intelligent terminals such as televisions and the like can be correspondingly operated through voice control. Therefore, how to accurately control a television by a voice control instruction is an important research direction.
Disclosure of Invention
The disclosure provides a television control method, a television control device, electronic equipment and a storage medium.
According to a first aspect of the present disclosure, there is provided a control method of a television, including:
acquiring a voice control instruction;
identifying the voice control instruction to obtain a voice identification result;
analyzing the voice recognition result based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction;
and controlling the television based on the target instruction type and the target keyword.
According to a second aspect of the present disclosure, there is provided a control apparatus for a television, including:
the first acquisition module is used for acquiring a voice control instruction;
The second acquisition module is used for identifying the voice control instruction so as to acquire a voice identification result;
the third acquisition module is used for analyzing the voice recognition result based on a preset instruction configuration list so as to acquire a target instruction type and a target keyword corresponding to the voice control instruction;
and the control module is used for controlling the television based on the target instruction type and the target keyword.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of controlling a television as described in the first aspect.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the control method of a television according to the first aspect.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method of controlling a television as described in the first aspect.
The television control method, the television control device, the electronic equipment and the storage medium have the following beneficial effects:
in the embodiment of the disclosure, a voice control instruction may be received first, then the voice control instruction is identified to obtain a voice recognition result, the voice recognition result is parsed based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction, and finally a television is controlled based on the target instruction type and the target keyword. Therefore, the target instruction type and the corresponding target keywords corresponding to the voice control instruction can be accurately determined based on the preset instruction configuration list, and further, the television can be accurately controlled to perform corresponding operation based on the target instruction type and the corresponding target keywords, so that the accuracy of controlling the television based on the voice control instruction is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a flowchart of a control method of a television according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a control method of a television according to still another embodiment of the present disclosure;
fig. 3 is a flowchart of a control method of a television according to still another embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a control device of a television according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device for implementing a control method of a television according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The embodiment of the disclosure relates to the technical field of artificial intelligence such as voice recognition, natural language processing and the like.
Artificial intelligence (Artificial Intelligence), english is abbreviated AI. It is a new technical science for researching, developing theory, method, technology and application system for simulating, extending and expanding human intelligence.
Natural language processing is the processing, understanding, and use of human language (e.g., chinese, english, etc.) by a computer, which is an interdisciplinary of computer science and linguistics, and is often referred to as computational linguistics. Since natural language is the fundamental sign of humans as distinguished from other animals. Without language, human thinking is not talking, so natural language processing embodies the highest tasks and boundaries of artificial intelligence, that is, machines achieve true intelligence only when computers have the ability to process natural language.
Speech recognition, also known as automatic speech recognition, aims at converting the content of human language into computer readable inputs, such as keys, binary codes or character sequences, etc., so that human interaction with a machine through natural language is possible.
Control methods, apparatuses, electronic devices, and storage media of televisions according to embodiments of the present disclosure are described below with reference to the accompanying drawings.
It should be noted that, the main execution body of the control method of the television in this embodiment is a control device of the television, and the device may be implemented in a software and/or hardware manner, and the device may be configured in an electronic device, where the electronic device may include, but is not limited to, a terminal, a server, and the like.
Fig. 1 is a flowchart of a control method of a television according to an embodiment of the present disclosure.
As shown in fig. 1, the control method of the television includes:
s101: and acquiring a voice control instruction.
It can be understood that when the user needs to perform a corresponding operation through the voice control television, the voice control instruction of the user needs to be acquired first. Optionally, the voice control instruction may be obtained by a voice collecting device, where the voice collecting device may be a microphone, a remote controller with a voice receiving and transmitting function, or a smart phone. The specific method for acquiring the voice control instruction is not limited in the application.
Alternatively, the voice control instruction may be a voice containing such as "play XX toilet", "play yesterday XX program", "turn up sound", "open XX application", and the like.
S102: and recognizing the voice control instruction to obtain a voice recognition result.
The voice recognition result contains text information corresponding to the voice control instruction.
Specifically, a voice recognition technology may be used to recognize the voice control command to obtain a voice recognition result.
Optionally, after the voice recognition result is obtained, text correction may be further performed on the voice recognition result to obtain a corrected voice recognition result.
The text correction of the voice recognition result may include at least one of:
based on a preset word stock, replacing words in the voice recognition result;
and performing hot word replacement on the voice recognition result.
Deleting the mood words in the voice recognition result;
replacing sensitive words contained in the voice recognition result with preset symbols;
converting non-Arabic numerals contained in the voice recognition result into Arabic numerals;
the letters contained in the speech recognition result are converted into a specified format, for example, uniformly converted into lower case letters, or uniformly converted into upper case letters.
S103: and analyzing the voice recognition result based on a preset instruction configuration list to acquire a target instruction type and a target keyword corresponding to the voice control instruction.
The preset instruction configuration list may include word lists corresponding to each instruction type. The various instruction types may include an on-demand type, a live type, a review type, an intelligent application type, a system control type, a weather query type, and the like.
For example, the vocabulary corresponding to the live broadcast type may include a television station capable of being on demand, for example, a first center station and a second center station.
For example, the vocabulary corresponding to the on-demand type may include movie a, television series B, and so on.
For example, the vocabulary corresponding to the review type may include keywords of the television station class, such as first center, second center, and time class, such as yesterday, previous day, XX year, XX month, XX day, and so on. The present disclosure is not limited in this regard.
The vocabulary corresponding to the system control type may include performing, for example, volume, size, height, fast forward, fast reverse, and the like.
The target instruction type may be any one of a on-demand type, a live type, a review type, an intelligent application type and a system control type.
Optionally, keyword extraction may be performed on the voice recognition result based on the vocabulary corresponding to each instruction type, so as to obtain the number of keywords, which are contained in the voice recognition result and are identical to the vocabulary corresponding to each instruction type, and determine the instruction type corresponding to the maximum number as the target instruction type, and the identical keywords as the target keywords.
For example, the speech recognition result is "i want to see yesterday's central one", where the vocabulary corresponding to the review type includes "central one" and "yesterday" two keywords. The word list corresponding to the live broadcast type comprises a keyword of ' one central one ', the target instruction type is a review type, and the target keyword is ' one central one ' yesterday '.
Optionally, an interface for automatic training of the media assets in batches may be provided, where each operator may upload media asset content by means of files, where the media asset content includes content that all instructions need to use, such as live tv stations, on-demand movies or tv shows, review tv station programs, and so on. The data form uploaded by the file may include a normalized value and an original value, such as a central set (original value) in the live broadcast, and may also be referred to as CCTV1, CCTV integration (normalized value), and the like. Wherein the normalized value is one line, the original value is one line, and the line of the original value can be preceded by a # number as a marker.
The data configured needs to be provided by the operator and the data provided needs to be purged in accordance with the given data format. The cleansing mode may include deleting spaces, punctuation, complex transformations, etc., to cleanse the form in which voice recognition may be extracted.
Optionally, the uploading of the media asset needs to be performed in idle time, and the whole process of uploading the media asset comprises media asset uploading, media asset training, training state checking and training completion synchronization to production. The media asset uploading is carried out by calling an uploading interface, the uploaded media asset is needed to be stored in the media asset training process, the media asset is analyzed based on a natural language processing technology (namely, the instruction type corresponding to the media asset is determined, for example, CCTV1 belongs to a live broadcast type), state checking (checking whether training is finished) is continuously carried out in the training process, and after the training is finished, the production environment is synchronized in gray level, so that the updating of the production environment can be finished as soon as possible, and the interference is reduced.
S104: and controlling the television based on the target instruction type and the target keywords.
Optionally, under the condition that the target instruction type belongs to the search type, searching is performed in a database corresponding to the target instruction type based on the target keyword to obtain a corresponding search result, and then the search result is displayed in a display screen of the television.
The instruction types contained in the instruction configuration list belong to retrieval types except for a system control type, such as an on-demand type, a live broadcast type, a review type, an intelligent application type and the like.
If the target instruction type belongs to the retrieval type, searching in a database corresponding to the target instruction type based on the target keyword to obtain the content matched with the voice control instruction, and displaying the content in the television.
For example, if the target type is on demand and the target keywords are "comedy" and "movie", the user searches for the corresponding movie recommended content based on the "comedy" and "movie" in the database of the on demand category, and displays the movie recommended content on the display screen of the television for selection.
For example, if the target type is live broadcast and the target keyword is "central five", searching is performed based on a database of "central five" live broadcast to obtain a live broadcast link corresponding to the central five, and based on the live broadcast link, playing the program content in the central five live broadcast in a display screen of the television.
Optionally, in the case that the target instruction type belongs to a non-search type, the television is subjected to system control based on the target keyword. Therefore, the system control can be directly carried out on the television for the voice control instruction which does not need to be searched, and the efficiency and the accuracy of the control on the television system are improved.
The system control type is a non-search type.
For example, if the target command type is a system control type and the corresponding target keyword is "up", "volume", the playing volume of the television is directly increased.
In the embodiment of the disclosure, the voice control instruction may be received first, then the voice control instruction is identified to obtain a voice recognition result, the voice recognition result is parsed based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction, and finally the television is controlled based on the target instruction type and the target keyword. Therefore, the target instruction type and the corresponding target keywords corresponding to the voice control instruction can be accurately determined based on the preset instruction configuration list, and further, the television can be accurately controlled to perform corresponding operation based on the target instruction type and the corresponding target keywords, so that the accuracy of controlling the television based on the voice control instruction is improved.
Fig. 2 is a flowchart of a control method of a television according to still another embodiment of the present disclosure;
as shown in fig. 2, the control method of the television includes:
s201: and acquiring a voice control instruction.
S202: and recognizing the voice control instruction to obtain a voice recognition result.
The specific implementation manner of step S201 and step S202 may refer to the detailed description of other embodiments of the present disclosure, and will not be described in detail herein.
S203: and extracting keywords from the voice recognition result to obtain a keyword set contained in the voice recognition result.
The keyword set is a set composed of keywords contained in the voice recognition result. The keyword set may include one keyword or a plurality of keywords. The present disclosure is not limited in this regard.
Optionally, a vocabulary corresponding to each instruction type in the instruction configuration list may be used to extract keywords from the voice recognition result, so as to obtain a keyword set.
Alternatively, a trained keyword extraction model may be used to extract keywords from the speech recognition result to obtain a keyword set.
S204: and determining a first matching degree between the keyword set and a word list corresponding to each instruction type in the instruction configuration list.
Optionally, the number of words, which are included in the vocabulary corresponding to each instruction type and are the same as the keywords in the keyword set, may be determined first, and then the ratio between each number and the total number of the keywords in the keyword set is determined as the first matching degree between the keyword set and the corresponding instruction type. Therefore, the first matching degree between the keyword set and the word list corresponding to each instruction type in the instruction configuration list can be accurately determined based on the number of words which are contained in the word list corresponding to each instruction type and are identical to the keywords in the keyword set.
For example, the voice control instruction is that "i want to see the last day's last day of a play feature show," the last day ", the feature show", the same words as the keyword set in the vocabulary corresponding to the on-demand type are "the feature show", the same words as the keyword set in the vocabulary corresponding to the live type are "the last day", the same words as the keyword set in the vocabulary corresponding to the review type are "the last day", the first matching degree corresponding to the on-demand type is 0.33, the first matching degree corresponding to the live type is 0.33, and the first matching degree corresponding to the review type is 0.67.
Alternatively, the number of words, which are contained in the vocabulary corresponding to each instruction type and are the same as the keywords in the keyword set, may be directly used as the first matching degree between the keyword set and each instruction type.
S205: and determining a second matching degree between the voice recognition result and a sample instruction set corresponding to each instruction type in the instruction configuration list.
The instruction configuration list may further include a sample instruction set corresponding to each instruction type. For example, a sample instruction set of live type may include "I want to see CCTV1," turn on "CCTV1," etc. The present disclosure is not limited in this regard.
Optionally, a third matching degree between the voice recognition result and each sample instruction in the sample instruction set corresponding to each instruction type may be determined first, and then a maximum third matching degree corresponding to each instruction type is determined as the second matching degree corresponding to each instruction type, so that the second matching degree between the voice recognition result and the sample instruction set corresponding to each instruction type in the instruction configuration list may be accurately determined based on the maximum third matching degree in the sample instruction set corresponding to each instruction type.
It will be appreciated that the sample instruction set corresponding to each instruction type contains many sample instructions, and therefore, it is necessary to find the sample instruction with the largest third matching degree from the sample instruction set corresponding to each instruction type, and take the largest third matching degree as the second matching degree between each instruction type and the speech recognition result.
The third matching degree between the voice recognition result and each sample instruction in the sample instruction set corresponding to each instruction type can be determined through cosine similarity, euclidean distance and other calculation modes. The present disclosure is not limited in this regard.
For example, the speech recognition result is "i want to see CCTV1", the keyword set is "CCTV1", where "CCTV1" is included in the word stock corresponding to the live broadcast type and also included in the word stock corresponding to the review type, that is, the keyword set is the same as the first matching degree of the live broadcast type and the review type, where it is difficult to determine the target instruction type, then the second matching degree between the speech recognition result and the sample instruction set corresponding to each instruction type in the instruction configuration list may be further determined, and the target instruction type is determined by combining the second matching degree.
S206: and determining the target instruction type corresponding to the voice control instruction according to the first matching degree and the second matching degree.
Optionally, a first weight corresponding to the first matching degree and a second weight corresponding to the second matching degree are obtained, then a target matching degree corresponding to each instruction type is determined based on the first weight and the second weight and the first matching degree and the second matching degree corresponding to each instruction type, and finally the instruction type with the highest target matching degree is determined to be the target instruction type corresponding to the voice control instruction. Thus, the target instruction type corresponding to the voice control instruction can be accurately determined.
The first weight and the second weight may be the same or different. The present disclosure is not limited in this regard. Optionally, the sum of the first weight and the second weight is 1.
Wherein target matching degree=first matching degree first weight+second matching degree second weight.
The higher the target matching degree is, the more the voice control instruction is matched with the target instruction type corresponding to the target matching degree, so that the instruction type with the highest target matching degree is determined to be the target instruction type corresponding to the voice control instruction.
S207: and determining the keywords which are contained in the keyword set and are positioned in the word list corresponding to the target instruction type as target keywords.
That is, keywords in a vocabulary belonging to the target instruction type contained in the keyword set are determined, and then the television can be controlled based on the target keywords corresponding to the target instruction type.
S208: and controlling the television based on the target instruction type and the target keywords.
In the embodiment of the disclosure, a voice control instruction can be received first, then the voice control instruction is identified to obtain a voice identification result, keyword extraction is performed on the voice identification result to obtain a keyword set contained in the voice identification result, then a first matching degree between the keyword set and a vocabulary corresponding to each instruction type in an instruction configuration list is determined, a second matching degree between the voice identification result and a sample instruction set corresponding to each instruction type in the instruction configuration list is determined, and then a target instruction type corresponding to the voice control instruction is determined according to the first matching degree and the second matching degree; and under the condition that any keyword in the keyword set is the same as any word in the word list corresponding to the target instruction type, determining any keyword as the target keyword, and finally controlling the television based on the target instruction type and the target keyword. Therefore, the target instruction type and the corresponding target keyword corresponding to the voice control instruction can be further and accurately determined based on the vocabulary and the sample instruction set contained in the preset instruction configuration list, so that the television can be further and accurately controlled to perform corresponding operation, and the accuracy of controlling the television based on the voice control instruction is further improved.
Fig. 3 is a flowchart of a control method of a television according to still another embodiment of the present disclosure;
as shown in fig. 3, the control method of the television includes:
s301: and acquiring a voice control instruction.
S302: and recognizing the voice control instruction to obtain a voice recognition result.
S303: and analyzing the voice recognition result based on a preset instruction configuration list to acquire a target instruction type and a target keyword corresponding to the voice control instruction.
The specific implementation manner of step S301 to step S303 may refer to the detailed description of other embodiments of the present disclosure, and will not be described in detail herein.
S304: and under the condition that the target instruction type belongs to the retrieval type, acquiring a target region corresponding to the voice control instruction.
Optionally, if the television has completed the network configuration, an access request for accessing the address of the network device (device for television network configuration) is sent to the network device, and a target region (provincial level) fed back by the network device based on the access request is received.
For example: the television is provided with WiFi, and after the WiFi equipment is connected with a network, the target region can be obtained simply by accessing the link.
Therefore, under the condition of television distribution network, the target region corresponding to the television can be conveniently and accurately obtained by sending an access request to the network equipment of the television so as to obtain the target region corresponding to the television.
Optionally, after the target region corresponding to the voice control instruction is obtained, a mapping word list corresponding to the target region may be further obtained, and the mapping is performed on the target keyword based on the mapping word list, so as to obtain the mapped target keyword.
The form of the keyword used for searching is different from region to region. For example, CCTV1, province 1, and province 2, respectively, use "CCTV1" and "center one". Or the province A uses simplified characters, the province B uses traditional Chinese characters and the like. Therefore, the target keywords can be mapped based on the mapping word list corresponding to the target region to obtain mapped target keywords, and then the mapped keywords are searched in the database corresponding to the target instruction type under the target region to obtain corresponding search results.
S305: and searching in a database corresponding to the target instruction type under the target region based on the target keyword to obtain a corresponding search result.
It will be appreciated that the program content contained in the database for each zone is different. For example, programs in the minority nationality language of the province a contained in the database of the province a are not present in the database of the province B. Therefore, after the target region corresponding to the voice control instruction is determined, the target keyword can be searched in the database corresponding to the target instruction type under the target region to obtain the corresponding search result, so that not only can the program supported by the database of the province be obtained, but also the search range can be reduced, and the search efficiency is improved.
S306: and displaying the search result in a display screen of the television.
In embodiments of the present disclosure, spam may also be included. Specifically, if the voice recognition result contains a sensitive word, returning a sensitive word broadcasting speaking operation; if the content length of the voice recognition result is shorter, the target instruction type and the target keyword are not analyzed, and a voice control instruction with too short length is returned; if the content length of the voice recognition result does not resolve the target instruction type and the target keyword, returning to the voice control operation with overlong instruction length; if the instruction resolution service experiences an error, including but not limited to a service exception shutdown, a network exception, etc., then instruction resolution exceptions are returned, etc.
In the embodiment of the disclosure, a voice control instruction may be received first, then the voice control instruction is identified to obtain a voice recognition result, and keyword extraction is performed on the voice recognition result to obtain a keyword set contained in the voice recognition result, then the voice recognition result is parsed based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction, and then, in the case that the target instruction type belongs to a search type, search is performed in a database corresponding to the target instruction type based on the target keyword to obtain a corresponding search result, and the search result is displayed on a display screen of a television. Therefore, after the target region corresponding to the voice control instruction is determined, the target keyword can be searched in the database corresponding to the target instruction type under the target region to obtain the corresponding search result, so that not only can the programs supported by the database of the province be obtained, but also the search range can be reduced, and the search efficiency is improved.
Fig. 4 is a schematic structural diagram of a control device of a television according to an embodiment of the present disclosure;
as shown in fig. 4, the control device 400 of the television includes:
a first obtaining module 410, configured to obtain a voice control instruction;
a second obtaining module 420, configured to identify the voice control instruction, so as to obtain a voice recognition result;
the third obtaining module 430 is configured to parse the voice recognition result based on a preset instruction configuration list, so as to obtain a target instruction type and a target keyword corresponding to the voice control instruction;
the control module 440 is configured to control the television based on the target instruction type and the target keyword.
Optionally, the third obtaining module 430 includes:
the acquisition unit is used for extracting keywords from the voice recognition result so as to acquire a keyword set contained in the voice recognition result;
the first determining unit is used for determining a first matching degree between the keyword set and a word list corresponding to each instruction type in the instruction configuration list;
a second determining unit, configured to determine a second matching degree between the speech recognition result and a sample instruction set corresponding to each instruction type in the instruction configuration list;
the third determining unit is used for determining a target instruction type corresponding to the voice control instruction according to the first matching degree and the second matching degree;
And a fourth determining unit, configured to determine, as the target keyword, a keyword that is included in the keyword set and is located in a vocabulary corresponding to the target instruction type.
Optionally, the first determining unit is specifically configured to:
determining the number of words which are contained in the word list corresponding to each instruction type and are the same as the keywords in the keyword set;
and determining the ratio between each number and the total number of the keywords in the keyword set as a first matching degree between the keyword set and the corresponding instruction type.
Optionally, the second determining unit is specifically configured to:
determining a third matching degree between the voice recognition result and each sample instruction in the sample instruction set corresponding to each instruction type;
and determining the maximum third matching degree corresponding to each instruction type as the second matching degree corresponding to each instruction type.
Optionally, the third determining unit is specifically configured to:
acquiring a first weight corresponding to the first matching degree and a second weight corresponding to the second matching degree;
determining a target matching degree corresponding to each instruction type based on the first weight, the second weight and the first matching degree and the second matching degree corresponding to each instruction type;
and determining the instruction type with the highest target matching degree as the target instruction type corresponding to the voice control instruction.
Optionally, the control module 440 is specifically configured to:
under the condition that the target instruction type belongs to the retrieval type, acquiring a target region corresponding to the voice control instruction;
searching in a database corresponding to the target instruction type under the target region based on the target keyword to obtain a corresponding search result;
and displaying the search result in a display screen of the television.
Optionally, the control module 440 is specifically configured to:
and under the condition that the target instruction type belongs to a non-search type, performing system control on the television based on the target keyword.
It should be noted that the explanation of the control method of the television is also applicable to the control device of the television in this embodiment, and will not be repeated here.
In the embodiment of the disclosure, the voice control instruction may be received first, then the voice control instruction is identified to obtain a voice recognition result, the voice recognition result is parsed based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction, and finally the television is controlled based on the target instruction type and the target keyword. Therefore, the target instruction type and the corresponding target keywords corresponding to the voice control instruction can be accurately determined based on the preset instruction configuration list, and further, the television can be accurately controlled to perform corresponding operation based on the target instruction type and the corresponding target keywords, so that the accuracy of controlling the television based on the voice control instruction is improved.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the apparatus 500 includes a computing unit 501 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the respective methods and processes described above, for example, a control method of a television. For example, in some embodiments, the control method of a television may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When a computer program is loaded into RAM 503 and executed by computing unit 501, one or more steps of the control method of a television described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the control method of the television in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
In this embodiment, the voice control instruction may be received first, then the voice control instruction is identified to obtain a voice recognition result, the voice recognition result is parsed based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction, and finally the television is controlled based on the target instruction type and the target keyword. Therefore, the target instruction type and the corresponding target keywords corresponding to the voice control instruction can be accurately determined based on the preset instruction configuration list, and further, the television can be accurately controlled to perform corresponding operation based on the target instruction type and the corresponding target keywords, so that the accuracy of controlling the television based on the voice control instruction is improved.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, the meaning of "a plurality" is at least two, such as two, three, etc., unless explicitly specified otherwise. In the description of the present disclosure, the words "if" and "if" are used to be interpreted as "at … …" or "at … …" or "in response to a determination" or "in the … … case".
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (17)

1. A control method of a television, comprising:
acquiring a voice control instruction;
identifying the voice control instruction to obtain a voice identification result;
analyzing the voice recognition result based on a preset instruction configuration list to obtain a target instruction type and a target keyword corresponding to the voice control instruction;
and controlling the television based on the target instruction type and the target keyword.
2. The method of claim 1, wherein the parsing the voice recognition result based on the preset instruction configuration list to obtain the target instruction type and the target keyword corresponding to the voice control instruction includes:
extracting keywords from the voice recognition result to obtain a keyword set contained in the voice recognition result;
Determining a first matching degree between the keyword set and a vocabulary corresponding to each instruction type in the instruction configuration list;
determining a second matching degree between the voice recognition result and a sample instruction set corresponding to each instruction type in the instruction configuration list;
determining the target instruction type corresponding to the voice control instruction according to the first matching degree and the second matching degree;
and determining the keywords which are contained in the keyword set and are positioned in the word list corresponding to the target instruction type as the target keywords.
3. The method of claim 2, wherein the determining a first degree of matching between the set of keywords and a vocabulary corresponding to each instruction type in the instruction configuration list comprises:
determining the number of words which are contained in the word list corresponding to each instruction type and are the same as the keywords in the keyword set;
and determining the ratio of each number to the total number of keywords in the keyword set as a first matching degree between the keyword set and the corresponding instruction type.
4. The method of claim 2, wherein the determining a second degree of match between the speech recognition result and a sample instruction set corresponding to each instruction type in the instruction configuration list comprises:
Determining a third matching degree between the voice recognition result and each sample instruction in the sample instruction set corresponding to each instruction type;
and determining the maximum third matching degree corresponding to each instruction type as the second matching degree corresponding to each instruction type.
5. The method of claim 2, wherein the determining the target instruction type corresponding to the voice control instruction according to the first matching degree and the second matching degree comprises:
acquiring a first weight corresponding to the first matching degree and a second weight corresponding to the second matching degree;
determining a target matching degree corresponding to each instruction type based on the first weight, the second weight and the first matching degree and the second matching degree corresponding to each instruction type;
and determining the instruction type with the highest target matching degree as the target instruction type corresponding to the voice control instruction.
6. The method of claim 1, wherein the controlling the television based on the target instruction type and the target keyword comprises:
under the condition that the target instruction type belongs to a retrieval type, acquiring a target region corresponding to the voice control instruction;
Searching in a database corresponding to the target instruction type under the target region based on the target keyword to obtain a corresponding search result;
and displaying the search result in a display screen of the television.
7. The method of claim 1, wherein the controlling the television based on the target instruction type and the target keyword comprises:
and under the condition that the target instruction type belongs to a non-search type, performing system control on the television based on the target keyword.
8. A control device for a television, comprising:
the first acquisition module is used for acquiring a voice control instruction;
the second acquisition module is used for identifying the voice control instruction so as to acquire a voice identification result;
the third acquisition module is used for analyzing the voice recognition result based on a preset instruction configuration list so as to acquire a target instruction type and a target keyword corresponding to the voice control instruction;
and the control module is used for controlling the television based on the target instruction type and the target keyword.
9. The apparatus of claim 8, wherein the third acquisition module comprises:
The acquisition unit is used for extracting keywords from the voice recognition result so as to acquire a keyword set contained in the voice recognition result;
the first determining unit is used for determining a first matching degree between the keyword set and a vocabulary corresponding to each instruction type in the instruction configuration list;
a second determining unit, configured to determine a second matching degree between the speech recognition result and a sample instruction set corresponding to each instruction type in the instruction configuration list;
the third determining unit is used for determining the target instruction type corresponding to the voice control instruction according to the first matching degree and the second matching degree;
and a fourth determining unit, configured to determine, as the target keyword, a keyword that is included in the keyword set and is located in a vocabulary corresponding to the target instruction type.
10. The apparatus of claim 9, wherein the first determining unit is specifically configured to:
determining the number of words which are contained in the word list corresponding to each instruction type and are the same as the keywords in the keyword set;
and determining the ratio of each number to the total number of keywords in the keyword set as a first matching degree between the keyword set and the corresponding instruction type.
11. The apparatus of claim 9, wherein the second determining unit is specifically configured to:
determining a third matching degree between the voice recognition result and each sample instruction in the sample instruction set corresponding to each instruction type;
and determining the maximum third matching degree corresponding to each instruction type as the second matching degree corresponding to each instruction type.
12. The apparatus of claim 9, wherein the third determining unit is specifically configured to:
acquiring a first weight corresponding to the first matching degree and a second weight corresponding to the second matching degree;
determining a target matching degree corresponding to each instruction type based on the first weight, the second weight and the first matching degree and the second matching degree corresponding to each instruction type;
and determining the instruction type with the highest target matching degree as the target instruction type corresponding to the voice control instruction.
13. The apparatus of claim 8, wherein the control module is specifically configured to:
under the condition that the target instruction type belongs to a retrieval type, acquiring a target region corresponding to the voice control instruction;
searching in a database corresponding to the target instruction type under the target region based on the target keyword to obtain a corresponding search result;
And displaying the search result in a display screen of the television.
14. The apparatus of claim 8, wherein the control module is specifically configured to:
and under the condition that the target instruction type belongs to a non-search type, performing system control on the television based on the target keyword.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
17. A computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-7.
CN202310196057.2A 2023-02-28 2023-02-28 Television control method and device, electronic equipment and storage medium Pending CN116434746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310196057.2A CN116434746A (en) 2023-02-28 2023-02-28 Television control method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310196057.2A CN116434746A (en) 2023-02-28 2023-02-28 Television control method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116434746A true CN116434746A (en) 2023-07-14

Family

ID=87091478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310196057.2A Pending CN116434746A (en) 2023-02-28 2023-02-28 Television control method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116434746A (en)

Similar Documents

Publication Publication Date Title
CN114549874B (en) Training method of multi-target image-text matching model, image-text retrieval method and device
CN112559800B (en) Method, apparatus, electronic device, medium and product for processing video
EP3916579A1 (en) Method for resource sorting, method for training sorting model and corresponding apparatuses
CN112507706B (en) Training method and device for knowledge pre-training model and electronic equipment
CN114861889B (en) Deep learning model training method, target object detection method and device
CN108768824B (en) Information processing method and device
CN113836925B (en) Training method and device for pre-training language model, electronic equipment and storage medium
CN113836314B (en) Knowledge graph construction method, device, equipment and storage medium
CN112988753B (en) Data searching method and device
CN110532404B (en) Source multimedia determining method, device, equipment and storage medium
CN116955856A (en) Information display method, device, electronic equipment and storage medium
CN117171296A (en) Information acquisition method and device and electronic equipment
CN111783433A (en) Text retrieval error correction method and device
CN116049370A (en) Information query method and training method and device of information generation model
CN113590852B (en) Training method of multi-modal recognition model, multi-modal recognition method and device
CN112784600B (en) Information ordering method, device, electronic equipment and storage medium
CN114417862A (en) Text matching method, and training method and device of text matching model
CN116434746A (en) Television control method and device, electronic equipment and storage medium
CN115828915B (en) Entity disambiguation method, device, electronic equipment and storage medium
CN114462364B (en) Method and device for inputting information
CN113377922B (en) Method, device, electronic equipment and medium for matching information
CN113268987B (en) Entity name recognition method and device, electronic equipment and storage medium
CN113377921B (en) Method, device, electronic equipment and medium for matching information
CN114491318B (en) Determination method, device, equipment and storage medium of target information
CN113361249B (en) Document weight judging method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination