CN111741369B - Smart television set top box based on voice recognition - Google Patents

Smart television set top box based on voice recognition Download PDF

Info

Publication number
CN111741369B
CN111741369B CN202010662605.2A CN202010662605A CN111741369B CN 111741369 B CN111741369 B CN 111741369B CN 202010662605 A CN202010662605 A CN 202010662605A CN 111741369 B CN111741369 B CN 111741369B
Authority
CN
China
Prior art keywords
user
voice
module
information
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010662605.2A
Other languages
Chinese (zh)
Other versions
CN111741369A (en
Inventor
王利平
李重
李瑞生
汤永哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Xinzhi Technology Co ltd
Original Assignee
Anhui Xinzhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Xinzhi Technology Co ltd filed Critical Anhui Xinzhi Technology Co ltd
Priority to CN202010662605.2A priority Critical patent/CN111741369B/en
Publication of CN111741369A publication Critical patent/CN111741369A/en
Application granted granted Critical
Publication of CN111741369B publication Critical patent/CN111741369B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/254Management at additional data server, e.g. shopping server, rights management server
    • H04N21/2541Rights Management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a voice recognition-based smart television set top box, which comprises a recognition analysis module, a database, an automatic recommendation module, a voice recording module, a user registration module, a camera unit, a processor, an alarm module, an external acquisition module and a control module, wherein the recognition analysis module is used for recognizing the voice of a user; the processor is used for comparing real-time sound pressure, recognition voice and real-time gestures in the database analyzed and calculated by the recognition and analysis module, and the user registration module is used for submitting the number of users and data information of the users for registration through the mobile phone terminal and sending the number of the users who are successfully registered and the data information of the users to the database for storage; the intelligent control system has strong intelligent performance, can greatly reduce the waiting time of a user, is more convenient for the user to use due to intelligent control, and brings convenience to the life of the user.

Description

Smart television set top box based on voice recognition
Technical Field
The invention relates to the field of intelligent television set-top boxes, in particular to an intelligent television set-top box based on voice recognition.
Background
The television set top box serving as important video and audio equipment meets the requirements of most of the families and hotels on entertainment and life at present; but limited by the complexity of remote controller operation, the functions of searching, switching stations and the like which are commonly seen by people can be directly controlled through voice, and the set top box can also be used as a central control of an intelligent family to control intelligent household equipment such as lamplight, air conditioners and the like, so that the people can liberate both hands and can finish the operation which is wanted by people only through one voice instruction; at present, the mainstream intelligent home central control scheme provides related services by externally connecting other central control devices, the whole function is combined in the set top box, and the living room is used as the area with the longest daily activity time of family members, so that better services can be provided undoubtedly. The utilization rate of the equipment can also be improved.
Application No. CN201410482292.7 discloses a set-top box system, which includes: the first tuner, the second tuner and the third tuner are used for receiving satellite signals; the fourth tuner is used for receiving the ground signal; the three first demodulation chips are respectively used for demodulating the intermediate frequency signals output by the first tuner, the second tuner and the third tuner; a second demodulation chip, for demodulating the intermediate frequency signal outputted by the fourth tuner; and the central processing unit is used for processing the transmission streams output by the three first demodulation chips and the three second demodulation chips.
However, in this patent, the user cannot be identified by voice, nor by keywords or gestures, and the protection of the privacy of the user is low.
Disclosure of Invention
The invention aims to provide an intelligent television set top box based on voice recognition;
the purpose of the invention can be realized by the following technical scheme:
an intelligent television set top box based on voice recognition comprises a recognition analysis module, a database, an automatic recommendation module, a voice recording module, a user registration module, a camera unit, a processor, an alarm module, an external acquisition module and a control module;
the external acquisition module is used for acquiring the sound pressure and the recognition voice of the current user in real time and marking the sound pressure as real-time sound pressure, and is also used for acquiring the real-time gesture input by the user when in use, and the external acquisition module is used for transmitting the real-time sound pressure, the recognition voice and the real-time gesture to the recognition analysis module;
the system comprises a recognition and analysis module, a database and a database, wherein the recognition and analysis module receives real-time sound pressure, recognition voice and real-time gestures transmitted by an external acquisition module, and is used for carrying out identity verification processing on the real-time sound pressure, the recognition voice and the real-time gestures by combining sound pressure information, keywords and verification information in the database, and the identity verification processing comprises the following specific steps:
the method comprises the following steps: acquiring real-time sound pressure, recognition voice and real-time gestures transmitted by a corresponding external acquisition module;
step two: comparing the recognized voice with the verification information stored in the corresponding database, wherein the specific comparison process comprises the following steps:
s1: converting the recognized voice into text information through a voice-to-text technology, and marking the text information as a word to be verified;
s2: matching the words to be verified with all the keywords, and finding out the user information corresponding to the keywords consistent with the words to be verified; if no corresponding user information exists, generating an authentication failure signal;
step three: acquiring corresponding user information, and acquiring verification gesture and sound pressure information corresponding to the user information;
step four: acquiring real-time sound pressure and real-time gestures; the method comprises the following steps of verifying a real-time gesture, and assigning a verification potential value according to a verification processing result, wherein the specific processing steps are as follows:
s01: comparing the real-time gesture with the corresponding verification gesture, and if the real-time gesture is consistent with the corresponding verification gesture, giving a verification potential value Yz to be 1, otherwise giving a verification potential value Yz to be 0;
s02: acquiring real-time sound pressure of a user, marking the real-time sound pressure as Sy, and marking corresponding sound pressure information as Y;
s03: obtaining a similarity value Xs by using a formula; the concrete formula is as follows:
Xs={|Sy-Y|/Y}×Yz;
in the formula, the absolute value of the difference between the Sy and the Y is obtained;
s04: obtaining a similarity value Xs;
step five: when the Xs is lower than X1, generating a kernel communication signal, and marking corresponding user information as a user; here X1 is a preset value;
the processor is used for comparing the real-time sound pressure, the recognized voice and the real-time gesture transmitted by the external acquisition module with the sound pressure information, the key words and the verification information in the database;
SS 1: if the processor receives the authentication failure signal, generating an alarm signal and sending the alarm signal to an alarm module;
SS 2: if the verification potential value Yz is given as 1, generating an accurate identification information signal and sending the accurate identification information signal to an automatic recommendation module;
if the verification potential value Yz is given as 0, generating an alarm signal and sending the alarm signal to an alarm module;
SS 3: if the processor receives the kernel communication signal, generating an identification information accurate signal and sending the identification information accurate signal to an automatic recommendation module; otherwise, generating an alarm signal and sending the alarm signal to an alarm module;
SS 4: after processing and generating the accurate identification information signal, managing the use of the user by combining the authority of the user in the database, wherein the specific management steps are as follows:
m1: if the age of the user is less than the set threshold value N1, the processor generates a time limit signal and sends the signal to the control unit;
m2: if the age of the user is greater than the set threshold N2, the processor generates a reduced brightness signal and sends the signal to the control unit.
Further, the automatic recommendation module is used for recommending a proper program type for a user, the processor compares real-time sound pressure, recognized voice and real-time gestures transmitted by the external acquisition module with sound pressure information, keywords and verification information in the database, the processor generates accurate voice recognition information after the user information is recognized accurately and sends the accurate voice recognition information to the automatic recommendation module, and the specific automatic recommendation steps are as follows:
p1: acquiring the times of program collection of a user using a television currently within ten days, and marking the times as Gu;
p2: acquiring the times of watching programs within ten days of a user who uses the television at present, and marking the times as Wu;
p3: acquiring the time of browsing programs within ten days of a user who uses a television at present, and marking the time as Ru;
p4: acquiring the times of program participation comment of a user using a television currently within ten days, and marking the times as Yu;
p5: using formulas
Figure BDA0002579167250000041
Obtaining an automatic recommendation coefficient Xu, wherein mu is an error correction factor and takes the value of 0.95682; d7, d8, d9 and d10 are all preset proportionality coefficients, d7+ d8+ d9+ d10 is 1, d7 is more than d8 is more than d9 is more than d10, and T1 and T2 are time proportionality coefficients;
p6: after the automatic recommendation coefficient Xu is calculated, the system automatically selects the program with the high first three recommendation coefficients for recommendation.
Further, the user registration module is used for the user to submit the number of the users and the data information of the users through the mobile phone terminal for registration and send the number of the users who are successfully registered and the data information of the users to the database for storage, wherein the data information of the users comprises names, ages and photos; the user registration module transmits an end signal to the database after the user registration is successful;
the database generates a voice recording start signal and sends the voice recording start signal to the voice recording module after receiving an end signal transmitted by the user registration module, the voice recording module is used for binding a user with voice key information to form registered user information and storing the registered user information in the database, the voice recording module informs the user of voice recording when receiving the voice recording start signal transmitted by the database, and at the moment, the voice recording module can record the voice key information of each user, wherein the voice key information comprises the sound pressure information and the key words of the voice of the user; the keywords are words set by a user in a self-defined way and are used for verifying the identity of the user;
the camera shooting unit is used for collecting a verification gesture of the user during registration, and the verification gesture is a gesture preset by the user and used for further verifying the identity of the user; the camera shooting unit is used for matching the verification gesture with the corresponding user to obtain verification information; and transmitting the verification information to a database for storage.
Furthermore, the control module is used for controlling the work of the television through voice, after the processor compares the resolution, the accurate information of voice recognition is generated and sent to the control module, and the control module comprises: the system comprises a television control unit, a home control unit, an entertainment unit, a weather query unit and a stock query unit, wherein the specific working processes of the parts are as follows:
a. the television control unit can operate the television by voice:
a1, the user can control the start of the television, the broadcast of the live broadcast and the channel change of the program through voice operation;
a2, controlling the sound of the television by increasing and decreasing the volume through the remote controller in any scene through voice operation by a user;
b. the home control unit takes the set-top box as a central control entrance to control all networked intelligent devices in a user family:
b1, the user can directly control the temperature, mode, wind speed and other functional settings of the air conditioner through the set-top box;
b2, the user can directly turn on or off the lamp through the set-top box;
b3, the user can directly operate the curtain through the set-top box;
c. the entertainment unit may conduct entertainment through the set-top box:
c1, the user can listen to favorite news programs through the set-top box;
c2, the user can listen to favorite music through the set-top box, and controls the music to be cut, paused and continued through voice;
c3, the user can play Tang poems through the set-top box;
c4, the user can communicate in a man-machine chatting mode through the set top box;
d. the weather query unit can directly query the weather of the current city through the voice of the user, can also query weather messages of other cities and other time, and plays and informs the user through TTS after the query is successful;
e. the stock inquiring unit may inquire the quotation of stocks for the user.
Compared with the prior art, the invention has the beneficial effects that:
1. the recognition and analysis module receives real-time sound pressure, recognition voice and real-time gestures transmitted by the external acquisition module, the recognition and analysis module is used for carrying out identity verification processing on the real-time sound pressure, the recognition voice and the real-time gestures by combining sound pressure information, keywords and verification information in a database, and the specific steps of the identity verification processing are as follows: acquiring real-time sound pressure, recognition voice and real-time gestures transmitted by a corresponding external acquisition module; the processor is used for comparing the real-time sound pressure, the recognized voice and the real-time gesture transmitted by the external acquisition module with the sound pressure information, the key words and the verification information in the database; if the processor receives the authentication failure signal, generating an alarm signal and sending the alarm signal to an alarm module; if the verification potential value Yz is given as 1, generating an accurate identification information signal and sending the accurate identification information signal to an automatic recommendation module; if the verification potential value Yz is given as 0, generating an alarm signal and sending the alarm signal to an alarm module; if the processor receives the kernel communication signal, generating an identification information accurate signal and sending the identification information accurate signal to an automatic recommendation module; otherwise, generating an alarm signal and sending the alarm signal to an alarm module; after the accurate identification information signal is processed and generated, the use of the user is managed by combining the authority of the user in the database, the identity of the user is identified by the system through real-time sound pressure, voice and real-time gestures of the user, the user is prevented from using the system at will, the information of the user is prevented from being leaked, and the safety performance of the set top box is improved;
2. after comparing real-time sound pressure, recognized voice and real-time gestures, the processor generates accurate voice recognition information and sends the accurate voice recognition information to the automatic recommendation module, the times of program collection, the times of program watching, the time of program browsing and the times of program participation in comment of a user who uses the television at present within ten days are obtained, an automatic recommendation coefficient is obtained by using a formula, then the system automatically selects the program with the high first three recommendation coefficients for recommendation, videos can be recommended for the user, and the time of client searching is reduced;
3. control module is used for through the work of speech control TV, and after the treater contrasted the resolution, the accurate information of speech recognition was generated and was sent to control module, control module divide into: the system comprises a television control unit, a home control unit, an entertainment unit, a weather query unit and a stock query unit, wherein the sound volume of the television can be controlled by starting the voice television, playing live broadcast and switching programs through a remote controller, and the home control unit controls all networked intelligent equipment in a user family by taking a set top box as a central control entrance, the weather inquiry unit can directly inquire the weather of the current city through the voice of the user, and can also inquire other cities, weather messages at other times, and play a notification to the user through TTS after the query is successful, the stock inquiry unit can inquire the quotation of the stock for the user, the invention has strong intelligent performance, can greatly reduce the waiting time of the user, meanwhile, the intelligent control enables the user to use the intelligent control more conveniently, and brings convenience to the life of the user.
Drawings
In order to facilitate understanding for those skilled in the art, the present invention will be further described with reference to the accompanying drawings.
Fig. 1 is a schematic block diagram of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a smart television set-top box based on voice recognition includes a recognition analysis module, a database, an automatic recommendation module, a voice recording module, a user registration module, a camera unit, a processor, an alarm module, an external collection module, and a control module;
the external acquisition module is used for acquiring the sound pressure and the recognition voice of the current user in real time and marking the sound pressure as real-time sound pressure, and is also used for acquiring the real-time gesture input by the user when in use, and the external acquisition module is used for transmitting the real-time sound pressure, the recognition voice and the real-time gesture to the recognition analysis module;
the system comprises a recognition and analysis module, a database and a database, wherein the recognition and analysis module receives real-time sound pressure, recognition voice and real-time gestures transmitted by an external acquisition module, and is used for carrying out identity verification processing on the real-time sound pressure, the recognition voice and the real-time gestures by combining sound pressure information, keywords and verification information in the database, and the identity verification processing comprises the following specific steps:
the method comprises the following steps: acquiring real-time sound pressure, recognition voice and real-time gestures transmitted by a corresponding external acquisition module;
step two: comparing the recognized voice with the verification information stored in the corresponding database, wherein the specific comparison process comprises the following steps:
s1: converting the recognized voice into text information through a voice-to-text technology, and marking the text information as a word to be verified;
s2: matching the words to be verified with all the keywords, and finding out the user information corresponding to the keywords consistent with the words to be verified; if no corresponding user information exists, generating an authentication failure signal;
step three: acquiring corresponding user information, and acquiring verification gesture and sound pressure information corresponding to the user information;
step four: acquiring real-time sound pressure and real-time gestures; the method comprises the following steps of verifying a real-time gesture, and assigning a verification potential value according to a verification processing result, wherein the specific processing steps are as follows:
s01: comparing the real-time gesture with the corresponding verification gesture, and if the real-time gesture is consistent with the corresponding verification gesture, giving a verification potential value Yz to be 1, otherwise giving a verification potential value Yz to be 0;
s02: acquiring real-time sound pressure of a user, marking the real-time sound pressure as Sy, and marking corresponding sound pressure information as Y;
s03: obtaining a similarity value Xs by using a formula; the concrete formula is as follows:
Xs={|Sy-Y|/Y}×Yz;
in the formula, the absolute value of the difference between the Sy and the Y is obtained;
s04: obtaining a similarity value Xs;
step five: when the Xs is lower than X1, generating a kernel communication signal, and marking corresponding user information as a user; here X1 is a preset value;
the processor is used for comparing the real-time sound pressure, the recognized voice and the real-time gesture transmitted by the external acquisition module with the sound pressure information, the key words and the verification information in the database;
SS 1: if the processor receives the authentication failure signal, generating an alarm signal and sending the alarm signal to an alarm module;
SS 2: if the verification potential value Yz is given as 1, generating an accurate identification information signal and sending the accurate identification information signal to an automatic recommendation module;
if the verification potential value Yz is given as 0, generating an alarm signal and sending the alarm signal to an alarm module;
SS 3: if the processor receives the kernel communication signal, generating an identification information accurate signal and sending the identification information accurate signal to an automatic recommendation module; otherwise, generating an alarm signal and sending the alarm signal to an alarm module;
SS 4: after processing and generating the accurate identification information signal, managing the use of the user by combining the authority of the user in the database, wherein the specific management steps are as follows:
m1: if the age of the user is less than the set threshold value N1, the processor generates a time limit signal and sends the signal to the control unit;
m2: if the age of the user is greater than the set threshold N2, the processor generates a reduced brightness signal and sends the signal to the control unit.
The automatic recommendation module is used for recommending proper program types for users, the processor compares real-time sound pressure, recognized voice and real-time gestures transmitted by the external acquisition module with sound pressure information, keywords and verification information in the database, the processor generates accurate voice recognition information after the user information is recognized accurately and sends the accurate voice recognition information to the automatic recommendation module, and specific automatic recommendation steps are as follows:
p1: acquiring the times of program collection of a user using a television currently within ten days, and marking the times as Gu;
p2: acquiring the times of watching programs within ten days of a user who uses the television at present, and marking the times as Wu;
p3: acquiring the time of browsing programs within ten days of a user who uses a television at present, and marking the time as Ru;
p4: acquiring the times of program participation comment of a user using a television currently within ten days, and marking the times as Yu;
p5: using formulas
Figure BDA0002579167250000101
Obtaining an automatic recommendation coefficient Xu, wherein mu is an error correction factor and takes the value of 0.95682; d7, d8, d9 and d10 are all preset proportionality coefficients, d7+ d8+ d9+ d10 is 1, d7 is more than d8 is more than d9 is more than d10, and T1 and T2 are time proportionality coefficients;
p6: after the automatic recommendation coefficient Xu is calculated, the system automatically selects the program with the high first three recommendation coefficients for recommendation.
The user registration module is used for submitting the number of users and data information of the users to register through the mobile phone terminal by the users and sending the number of the users who successfully register and the data information of the users to the database for storage, wherein the data information of the users comprises names, ages and photos; the user registration module transmits an end signal to the database after the user registration is successful;
the database generates a voice recording start signal and sends the voice recording start signal to the voice recording module after receiving an end signal transmitted by the user registration module, the voice recording module is used for binding a user with voice key information to form registered user information and storing the registered user information in the database, the voice recording module informs the user of voice recording when receiving the voice recording start signal transmitted by the database, and at the moment, the voice recording module can record the voice key information of each user, wherein the voice key information comprises the sound pressure information and the key words of the voice of the user; the keywords are words set by a user in a self-defined way and are used for verifying the identity of the user;
the camera shooting unit is used for collecting a verification gesture of the user during registration, and the verification gesture is a gesture preset by the user and used for further verifying the identity of the user; the camera shooting unit is used for matching the verification gesture with the corresponding user to obtain verification information; and transmitting the verification information to a database for storage.
The control module is used for controlling the work of the television through voice, after the processor compares the resolution, the accurate information of voice recognition is generated and sent to the control module, and the control module is divided into: the system comprises a television control unit, a home control unit, an entertainment unit, a weather query unit and a stock query unit, wherein the specific working processes of the parts are as follows:
a. the television control unit can operate the television by voice:
a1, the user can control the start of the television, the broadcast of the live broadcast and the channel change of the program through voice operation;
a2, controlling the sound of the television by increasing and decreasing the volume through the remote controller in any scene through voice operation by a user;
b. the home control unit takes the set-top box as a central control entrance to control all networked intelligent devices in a user family:
b1, the user can directly control the temperature, mode, wind speed and other functional settings of the air conditioner through the set-top box;
b2, the user can directly turn on or off the lamp through the set-top box;
b3, the user can directly operate the curtain through the set-top box;
c. the entertainment unit may conduct entertainment through the set-top box:
c1, the user can listen to favorite news programs through the set-top box;
c2, the user can listen to favorite music through the set-top box, and controls the music to be cut, paused and continued through voice;
c3, the user can play Tang poems through the set-top box;
c4, the user can communicate in a man-machine chatting mode through the set top box;
d. the weather query unit can directly query the weather of the current city through the voice of the user, can also query weather messages of other cities and other time, and plays and informs the user through TTS after the query is successful;
e. the stock inquiring unit may inquire the quotation of stocks for the user.
The working principle of the invention is as follows:
the utility model provides an intelligent TV set-top box based on speech recognition, during operation, real-time acoustic pressure, discernment pronunciation and real-time gesture that discernment analysis module received external collection module transmission, the discernment analysis module is used for carrying out the authentication processing to real-time acoustic pressure, discernment pronunciation and real-time gesture in the combination acoustic pressure information, keyword and the verification information in the database, and the concrete step of authentication processing is: acquiring real-time sound pressure, recognition voice and real-time gestures transmitted by a corresponding external acquisition module; comparing the recognized voice with the verification information stored in the corresponding database, wherein the specific comparison process comprises the following steps: converting the recognized voice into text information through a voice-to-text technology, and marking the text information as a word to be verified; matching the words to be verified with all the keywords, and finding out the user information corresponding to the keywords consistent with the words to be verified; if no corresponding user information exists, generating an authentication failure signal; acquiring corresponding user information, and acquiring verification gesture and sound pressure information corresponding to the user information; acquiring real-time sound pressure and real-time gestures; the method comprises the following steps of verifying a real-time gesture, and assigning a verification potential value according to a verification processing result, wherein the specific processing steps are as follows: comparing the real-time gesture with the corresponding verification gesture, and if the real-time gesture is consistent with the corresponding verification gesture, giving a verification potential value Yz to be 1, otherwise giving a verification potential value Yz to be 0; acquiring real-time sound pressure of a user, marking the real-time sound pressure as Sy, and marking corresponding sound pressure information as Y; obtaining a similarity value Xs by using a formula; obtaining a similarity value Xs; when the Xs is lower than X1, generating a kernel communication signal, and marking corresponding user information as a user; here X1 is a preset value; the processor is used for comparing the real-time sound pressure, the recognized voice and the real-time gesture transmitted by the external acquisition module with the sound pressure information, the key words and the verification information in the database; if the processor receives the authentication failure signal, generating an alarm signal and sending the alarm signal to an alarm module; if the verification potential value Yz is given as 1, generating an accurate identification information signal and sending the accurate identification information signal to an automatic recommendation module; if the verification potential value Yz is given as 0, generating an alarm signal and sending the alarm signal to an alarm module; if the processor receives the kernel communication signal, generating an identification information accurate signal and sending the identification information accurate signal to an automatic recommendation module; otherwise, generating an alarm signal and sending the alarm signal to an alarm module; after processing and generating the accurate identification information signal, managing the use of the user by combining the authority of the user in the database, wherein the specific management steps are as follows: if the age of the user is less than the set threshold value N1, the processor generates a time limit signal and sends the signal to the control unit; if the age of the user is larger than the set threshold value N2, the processor generates a brightness reducing signal and sends the signal to the control unit, the system identifies the identity of the user through the real-time sound pressure, the voice and the real-time gestures of the user, prevents other people from using the system at will, prevents the information of the user from being leaked, and improves the safety performance of the set top box; after comparing real-time sound pressure, recognized voice and real-time gestures, the processor generates accurate voice recognition information and sends the accurate voice recognition information to the automatic recommendation module, the times of program collection, the times of program watching, the time of program browsing and the times of program participation in comment of a user who uses the television at present within ten days are obtained, an automatic recommendation coefficient is obtained by using a formula, then the system automatically selects the program with the high first three recommendation coefficients for recommendation, videos can be recommended for the user, and the time of client searching is reduced; control module is used for through the work of speech control TV, and after the treater contrasted the resolution, the accurate information of speech recognition was generated and was sent to control module, control module divide into: the system comprises a television control unit, a home control unit, an entertainment unit, a weather query unit and a stock query unit, wherein the sound volume of the television can be controlled by starting the voice television, playing live broadcast and switching programs through a remote controller, and the home control unit controls all networked intelligent equipment in a user family by taking a set top box as a central control entrance, the weather inquiry unit can directly inquire the weather of the current city through the voice of the user, and can also inquire other cities, weather messages at other times, and play a notification to the user through TTS after the query is successful, the stock inquiry unit can inquire the quotation of the stock for the user, the invention has strong intelligent performance, can greatly reduce the waiting time of the user, meanwhile, the intelligent control enables the user to use the intelligent control more conveniently, and brings convenience to the life of the user.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (3)

1. An intelligent television set top box based on voice recognition is characterized by comprising a recognition analysis module, a database, an automatic recommendation module, a voice recording module, a user registration module, a camera unit, a processor, an alarm module, an external acquisition module and a control module;
the external acquisition module is used for acquiring the sound pressure and the recognition voice of the current user in real time and marking the sound pressure as real-time sound pressure, and is also used for acquiring the real-time gesture input by the user when in use, and the external acquisition module is used for transmitting the real-time sound pressure, the recognition voice and the real-time gesture to the recognition analysis module;
the system comprises a recognition and analysis module, a database and a database, wherein the recognition and analysis module receives real-time sound pressure, recognition voice and real-time gestures transmitted by an external acquisition module, and is used for carrying out identity verification processing on the real-time sound pressure, the recognition voice and the real-time gestures by combining sound pressure information, keywords and verification information in the database, and the identity verification processing comprises the following specific steps:
the method comprises the following steps: acquiring real-time sound pressure, recognition voice and real-time gestures transmitted by a corresponding external acquisition module;
step two: comparing the recognized voice with the verification information stored in the corresponding database, wherein the specific comparison process comprises the following steps:
s1: converting the recognized voice into text information through a voice-to-text technology, and marking the text information as a word to be verified;
s2: matching the words to be verified with all the keywords, and finding out the user information corresponding to the keywords consistent with the words to be verified; if no corresponding user information exists, generating an authentication failure signal;
step three: acquiring corresponding user information, and acquiring verification gesture and sound pressure information corresponding to the user information;
step four: acquiring real-time sound pressure and real-time gestures; the method comprises the following steps of verifying a real-time gesture, and assigning a verification potential value according to a verification processing result, wherein the specific processing steps are as follows:
s01: comparing the real-time gesture with the corresponding verification gesture, and if the real-time gesture is consistent with the corresponding verification gesture, giving a verification potential value Yz to be 1, otherwise giving a verification potential value Yz to be 0;
s02: acquiring real-time sound pressure of a user, marking the real-time sound pressure as Sy, and marking corresponding sound pressure information as Y;
s03: obtaining a similarity value Xs by using a formula; the concrete formula is as follows:
Xs={|Sy-Y|/Y}
Figure 720883DEST_PATH_IMAGE002
Yz;
in the formula, the absolute value of the difference between the Sy and the Y is obtained;
s04: obtaining a similarity value Xs;
step five: when the Xs is lower than X1, generating a kernel communication signal, and marking corresponding user information as a user; here X1 is a preset value;
the processor is used for comparing the real-time sound pressure, the recognized voice and the real-time gesture transmitted by the external acquisition module with the sound pressure information, the key words and the verification information in the database;
SS 1: if the processor receives the authentication failure signal, generating an alarm signal and sending the alarm signal to an alarm module;
SS 2: if the verification potential value Yz is given as 1, generating an accurate identification information signal and sending the accurate identification information signal to an automatic recommendation module;
if the verification potential value Yz is given as 0, generating an alarm signal and sending the alarm signal to an alarm module;
SS 3: if the processor receives the kernel communication signal, generating an identification information accurate signal and sending the identification information accurate signal to an automatic recommendation module; otherwise, generating an alarm signal and sending the alarm signal to an alarm module;
SS 4: after processing and generating the accurate identification information signal, managing the use of the user by combining the authority of the user in the database, wherein the specific management steps are as follows:
m1: if the age of the user is less than the set threshold value N1, the processor generates a time limit signal and sends the signal to the control unit;
m2: if the age of the user is larger than a set threshold value N2, the processor generates a brightness reducing signal and sends the signal to the control unit;
the automatic recommendation module is used for recommending proper program types for users, the processor compares real-time sound pressure, recognized voice and real-time gestures transmitted by the external acquisition module with sound pressure information, keywords and verification information in the database, the processor generates accurate voice recognition information after the user information is recognized accurately and sends the accurate voice recognition information to the automatic recommendation module, and specific automatic recommendation steps are as follows:
p1: acquiring the times of program collection of a user using a television currently within ten days, and marking the times as Gu;
p2: acquiring the times of watching programs within ten days of a user who uses the television at present, and marking the times as Wu;
p3: acquiring the time of browsing programs within ten days of a user who uses a television at present, and marking the time as Ru;
p4: acquiring the times of program participation comment of a user using a television currently within ten days, and marking the times as Yu;
p5: using formulas
Figure 169181DEST_PATH_IMAGE004
Obtaining an automatic recommendation coefficient Xu, wherein mu is an error correction factor and takes the value of 0.95682; d7, d8, d9 and d10 are all preset proportionality coefficients, d7+ d8+ d9+ d10=1, d7 < d8 < d9 < d10, and T1 and T2 are time proportionality coefficients;
p6: after the automatic recommendation coefficient Xu is calculated, the system automatically selects the program with the high first three recommendation coefficients for recommendation.
2. The smart television set-top box based on voice recognition as claimed in claim 1, wherein the user registration module is used for the user to submit the number of users and the data information of the users through the mobile phone terminal for registration and send the number of the users who successfully register and the data information of the users to the database for storage, and the data information of the users comprises names, ages and photos; the user registration module transmits an end signal to the database after the user registration is successful;
the database generates a voice recording start signal and sends the voice recording start signal to the voice recording module after receiving an end signal transmitted by the user registration module, the voice recording module is used for binding a user with voice key information to form registered user information and storing the registered user information in the database, the voice recording module informs the user of voice recording when receiving the voice recording start signal transmitted by the database, and at the moment, the voice recording module can record the voice key information of each user, wherein the voice key information comprises the sound pressure information and the key words of the voice of the user;
the camera shooting unit is used for collecting a verification gesture of the user during registration, and the verification gesture is a gesture preset by the user and used for further verifying the identity of the user; the camera shooting unit is used for matching the verification gesture with the corresponding user to obtain verification information; and transmitting the verification information to a database for storage.
3. The smart television set-top box based on voice recognition as claimed in claim 1, wherein the control module is configured to control the television to work through voice, the processor compares the resolutions, generates information with accurate voice recognition and sends the information to the control module, and the control module is divided into: the system comprises a television control unit, a home control unit, an entertainment unit, a weather query unit and a stock query unit, wherein the specific working processes of the parts are as follows:
a. the television control unit can operate the television by voice:
a1, the user can control the start of the television, the broadcast of the live broadcast and the channel change of the program through voice operation;
a2, controlling the sound of the television by increasing and decreasing the volume through the remote controller in any scene through voice operation by a user;
b. the home control unit takes the set-top box as a central control entrance to control all networked intelligent devices in a user family:
b1, the user can directly control the temperature, mode and wind speed function setting of the air conditioner through the set-top box;
b2, the user can directly turn on or off the lamp through the set-top box;
b3, the user can directly operate the curtain through the set-top box;
c. the entertainment unit may conduct entertainment through the set-top box:
c1, the user can listen to favorite news programs through the set-top box;
c2, the user can listen to favorite music through the set-top box, and controls the music to be cut, paused and continued through voice;
c3, the user can play Tang poems through the set-top box;
c4, the user can communicate in a man-machine chatting mode through the set top box;
d. the weather query unit can directly query the weather of the current city through the voice of the user, can also query weather messages of other cities and other time, and plays and informs the user through TTS after the query is successful;
e. the stock inquiring unit may inquire the quotation of stocks for the user.
CN202010662605.2A 2020-07-10 2020-07-10 Smart television set top box based on voice recognition Active CN111741369B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010662605.2A CN111741369B (en) 2020-07-10 2020-07-10 Smart television set top box based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010662605.2A CN111741369B (en) 2020-07-10 2020-07-10 Smart television set top box based on voice recognition

Publications (2)

Publication Number Publication Date
CN111741369A CN111741369A (en) 2020-10-02
CN111741369B true CN111741369B (en) 2021-11-16

Family

ID=72654157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010662605.2A Active CN111741369B (en) 2020-07-10 2020-07-10 Smart television set top box based on voice recognition

Country Status (1)

Country Link
CN (1) CN111741369B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112327789B (en) * 2020-11-26 2023-04-28 江西台德智慧科技有限公司 Voice interaction system and method applied to intelligent voice assistant
CN112530167A (en) * 2020-12-07 2021-03-19 宜辰光电科技(安徽)有限公司 Control system of vehicle-mounted screen panel based on cloud platform
CN112735390B (en) * 2020-12-25 2023-02-28 江西台德智慧科技有限公司 Intelligent voice terminal equipment with voice recognition function
CN114339342A (en) * 2021-12-23 2022-04-12 歌尔科技有限公司 Remote controller control method, remote controller, control device and medium
CN115544362B (en) * 2022-10-11 2023-06-13 读书郎教育科技有限公司 Content recommendation system based on AI

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1977293A (en) * 2004-06-29 2007-06-06 皇家飞利浦电子股份有限公司 Personal gesture signature
CN101420543A (en) * 2008-12-05 2009-04-29 天津三星电子显示器有限公司 Method for voice controlling television and television therewith
CN105338390A (en) * 2015-12-09 2016-02-17 陈国铭 Intelligent television control system
CN105959806A (en) * 2016-05-25 2016-09-21 乐视控股(北京)有限公司 Program recommendation method and device
CN106060596A (en) * 2016-06-29 2016-10-26 江苏省公用信息有限公司 User grouping system and method of internet protocol television
CN108460329A (en) * 2018-01-15 2018-08-28 任俊芬 A kind of face gesture cooperation verification method based on deep learning detection
CN109195014A (en) * 2018-09-27 2019-01-11 江苏银河数字技术有限公司 Set-top-box system and its application method with user's identification and program push function
CN109660833A (en) * 2018-12-19 2019-04-19 四川省有线广播电视网络股份有限公司 Intelligent sound television system terminal portal design method
CN110363639A (en) * 2019-07-08 2019-10-22 广东工贸职业技术学院 A kind of financial management system based on artificial intelligence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101154011B1 (en) * 2010-06-07 2012-06-08 주식회사 서비전자 System and method of Multi model adaptive and voice recognition

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1977293A (en) * 2004-06-29 2007-06-06 皇家飞利浦电子股份有限公司 Personal gesture signature
CN101420543A (en) * 2008-12-05 2009-04-29 天津三星电子显示器有限公司 Method for voice controlling television and television therewith
CN105338390A (en) * 2015-12-09 2016-02-17 陈国铭 Intelligent television control system
CN105959806A (en) * 2016-05-25 2016-09-21 乐视控股(北京)有限公司 Program recommendation method and device
CN106060596A (en) * 2016-06-29 2016-10-26 江苏省公用信息有限公司 User grouping system and method of internet protocol television
CN108460329A (en) * 2018-01-15 2018-08-28 任俊芬 A kind of face gesture cooperation verification method based on deep learning detection
CN109195014A (en) * 2018-09-27 2019-01-11 江苏银河数字技术有限公司 Set-top-box system and its application method with user's identification and program push function
CN109660833A (en) * 2018-12-19 2019-04-19 四川省有线广播电视网络股份有限公司 Intelligent sound television system terminal portal design method
CN110363639A (en) * 2019-07-08 2019-10-22 广东工贸职业技术学院 A kind of financial management system based on artificial intelligence

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于计算机视觉的用户身份验证App设计与实现;李丽慧等;《信息与电脑(理论版)》;20170908(第17期);全文 *

Also Published As

Publication number Publication date
CN111741369A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111741369B (en) Smart television set top box based on voice recognition
KR101289081B1 (en) IPTV system and service using voice interface
US20140195230A1 (en) Display apparatus and method for controlling the same
US9520133B2 (en) Display apparatus and method for controlling the display apparatus
US20230409634A1 (en) Apparatus, systems and methods for media content searching
US20190333515A1 (en) Display apparatus, method for controlling the display apparatus, server and method for controlling the server
US11146841B2 (en) Voice-based television control method and intelligent terminal
CN102196207B (en) Method, device and system for controlling television by using voice
WO2016169329A1 (en) Voice-controlled electronic program method and device, and storage medium
WO2016206494A1 (en) Voice control method, device and mobile terminal
EP2919472A1 (en) Display apparatus, method for controlling display apparatus, and interactive system
WO2020133946A1 (en) Device control method, device, apparatus and medium
WO2015033523A1 (en) Voice interaction control method
CN102833582B (en) Method for searching audio and video resources via voice
CN105161106A (en) Voice control method of intelligent terminal, voice control device and television system
US9230559B2 (en) Server and method of controlling the same
CN103491411A (en) Method and device based on language recommending channels
CN110517686A (en) Intelligent sound box end voice opens the method and system of application
WO2020177687A1 (en) Mode setting method and device, electronic apparatus, and storage medium
CN114553349A (en) Method and system for collecting weather forecast information
CN112735403B (en) Intelligent home control system based on intelligent sound equipment
US10826961B2 (en) Multimedia player device automatically performs an operation triggered by a portable electronic device
US20220109914A1 (en) Electronic apparatus having notification function, and control method for electronic apparatus
KR101828715B1 (en) Bidirectional date service expansion system using shortening cord
CN108363770A (en) A kind of set-top box supports multipath extraction keyword and the method and system of search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant