US20190378494A1 - Method and apparatus for outputting information - Google Patents

Method and apparatus for outputting information Download PDF

Info

Publication number
US20190378494A1
US20190378494A1 US16/298,714 US201916298714A US2019378494A1 US 20190378494 A1 US20190378494 A1 US 20190378494A1 US 201916298714 A US201916298714 A US 201916298714A US 2019378494 A1 US2019378494 A1 US 2019378494A1
Authority
US
United States
Prior art keywords
user
identity information
operation option
determining
voiceprint characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/298,714
Other languages
English (en)
Inventor
Zaipeng Hou
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Shanghai Xiaodu Technology Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Assigned to BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. reassignment BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HOU, ZAIPENG
Publication of US20190378494A1 publication Critical patent/US20190378494A1/en
Assigned to BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., SHANGHAI XIAODU TECHNOLOGY CO. LTD. reassignment BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/083Recognition networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • H04N21/4753End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data for user identification, e.g. by entering a PIN or password
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present disclosure relates to the field of smart television technology, and specifically to a method and apparatus for outputting information.
  • Smart televisions have been widespread in our lives.
  • the smart televisions are not only limited to traditional television program viewing functions.
  • the popular television application market provides thousands of television applications for users, covering television live streaming, video-on-demand, stock and finance, healthy life, system optimization tool, etc.
  • the smart televisions have numerous functions, and present the same complicated operation interface to different user groups.
  • Embodiments of the present disclosure provide a method and apparatus for outputting information.
  • the embodiments of the present disclosure provide a method for outputting information.
  • the method includes: receiving a message of requesting to enter a target user mode, the message being inputted by a first user; determining identity information of the first user; determining whether the target user mode matches the identity information of the first user; and selecting, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set, to output the operation option page.
  • the method further includes: selecting, if the target user mode does not match the identity information, an operation option page matching a user mode matching the identity information of the first user from the preset operation option page set, to output the operation option page.
  • the determining identity information of the first user includes: in response to receiving first voice of the first user, generating a first voiceprint characteristic vector based on the first voice; and inputting the first voiceprint characteristic vector into a pre-trained voiceprint recognition model to obtain the identity information of the first user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user.
  • the determining identity information of the first user includes: outputting a question for verifying user identity information; determining, in response to receiving reply information inputted by the first user, whether an answer matching the reply information is included in a predetermined answer set, the answer corresponding to the user identity information; and determining, if the answer is included in the predetermined answer set, the user identity information corresponding to the answer matching the reply information as the identity information of the first user.
  • the generating a first voiceprint characteristic vector based on the first voice includes: importing the first voice into a pre-trained universal background model to perform mapping to obtain a first voiceprint characteristic super-vector, the universal background model being used to represent a corresponding relationship between voice and the voiceprint characteristic super-vector; and performing a dimension reduction on the first voiceprint characteristic super-vector to obtain the first voiceprint characteristic vector.
  • the method further includes: recording, in response to determining the first user belonging to a predetermined population group according to the identity information of the first user, a time point of determining the identity information of the first user as a viewing start time of the first user; and outputting time prompting information or performing a turnoff operation, in response to determining at least one of a difference between a current time and the viewing start time of the first user being greater than a viewing duration threshold of the predetermined population group or the current time being within a predetermined time interval.
  • the identity information includes at least one of: gender, age or family member identifier.
  • the method further includes: in response to receiving second voice of a second user, generating a second voiceprint characteristic vector based on the second voice; inputting the second voiceprint characteristic vector into a voiceprint recognition model to obtain identity information of the second user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user; and determining a younger user from the first user and the second user, and selecting an operation option page matching a user mode corresponding to the younger user from the preset operation option page set to output the operation option page.
  • the embodiments of the present disclosure provide an apparatus for outputting information.
  • the apparatus includes: a receiving unit, configured to receive a message of requesting to enter a target user mode, the message being inputted by a first user; a determining unit, configured to determine identity information of the first user; a matching unit, configured to determine whether the target user mode matches the identity information of the first user; and an outputting unit, configured to select, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set, to output the operation option page.
  • the outputting unit is further configured to: select, if the target user mode does not match the identity information, an operation option page matching a user mode matching the identity information of the first user from the preset operation option page set, to output the operation option page.
  • the determining unit is further configured to: generate, in response to receiving first voice of the first user, a first voiceprint characteristic vector based on the first voice; and input the first voiceprint characteristic vector into a pre-trained voiceprint recognition model to obtain the identity information of the first user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user.
  • determining the identity information of the first user includes: outputting a question for verifying user identity information; determining, in response to receiving reply information inputted by the first user, whether an answer matching the reply information is included in a predetermined answer set, the answer corresponding to the user identity information; and determining, if the answer is included in the predetermined answer set, the user identity information corresponding to the answer matching the reply information as the identity information of the first user.
  • generating the first voiceprint characteristic vector based on the first voice includes: importing the first voice into a pre-trained universal background model to perform mapping to obtain a first voiceprint characteristic super-vector, the universal background model being used to represent a corresponding relationship between voice and the voiceprint characteristic super-vector; and performing a dimension reduction on the first voiceprint characteristic super-vector to obtain the first voiceprint characteristic vector.
  • the apparatus further includes a prompting unit.
  • the prompting unit is configured to: record, in response to determining the first user belonging to a predetermined population group according to the identity information of the first user, a time point of determining the identity information of the first user as a viewing start time of the first user; and output time prompting information or perform a turnoff operation, in response to determining at least one of a difference between a current time and the viewing start time of the first user being greater than a viewing duration threshold of the predetermined population group or the current time being within a predetermined time interval.
  • the identity information includes at least one of: gender, age or family member identifier.
  • the apparatus further includes a switching unit.
  • the switching unit is configured to: generate, in response to receiving second voice of a second user, a second voiceprint characteristic vector based on the second voice; input the second voiceprint characteristic vector into a voiceprint recognition model to obtain identity information of the second user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user; and determine a younger user from the first user and the second user, and select an operation option page matching a user mode corresponding to the younger user from the preset operation option page set to output the operation option page.
  • the embodiments of the present disclosure provide an electronic device.
  • the electronic device includes: one or more processors; and a storage device, configured to store one or more programs.
  • the one or more programs when executed by the one or more processors, cause the one or more processors to implement the method described in any implementation in the first aspect.
  • the embodiments of the present disclosure provide a computer readable medium storing a computer program.
  • the program when executed by a processor, implements the method described in any implementation in the first aspect.
  • the operation option page is selected according to the target user mode to be outputted. Accordingly, a personalized operation option page can be provided for a smart television user of a different type.
  • FIG. 1 is a diagram of an exemplary system architecture in which an embodiment of the present disclosure may be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for outputting information according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of the method for outputting information according to the present disclosure
  • FIG. 4 is a flowchart of another embodiment of the method for outputting information according to the present disclosure.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for outputting information according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of a computer system adapted to implement an electronic device according to the embodiments of the present disclosure.
  • FIG. 1 shows an exemplary system architecture 100 in which an embodiment of a method for playing a television program or an apparatus for playing a television program according to the present disclosure may be applied.
  • the system architecture 100 may include a smart television 101 and a remote control 102 .
  • the smart television 101 is provided with a microphone 103 to collect the sound of a viewer.
  • the remote control 102 is used to remotely control the smart television 101 .
  • functions such as a conversion for a channel of a smart television and playing of a television program may be realized.
  • the smart television 101 can provide various entertainment, information and learning resources such as a web browser, a full high definition 3D somatic game, a video call and an online education.
  • the smart television 101 may be extended infinitely, and can respectively support tens of thousands of pieces of utility functional software developed independently and shared together by organizations and individuals, and professional and amateur software lovers.
  • the smart television may realize various application services such as a network search, a network television, video-on-demand, a digital music, online news, and a network video call.
  • a user may search for a television channel and a website and record a television program, and may play a satellite program, a cable television program and an online video.
  • the smart television 101 has a fully open platform and is equipped with an operating system, on which the user may install and uninstall a program provided by a third party service provider such as software or a game. Through such program, the user may continuously extend the functions of a color television, and surf the Internet through a network cable or a wireless network.
  • the smart television 101 may collect the sound of the viewer through the microphone 103 , and then recognize the identity of the viewer. Further, the smart television 101 provides a different operating interface and different content for a different identity.
  • the method for outputting information provided in the embodiments of the present disclosure is generally performed by the smart television 101 .
  • the apparatus for outputting information is generally provided in the smart television 101 .
  • FIG. 2 illustrates a flow 200 of an embodiment of a method for outputting information according to the present disclosure.
  • the method for outputting information includes the following steps 201 to 203 .
  • Step 201 includes receiving a message of requesting to enter a target user mode, the message being inputted by a first user.
  • a performing subject e.g., the smart television shown in FIG. 1
  • the method for outputting information may receive the voice of entering the target user mode verbally inputted by the user through a microphone, for example, “entering the child mode.”
  • the performing subject may receive the message of requesting to enter the target user mode sent by the user through a remote control.
  • a user mode may be a mode of an operation option page distinguished according to the age of a user, for example, an elderly mode, the child mode, and an adult mode.
  • the target user mode may be a user mode requested by the user, for example, one of the elderly mode, the child mode, and the adult mode.
  • the operation option page is a page for operating the smart television, the page being displayed on the home page of the smart television.
  • the operation option page of the elderly mode may omit some options such as a game option.
  • the operation option page of the elderly mode may also include some specific options such as a Chinese opera channel and a square dance channel.
  • the elderly mode may also enlarge the font of the operation option page, thereby facilitating the elderly to watch.
  • the child mode may filter out some programs that are not suitable for children to watch, set an eye protection mode, and control a volume and a viewing time.
  • the child mode may also display phonetic symbols on the operation option page for facilitating use by aa child who is unable to read words.
  • the child mode may also add some cartoon images to the page to facilitate the child recognizing the operation options.
  • Step 202 includes determining identity information of the first user.
  • the identity information of the user may be determined by a voice recognition or by inputting by the user an identity identifier through the remote control.
  • the identity information may include family member identifiers such as father, mother, grandfather, grandmother, and daughter.
  • the identity information may also include categories such as child, adult, and elderly. This step is used to determine the identity information of the user requesting to enter the target user mode. An adult may help a child to request to enter the child mode. The child cannot select to enter the adult mode by himself.
  • the determining identity information of the first user may include the following steps 202 A 1 and 202 A 2 .
  • Step 202 A 1 includes in response to receiving first voice of the first user, generating a first voiceprint characteristic vector based on the first voice.
  • the voice inputted by the first user is referred to as the first voice.
  • the voice inputted by the second user is referred to as second voice.
  • the processing process for the first voice and the processing process for the second voice are the same.
  • voice is uniformly used to represent the first voice and the second voice.
  • the voice inputted verbally by the user may be received through the microphone.
  • the voice may include a remote control instruction (e.g., “turning on”) or instruction other than the remote control instruction.
  • a voiceprint is an acoustic wave spectrum carrying verbal information and displayed by an electro-acoustic instrument.
  • the voiceprint characteristic vector may be a vector identifying a characteristic of the acoustic wave spectrum of the user. If a piece of audio includes sounds of a plurality of people, a plurality of voiceprint characteristic vectors may be extracted. It should be noted that generating the voiceprint characteristic vector based on the voice is a publicly known technique widely studied and applied at present, which will not be repeatedly described herein.
  • the generating the voiceprint characteristic vector based on the voice may be implemented by extracting a typical feature in the voice.
  • features of the sound such as a wavelength, a frequency, an intensity, and a rhythm can reflect the characteristics of the sound of the user. Therefore, when the voiceprint characteristic extraction is performed on the voice, the features in the sound such as the wavelength, the frequency, the intensity, and the rhythm may be extracted, and the feature values of the features such as the wavelength, the frequency, the intensity, and the rhythm in the voice may be determined.
  • the feature values of the features such as the wavelength, the frequency, the intensity, and the rhythm in the voice are used as elements in the voiceprint characteristic vector.
  • the generating the voiceprint characteristic vector based on the voice may also be implemented by extracting an acoustic feature in the voice, for example, a Mel-frequency cepstral coefficient.
  • the Mel-frequency cepstral coefficient is used as an element in the voiceprint characteristic vector.
  • the process of extracting the Mel cepstral coefficient from the voice may include a pre-emphasis, framing, windowing, a fast Fourier transform, Mel filtering, a logarithmic transformation, and a discrete cosine transform.
  • the user Before inputting the voice, the user may make the smart television muted through the remote control, to keep the collected voice inputted by the user from including the sound of a television program.
  • the smart television may also be muted by a predetermined voice command. For example, the user may verbally input “silent” to make the smart television muted.
  • an electronic device may import the voice into a pre-trained universal background model (UBM) to perform mapping to obtain a voiceprint characteristic super-vector (i.e., a Gaussian super-vector).
  • UBM pre-trained universal background model
  • the universal background model is also referred to as a global background model for representing a general background characteristic.
  • the universal background model is obtained by performing training on voices of a large number of impostors using an EM (Expectation-Maximum) algorithm.
  • the training for the UBM model is from a large number of different speakers. It is assumed that there are a plurality of Gaussian distributions in the trained universal background model.
  • the voiceprint characteristic super-vector of the speaker may be calculated.
  • the difference between the acoustic characteristic of the speaker and the universal background model is reflected. That is, the unique individuality in the pronunciation of the speaker is reflected.
  • the voice of the user having an uncertain length may be finally mapped onto a voiceprint characteristic super-vector having a fixed length that can reflect the vocalization characteristic of the user.
  • Such high-dimensional voiceprint characteristic super-vector not only includes an individual difference in pronunciation, but may also include a difference caused by a channel. Therefore, a dimension reduction is also required to be performed on the super-vector through some supervised dimension reduction algorithms, to map the super-vector onto a lower-dimensional vector.
  • the dimension reduction may be performed on the voiceprint characteristic super-vector through a Joint Factor Analysis (JFA) method to obtain the voiceprint characteristic vector.
  • JFA Joint Factor Analysis
  • the Joint Factor Analysis method is an effective algorithm for channel compensation in voiceprint authentication algorithms, which estimates a channel factor by assuming that a speaker space and a channel space are independent, and the speaker space and the channel space may be described by two low-dimensional factor spaces respectively.
  • the dimension reduction may be performed on the voiceprint characteristic super-vector through a probabilistic linear discriminant analysis (PLDA) algorithm to obtain the voiceprint characteristic vector.
  • the probabilistic linear discriminant analysis algorithm is also a channel compensation algorithm, which is a linear discriminant analysis (LDA) algorithm in a probabilistic form.
  • the dimension reduction may alternatively be performed on the voiceprint characteristic super-vector through an identifying vector (I-Vector) to obtain the voiceprint characteristic vector.
  • I-Vector identifying vector
  • a plurality of pieces of voice generally need to be provided for training the universal background model.
  • a plurality of voiceprint characteristic vectors of the above voice are extracted and obtained.
  • the voiceprint characteristic vector of the user may be stored, and voiceprint characteristic vectors of a plurality of users constitute a voiceprint library.
  • a Gaussian mixture model may be trained through the Expectation Maximization algorithm.
  • This model describes a probability distribution of voice characterization data of many people, which may be understood as the commonality of all the speakers.
  • the model is served as a priori model for a voiceprint model of a certain speaker. Therefore, this Gaussian mixture model is also referred to as the UBM model.
  • the universal background model may also be constructed through a deep neural network.
  • the voice may be processed to filter out a noise.
  • the noise in the voice is filtered out through a singular value decomposition algorithm or a filter algorithm.
  • the noise herein may include a discordant sound having a confusing change in pitch and intensity.
  • the noise herein may also include a sound that interferes with the recognition for a target sound, for example, background music.
  • the singular value decomposition is an important matrix factorization in Linear algebra, which is the generalization of the unitary diagonalization of a normal matrix in matrix analysis. The SVD has important applications in the fields of signal processing and statistics.
  • the SVD-based de-noising technique is one of the subspace algorithms.
  • a noisy signal vector space is decomposed into two subspaces respectively dominated by a pure signal and a noise signal. Then, the pure signal is estimated by simply removing the noisy signal vector component in the “noise space.”
  • the noise in an audio file may a be filtered out through an adaptive filter method and a Kalman filter method.
  • the voice is usually framed with an interval of 20-50 ms, and then, each frame of voice may be mapped to an acoustic characteristic sequence having a fixed length by some feature extraction algorithms (mainly performing a conversion from a time domain to a frequency domain).
  • Step 202 A 2 includes inputting the first voiceprint characteristic vector into a pre-trained voiceprint recognition model to obtain the identity information of the first user.
  • the voiceprint recognition model is used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user.
  • the identity information of the user may include at least one of: gender, age or family member identifier.
  • the age may be a certain age range, for example, 4-8 years old, and 20-30 years old. Gender and age may be combined to determine the specific identity of the user. For example, child, elderly, adult female, and adult male may be recognized.
  • the family member identifier may be used to identify a pre-registered family member, for example, mother, father, daughter, and grandmother. If the number of member having close ages and the same gender in the family is merely one, the family member may be directly determined using the age and the gender of the user.
  • the family members include a mother, a father, a daughter and a grandmother, it is determined that the female aged between 50 and 60 is the grandmother, and the female aged between 4 and 8 is the daughter.
  • the voiceprint recognition model may include a classifier, which can map a voiceprint characteristic vector in the voiceprint characteristic vector library to a certain one of given categories of the user, and thus the model may be applied to the prediction for the category of the user.
  • the classification may be performed based on the age, the gender, or a combination of the age and the gender, for example, girl, male adult, and female elderly. That is, the category of the user may be outputted by inputting the voiceprint characteristic vector into the classifier.
  • the classifier used in this embodiment may include a decision tree, a logistic regression, a naive Bayes, a neural network, etc. Based on a simple probability model, the classifier uses the largest probability value to perform a classification prediction on the data.
  • the classifier is trained in advance. The classifier may be trained by extracting a voiceprint characteristic vector from a large number of sound samples.
  • the configuration and the implementation for the classifier may include: 1) selecting samples (including a positive sample and a negative sample), all the samples being divided into a training sample and a test sample; 2) performing a classifier algorithm based on the training sample, to generate the classifier; 3) inputting the test sample into the classifier to generate a prediction result; and 4) calculating a necessary evaluation index according to the prediction result, to evaluate a performance of the classifier.
  • sounds of a large number of children are collected as the positive sample
  • sounds of a large number of adults are collected as the negative sample.
  • the classifier algorithm is performed to generate the classifier.
  • the positive sample and the negative sample are respectively inputted into the classifier, to generate the prediction result to verify whether the result is child.
  • the performance of the classifier is evaluated according to the prediction result.
  • the voiceprint recognition model may further include a family member mapping table.
  • the family member mapping table records a corresponding relationship between the family member identifier, the gender, and the age.
  • the family member identifier may be determined by retrieving the classification result of the classifier from the family member mapping table. For example, if the result outputted by the classifier is a female aged between 50 and 60, the family member identifier of this user is determined as the grandmother through the family member mapping table.
  • the voiceprint recognition model may be the voiceprint library.
  • the voiceprint library is used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information.
  • the voiceprint characteristic vector is inputted into a predetermined voiceprint library to perform matching, and a first predetermined number of pieces of identity information are selected based on a descending order of matching degrees and outputted.
  • the voiceprint characteristic vector of the user may be constructed through step 201 , and then, the corresponding relationship between the voiceprint characteristic vector and the identity information is established.
  • the voiceprint library is constructed by registering corresponding relationships between the voiceprint characteristic vectors of a plurality of users and the identity information of the users.
  • the matching degree between the above voiceprint characteristic vector and the above voiceprint library may be calculated using a Manhattan distance, Minkowski distance, or a cosine similarity.
  • determining the identity information of the first user may include the following steps 202 B 1 to 202 B 3 .
  • Step 202 B 1 includes outputting a question for verifying user identity information.
  • the question is mainly used to prevent a child from pretending to be an adult. Therefore, the question may be set as a question difficult for the child to answer, for example, “please input the mode switching password” is displayed on the television screen, or “please input the mode switching password” is prompted by voice.
  • the question may alternatively be randomly generated. For example, an English question, a mathematic question, or an ancient poetry question may be set to ask the user to give an answer. The user may select or directly input an answer via the remote control, or answer by voice.
  • Step 202 B 2 includes determining, in response to receiving reply information inputted by the first user, whether a predetermined answer set includes an answer matching the reply information.
  • the answer corresponds to the user identity information.
  • each password corresponds to a kind of user identity information.
  • the user may determine the user identity information according to the reply information inputted by the user. For example, the adult password is preset to “adult,” and the child password is preset to “child.” If the smart television receives “adult,” the user may be determined as an adult. If there are questions with fixed answers, the reply information inputted by the user may be compared with the fixed answers. For convenience of answering, a multiple-choice question may be provided when the question is proposed, and the user only needs to select A, B, C, or D.
  • Step 202 B 3 includes determining, if the answer is included in the predetermined answer set, the user identity information corresponding to the answer matching the reply information as the identity information of the first user.
  • the answer corresponds to the user identity information. Different answers correspond to different identity information. If the question is a password question, each password corresponds to a kind of user identity information. The corresponding user identity may be found according to the password answered by the user. If the question is not a password question, whether the answer is correct may be determined according to the reply information inputted by the user. If there is no answer matching the reply information in the predetermined answer set, the answer is incorrect, and the identity information of the user cannot be identified. If there is an answer matching the reply information in the predetermined answer set, the answer is correct, and the identity information of the user is determined according to the corresponding relationship between the answer and the user identity information.
  • Step 203 includes matching the target user mode with the identity information of the first user.
  • each kind of identity information matches at least one user mode.
  • the adult may match the child mode, the elderly mode, and the adult mode.
  • the elderly may match the child mode and the elderly mode.
  • the child only matches the child mode. If the determined identity information is the child, and the target user mode requested by the user is the adult mode, the target user mode does not match the identity information. If the determined identity information is the child, and the target user mode requested by the user is the child mode, the target user mode matches the identity information.
  • the adult may help the child or the elderly to select the target user model. Only with the help of the adult, the child can enter the adult mode, such that the child enters the adult mode under the supervision of the adult. If there is no adult supervision, the child can only enter the child mode.
  • Step 204 includes selecting, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set to output the operation option page.
  • different user modes correspond to different operation option pages. If the target user mode matches the identity information, the target user mode requested by the user may be directly entered.
  • the operation option page may include the home page of the smart television.
  • the operation option page may alternatively include operation options in a menu form.
  • the operation options may include a channel option, a sound option, an image option, etc.
  • the operation option pages in the preset operation option page set are different from each other. For example, the front on the operation option page for the elderly mode is thick and big, and the number of operation options on the page is small, to avoid a too complicated operation from affecting the use of the elderly.
  • Some channel options may be removed from the operation option page for the child mode, and the phonetic alphabet may be displayed for an under-age child to read.
  • the operation option page for the adult mode may show all the functions supported by the smart television.
  • Step 205 includes selecting, if the target user mode does not match the identity information, an operation option page matching a user mode matching the identity information of the first user from the preset operation option page set, to output the operation option page.
  • the target user mode requested by the user is not entered, and the user mode matching the identity information of the user is entered.
  • the identity information of the user is the child.
  • the user requests to enter the adult mode, but since the requested user mode does not match the actual identity of the user, the user is only allowed to enter the child mode.
  • the user may enter a predetermined guest mode. Specific permissions are set for a guest. For example, the guest cannot watch a paying program. Alternatively, the child mode is used for the guest by default.
  • the above method may further include the following steps 2051 and 2052 .
  • Step 2051 includes recording, in response to determining the first user belonging to a predetermined population group according to the identity information of the first user, a time point of determining the identity information of the first user as a viewing start time of the first user.
  • the predetermined population group may be the elderly or the children. For health of the elderly or the child, the viewing duration of the elderly or the children needs to be controlled. Therefore, the time when the user starts viewing the television is recorded as the viewing start time of the user.
  • the viewing start time may be recorded after the identity information of the first user is determined in step 202 . Not only a length of time is included, but also the specific time may be monitored.
  • the elderly or the child is not allowed to watch the television after 12 o'clock in the evening.
  • Step 2052 includes outputting, in response to determining a difference between a current time and the viewing start time of the first user being greater than a viewing duration threshold of the predetermined population group and/or the current time being within a predetermined time interval, time prompting information and/or performing a turnoff operation.
  • the difference between the current time and the viewing start time of the user may be used as the viewing duration of the user.
  • the television program is no longer played or the television is turned off.
  • the user may be notified of an upcoming timeout in advance in a form of text or voice.
  • the predetermined time interval in which the predetermined population group is prohibited to watch the television may further be set, for example, the time interval from 12:00 pm to 6:00 am.
  • FIG. 3 is a schematic diagram of an application scenario of the method for outputting information according to this embodiment.
  • the child user inputs the voice of entering the target user mode “entering the child mode” into the smart television through the microphone.
  • the smart television extracts the voiceprint characteristic vector according to the voice “entering the child mode,” and then determines the identity information of the user as the child through the pre-trained voiceprint recognition model.
  • the target user mode (child mode) matches the identity information of the user “child.”
  • the operation option page corresponding to the child mode is selected from the preset set of respective operation options for the child, the adult, and the elderly, to be outputted.
  • the operation option page in the child mode adds information such as the phonetic alphabet and a cartoon character for the usage by the child. In addition, other functions not suitable for the child are prohibited.
  • the physical and mental health of the specific population group may be protected while a personalized operation option page is provided for a smart television user of a different type.
  • FIG. 4 illustrates a flow 400 of another embodiment of the method for outputting information.
  • the flow 400 of the method for outputting information includes the following steps 401 to 408 .
  • Step 401 includes: receiving a message of requesting to enter a target user mode, the message being inputted by a first user.
  • Step 402 includes determining identity information of the first user.
  • Step 403 includes matching the target user mode with the identity information of the first user.
  • Step 404 includes selecting, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set to output the operation option page.
  • Step 405 includes: selecting, if the target user mode does not match the identity information, an operation option page matching a user mode matching the identity information of the first user from the preset operation option page set, to output the operation option page.
  • Steps 401 - 405 are substantially the same as steps 201 - 205 , which will not be repeatedly described.
  • Step 406 includes: in response to receiving second voice of a second user, generating a second voiceprint characteristic vector based on the second voice.
  • the second voiceprint characteristic vector may be generated based on the second voice.
  • the specific process is substantially the same as the process of generating the first voiceprint characteristic vector based on the first voice, which will not be repeatedly described.
  • Step 407 includes: inputting the second voiceprint characteristic vector into a voiceprint recognition model to obtain identity information of the second user.
  • the voiceprint recognition model is used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user.
  • step 202 A 2 the specific process is substantially the same as the process of inputting the first voiceprint characteristic vector into the voiceprint recognition model to obtain the identity information of the first user, which will not be repeatedly described.
  • Step 408 includes: determining a younger user from the first user and the second user, and selecting an operation option page matching a user mode corresponding to the younger user from the preset operation option page set to output the operation option page.
  • the voiceprint recognition model may recognize a general age of the user.
  • the operation option page matching the user mode corresponding to the younger user is selected from the preset operation option page set to be outputted. For example, if the first user is a child, even if the second user is an adult, the output is performed according to the operation option page corresponding to the child mode. The original user mode is kept, and the operation option page does not need to be switched. If the first user is the adult, the current mode is the adult mode, the mode needs to be switched to the child mode when the second user is the child.
  • the flow 400 of the method for outputting information in this embodiment emphasizes the step of switching the user mode. Accordingly, the solution described in this embodiment may introduce a protection for the younger user when different users are watching the television at the same time. Thus, the comprehensiveness of the protection for the child is improved.
  • the present disclosure provides an embodiment of an apparatus for outputting information.
  • the embodiment of the apparatus corresponds to the embodiment of the method shown in FIG. 2 , and the apparatus may be applied in various electronic devices.
  • the apparatus 500 for outputting information in this embodiment includes: a receiving unit 501 , a determining unit 502 , a matching unit 503 , and an outputting unit 504 .
  • the receiving unit 501 is configured to receive a message of requesting to enter a target user mode, the message being inputted by a first user.
  • the determining unit 502 is configured to determine identity information of the first user.
  • the matching unit is configured to determine whether the target user mode matches the identity information of the first user.
  • the outputting unit 504 is configured to select, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set, to output the operation option page.
  • step 201 for specific processes of the receiving unit 501 , the determining unit 502 , the matching unit 503 , and the outputting unit 504 in the apparatus 500 for outputting information, reference may be made to step 201 , step 202 , step 203 , and step 204 in the corresponding embodiment of FIG. 2 .
  • the outputting unit 504 is further configured to: select, if the target user mode does not match the identity information, an operation option page matching a user mode matching the identity information of the first user from the preset operation option page set, to output the operation option page.
  • the determining unit 502 is further configured to: generate, in response to receiving first voice of the first user, a first voiceprint characteristic vector based on the first voice; and input the first voiceprint characteristic vector into a pre-trained voiceprint recognition model to obtain the identity information of the first user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user.
  • the determining unit 502 is further configured to: output a question for verifying user identity information; determine, in response to receiving reply information inputted by the first user, whether a predetermined answer set includes an answer matching the reply information, the answer corresponding to the user identity information; and determine, if the answer is included in the predetermined answer set, the user identity information corresponding to the answer matching the reply information as the identity information of the first user.
  • the determining unit 502 is further configured to: import the first voice into a pre-trained universal background model to perform mapping to obtain a first voiceprint characteristic super-vector, the universal background model being used to represent a corresponding relationship between the voice and the voiceprint characteristic super-vector; and perform a dimension reduction on the first voiceprint characteristic super-vector to obtain the first voiceprint characteristic vector.
  • the apparatus 500 further includes a prompting unit (not shown).
  • the prompting unit is configured to: record, in response to determining the first user belonging to a predetermined population group according to the identity information of the first user, a time point of determining the identity information of the first user as a viewing start time of the first user; and output time prompting information and/or perform a turnoff operation, in response to determining a difference between a current time and the viewing start time of the first user being greater than a viewing duration threshold of the predetermined population group and/or the current time being within a predetermined time interval.
  • the identity information includes at least one of: gender, age or family member identifier.
  • the apparatus 500 further includes a switching unit.
  • the switching unit is configured to: generate, in response to receiving second voice of a second user, a second voiceprint characteristic vector based on the second voice; input the second voiceprint characteristic vector into a voiceprint recognition model to obtain identity information of the second user, the recognition model being used to represent a corresponding relationship between the voiceprint characteristic vector and the identity information of the user; and determine a younger user from the first user and the second user, and select an operation option page matching a user mode corresponding to the younger user from the preset operation option page set to output the operation option page.
  • FIG. 6 illustrates a schematic structural diagram of a computer system 600 adapted to implement an electronic device (e.g., the smart television shown in FIG. 1 ) of the embodiments of the present disclosure.
  • an electronic device e.g., the smart television shown in FIG. 1
  • the electronic device shown in FIG. 6 is merely an example and should not impose any restriction on the function and scope of use of the embodiments of the present disclosure.
  • the computer system 600 includes a central processing unit (CPU) 601 , which may execute various appropriate actions and processes in accordance with a program stored in a read-only memory (ROM) 602 or a program loaded into a random access memory (RAM) 603 from a storage portion 608 .
  • the RAM 603 further stores various programs and data required by operations of the system 600 .
  • the CPU 601 , the ROM 602 and the RAM 603 are connected to each other via a bus 604 .
  • An input/output (I/O) interface 605 is also connected to the bus 604 .
  • the following components are connected to the I/O interface 605 : an input portion 606 including a keyboard, a mouse. etc.; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display device (LCD), a speaker, etc.; a storage portion 608 including a hard disk and the like; and a communication portion 609 including a network interface card, for example, a LAN card and a modem.
  • the communication portion 609 performs communication processes via a network such as the Internet.
  • a driver 610 is also connected to the I/O interface 605 as required.
  • a removable medium 611 for example, a magnetic disk, an optical disk, a magneto-optical disk, and a semiconductor memory, may be installed on the driver 610 , to facilitate the installation of a computer program from the removable medium 611 on the storage portion 608 as needed.
  • an embodiment of the present disclosure includes a computer program product, including a computer program hosted on a computer readable medium, the computer program including program codes for performing the method as illustrated in the flowchart.
  • the computer program may be downloaded and installed from a network via the communication portion 609 , and/or may be installed from the removable medium 611 .
  • the computer program when executed by the central processing unit (CPU) 601 , implements the above mentioned functionalities as defined by the method of the present disclosure.
  • the computer readable medium in the present disclosure may be a computer readable signal medium, a computer readable storage medium, or any combination of the two.
  • the computer readable storage medium may be, but not limited to: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or element, or any combination of the above.
  • a more specific example of the computer readable storage medium may include, but not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM or flash memory), a fibre, a portable compact disk read only memory (CD-ROM), an optical memory, a magnet memory or any suitable combination of the above.
  • the computer readable storage medium may be any physical medium containing or storing programs, which may be used by a command execution system, apparatus or element or incorporated thereto.
  • the computer readable signal medium may include a data signal that is propagated in a baseband or as a part of a carrier wave, which carries computer readable program codes. Such propagated data signal may be in various forms, including, but not limited to, an electromagnetic signal, an optical signal, or any suitable combination of the above.
  • the computer readable signal medium may also be any computer readable medium other than the computer readable storage medium.
  • the computer readable medium is capable of transmitting, propagating or transferring programs for use by, or used in combination with, a command execution system, apparatus or element.
  • the program codes contained on the computer readable medium may be transmitted with any suitable medium including, but not limited to, wireless, wired, optical cable, RF medium, or any suitable combination of the above.
  • a computer program code for executing the operations according to the present disclosure may be written in one or more programming languages or a combination thereof.
  • the programming language includes an object-oriented programming language such as Java, Smalltalk and C++, and further includes a general procedural programming language such as “C” language or a similar programming language.
  • the program codes may be executed entirely on a user computer, executed partially on the user computer, executed as a standalone package, executed partially on the user computer and partially on a remote computer, or executed entirely on the remote computer or a server.
  • the remote computer may be connected to the user computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or be connected to an external computer (e.g., connected through Internet provided by an Internet service provider).
  • LAN local area network
  • WAN wide area network
  • each of the blocks in the flowcharts or block diagrams may represent a module, a program segment, or a code portion, the module, the program segment, or the code portion comprising one or more executable instructions for implementing specified logic functions.
  • the functions denoted by the blocks may occur in a sequence different from the sequences shown in the figures. For example, any two blocks presented in succession may be executed, substantially in parallel, or may sometimes be executed in a reverse sequence, depending on the function involved.
  • each block in the block diagrams and/or flowcharts as well as a combination of blocks may be implemented using a dedicated hardware-based system executing specified functions or operations, or by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented by means of software or hardware.
  • the described units may also be provided in a processor, for example, described as: a processor, comprising a receiving unit, a determining unit, a matching unit, and an outputting unit.
  • the names of these units do not in some cases constitute a limitation to such units themselves.
  • the receiving unit may also be described as “a unit for receiving a message of requesting to enter a target user mode, the message being inputted by a first user.”
  • the present disclosure further provides a computer readable medium.
  • the computer readable medium may be the computer readable medium included in the apparatus described in the above embodiments, or a stand-alone computer readable medium not assembled into the apparatus.
  • the computer readable medium stores one or more programs.
  • the one or more programs when executed by the apparatus, cause the apparatus to: receive a message of requesting to enter a target user mode, the message being inputted by a first user; determine identity information of the first user; determine whether the target user mode matches the identity information of the first user; and select, if the target user mode matches the identity information, an operation option page matching the target user mode from a preset operation option page set, to output the operation option page.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
US16/298,714 2018-06-08 2019-03-11 Method and apparatus for outputting information Abandoned US20190378494A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810589033.2A CN108882032A (zh) 2018-06-08 2018-06-08 用于输出信息的方法和装置
CN201810589033.2 2018-06-08

Publications (1)

Publication Number Publication Date
US20190378494A1 true US20190378494A1 (en) 2019-12-12

Family

ID=64337534

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/298,714 Abandoned US20190378494A1 (en) 2018-06-08 2019-03-11 Method and apparatus for outputting information

Country Status (3)

Country Link
US (1) US20190378494A1 (ja)
JP (1) JP2019212288A (ja)
CN (1) CN108882032A (ja)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111010481A (zh) * 2019-12-16 2020-04-14 北京小米移动软件有限公司 来电监听方法、来电监听装置及计算机存储介质
CN111899717A (zh) * 2020-07-29 2020-11-06 北京如影智能科技有限公司 一种语音回复方法及装置
CN112333550A (zh) * 2020-06-19 2021-02-05 深圳Tcl新技术有限公司 节目查询方法、装置、设备及计算机存储介质
CN113676394A (zh) * 2021-08-19 2021-11-19 维沃移动通信(杭州)有限公司 信息处理方法和信息处理装置
WO2022228135A1 (zh) * 2021-04-26 2022-11-03 北京有竹居网络技术有限公司 一种多媒体内容展示方法、装置、设备及存储介质

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109671438A (zh) * 2019-01-28 2019-04-23 武汉恩特拉信息技术有限公司 一种利用语音提供辅助服务的装置及方法
CN110134022B (zh) * 2019-05-10 2022-03-18 平安科技(深圳)有限公司 智能家居设备的声音控制方法、装置、及电子装置
CN110689886B (zh) * 2019-09-18 2021-11-23 深圳云知声信息技术有限公司 设备控制方法及装置
CN111081249A (zh) * 2019-12-30 2020-04-28 腾讯科技(深圳)有限公司 一种模式选择方法、装置及计算机可读存储介质
CN113553105A (zh) * 2020-04-23 2021-10-26 百度在线网络技术(北京)有限公司 引导页面的生成方法和装置
CN111600782B (zh) * 2020-04-28 2021-05-18 百度在线网络技术(北京)有限公司 智能语音设备的控制方法、装置、电子设备和存储介质
CN111787387A (zh) * 2020-06-30 2020-10-16 百度在线网络技术(北京)有限公司 内容显示方法、装置、设备以及存储介质
CN114079806B (zh) * 2020-08-06 2024-06-04 深圳Tcl新技术有限公司 一种个性化页面显示方法及相关设备
CN112000726A (zh) * 2020-09-03 2020-11-27 未来穿戴技术有限公司 按摩操作模式的存储方法及电子设备、存储介质
CN112423069A (zh) * 2020-11-20 2021-02-26 广州欢网科技有限责任公司 模式切换方法、装置及设备、智能播放系统
CN114999472A (zh) * 2022-04-27 2022-09-02 青岛海尔空调器有限总公司 一种空调控制方法、装置及一种空调
CN114885218A (zh) * 2022-06-16 2022-08-09 深圳创维-Rgb电子有限公司 自动选择观看模式的方法、电视机、设备及存储介质
CN116055818A (zh) * 2022-12-22 2023-05-02 北京奇艺世纪科技有限公司 视频播放方法、装置、电子设备及存储介质

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1079615A3 (en) * 1999-08-26 2002-09-25 Matsushita Electric Industrial Co., Ltd. System for identifying and adapting a TV-user profile by means of speech technology
JP4292646B2 (ja) * 1999-09-16 2009-07-08 株式会社デンソー ユーザインタフェース装置、ナビゲーションシステム、情報処理装置及び記録媒体
US7046139B2 (en) * 2004-04-26 2006-05-16 Matsushita Electric Industrial Co., Ltd. Method and parental control and monitoring of usage of devices connected to home network
JP2006238391A (ja) * 2005-02-28 2006-09-07 Funai Electric Co Ltd リモコン装置
KR100664943B1 (ko) * 2005-08-10 2007-01-04 삼성전자주식회사 모드 기반 접근 제어 방법 및 장치
JP2009139390A (ja) * 2007-12-03 2009-06-25 Nec Corp 情報処理システム、処理方法及びプログラム
US10460085B2 (en) * 2008-03-13 2019-10-29 Mattel, Inc. Tablet computer
RU2493613C2 (ru) * 2008-08-22 2013-09-20 Сони Корпорейшн Устройство воспроизведения изображений и способ управления
KR101289081B1 (ko) * 2009-09-10 2013-07-22 한국전자통신연구원 음성 인터페이스를 이용한 iptv 시스템 및 서비스 방법
JP5510069B2 (ja) * 2010-05-25 2014-06-04 富士通モバイルコミュニケーションズ株式会社 翻訳装置
JP2013152610A (ja) * 2012-01-25 2013-08-08 Mitsubishi Motors Corp 車両情報提示装置
US9665922B2 (en) * 2012-11-30 2017-05-30 Hitachi Maxell, Ltd. Picture display device, and setting modification method and setting modification program therefor
CN103914127B (zh) * 2012-12-31 2019-06-25 联想(北京)有限公司 一种电子设备的控制方法及电子设备
US9100694B1 (en) * 2013-03-14 2015-08-04 Google Inc. TV mode change in accordance with number of viewers present
CN104065989B (zh) * 2013-03-21 2018-07-06 国民技术股份有限公司 播放终端及其语音控制方法
WO2014199596A1 (ja) * 2013-06-10 2014-12-18 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 話者識別方法、話者識別装置及び話者識別システム
CN104008320A (zh) * 2014-05-19 2014-08-27 惠州Tcl移动通信有限公司 基于人脸识别的使用权限和用户模式控制方法及系统
CN106156575A (zh) * 2015-04-16 2016-11-23 中兴通讯股份有限公司 一种用户界面控制方法及终端
JP6693111B2 (ja) * 2015-12-14 2020-05-13 カシオ計算機株式会社 対話装置、ロボット、対話方法及びプログラム
JP6600561B2 (ja) * 2016-01-06 2019-10-30 マクセル株式会社 表示装置
JP6738150B2 (ja) * 2016-01-14 2020-08-12 株式会社ナビタイムジャパン ナビゲーションアプリケーション・プログラム、情報処理装置、及び情報処理方法
CN105791935A (zh) * 2016-05-03 2016-07-20 乐视控股(北京)有限公司 一种电视的控制方法及装置
CN105959806A (zh) * 2016-05-25 2016-09-21 乐视控股(北京)有限公司 节目推荐方法及装置
CN106128467A (zh) * 2016-06-06 2016-11-16 北京云知声信息技术有限公司 语音处理方法及装置
CN106454515A (zh) * 2016-10-31 2017-02-22 四川长虹电器股份有限公司 智能电视播放控制系统及方法
CN107623614B (zh) * 2017-09-19 2020-12-08 百度在线网络技术(北京)有限公司 用于推送信息的方法和装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111010481A (zh) * 2019-12-16 2020-04-14 北京小米移动软件有限公司 来电监听方法、来电监听装置及计算机存储介质
CN112333550A (zh) * 2020-06-19 2021-02-05 深圳Tcl新技术有限公司 节目查询方法、装置、设备及计算机存储介质
CN111899717A (zh) * 2020-07-29 2020-11-06 北京如影智能科技有限公司 一种语音回复方法及装置
WO2022228135A1 (zh) * 2021-04-26 2022-11-03 北京有竹居网络技术有限公司 一种多媒体内容展示方法、装置、设备及存储介质
US20240004531A1 (en) * 2021-04-26 2024-01-04 Beijing Youzhuju Network Technology Co. Ltd. Method, apparatus, and device for displaying multi-media content, and storage medium
CN113676394A (zh) * 2021-08-19 2021-11-19 维沃移动通信(杭州)有限公司 信息处理方法和信息处理装置

Also Published As

Publication number Publication date
CN108882032A (zh) 2018-11-23
JP2019212288A (ja) 2019-12-12

Similar Documents

Publication Publication Date Title
US20190378494A1 (en) Method and apparatus for outputting information
US11006179B2 (en) Method and apparatus for outputting information
US20200126566A1 (en) Method and apparatus for voice interaction
KR102237539B1 (ko) 음성 대화 분석을 기반으로 하는 치매 및 인지 능력 판단 시스템 및 방법
US10824874B2 (en) Method and apparatus for processing video
CN107886949B (zh) 一种内容推荐方法及装置
CN111415677B (zh) 用于生成视频的方法、装置、设备和介质
US11475897B2 (en) Method and apparatus for response using voice matching user category
CN107481720B (zh) 一种显式声纹识别方法及装置
US20200043502A1 (en) Information processing method and device, multimedia device and storage medium
CN108924218B (zh) 用于推送信息的方法和装置
US11127399B2 (en) Method and apparatus for pushing information
CN109582825B (zh) 用于生成信息的方法和装置
Fok et al. Towards more robust speech interactions for deaf and hard of hearing users
CN113205793B (zh) 音频生成方法、装置、存储介质及电子设备
CN114143479B (zh) 视频摘要的生成方法、装置、设备以及存储介质
Njaka et al. Voice controlled smart mirror with multifactor authentication
CN111640434A (zh) 用于控制语音设备的方法和装置
Erro et al. Personalized synthetic voices for speaking impaired: website and app.
CN108322770A (zh) 视频节目识别方法、相关装置、设备和系统
CN113903338A (zh) 面签方法、装置、电子设备和存储介质
WO2021169825A1 (zh) 语音合成方法、装置、设备和存储介质
CN111400463A (zh) 对话响应方法、装置、设备和介质
CN111654752A (zh) 多媒体信息播放方法、装置及相关设备
CN113836273A (zh) 基于复杂语境的法律咨询方法及相关设备

Legal Events

Date Code Title Description
AS Assignment

Owner name: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HOU, ZAIPENG;REEL/FRAME:048564/0921

Effective date: 20180620

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

AS Assignment

Owner name: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.;REEL/FRAME:056811/0772

Effective date: 20210527

Owner name: SHANGHAI XIAODU TECHNOLOGY CO. LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.;REEL/FRAME:056811/0772

Effective date: 20210527

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION