US10685649B2 - Method and apparatus for providing voice service - Google Patents

Method and apparatus for providing voice service Download PDF

Info

Publication number
US10685649B2
US10685649B2 US15/858,428 US201715858428A US10685649B2 US 10685649 B2 US10685649 B2 US 10685649B2 US 201715858428 A US201715858428 A US 201715858428A US 10685649 B2 US10685649 B2 US 10685649B2
Authority
US
United States
Prior art keywords
request information
user demand
voice request
user
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US15/858,428
Other languages
English (en)
Other versions
US20190147862A1 (en
Inventor
Guang Lu
Xiajun LUO
Shiquan YE
Jue HUANG
Miaochang ZHANG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Shanghai Xiaodu Technology Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Assigned to BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. reassignment BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HUANG, JUE, LU, GUANG, LUO, XIAJUN, YE, SHIQUAN, ZHANG, MIAOCHANG
Publication of US20190147862A1 publication Critical patent/US20190147862A1/en
Application granted granted Critical
Publication of US10685649B2 publication Critical patent/US10685649B2/en
Assigned to BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., SHANGHAI XIAODU TECHNOLOGY CO. LTD. reassignment BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Definitions

  • This disclosure relates to the field of computer technology, specifically to the field of artificial intelligence technology, and more specifically to a method and apparatus for providing a voice service.
  • a user may execute operations, such as multimedia resource playing and information inquiry.
  • an existing intelligent voice service platform may initiate relevant functions, and provide established operation interfaces, for example, when playing music, interfaces for operations, such as “play the next,” “pause,” “continue to play” and “add to favourites” are provided.
  • interfaces for operations such as “play the next,” “pause,” “continue to play” and “add to favourites” are provided.
  • same operation interfaces are provided for the same type of voice service.
  • implicit demands may be different when users send different voice requests requesting for the same type of voice services.
  • the users want to execute other operations without configured interfaces in the playing interface the user needs to spend time on multilevel lookup in the application interface.
  • An embodiment of this disclosure provides a method and apparatus for providing a voice service.
  • an embodiment of this disclosure provides a method for providing a voice service, including: analyzing, in response to receiving first voice request information sent by an intelligent voice device containing a display, the first voice request information to determine a user demand; determining an alternative operation associated with the user demand based on a configured optional operation set; generating prompt information for guiding a user to execute the alternative operation; and pushing the prompt information to the intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display.
  • the analyzing the first voice request information to determine a user demand includes: ascertaining whether the first voice request information contains a keyword requesting for playing a multimedia resource; and if yes, identifying a preset multimedia tag in the first voice request information, and determining the user demand being a demand for playing a first multimedia resource containing the preset multimedia tag.
  • the determining an alternative operation associated with the user demand based on a configured optional operation set includes: selecting a playing operation corresponding to the user demand and an optional operation associated with the playing operation from the configured optional operation set as the alternative operation; the optional operation associated with the playing operation including at least one of the following operations: an operation of selecting a to-be-played multimedia resource, an operation of switching a play mode, and an operation of feeding back preferences for a played multimedia resource.
  • the analyzing the first voice request information to determine a user demand further includes: obtaining user descriptor data, scenario data and to-be-recommended multimedia resource data in response to determining the first voice request information containing a keyword requesting for playing the multimedia resource and the first voice request information not containing a preset multimedia tag; and determining the user demand being a demand for selecting a second multimedia resource matching the user descriptor data and/or the scenario data from the to-be-recommended multimedia resource data.
  • the determining an alternative operation associated with the user demand based on a configured optional operation set includes: selecting a recommending operation corresponding to the user demand from the configured optional operation set as the alternative operation, where a recommended object of the recommending operation includes the second multimedia resource.
  • the analyzing the first voice request information to determine a user demand further includes: searching, in response to determining the first voice request information not containing a keyword requesting for playing a multimedia resource, network data using a result of the analyzing the first voice request information as a search expression, and determining the user demand based on a result of the searching.
  • the determining an alternative operation associated with the user demand based on a configured optional operation set includes: selecting an optional operation matching the result of the searching from the configured optional operation set as the alternative operation.
  • the analyzing the first voice request information to determine a user demand includes: obtaining second voice request information being received in a preset period before receiving the first voice request information; and analyzing the first voice request information based on the second voice request information to determine the user demand.
  • the method further comprises: monitoring user behavior data of executing the alternative operation based on the prompt information; and adjusting a parameter of a correlation between the alternative operation in the configured optional operation set and the user demand based on the behavior data.
  • the determination unit is further used for determining an alternative operation associated with the user demand as follows: selecting a playing operation corresponding to the user demand and an optional operation associated with the playing operation from the configured optional operation set as the alternative operation; the optional operation associated with the playing operation including at least one of the following operations: an operation of selecting a to-be-played multimedia resource, an operation of switching a play mode, and an operation of feeding back preferences for a played multimedia resource.
  • the analysis unit is further used for analyzing the first voice request information to determine a user demand as follows: obtaining second voice request information being received in a preset period before receiving the first voice request information; and analyzing the first voice request information based on the second voice request information to determine the user demand.
  • the apparatus further includes a feedback unit for: monitoring user behavior data of executing an alternative operation based on prompt information; and adjusting a parameter of a correlation between the alternative operation in a configured optional operation set and a user demand based on the behavior data.
  • FIG. 2 is a schematic flowchart of a method for providing a voice service according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for providing a voice service according to another embodiment of the present application.
  • FIG. 4 is a schematic flowchart of an application scenario of the method for providing a voice service according to an embodiment of the present application
  • FIG. 6 is a schematic flowchart of a further application scenario of the method for providing a voice service according to an embodiment of the present application
  • FIG. 7 is a structural schematic diagram of an apparatus for providing a voice service according to an embodiment of the present application.
  • FIG. 8 is a structural schematic diagram of a computer system of a server adapted to implement a server of the embodiments of the present application.
  • FIG. 1 shows an illustrative architecture of a system 100 which may be used by a method for providing a voice service or an apparatus for providing a voice service according to the embodiments of the present application.
  • the user 110 may use the terminal devices 101 and 102 to interact with the server 104 through the network 103 , in order to transmit or receive messages, etc.
  • the terminal devices 101 and 102 may be an electronic device containing an audio input interface, an audio output interface and a display, and supporting network communications, such as smart loudspeaker boxes having a microphone and a display, smart phones, tablet computers, laptop computers, and smart wearable devices.
  • An application capable of interacting with the server 104 such as a voice service client, may be installed on the terminal device 101 or 102 .
  • the server 104 may be a server providing various services, such as a voice server that controls a voice output operation executed by the terminal devices 101 and 102 .
  • the voice server may process a voice service request sent by the user 110 through the terminal devices 101 and 102 , and send the results of the processing (such as audio data and control instructions of the audio output interface) to the terminal devices 101 and 102 .
  • the terminal devices 101 and 102 may receive audio data and control instructions sent by the server 104 through the network 103 , and execute the corresponding voice output operation and display operation, thereby realizing completion of the voice service using the terminal devices 101 and 102 .
  • the method for providing a voice service is generally executed by the server 105 . Accordingly, an apparatus for providing a voice service is generally installed on the server 105 .
  • terminal devices the numbers of the terminal devices, the networks and the servers in FIG. 1 are merely illustrative. Any number of terminal devices, networks and servers may be provided based on the actual requirements.
  • the method for providing a voice service includes the following steps:
  • the first voice request may be analyzed to extract user demand information therefrom. Specifically, the first voice request information may be firstly decoded to obtain the voice request content, and then semantic analysis of the voice request content may be implemented by firstly lexing using a language model, then extracting core words and keywords, and finally determining the user demand included in the voice request content using a topic model.
  • a keyword set may be set for each function of the intelligent voice device. If a keyword or a keyword combination of a certain function is analyzed from the content of a first voice request, then a user demand corresponding to the first voice request may be determined to be a demand that can be satisfied by the function.
  • an alarm clock function may contain a keyword of “alarm clock,” and an information pushing function may include keywords of: “broadcast and news,” “weather,” etc.
  • the user demand may be determined to be a demand for obtaining pushed news.
  • first voice request information may be analyzed using a machine learning method, and specifically a user intention may be identified using a trained intention identification model, where a training sample of the intention identification model may be an artificially tagged sample. Parameters of the intention identification model are constantly adjusted in the training process to enable its predicted value to approach a tagged value, an adjustment of the model parameters is stopped when an error between the predicted values and the tagged value satisfies a convergence condition, and then a trained intention identification model is obtained.
  • voice request content of the first voice request information may be inputted into the intention identification model, which may output a user intention, and then a result of analyzing a user demand is obtained.
  • an optional operation satisfying the user demand may be selected from the configured optional operation set as the alternative operation.
  • the configured optional operation set may include a plurality of optional operations, and a user demand associated with each optional operation is configured.
  • the configured optional operation set may be pre-configured as a matter of experience, where the user demand associated with the each optional operation may be a user demand associated with the voice interaction. Different user demands may be associated with different optional operations.
  • an intelligent voice device is designed based on a specific application scenario, such as a vehicle-mounted information device, a kitchen voice assistant and a robot.
  • Voice services that can be provided by intelligent voice devices are also related to their application scenarios.
  • a vehicle-mounted information device may provide the functions, such as playing music, route inquiry, news broadcast, weather inquiry, reminder, and web search.
  • a user demand that may be satisfied by an intelligent voice device may be set according to its specific application scenario.
  • a vehicle-mounted information device may meet the user demands, such as playing music, route inquiry, news broadcast, weather inquiry, reminder and web search.
  • An optional operation that may be satisfied by an intelligent voice device and corresponds to each user demand may also be customized, and the optional operations are associated with the user demands that may be satisfied by the intelligent voice device.
  • customizable optional operations associated with the demand for playing music include: pause playing, continue to play, switch to the next, switch to the previous, add to favorites, thumb up, switch to single loop/list loop mode, and so on. These optional operations may be added to an optional operation set, and an associated relation between these optional operations and user demands is configured.
  • Step 203 generating prompt information for guiding a user to execute the alternative operation.
  • Step 204 pushing the prompt information to the intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display.
  • the electronic device for providing a voice service may send the prompt information to an intelligent voice device through network.
  • the intelligent voice device may show the prompt information on the display after receiving the prompt information.
  • the user may obtain the prompt information through the display of the intelligent voice device, thereby executing corresponding operations under the guidance of the prompt information.
  • FIG. 3 a schematic flowchart of another embodiment of a method for providing a voice service according to the disclosure is shown.
  • Step 301 ascertaining, in response to receiving first voice request information sent by an intelligent voice device containing a display, whether the first voice request information contains a keyword requesting for playing a multimedia resource.
  • the received first voice request information may be converted from voice to text.
  • Content of the first voice request is converted to text, the text is lexed, and whether the result of the lexing contains a keyword requesting for playing a multimedia resource is ascertained.
  • the keyword requesting for playing a multimedia resource may be preset, may be a separate wording, such as “listen to music” and “listen to songs;” and may also be a combination of a plurality of wordings, such as a combination of “play” and “news,” and a combination of “play some” and “music.”
  • step 302 will be executed, i.e., identifying a preset multimedia tag in the first voice request information.
  • the user demand may be determined to be playing the multimedia resource.
  • characteristic attributes of the multimedia resource expected by a user to be played may be further determined based on the first voice request information, including an author, a type, a style, a language of the multimedia resource or an identifier for a set of the multimedia resources, etc.
  • the specific modes of implementing the embodiment may be identifying the preset multimedia tag in the first voice request information.
  • the preset multimedia tag may be a tag for characterizing characteristic attributes of the multimedia resource, and the multimedia resource in a multimedia resource library may be configured with the tag.
  • a music tag may include a tag for indicating a song name, a singer, a composer, an album name, a music type, a music style or a language, etc. of music
  • the music type tag may, for example, include rock and roll, rap, folk, pop, bel canto or symphony, etc.
  • the music style tag may include a cheerful, relaxed, sad or encouraging tag, etc.
  • the language tag may include mandarin Chinese, Cantonese, English, Korean or Japanese, etc.
  • a multimedia tag set may be established based on tags of the multimedia resources in a multimedia resource library.
  • a preset multimedia tag in the first voice request information may be identified by matching a multimedia tag set, and a successfully matched tag is an identified preset multimedia tag.
  • a multimedia tag may be matched using an exact matching method or a fuzzy matching method.
  • the fuzzy matching may be choosing a tag similar to the first voice request information from the multimedia tags. For example, “Balixiang” in the first voice request information may be determined to successfully match a multimedia tag of “Qilixiang,” so that a user demand may also be successfully identified when the user sends a fuzzy request.
  • step 303 will be executed, i.e., determining the user demand being a demand for playing a first multimedia resource containing the preset multimedia tag.
  • the first voice request information contains a tag matching a multimedia tag set, then it is determined that the user expects to play a multimedia resource having the tag, thus realizing accurate identification of a user demand.
  • step 304 a playing operation corresponding to a user demand and an optional operation associated with the playing operation are selected from a configured optional operation set as alternative operations.
  • a playing operation satisfying the user demand and an associated optional operation that may be required to be executed by the user in the playing operation may be selected from a configured optional operation set as alternative operations.
  • the optional operation associated with the playing operation includes at least one of the following operations: an operation of selecting a to-be-played multimedia resource, an operation of switching a play mode, and an operation of feeding back preferences for a played multimedia resource.
  • the operation of selecting a to-be-played multimedia resource may be an operation of switching a currently played resource, such as selecting “the next;” the operation of switching a play mode may be an operation of selecting a mode, such as “single loop,” “list loop,” or “shuffle play;” and the operation of feeding back preferences for a played multimedia resource may be, for example, an operation of selecting “like this song,” “add this song to favorites” or “dislike this song.”
  • the developer of the intelligent voice device may configure a plurality of optional operations associated with a playing operation for a multimedia resource playing function of the intelligent voice device. Then, when a user demand is determined to be a demand for playing a first multimedia resource containing the preset multimedia tag, the optional operations configured by the developer and associated with the playing operation may be used as alternative operations. Thus, after subsequently generated prompt information for guiding a user to execute an alternative operation is pushed to a display of the intelligent voice device, the user may be aware of optional operations associated with the playing operation of the intelligent voice device, thereby enabling the user to understand the service ability of the intelligent voice device, and helping the user to obtain an more abundant and efficient intelligent voice service.
  • step 310 prompt information for guiding a user to execute an alternative operation is generated, and then, in step 311 , the prompt information is pushed to an intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display.
  • Step 310 and step 311 are the same as the steps 203 and 204 in the foregoing embodiments, and are not repeated any more here.
  • characteristic attributes of the multimedia resource expected by a user to be played can be accurately identified to determine the user demand being a demand for playing the multimedia resource containing the preset multimedia tag, thereby determining a playing operation and an optional operation associated with the playing operation being operations that may be expected by the user to be executed in the playing process and pushing prompt information of these operations that may be expected to be executed to the user.
  • the prompt information of the operations matching the user demand can be accurately pushed, thereby saving the time of searching for relevant operation functions for the user, and contributing to improving the efficiency of the voice service.
  • FIG. 4 a schematic diagram of an application scenario of a method for providing a voice service according to the disclosure is shown.
  • a voice server C may extract a combination of keywords “listen to” and “song” therein, determine the user demand being a demand for playing a song, and identify a tag of the song “Jay Chou”. Then the user demand may be further determined to be a demand for playing Jay Chou's song.
  • a playing operation and optional operations associated with the playing operation such as “play the next,” “add this song to favorites” and “switch a play mode,” may be determined to be alternative operations, and prompt information of the alternative operations is generated and pushed to the intelligent voice device B, which may show the user A the prompt information containing the alternative operations on the display.
  • step 305 will be executed, i.e., obtaining user descriptor data, scenario data and to-be-recommended multimedia resource data.
  • a user demand is determined to be a demand requesting for playing a multimedia resource based on a result of the ascertaining in the step 301 , but the user is determined not to have a specific demand for expecting to play the multimedia resource based on the result of the identifying in the step 302 , then the user demand corresponding to the first voice request information may be determined to be an multimedia resource of extensive demand, i.e., the user expects an intelligent device to select and play some multimedia resources. Under the circumstance, user descriptor data, scenario data and to-be-recommended multimedia resource data may be obtained to select the multimedia resources to be played based on the obtained data.
  • the user descriptor data may include the time and frequency of using a multimedia resource playing function of an intelligent voice device by a user, basic attributes of the user (including gender, character, career, etc.), historical multimedia resource playing records of the user, etc.
  • the scenario data may include the current time and environmental data.
  • the environmental data may be determined based on the geographic location information, and may also be determined based on the detected environmental sound.
  • the scenario data may include, e.g., early morning, evening, night, a living room, an office, etc.
  • the to-be-recommended multimedia resource data may be newly published multimedia resources (e.g., new albums), and multimedia resources with high popularity in network.
  • step 306 determining the user demand being a demand for selecting a second multimedia resource matching the user descriptor data and/or the scenario data from the to-be-recommended multimedia resource data.
  • the user demand may be determined to be a demand for recommending and playing some multimedia resources that may be favored.
  • a multimedia resource may be selected from the to-be-recommended multimedia resources based on the user descriptor data and/or the scenario data, and then be recommended.
  • the second multimedia resource matching the user descriptor data may be selected from the to-be-recommended multimedia resource data as a recommended object.
  • the to-be-recommended multimedia resource may have a tag for indicating its characteristic attributes.
  • a similarity between the user descriptor data and the tag of the to-be-recommended multimedia resource may be calculated, and if the similarity is greater than a threshold value, then the matching is successful.
  • the similarity may also be weighted using popularity data of the multimedia resource. The higher the popularity is, the higher the weight is.
  • the popularity data may be calculated based on the publication time, views, search volume and the like of the multimedia resource.
  • the second multimedia resource matching the scenario data may be selected from the to-be-recommended multimedia resource data as a recommended object.
  • the scenario data may be the current time and environmental data. Views and search volume of the to-be-recommended multimedia resource in each period and each environment may be collected, and the to-be-recommended multimedia resource with highest views and/or search volume in the period of the current time is selected as a recommended object. For example, if the current time is 8:00 a.m., then fresh and cheerful music with highest views in the early morning may be selected as a recommended to-be-played object.
  • the second multimedia resource as a recommended object may be determined based on the user descriptor data and the scenario data.
  • a comprehensive similarity between the to-be-recommended multimedia resource and the user descriptor data and between the to-be-recommended multimedia resource and the scenario data may be calculated by weighted sum of a similarity between the to-be-recommend multimedia resource and the user descriptor data and an association degree between the to-be-recommend multimedia resource and the scenario data. Then the to-be-recommended multimedia resource with a high comprehensive similarity is determined to be the selected second multimedia resource.
  • a recommending operation corresponding to the user demand is selected from a configured optional operation set as an alternative operation.
  • a corresponding relation between the optional operations and the user demands may be configured in the configured optional operation set, where an optional operation corresponding to a demand for selecting a recommended object from to-be-recommend multimedia resource data is used as a recommending operation.
  • the electronic device for providing a voice service may determine an alternative operation associated with the user demand obtained in step 306 to be a recommending operation based on the corresponding relation.
  • a recommended object of the recommending operation may be determined to include the second multimedia resource. Then, when prompt information is subsequently generated, prompt information guiding a user to execute the selected and recommended second multimedia resource may be generated.
  • the prompt information guiding a user to execute the selected and recommended second multimedia resource may be prompt information guiding the user to send a further voice request, thus guiding the user, through the prompt information, to execute a plurality of rounds of dialogue with the intelligent voice device, and more accurately determining the user demand.
  • the step 310 and the step 311 may be executed.
  • the step 310 and the step 311 are consistent with the step 203 and the step 204 in the foregoing embodiments, and are not repeated any more here.
  • FIG. 8 a schematic diagram of an application scenario of a method for providing a voice service according to the disclosure is shown.
  • a user A sends a request “play some music” to an intelligent voice device B having a display
  • the intelligent voice device B sends the request to a voice server C.
  • the voice server C may extract a combination of keywords “play”+“music,” and determine the user demand being a demand for playing a song.
  • the intelligent voice server fails to identify any tag for indicating song characteristics, then the user demand may be further determined to be a demand for recommending a song, early morning music matching the current scenario, recently popular City of Rock's song consistent with a favourite style of the user or a song in a new song list may be selected as a song recommended to the user, and prompt information guiding the user to play the recommended song is generated. Then the prompt information is pushed to the intelligent voice device B, which may show the user A the corresponding prompt information “listen to City of Rock's song,” “play some early morning music” and “play a new song list,” on the display. Such prompt information may be used as user guidance for the next round of dialogue.
  • the user A may send a request, such as “I wouldn't like a new song list” or “play some early morning music.”
  • the voice server C may further revise the user demand analysis structure based on the request sent by the user, and adjust prompt information of the provided optional operations.
  • the first voice request may be analyzed, and a search expression is generated based on a result of the analyzing the first voice request, network data is searched using a search expression, and then a result of the searching is analyzed to determine a user demand.
  • FIG. 6 a schematic diagram of an application scenario of a method for providing a voice service according to the disclosure is shown.
  • a user A sends a request “check Eason Chan's Concert” to an intelligent voice device B having a display
  • the intelligent voice device B sends the request to a voice server C.
  • the voice server C may analyze the request, determine the request not containing a keyword for playing a multimedia resource, further search using “check Eason Chan's Concert” as a search expression, extract an operation of “book tickets” and an operation of “agenda reminder” on a page of the result of the searching, and determine the user demand being “book tickets” or “agenda reminder.” Then the voice server C may ascertain whether the operation of “book tickets” and the operation of “agenda reminder” are included in a configured optional operation set, may generate, if a result of the ascertaining is “yes,” prompt information guiding the user to execute the operation of “book tickets” and the operation of “agenda reminder,” and push the prompt information to the intelligent voice device B. The intelligent voice device B may show the user A corresponding prompt information of “book tickets” and “agenda reminder” on the display.
  • network-sourced big data may be used to improve user demand identification accuracy, and may provide relevant user operation behavior data.
  • Alternative operations may be selected based on these operation behavior data to prompt a user to execute these alternative operations, thereby realizing diversified user demand identifications and operation prompts.
  • the analyzing the first voice request information to determine a user demand may include: obtaining second voice request information being received in a preset period before receiving the first voice request information; and analyzing the first voice request information based on the second voice request information to determine the user demand.
  • the preset period may be an artificially set period, such as 5 minutes.
  • the electronic device for providing a voice service may further accurately determine the user demand based on a plurality of rounds of dialogue between an intelligent device and a user.
  • the received second voice request information may be used as an additional condition, an analysis result satisfying the additional condition is selected from a result of the analyzing the first voice request information, and then the user demand is determined.
  • the first voice request information and the second voice request may be combined and analyzed simultaneously.
  • the first voice request information and the second voice request information may be combined into one piece of voice request information, which is inputted into a machine learning based user demand identification model to identify the user demand.
  • a plurality of pieces of voice request information received in a relatively short time may be combined to analyze the user demand, thereby enhancing the accuracy of user demand analysis results, and improving the pertinence of the voice service.
  • the method for providing a voice service may further include: monitoring user behavior data of executing an alternative operation based on the prompt information; and adjusting a parameter of a correlation between the alternative operation in a configured optional operation set and the user demand based on the behavior data.
  • the intelligent voice device may record whether a user executes an alternative operation prompted by prompt information, and the number of times and frequency of executing each alternative operation, and report the data to an electronic device for providing a voice service.
  • the electronic device for providing a voice service may adjust the parameter of a correlation between the alternative operations and the user demands.
  • a user fails to execute a corresponding alternative operation or executes the alternative operation a few times after seeing prompt information, then the user may be considered to be less interested in the alternative operation.
  • the parameter of a correlation between the user demand and the alternative operation determined in the step 201 may be reduced, and the correlation between the user demand and the alternative operation is weakened to reduce the probability of promoting to execute the alternative operation and decrease the occurrence of the prompt information of the alternative operation under a given user demand later.
  • the user may be considered to be more interested in the alternative operation, or the alternative operation provides the user with strong assistance.
  • the parameter of a correlation between the user demand and the alternative operation determined in the step 201 may be enhanced, and the correlation between the user demand and the alternative operation is strengthened to increase the probability of prompting the user to execute the alternative operation and intensify the occurrence of the prompt information of the alternative operation under a given user demand later. Therefore, the correlation between optional operations in the optional operation set and the user demands may be dynamically updated by collecting the user's operation behavior data, to further enhance the matching rate between the generated prompt information and the user demand.
  • the disclosure provides an embodiment of an apparatus for providing a voice service
  • the embodiment of the apparatus corresponds to the embodiments of the methods shown in FIG. 2 and in FIG. 3
  • the apparatus may be specifically applied in a variety of electronic devices.
  • an apparatus for providing a voice service 700 includes: an analysis unit 701 , a determination unit 702 , a generation unit 703 and a push unit 704 , where the analysis unit 701 is used for analyzing, in response to receiving first voice request information sent by an intelligent voice device containing a display, the first voice request information to determine a user demand; the determination unit 702 is used for determining an alternative operation associated with the user demand based on a configured optional operation set; the generation unit 703 is used for generating prompt information for guiding a user to execute the alternative operation; and the push unit 704 is used for pushing the prompt information to the intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display.
  • the analysis unit 701 may decode, after detecting receipt of first voice request information sent by an intelligent voice device containing a display, the first voice request information to obtain the voice request content, and then implement semantic analysis of the voice request content by firstly lexing using a language model, then extracting core words and keywords, and finally determining the user demand included in the voice request content using a topic model.
  • the determination unit 702 may select an optional operation associated with a user demand obtained by the analysis unit 701 through analysis from an optional operation set configured with a user demand associated with each optional operation as an alternative operation.
  • the generation unit 703 may generate prompt information including relevant information of an alternative operation determined by the generation unit 703 , to prompt a user to execute the alternative operation.
  • the relevant information of the alternative operation may include the name and operation objects of the alternative operation, etc.
  • the push unit 704 may push prompt information generated by the generation unit 703 to an intelligent voice device sending first voice request information and containing the display through network.
  • the intelligent voice device may show the prompt information on the display, to guide the user to execute a corresponding operation based on the prompt information.
  • the analysis unit 701 may be further used for analyzing first voice request information to determine a user demand as follows: ascertaining whether the first voice request information contains a keyword requesting for playing a multimedia resource; and if yes, identifying a preset multimedia tag in the first voice request information, and determining the user demand being a demand for playing a first multimedia resource containing the preset multimedia tag.
  • the determination unit 702 may be further used for determining an alternative operation associated with the user demand as follows: selecting a playing operation corresponding to the user demand and an optional operation associated with the playing operation from a configured optional operation set as the alternative operation; the optional operation associated with the playing operation including at least one of the following operations: an operation of selecting a to-be-played multimedia resource, an operation of switching a play mode, and an operation of feeding back preferences for a played multimedia resource.
  • the analysis unit 701 may be further used for analyzing first voice request information to determine a user demand as follows: obtaining user descriptor data, scenario data and to-be-recommended multimedia resource data in response to determining the first voice request information containing a keyword requesting for playing the multimedia resource and the first voice request information not containing a preset multimedia tag; and determining the user demand being a demand for selecting a second multimedia resource matching the user descriptor data and/or the scenario data from the to-be-recommended multimedia resource data.
  • the determination unit 702 may be further used for determining an alternative operation associated with the user demand as follows: selecting a recommending operation corresponding to the user demand from a configured optional operation set as the alternative operation, wherein a recommended object of the recommending operation includes the second multimedia resource.
  • the analysis unit 701 may be further used for analyzing the first voice request information to determine a user demand as follows: searching, in response to determining the first voice request information not containing a keyword requesting for playing a multimedia resource, network data using a result of the analyzing the first voice request information as a search expression, and determining the user demand based on a result of the searching.
  • the determination unit 702 may be further used for determining an alternative operation associated with the user demand as follows: selecting an optional operation matching the result of the searching from a configured optional operation set as the alternative operation.
  • the analysis unit 701 may be further used for analyzing first voice request information to determine a user demand as follows: obtaining second voice request information being received in a preset period before receiving the first voice request information; and analyzing the first voice request information based on the second voice request information to determine the user demand.
  • the apparatus 700 may further include a feedback unit for: monitoring user behavior data of executing an alternative operation based on prompt information; and adjusting a parameter of a correlation between the alternative operation in a configured optional operation set and a user demand based on the behavior data.
  • the units recorded in the apparatus 700 correspond to the steps in the methods described in FIG. 2 and FIG. 3 .
  • the foregoing operations and characteristics described for the methods are also applicable to the apparatus 700 and units included therein, and are not repeated any more here.
  • the apparatus 700 for providing a voice service analyzes, by an analysis unit in response to receiving first voice request information sent by an intelligent voice device containing a display, the first voice request information to determine a user demand; determines, by a determination unit, an alternative operation associated with the user demand based on a configured optional operation set; generates, by a generation unit, prompt information for guiding a user to execute the alternative operation, and pushes, by a push unit, the prompt information to the intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display, thereby realizing a user demand based differential operation prompt, providing different operation prompt information for different potential user demands, improving the speed of obtaining relevant operation information by a user, and contributing to improving the efficiency of the voice service.
  • FIG. 8 a structural diagram of a computer system 800 of a server applicable for implementing embodiments of the disclosure is shown.
  • the server shown in FIG. 8 is only an example, and shall not limit the functions and serviceable range of embodiments of the disclosure in any way.
  • the computer system 800 includes a central processing unit (CPU) 801 , which may execute various appropriate actions and processes in accordance with a program stored in a read-only memory (ROM) 802 or a program loaded into a random access memory (RAM) 803 from a storage portion 808 .
  • the RAM 803 also stores various programs and data required by operations of the system 800 .
  • the CPU 801 , the ROM 802 and the RAM 803 are connected to each other through a bus 804 .
  • An input/output (I/O) interface 805 is also connected to the bus 804 .
  • the following components are connected to the I/O interface 805 : an input portion 806 including a keyboard, a mouse etc.; an output portion 807 comprising a cathode ray tube (CRT), a liquid crystal display device (LCD), a speaker etc.; a storage portion 808 including a hard disk and the like; and a communication portion 809 comprising a network interface card, such as a LAN card and a modem.
  • the communication portion 809 performs communication processes via a network, such as the Internet.
  • a drive 810 is also connected to the I/O interface 805 as required.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, and a semiconductor memory, may be installed on the drive 810 , to facilitate the retrieval of a computer program from the removable medium 811 , and the installation thereof on the storage portion 808 as needed.
  • an embodiment of the present disclosure includes a computer program product, which comprises a computer program that is tangibly embedded in a machine-readable medium.
  • the computer program comprises program codes for executing the method as illustrated in the flow chart.
  • the computer program may be downloaded and installed from a network via the communication portion 809 , and/or may be installed from the removable media 811 .
  • the computer program when executed by the central processing unit (CPU) 801 , implements the above mentioned functionalities as defined by the methods of the present disclosure.
  • the computer readable medium in the present disclosure may be computer readable storage medium.
  • An example of the computer readable storage medium may include, but not limited to: semiconductor systems, apparatus, elements, or a combination any of the above.
  • a more specific example of the computer readable storage medium may include but is not limited to: electrical connection with one or more wire, a portable computer disk, a hard disk, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM or flash memory), a fibre, a portable compact disk read only memory (CD-ROM), an optical memory, a magnet memory or any suitable combination of the above.
  • the computer readable storage medium may be any physical medium containing or storing programs which can be used by a command execution system, apparatus or element or incorporated thereto.
  • the computer readable medium may be any computer readable medium except for the computer readable storage medium.
  • the computer readable medium is capable of transmitting, propagating or transferring programs for use by, or used in combination with, a command execution system, apparatus or element.
  • the program codes contained on the computer readable medium may be transmitted with any suitable medium including but not limited to: wireless, wired, optical cable, RF medium etc., or any suitable combination of the above.
  • a computer program code for executing operations in the disclosure may be compiled using one or more programming languages or combinations thereof.
  • the programming languages include object-oriented programming languages, such as Java, Smalltalk or C++, and also include conventional procedural programming languages, such as “C” language or similar programming languages.
  • the program code may be completely executed on a user's computer, partially executed on a user's computer, executed as a separate software package, partially executed on a user's computer and partially executed on a remote computer, or completely executed on a remote computer or server.
  • the remote computer may be connected to a user's computer through any network, including local area network (LAN) or wide area network (WAN), or may be connected to an external computer (for example, connected through Internet using an Internet service provider).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, connected through Internet using an Internet service provider
  • each of the blocks in the flow charts or block diagrams may represent a module, a program segment, or a code portion, said module, program segment, or code portion comprising one or more executable instructions for implementing specified logic functions.
  • the functions denoted by the blocks may occur in a sequence different from the sequences shown in the figures. For example, any two blocks presented in succession may be executed, substantially in parallel, or they may sometimes be in a reverse sequence, depending on the function involved.
  • each block in the block diagrams and/or flow charts as well as a combination of blocks may be implemented using a dedicated hardware-based system executing specified functions or operations, or by a combination of a dedicated hardware and computer instructions.
  • the units or modules involved in the embodiments of the present application may be implemented by means of software or hardware.
  • the described units or modules may also be provided in a processor, for example, described as: a processor, comprising an analysis unit, a determination unit, a generation unit, and a push unit, where the names of these units or modules do not in some cases constitute a limitation to such units or modules themselves.
  • the analysis unit may also be described as “a unit for analyzing, in response to receiving first voice request information sent by an intelligent voice device containing a display, the first voice request information.”
  • the present application further provides a computer-readable medium.
  • the computer-readable medium may be the computer-readable medium included in the apparatus in the above described embodiments, or a stand-alone computer-readable medium not assembled into the apparatus.
  • the computer-readable medium stores one or more programs.
  • the one or more programs when executed by a device, cause the device to: analyze, in response to receiving first voice request information sent by an intelligent voice device containing a display, the first voice request information to determine a user demand; determine an alternative operation associated with the user demand based on a configured optional operation set; generate prompt information for guiding a user to execute the alternative operation; and push the prompt information to the intelligent voice device containing the display to enable the intelligent voice device to show the prompt information on the display.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)
US15/858,428 2017-11-16 2017-12-29 Method and apparatus for providing voice service Active 2038-06-29 US10685649B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201711136981 2017-11-16
CN201711136981.2 2017-11-16
CN201711136981.2A CN107833574B (zh) 2017-11-16 2017-11-16 用于提供语音服务的方法和装置

Publications (2)

Publication Number Publication Date
US20190147862A1 US20190147862A1 (en) 2019-05-16
US10685649B2 true US10685649B2 (en) 2020-06-16

Family

ID=61651724

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/858,428 Active 2038-06-29 US10685649B2 (en) 2017-11-16 2017-12-29 Method and apparatus for providing voice service

Country Status (3)

Country Link
US (1) US10685649B2 (zh)
JP (1) JP7335062B2 (zh)
CN (1) CN107833574B (zh)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11264021B2 (en) * 2018-03-08 2022-03-01 Samsung Electronics Co., Ltd. Method for intent-based interactive response and electronic device thereof
CN108471542B (zh) * 2018-03-27 2020-11-06 南京创维信息技术研究院有限公司 基于智能音箱的影视资源播放方法、智能音箱及存储介质
CN108492826B (zh) * 2018-03-30 2021-05-04 北京金山安全软件有限公司 音频处理方法、装置、智能设备及介质
WO2019223536A1 (en) * 2018-05-21 2019-11-28 Qingdao Hisense Electronics Co., Ltd. Display apparatus with intelligent user interface
CN110544473B (zh) 2018-05-28 2022-11-08 百度在线网络技术(北京)有限公司 语音交互方法和装置
CN108959380B (zh) * 2018-05-29 2020-11-17 腾讯大地通途(北京)科技有限公司 一种信息推送方法、装置及客户端
CN108932946B (zh) 2018-06-29 2020-03-13 百度在线网络技术(北京)有限公司 客需服务的语音交互方法和装置
CN108920657A (zh) * 2018-07-03 2018-11-30 百度在线网络技术(北京)有限公司 用于生成信息的方法和装置
CN111339348A (zh) * 2018-12-19 2020-06-26 北京京东尚科信息技术有限公司 信息服务方法、装置和系统
US10817246B2 (en) * 2018-12-28 2020-10-27 Baidu Usa Llc Deactivating a display of a smart display device based on a sound-based mechanism
CN109492169A (zh) * 2019-01-10 2019-03-19 自驾旅行网(上海)信息科技有限公司 一种基于ai语音算法的自驾旅途多媒体推荐方法及其应用系统
CN109919657A (zh) * 2019-01-24 2019-06-21 珠海格力电器股份有限公司 用户需求信息的获取方法、装置、存储介质及语音设备
US11003419B2 (en) 2019-03-19 2021-05-11 Spotify Ab Refinement of voice query interpretation
CN111724773A (zh) * 2019-03-22 2020-09-29 北京京东尚科信息技术有限公司 应用开启方法、装置和计算机系统及介质
CN110176229A (zh) * 2019-05-28 2019-08-27 北京增强智能科技有限公司 一种语音处理方法和装置
CN112052377B (zh) * 2019-06-06 2023-09-15 百度在线网络技术(北京)有限公司 资源推荐方法、装置、服务器和存储介质
CN110297940A (zh) * 2019-06-13 2019-10-01 百度在线网络技术(北京)有限公司 播放处理方法、装置、设备和存储介质
CN112256947B (zh) * 2019-07-05 2024-01-26 北京猎户星空科技有限公司 一种推荐信息的确定方法、装置、系统、设备及介质
CN110414582B (zh) * 2019-07-21 2022-03-08 珠海格力电器股份有限公司 一种模型训练方法、装置、计算设备及存储介质
CN112398890A (zh) * 2019-08-16 2021-02-23 北京搜狗科技发展有限公司 一种信息推送方法、装置、和用于推送信息的装置
CN110472095B (zh) * 2019-08-16 2023-03-10 百度在线网络技术(北京)有限公司 语音引导方法、装置、设备和介质
CN110717337A (zh) * 2019-09-29 2020-01-21 北京声智科技有限公司 信息处理方法、装置、计算设备和存储介质
CN111048084B (zh) * 2019-12-18 2022-05-31 上海智勘科技有限公司 在智能语音交互过程中推送信息的方法及系统
CN111310009A (zh) * 2020-01-16 2020-06-19 珠海格力电器股份有限公司 用户分类方法、装置、存储介质、计算机设备
CN111540355B (zh) * 2020-04-17 2024-05-24 广州三星通信技术研究有限公司 基于语音助手的个性化设置方法和设备
CN112004131A (zh) * 2020-08-12 2020-11-27 海信电子科技(武汉)有限公司 一种显示系统
CN111930919B (zh) * 2020-09-30 2021-01-05 知学云(北京)科技有限公司 一种面向企业在线教育app语音交互的实现方法
CN112233660A (zh) * 2020-10-14 2021-01-15 广州欢网科技有限责任公司 用户画像扩充方法、装置、控制器和用户画像获取系统
CN112565396B (zh) * 2020-12-02 2023-04-07 深圳优地科技有限公司 信息推送方法、装置、机器人及存储介质
CN113051471A (zh) * 2021-03-15 2021-06-29 北京线点科技有限公司 数据推荐方法、装置和系统
CN113393839B (zh) * 2021-08-16 2021-11-12 成都极米科技股份有限公司 智能终端控制方法、存储介质及智能终端
CN116016009A (zh) * 2023-01-04 2023-04-25 杭州好上好电子有限公司 一种ai智能管家系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030216960A1 (en) * 2002-05-16 2003-11-20 Richard Postrel System and method for offering geocentric-based incentives and executing a commercial transaction via a wireless device
JP2012123492A (ja) 2010-12-06 2012-06-28 Fujitsu Ten Ltd 情報提供システムおよび情報提供装置
US20120311723A1 (en) * 2011-05-06 2012-12-06 Google Inc. Physical Confirmation For Network-Provided Content
US20150207766A1 (en) * 2014-01-22 2015-07-23 Qualcomm Incorporated Dynamic Invites With Automatically Adjusting Displays
US20170256260A1 (en) * 2014-09-05 2017-09-07 Lg Electronics Inc. Display device and operating method therefor
US9986095B1 (en) * 2017-02-13 2018-05-29 West Corporation Multimode service communication configuration for performing transactions

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6122361A (en) * 1997-09-12 2000-09-19 Nortel Networks Corporation Automated directory assistance system utilizing priori advisor for predicting the most likely requested locality
US8498871B2 (en) * 2001-11-27 2013-07-30 Advanced Voice Recognition Systems, Inc. Dynamic speech recognition and transcription among users having heterogeneous protocols
JP2005332404A (ja) * 2002-09-24 2005-12-02 Motoi Soken:Kk コンテンツ提供システム
JP2006065860A (ja) * 2005-08-12 2006-03-09 Csk Holdings Corp 配信情報システム、配信情報処理装置、情報端末装置、情報配信方法、および、プログラム
US8359020B2 (en) * 2010-08-06 2013-01-22 Google Inc. Automatically monitoring for voice input based on context
KR20130078486A (ko) * 2011-12-30 2013-07-10 삼성전자주식회사 전자 장치 및 그의 제어 방법
KR102022318B1 (ko) * 2012-01-11 2019-09-18 삼성전자 주식회사 음성 인식을 사용하여 사용자 기능을 수행하는 방법 및 장치
CN104380374A (zh) * 2012-06-19 2015-02-25 株式会社Ntt都科摩 功能执行指示系统、功能执行指示方法及功能执行指示程序
JP2014072586A (ja) * 2012-09-27 2014-04-21 Sharp Corp 表示装置、表示方法、テレビジョン受像機、プログラム、および、記録媒体
JP2014203207A (ja) * 2013-04-03 2014-10-27 ソニー株式会社 情報処理装置、情報処理方法及びコンピュータプログラム
US9215510B2 (en) * 2013-12-06 2015-12-15 Rovi Guides, Inc. Systems and methods for automatically tagging a media asset based on verbal input and playback adjustments
US9338493B2 (en) * 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
EP3166023A4 (en) * 2014-07-04 2018-01-24 Clarion Co., Ltd. In-vehicle interactive system and in-vehicle information appliance
JP6671379B2 (ja) * 2014-10-01 2020-03-25 エクスブレイン・インコーポレーテッド 音声および接続プラットフォーム
CN105069050A (zh) * 2015-07-23 2015-11-18 小米科技有限责任公司 搜索响应方法、装置及系统
CN105206266B (zh) * 2015-09-01 2018-09-11 重庆长安汽车股份有限公司 基于用户意图猜测的车载语音控制系统及方法
CN105260080A (zh) * 2015-09-22 2016-01-20 广东欧珀移动通信有限公司 一种在移动终端显示屏实现声控操作的方法及装置
US10446143B2 (en) * 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
CN106297785B (zh) * 2016-08-09 2020-01-14 董文亮 一种基于车联网的智能服务系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030216960A1 (en) * 2002-05-16 2003-11-20 Richard Postrel System and method for offering geocentric-based incentives and executing a commercial transaction via a wireless device
JP2012123492A (ja) 2010-12-06 2012-06-28 Fujitsu Ten Ltd 情報提供システムおよび情報提供装置
US20120311723A1 (en) * 2011-05-06 2012-12-06 Google Inc. Physical Confirmation For Network-Provided Content
US20150207766A1 (en) * 2014-01-22 2015-07-23 Qualcomm Incorporated Dynamic Invites With Automatically Adjusting Displays
US20170256260A1 (en) * 2014-09-05 2017-09-07 Lg Electronics Inc. Display device and operating method therefor
US9986095B1 (en) * 2017-02-13 2018-05-29 West Corporation Multimode service communication configuration for performing transactions

Also Published As

Publication number Publication date
JP2019091417A (ja) 2019-06-13
JP7335062B2 (ja) 2023-08-29
CN107833574A (zh) 2018-03-23
CN107833574B (zh) 2021-08-24
US20190147862A1 (en) 2019-05-16

Similar Documents

Publication Publication Date Title
US10685649B2 (en) Method and apparatus for providing voice service
CN107844586B (zh) 新闻推荐方法和装置
US10643610B2 (en) Voice interaction based method and apparatus for generating multimedia playlist
US10795939B2 (en) Query method and apparatus
US10936645B2 (en) Method and apparatus for generating to-be-played multimedia content
US20190147058A1 (en) Method and apparatus for pushing multimedia content
CN109036417B (zh) 用于处理语音请求的方法和装置
US20190258660A1 (en) System and method for summarizing a multimedia content item
EP3543998B1 (en) Method and apparatus for playing multimedia content
US11127399B2 (en) Method and apparatus for pushing information
TW201214173A (en) Methods and apparatus for displaying content
US8666749B1 (en) System and method for audio snippet generation from a subset of music tracks
CN109036397B (zh) 用于呈现内容的方法和装置
CN110717337A (zh) 信息处理方法、装置、计算设备和存储介质
US20200322570A1 (en) Method and apparatus for aligning paragraph and video
US10872108B2 (en) Method and apparatus for updating multimedia playlist
CN109597996B (zh) 一种语义解析方法、装置、设备和介质
CN111753126A (zh) 用于视频配乐的方法和装置
US20210004406A1 (en) Method and apparatus for storing media files and for retrieving media files
CN112309387A (zh) 用于处理信息的方法和装置
CN114580790A (zh) 生命周期阶段预测和模型训练方法、装置、介质及设备
CN111883126A (zh) 数据处理方式的选择方法、装置及电子设备
CN112148848A (zh) 一种问答处理方法及装置
CN111131354A (zh) 用于生成信息的方法和装置
CN111753080B (zh) 用于输出信息的方法和装置

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LU, GUANG;LUO, XIAJUN;YE, SHIQUAN;AND OTHERS;REEL/FRAME:045248/0843

Effective date: 20180103

Owner name: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LU, GUANG;LUO, XIAJUN;YE, SHIQUAN;AND OTHERS;REEL/FRAME:045248/0843

Effective date: 20180103

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.;REEL/FRAME:056811/0772

Effective date: 20210527

Owner name: SHANGHAI XIAODU TECHNOLOGY CO. LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.;REEL/FRAME:056811/0772

Effective date: 20210527

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4