CN113938755A - Server, terminal device and resource recommendation method - Google Patents

Server, terminal device and resource recommendation method Download PDF

Info

Publication number
CN113938755A
CN113938755A CN202111104739.3A CN202111104739A CN113938755A CN 113938755 A CN113938755 A CN 113938755A CN 202111104739 A CN202111104739 A CN 202111104739A CN 113938755 A CN113938755 A CN 113938755A
Authority
CN
China
Prior art keywords
recommended
resource
user
language
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111104739.3A
Other languages
Chinese (zh)
Inventor
邵星阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Visual Technology Co Ltd
Original Assignee
Hisense Visual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Visual Technology Co Ltd filed Critical Hisense Visual Technology Co Ltd
Priority to CN202111104739.3A priority Critical patent/CN113938755A/en
Publication of CN113938755A publication Critical patent/CN113938755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/252Processing of multiple end-users' preferences to derive collaborative data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections

Abstract

The present disclosure relates to a server, a terminal device and a resource recommendation method, and relates to the technical field of communications, wherein the server includes: a communicator configured to receive input information transmitted by a terminal device; a controller configured to: determining at least one recommended resource according to input information received by a communicator, determining a target recommended resource from the at least one recommended resource, inputting identification information of the target recommended resource into a target language generation model, obtaining a first recommended language output by the target language generation model, sending the at least one recommended resource and the first recommended language to terminal equipment through the communicator, wherein the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.

Description

Server, terminal device and resource recommendation method
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a server, a terminal device, and a resource recommendation method.
Background
At present, all intelligent devices have a function of interacting with a user, for example, an intelligent television is used, the user can send a search request for a certain movie in a voice manner, and after the intelligent television can recognize the voice content, resource information of the movie is displayed and a recommended word for a fixed setting of the movie is output, where the recommended word is usually based on the fixed setting of the movie and stored in a database. Because the recommended language is stored in advance, the same recommended language may be set for a plurality of resources based on the limitation of the storage amount, and therefore, the content of the recommended language output by the intelligent device is relatively single and fixed at present.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, the present disclosure provides a server, a terminal device, and a resource recommendation method, which are used for solving the problem that the content of a recommendation output by a current intelligent device is relatively single and fixed.
The technical scheme provided by the embodiment of the disclosure comprises the following steps:
in a first aspect, a server is provided, including: a communicator configured to receive input information transmitted by a terminal device;
a controller configured to:
determining at least one recommended resource according to input information received by a communicator, determining a target recommended resource from the at least one recommended resource, inputting identification information of the target recommended resource into a target language generation model, obtaining a first recommended language output by the target language generation model, sending the at least one recommended resource and the first recommended language to terminal equipment through the communicator, wherein the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
In a second aspect, a terminal device is provided, which includes:
a user interface configured to receive an input of a user;
an output interface configured to output user interaction information;
a communicator for communicating with a server;
a controller configured to: the method comprises the steps of responding to input received by a user interface, controlling a communicator to send input information corresponding to the input to a server, wherein the server is configured to respond to the received input information, determine at least one recommended resource according to the input information received by the communicator, determine a target recommended resource from the at least one recommended resource, input identification information of the target recommended resource into a target language generation model, obtain a first recommended language output by the target language generation model, receive the at least one recommended resource and the first recommended language sent by the server through the communicator, and control an output interface to output at least one resource to be recommended and the first recommended language, wherein the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
In a third aspect, a resource recommendation method is provided, including:
receiving input information sent by terminal equipment;
determining at least one recommended resource according to the input information; inputting identification information of the target recommended resource into a target language generation model, and acquiring a first recommended language output by the target language generation model; and sending at least one recommended resource and a first recommended word to the terminal equipment, wherein the target recommended resource is one of the at least one recommended resource.
In a fourth aspect, a resource recommendation method is provided, including: responding to input of a user, sending input information corresponding to the input to a server, wherein the server is configured to respond to the received input information, determine at least one recommended resource corresponding to the input information, input identification information of a target recommended resource to a target language generation model, and acquire a first recommended language output by the target language generation model;
receiving at least one recommended resource and a first recommended word sent by a server, and controlling an output interface to output at least one resource to be recommended and the first recommended word, wherein the target recommended resource is one of the at least one recommended resource.
In a fifth aspect, a resource recommendation system is provided, which includes: a server as in the first aspect and a terminal device as in the second aspect.
In a sixth aspect, a computer-readable storage medium is provided, comprising: the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements a resource recommendation method as claimed in the third or fourth aspect.
In the technical solution provided by the embodiments of the present disclosure, when a terminal device receives an input from a user, the terminal device may send input information corresponding to the input to a server, and the server may determine at least one recommended resource according to the input information corresponding to the input, generate a recommended word for a target recommended resource in the at least one recommended resource, and send the recommended word for the target recommended resource to the terminal device, in the embodiments of the present disclosure, tag information of the target recommended resource is input to a target language generation model, and the recommended word is generated in real time, and the target language generation model is trained according to identification information of a plurality of recommended resources and a standard recommended word corresponding to each identification information, so that the generated recommended word is a unique recommended word generated according to the current recommended resource, and compared with the prior art, such a recommended word generation manner employs a fixedly stored recommended word, the generated recommendation words can be more flexible and richer.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 illustrates a scene diagram in some embodiments;
fig. 2 shows a block diagram of a hardware configuration of a terminal device in some embodiments;
fig. 3 shows a block diagram of a hardware configuration of a server in some embodiments;
FIG. 4 illustrates a diagram of software configuration in a terminal device in some embodiments;
FIG. 5 is a schematic diagram illustrating a resource recommendation method provided by an embodiment of the present disclosure;
FIG. 6 illustrates a schematic diagram of multiple trigger opportunities provided by embodiments of the present disclosure;
FIG. 7 is a diagram illustrating generation of a recommendation through a target language generation model in an embodiment of the present disclosure;
fig. 8 shows an implementation diagram of a recommendation process in an embodiment of the disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
FIG. 1 is a schematic diagram of a scenario in some embodiments. As shown in fig. 1, a user may operate the terminal device 200 through the smart device 300 or the control apparatus 100, and the terminal device 200 performs data communication with the server 400.
In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the terminal device includes an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication methods, and controls the terminal device 200 by a wireless or wired method. The user can input a user instruction through a key on a remote controller, voice input, control panel input, or the like to control the terminal apparatus 200.
In some embodiments, the smart device 300 (e.g., mobile terminal, tablet, computer, laptop, etc.) may also be used to control the terminal device 200. The terminal device 200 is controlled using, for example, an application program running on the smart device.
In some embodiments, the terminal device 200 may also receive the user's control through touch or gesture, etc., instead of receiving the instruction using the smart device or the control device described above.
In some embodiments, the display device 200 may also be controlled by a manner other than the control apparatus 100 and the smart device 300, for example, the voice instruction control of the user may be directly received by a module configured inside the terminal device 200 to obtain a voice instruction, or may be received by a voice control device provided outside the terminal device 200.
In some embodiments, terminal device 200 may be allowed to communicatively connect through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 can provide various contents and interactions to the terminal apparatus 200. The server 400 may be a cluster or a plurality of clusters, and may include one or more types of servers.
As shown in fig. 2, which is a block diagram of a hardware configuration of the terminal apparatus in some embodiments, the terminal apparatus 200 includes a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a user interface 280, and at least one of a memory and a power supply, wherein the display 260 and the audio output interface 270 are output interfaces.
In some embodiments, the controller 250 includes a processor, a video processor, an audio processor, a graphic processor, a RAM, a ROM, a first interface to an nth interface for input/output.
The display 260 includes a display screen component for presenting a picture, and a driving component for driving image display, a component for receiving an image signal from the controller output, performing display of video content, image content, and a menu manipulation interface, and a user manipulation UI interface.
The display 260 may be a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
The communicator 220 is a component for communicating with an external device or a server according to various communication protocol types. For example: the communicator may include at least one of a Wifi module, a bluetooth module, a wired ethernet module, and other network communication protocol chips or near field communication protocol chips, and an infrared receiver. The terminal device 200 may establish transmission and reception of control signals and data signals with the external control device 100 or the server 400 through the communicator 220.
The user interface may be used to receive a control signal input by a user through the control device 100 (e.g., an infrared remote control, etc.) or by touch or gesture, etc.
The detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for collecting ambient light intensity; alternatively, the detector 230 includes an image collector, such as a camera, which may be used to collect external environment scenes, attributes of the user, or user interaction gestures, or the detector 230 includes a sound collector, such as a microphone, which is used to receive external sounds.
The external device interface 240 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, and the like. The interface may be a composite input/output interface formed by the plurality of interfaces.
The tuner demodulator 210 receives a broadcast television signal through a wired or wireless reception manner, and demodulates an audio/video signal, such as an EPG data signal, from a plurality of wireless or wired broadcast television signals.
In some embodiments, the controller 250 and the modem 210 may be located in different separate devices, that is, the modem 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.
The controller 250 controls the operation of the display device and responds to the user's operation through various software control programs stored in the memory. The controller 250 controls the overall operation of the display apparatus 200. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.
In some embodiments the controller comprises at least one of a Central Processing Unit (CPU), a video processor, an audio processor, a Graphics Processing Unit (GPU), a RAM Random Access Memory (RAM), a ROM (Read-Only Memory), a first to nth interface for input/output, a communication Bus (Bus), and the like.
A user may input a user command on a Graphical User Interface (GUI) displayed on the display 260, and the user interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user interface receives the user input command by recognizing the sound or gesture through the sensor.
A "user interface" is a media interface for interaction and information exchange between an application or operating system and a user that enables the conversion of the internal form of information to a form acceptable to the user. A commonly used presentation form of the User Interface is a Graphical User Interface (GUI), which refers to a User Interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in the display screen of the electronic device, where the control may include a visual interface element such as an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc.
In some embodiments, the terminal device may be a display device, that is, a terminal device having a display function. Such as a television, a mobile phone, a computer, a learning machine, etc. In the display device:
a user interface 280 configured to receive an input of a user; the input may be a voice input, a touch input, and the like, which is not limited in the embodiment of the disclosure.
An output interface (display 260, and/or audio output interface 270) configured to output user interaction information;
a communicator 220 for communicating with the server 400;
a controller 250 configured to: in response to an input received by the user interface 280, the communicator 220 is controlled to send input information corresponding to the input to the server 400, the server 400 is configured to, in response to receiving the input information, determine at least one recommended resource according to the input information received by the communicator 220, determine a target recommended resource from the at least one recommended resource, input identification information of the target recommended resource to a target language generation model, obtain a first recommended language output by the target language generation model, receive the at least one recommended resource and the first recommended language sent by the communicator 400 through the communicator 220, and control the output interface to output the at least one resource to be recommended and the first recommended language, wherein the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information includes: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
The control of the output interface to output the at least one resource to be recommended and the first recommendation language means that the display 260 and/or the audio output interface 270 are used to output the at least one resource to be recommended and the first recommendation language.
In some embodiments, the controller 250 is further configured to:
receiving, by the communicator 220, ranking information of at least one recommended resource transmitted by the server 400; the ranking information is determined according to user preference information of the user.
In some embodiments, it is further configured to:
the random resource other than the at least one recommended resource transmitted by the server 400 is received through the communicator 220.
In some embodiments, the controller 250 is further configured to:
receiving, by communicator 220, an output mode indication sent by server 400; the output mode indication is used for indicating that at least one recommended resource is output through an interface display mode and indicating that a first recommended language is output through the interface display mode and/or a voice mode;
the output interface includes: display 260 and/or audio output interface 270;
the controller 250 is specifically configured to:
controlling the display 260 to output at least one recommended resource and a first recommended word in an interface display mode;
and/or the presence of a gas in the gas,
controls the display 260 to output the at least one recommended resource in an interface display manner, and controls the audio output interface 270 to output the first recommended language in a voice manner.
Fig. 3 is a block diagram of a hardware configuration of a server in some embodiments; in some embodiments, the server 400 may include:
a communicator 410 configured to receive input information transmitted by the terminal apparatus 200;
a controller 420 configured to:
determining at least one recommended resource according to input information received by the communicator 410, determining a target recommended resource from the at least one recommended resource, inputting identification information of the target recommended resource into a target language generation model, obtaining a first recommended language output by the target language generation model, and sending the at least one recommended resource and the first recommended language to the terminal equipment through the communicator 410, wherein the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
The controller 420 may include a processor, a video processor, an audio processor, a graphic processor, a RAM, a ROM, and the like.
In some embodiments, the controller 420 is further configured to:
obtaining recommendation related factors, wherein the recommendation related factors comprise at least one of the following:
a conversation scene where the current user is located;
a user emotion corresponding to the input information;
a user representation of a user;
the above preference information in the current conversation scene;
a knowledge graph of the target recommended resource;
the controller 420 is specifically configured to: and inputting the recommendation related factors and the identification information of the target recommended resource into the target language generation model, and acquiring a first recommended language of the target recommended resource output by the target language generation model.
In some embodiments, the controller 420 is further configured to:
acquiring identification information of a plurality of recommended resources and a standard recommended language corresponding to each identification information;
the identification information of a first recommended resource in the plurality of recommended resources is input into a preset language generation model, a recommended word of the first recommended resource output by the preset language generation model is obtained, and the initial language generation model is updated according to the recommended word of the first recommended resource and a first standard recommended word to obtain a target language generation model.
In some embodiments, the controller 420 is specifically configured to:
determining a correct resource name which is stored corresponding to the input information according to the input information, and determining at least one recommended resource based on the correct resource name;
and/or the presence of a gas in the gas,
the input information comprises voiceprint information, a user portrait matched with the voiceprint information is determined, and at least one recommended resource is determined based on the user portrait;
and/or the presence of a gas in the gas,
according to the input information and the user information of the input information, a target entity is determined, and the at least one recommended resource associated with the target entity is determined based on a knowledge graph.
In some embodiments, the at least one recommended resource satisfies at least one of the following conditions:
the search index is greater than or equal to a preset index;
the resource evaluation parameter is greater than or equal to a preset evaluation parameter;
the online time is in a preset time range;
associated with a target zone;
the user has not played;
it has not been recommended to the user.
In some embodiments, the controller 420 is further configured to:
predicting the click probability of the at least one recommended resource;
sequencing the at least one recommended resource according to the click probability to obtain sequencing information of the at least one recommended resource; the ranking information of the at least one recommended resource is sent to the terminal device through the communicator 410.
In some embodiments, the controller 420 is further configured to: random resources other than the at least one recommended resource are transmitted to the terminal device through the communicator 410.
In some embodiments, the controller 420 is specifically configured to:
acquiring a second recommended word output by the target language generation model, wherein the second recommended word corresponds to the target recommended resource;
if the fluency parameter of the second recommended language is smaller than the preset fluency parameter, determining a third recommended language as the first recommended language from the stored recommended languages, wherein the third recommended language is that the similarity parameter between the second recommended languages is larger than the preset similarity parameter;
and if the fluency parameter of the second recommended word is detected to be larger than or equal to the preset fluency parameter, taking the second recommended word as the first recommended word.
In some embodiments, the controller 420 is further configured to:
sending an output mode indication to the terminal device through the communicator 410; the output mode indication is used for indicating that at least one recommended resource is output through an interface display mode and indicating that the first recommended language is output through the interface display mode and/or a voice mode.
In some embodiments, the terminal device may be a terminal device having an operating system, and the operating system of the terminal device may be an android operating system, an iOS operating system, and the like, which is described below by taking the android operating system as an example.
Referring to fig. 4, as a software configuration diagram in the terminal device in some embodiments, the terminal device may divide the operating system into four layers, which are, from top to bottom, an Application (Applications) layer (referred to as an "Application layer"), an Application Framework (Application Framework) layer (referred to as a "Framework layer"), an Android runtime (Android runtime) and system library layer (referred to as a "system runtime library layer"), and a kernel layer.
In some embodiments, at least one application program runs in the application program layer, and the application programs may be windows (windows) programs carried by an operating system, system setting programs, clock programs or the like; or an application developed by a third party developer. In particular implementations, the application packages in the application layer are not limited to the above examples.
The framework layer provides an Application Programming Interface (API) and a programming framework for the application. The application framework layer includes a number of predefined functions. The application framework layer acts as a processing center that decides to let the applications in the application layer act. The application program can access the resources in the system and obtain the services of the system in execution through the API interface.
As shown in fig. 4, in the embodiment of the present application, the application framework layer includes a manager (Managers), a Content Provider (Content Provider), and the like, where the manager includes at least one of the following modules: an Activity Manager (Activity Manager) is used for interacting with all activities running in the system; the Location Manager (Location Manager) is used for providing the system service or application with the access of the system Location service; a Package Manager (Package Manager) for retrieving various information related to an application Package currently installed on the device; a Notification Manager (Notification Manager) for controlling display and clearing of Notification messages; a Window Manager (Window Manager) is used to manage the icons, windows, toolbars, wallpapers, and desktop components on a user interface.
In some embodiments, the activity manager is used to manage the lifecycle of the various applications as well as general navigational fallback functions, such as controlling exit, opening, fallback, etc. of the applications. The window manager is used for managing all window programs, such as obtaining the size of a display screen, judging whether a status bar exists, locking the screen, intercepting the screen, controlling the change of the display window (for example, reducing the display window, displaying a shake, displaying a distortion deformation, and the like), and the like.
In some embodiments, the system runtime layer provides support for the upper layer, i.e., the framework layer, and when the framework layer is used, the android operating system runs the C/C + + library included in the system runtime layer to implement the functions to be implemented by the framework layer.
In some embodiments, the kernel layer is a layer between hardware and software. As shown in fig. 5, the core layer includes at least one of the following drivers: audio drive, display driver, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (like fingerprint sensor, temperature sensor, pressure sensor etc.) and power drive etc..
The resource recommendation method provided by the embodiment of the disclosure can be realized based on the interaction between the terminal device and the server.
Fig. 5 is a schematic diagram illustrating a resource recommendation method according to an embodiment of the disclosure; the resource recommendation method comprises the following steps:
501. the terminal device receives an input from a user.
The input is a voice input, a touch input, and the like, and the embodiment of the invention is not limited.
502. The terminal equipment responds to the input of the user and sends input information corresponding to the input to the server.
In some embodiments, the input of the user is a voice input, the terminal device may obtain input information corresponding to the voice input in response to the voice information, and the input information corresponding to the voice input may include voiceprint information corresponding to the voice input, content information of the voice input, and the like. The terminal device can recognize the voice input to obtain content information of the voice input and send the content information to the server.
In other embodiments, the input of the user may be a touch input, and the terminal device may obtain input content corresponding to the touch input in response to the touch input, and send the input content to the server.
In some embodiments, after the terminal device obtains the input information corresponding to the input of the user, the terminal device may perform semantic parsing through a semantic engine to obtain the user intention.
In other embodiments, the terminal device may send the input information for the input pair of the user to the server after acquiring the input information, and perform semantic analysis by the server through semantic referral, so as to acquire the user intention.
In some embodiments, throughout the entire voice interaction process, the embodiment of the present disclosure may trigger the resource recommendation process in multiple scenarios, that is, there are multiple trigger occasions to trigger execution of subsequent 503 to 507, so as to improve human-computer interaction experience.
Fig. 6 is a schematic diagram of multiple trigger timings according to an embodiment of the disclosure;
after the input information corresponding to the user input is acquired, the analysis of the input information may be triggered, as shown in fig. 6, 4 analysis results may exist, and a subsequent resource recommendation process may be triggered under any one of the following analysis results:
(1) the user has no explicit intent;
in some embodiments, a chat query (query) directed to the user, or a meaningless query, may resolve to the user's lack of explicit intent.
For example, for a chatty query such as "i are bored and" have a bad mood "as shown in fig. 6, it may be resolved that the user has no clear intention.
For example, a meaningless query such as "that" as shown in FIG. 6 would also resolve to the user's lack of explicit intent.
(2) The user intention cannot be analyzed;
in some embodiments, the user does not speak an intention, which results in that the speech engine cannot resolve the user intention when resolving the input information corresponding to the speech input of the user, for example, as shown in fig. 6, for query "i am about", the semantic engine cannot resolve the user intention.
In some embodiments, the user speaks the intent, but the semantic engine does not resolve due to positioning issues, error correction issues, labeling issues, recognition issues, and the like that exist with existing semantic engines.
For example, in the case of the positioning problem, some descriptions of the user are new intentions and new descriptions outside the existing field and intention system, and cannot be classified into the existing field and intention system, so that there is a problem that input information of the user cannot be positioned, and the user intention cannot be analyzed. For example, in 2020, query "epidemic data" is targeted, and then the semantic engine does not cover the newly appearing intention of the epidemic, so that the user intention cannot be resolved.
For example, for error correction, the user's input information may not include accurate resource description information, and the user's intention cannot be resolved. For example, for the query "bala magic fairy", what the user wants to search for is the relevant resource of "bala magic fairy", but since the query content is "bala magic fairy", and the semantic engine cannot recognize the "bala magic fairy" as "bala magic fairy" through error correction, the user intention cannot be resolved.
For example, for the labeling problem, the input information of the user is a resource name, but the resource name and the resource type are not included in the database corresponding to the semantic engine, and the semantic engine cannot analyze the user intention. For example, when the query is "rolling the east-dead water of Yangtze river", the user wants to search for the song "rolling the east-dead water of Yangtze river", but since the resource and the resource type of the field "rolling the east-dead water of Yangtze river" are not noted in the database corresponding to the semantic engine, the user intention cannot be analyzed.
For example, in response to a recognition problem, the semantic engine may not be able to resolve the user's intention due to an error in the content information recognized when recognizing the speech content. For example, a query is "encyclopedia of ameliota" and a user means encyclopedia information of a person of ameliota three, but when the query is recognized as "encyclopedia of ameliota 300" when the content is recognized, the semantic engine may not be able to resolve the user intention.
(3) The user has a clear intention, but does not define a specific resource name;
for example, the user has explained the intention for the query to be "i want to watch a movie", "i want to listen to a song", "game", "what good-looking movie", etc., but at this time, a specific resource name is not defined.
In some embodiments, when the subsequent recommendation process is triggered for the cases (1), (2) and (3), resource recommendation may be performed for the user based on a user profile formed by the user historical behaviors, the heat of the resource, the freshness of the resource, and the like. For a specific recommendation strategy, reference may be made to the following description of any recommendation strategy, which is not described herein again.
Further, reply content may also be output to the user based on the recommended resources. For example, the "voice can be broadcasted" i guess you like these contents although i do not know what you say yet "in (1) and (2) above, and the" movie will be liked by i guessing you although it is not known what movie you specifically want to see "in (3) in case that the user wants to search movie resources.
(4) The user defines a specific resource name.
There may be two cases for the above case (4) for the query result of the resource name:
(4.1) no resources;
for example, for the query being "piano music five you want to listen to night", at which time the user explicitly says the resource name "piano music five at night", the semantic engine may parse the user intention, but in the case that the resource is not stored in the music resource library, the query result is no resource.
In some embodiments, when the subsequent recommendation process is triggered for the above case (4.1), the same type of resource as that of "night piano music five" or related resources may be recommended, for example, other piano music is recommended to the user, and during the recommendation process, resource recommendation may also be performed for the user based on the knowledge graph, the user profile formed by the user historical behaviors, the popularity of the resource, the novelty of the resource, and the like. For a specific recommendation strategy, reference may be made to the following description of any recommendation strategy, which is not described herein again.
And (4.2) resources are available.
For example, for query "i want to listen to the piano music five at night", saying the resource name "night piano music five", the semantic engine may analyze the user intention, and may query that there is a resource in the case that the resource is stored in the music resource library.
For this situation, in the subsequent recommendation process, the resource of "night piano music five" that the user inquired about may be directly recommended to the user. Further, while the resource of 'night piano music five' is presented to the user, the same type of resource or related resources can be recommended to the user.
In the embodiment of the disclosure, the resource recommendation process can be triggered for multiple trigger occasions, so that resource recommendation can run through the whole user human-computer interaction process, and user experience is improved.
503. The server determines at least one recommended resource according to the input information.
In the embodiment of the disclosure, the server may determine at least one recommended resource based on one or more recommendation strategies according to the input information.
In some embodiments, at least one of the following recommendation policies (a) to (F) may be included, but are not limited to:
(A) a recommendation strategy based on query rewrite;
in some embodiments, a correct resource name saved in correspondence with the input information is determined according to the input information, and the at least one recommended resource is determined based on the correct resource name.
In some embodiments, the mapping pairs of correct and incorrect queries may be pre-stored. Illustratively, as shown in Table 1, a diagram of a mapping pair of correct query and incorrect query is provided, assuming that the incorrect query input by the user is "piglet" or "pig pecky"
Figure BDA0003270476530000111
TABLE 1
For the case that the input information is an erroneous query, a correct query (i.e., the correct resource name) corresponding to the erroneous query may be matched in the mapping pair, and at least one recommended resource may be determined based on the correct query.
In some embodiments, the mapping pairs of correct queries and incorrect queries based on user log data may be obtained through big data statistical analysis. Due to the reasons of wrong voice recognition, incorrect voice function of a user, non-standard user speaking and the like, the query of the user is often classified into a wrong service field, so that the generalization capability is poor, and the true intention of the user cannot be correctly understood. The query rewriting is to widely search the user's expression in a data mining mode, and mine similar sentences of the user under similar intentions, so that sentences which are not normalized and incorrect are rewritten into sentences which are convenient to understand and have clear meanings.
Wherein, according to the user log information, the process of collecting the common associated user utterance sets comprises the following steps:
firstly, common queries of a large number of users are analyzed based on big data statistical analysis, and the query with the largest occurrence frequency in the users is used as a correct query set. Since these queries are often mentioned by many different users, these can be considered correct queries.
Then, for the query arranged in time sequence by the same user, the query arranged in time sequence by the user may be divided into session sessions according to a certain time interval threshold. The time interval threshold value can be set according to actual conditions, for example, the time interval threshold value can be set to a fixed value for 30 minutes, so that the associated statement within 30 minutes can be recalled conveniently.
Furthermore, whether the last query of a session belongs to the correct query or not can be checked in each session, if so, the last query is the correct query in the session, the similarity relative to the correct query is calculated for all other queries in the session, and the similarity between the correct query and all other queries in the session is greater than or equal to a preset value and can be used as a candidate wrong query set mapped to the correct query. In the process of determining the error query set, the mapping from a plurality of error queries to a correct query can be finally determined through some manual screening.
Wherein, the similarity of the query can be calculated by selecting algorithms for calculating the similarity, such as Euclidean distance, cosine similarity, Jaccard distance, edit distance and the like. The query mapping data obtained in this way can be stored in the mapping form of the wrong query and the correct query, and can be used as an error correction recall database. And recalling the corresponding correct query for the original user query.
The benefit of recalling a query obtained in the above embodiment is derived from a real user query, and a new popular query can be periodically extended.
(B) A recommendation based on the voiceprint user representation;
the user portrait can be one-level or multi-level user portrait, and the user portrait can reflect the preference of the user and the preference of the user.
In some embodiments, the user representation may include two levels of representation labels: a topic of interest, and an interest tag. For example, the interest topic may include: video, movies, television shows, documentaries, songs, etc. The interest tags may include: science fiction, love, comedy, war, military affairs, police gangster, etc.
In some embodiments, the input information includes voiceprint information, a user representation matching the voiceprint information may be determined, and at least one recommended resource may be determined based on the user representation. Wherein the user representation may be determined based on a user's historical play asset behavior. Specifically, the user representation may be determined according to a recent behavior of the user playing the resource, or the user representation may be determined according to a behavior of the user playing the resource in a history for a long time, and the specific reference time period of the history behavior is not limited in the embodiment of the present disclosure, and the user representation may be determined according to a behavior of the user playing the resource within one month, or may be determined according to a behavior of the user playing the resource within one year.
In some embodiments, the user representation may be based on a user representation of one or more users of the terminal device.
In some embodiments, when matching the voiceprint information to the user representation, a user representation of the user making the voice input to the terminal device may be determined.
Some terminal devices are commonly used by a plurality of users, but the interests of the plurality of users are different. For example, family members may share the same television, but the interests vary between family members. For such a commonly used terminal device, if a plurality of mixed user figures are adopted to recommend resources to a user, the recommended resources may not meet the preference of a certain user currently using the terminal device, and the accuracy of resource recommendation is poor. Such as: the user in the family likes to easily melt the thousand seals, and the user in the man does not like to easily melt the thousand seals, but when the user in the family watches television, the user in the family recommends the video of easily melting the thousand seals for the user in the man, so the user in the family may have a poor experience on resource recommendation.
Based on different voiceprint information of different users, in order to realize accurate recommendation, a user portrait based on personal voiceprint information can be established for each family member based on a voiceprint recognition technology, in the recommendation process, the voiceprint information of the current user is recognized, the corresponding user portrait is matched, and recommendation is carried out based on the matched user portrait.
Illustratively, as shown in table 2, a schematic diagram of a user representation for different family members; among them, there are 2 members of the family, one member having an ID of 001 corresponding to voiceprint information 1, and the other member having an ID of 002 corresponding to voiceprint information 2, which correspond to different user portraits, respectively, and include a weight for each type of content of interest for the user portraits.
Figure BDA0003270476530000131
TABLE 2
(C) Knowledge-graph association recommendations (knowledge-graph based recommendations) based on the above information and the entities in the current information;
in some embodiments, a target entity may be determined based on the input information and user context information of the input information, and at least one recommended resource associated with the target entity may be determined based on the knowledge-graph.
The knowledge graph is a semantic network for revealing the relation between entities, and on the basis of information, the relation between the entities is established, so that the knowledge graph can be formed. The knowledge graph can be converted into a simple and clear triple of entities, relations and entities by effectively processing, processing and integrating the intricate and complex information.
Briefly, in the above embodiments, the entity (which may be referred to as a target entity) involved in the input information and the above information of the user may be determined, and at least one recommended resource related to the entity may be determined from a pre-formed knowledge-graph.
In some embodiments, an example of a knowledge-graph association recommendation based on the above information and entities in the current information is provided, assuming that the human-machine interaction dialog is as follows:
human: as exemplary who the lead actor is?
Machine: exemplary of the lead actor being Zhou Xuan and Huo Jian Hua
Human: i like to be bustling.
Machine: the Zhou Xuan is not only excellent in performance but also good in singing,
human: did not hear her song.
Machine: wo, No root wild grass, and Wo He listen to the Wo Xun singing in Zhou Xun! .
As can be seen from the man-machine conversation, the information in the conversation with the user refers to the subject like the Zhou Xun, and then the song resource related to the subject like the Zhou Xun can be obtained in the knowledge graph and recommended to the user. For example, the song "wave" sung in the Zhou Xuan is recommended to the user as exemplified in the man-machine process described above.
(D) Recommendations based on new thermal resources (time to show, search index, score);
in some embodiments, resources with a search index greater than or equal to a preset index may be recommended, that is, the at least one recommended resource needs to satisfy the search index greater than or equal to the preset index.
In some embodiments, a resource whose resource evaluation parameter is greater than or equal to a preset evaluation parameter may be recommended, that is, the at least one recommended resource needs to satisfy that the resource evaluation parameter is greater than or equal to the preset evaluation parameter. The resource evaluation parameter may be an evaluation parameter given by a large number of users who have played the resource.
In some embodiments, resources with a search index greater than or equal to a preset index and a resource evaluation parameter greater than or equal to a preset evaluation parameter may also be recommended.
(E) A region-based recommendation;
in some embodiments, resources associated with the target zone may be recommended. The target region may be a region where the user is located, or the target region may be a region where the user has paid attention to. For example, the user may recommend a movie associated with the region information of "beijing" in "beijing".
(F) Time-based recommendations.
In some embodiments, the resource whose online time is within the preset time range may be recommended, that is, the at least one recommended resource needs to satisfy that the online time is within the preset time range. And when the preset time range is a period of time closest to the current time, the resource which can be newly online is recommended. For example, if the predetermined time range is within one month of the current time, then the resource that was online within the last month may be recommended.
Further, in addition to the recommendation strategies mentioned in the above embodiments, resources that have not been played by the user and/or resources that have not been recommended by the user may be recommended to the user.
It should be noted that any one or more of the above recommendation strategies may be used alone or in combination, and are not limited in the embodiments of the present invention. By fusing the multiple function recommendation strategies, resources can be accurately recommended, the contents in the conversation are combined in the recommendation process, and the recommendation quality and the recommendation precision are improved.
504. The server determines a target recommended resource from the at least one recommended resource.
505. The server inputs the identification information of the target recommended resource into the target language generation model, and obtains a first recommended language output by the target language generation model.
The target language generation model is obtained by training a preset language generation model according to sample information, and the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
The identification information of the target recommended resource may be a name, related producer information, a type, a brief introduction, comment information, and the like of the target recommended resource. For example, the song is named "Dongfeng Chu", and the singer is Zhou Ji Lun.
In some embodiments, the identification information of the target recommended resource may be included in the knowledge-graph information of the target recommended resource.
Before the server executes the above 505, recommendation related factors may also be obtained, and the recommendation related factors include, but are not limited to, at least one of the following:
1) a conversation scene where the current user is located;
2) a user emotion corresponding to the input information;
3) a user representation of a user;
4) the above preference information in the current conversation scene;
5) a knowledge graph of the target recommended resource;
in some embodiments, the server may input both the recommendation related factor and the identification information of the target recommended resource to the target language generation model, and obtain the first recommended language of the target recommended resource output by the target language generation model.
In some embodiments: the identification information of a plurality of recommended resources and the standard recommended language corresponding to each identification information can be obtained from the sample information, and the preset language generation model is trained once or many times according to the identification information of the plurality of recommended resources and the standard recommended language corresponding to each identification information, so that the target language generation model is obtained. Specifically, the following steps a to c may be performed one or more times to obtain the target language generation model:
a. inputting identification information of a first recommended resource in the plurality of recommended resources into a preset language generation model;
b. acquiring a recommended language of a first recommended resource output by a preset language generation model;
c. and updating the initial language generation model according to the recommended words of the first recommended resources and the first standard recommended words.
In some embodiments, after inputting the identification information of the target recommended resource into the target language generation model, obtaining the first recommended language output by the target language generation model may include: inputting identification information of the target recommended resource into the target language generation model, and acquiring a second recommended word output by the target language generation model, wherein the second recommended word corresponds to the target recommended resource; if the fluency parameter of the second recommended language is smaller than the preset fluency parameter, determining a third recommended language as the first recommended language from the stored recommended languages, wherein the third recommended language is that the similarity parameter between the second recommended languages is larger than the preset similarity parameter; and if the fluency parameter of the second recommended word is detected to be larger than or equal to the preset fluency parameter, taking the second recommended word as the first recommended word.
That is to say, when the fluency parameter of the second recommended language is detected, a language model service may be invoked for the generated second recommended language, and a score of the second recommended language may be obtained to evaluate the fluency of the generated sentence, where the language model service is used to evaluate the fluency of the sentence. If the score of the second recommended language is larger than or equal to a preset threshold (namely a preset fluency parameter), the fluency of the second recommended language can be considered to reach the standard and is used as the first recommended language, and if the fluency is not reached, the recommended language with the highest similarity with the second recommended language can be determined from the template recommended languages in the database and is used as the first recommended language.
In some embodiments, the doc2vec algorithm may be used to calculate the text vector of the generated sentence and calculate its cosine similarity to the template recommendation in the database. If the retrieved similar parameters (namely cosine similarity) are larger than the preset similar parameter recommendation, replacing the native recommendation as the final first recommendation; if the fact that the similarity parameter (namely cosine similarity) is larger than the preset similarity parameter recommendation is not detected, one template can be randomly selected from the template recommendations in the database according to the resource types to serve as the final first recommendation.
In order to more clearly describe the process of training to obtain the target language generation model and the process of generating the recommended language based on the target language generation model in the embodiment of the present disclosure, the following description is made with reference to fig. 7, and fig. 7 is a schematic diagram of generating the recommended language through the target language generation model in the embodiment of the present disclosure. In the training process of the target language generation model, the target language generation model can be obtained through end-to-end training by combining conversation scenes, user emotions, recommendation bases, knowledge map information of recommended resources and the like.
For example, in the process of obtaining the recommended language through the target language generation model, the input information may include the following:
tag information of the target recommended resource: the name of the target resource. For example, FIG. 7 shows "blue and white porcelain"
Conversation scene: the method comprises the steps that personal or public, whether a current conversation scene is a public conversation scene or a personal conversation scene can be determined through voiceprint information, if only one voice input corresponding to the voiceprint information is received within a period of time, the conversation scene is indicated to be the personal conversation scene, and if two or more voice inputs corresponding to the voiceprint information are received within a period of time, the conversation scene is indicated to be the public conversation scene. The dialog scenario is not currently considered in fig. 7, and this input may be added in practical applications.
The emotion of the user: the basic forms of user emotion are: the emotion of the user can be determined by identifying the intonation and the speech speed of the voice input of the user, the input content and the like. For example, the user emotion identified in fig. 7 is "sadness".
The recommendation basis is as follows: the user portrait containing the user can be specifically an interest tag in the user portrait; the above preference information in the current dialog scenario, for example, the preference entity and preference attribute information mentioned above by the user in the current dialog scenario. The recommended basis in FIG. 7 is "from worry"
Knowledge graph information of recommended resources: the related structured data of the resource, illustratively, the knowledge-graph for the music resource may include, but is not limited to, one or more of the following: music title, singer, composition, release time, music style, high-tide words, music appreciation). The knowledge-graph information of the recommended resources in FIG. 7 may include: the music title "blue and white porcelain", singer "zhou jilun", composition "chinese hill", release time "2007", genre "chinese wind" lyrics climax sentence "sky blue and so on smoke and rain but i waits you", and music appreciation "as if the blue olive can slowly get back to the taste in the mouth".
The input information can be arranged into a structured information format required by the model, and is input into the trained seq2seq model (i.e. the target language generation model in the embodiment of the present disclosure) to obtain the generated recommended language.
Wherein seq2seq belongs to one of encoder-decoder (encoder-decoder) structures, wherein an encoder encodes input information and a decoder decodes the input information to generate a recommendation. Wherein the attention mechanism is also used to enhance the model generation effect. Further, in order to ensure that the rare words appearing in names such as movie names, person names, song names and the like can be correctly output, a copy (copy) mechanism can be used for directly copying the rare words into the generated recommended words, namely directly copying the rare words into the generated recommended words by a semantic removing method.
In fig. 7, the target language generation model is generated in an autoregressive manner, one word after another, when generating the recommended language. After the "listen" and "first" words have been generated, the target language generation model may perform weighted summation on the probability scores of the words in the attention distribution and the probability scores of the words in the word list distribution to obtain a final probability distribution, and take the word corresponding to the maximum probability value in the final profile distribution as the word generated in the current step. The context vector can be obtained through the attention distribution condition, and the weight of the probability score of the word in the attention distribution and the weight of the probability score of the word in the word list distribution are obtained according to the context vector, wherein the weight of the probability score of the word in the attention distribution is (1-p) in fig. 7, and the weight of the probability score of the word in the word list distribution is p. As shown in fig. 7, the word "chinese wind" corresponds to the maximum probability in the final profile distribution, i.e. the word generated at the current step is "chinese wind". After words are generated one by one according to the mode, the finally generated recommendation language is' listening to the celadon of the Chinese wind song, and the taste of the celadon can be slowly regained as if the blue olive is in the mouth. "
506. The server sends at least one recommended resource and the first recommended language to the terminal equipment.
In some embodiments: the server can predict the click probability of at least one recommended resource; sequencing the at least one recommended resource according to the click probability to obtain sequencing information of the at least one recommended resource; and the server may also send ranking information of the at least one recommended resource to the terminal device.
Wherein, the click probability of at least one recommended resource can be estimated through at least one of Deep Interest Network (DIN), regression model and popularity of the resource. In some embodiments: the server may also send random resources other than the at least one recommended resource to the terminal device.
507. The terminal equipment outputs at least one recommended resource and a first recommended language.
The resource recommending method provided by the disclosed embodiment can send input information corresponding to the input to a server when a terminal device receives the input of a user, the server can determine at least one recommended resource according to the input information corresponding to the input, generate a recommended language for a target recommended resource in the at least one recommended resource, and send the recommended language of the target recommended resource to the terminal device, in the disclosed embodiment, the recommended language is generated in real time by inputting identification information of the target recommended resource to a target language generating model, the target language generating model is obtained by training according to the identification information of a plurality of recommended resources and a standard recommended language corresponding to each identification information, the generated recommended language can be a unique recommended language generated according to the current recommended resource, and compared with the prior art, the recommended language generating mode adopts a fixedly stored recommended language, the generated recommendation words can be more flexible and richer.
In some embodiments: sending an output mode indication to the terminal equipment through the communicator; the output mode indication is used for indicating that at least one recommended resource is output through an interface display mode and indicating that the first recommended language is output through the interface display mode and/or a voice mode. Correspondingly, after the terminal device receives the output mode indication sent by the communicator, the at least one recommended resource and the first recommended language can be output according to the output mode indication.
The output mode is crucial to the recommendation system, and the traditional recommendation display is based on text and picture forms and is not vivid enough. For example: the following movie and television contents are recommended for you, and then the poster for displaying the media assets is densely numb in the interface, so that the user experience is poor. The output mode in the embodiment of the disclosure can display the recommendation result through a dialogue interaction mode and an interface display mode, and provide a recommendation reason, which can explain the reason for recommending the resources to the user, so that the user can gain trust, and the user can accept the recommendation result more easily.
One way of interpreting the recommendation is to explain the basis of the recommendation to the user based on a preset template, such as explaining the related tag information of the user portrait on which the recommendation is based to the user.
Optionally, the recommendation interpretation may be included in the generated first recommendation, or may be independent of the voice content and/or text content of the first recommendation. The recommended explanation can be displayed by means of displaying text on the interface and/or played by means of voice.
In order to more clearly illustrate the resource recommendation process in the embodiment of the present disclosure, as shown in fig. 8, an implementation diagram of the recommendation process is provided in the embodiment of the present disclosure. As can be seen from fig. 8, the recommendation process is divided into four phases: recalling, filtering, sorting and post-processing.
During the recall phase, relevant resources may be retrieved from a repository based on the multi-recall policy and the current scenario. The multi-recall strategy is the above recommendation strategies (a) to (E): query-rewrite based recommendations, voiceprint user profile based recommendations, knowledge-graph based recommendations, new thermal resource based recommendations, geographic based recommendations, and time based recommendations. The recommendation based on query rewrite can be realized based on mapping pairs of error queries and correct queries stored in a number bin in a calculation and storage center in fig. 8, the mapping pairs can be mined from user log data in a data source, and the mining process can be realized based on a Spark computing platform.
In the filtering stage, a large number of resources recalled in the recalling stage can be screened, and a proper number of resources can be selected. For example, resources that have been viewed by the user, as well as resources that have been recommended to the user, may be filtered out.
In the sorting phase, the sorting strategies include but are not limited to: and ranking the possible click probability of the user according to the deep interest network DIN, the regression model and the popularity of the resource. The deep learning models such as DIN and regression models used in the sorting stage can be trained based on a Tensorflow training platform. Sample data for training may be stored in the data warehouse in fig. 8, and may be obtained by manually labeling data in the data source shown in fig. 8.
In the post-processing stage, the sorted resources may be used as resources in a resource library (which may be a resource library of a basic data portion in a graph) to be recommended to the user, the recommended resources of the first few sorted bits may be presented, and at least 1 new hot resource is added. These new hot resources do not necessarily have to conform to the user interest preferences, and random resources may be added outside of the preferences to avoid limiting the user's view. And the recommendation diversity and novelty are increased. Furthermore, a corresponding presentation form can be set for the resource, and various presentation forms can be adopted for presentation. For example, through an interface display form, and combined with voice interaction, the recommendation language is presented to the user in a dialog form. The recommendation language may be generated by a model of recommendation interpretation (i.e. the target language model in the above embodiment, the model of recommendation interpretation may be trained based on a Tensorflow training platform in the computation and storage center, and the sample data required to be input in the training process may be obtained from bins in the computation and storage center.
Data support may be provided for the various phases described above with respect to the underlying data in fig. 8. Where the user behavior data is historical, above behavior data, from which user preferences, i.e., user portrayal, are modeled. Since a terminal device (such as a television) has a plurality of members, and the interest pictures of each member are different, it is necessary to distinguish them by means of voiceprints, and the preference of each member, namely, the voiceprint user picture, is modeled and stored in the voiceprint picture information. The resources that have been recommended to the user may be recorded in the historical recommended resources, and the recommended resources may not be recommended any more repeatedly.
In some embodiments, in the process of generating the recommended words, not only the identification information of the target recommended resources but also information such as a conversation scene, a recommendation basis, user emotion, a knowledge graph and the like are considered, so that the generated recommended words can meet scene requirements, and are rich, vivid and more intelligent.
In some embodiments, the embodiments of the present disclosure further provide a resource recommendation system, which includes the server in the embodiments of the present disclosure and the terminal device in the embodiments of the present disclosure.
In some embodiments, the disclosed embodiments also provide a computer-readable storage medium comprising: the computer-readable storage medium stores thereon a computer program which, when executed by a processor, implements a resource recommendation method performed by a server.
In some embodiments, the disclosed embodiments also provide a computer-readable storage medium comprising: the computer-readable storage medium stores thereon a computer program which, when executed by a processor, implements a resource recommendation method performed by a terminal device.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (16)

1. A server, comprising:
a communicator configured to receive input information transmitted by a terminal device;
a controller configured to:
determining at least one recommended resource according to the input information received by the communicator, determining a target recommended resource from the at least one recommended resource, inputting identification information of the target recommended resource to a target language generation model, acquiring a first recommended language output by the target language generation model, and sending the at least one recommended resource and the first recommended language to the terminal equipment through the communicator; the target language generation model is obtained by training a preset language generation model according to sample information, wherein the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
2. The server of claim 1, wherein the controller is further configured to:
obtaining recommendation related factors, wherein the recommendation related factors comprise at least one of the following:
a conversation scene where the current user is located;
a user emotion corresponding to the input information;
a user representation of the user;
the above preference information in the current conversation scene;
a knowledge graph of the target recommended resource;
the control appliance is configured to: and inputting the recommendation related factors and the identification information of the target recommended resource into the target language generation model, and acquiring the first recommended language output by the target language generation model.
3. The server according to claim 1 or 2, wherein the controller is configured to:
determining a correct resource name which is stored corresponding to the input information according to the input information, and determining the at least one recommended resource based on the correct resource name;
and/or the presence of a gas in the gas,
the input information comprises voiceprint information, a user profile matching the voiceprint information is determined, and the at least one recommended resource is determined based on the user profile;
and/or the presence of a gas in the gas,
according to the input information and the user's information, determining a target entity, and determining the at least one recommended resource associated with the target entity based on a knowledge graph.
4. The server according to claim 1 or 2, wherein the at least one recommended resource satisfies at least one of the following conditions:
the search index is greater than or equal to a preset index;
the resource evaluation parameter is greater than or equal to a preset evaluation parameter;
the online time is in a preset time range;
associated with a target zone;
the user has not played;
not recommended to the user.
5. The server according to claim 1 or 2, wherein the controller is further configured to:
predicting the click probability of the at least one recommended resource;
sequencing the at least one recommended resource according to the click probability to obtain sequencing information of the at least one recommended resource; and sending the sequencing information of the at least one recommended resource to the terminal equipment through the communicator.
6. The server according to claim 1 or 2, wherein the controller is further configured to: sending, by the communicator, a random resource other than the at least one recommended resource to the terminal device.
7. The server according to claim 1 or 2, wherein the controller is configured to:
acquiring a second recommended language output by the target language generation model, wherein the second recommended language corresponds to the target recommended resource;
if the fluency parameter of the second recommended language is smaller than a preset fluency parameter, determining a third recommended language as the first recommended language from the stored recommended languages, wherein the third recommended language is that the similarity parameter between the second recommended languages is larger than the preset similarity parameter;
and if the fluency parameter of the second recommended word is detected to be larger than or equal to a preset fluency parameter, taking the second recommended word as the first recommended word.
8. The server according to claim 1 or 2, wherein the controller is further configured to:
sending an output mode indication to the terminal equipment through the communicator; the output mode indication is used for indicating that the at least one recommended resource is output through an interface display mode and indicating that the first recommended language is output through an interface display mode and/or a voice mode.
9. A terminal device, comprising:
a user interface configured to receive an input of a user;
an output interface configured to output user interaction information;
a communicator for communicating with a server;
a controller configured to: in response to the input received by the user interface, controlling the communicator to send input information corresponding to the input to a server, the server is configured to determine at least one recommended resource according to the input information received by the communicator, determine a target recommended resource from the at least one recommended resource, input identification information of the target recommended resource to a target language generation model, and acquire a first recommended language output by the target language generation model, receiving the at least one recommended resource and the first recommended language sent by the server through the communicator, controlling the output interface to output the at least one resource to be recommended and the first recommended language, the target language generation model is obtained by training a preset language generation model according to sample information, wherein the sample information comprises: identification information of a plurality of recommended resources, and a standard recommended language corresponding to each identification information.
10. The terminal device of claim 9, wherein the controller is further configured to:
receiving, by the communicator, ranking information of the at least one recommended resource sent by the server; the ranking information is determined according to the user preference information of the user.
11. The terminal device of claim 9, further configured to:
receiving, by the communicator, a random resource other than the at least one recommended resource sent by the server.
12. The terminal device of claim 9, wherein the controller is further configured to:
receiving an output mode indication sent by the server through the communicator; the output mode indication is used for indicating that the at least one recommended resource is output in an interface display mode and indicating that the first recommended language is output in an interface display mode and/or a voice mode;
the output interface includes: a display and/or an audio output interface;
the control appliance is configured to:
controlling the display to output the at least one recommended resource and the first recommended word in an interface display mode;
and/or the presence of a gas in the gas,
and controlling the display to output the at least one recommended resource in an interface display mode, and controlling the audio output interface to output the first recommended language in a voice mode.
13. A resource recommendation method, comprising:
receiving input information sent by terminal equipment;
determining at least one recommended resource according to the input information; determining a target recommended resource from the at least one recommended resource, inputting identification information of the target recommended resource into a target language generation model, and acquiring a first recommended language output by the target language generation model; and sending the at least one recommended resource and the first recommended word to the terminal equipment.
14. A resource recommendation method, comprising:
responding to input of a user, sending input information corresponding to the input to a server, wherein the server is configured to determine at least one corresponding recommended resource according to the input information, determine a target recommended resource from the at least one recommended resource, input identification information of the target recommended resource to a target language generation model, and acquire a first recommended language output by the target language generation model, the target language generation model is a model obtained by training a preset language generation model according to sample information, and the sample information includes: identification information of a plurality of recommended resources and a standard recommended language corresponding to each identification information;
and receiving the at least one recommended resource and the first recommended language sent by the server, and controlling the output interface to output the at least one resource to be recommended and the first recommended language.
15. A resource recommendation system, comprising: a server according to any one of claims 1 to 8 and a terminal device according to any one of claims 9 to 12.
16. A computer-readable storage medium, comprising: the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the resource recommendation method according to claim 13 or 14.
CN202111104739.3A 2021-09-18 2021-09-18 Server, terminal device and resource recommendation method Pending CN113938755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111104739.3A CN113938755A (en) 2021-09-18 2021-09-18 Server, terminal device and resource recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111104739.3A CN113938755A (en) 2021-09-18 2021-09-18 Server, terminal device and resource recommendation method

Publications (1)

Publication Number Publication Date
CN113938755A true CN113938755A (en) 2022-01-14

Family

ID=79276203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111104739.3A Pending CN113938755A (en) 2021-09-18 2021-09-18 Server, terminal device and resource recommendation method

Country Status (1)

Country Link
CN (1) CN113938755A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114999611A (en) * 2022-07-29 2022-09-02 支付宝(杭州)信息技术有限公司 Model training and information recommendation method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114999611A (en) * 2022-07-29 2022-09-02 支付宝(杭州)信息技术有限公司 Model training and information recommendation method and device
CN114999611B (en) * 2022-07-29 2022-12-20 支付宝(杭州)信息技术有限公司 Model training and information recommendation method and device

Similar Documents

Publication Publication Date Title
US20180121547A1 (en) Systems and methods for providing information discovery and retrieval
US11347801B2 (en) Multi-modal interaction between users, automated assistants, and other computing services
CN117194609A (en) Providing command bundle suggestions for automated assistants
US11463748B2 (en) Identifying relevance of a video
CN112000820A (en) Media asset recommendation method and display device
CN112182196A (en) Service equipment applied to multi-turn conversation and multi-turn conversation method
CN109600646B (en) Voice positioning method and device, smart television and storage medium
CN113938755A (en) Server, terminal device and resource recommendation method
US20230401250A1 (en) Systems and methods for generating interactable elements in text strings relating to media assets
CN112804567A (en) Display device, server and video recommendation method
US11768867B2 (en) Systems and methods for generating interactable elements in text strings relating to media assets
CN113490057B (en) Display device and media asset recommendation method
CN115602167A (en) Display device and voice recognition method
CN115273848A (en) Display device and control method thereof
KR102414993B1 (en) Method and ststem for providing relevant infromation
CN113593559A (en) Content display method, display equipment and server
CN113157966A (en) Display method and device and electronic equipment
CN113038217A (en) Display device, server and response language generation method
CN113076427B (en) Media resource searching method, display equipment and server
CN115150673B (en) Display equipment and media asset display method
US20230199260A1 (en) Systems and methods for generating interactable elements in text strings relating to media assets
US20240073160A1 (en) Providing a system-generated response in a messaging session
CN117809649A (en) Display device and semantic analysis method
CN117806587A (en) Display device and multi-round dialog prediction generation method
US20220036887A1 (en) Information processing device, information processing method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination