CN113574525A

CN113574525A - Media content recommendation method and equipment

Info

Publication number: CN113574525A
Application number: CN201980094051.6A
Authority: CN
Inventors: 曹秋枫; 丁送星
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-04-29
Filing date: 2019-04-29
Publication date: 2021-10-29
Also published as: WO2020220180A1

Abstract

The embodiment of the application provides a method and equipment for recommending media content, wherein the method comprises the following steps: when the media content is played, acquiring reaction state information of a user, wherein the reaction state information comprises at least one type of information as follows: the method comprises the steps that image information of a user is acquired through image acquisition equipment or sound information of the user is acquired through sound acquisition equipment; and acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media content to the user. According to the embodiment, whether the user is interested in the currently played media content can be accurately determined through the image information or the sound information of the user, so that programs which are interested in the user can be recommended to the user, and the accuracy of recommending the media content is improved.

Description

Media content recommendation method and equipment

Technical Field

The present application relates to the field of multimedia technologies, and in particular, to a method and an apparatus for recommending media content.

Background

With the development of computer technology, the functions of the intelligent multimedia playing device are more and more abundant, and more content providers serving the intelligent multimedia playing device are provided, so that more and more playing contents are generated.

In the prior art, when a multimedia device recommends media content for a user, the recommendation is often made based on a history playing record. Specifically, a user logs in a multimedia device through an account, the multimedia device obtains a history playing record corresponding to the account, obtains similar media content according to the history playing record, and recommends the similar media content to the user. After the user watches the media content, the multimedia device records the played media content to the historical playing record, so that similar media content recommendation is carried out according to the new historical playing record when media recommendation is carried out next time.

However, recommending programs through a history play record does not accurately recommend media content to a user.

Disclosure of Invention

The embodiment of the application provides a media content recommendation method and device, so that media content can be accurately recommended to a user.

In a first aspect, an embodiment of the present application provides a media content recommendation method, which obtains reaction state information of a user when media content is played, where the reaction state information includes at least one type of information: and secondly, obtaining the evaluation information of the user on the media content according to the response state information through the image information of the user obtained by the image acquisition equipment or the sound information of the user obtained by the sound acquisition equipment, wherein the evaluation information is used as a basis for recommending other media contents to the user.

In the process, whether the user is interested in the currently played media content can be accurately determined through the image information or the sound information of the user, and then the obtained evaluation information is used as a basis for recommending other media content to the user, so that the other media content recommended to the user can be the content which is interested by the user, and the accuracy of recommending the media content is improved.

In a possible implementation manner, the image information includes one or more images acquired by the image acquisition device at irregular intervals or continuously and uninterruptedly acquired in a preset time period or acquired based on the first image acquisition frequency, where the preset time period is a time period for playing preset content in the media content.

The image acquisition device comprises a plurality of image acquisition devices, wherein the image acquisition devices are arranged in a preset time period, and the plurality of image acquisition frequencies are set in the preset time period, so that images acquired by the image acquisition devices can be applied to various different scenes, and the applicability of the image acquisition devices is improved.

In a possible implementation, the image information further includes one or more images acquired by the image acquisition device based on the second image acquisition frequency in other time periods than the preset time period; wherein the first image acquisition frequency is higher than the second image acquisition frequency.

The images are acquired based on the second image acquisition frequency in other time periods, wherein the first image acquisition frequency is higher than the second image acquisition frequency, so that the processing efficiency can be improved, and the storage space of the multimedia equipment can be saved.

In a possible implementation manner, the sound information includes one or more pieces of sound collected by the sound collection device in a preset time period, the preset time period is a time period for playing preset content in the media content, and the frequency of collecting the sound in the preset time period includes any one of the following: continuous collection, collection according to a preset frequency or collection at irregular intervals.

The device comprises a sound acquisition device, a sound processing device and a sound processing device, wherein one or more sections of sounds are acquired by setting various optional acquisition frequencies in a preset time period, so that the applicability of the sound acquisition device is improved.

In a possible implementation manner, before obtaining the reaction state information of the user, first obtaining authorization information that the user indicates that the device is turned on, specifically:

if the response state information comprises image information, starting the image acquisition equipment according to first authorization information of the user, wherein the first authorization information is used for indicating the starting of the image acquisition equipment;

if the response state information comprises sound information, starting the sound acquisition equipment according to second authorization information of the user, wherein the second authorization information is used for indicating the starting of the sound acquisition equipment;

and if the reaction state information comprises image information and sound information, starting the image acquisition equipment according to the first authorization information and starting the sound acquisition equipment according to the second authorization information.

In the process, the image acquisition device and/or the sound acquisition device are/is started according to the authorization information of the user, so that the reaction state information of the user can be ensured to be acquired under the condition of user authorization, the privacy of the user is prevented from being invaded, the user experience is improved, the device needing to be started is specifically determined according to the reflection information, the resources are saved, and the waste is avoided.

In one possible implementation manner, obtaining the user rating information of the media content according to the reaction state information includes:

if the response state information comprises image information, acquiring evaluation information of the user on the media content according to facial expression information of the user;

if the response state information comprises sound information, acquiring evaluation information of the user on the media content according to the sound emotion information of the user;

if the response state information comprises image information and sound information, acquiring evaluation information of the user on the media content according to the facial expression information and the sound emotion information of the user;

the facial expression information is acquired according to the image information, the facial emotion information of the user when the user watches the media content, and the sound emotion information is acquired according to the sound information, and the sound emotion information of the user when the user watches the preset content.

In the process, the facial expression information of the user and the sound emotion information of the user can intuitively and accurately reflect the state of the user when the user watches the media content, so that the reliability of the acquired evaluation information can be ensured.

In one possible implementation, obtaining user rating information of media content according to facial expression information of a user includes:

if the facial expression information of the user is acquired in a preset time period, acquiring standard facial expression information corresponding to the preset time period, wherein the standard facial expression information is expression information predefined according to preset content;

if the facial expression information of the user is consistent with the standard facial expression information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the facial expression information of the user is inconsistent with the standard facial expression information, determining that the evaluation of the user on the media content is the score reduction evaluation.

The implementation process of comparing the facial expression information of the user with the standard facial expression information is simple, whether the evaluation information of the user is the score-adding evaluation or the score-subtracting evaluation is determined according to the comparison, and the efficiency of obtaining the evaluation information can be effectively improved through the score-adding and score-subtracting mode.

if the facial expression information of the user is acquired in other time periods, acquiring an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial expression information;

and acquiring the evaluation information of the user on the media content according to the facial expression information and the evaluation mapping table of the user.

In the process, when the facial expression information of the user is acquired in other time periods, the evaluation information of the user on the media content is acquired in real time through the facial expression information and the evaluation mapping table of the user, so that the feedback effect of the user on the program can be quickly and effectively determined, and the accuracy of the evaluation information is ensured.

In one possible implementation manner, obtaining the evaluation information of the media content by the user according to the sound emotion information of the user includes:

acquiring standard sound emotion information corresponding to a preset time period, wherein the standard sound emotion information is sound information predefined according to preset content;

if the sound emotion information of the user is consistent with the standard sound emotion information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the sound emotion information of the user is inconsistent with the standard sound emotion information, determining that the evaluation of the user on the media content is the score reduction evaluation.

The method comprises the steps of comparing the sound emotion information of a user with standard sound emotion information to determine the evaluation information of the user on the currently played media content, and accurately acquiring the feedback of the user on the program content in the preset time period, so that the authenticity and the effectiveness of the evaluation information of the user are improved.

In a possible implementation manner, after obtaining the reaction state information of the user, the method further includes:

acquiring an identity of at least one user according to the image information or the sound information;

after obtaining the evaluation information of the user on the media content according to the reaction state information, the method further comprises:

and associating the evaluation information of each user on the media content with the identity of the user.

In the process, the evaluation information of the media content is associated with the identity of the user, so that the recommended media content can be recommended according to the currently associated information when the media content is recommended to the user in the following, the utilization rate of the evaluation information is improved, and the recommendation accuracy is improved.

In one possible implementation, obtaining user rating information of media content according to facial expression information of a user and sound emotion information of the user comprises:

acquiring an identity of at least one user according to the image information;

acquiring target facial expression information matched with the voice emotion information, and acquiring a target identity corresponding to the target facial expression information from at least one user identity, wherein the target facial expression information is consistent with the emotion corresponding to the voice emotion information;

and obtaining the evaluation information of the user corresponding to the target identity mark on the media content according to the sound emotion information and the standard sound emotion information corresponding to the preset time period, wherein the standard sound emotion information is information predefined according to the preset content.

In the process, the facial expression information is matched with the voice emotion information, and then the evaluation information of each user can be determined according to the respective facial expression information and the voice emotion information aiming at the identity of each user, so that the comprehensiveness and pertinence of the user evaluation information acquisition are improved.

In one possible implementation, before the media content is played, the method further includes:

acquiring a corresponding relation between the identity of the user and the user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice;

acquiring the identity of at least one user according to the image information or the sound information, wherein the method comprises the following steps:

acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the face contained in the image information; or

And acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the sound included in the sound information.

The identity identification and the user characteristics of the user are input in advance, so that the media content can be directly matched in the subsequent media content recommendation process without real-time acquisition and matching, and the operation efficiency is improved.

In a possible implementation manner, after obtaining the evaluation information of the media content by the user according to the reaction state information, the method further includes:

if the evaluation information is the bonus information, updating the evaluation information of the user on the media content to obtain updated evaluation information;

and if the evaluation information is the score reduction information, determining whether to update the evaluation information of the user on the media content according to the state information acquired after the state information.

When the evaluation of the user on the media content is the score reduction evaluation, whether the evaluation information of the user needs to be updated or not is comprehensively determined through the state information after the current state information, so that the evaluation information error caused by false detection is avoided, and the accuracy of the evaluation information is improved.

and identifying the identity of the user, and if the historical evaluation information of the user is acquired according to the identity of the user, determining the media content recommended to the user according to the historical evaluation information.

In the implementation manner, if the historical evaluation information can be acquired according to the identity of the user, the media content recommended to the user is determined according to the historical evaluation information, and the recommendation efficiency and the usability of the historical evaluation information can be effectively improved.

In one possible implementation, after the playing of the media content is finished, the method further includes:

determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises media contents evaluated by the user;

and sending the evaluation list to the media source platform, and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises the media content to be recommended to the user.

In the implementation manner, the media content to be recommended to the user currently is interesting to the user by sending the evaluation list determined according to the evaluation information of the user to the media source platform, so that the accuracy of media content recommendation is improved.

In one possible implementation, determining the rating list according to the rating information of the user includes:

if a user to be recommended is identified, determining the media content of which the evaluation information associated with the user to be recommended meets a first preset condition; the first preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media content meeting the first preset condition.

In the implementation manner, when there is only one user to be recommended, the media program meeting the first preset condition is recommended, wherein the first preset condition only considers the preference of the current single user, so that the efficiency and accuracy of media program recommendation are improved.

if at least two users to be recommended are identified, determining the media content of which the evaluation information respectively associated with each of the at least two users to be recommended meets a second preset condition, wherein the second preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media contents which all meet the second preset condition.

When a plurality of users watch media contents simultaneously, the media contents are recommended according to the evaluation information of each user and a second preset condition, wherein the second preset condition considers the common preference of the plurality of users, so that the media contents suitable for the plurality of users can be recommended to improve the user experience.

In a second aspect, an embodiment of the present application provides a media content recommending module, which includes an input module and a processing module, where the input module is configured to obtain reaction state information of a user when media content is played, and the reaction state information includes at least one of the following types of information: the method comprises the steps that image information of a user is acquired through image acquisition equipment or sound information of the user is acquired through sound acquisition equipment;

and the processing module is used for acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media contents to the user.

In a possible implementation, before obtaining the reaction state information of the user, the processing module is further configured to:

In a possible implementation, the processing module is specifically configured to:

In a possible implementation manner, after obtaining the reaction state information of the user, the processing module is further configured to:

and according to the reaction state information, after obtaining the evaluation information of the user on the media content, associating the evaluation information of each user on the media content with the identity of the user.

acquiring an identity of at least one user according to the image information;

In one possible embodiment, before the media content is played, the processing module is further configured to:

In one possible implementation, the processing module is further configured to:

after obtaining the evaluation information of the user on the media content according to the reaction state information, if the evaluation information is bonus information, updating the evaluation information of the user on the media content to obtain updated evaluation information;

In one possible implementation, the processing module is further configured to:

before the media content is played, the identity of the user is identified, and if the historical evaluation information of the user is obtained according to the identity of the user, the media content recommended to the user is determined according to the historical evaluation information.

In one possible embodiment, the method further comprises: an output module;

the processing module is further configured to: after the media content is played, determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises the media content evaluated by the user;

the output module is used for: sending an evaluation list to a media source platform;

the input module is further configured to: and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises media contents to be recommended to the user.

In a third aspect, an embodiment of the present application provides a media content recommendation device, including: a processor and a memory;

wherein the processor is configured to invoke a computer program stored in the memory, and perform the following operations:

when media content is played, reaction state information of a user is acquired, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through image acquisition equipment or the sound information of the user is acquired through sound acquisition equipment;

and acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media contents to the user.

In one possible implementation, the memory is further configured to; and storing the image information of the user acquired by the image acquisition equipment and/or the sound information acquired by the sound acquisition equipment.

In a possible implementation manner, the image information includes one or more images acquired by the image acquisition device at irregular intervals or continuously acquired at an irregular interval in a preset time period or acquired based on the first image acquisition frequency, where the preset time period is a time period for playing preset content in the media content.

In a possible implementation manner, the image information further includes one or more images acquired by the image acquisition device based on the second image acquisition frequency in other time periods except the preset time period; wherein the first image acquisition frequency is higher than the second image acquisition frequency.

In a possible implementation manner, the sound information includes one or more pieces of sound collected by the sound collection device in a preset time period, the preset time period is a time period for playing preset content in the media content, and the frequency for collecting the sound in the preset time period includes any one of the following: continuous collection, collection according to a preset frequency or collection at irregular intervals.

In one possible implementation, the processor is further configured to:

before acquiring the reaction state information of the user, if the reaction state information comprises image information, starting the image acquisition equipment according to first authorization information of the user, wherein the first authorization information is used for indicating the starting of the image acquisition equipment;

In one possible implementation, the processor is specifically configured to:

In one possible implementation, the processor is further configured to:

after the reaction state information of the user is obtained, the identity of at least one user is obtained according to the image information or the sound information;

after obtaining the evaluation information of the user on the media content according to the reaction state information, the method further comprises the following steps:

In one possible implementation, the processor is specifically configured to:

acquiring an identity of at least one user according to the image information;

In one possible implementation, the processor is further configured to:

before media content is played, acquiring a corresponding relation between an identity of a user and a user characteristic according to the identity and the user characteristic input by the user, wherein the user characteristic comprises one of a face or a voice;

In one possible implementation, the processor is further configured to:

In one possible implementation manner, the method further includes: a communication module;

the processor is further configured to: after the media content is played, determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises the media content evaluated by the user;

the communication module is configured to: and sending the evaluation list to the media source platform, and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises the media content to be recommended to the user.

In one possible implementation, the processor is specifically configured to:

In a fourth aspect, an embodiment of the present application provides a terminal device, including: a media content recommender, a camera, and/or a microphone.

Wherein the media content recommender is arranged to perform the method of any of the first aspect and its various possible embodiments.

In a fifth aspect, an embodiment of the present application provides a storage medium for storing a computer program, where the computer program is used to implement the authentication method according to any one of the first aspect when executed by a computer or a processor.

The media content recommendation method and the media content recommendation device provided by the embodiment of the application comprise the following steps: and acquiring the corresponding relation between the identity of the user and the user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice. When the media content is played, acquiring reaction state information of a user, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through the image acquisition device or the sound information of the user is acquired through the sound acquisition device. And acquiring the evaluation information of the user on the media content according to the reaction state information. And associating the evaluation information of each user on the media content with the identity of the user. The method comprises the steps of obtaining the corresponding relation between the identity identification of a user and the characteristics of the user, associating the evaluation information of each user on the media content with the corresponding identity identification, and facilitating the follow-up personalized media content recommendation aiming at the users with different identity identifications so as to improve the accuracy of the media content recommendation, wherein the evaluation information of the user on the media content is obtained through the facial expression information of the user or the sound emotion information of the user, the evaluation information can be obtained in real time based on the feedback of the user on the media content, and the authenticity and the accuracy of the evaluation information are guaranteed.

Drawings

FIG. 1A is a first schematic diagram of a media content recommendation system according to an embodiment of the present application;

FIG. 1B is a schematic diagram of a media content recommendation system according to an embodiment of the present application;

FIG. 2 is a first flowchart of a media content recommendation method according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating a method for recommending media content according to an embodiment of the present application;

fig. 4 is a flowchart three of a media content recommendation method according to an embodiment of the present application;

FIG. 5 is a fourth flowchart of a media content recommendation method according to an embodiment of the present application;

FIG. 6 is a fifth flowchart of a media content recommendation method according to an embodiment of the present application;

FIG. 7 is a sixth flowchart of a media content recommendation method according to an embodiment of the present application;

FIG. 8 is a first flowchart illustrating a media content recommendation method according to an embodiment of the present application;

FIG. 9 is a flowchart illustrating a second method for recommending media content according to an embodiment of the present application;

FIG. 10 is a first signaling flow chart of a media content recommendation method according to an embodiment of the present application;

fig. 11 is a signaling flowchart of a media content recommendation method according to an embodiment of the present application;

FIG. 12 is a first schematic structural diagram of a media content recommendation device according to an embodiment of the present application;

FIG. 13 is a schematic structural diagram of a media content recommendation device according to an embodiment of the present application;

fig. 14 is a schematic hardware structure diagram of a media content recommendation device according to an embodiment of the present application.

Detailed Description

The system architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and as a person of ordinary skill in the art knows that along with the evolution of the system architecture and the appearance of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.

Fig. 1A is a schematic diagram of a media content recommendation system according to an embodiment of the present application. As shown in fig. 1A, the recommendation system includes: the multimedia device 101 and the media source platform 102, wherein the multimedia device 101 includes an image capturing device 1011 and a sound capturing device 1012.

Specifically, the multimedia device 101 may include, but is not limited to, a Digital Television (DTV), a mobile device, a laptop computer, a peripheral advertisement device, a tablet device, a Personal Digital Assistant (PDA), a smart terminal, a handheld device or a vehicle-mounted device with a wireless connection function, and other portable devices.

In this embodiment, an image capturing device 1011 is disposed on the multimedia device 101, where the image capturing device 1011 may be, for example, a camera, and may also be, for example, a network camera device, and the network camera device may include: a lens, an image sensor, a microprocessor, an image processor, a memory, etc., and the specific implementation of the image capturing device 1011 is not limited herein.

Secondly, a sound collecting device 1021 is further disposed on the multimedia device, wherein the sound collecting device 1021 may include, but is not limited to, a far-field microphone, a digital broadcasting terminal, or a personal digital assistant, which mainly has a function of collecting sound. The sound collection device 1021 is equipped with a microphone that can collect a sound signal in the surrounding environment, and the sound collection device 1021 is not particularly limited in this embodiment.

The media source platform 102 is a platform for providing media content to the multimedia device 101, where the media source platform may be, for example, a platform provided by a different operator, a platform provided by a different video provider, and may also be, for example, a platform storing local video, and the like, which is not limited herein. The media source platforms may provide multimedia files such as audio, video, etc.

The multimedia device 101 interacts with the media source platform 102, where the interaction may be through a wired network, for example, the wired network may include a coaxial cable, a twisted pair, an optical fiber, and the like, and the interaction may also be through a Wireless network, for example, the Wireless network may be a 2G network, a 3G network, a 4G network, or a 5G network, a Wireless Fidelity (WIFI) network, and the like. The specific type or specific form of interaction is not limited herein, and the function of the multimedia device 101 interacting with the media source platform 102 may be implemented.

In another possible implementation manner, the image capturing device 1011 and the sound capturing device 1021 described above may also be independent external devices, which is described below with reference to fig. 1B, where fig. 1B is a schematic diagram of a media content recommendation system provided in an embodiment of the present application, and as shown in fig. 1B, the recommendation system includes: multimedia device 101, media source platform 102, image capture device 103, and sound capture device 104.

The image capturing device 103 and the sound capturing device 104 may be two independent devices, or may also be a coupled device, and are externally connected to the multimedia device 101, where a specific implementation manner of the external connection may be, for example, wired connection, such as connection through cables such as a coaxial cable, a twisted pair, and an optical fiber, or may also be, for example, wireless connection, such as connection through bluetooth, a wireless network, and the like, which is not limited in this embodiment.

Based on the problem that programs are recommended through history play records in the prior art, so that media content cannot be accurately recommended to a user, an embodiment of the present invention provides a media content recommendation method, which is described in detail below with reference to the system shown in fig. 1 and fig. 2.

Fig. 2 is a first flowchart of a media content recommendation method according to an embodiment of the present application. The execution entity of the present embodiment may be, for example, a multimedia device in the recommendation system. As shown in fig. 2, the method includes:

s201, when the media content is played, obtaining reaction state information of a user, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through the image acquisition device or the sound information of the user is acquired through the sound acquisition device.

Specifically, when the user plays the media content through the multimedia device, the reaction state information of the user can be acquired. The media content may be audio or video, and the implementation manner of the multimedia content is not particularly limited in this embodiment.

The multimedia device may control operation of at least one of the image capture device or the sound capture device while the media content is being played. For example, the image pickup device is controlled to pick up image information of the user. Or, the sound collection device is controlled to collect the sound information of the user. Or, controlling the image acquisition device and the sound acquisition device to work simultaneously to acquire the image information and the sound information of the user.

In this embodiment, the image information may be, for example, picture information in units of frames, or may be, for example, one piece of video information or the like. When the user is located near the multimedia device, the image information includes an image of the user, and the sound information includes a sound of the user.

In a possible implementation manner, for example, the reaction state information of the user may be obtained according to a preset period, and also, for example, the reaction state information of the user may be obtained in real time, that is, the reaction state information of the user is obtained when the action change of the user is detected or the sound information of the user is monitored.

S202, obtaining the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media content to the user.

In the present embodiment, the rating information of the user on the media content may be, for example, rating scores, such as respective numerical values between 1 and 100, which are rating information for indicating different user satisfaction degrees. Alternatively, the evaluation information of the media content by the user may also be, for example, a preset degree index, such as a degree index of no interest, or great interest, which is used as the evaluation information of the user, and the specific evaluation information may be selected according to actual requirements, which is not particularly limited herein.

If the response state information includes image information or sound information, the evaluation information of the user on the media content can be obtained according to the image information or the sound information.

For the image information, the feedback information of the user to the currently played media content may be obtained according to the image information of the user, for example, whether the face of the user is facing the multimedia device is judged through the image information, whether the user is sleeping is judged through the image information, and whether the user is interested in the media content is determined according to the feedback information of whether the user is interested in the media content, so as to perform score adding or score subtracting evaluation on the media content. For example, when the user goes to sleep, the media content is subjected to score reduction evaluation, and when the user focuses on the media content, the media content is subjected to score increase evaluation.

For the sound information, the feedback information of the user to the currently played media content may be obtained according to the sound information of the user, for example, whether the user is still watching a program is judged through the sound information, or whether the user has a corresponding sound reaction to the currently played media content is judged through the sound information, such as whether the user makes laughter when the media content is played to a laugh point, and whether the user cries when the media content is played to a tear point. For example, if the user utters laughter at a laugh point, the media content is subjected to score-added evaluation, and if the user does not utter laughter at the laugh point, the media content is subjected to score-subtracted evaluation.

In a specific implementation process, the evaluation information may be acquired only from the image information, or may be acquired only from the sound information. The evaluation information may also be acquired in combination with the image information and the sound information. When the evaluation information is obtained by combining the image information and the sound information, the evaluations of the two may be superimposed, or a weighted manner may be adopted to obtain a comprehensive evaluation.

In the embodiment of the present application, the rating information is used as a basis for recommending other media content to the user, for example, media content with high rating is obtained according to the rating information, and related or same type programs of the media content with high rating are recommended to the user.

The media content recommendation method provided by the embodiment of the application comprises the following steps: when the media content is played, acquiring reaction state information of a user, wherein the reaction state information comprises at least one type of information as follows: the method comprises the steps that image information of a user is acquired through image acquisition equipment or sound information of the user is acquired through sound acquisition equipment; and acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media content to the user. Whether the user is interested in the currently played media content can be accurately determined through the image information or the sound information of the user, and then the obtained evaluation information is used as a basis for recommending other media content to the user, so that the other media content recommended to the user can be the content which is interested by the user, and the accuracy of recommending the media content is improved.

On the basis of the foregoing embodiment, the following describes in further detail a media content recommendation method provided by the present application with reference to specific embodiments, and first of all, with reference to fig. 3, fig. 3 is a flowchart of a media content recommendation method provided by an embodiment of the present application, and as shown in fig. 3, the method includes:

s301, acquiring a corresponding relation between the identity of the user and the user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice.

The user identification is used for indicating different user identities, and in order to ensure that the evaluation information can be associated with the user identification when the evaluation information of the user is obtained, so that personalized content recommendation can be performed on each user in the following, the user identification can be stored in advance.

Specifically, the identity input by the user may be, for example, a name of the user, an account number of the user, a nickname, and the like, and those skilled in the art can understand that the identity is only required to distinguish different users, and the specific setting manner of the identity is not particularly limited in this embodiment. When the program is recommended to the user, the program is recommended through the identity input by the user, so that the user can feel the closeness, and the user experience is improved.

The user characteristics include one of a face and a voice, for example, a user inputs an identity of the user 1 and performs facial information entry, so as to obtain the user characteristics of the face of the user 1, and for example, a user inputs an identity of zhang san and performs voice information entry, so as to obtain the user characteristics of zhang san, where the user characteristics may also include both the face and the voice.

In another possible implementation manner, the user may not input the identity and the user characteristic, but the multimedia device obtains the identity of the user according to the image information and/or the sound information of the user. For example, for image information, a face may be obtained through a face recognition technology or the like, and then an identity "user a" is generated from the face, and the face is associated with "user a".

For the sound information, a sound may be acquired through sound recognition and then associated with "user B". Those skilled in the art will appreciate that when there is a user in the image information, a user identifier may be generated from the face and voice, and the face and voice may be associated with "user C".

S302, when the media content is played, reaction state information of the user is obtained, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through the image acquisition device or the sound information of the user is acquired through the sound acquisition device.

In this embodiment of the application, different preset time periods are set for different media contents, where the preset time period is a time period for playing a preset content in a media program, and the preset content in the media content may be, for example, a smiling point content in the media content, further may be, for example, a punctum content in the media content, or other preset stems. As will be understood by those skilled in the art, the preset content in the media content may be content capable of generating a program effect, and the specific selection may be set according to the content of the actual media content, which is not limited herein.

The preset time period is a time period for playing the preset content in the media content, corresponds to the playing time period of the preset content, and may include at least one preset time period in the media content. In a specific implementation process, the attribute information of the media content may set a time period of the preset content, the multimedia device may determine the preset time period according to the attribute information, and the complete duration of the media program may correspond to at least one preset time period and other time periods except the preset time period.

In the embodiment of the application, in order to improve the processing efficiency and save the storage space of the multimedia device, the acquisition frequency in the preset time period is different from that in other time periods, and then the acquisition of the image information and the sound information may not be real-time, but the acquisition is performed according to a certain acquisition frequency.

In an alternative implementation manner, the image information includes one or more images acquired by the image acquisition device at irregular intervals within a preset time period, where the irregular intervals may be, for example, randomly generated time intervals, or preset irregular intervals, and the like, which is not limited herein. Or the image information includes one or more images that the image capturing device continuously captures in a preset time period, specifically, the image capturing device continuously captures the image information of the user as long as the image capturing device is in an open state. Still alternatively, the image information includes one or more images acquired by the image acquisition device in a preset time period based on a first image acquisition frequency, where the first image acquisition frequency is a frequency selected according to an actual requirement, and this embodiment is not limited thereto.

In another optional implementation manner, the image information further includes one or more images acquired by the image acquisition device based on a second image acquisition frequency in other time periods than the preset time period, where the second image acquisition frequency is a frequency selected according to an actual requirement, and this embodiment is not limited in this respect.

As can be understood by those skilled in the art, in order to ensure that whether a user responds to a program effect corresponding to a preset content can be accurately acquired within a preset time period, the first image acquisition frequency is set to be higher than the second image acquisition frequency, for example, a certain comedy clip is currently played, a smiling point content exists in 2 minutes 15 seconds to 2 minutes 45 seconds, a smiling point content exists in 3 minutes 35 seconds to 4 minutes 5 seconds, two preset time periods exist here, the image acquisition frequency is set to be acquired every 1 second within the two preset time periods, and the image acquisition frequency can be set to be acquired every 5 seconds within other time periods.

Alternatively, the first image capturing frequency may be set to be smaller than the second image capturing frequency, and both of the first image capturing frequency and the second image capturing frequency may be set according to actual requirements.

In an optional embodiment, the sound information includes one or more pieces of sound collected by the sound collection device in a preset time period, where the frequency of collecting the sound in the preset time period includes any one of: the continuous collection, the collection according to the preset frequency or the collection at irregular intervals, the specific implementation manner is similar to the collection of the image information, and the details are not repeated here.

In the embodiment of the application, in order to avoid invading the privacy of the user, the response state information of the user can be acquired after the opening authority of the user for authorizing the image acquisition device and/or the sound acquisition device is acquired.

In an optional implementation manner, if the reflection state information only includes image information, the image acquisition device is started according to first authorization information of a user, where the first authorization information is used to indicate the start of the image acquisition device;

specifically, first authorization information of a user is acquired, the first authorization information is used for indicating the opening of the image acquisition device, and then the image acquisition device is opened according to the first authorization information.

For example, when the user turns on the multimedia device, or before the media content is played, a prompt message for turning on the image capture device may be displayed to the user through the multimedia device. And then receiving user operation input by a user to acquire first authorization information of the user, and then acquiring the opening authority of the image acquisition equipment according to the first authorization information to open the image acquisition equipment.

In another optional implementation manner, if the reaction state information includes sound information, the sound collection device is turned on according to second authorization information of the user, where the second authorization information is used to indicate turning on of the sound collection device. Or, if the reaction state information includes image information and sound information, the image capturing device is turned on according to the first authorization information and the sound capturing device is turned on according to the second authorization information, which is implemented in a manner similar to that of turning on the image capturing device.

For example, the method includes the steps of obtaining the opening authority of the image acquisition device according to the first authorization information and opening the opening authority of the sound acquisition device according to the second authorization information so as to open the image acquisition device, and opening the sound acquisition device N seconds before a preset time period, wherein N is an integer greater than or equal to 0. .

The image acquisition device and/or the sound acquisition device are/is started according to the authorization information of the user, so that the reaction state information of the user can be ensured to be acquired under the condition of user authorization, the privacy of the user is prevented from being invaded, and the user experience is improved.

S303, identifying the identity of at least one user according to the image information or the sound information.

In this embodiment, a correspondence between the user identification and the user characteristics is obtained in advance, and when the user identification is identified according to the image information, the identification of at least one user is obtained according to the correspondence between the user identification and the face included in the image information.

Specifically, the image information includes a face of at least one user, and the face recognition is performed on the at least one user in the image information, where a specific implementation manner of the face recognition may refer to the prior art, which is not described herein again, and then the face is obtained according to the face recognition, and the identity of the at least one user is obtained according to a corresponding relationship between the face included in the image information and the identity of the user.

When the identity of the user is identified according to the sound information, the identity of at least one user is obtained according to the corresponding relation between the identity of the user and the sound included in the sound information.

When the identity of at least one user is identified according to the sound information, sound processing and feature analysis, such as frequency band analysis, timbre analysis and the like, are performed on the sound contained in the sound information, so as to obtain the sound contained in the sound information, and then the identity of at least one user is obtained according to the corresponding relationship between the sound and the identity of the user.

If the corresponding relationship between the user identity and the user characteristics is not obtained in advance, that is, the user does not input corresponding settings, and the face or the voice of the current user appears for the first time, the user identity can be obtained from the face or the voice included in the voice information included in the image information, the user identity is set as a user a, a user B, a user C and the like, the user can be distinguished, and then the corresponding relationship between the user identity and the user characteristics is established according to the face or the voice in the image information.

In this embodiment, the operation of identifying the id of the user may be performed once, and subsequently, the evaluation information corresponding to the id of the user may be directly updated according to the id of the user, and the id does not need to be obtained again, thereby simplifying the operation.

And S304, acquiring the evaluation information of the user on the media content according to the reaction state information.

And if the reaction state information comprises image information, acquiring the evaluation information of the user on the media content according to the facial expression information of the user.

Specifically, the image information of at least one user in the image information is analyzed, so as to obtain facial expression information of the user, where the facial expression information is facial emotion information of the user when the user views the media content, the facial expression information may be, for example, an overall facial expression of the user, or may be, for example, a partial facial muscle state of the user, such as a mouth angle state, an eye state, and the like, and the facial expression information is not limited herein.

The facial expression information of the user watching the media content can reflect the effect feedback of the user on the program when watching the media content, so that the satisfaction degree of the user on the currently played media content can be accurately acquired by acquiring the facial expression information.

And if the response state information comprises sound information, acquiring the evaluation information of the user on the media content according to the sound emotion information of the user.

Specifically, the sound information of at least one user in the sound information is analyzed and processed, so as to obtain the sound emotion information of the user, where the sound emotion information may include, for example, an emotion state of the sound, or a decibel of the sound, and this is not limited here.

The sound emotion information of the user watching the media content can also reflect the effect feedback of the user on the program when watching the media content, and the implementation mode is similar to that obtained according to the facial expression information, and is not repeated here.

And if the response state information comprises image information and sound information, acquiring the evaluation information of the user on the media content according to the facial expression information and the sound emotion information of the user.

When the evaluation information is acquired by combining the facial expression information and the sound emotion information, the evaluations of the facial expression information and the sound emotion information may be superimposed, or a weighted manner may be adopted to acquire the comprehensive evaluation.

S305, associating the evaluation information of each user on the media content with the identity of the user.

Further, after the identification of the user is identified in the above step, after the evaluation information of the media content by at least one user is determined, the evaluation information of the media content by each user can be associated with the identification of the user, so that the recommended media content can be recommended according to the currently associated information when the media content is subsequently recommended to the user.

The media content recommendation method provided by the embodiment of the application comprises the following steps: and acquiring the corresponding relation between the identity of the user and the user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice. When the media content is played, acquiring reaction state information of a user, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through the image acquisition device or the sound information of the user is acquired through the sound acquisition device. And acquiring the evaluation information of the user on the media content according to the reaction state information. And associating the evaluation information of each user on the media content with the identity of the user. The method comprises the steps of obtaining the corresponding relation between the identity identification of a user and the characteristics of the user, associating the evaluation information of each user on the media content with the corresponding identity identification, and facilitating the follow-up personalized media content recommendation aiming at the users with different identity identifications so as to improve the accuracy of the media content recommendation, wherein the evaluation information of the user on the media content is obtained through the facial expression information of the user or the sound emotion information of the user, the evaluation information can be obtained in real time based on the feedback of the user on the media content, and the authenticity and the accuracy of the evaluation information are guaranteed.

On the basis of the above embodiment, the media content recommendation method provided by the present application may obtain the evaluation information of the user on the media content according to the facial expression information or the voice emotion information of the user alone, and may also obtain the evaluation information of the user on the media content according to the facial expression information and the voice emotion information of the user together, and first, a description is given below, with reference to fig. 4, on an implementation manner of obtaining the evaluation information of the user on the media content according to the facial expression information of the user.

Fig. 4 is a flowchart three of a media content recommendation method according to an embodiment of the present application, and as shown in fig. 4, the method includes:

s401, judging whether the facial expression information of the user is acquired in a preset time period, if so, executing S402, and if not, executing S406.

In this embodiment, the media content includes a preset time period and other time periods except the preset time period, wherein the image capturing frequency is different between the preset time period and the other time periods, so it is first determined whether the obtaining time period corresponding to the facial expression information of the user is the preset time period.

Specifically, when facial expression information of a user is acquired according to image information, the acquired time node is correspondingly associated with each piece of facial expression information, and then the acquired time node is compared with a starting time point corresponding to a preset time period, so that whether the facial expression of the user is acquired in the preset time period is judged.

S402, obtaining standard facial expression information corresponding to a preset time period, wherein the standard facial expression information is expression information predefined according to preset content.

If the facial expression information of the user is acquired within the preset time period, it indicates that the program content of the media content has a corresponding program effect, and therefore it is required to detect whether the user has feedback information for the corresponding media content within the preset time period.

Specifically, the standard facial expression information corresponding to the preset time period is expression information preset according to the program content of the preset time period, for example, when the program content corresponding to the preset time period is a laugh point content, the corresponding standard facial expression information is laugh, and for example, when the program content corresponding to the third time period is a teardrop content, the corresponding standard facial expression information is crying.

Those skilled in the art will understand that the standard facial expression information corresponding to each preset time period is specifically set according to the program content, and this embodiment is not limited thereto.

And S403, judging whether the facial expression information of the user is consistent with the standard facial expression information, if so, executing S404, and if not, executing S405.

Further, whether the facial expression information of the user meets the standard facial expression information is judged, for example, when the standard facial expression information is laughing, the mouth sealing and the like can be considered to be consistent with the standard facial expression information, and when the standard facial expression information is crying, the lacrimation, the wiping of the eye corners, the downward mouth corners and the like can be determined to be consistent with the standard facial expression information.

The specific implementation manner of the determination may be, for example, to perform feature extraction and analysis according to facial expression information of the user, then perform the determination according to a result of the feature analysis, further, for example, to extract a shape point from a face included in image information of the user, and compare the extracted shape point with a shape point of preset standard facial expression information to perform the determination, and the specific implementation manner of the determination is not particularly limited in this embodiment.

S404, determining that the evaluation of the user on the media content is the bonus evaluation.

If the facial expression information of the user is consistent with the standard facial expression information, the feedback information of the user to the current program content can be determined to be positive feedback, so that the evaluation of the user to the media content is determined to be bonus evaluation, and if the image information is not the first image, the evaluation information of the user to the media content can be updated, and the updated evaluation information can be obtained.

Specifically, for example, when the evaluation information is a score, the score corresponding to the bonus evaluation may be directly added to the score of the program, if the evaluation information of the media content before being updated is 88 scores, when the media content is played to a smiling point content, the facial expression information of the user is obtained as a smile, where for example, the score corresponding to the smile is 2 scores, and the updated evaluation information is 90 scores.

If the data of the above example is continuously used, for example, the weight value of the degree of interest indicator may be increased, and updated evaluation information may be obtained according to the updated weight value of each degree indicator.

S405, determining that the evaluation of the user on the media content is the score reduction evaluation.

If the facial expression information of the user is inconsistent with the standard facial expression information, it can be determined that the obtained feedback information of the user to the current program content is negative feedback, so that the evaluation of the user to the media content is determined to be a score reduction evaluation.

And S406, determining whether to update the evaluation information of the user on the media content according to the state information acquired after the state information.

Specifically, when the evaluation of the media content is a score reduction evaluation, in order to avoid false detection or incorrect acquisition of a time node, for example, a user tears at a time of a punctate content, but the user tears are not detected in facial expression information of the user at the current time node, or when a smile point content is detected, because the user has slow response, the user smiles a few seconds after a preset time period corresponding to the smile point content, and therefore, for the score reduction evaluation, it is necessary to continue to acquire state information after the state information.

According to the state information acquired after the state information, whether the evaluation information of the user needs to be updated is comprehensively determined, for example, if the current evaluation on the media content is a score reduction evaluation, whether the evaluation information needs to be updated can be determined according to the facial expression information of the user corresponding to the preset number of image information after the current facial expression information of the user.

For example, assuming that 10 pieces of image information after the current image information have facial expression information of 6 users corresponding to the score reduction evaluation and facial expression information of 4 users corresponding to the score increase evaluation, it is determined that the score reduction is performed according to the score reduction information corresponding to the current score reduction evaluation.

For another example, the evaluation information may be updated according to the weighted values of the point information and the point information of 10 pieces of image information after the facial expression information of the current user, and it can be understood by those skilled in the art that when the evaluation of the user on the media content is the point evaluation, the manner of updating the evaluation information of the user on the media content may be selected according to the requirement, and whether to update the evaluation information is determined according to the state information acquired after the state information, so that the accuracy of the evaluation information of the user can be improved.

S407, obtaining an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial expression information.

If the facial expression information of the user is not acquired in the preset time period, namely is acquired in other time periods except the preset time period, it is indicated that no preset program effect exists at the moment, and evaluation information is directly acquired according to the facial expression information of the user and an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial states.

For example, the evaluation mapping table stores face information such as yawning, a face not facing the multimedia device, and distraction, where each different face information corresponds to respective evaluation information, for example, 5 minutes of yawning, 10 minutes of face not facing the multimedia device, or a degree index of non-interest corresponding to the face not facing the multimedia device, and the specific evaluation mapping table may be set according to an actual requirement, which is not limited herein.

And S408, acquiring the evaluation information of the user on the media content according to the facial expression information and the evaluation mapping table of the user.

According to the facial expression information of the user and the evaluation mapping table, face information matched with the facial expression information of the current user is searched in the evaluation mapping table, and then evaluation information of the first user on the media content is obtained according to the evaluation information corresponding to the matched face information.

In this embodiment, the evaluation information corresponding to the face information stored in the evaluation mapping table may also be divided into an additional score evaluation or a subtractive score evaluation, and then the evaluation information of the media content by the user is updated according to the additional score evaluation or the subtractive score evaluation.

The media content recommendation method provided by the embodiment of the application comprises the following steps: judging whether the facial expression information of the user is acquired in a preset time period or not, if so, acquiring standard facial expression information corresponding to the preset time period, wherein the standard facial expression information is expression information predefined according to preset content. And judging whether the facial expression information of the user is consistent with the standard facial expression information or not, and if so, determining that the evaluation of the user on the media content is a bonus evaluation. And if not, determining that the evaluation of the user on the media content is the score reduction evaluation. And determining whether to update the evaluation information of the user on the media content according to the state information acquired after the state information. And if the facial expression information of the user is not acquired within a preset time period, acquiring an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial expression information. And acquiring the evaluation information of the user on the media content according to the facial expression information and the evaluation mapping table of the user. The evaluation of the user to the media content is determined to be score-subtracting evaluation or score-adding evaluation according to the facial expression information and the standard facial expression information of the user within the preset time period, so that the feedback effect of the user to the program can be quickly and effectively determined, and then when the evaluation of the user to the media content is score-subtracting evaluation, whether the evaluation information of the user needs to be updated or not is comprehensively determined according to the state information after the current state information, so that the evaluation information error caused by false detection is avoided, and the accuracy of the evaluation information is improved. And secondly, acquiring the evaluation information of the user through the facial expression information of the user and the evaluation mapping table in other time periods, so that whether the user is interested in the currently played media content can be determined in real time according to the facial expression of the user, and the authenticity of the evaluation information of the user is ensured.

On the basis of the above-mentioned embodiment, a description will be given next to an implementation manner of acquiring user rating information of media content according to user voice information with reference to fig. 5.

Fig. 5 is a fourth flowchart of a media content recommendation method according to an embodiment of the present application, as shown in fig. 5, the method includes:

s501, obtaining standard sound emotion information corresponding to a preset time period, wherein the standard sound emotion information is sound information predefined according to preset content.

Specifically, the program content in the preset time period has a corresponding program effect, where the standard sound emotion information corresponding to the preset time period is sound information preset according to the program content in the preset time period, where the standard sound emotion information is sound information predefined according to the preset content, and may include, for example, a sound emotion state, a sound decibel level, and the like.

For example, if the program content in the preset time period corresponds to the content of a laugh point, the emotional state of the corresponding standard sound emotional information is laugh, where the decibel of the sound may be, for example, 1 decibel, or if the program content in the preset time period corresponds to the content of a teardrop, the emotional state of the corresponding standard sound emotional information is crying.

Those skilled in the art can understand that the standard sound emotion information corresponding to each preset time period is specifically set according to the program content, and this embodiment does not limit this.

S502, judging whether the voice emotion information of the user is consistent with the standard voice emotion information, if so, executing S503, and if not, executing S504.

Comparing the sound emotion information of the user with the standard sound emotion information, for example, performing feature analysis on the sound information of the user to obtain the sound emotion information of the user, and also obtaining the sound decibel of the user, and then judging whether the sound emotion information of the user is consistent with the standard sound emotion information.

For example, when the emotional state of the vocal emotion information of the user is crying is detected in the content of the punctum, and the decibel of the voice is greater than 1 decibel, the vocal information of the user can be determined to be consistent with the standard vocal emotion information. For example, when the sound decibel of the sound emotion information of the user is detected to be less than 1 decibel in the content of the punctum, or the emotion state is detected not to be crying, the sound emotion information of the user can be determined to be inconsistent with the standard sound emotion information.

S503, determining the evaluation of the user to the media content as the bonus evaluation.

S504, determining that the evaluation of the user on the media content is the score reduction evaluation.

The implementation manners of S503 and S504 are similar to those of S404 and S405, and specific contents may refer to the descriptions of the above embodiments, and are not described herein again.

The media content recommendation method provided by the embodiment of the application comprises the following steps: and acquiring standard sound emotion information corresponding to a preset time period, wherein the standard sound emotion information is sound information predefined according to preset content. And judging whether the sound emotion information of the user is consistent with the standard sound emotion information, if so, determining that the evaluation of the user on the media content is a bonus evaluation. And if not, determining that the evaluation of the user on the media content is the score reduction evaluation. The evaluation information of the user on the currently played media content is determined by comparing the sound emotion information of the user with the standard sound emotion information, and the feedback of the user on the program content in the preset time period can be accurately acquired, so that the authenticity and the effectiveness of the evaluation information of the user are improved.

On the basis of the above embodiment, when the sound information and the image information correspond to the identity of the same user, the evaluation information for the image information in fig. 4 and the evaluation information for the sound information in fig. 5 may be superimposed or weighted to obtain the comprehensive score of the user for the media content.

For example, if the current sound information and the identification corresponding to the image information are both the user 1, it indicates that the image information and the sound information of the user 1 are obtained, and the evaluation information of the user 1 on the currently played media content can be determined according to the sound information and the image information of the user 1.

In one possible implementation manner, for example, if it is determined that the user 1 makes a smile feedback on the program content at the preset smile point according to the image information of the user 1, and it is determined that the user 1 makes a laugh on the program content at the preset smile point according to the sound information of the user 1, a score corresponding to the smile and a score corresponding to the laugh are added to the evaluation information of the user 1 on the media content at the same time.

In another possible implementation manner, for example, 20 pieces of image information of smile of the user are acquired according to the image information of the user within a preset time period, and the laughter of the user is acquired according to the sound information of the user, but the laughter of the user is lower than a preset decibel, the evaluation information of the user can be updated according to the weighting corresponding to the smile degree in each picture and the weighting corresponding to the laughter lower than the preset decibel.

On the basis of the above embodiment, in order to reduce the processing capability of the multimedia device and simplify the processing process, the voice recognition process may be weakened, that is, the association between the voice and the user's id does not need to be established, so as to obtain the evaluation information of the user. As described in detail below in conjunction with fig. 6.

Fig. 6 is a fifth flowchart of a media content recommendation method according to an embodiment of the present application, and as shown in fig. 6, the method includes:

s601, obtaining the identity of at least one user according to the image information.

The implementation of S601 is similar to S303, and is not described herein again.

S602, acquiring target facial expression information matched with the voice emotion information, and acquiring a target identity corresponding to the target facial expression information from at least one user identity, wherein the target facial expression information is consistent with the emotion corresponding to the voice emotion information.

In this embodiment, if there is a matching relationship between the sound emotion information and the facial expression information, for example, the currently detected sound emotion information is laughter, the facial expression information whose facial expression is "laugh" is determined as the matching target facial expression information according to the facial expression information of at least one user.

If facial expressions of a plurality of users are smiling at present, facial expression information matched with the sound emotion information can be acquired according to the corresponding relation between the sound decibel and the smiling degree in the sound emotion information, wherein the mode of acquiring the target facial expression information matched with the sound emotion information can be selected according to actual requirements, and the mode is not limited herein.

The user identity corresponding to the target facial expression information is determined according to the user features corresponding to at least one user identity, so that the target identity corresponding to the target facial expression is obtained.

The target facial expression information corresponds to the emotion corresponding to the sound emotion information, for example, when the expression indicated by the target facial expression is laughing, the sound indicated by the corresponding sound emotion information is laughing, and the like, wherein the emotion can be set to be difficult, hurry, pleasant, and the like, for example, and the specific corresponding emotion is not limited in this embodiment.

S603, obtaining the evaluation information of the user corresponding to the target identity mark on the media content according to the sound emotion information and the standard sound emotion information corresponding to the preset time period.

Whether the sound emotion information is consistent with standard sound emotion information corresponding to a preset time period or not is judged, if yes, the evaluation of the user on the media content is determined to be score-added evaluation, and if not, the evaluation of the user on the media content is determined to be score-subtracted evaluation, so that the evaluation information of the user on the media content corresponding to the target identity is obtained.

The media content recommendation method provided by the embodiment of the application comprises the following steps: and acquiring the identity of at least one user according to the image information. And acquiring target facial expression information matched with the voice emotion information, and acquiring a target identity corresponding to the target facial expression information from at least one user identity, wherein the target facial expression information is consistent with the emotion corresponding to the voice emotion information. And acquiring the evaluation information of the user corresponding to the target identity mark on the media content according to the sound emotion information and the standard sound emotion information corresponding to the preset time period. By matching the facial expression information with the voice emotion information and then acquiring the identity corresponding to the target facial expression information, the evaluation information of each user can be determined according to the respective facial expression information and the voice emotion information aiming at the identity of each user, so that the comprehensiveness and pertinence of the user evaluation information acquisition are improved.

On the basis of the above embodiment, before the media content is played, the identity of the user can be identified, and if the historical evaluation information of the user is obtained according to the identity of the user, the media content recommended to the user is determined according to the historical evaluation information.

Specifically, the method includes the steps of firstly obtaining an identity of a user, secondly detecting whether historical evaluation information of the user can be obtained according to the identity of the user, if so, indicating that the user corresponding to the identity watches media content on the multimedia device, and then determining media content recommended to the user according to the historical evaluation information of the user, wherein the recommended media content may be, for example, media content of the same type as that of the user whose evaluation information in the historical evaluation information is higher than a preset score, or the recommended media content may also be content recommended by other users whose evaluation information in the historical evaluation information of the user is higher than the preset score, and the specific implementation manner of the recommended media content may be set according to requirements, and is not limited herein.

On the basis of the above embodiment, the media content recommendation method provided by the present application may also perform differential recommendation according to different numbers of users to be recommended after obtaining the evaluation information of the media content by the user according to the state information, and the following description is provided with reference to a specific embodiment and is introduced with reference to fig. 7.

Fig. 7 is a sixth flowchart of a media content recommendation method according to an embodiment of the present application, as shown in fig. 7, the method includes:

s701, the number of the identified users to be recommended is obtained.

In this embodiment, before media content recommendation is performed, it is necessary to firstly identify users to be recommended, and secondly determine the number of the current users to be recommended, for example, if only three users watch media content currently, it is only necessary to recommend media content interested in three users according to evaluation information of three users. Or, currently, the media content is watched by Zhang III and Li IV together, and the media content which is commonly interested by Zhang III and Li IV needs to be recommended.

In a possible implementation manner, the number of users to be recommended may be obtained by performing recognition through the image information, or before the media content starts to play, the identifier or the number of the users to be recommended input by the user may be received, and the implementation manner of the number of the users to be recommended is not limited in this embodiment.

S702, if a user to be recommended is identified, determining the media content of which the evaluation information associated with the user to be recommended meets a first preset condition; the first preset condition is specifically one of the following conditions: the evaluation score is higher than a preset score or the evaluation rank is higher than a preset rank.

If a user to be recommended is identified, the media content that the user is interested in is recommended directly, in this embodiment, each user is associated with evaluation information for different media contents, where the association relationship may be as shown in table 1, for example:

TABLE 1

Identity label	Media content	Evaluation information
Zhang San	Speed and passion	97
Zhang San	Sea king	80
Li Si	Speed and passion	34
Li Si	Sea king	77

Specifically, the media content in which the evaluation information associated with the user to be recommended meets a first preset condition is determined, where the first preset condition is a condition that the evaluation of the user is used as a measurement index, and is specifically one of the following: the evaluation score is higher than the preset score or the evaluation rank is higher than the preset rank, or the first preset condition may further include a condition input by the user in advance, for example, the user sets that terrorist discs are not to be recommended, and a specific setting mode of the first preset condition may be set according to implementation requirements, which is not limited herein.

Taking the identification and rating information of the user in table 1 as an example, for example, if the current page three is the user to be recommended, where the first preset condition is that the rating information is higher than 90 points, according to the rating information of different media contents associated with the user in table 1, it may be determined that the media content whose rating satisfies the first preset condition is "speed and passion".

And S703, determining an evaluation list according to the media content meeting the first preset condition.

In a possible implementation manner, the program types of the media contents meeting the first preset condition may be acquired, then all the media contents of the same type are acquired, and the rating list is determined according to at least one media content ranked before the preset number, or the media contents similar to the media contents meeting the first preset condition are determined according to recommendation information of other users who have viewed the media contents meeting the first preset condition, so as to determine the rating list.

S704, if at least two users to be recommended are identified, determining the media content of which the evaluation information associated with each user of the at least two users to be recommended meets a second preset condition; the second preset condition is specifically one of the following conditions: the evaluation score is higher than a preset score or the evaluation rank is higher than a preset rank.

Specifically, firstly, obtaining evaluation information associated with each of the at least two users to be recommended, and determining media content of which the evaluation information associated with each user meets a second preset condition, wherein the second preset condition is a condition that the evaluation of each of the plurality of users is used as a measurement index, and specifically is one of the following conditions: the evaluation score is higher than the preset score or the evaluation rank is higher than the preset rank, and the second preset condition may further include a condition input by the user in advance, and the like, which is not limited herein.

Taking the identity and the evaluation information of the users in table 1 as an example, for example, if the current zhang san and lie san are users to be recommended, media content that both users are interested in needs to be recommended, where the second preset condition may be that the evaluation score is higher than 70, for example, according to the evaluation information of different media contents associated with both users in table 1, it may be determined that the media content whose evaluation information both satisfies the second preset condition is "haiwang".

S705, determining an evaluation list according to the media contents which all meet the second preset condition.

And secondly, determining an evaluation list according to the media contents of which the evaluation information meets the second preset condition, wherein the specific implementation mode is similar to that of determining the evaluation list according to the first preset condition, and the details are not repeated here.

S706, sending the evaluation list to the media source platform, and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises the media content to be recommended to the user.

Further, the evaluation list is sent to the media source platform, then the media source platform returns the content of the media content to the multimedia device, finally the media content to be recommended to the user returned by the media source platform is obtained, and recommendation of the media content is carried out.

The media content recommendation method provided by the embodiment of the application comprises the following steps: and acquiring the number of the identified users to be recommended. If a user to be recommended is identified, determining the media content of which the evaluation information associated with the user to be recommended meets a first preset condition; the first preset condition is specifically one of the following conditions: the evaluation score is higher than a preset score or the evaluation rank is higher than a preset rank. And determining an evaluation list according to the media content meeting the first preset condition. If at least two users to be recommended are identified, determining the media content of which the evaluation information respectively associated with each of the at least two users to be recommended meets a second preset condition; the second preset condition is specifically one of the following conditions: the evaluation score is higher than a preset score or the evaluation rank is higher than a preset rank. And determining an evaluation list according to the media contents which all meet the second preset condition. And sending the evaluation list to the media source platform, and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises the media content to be recommended to the user. The method comprises the steps of recommending the media content through the evaluation information of a user, recommending the program only according to the evaluation information of the user and a first preset condition when only a single user exists, and accordingly improving the accuracy of program recommendation.

The media content recommendation method provided by the present application is described below by taking an example in a specific scenario, and is introduced with reference to fig. 8 and fig. 9, fig. 8 is a schematic flowchart of a media content recommendation method provided in an embodiment of the present application, and fig. 9 is a schematic flowchart of a media content recommendation method provided in an embodiment of the present application.

As shown in fig. 8, assuming that an emotional drama is played at this time, when the media content is played, the camera records the photo of the viewer every 30 seconds, where 30 seconds is the first image capturing frequency, and the far-field microphone is turned on in advance according to a preset "stem", where the preset "stem" is the program content corresponding to the preset time period.

And in a preset time period corresponding to the preset 'stem', the far-field microphone collects the voice of the user, and/or collects the image information of the user according to a second image collecting frequency, for example, the photo of the viewer is recorded every 5 seconds, and then the collected photo and voice are input into a state model of the viewer.

When the state model of the viewer is specifically implemented, the evaluation information may be obtained only according to the photo of the viewer, may also be obtained only according to the sound of the viewer, and may also be obtained by combining the photo and the sound of the viewer, where specific operation steps of the three implementation manners may refer to the embodiments of fig. 4, fig. 5, and fig. 6.

And then, obtaining the evaluation information of the user on the media content according to the analysis result of the state model, and determining that the husband A does not like to watch the film according to the automatic evaluation of the analysis results of the husband A not in front of the television, the husband A with eyes closed, the husband A with the eyes facing the television and the like, and determining that the wife B likes to watch the film according to the automatic evaluation of the analysis results of the wife B with eyes wiped, facing the television, laughing and the like.

Specifically, when a user starts to watch a movie, the camera first detects the identity of a current movie watching person, if only one person of wife B watches the movie currently, a similar emotional drama is recommended for wife B, and if it is detected that two persons of the current husband a and wife B watch the movie, because the husband is not interested in the movie according to the previous evaluation of the emotional drama, a movie which is liked by both persons needs to be recommended, so that the user experience is improved.

Referring to fig. 9, a specific implementation process of detecting a state model of a viewer is described, as shown in fig. 9, a multimedia device firstly enters a viewing interface, secondly, a television camera is turned on under the authorization of a user, image information of the user is obtained through the camera to identify information of the current viewer, and the number of users to be recommended currently is determined according to an identification result, where the current viewer is, for example, a husband a and/or a husband B.

If the current film viewer only has wife B alone, then similar programs can be directly recommended according to the media content liked by wife B, if the current film viewer is the husband A and wife B, then similar programs are recommended according to the media content liked by both the husband A and wife B, and if the judgment information of a plurality of media contents according to the husband A and wife B determines that both the husband A and wife B like the fun programs, then the similar fun programs are recommended to both the current film viewer and wife B.

Secondly, the husband A and the wife B start to watch the photos, the cameras shoot the photos of the husband A and the wife B at intervals according to the first image acquisition frequency, so that the watching states of the husband A and the wife B are obtained, and whether the playing time point of the current media content reaches the time point of the preset smiling stick or not is judged in real time.

If the preset smiling stem time point is not reached, namely the time period of the current media content playing time node is other time periods, then whether each viewer is immersed in the viewing experience is judged according to the viewing state of each viewer, wherein the judging mode can be, for example, judging whether the viewer leaves a seat, whether the face is over against a television, whether the eyes are in an open state and the like, if the viewer is determined to be immersed in the viewing experience, the viewing effect is better, the evaluation of the viewer on the media content is added, if the viewer is determined not to be immersed in the viewing experience, the viewing effect is poorer, and the evaluation of the viewer on the media content is reduced.

If the time point of the preset smiling stalk is reached, firstly, the far-field microphone is started, or the far-field microphone can be started before the time point of the preset smiling stalk, secondly, the film watching states of all film watching persons are respectively detected and judged, firstly, whether the film watching persons are smiling in the smiling stalk time period is judged, if the film watching persons are not smiling is not detected, the effect that the smiling stalks are not reached can be determined, the evaluation of the film watching persons on the media content is reduced, wherein when the film watching persons are not detected, the evaluation information is directly updated, whether the smiling sounds are detected is not required to be judged, so that the judgment process is simplified, and the system efficiency is improved.

Secondly, if the audience is smiling, the audience can preliminarily determine that the smiling stem content generates a certain program effect, then whether smile is detected in a smile stem time period is judged, if the smile is not detected, the audience feedback to the smile stem content is general, at the moment, the score corresponding to the smile of the image information is added to the evaluation of the audience to the media content, if the smile is detected, the feedback of the user to the program effect generated by the smile stem content is very good, and at the moment, the smile of the image information and the score corresponding to the smile are added to the evaluation of the audience to the media content.

Aiming at different film viewers, according to the time sequence of media content playing, the evaluation of each film viewer on the currently played media content is updated in real time in the preset time period corresponding to the preset smiling peduncle and other time periods except the preset smiling peduncle, and therefore the evaluation of the husband A and the wife B on the program is obtained respectively.

According to the media content recommendation method, the evaluation information of each viewer on the media content is determined according to the image information and/or the sound information of the viewer, and then the media content is recommended according to the evaluation information of the viewer, so that the accuracy of media content recommendation can be improved.

Fig. 10 is a first signaling flowchart of a media content recommendation method according to an embodiment of the present application, and fig. 11 is a second signaling flowchart of the media content recommendation method according to the embodiment of the present application.

As shown in fig. 10, firstly, the multimedia device receives an instruction sent by a viewer to enter a viewing application, secondly, the multimedia device pops up an authorization page, where the authorization page is used to obtain the opening authority of the image capture device and the sound capture device, secondly, according to the authorization operation of the viewer, the image capture device and/or the sound capture device are/is opened, and then the sound capture device can be opened again when a preset time period arrives, so as to save resources.

The image acquisition equipment acquires picture information of a viewer, identifies identity information of the viewer according to the picture information, for example, the identity information of the viewer can be compared with prestored picture information according to the picture identification result of the picture information, so that the identity information of the viewer is acquired, then a preferred media content list of the viewer is acquired according to the identity information of the viewer, and if a plurality of viewers exist at the moment, the acquired media content is the preferred media content of the viewers.

And transmitting the media content list to a media source platform, wherein the media source platform acquires specific media content according to the list information, and then the multimedia device receives the media content transmitted by the media source platform, recommends the media content as similar media content to the user, and plays the favorite media content at the same time.

As shown in fig. 11, the viewer starts viewing, the multimedia device starts the image capturing device to capture the picture information of the viewer, and then the multimedia device receives the viewer viewing status picture sent by the image capturing device, identifies whether the viewer is immersed in the program content of the media content according to the viewing status picture, and performs a score addition or a score subtraction on the evaluation of the viewer on the media content according to the identification result.

Specifically, the media source platform pre-processes the media content, sets a time period of a preset stem in the playing information of the media content, and cyclically executes an operation of evaluating, adding or subtracting according to the watching state picture of the viewer when the time node corresponding to the preset stem is not reached, for example, the operation may be cyclically executed according to a first image acquisition frequency until the start time node corresponding to the preset stem is reached.

And secondly, controlling the image acquisition equipment to start a high-speed continuous shooting mode in the first N seconds before the preset stem arrives, wherein the frequency corresponding to the high-speed continuous shooting mode is the second image acquisition frequency, and controlling the sound acquisition equipment to start by the multimedia equipment to prepare for acquiring the sound information of the audience, wherein N is an integer greater than or equal to 0.

Specifically, within a preset time period corresponding to the preset stem, the image acquisition device acquires picture information of the viewer according to the second image acquisition frequency and sends the picture information to the multimedia device, then the multimedia device identifies the response of the viewer information to the preset stem according to the picture information, and carries out evaluation and scoring or evaluation and scoring operation according to the identification result.

Secondly, the sound collection device collects the sound of the viewer in a preset time period corresponding to the preset stem, the sound collection structure is sent to the multimedia device, the multimedia device identifies the response of the viewer to the preset stem according to the sound information collected by the sound collection device, and the corresponding evaluation is subjected to adding or subtracting.

In a preset time period corresponding to the preset stem, the addition and subtraction performed according to the picture information and the sound information is also performed in a circulating manner, for example, the addition and subtraction may be performed in a circulating manner according to the second acquisition frequency until the end time node corresponding to the preset stem is reached.

And after the program content corresponding to the preset peduncle is finished, controlling the image acquisition equipment to recover the low-speed shooting mode, and controlling the sound acquisition equipment to be closed, thereby avoiding the waste of resources.

In a specific implementation process, a plurality of preset stems may exist in the media content, specifically, the evaluation information of each user on the media content is updated in real time in a preset time period corresponding to the preset stem and other time periods except the preset stem according to the playing sequence of the specific content of the current media content, so that the evaluation information of each user on the media content is obtained when the playing of the media content is finished, and the authenticity and the effectiveness of the evaluation information are ensured.

Fig. 12 is a first schematic structural diagram of a media content recommendation device according to an embodiment of the present application. As shown in fig. 12, the apparatus 120 includes: an input module 1201 and a processing module 1202.

An input module 1201, configured to acquire reaction state information of a user when media content is played, where the reaction state information includes at least one of the following types of information: the method comprises the steps that image information of a user is acquired through image acquisition equipment or sound information of the user is acquired through sound acquisition equipment;

the processing module 1202 is configured to obtain, according to the response state information, evaluation information of the user on the media content, where the evaluation information is used as a basis for recommending other media content to the user.

In one possible design, the image information includes one or more images captured by the image capturing device at irregular intervals or continuously captured or captured based on the first image capturing frequency in a preset time period, where the preset time period is a time period for playing preset content in the media content.

In one possible design, the image information further includes one or more images acquired by the image acquisition device based on the second image acquisition frequency in other time periods than the preset time period; wherein the first image acquisition frequency is higher than the second image acquisition frequency.

In one possible design, the sound information includes one or more pieces of sound collected by the sound collection device in a preset time period, the preset time period is a time period for playing preset content in the media content, and the frequency for collecting the sound in the preset time period includes any one of the following: continuous collection, collection according to a preset frequency or collection at irregular intervals.

In one possible design, before obtaining the reaction state information of the user, the processing module 1202 is further configured to:

In one possible design, the processing module 1202 is specifically configured to:

In one possible design, after obtaining the reaction status information of the user, the processing module 1202 is further configured to:

acquiring an identity of at least one user according to the image information;

In one possible design, prior to the playing of the media content, the processing module 1202 is further configured to:

acquiring a corresponding relation between the identity of a user and user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice;

In one possible design, the processing module 1202 is further configured to:

The apparatus provided in this embodiment may be used to implement the technical solutions of the above method embodiments, and the implementation principles and technical effects are similar, which are not described herein again.

Fig. 13 is a schematic structural diagram of a media content recommendation device according to an embodiment of the present application. As shown in fig. 13, the present embodiment further includes, on the basis of the embodiment of fig. 12: and an output module 1303.

In one possible design, the processing module 1302 is further configured to: after the media content is played, determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises the media content evaluated by the user;

the output module 1303 is configured to: sending an evaluation list to a media source platform;

the input module 1301 is further configured to: and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises media contents to be recommended to the user.

In one possible design, the processing module 1302 is specifically configured to:

In one possible design, the processing module specific 1302 is configured to:

Fig. 14 is a schematic hardware structure diagram of a media content recommendation device according to an embodiment of the present application. As shown in fig. 14:

the media content recommendation device 1401 may communicate wirelessly via NFC related protocols, such as wirelessly communicating with a media source platform, or communicating with a third party device such as an image capture device, a sound capture device, etc.

Illustratively, the media content recommendation device 1401 may be connected (e.g., wired or wireless) to the electronic device to be communicated via one or more communication networks. The communication network may be a local area network (lan) or a Wide Area Network (WAN), such as the internet. The communication network may be implemented using any known network communication Protocol, which may be any of various wired or wireless communication protocols, such as ethernet, Universal Serial Bus (USB), FIREWIRE (FIREWIRE), global system for mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), time division code division multiple access (TD-SCDMA), long term evolution (long term evolution, LTE), bluetooth, wireless fidelity (Wi-Fi), NFC, Voice over Internet Protocol (VoIP), Internet Protocol, VoIP, or any other suitable communication Protocol.

Illustratively, the media content recommendation device 1401 may establish a connection with the image capture device via Wi-Fi or Bluetooth. Also illustratively, the media content recommendation device 1401 establishes a connection not only with the image capture device via bluetooth, but also with the media source platform via a wide area network.

The media content recommendation device 1401 may be a Mobile Terminal (Mobile Terminal) or a user equipment, such as a Mobile phone, a tablet computer, a television, a peripheral advertisement device, a vehicle-mounted processing device, or a computer with mobility, such as a laptop computer, a pocket computer, or a handheld computer.

By way of example, FIG. 14 illustrates a schematic diagram of a media content recommendation device 1401. The media content recommendation device 1401 may include a processor 1410, an external memory interface 1420, an internal memory 1421, a Universal Serial Bus (USB) interface 1430, a charging management module 1440, a power management module 1441, a battery 1442, an antenna 1, an antenna 2, a mobile communication module 1450, a wireless communication module 1460, an audio module 1470, a speaker 1470A, a receiver 1470B, a microphone 1470C, an earphone interface 1470D, a sensor 1480, buttons 1490, a motor 1491, an indicator 1492, a camera 1493, and a display 1494.

It is to be understood that the illustrated configuration of the present embodiment does not constitute a specific limitation of the media content recommendation device 1401. In other embodiments of the present application, the media content recommendation device 1401 may include more or fewer components than illustrated, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Among other things, processor 1410 may include one or more processing units, such as: the processor 1410 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors. In some embodiments, the media content recommendation device 1401 may also include one or more processors 1410. Among the controllers may be the neural hub and command center of the media content recommendation device 1401. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution. A memory may also be provided in processor 1410 for storing instructions and data.

In some embodiments, the memory in the processor 1410 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 1410. If the processor 1410 needs to use the instruction or data again, it can be called directly from the memory. This avoids repeated accesses, reduces the latency of the processor 1410, and thus improves the processing efficiency of the media content recommendation device 1401.

In some embodiments, processor 1410 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

The USB interface 1430 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the media content recommendation device 1401, and may also be used to transfer data between the media content recommendation device 1401 and a peripheral device. And the earphone can also be used for connecting an earphone and playing audio through the earphone.

It is to be understood that the interfacing relationship between the modules illustrated in the embodiments of the present invention is only for illustrative purposes and does not constitute a structural limitation of the media content recommendation device 1401. In other embodiments of the present application, the media content recommendation device 1401 may also interface differently or in combination with the various interfaces described in the previous embodiments.

The charging management module 1440 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 1440 may receive charging input from a wired charger via the USB interface 1430. In some wireless charging embodiments, the charging management module 1440 may receive wireless charging input through a wireless charging coil of the media content recommendation device 1401. The media content recommendation device 1401 may also be powered by the power management module 1441 while the charge management module 1440 charges the battery 1442.

The power management module 1441 is used to connect the battery 1442, the charging management module 1440 and the processor 1410. The power management module 1441 receives input from the battery 1442 and/or the charging management module 1440, and provides power to the processor 1410, the internal memory 1421, the display 1494, the camera 1493, and the wireless communication module 1460. The power management module 1441 may also be used to monitor parameters such as battery capacity, battery cycle number, battery state of health (leakage, impedance), etc. In other embodiments, a power management module 1441 may also be disposed in the processor 1410. In other embodiments, the power management module 1441 and the charging management module 1440 may be disposed in the same device.

The wireless communication functionality of the media content recommendation device 1401 may be implemented via the antenna 1, antenna 2, mobile communication module 1450, wireless communication module 1460, modem processor, baseband processor, and the like. The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the media content recommendation device 1401 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module 1450 may provide a solution for applications on the media content recommendation device 1401 that include wireless communication, such as 2G/3G/4G/5G. The mobile communication module 1450 may include at least one filter, switch, power amplifier, low noise amplifier, and the like. The mobile communication module 1450 may receive electromagnetic waves from the antenna 1, filter, amplify, etc. the received electromagnetic waves, and transmit the electromagnetic waves to the modem processor for demodulation. The mobile communication module 1450 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 1450 may be disposed in the processor 1410. In some embodiments, at least some of the functional blocks of the mobile communication module 1450 may be provided in the same device as at least some of the blocks of the processor 110.

The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 1470A, the receiver 1470B, etc.) or displays an image or video through the display 1494. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be separate from the processor 1410, and may be located in the same device as the mobile communication module 1450 or other functional modules.

The wireless communication module 1460 may provide solutions for wireless communication applied to the media content recommendation device 14401, including Wireless Local Area Networks (WLANs), bluetooth, Global Navigation Satellite System (GNSS), Frequency Modulation (FM), NFC, Infrared (IR), and the like. The wireless communication module 1460 may be one or more devices integrating at least one communication processing module. The wireless communication module 1460 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering on electromagnetic wave signals, and transmits the processed signals to the processor 1410. The wireless communication module 1460 may also receive a signal to be transmitted from the processor 1410, frequency modulate it, amplify it, and convert it into electromagnetic waves via the antenna 2 to radiate it out.

In some embodiments, the antenna 1 of the media content recommendation device 1401 is coupled to the mobile communication module 1450 and the antenna 2 is coupled to the wireless communication module 1460 so that the media content recommendation device 1401 can communicate with networks and other devices via wireless communication techniques. The wireless communication technologies may include GSM, GPRS, CDMA, WCDMA, TD-SCDMA, LTE, GNSS, WLAN, NFC, FM, and/or IR technologies, among others. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).

The media content recommendation device 1401 may implement display functionality via the GPU, the display 1494, and the application processor, among other things. The GPU is a microprocessor for image processing, connected to the display 1494 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 1410 may include one or more GPUs that execute instructions to generate or change display information.

The display screen 1494 is used to display images, video, and the like. The display 1494 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the media content recommendation device 1401 may include 1 or N display screens 1494, N being a positive integer greater than 1.

The media content recommendation device 1401 may implement a capture function via the ISP, one or more cameras 1493, a video codec, a GPU, one or more displays 1494, an application processor, etc.

The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. Applications such as intelligent awareness of the media content recommendation device 1401 may be implemented by the NPU, for example: image recognition, face recognition, speech recognition, text understanding, and the like.

The external memory interface 1420 may be used to connect an external memory card, such as a Micro SD card, to enable extending the storage capabilities of the media content recommendation device 1401. The external memory card communicates with the processor 1410 through an external memory interface 1420 to implement data storage functions. For example, data files such as music, photos, videos, and the like are saved in the external memory card.

Internal memory 1421 may be used to store one or more computer programs comprising instructions. The processor 1410 may cause the media content recommendation device 1401 to perform the voice switching method provided in some embodiments of the present application, as well as various functional applications and data processing, etc., by executing the above-described instructions stored in the internal memory 1421. The internal memory 1421 may include a program storage area and a data storage area. Wherein, the storage program area can store an operating system; the storage program area may also store one or more application programs (such as user features, user voice information, etc.), and the like. The storage data area may store data created during use of the media content recommendation device 1401, such as user historical viewing records, and the like. In addition, the internal memory 1421 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. In some embodiments, the processor 1410 may cause the media content recommendation device 1401 to perform the media content recommendation method provided in the embodiments of the present application, as well as various functional applications and data processing by executing instructions stored in the internal memory 1421 and/or instructions stored in a memory provided in the processor 1410.

The media content recommendation device 1401 may implement audio functionality through the audio module 1470, speaker 1470A, headphones 1470B, microphone 1470C, headphone interface 1470D, and application processor, among others. Such as sound playback of media content, music playback, etc. The audio module 1470 is used for converting digital audio information into an analog audio signal and outputting the analog audio signal, and is also used for converting an analog audio input into a digital audio signal. The audio module 1470 may also be used to encode and decode audio signals. In some embodiments, the audio module 1470 may be disposed in the processor 1410, or some functional modules of the audio module 1470 may be disposed in the processor 1410.

The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The media content recommendation device 1401 may listen to music, or to a broadcast program, etc. through the speaker 1470A. A receiver 1470B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal.

When the media content recommendation device 1401 acquires the user's voice information, the acquisition may be performed through the headphones 1470B. The microphone 1470C, also referred to as a "microphone", is used to convert sound signals into electrical signals. When acquiring the voice information of the user, the voice signal of the user may be acquired by the microphone 1470C, which may be a far-field microphone, for example, and then input to the microphone 1470C. The media content recommendation device 1401 may be provided with at least one microphone 1470C. In other embodiments, the media content recommendation device 1401 may be provided with two microphones 1470C, which may implement noise reduction functionality in addition to collecting sound signals. In other embodiments, the media content recommender 1401 may also be provided with three, four or more microphones 1470C to collect audio signals, reduce noise, identify audio information, perform directional recording, and the like.

The headset interface 1470D is used to connect wired headsets. The headset interface 1470D may be the USB interface 1430, may also be an open mobile electronic device platform (OMTP) standard interface of 3.5mm, and may also be a CTIA (cellular telecommunications industry association) standard interface.

The sensors 1480 may include pressure sensors, gyroscope sensors, air pressure sensors, magnetic sensors, acceleration sensors, distance sensors, proximity light sensors, fingerprint sensors, temperature sensors, touch sensors, ambient light sensors, bone conduction sensors, and the like.

The pressure sensor is used for sensing a pressure signal and converting the pressure signal into an electric signal. In some embodiments, the pressure sensor may be disposed on the display screen. There are many types of pressure sensors, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. The capacitance between the pressure sensor electrodes changes when a force acts thereon. The media content recommendation device 1401 determines the strength of the pressure from the change in capacitance. When a touch operation is applied to the display screen, the media content recommendation apparatus 1401 detects the intensity of the touch operation according to the pressure sensor. The media content recommendation device may also calculate the location of the touch based on the detection signal of the pressure sensor. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.

The gyro sensor may be used to determine the motion gesture of the media content recommendation device 1401 (e.g., when a tablet computer or cell phone). In some embodiments, the angular velocity of the media content recommendation device 1401 about three axes (i.e., x, y, and z axes) may be determined by a gyroscope sensor. The gyro sensor may be used for photographing anti-shake. Illustratively, when the shutter is pressed, the gyroscope sensor detects the shaking angle of the media content recommendation device 1401, calculates the distance that the lens module needs to compensate according to the angle, and enables the lens to counteract the shaking of the media content recommendation device 1401 through reverse movement, thereby realizing anti-shaking. The gyroscope sensor can also be used for navigation, body feeling game scenes and the like.

The acceleration sensor may detect the magnitude of acceleration of the media content recommendation device 1401 in various directions, typically three axes. The magnitude and direction of gravity may be detected when the media content recommendation device 1401 is stationary. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.

A distance sensor for measuring a distance. The media content recommendation device 1401 may measure distance by infrared or laser. In some embodiments, the scene is photographed and the media content recommendation device 1401 may range using the distance sensor 180F to achieve fast focus.

The proximity light sensor may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The media content recommendation device 1401 emits infrared light outward through the light emitting diodes. The media content recommendation device 1401 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it may be determined that there is an object near the media content recommendation device 1401. When insufficient reflected light is detected, the media content recommendation device 1401 may determine that there are no objects near the media content recommendation device 1401. The proximity light sensor 180G may also be used to automatically unlock and lock the screen.

The ambient light sensor is used for sensing the ambient light brightness. The media content recommendation device 1401 may adaptively adjust the brightness of the display 1494 according to the perceived ambient light level. The ambient light sensor may also be used to automatically adjust the white balance of the gaze as the user image information is acquired. The ambient light sensor may also be used in conjunction with a proximity light sensor to detect whether the media content recommendation device 1401 is operating properly for purposes of invalid detection, etc.

A fingerprint sensor (also known as a fingerprint recognizer) is used to collect a fingerprint. The media content recommendation device 1401 may utilize the collected fingerprint characteristics to implement fingerprint unlocking, access an application lock, obtain user rights, and the like. Further description of fingerprint sensors may be found in international patent application PCT/CN2017/082773 entitled "method and electronic device for handling notifications", which is incorporated herein by reference in its entirety.

Touch sensors, which may also be referred to as touch panels or touch sensitive surfaces. The touch sensor may be disposed on the display screen 1494, and the touch sensor and the display screen 1494 form a touch screen, which is also called a touch screen. The touch sensor 1480K is used to detect a touch operation applied thereto or therearound. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through the display 1494. In other embodiments, the touch sensor may also be disposed on a surface of the media content recommendation device 1401 in a different location than the display 1494.

The bone conduction sensor 1480M may acquire a vibration signal. In some embodiments, the bone conduction sensor 1480M may acquire a vibration signal of the human vocal part vibrating a bone mass. The bone conduction sensor 1480M may also contact the body pulse to receive a blood pressure pulse signal. In some embodiments, a bone conduction sensor 1480M may also be provided in the headset, integrated into a bone conduction headset. The audio module 1470 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part obtained by the bone conduction sensor 1480M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure pulsation signal acquired by the bone conduction sensor 1480M, so as to realize the heart rate detection function.

The keys 1490 include a power-on key, a volume key, etc. The keys 1490 may be mechanical keys or touch keys. The media content recommendation device 1401 may receive a key input, and generate a key signal input related to user settings and function control of the media content recommendation device 1401.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the present application are all or partially generated upon loading and execution of computer program instructions on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or can comprise one or more data storage devices, such as a server, a data center, etc., that can be integrated with the medium. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Claims

A method for recommending media contents, comprising:

when media content is played, reaction state information of a user is acquired, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through image acquisition equipment or the sound information of the user is acquired through sound acquisition equipment;

and acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media contents to the user.
The method according to claim 1, wherein the image information comprises one or more images captured by the image capturing device at irregular intervals or continuously captured uninterruptedly or captured based on a first image capturing frequency for a preset time period, wherein the preset time period is a time period for playing a preset content of the media content.
The method of claim 2, wherein the image information further comprises one or more images acquired by the image acquisition device based on a second image acquisition frequency during a time period other than the preset time period; wherein the first image acquisition frequency is higher than the second image acquisition frequency.
The method according to any one of claims 1 to 3, wherein the sound information includes one or more pieces of sound collected by the sound collection device during a preset time period, the preset time period is a time period for playing preset content in the media content, and the frequency of collecting sound during the preset time period includes any one of: continuous collection, collection according to a preset frequency or collection at irregular intervals.
The method according to any one of claims 1-4, wherein prior to said obtaining reaction status information of the user, the method further comprises:

if the response state information comprises the image information, starting the image acquisition equipment according to first authorization information of the user, wherein the first authorization information is used for indicating the starting of the image acquisition equipment;

if the response state information comprises the sound information, starting the sound acquisition equipment according to second authorization information of the user, wherein the second authorization information is used for indicating the starting of the sound acquisition equipment;

and if the response state information comprises the image information and the sound information, starting the image acquisition equipment according to the first authorization information and starting the sound acquisition equipment according to the second authorization information.
The method according to any one of claims 2 to 4, wherein the obtaining of the user's rating information for the media content according to the reaction status information comprises:

if the response state information comprises the image information, obtaining evaluation information of the user on the media content according to facial expression information of the user;

if the response state information comprises the sound information, acquiring the evaluation information of the user on the media content according to the sound emotion information of the user;

if the response state information comprises the image information and the sound information, acquiring evaluation information of the user on the media content according to the facial expression information and the sound emotion information of the user;

the facial expression information is acquired according to the image information, and the sound emotion information is acquired according to the sound information, wherein the facial emotion information is acquired when the user watches the media content, and the sound emotion information is acquired when the user watches the preset content.
The method of claim 6, wherein the obtaining of the user's rating information for the media content according to the user's facial expression information comprises:

if the facial expression information of the user is acquired in the preset time period, acquiring standard facial expression information corresponding to the preset time period, wherein the standard facial expression information is expression information predefined according to the preset content;

if the facial expression information of the user is consistent with the standard facial expression information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the facial expression information of the user is inconsistent with the standard facial expression information, determining that the evaluation of the user to the media content is a score reduction evaluation.
The method of claim 6, wherein the obtaining of the user's rating information for the media content according to the user's facial expression information comprises:

if the facial expression information of the user is acquired in other time periods, acquiring an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial expression information;

and acquiring the evaluation information of the user on the media content according to the facial expression information of the user and the evaluation mapping table.
The method of claim 6, wherein the obtaining of the user's rating information for the media content according to the user's vocal emotion information comprises:

acquiring standard sound emotion information corresponding to the preset time period, wherein the standard sound emotion information is sound information predefined according to the preset content;

if the sound emotion information of the user is consistent with the standard sound emotion information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the sound emotion information of the user is inconsistent with the standard sound emotion information, determining that the evaluation of the user to the media content is a score reduction evaluation.
The method according to any one of claims 1-9, wherein after obtaining the reaction status information of the user, the method further comprises:

acquiring an identity of at least one user according to the image information or the sound information;

after obtaining the evaluation information of the user on the media content according to the reaction state information, the method further includes:

and associating the evaluation information of each user on the media content with the identity of the user.
The method of claim 6, wherein the obtaining of the evaluation information of the media content by the user according to the facial expression information of the user and the sound emotion information of the user comprises:

acquiring an identity of at least one user according to the image information;

acquiring target facial expression information matched with the voice emotion information, and acquiring a target identity corresponding to the target facial expression information from the identity of the at least one user, wherein the target facial expression information is consistent with the emotion corresponding to the voice emotion information;

and acquiring the evaluation information of the user corresponding to the target identity mark on the media content according to the sound emotion information and the standard sound emotion information corresponding to the preset time period, wherein the standard sound emotion information is information predefined according to the preset content.
The method of any of claims 10 or 11, wherein prior to the playing of the media content, the method further comprises:

acquiring a corresponding relation between the identity of the user and the user characteristic according to the identity and the user characteristic input by the user, wherein the user characteristic comprises one of a face or a voice;

the obtaining of the identity of at least one user according to the image information or the sound information includes:

acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the face included in the image information; or

And acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the sound included in the sound information.
The method according to any one of claims 1 to 12, wherein after obtaining the user rating information of the media content according to the reaction status information, the method further comprises:

if the evaluation information is bonus information, updating the evaluation information of the user on the media content to obtain updated evaluation information;

and if the evaluation information is the score reduction information, determining whether to update the evaluation information of the user on the media content according to the state information acquired after the state information.
The method of any of claims 10-12, wherein prior to the playing of the media content, the method further comprises:

and identifying the identity of the user, and if the historical evaluation information of the user is acquired according to the identity of the user, determining the media content recommended to the user according to the historical evaluation information.
The method of any of claims 1-14, wherein after the end of the playing of the media content, the method further comprises:

determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises the media content evaluated by the user;

and sending the evaluation list to a media source platform, and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises media contents to be recommended to the user.
The method according to claim 14 or 15, wherein determining a rating list according to the rating information of the user comprises:

if a user to be recommended is identified, determining media content of which the evaluation information associated with the user to be recommended meets a first preset condition; the first preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media content meeting the first preset condition.
The method according to claim 14 or 15, wherein determining a rating list according to the rating information of the user comprises:

if at least two users to be recommended are identified, determining media content of which the evaluation information respectively associated with each of the at least two users to be recommended meets a second preset condition, wherein the second preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media contents which all meet the second preset condition.
A media content recommender, comprising:

the input module is used for acquiring reaction state information of a user when media content is played, wherein the reaction state information comprises at least one type of information as follows: the image information of the user is acquired through image acquisition equipment or the sound information of the user is acquired through sound acquisition equipment;

and the processing module is used for acquiring the evaluation information of the user on the media content according to the reaction state information, wherein the evaluation information is used as a basis for recommending other media content to the user.
The apparatus of claim 18, wherein the image information comprises one or more images captured by the image capturing device at irregular intervals or continuously captured continuously or based on a first image capturing frequency for a preset time period, and the preset time period is a time period for playing a preset content of the media contents.
The apparatus of claim 19, wherein the image information further comprises one or more images acquired by the image acquisition device based on a second image acquisition frequency during a time period other than the preset time period; wherein the first image acquisition frequency is higher than the second image acquisition frequency.
The apparatus according to any one of claims 18 to 20, wherein the sound information includes one or more pieces of sound collected by the sound collection device during a preset time period, the preset time period is a time period for playing preset content in the media content, and a frequency of collecting sound during the preset time period includes any one of: continuous collection, collection according to a preset frequency or collection at irregular intervals.
The apparatus according to any of claims 18-21, wherein prior to said obtaining reaction status information of the user, the processing module is further configured to:

if the response state information comprises the image information, starting the image acquisition equipment according to first authorization information of the user, wherein the first authorization information is used for indicating the starting of the image acquisition equipment;

if the response state information comprises the sound information, starting the sound acquisition equipment according to second authorization information of the user, wherein the second authorization information is used for indicating the starting of the sound acquisition equipment;

and if the response state information comprises the image information and the sound information, starting the image acquisition equipment according to the first authorization information and starting the sound acquisition equipment according to the second authorization information.
The apparatus according to any one of claims 19 to 21, wherein the processing module is specifically configured to:

if the response state information comprises the image information, obtaining evaluation information of the user on the media content according to facial expression information of the user;

if the response state information comprises the sound information, acquiring the evaluation information of the user on the media content according to the sound emotion information of the user;

if the response state information comprises the image information and the sound information, acquiring evaluation information of the user on the media content according to the facial expression information and the sound emotion information of the user;

the facial expression information is acquired according to the image information, and the sound emotion information is acquired according to the sound information, wherein the facial emotion information is acquired when the user watches the media content, and the sound emotion information is acquired when the user watches the preset content.
The apparatus of claim 23, wherein the processing module is specifically configured to:

if the facial expression information of the user is acquired in the preset time period, acquiring standard facial expression information corresponding to the preset time period, wherein the standard facial expression information is expression information predefined according to the preset content;

if the facial expression information of the user is consistent with the standard facial expression information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the facial expression information of the user is inconsistent with the standard facial expression information, determining that the evaluation of the user to the media content is a score reduction evaluation.
The apparatus of claim 23, wherein the processing module is specifically configured to:

if the facial expression information of the user is acquired in other time periods, acquiring an evaluation mapping table, wherein the evaluation mapping table is used for indicating evaluation information corresponding to different facial expression information;

and acquiring the evaluation information of the user on the media content according to the facial expression information of the user and the evaluation mapping table.
The apparatus of claim 23, wherein the processing module is specifically configured to:

acquiring standard sound emotion information corresponding to the preset time period, wherein the standard sound emotion information is sound information predefined according to the preset content;

if the sound emotion information of the user is consistent with the standard sound emotion information, determining that the evaluation of the user on the media content is a bonus evaluation;

and if the sound emotion information of the user is inconsistent with the standard sound emotion information, determining that the evaluation of the user to the media content is a score reduction evaluation.
The apparatus according to any of claims 18-26, wherein after obtaining the reaction status information of the user, the processing module is further configured to:

acquiring an identity of at least one user according to the image information or the sound information;

and associating the evaluation information of each user on the media content with the identity of the user after acquiring the evaluation information of the user on the media content according to the reaction state information.
The apparatus of claim 23, wherein the processing module is specifically configured to:

acquiring an identity of at least one user according to the image information;

acquiring target facial expression information matched with the voice emotion information, and acquiring a target identity corresponding to the target facial expression information from the identity of the at least one user, wherein the target facial expression information is consistent with the emotion corresponding to the voice emotion information;

and acquiring the evaluation information of the user corresponding to the target identity mark on the media content according to the sound emotion information and the standard sound emotion information corresponding to the preset time period, wherein the standard sound emotion information is information predefined according to the preset content.
The apparatus of any of claims 27 or 28, wherein prior to the playing of the media content, the processing module is further configured to:

acquiring a corresponding relation between the identity of the user and the user characteristics according to the identity and the user characteristics input by the user, wherein the user characteristics comprise one of human faces or voice;

acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the face included in the image information; or

And acquiring the identity of at least one user according to the corresponding relation between the identity of the user and the sound included in the sound information.
The apparatus of any one of claims 18-29, wherein the processing module is further configured to:

after obtaining the evaluation information of the user on the media content according to the reaction state information, if the evaluation information is bonus information, updating the evaluation information of the user on the media content to obtain updated evaluation information;

and if the evaluation information is the score reduction information, determining whether to update the evaluation information of the user on the media content according to the state information acquired after the state information.
The apparatus of any one of claims 27-29, wherein the processing module is further configured to:

and before the media content is played, identifying the identity of the user, and if the historical evaluation information of the user is acquired according to the identity of the user, determining the media content recommended to the user according to the historical evaluation information.
The apparatus of any one of claims 18-31, further comprising: an output module;

the processing module is further configured to: after the media content is played, determining an evaluation list according to the evaluation information of the user, wherein the evaluation list comprises the media content evaluated by the user;

the output module is used for: sending the evaluation list to a media source platform;

the input module is further configured to: and acquiring a recommendation list returned by the media source platform, wherein the recommendation list comprises media contents to be recommended to the user.
The apparatus according to claim 31 or 32, wherein the processing module is specifically configured to:

if a user to be recommended is identified, determining media content of which the evaluation information associated with the user to be recommended meets a first preset condition; the first preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media content meeting the first preset condition.
The apparatus according to claim 31 or 32, wherein the processing module is specifically configured to:

if at least two users to be recommended are identified, determining media content of which the evaluation information respectively associated with each of the at least two users to be recommended meets a second preset condition, wherein the second preset condition is specifically one of the following conditions: the evaluation score is higher than the preset score or the evaluation ranking is higher than the preset ranking;

and determining an evaluation list according to the media contents which all meet the second preset condition.
A terminal device, characterized in that it comprises a media content recommendation apparatus, a camera and/or a microphone according to any one of claims 1-17.