CN111698564A

CN111698564A - Information recommendation method, device, equipment and storage medium

Info

Publication number: CN111698564A
Application number: CN202010731307.4A
Authority: CN
Inventors: 林洁娴
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-09-22
Anticipated expiration: 2040-07-27
Also published as: CN111698564B

Abstract

The application discloses an information recommendation method, device, equipment and storage medium, and relates to the field of human-computer interaction. The method comprises the following steps: acquiring a user interest area of a target client when a video picture of a first video is played, wherein the first video is a video issued by a first user account; extracting interest elements from the user interest area, wherein the interest elements comprise at least one of character elements and object elements; extracting interest sub-elements from the interest elements; and recommending the information of the second user account to the target client in response to the association of the interest sub-element and the second user account. The steps of searching interested related videos by the user are simplified, and the man-machine interaction efficiency is improved.

Description

Information recommendation method, device, equipment and storage medium

Technical Field

The present application relates to the field of human-computer interaction, and in particular, to an information recommendation method, apparatus, device, and storage medium.

Background

The short video application program can recommend the short video which is liked to be watched by the user to the user according to the video types frequently watched by the user and the related algorithm.

Taking the example that multiple persons appear in one short video, illustratively, the short video watched by the user belongs to a dance-type video, and the user is interested in a dancer located at a middle position in the video and wants to view related videos about the dancer. However, the dancer is not a publisher of the short video, and the user usually looks over the comment of the short video, and may query the user account of the dancer on the short video application from the comment, so as to further search the video content of the dancer on the short video application.

In the technical scheme, the user needs to look over the comments of the short videos to inquire the information of the interested person, the steps are complicated, or the related information of the person cannot be found in the comments, so that the user cannot watch the interested videos.

Disclosure of Invention

The embodiment of the application provides an information recommendation method, device, equipment and storage medium, which are used for obtaining interest elements of a user through an eyeball tracking technology so as to recommend videos matched with the interest elements to the user. The technical scheme is as follows:

according to an aspect of the present application, there is provided an information recommendation method, the method including:

acquiring a user interest area of a target client when a video picture of a first video is played, wherein the first video is a video issued by a first user account;

extracting an interest element from the user interest area, wherein the interest element comprises at least one of a character element and an object element;

extracting interest sub-elements from the interest elements;

and responding to the association of the interest sub-element and a second user account, and recommending the information of the second user account to a target client.

According to another aspect of the present application, there is provided an information recommendation method, the method including:

displaying a video picture when a first video is played, wherein the first video is a video issued by a first user account;

collecting at least one of eyeball tracking data and screen operation data;

displaying information of a second user account on the video picture, wherein the second user account is associated with an interest sub-element in interest elements, the interest elements are in a user interest area in the video picture, and the user interest area is obtained through at least one of the eyeball tracking data and the screen operation data.

According to another aspect of the present application, there is provided an information recommendation apparatus, the apparatus including:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a user interest area of a target client when a video picture of a first video is played, and the first video is a video issued by a first user account;

the element extraction module is used for extracting interest elements from the user interest area, wherein the interest elements comprise at least one of character elements and object elements;

the element extraction module is used for extracting interest sub-elements from the interest elements;

and the information recommendation module is used for responding to the association between the interest sub-element and a second user account and recommending the information of the second user account to the target client.

the display module is used for displaying a video picture when a first video is played, wherein the first video is a video issued by a first user account;

the acquisition module is used for acquiring at least one of eyeball tracking data and screen operation data;

the display module is configured to display information of a second user account on the video screen, where the second user account is associated with an interest sub-element in an interest element, the interest element is in a user interest area in the video screen, and the user interest area is obtained through at least one of the eyeball tracking data and the screen operation data.

According to another aspect of the present application, there is provided a computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the information recommendation method according to the above aspect.

According to another aspect of the present application, there is provided a computer-readable storage medium having stored therein at least one instruction, at least one program, code set, or set of instructions that is loaded and executed by a processor to implement the information recommendation method according to the above aspect.

According to another aspect of the application, a computer program product or computer program is provided, comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer readable storage medium by a processor of a computer device, and the processor executes the computer instructions to cause the computer device to execute the information recommendation method as described above.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

the user interest area of the user when watching the video is acquired, the user interest area is utilized to determine which part of the content of the video the user is interested in, and the video content interested by the user can be accurately acquired through fine division of the interest elements, so that the second user account related to the video content interested by the user is recommended to a target client (user), steps of searching the interested related video by the user are simplified, and the man-machine interaction efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic interface diagram of an information recommendation method provided in an exemplary embodiment of the present application;

FIG. 2 is a block diagram of a computer system provided in an exemplary embodiment of the present application;

FIG. 3 is a flow chart of a method for information recommendation provided by an exemplary embodiment of the present application;

FIG. 4 is a flow chart of a method of information recommendation provided by another exemplary embodiment of the present application;

FIG. 5 is an interface schematic diagram of an information recommendation method provided in another exemplary embodiment of the present application;

FIG. 6 is a flow chart of a method of information recommendation provided by another exemplary embodiment of the present application;

FIG. 7 is an interface diagram of an information recommendation method provided in another exemplary embodiment of the present application;

FIG. 8 is a flow chart of a method of information recommendation provided by another exemplary embodiment of the present application;

FIG. 9 is an interface diagram of an information recommendation method provided in another exemplary embodiment of the present application;

FIG. 10 is a flow chart of a method of information recommendation provided by another exemplary embodiment of the present application;

FIG. 11 is a schematic illustration of a home page interface for a user account as provided by an exemplary embodiment of the present application;

FIG. 12 is a schematic diagram of a video picture of a second video provided by an exemplary embodiment of the present application;

FIG. 13 is a flow chart of a method of information recommendation provided by another exemplary embodiment of the present application;

fig. 14 is a block diagram illustrating an exemplary embodiment of an information recommendation apparatus according to the present application;

fig. 15 is a block diagram of an information recommendation apparatus according to another exemplary embodiment of the present application;

FIG. 16 is a block diagram illustrating an apparatus of a server according to an exemplary embodiment of the present application;

fig. 17 is a schematic structural diagram of a computer device according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

First, terms related to embodiments of the present application will be described.

Eye Tracking technology (Eye Tracking): the method is characterized in that the terminal is controlled to execute the operation corresponding to the eyeball movement information by acquiring the eyeball movement information of the user using the terminal, and the user does not need to manually operate the terminal. For example, the terminal acquires the eye movement track of the user, and determines that the user needs to switch the page being viewed to the next page, and then the terminal switches the page.

The eyeball tracking technology includes three ways: firstly, tracking according to the characteristic changes of eyeballs and the peripheries of the eyeballs; secondly, tracking according to the change of the iris angle; third, a light beam such as infrared rays is projected onto the iris to extract features.

The sampling ability of the human eye limits the way visual information can be extracted from the surroundings, and the position where the line of sight falls can be obtained by eye movement since the visual accuracy drops rapidly when moving the implementation out of the central region of the visual field.

Eye movement behavior includes fixation behavior and saccadic behavior. Gaze behavior refers to the behavior of a line of sight held on a target for a certain length of time to obtain visual information, the time of gaze behavior typically varying between 80 and 600 milliseconds. During the fixation, the eyeball of the user is not completely static (the behaviors such as micro-eye jump, eye tremor and the like can be generated); glance behavior refers to the rapid eye movement behavior of the eye moving from one point to another, with little to no effective visual information being obtained by glance behavior, as opposed to gaze behavior, which typically varies in time between 20 and 40 milliseconds.

AI (Artificial Intelligence) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a video processing technology, a natural language processing technology, machine learning/deep learning and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, face recognition, object recognition, and the like.

The following description will take an example of obtaining an interest element from eye tracking data. As shown in fig. 1, an embodiment of the present application provides an information recommendation method. The method provided by the embodiment of the application is applied to a video recommendation scene. The method is applied to a computer system as shown in FIG. 2, and comprises the following steps:

1. the terminal collects eyeball tracking data.

An application program supporting a video playing function runs on the terminal 210, a user watches a video 11 issued by a first user account on the application program, the terminal 210 starts a camera, and starts an eye tracking system to track the movement track of eyeballs of the user and a fixation point focused on the video. The terminal 210 sends the collected eye tracking data to the background server of the application program.

2. And the background server determines that the video information is interesting to the user according to the eyeball tracking data.

The background server calls an interest analysis system, and the interest analysis system is used for determining a video area in which the user generates interest. Illustratively, the interest analysis system scores a video region where a sight line of a user is located when the user watches a video according to the eyeball tracking data, and determines the video region where the user generates interest according to the score.

First, the interest analysis system identifies people or things that appear in the video. As shown in fig. 1, two dancers are present in the video. Illustratively, the dancer on the left corresponds to a first user, the first user corresponds to a first user account in the video application, the dancer on the right corresponds to a second user, the second user corresponds to a second user account in the video application, the first user corresponds to the area 101 in the video, and the second user corresponds to the area 102 in the video. The video watched by the user is the video issued by the first user account.

Then, the interest analysis system determines the area of interest in the user in the video according to the proportion of the area where the user's point of regard is located in the whole video area. Illustratively, the region where the user's gaze point is located is region 101, and region 101 accounts for 60% of the entire video region, then the interest analysis system determines that region 101 is the region of interest to the user. I.e. the user is more interested in the first user in the video.

In some embodiments, the interest analysis system may comprehensively determine the user's region of interest in combination with a plurality of determination conditions. Illustratively, by judging the time length of the user's gaze point falling on a certain area in the video, when the time length of the user's gaze point falling on a certain area exceeds a time length threshold, the area is determined as the user interest area. Illustratively, the interest analysis system judges whether the eye movement behavior of the user belongs to the saccadic behavior or the staring behavior through the eyeball tracking data, and when the eye movement behavior of the user belongs to the staring behavior, the interest analysis system judges that the region watched by the user is the user interest region. And determining the area in which the user is interested by the area, wherein the area comprises the interested element, through weight calculation of the judgment condition.

The interest analysis system performs fine segmentation on the area 101, as shown in fig. 1, the area 101 is divided into an area 103, an area 104 and an area 105, the area 101 includes a head element of the first user, the area 102 includes an upper body element of the first user, and the area 103 includes a lower body element of the first user. The interest analysis system further determines the sub-element of interest of the user according to the area where the point of regard of the user is located, for example, if the point of regard of the user falls in the area 104, the interest analysis system determines that the area 104 is the area of interest of the user, and the upper body element included in the area 104 is the sub-element of interest of the user.

3. And the background server sends the video information which is interested by the user to the terminal.

The backend server invokes the portrait/object recognition system to recognize the area 104. Illustratively, the area 104 includes the first user's upper body element, the backend server invokes the object recognition system to recognize that the area includes ballet clothes, determines the user account that issued the video related to ballet clothes, and if the user account a issued the related video about "how beginner purchased ballet clothes" and "recommended ballet clothes", and the backend server determines that the user account a is not the first user account, the backend server sends the related video information 16 of the user account a to the terminal 210.

The user views the video information 16 associated with the user account a at the terminal 210, and the user can click on the video information 16 to view the video published by the user account a.

According to the method provided by the embodiment of the application, the videos which the user wants to watch can be recommended to the user based on the eyeball tracking data without manually searching the videos, so that the steps of inquiring the videos which the user is interested in are simplified, and the recommendation efficiency of the video information is improved.

The information recommendation method provided by the embodiment of the application can be applied to computer equipment with stronger data processing capacity. In a possible implementation manner, the information recommendation method provided by the embodiment of the application can be applied to a personal computer, a workstation or a server, that is, information recommendation can be performed through the personal computer, the workstation or the server.

Illustratively, the information recommendation function is implemented as a part of an application program, and the application program is installed in the terminal, so that the terminal has a function of recommending information to the target client; or the information recommendation function is set in a background server of the application program, so that the terminal installed with the application program recommends information to the target client by means of the background server.

Referring to FIG. 2, a block diagram of a computer system according to an exemplary embodiment of the present application is shown. The computer system 200 includes a terminal 210 and a server 220, wherein the terminal 210 and the server 220 perform data communication through a communication network, optionally, the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network.

The terminal 210 has an application program that supports a video playing function, where the application program may be a news application program, a social contact application program, a short video application program, a live broadcast application program, a music application program, a shopping application program, a virtual reality application program, and the like, and the application program is not limited in this embodiment.

In some embodiments, the terminal 210 may be a mobile terminal such as a smart phone, a tablet computer, a laptop portable notebook computer, or a terminal such as a desktop computer, a projection computer, and the like, and the type of the terminal is not limited in the embodiments of the present application.

The server 220 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. In one possible implementation, server 220 is a backend server for applications in terminal 210.

As shown in fig. 2, in the embodiment of the present application, a user watches a video 11 published by a first user account on a terminal 210, the terminal 210 collects eyeball tracking data 12 of the user when watching the video by using an eyeball tracking technology, and sends the eyeball tracking data 12 to a server 220, the server 220 acquires the eyeball tracking data 12 of the user, the server 220 invokes an interest analysis system 13 to analyze the eyeball tracking data 12, so as to obtain a user interest area 14 when the user watches the video, and the user interest area 14 is used for representing an area corresponding to a person or an object in the video when the user has an interest in the video. The server 220 invokes the person identification system/object identification system 15 to identify the person or thing in the area of interest 14 of the user, and obtains the related information 16 of the second user account. In some embodiments, the terminal 210 obtains the user interest area 14 by receiving an operation of the user on the display screen, for example, when the user zooms in a person displayed in the video, the terminal sends the zooming operation of the user to the server 220 as the interest point data, and the server 220 determines that a video area corresponding to the person zoomed in by the user is the user interest area 14 according to the zooming operation.

Illustratively, a video about dancing is issued by a first user account, the video includes a first user corresponding to the first user account and a second user corresponding to the second user account dancing, eyeball tracking data acquired by the terminal 210 indicates that the user is more interested in the second user, the server 220 sends related information 16 of the second user account to the terminal 210, the video 11 issued by the first user account is displayed on the terminal, and the related information 16 of the second user account is displayed on the video 11 of the first user account in an overlapping manner. The user may view the video posted by the second user account by clicking on the information 16 associated with the second user account.

In other possible embodiments, the interest analysis system 13 and the image recognition system/object recognition system 15 may be disposed in the terminal, and the terminal 210 may determine the person or thing interested in the user in the video by combining at least one of the collected eyeball tracking data and the collected screen operation data without the aid of the server 220.

Fig. 3 is a flowchart illustrating an information recommendation method according to an exemplary embodiment of the present application. The embodiment is described by taking the method as an example for being used in the server 220 in the computer system 200 shown in fig. 2, and the method comprises the following steps:

step 301, obtaining a user interest area of a target client when playing a video screen of a first video, where the first video is a video published by a first user account.

The target client is a client used by a user viewing the first video. Illustratively, a user views a first video on an application, the application including at least one of a video playback application, a music playback application, a live application, and a short video application. In some embodiments, the user needs to log in to the user account to publish the video on the application program, and the user does not need to log in to the user account when watching the video. The embodiment of the application is described by taking an example that a user watches a video published by a first user account on an application program.

The user interest area refers to an area in which a user is interested in an element in a video when watching the video, and a video area corresponding to the element is the user interest area. The user interest area is obtained by at least one of eyeball tracking data and screen manipulation data. The elements in the user's region of interest include at least one of a character element and an object element. In the embodiment of the application, the user interest area is a partial area in the video area.

Illustratively, the terminal collects at least one of eyeball tracking data and screen operation data and sends the data to a server of the application program, and the server judges the user interest area in the first video according to the data.

Step 302, extracting an interest element from the user interest area, wherein the interest element comprises at least one of a character element and an object element.

Illustratively, the server includes an interest analysis system for analyzing the interest element in the first video based on at least one of eye tracking data and screen manipulation data. Illustratively, the interest analysis system is constructed based on an interest analysis model, the interest analysis model is a machine learning model with an interest element identification capability, and the interest elements in the first video are obtained by inputting eyeball tracking data of a user watching the first video into the interest analysis model; or, inputting screen operation data of the user watching the first video into the interest analysis model to obtain the interest elements in the first video.

In some embodiments, the interest analysis model first divides a region corresponding to an element in a video, for example, divides a video region corresponding to a character element in the video from other video regions; or dividing the video area corresponding to the object element in the video into other video areas. And then determining interest elements from the divided video, wherein the divided video area is the user interest area. In other embodiments, the elements in the video are divided by a pre-trained element division model, and then the video after element division is input into the interest analysis model, so as to obtain the interest elements in the video from a plurality of divided elements.

The interest element refers to the video content in which the user is interested in, and comprises at least one of a character element and an object element in the video. In some embodiments, the interest elements also include scene elements, such as, for example, a video taken at a coffee shop, a scene element being an indoor element; for another example, the video is shot on a tennis court, and the scene elements are outdoor elements.

In some embodiments, the interest analysis model is trained by:

obtaining sample eyeball tracking data, and marking interesting elements by the sample eyeball tracking data;

inputting the sample eyeball tracking data into an interest analysis model to obtain predicted interest elements of the sample eyeball tracking data;

and training the interest analysis model according to the calibrated interest elements and the predicted interest elements to obtain the trained interest analysis model.

In some embodiments, the interest analysis model is trained by:

obtaining sample screen operation data, and calibrating interesting elements by the sample screen operation data;

inputting the sample screen operation data into an interest analysis model to obtain predicted interest elements of the sample screen operation data;

It is to be understood that the sample data for training the interest analysis model may be mixed sample data, which includes two types of data of eyeball tracking data and screen manipulation data.

Step 303, extracting interest sub-elements from the interest elements.

In response to the interest element belonging to the character element, the interest sub-element includes at least one of a head element, an upper body element, and a lower body element of the character element.

It will be appreciated that sub-elements of interest may also be divided, such as head elements including facial five sense organ elements and hair elements; the upper body elements comprise arm elements, chest elements and back elements; the lower body elements include leg elements and foot elements. The embodiment of the present application is described by taking an example that the interest sub-elements include a head element, an upper body element, and a lower body element.

In response to the interest element belonging to the object element, the interest sub-element includes at least one of a flag element and a structural element of the object element.

The mark element refers to a characteristic element capable of indicating one article. The Logo element is schematically a brand Logo (Logo) to which the object belongs, such as a car Logo of a car displayed in a video, and the car Logo represents the characteristics of the car.

Structural elements refer to elements that constitute an article. Illustratively, a table is shown in the video, and the structural elements of the table include a table top element and a table leg element; the video shows a car whose structural elements include body elements and wheel elements.

Illustratively, the server includes at least one of a face recognition system and an object recognition system. The human image recognition system is constructed based on a human face recognition model, and the object recognition system is constructed based on an object recognition model.

And when the interest elements belong to the character elements, the server calls the face recognition model to recognize the interest elements to obtain at least one of the head elements, the upper body elements and the lower body elements of the character elements.

And when the interest elements belong to the object elements, the server calls the object recognition model to recognize the interest elements to obtain the mark elements and the structural elements of the object elements.

Schematically, inputting an interest element into a face recognition model or an object recognition model, and outputting an interest sub-element; or inputting eyeball tracking data of the user when watching the video into the interest analysis model, and outputting the interest sub-elements.

And step 304, in response to the interest sub-element being associated with the second user account, recommending information of the second user account to the target client.

Illustratively, the server stores the association relationship between the interest sub-element and the second user account in advance, and when the server identifies the interest sub-element through the

above steps

302 and 303, the server determines the second user account corresponding to the interest sub-element according to the association relationship, so as to recommend the information of the second user account to the target client.

The target client is a client used by a user viewing the first video. The information of the second user account comprises at least one of the second user account, a nickname corresponding to the second user account, the gender of the second user account, a head portrait corresponding to the second user account and a personal signature of the second user account. In some embodiments, the server also recommends the video of the second user account published on the application to the target client.

It can be understood that the server may also directly associate the interest element in the video with the second user account, and recommend the second user account to the target client in response to the server detecting the interest element associated with the second user account.

In summary, according to the method provided by this embodiment, the user interest area of the user when watching the video is acquired, which part of the content in the video the user is interested in is determined by using the user interest area, and the video content the user is interested in can be accurately acquired by finely dividing the interest elements, so that the second user account associated with the video content the user is interested in is recommended to the target client (user), steps of the user in searching for the interested related video are simplified, and the human-computer interaction efficiency is improved.

Fig. 4 shows a flowchart of an information recommendation method according to another exemplary embodiment of the present application. The embodiment is described by taking the method as an example for being used in the server 220 in the computer system 200 shown in fig. 2, and the method comprises the following steps:

step 401, obtaining a user interest area of a target client when playing a video screen of a first video, where the first video is a video published by a first user account.

The mode of acquiring the interest area of the user comprises at least one of the following modes:

acquiring eyeball tracking data; acquiring a first user interest area from a video picture according to eyeball tracking data; or acquiring screen operation data; and acquiring a second user interest area from the video picture according to the screen operation data.

The eyeball tracking data refers to data collected by an eyeball tracking technology and used for recording the sight line change focused on the video. Illustratively, eyeball tracking data is acquired through a camera of a terminal (such as a smart phone, a notebook computer and the like), and the terminal sends the acquired eyeball tracking data to a server; or, the eyeball tracking data is collected through professional collection equipment (such as infrared equipment capable of emitting infrared rays and an eye tracker in the form of glasses), the professional collection equipment is connected with the server or the professional collection equipment is connected with the terminal, and the acquired tracking data is sent to the server by the professional collection equipment or sent to the server by the terminal. This is not limited in the examples of the present application.

The measurement index of the eyeball tracking data comprises at least one of an eye movement track diagram, eye movement time, direction and distance of eye movement and pupil diameter.

The eye movement track map is a route map formed by superimposing the eyeball motion information on the visual image to form the fixation point and the fixation point movement. The eye movement track diagram can intuitively reflect the characteristics when the sight line moves.

The eye movement time refers to time information corresponding to the eyeball when performing various actions, and includes a fixation time (fixation stay time), an eye jump time, a retrospective time, a follow-up movement time, and a jogging time (including spontaneous high-frequency micro-tremor of the eyeball, slow drift, and micro-jump time) in the fixation process. Illustratively, the eye movement time may also be decomposed, such as decomposing the gaze time into a single gaze time, a first gaze time, a total gaze time, and the like.

The direction and distance of eye movement refer to the orientation and distance moved by the eyeball as it moves. The speed of eyeball movement (eye movement speed) can be calculated according to the direction and distance of eye movement.

The diameter of the pupil can reflect the psychological activity of a person, for example, when a person shows a frightened state, the diameter of the pupil becomes larger. It can also be determined by measuring the change in pupil diameter that a person has received a change in the information.

Illustratively, a user opens a short video application on the terminal, and the terminal starts a camera to collect a sight line change of the user when watching a short video (a first video issued by a first user account), that is, to collect eyeball tracking data. The terminal uploads eyeball tracking data of the user to a background server of the short video application program, so that the background server obtains the eyeball tracking data.

The screen operation data is data generated when the display screen of the terminal receives a touch operation. The touch operation includes: at least one of a double-finger zoom-in operation (single-finger double-finger or double-finger), a double-click operation (single-finger double-click and the number of operations is more than two), a continuous-click operation (single-finger continuous-click), and a double-finger zoom-out operation (single-finger double-finger or double-finger). In some embodiments, the terminal is a terminal having an external access device, such as a desktop computer to which a mouse is connected. Illustratively, when a user zooms in on content displayed on a display screen by sliding a mouse wheel, the desktop computer collects screen operation data.

Illustratively, a user opens a short video application on the terminal to view the first video, and when the user enlarges elements in a video picture of the first video through an enlarging operation, the terminal collects screen operation data according to the user's operation. And the terminal uploads the screen operation data to a background server of the short video application program, so that the background server obtains the screen operation data.

Step 402, extracting an interest element from a user interest area, wherein the interest element comprises at least one of a character element and an object element.

Illustratively, the eye tracking data includes gaze point data and eye movement behavior data. And the background server of the short video determines the interest elements in the first video according to the gazing point data and the eye movement behavior data. The background server extracts the interest elements from the determined interest elements in the first user interest area or from the second user interest area.

In some embodiments, the server comprises an interest analysis system that constructs an interest analysis model into which the acquired eye tracking data is input, the interest analysis model outputting elements of interest to the user in the short video (first video) based on the acquired gaze point data and eye movement behavior data. For example, the interest analysis model outputs that the user is interested in character elements in short videos.

In some embodiments, the server includes at least one of a face recognition model and an object recognition model, and the server first calls the interest analysis model to identify a region of interest of the user in the first video, and then calls the face recognition model and the object recognition model to identify elements contained in the region of interest, that is, identifies whether the video contains at least one of a character element and an object element through the face recognition model and the object recognition model.

As shown in the left diagram in fig. 5, the server invokes the interest analysis model to identify a region of interest of the user in the first video, where the region of interest is the region 101, and then invokes the face recognition model to identify elements in the region 101, so as to obtain that the region 101 includes the person elements, that is, the user is interested in the person elements in the region 101.

And step 403, in response to the interest element belonging to the character element, dividing the interest element to obtain a head element of the character element, an upper body element of the character element and a lower body element of the character element.

Illustratively, in response to that the interest element belongs to the character element, the server calls the interest analysis model and further divides the interest element according to the gazing point data and the eye movement behavior data to obtain the body part element of the character element.

As shown in the right diagram of fig. 5, the interest analysis model further divides the character elements interested by the user into an area 103 corresponding to the head element, an area 104 corresponding to the upper body element, and an element 105 corresponding to the lower body area.

At step 404, an interest sub-element is determined from the head element, the upper body element, and the lower body element.

Illustratively, the gaze point data comprises gaze point locations and the eye movement behavior data comprises gaze behaviors. The gaze point position refers to a position where the user's line of sight is focused in the video while watching the video. The eye movement behavior refers to the eyeball rotation behavior of a user when watching a video, and comprises the fixation behavior and the saccade behavior. Step 404 may be replaced with steps 4041 and 4042 as follows:

step 4041, a first interest value corresponding to the gaze point data and a second interest value corresponding to the eye movement behavior data are obtained.

The server obtains a first interest value corresponding to the gazing point data and a second interest value corresponding to the eye movement behavior data through the following steps:

step 420, acquiring the area ratio of a first region where the gazing point position is located in the first video region, and determining a first interest value of the first region according to the area ratio; or obtaining the watching time length of the watching point position in the first area, and determining the first interest value of the first area according to the watching time length.

Illustratively, the server stores a first corresponding relationship between the area ratio and the first interest value in advance, and determines the first interest value according to the first corresponding relationship and the area ratio. As shown in the right diagram in fig. 5, the first region where the gaze point location is located is a region 103, and the area ratio of the region 103 to the first video region is 20%, the server determines that the first interest value of the first region is 20 according to the first correspondence and the area ratio. Illustratively, the area ratio and the first interest value have a positive correlation, that is, the larger the area ratio is, the larger the corresponding first interest value is, the more the user is interested in the elements in the area.

Illustratively, the server stores a second corresponding relationship between the gazing duration and the first interest value in advance, and determines the first interest value according to the second corresponding relationship and the gazing duration. As shown in the right diagram of fig. 5, if the gazing point position has a gazing duration of 10 seconds in the area 103, the server determines that the first interest value of the first area is 20 according to the second correspondence and the gazing duration. Illustratively, the gazing time duration and the first interest value have a positive correlation, that is, the longer the gazing time is, the larger the corresponding first interest value is, the more the user is interested in the elements in the area.

Step 440, in response to the eye movement behavior belonging to the gazing behavior and the gazing behavior resulting in the first area, determining a second interest value for the first area.

Illustratively, the server stores in advance a third correspondence between the eye movement behavior and a second interest value, the second interest value corresponding to the fixation behavior being higher than the second interest value corresponding to the glance behavior.

As shown in the right diagram of fig. 5, the server determines that the eye movement behavior of the user is the gaze behavior and the region gazed by the user is the region 103 according to the eyeball tracking data, and the server determines that the second interest value of the first region is 60.

Step 4042, an interest sub-element is determined from the head element, the upper body element, and the lower body element according to the first interest value and the second interest value.

Illustratively, the interest sub-element is determined in combination with the weights of the first interest value and the second interest value. And the server calculates the interest value corresponding to each element according to the steps, so that the element with the high interest value is determined as the interest sub-element. For example, the first interest value is weighted 0.4, and the second interest value is weighted 0.6. The server calculates interest values of the area 103, the area 104 and the area 105 according to the above steps to represent a first interest value, a second interest value and a comprehensive interest value of each of the three areas.

Watch 1

As can be seen from the table I, the interest value of the area 103 is the highest, the server determines the area 103 as an area in which the user is interested, the area 103 includes a head element, and the server determines the head element as an interest sub-element.

The server acquires a second video matched with the interest sub-element, and the method comprises the following three conditions:

in step 405a, in response to the interest sub-element belonging to the head element, a second video matching the head element is obtained.

The second video matched with the head includes: video containing elements similar to the facial contours of the head element, video containing elements similar to the gender characteristics represented by the head element, video containing elements having an associative relationship with the identity represented by the head element.

Illustratively, the server calls a face recognition model to recognize the head element, identifies a user a similar to the face contour of the head element, where the user a may be a user who publishes the first video, or the user a may be a user who is similar to the first user who publishes the first video in a long term, and the server acquires the video containing the user a, or acquires the video distributed by the user a.

Illustratively, the server calls a face recognition model to recognize the head element, the gender feature represented by the head element is recognized to be male, and the server acquires a video containing a male user.

Illustratively, the server calls a face recognition model to recognize the head element, and recognizes that the head element is the head element of the user a, and the user a pays attention to a second user account (the user account of the user b) by using a first user account on the video application program, that is, the user a and the user b have a friend relationship on the video application program, and then the server acquires a video containing the user b or a video released by the user b.

It is understood that the manner of obtaining the second video matching the character elements may be implemented individually, in combinations of two or all.

And step 406a, recommending the publisher information of the second video to the target client as the information of the second user account.

And the server recommends the acquired publisher information of the second video to the target client. In some embodiments, the server determines that the second video matched with the head element contains the user b, and the video is a video issued by the user c, and then the server recommends the user account corresponding to the user c to the target client. As shown in fig. 5, information 16 about the second user account for dance is displayed on the video screen.

And step 405b, responding to the interest sub-element belonging to the upper body element, and acquiring a second video matched with the upper body element.

Similar to the process of identifying the head element by the server, the server acquires the second video according to the information corresponding to the upper body element. For example, the server searches for a video having a morphology that is the same as or similar to the morphology of the upper body element (e.g., stooped, wiggled, etc.).

And step 406b, recommending the publisher information of the second video to the target client as the information of the second user account.

The embodiment referring to step 406a is not described herein.

Step 405c, in response to the interest sub-element belonging to the lower body element, acquiring a second video matching the lower body element.

Similar to the process of the server recognizing the upper body element, the server acquires the second video according to the information corresponding to the lower body element. For example, the server looks for a video that contains a similarity to the form of the lower body element (e.g., running, kicking, a word horse, etc.).

And step 406c, recommending the publisher information of the second video to the target client as the information of the second user account.

The embodiment referring to step 406a is not described herein.

It is understood that the above three cases can be implemented individually, or in combination of two or all of them.

In an alternative embodiment based on fig. 4, when the character element includes a decoration element, the manner of obtaining the video matching with the interest sub-element by the server further includes the following steps:

step 405d, responding to the decoration element corresponding to the character element included in the interest sub-element, and acquiring a third video matched with the decoration element.

Illustratively, the server invokes the machine learning model to identify the decoration elements corresponding to the task elements.

In response to the character element belonging to the head element, the decoration element includes at least one of a hair style element (e.g., hair color, straight hair, curly hair, long hair, short hair, etc.), a makeup element (e.g., lipstick color number, foundation color number, eye shadow color number, etc.), and a decoration element (e.g., earrings, hairpins, hair rings, etc.).

In one example, the server invokes a machine learning model to identify the decoration element corresponding to the head element as a makeup element, and the server obtains a third video matching the makeup element, the third video including a video containing the makeup element, such as a person in the third video using a lipstick having the same or similar color number as the lipstick included in the first video, or a lipstick selling video having a lipstick sold in the same or similar color number as the lipstick included in the first video.

Responsive to the character element belonging to the upper body element, the decorative element includes at least one of a clothing element, an accessory element (e.g., hat, scarf, backpack, necklace, bracelet, etc.), a skin marking element (e.g., birthmark, tattoo, etc.).

In one example, the server calls the machine learning model to identify a decoration element corresponding to the upper body element as an accessory element, and the server acquires a third video matched with the accessory element, wherein the third video comprises a video containing the accessory element, such as a person in the third video wearing a scarf with the same or similar style as the scarf contained in the first video, or a scarf selling video with the same or similar style as the scarf contained in the first video.

Responsive to the character element belonging to the lower body element, the decoration element includes at least one of a clothing element, an accessory element (e.g., a foot chain, etc.), a skin marking element.

In one example, the server invokes the machine learning model to identify the decoration element corresponding to the lower body element as a clothing element, and the server obtains a third video matching the clothing element, where the third video includes a video containing the clothing element, such as a person in the third video wearing a shoe that is the same as or similar to a shoe contained in the first video, or a shoe sales video where a shoe that is the same as or similar to a shoe contained in the first video is sold.

And step 406d, recommending the publisher information of the third video to the target client as the information of the second user account.

The embodiment referring to step 406a is not described herein.

In summary, according to the method provided by this embodiment, the user interest area of the user when watching the video is acquired, which part of the content in the video the user is interested in is determined by using the user interest area, and the video content the user is interested in can be accurately acquired by finely dividing the interest elements, so that the second user account associated with the video content the user is interested in is recommended to the target client (user), steps of the user in searching the interested related video are simplified, meanwhile, a plurality of search modes are provided for the user, and the human-computer interaction efficiency is improved.

By finely dividing the interest elements, the server can further determine videos in which the user is interested according to different interest sub-elements, and the determination result of the interest videos is more accurate, so that the server can accurately recommend information of a second user account to the client according to the interest videos.

By acquiring the second video corresponding to the type of the interest element, the information of the second user account for publishing the video is determined, so that the server can accurately recommend the video which is interested by the user to the client, and the man-machine interaction efficiency is improved.

And when the interest sub-elements comprise decoration elements for decorating body parts, the video matched with the decoration elements is used as the video interested by the user, so that the user account for publishing the video is recommended to the target client as a second user account. The video types recommended to the user by the server are richer and more diverse, and are not limited to a single type.

And calculating a first interest value and a second interest value of the sight of the user in the area corresponding to each interest sub-element through the eyeball tracking data, so that the server can accurately recommend information of a second user account to the target client according to the interest values.

The interesting sub-elements of the first video, which are interesting to the user, are comprehensively judged through multiple judgment conditions, so that the judgment result of the server is more accurate, and the server can accurately recommend the information of the second user account to the target client.

In an alternative embodiment based on fig. 4, the server may further extract the interest sub-element from the user interest area through the screen operation data, and the method further includes the following steps, as shown in fig. 6:

in some embodiments, the screen operation data includes zoom-in operation data.

And step 410a, acquiring the amplification scale corresponding to the amplification operation data.

The terminal receives the amplification operation, acquires amplification operation data, and sends the amplification operation data to the server, and the server acquires a corresponding amplification ratio according to the amplification operation data, wherein the amplification ratio is 1:2, for example.

And step 430a, in response to the amplification scale being larger than the scale threshold, determining the element corresponding to the amplification operation data as the interest sub-element.

Illustratively, the first threshold is 60%, and when the magnification ratio of an element in the first video exceeds 60%, the element is determined as the interest sub-element.

As shown in the left diagram of fig. 7, the user watches the video 11 published by the first user account, the user applies a two-finger zoom-in operation on the terminal, the display screen of the terminal displays a picture as shown in the right diagram of fig. 7, in which the area 103 is displayed in a zoomed-in manner, and the zoom-in scale exceeds the scale threshold, then the server determines the head element in the area 103 as the interest sub-element.

And step 410b, acquiring the duration of the amplification area in the amplification state, wherein the amplification area corresponds to the amplification operation data.

And step 430b, in response to the duration being greater than the time threshold, determining the element corresponding to the amplification operation data as the interest sub-element.

Illustratively, the third threshold is 10 seconds, and when the duration of the element in the first video in the magnified state is greater than 10 seconds, the element is determined as the interest sub-element.

And step 410c, acquiring the amplification area corresponding to the amplification operation data.

And step 430c, in response to the amplification area being larger than the area threshold, determining the element corresponding to the amplification operation data as the interest sub-element.

Illustratively, taking the terminal as a smart phone as an example, the size of the display screen of the smart phone is 6.0 inches, the second threshold value is 4.0 inches, and when the size of the video area of the first video after the element is enlarged is larger than 4.0 inches, the element is determined as the interest sub-element.

In some embodiments, in response to the terminal receiving an enlargement operation on the first video and receiving a reduction operation on the first video, a duration for which the enlarged portion of the enlarged element is in an enlarged state is obtained; in response to the duration being less than another time threshold, determining the magnified element as an element of no interest, the magnified portion being a sub-element of the element. For example, if the other time threshold is 5 seconds, and the length of time that the head element of the character element in the first video is in the enlarged state is shorter than 5 seconds after the head element is enlarged, the server determines the head element as an element that is not of interest to the user.

It will be appreciated that the above-described manner of determining the interest sub-element through the screen operation data can be equally applied to the manner of determining the interest element by the server.

It is understood that the above-mentioned methods for extracting interest elements or extracting interest sub-elements according to the eye tracking data and the screen operation data may be implemented independently or in combination.

In summary, according to the method provided by this embodiment, the screen operation data acquired by the gesture operation of the user when watching the video is acquired, which part of the content in the video is interested by the user is determined by using the screen operation data, and the video content interested by the user can be accurately acquired by finely dividing the interest elements, so that the second user account associated with the video content interested by the user is recommended to the target client (user), steps of the user in searching for the interested related video are simplified, and the human-computer interaction efficiency is improved.

In an alternative embodiment based on fig. 4, the server may further identify the object element in the video, where the identification of the object element includes the following steps, as shown in fig. 8:

step 801, responding to that the interest element belongs to the object element, dividing the interest element to obtain a mark element and a structural element of the object element.

Similar to the dividing mode of the character elements, schematically, in response to the interest elements belonging to the object elements, the server calls the interest analysis model to further divide the interest elements according to the gazing point data and the eye movement behavior data to obtain the mark elements and the structural elements of the object elements.

As shown in the left diagram of fig. 9, the interest analysis model determines, according to the eyeball tracking data, that the region in which the user is interested is a region 201, elements included in the region 201 are interest elements, the interest elements belong to object interest elements (automobiles), and the interest analysis model further divides the object elements in which the user is interested to obtain a region 202 corresponding to the mark element and a region 203 corresponding to the structural element, as shown in the right diagram of fig. 9.

In step 802, an interest sub-element is determined from the flag element and the structural element.

Illustratively, the gaze point data comprises gaze point locations and the eye movement behavior data comprises gaze behaviors. Step 802 can be replaced by step 8021 and step 8022 as follows:

step 8021, a third interest value corresponding to the gaze point data and a fourth interest value corresponding to the eye movement behavior data are obtained.

The server acquires a third interest value corresponding to the fixation point data and a fourth interest value corresponding to the eye movement behavior data through the following steps:

step 820, acquiring the area ratio of a second area where the gazing point position is located in the video area of the first video, and determining a third interest value of the second area according to the area ratio; or, the watching time length of the focus point position in the second area is obtained, and a third interest value of the second area is determined according to the watching time length.

Illustratively, the server stores a fifth corresponding relationship between the gazing duration and the third interest value in advance, and determines the third interest value according to the fifth corresponding relationship and the gazing duration. As shown in fig. 9, if the gazing point position in the area 202 has a gazing duration of 10 seconds, the server determines that the third interest value of the second area is 20 according to the fifth correspondence and the gazing duration. Illustratively, the gazing time duration and the third interest value have a positive correlation, that is, the longer the gazing time is, the larger the corresponding third interest value is, the more the user is interested in the elements in the area.

In some embodiments, the server stores a fourth corresponding relationship between the area ratio and the third interest value in advance, and determines the third interest value according to the fourth corresponding relationship and the area ratio. As shown in fig. 9, the second region where the gazing point position is located is a region 202, and the area ratio of the region 202 in the first video region is 20%, the server determines that the third interest value of the second region is 20 according to the fourth correspondence and the area ratio. Illustratively, the area ratio and the third interest value have a positive correlation, that is, the larger the area ratio is, the larger the corresponding third interest value is, the more the user is interested in the elements in the area.

In response to the eye movement behavior belonging to the gaze behavior and the resulting gaze behavior being in the second region, a fourth interest value for the second region is determined, step 840.

Illustratively, the server stores in advance a sixth correspondence between the eye movement behavior and a fourth interest value, the fourth interest value corresponding to the gazing behavior being higher than the fourth interest value corresponding to the saccadic behavior.

As shown in fig. 9, the server determines that the eye movement behavior of the user is the gazing behavior according to the eyeball tracking data, the region gazed by the user is the region 202, and the server determines that the fourth interest value of the second region is 60.

Step 8022, determining interest sub-elements from the mark elements and the structure elements according to the third interest value and the fourth interest value.

Illustratively, the interest sub-element is determined in combination with the weights of the third interest value and the fourth interest value. And the server calculates the interest value corresponding to each element according to the steps, so that the element with the high interest value is determined as the interest sub-element. For example, the third interest value is weighted to 0.3, and the fourth interest value is weighted to 0.7. The server calculates the interest values of the area 202 and the area 203 according to the above steps, and respectively represents the third interest value, the fourth interest value and the comprehensive interest value of the three areas in the table two.

Watch two

As can be seen from table two, the interest value of the area 202 is the highest, the server determines the area 202 as an area in which the user is interested, the area 202 includes a flag element, and the server determines the flag element as an interest sub-element.

The server acquires a second video matched with the interest sub-element, and the method comprises the following two conditions:

and step 803a, in response to the interest sub-element belonging to the mark element, acquiring a second video matched with the mark element.

The second video matched with the flag element includes: the video containing the same elements as the mark elements, the video containing elements close to the mark elements, and the video containing the mark elements belonging to the same type as the mark elements.

Illustratively, the server calls the object recognition model to recognize the mark element, recognizes a video containing the same element as the mark element, and acquires the video containing the brand a if the mark element is the brand a.

Illustratively, the server calls the object recognition model to recognize the mark elements, recognizes the video containing the elements similar to the mark elements, and if the outline of the mark elements is an equilateral triangle, the server obtains the video containing the mark elements of any triangle.

Illustratively, the server calls the object recognition model to recognize the mark elements, recognizes the video containing the mark elements belonging to the same type as the mark elements, for example, the outline of the mark element is a triangle, and the server obtains the video containing the mark elements of a pentagon.

It is understood that the manner of obtaining the second video matching the item element may be implemented individually, in a combination of two or all.

Step 804a, recommending the publisher information of the second video to the target client as the information of the second user account.

And the server recommends the acquired publisher information of the second video to the target client. In some embodiments, the server determines that the second video matched with the flag element contains the object a, and the video is a video issued by the user c, and then the server recommends the user account corresponding to the user c to the target client. As shown in the right diagram of fig. 9, information 26 about the second user account of the automobile is displayed on the video screen.

And step 803b, responding to the interest sub-element belonging to the structural element, and acquiring a second video matched with the structural element.

Similar to the process of identifying the mark element by the server, the server acquires the second video according to the information corresponding to the structural element. For example, the server searches for a video containing a structural element that is the same as or similar to the structural element based on the color, texture, structure, etc. of the structural element.

And step 804b, recommending the publisher information of the second video to the target client as the information of the second user account.

The embodiment referring to step 804a is not described herein again.

It is understood that, when the interest elements are identified and divided in the above embodiments, a calibrated region is not displayed on the video picture (i.e. a dotted line is not displayed), and the illustration is only schematic in the drawings.

It is to be understood that, when the interest element belongs to the object element, the above-described method of determining the interest sub-element through the eyeball tracking data may be replaced with a method of determining the interest sub-element through the screen operation data. For example, when the user enlarges a logo element of the vehicle, and the enlargement ratio of the logo element exceeds the scale threshold, the logo element is an interest sub-element indicating that the user has just interested in the brand of the vehicle in the video. For a specific embodiment, refer to step 410a to step 430c shown in fig. 6, which are not described herein again.

In summary, according to the method provided by this embodiment, the second video corresponding to the type to which the interest element belongs is acquired, so that the information of the second user account publishing the video is determined, the server can accurately recommend the video in which the user is interested to the client, and the human-computer interaction efficiency is improved.

And calculating a third interest value and a fourth interest value of the sight of the user in the area corresponding to each interest sub-element through the eyeball tracking data, so that the server can accurately recommend information of the second user account to the target client according to the interest values.

The interest sub-elements can be determined through the screen operation data, so that the mode of determining the interest sub-elements by the server is more diversified.

In some embodiments, the server also recommends the video published by the second user account to the target client: the server sorts videos issued by the second user account according to the video heat; the server extracts videos with the video popularity ranking of N before according to the sequence, wherein N is a positive integer; and the server recommends the top N videos to the target client.

Illustratively, N is 2, and as shown in the right diagram of fig. 5, the right diagram of fig. 7, and the right diagram of fig. 9, the video with the top 2 of the video popularity ranking in the video published by the second user account is displayed.

Illustratively, the popularity of video playing is taken as a standard for measuring the popularity of the video. The praise is obtained by triggering a love control on the video screen, such as the love control in the lower right corner of the user interface in fig. 5, 7 and 9. As can be seen from the numbers below the love control, the video shown in fig. 9 is more hot than that shown in fig. 5.

In some embodiments, the video popularity may also be measured by the number of forwarding, the number of comments, the playing amount, and the like, which is not limited in this application.

Fig. 10 shows an information recommendation method provided in another exemplary embodiment of the present application, which is described by taking the method as an example for the terminal 210 in the computer system 200 shown in fig. 2, and the method includes the following steps:

step 1001, displaying a video frame when a first video is played, where the first video is a video published by a first user account.

Illustratively, the terminal used by the user displays a video picture of the first video, as shown in the left diagram of fig. 5 and the left diagram of fig. 9. Illustratively, when the first video includes at least two character elements, the video includes a user who publishes the video, or the video does not include a user who publishes the video, for example, a dancer on the left side in fig. 5 is a first user corresponding to a first user account, and the video is published by the first user account; or the dancer in the video does not comprise the first user, wherein the dancer is issued by the first user account during the video.

Step 1002, at least one of eye tracking data and screen operation data is collected.

When the terminal detects that the video application program is opened and the video is in a playing state, the terminal starts the camera to collect eyeball tracking data. The terminal uploads the eyeball tracking data to a background server of the video application program, and the background server determines interest elements which are interested by the user according to the eyeball tracking data and further determines interest sub-elements, as shown in the right diagram of fig. 5 and the right diagram of fig. 9.

When the video application program is detected to be opened and the video is in a playing state, the display screen receives the operation, and the terminal collects screen operation data according to the received operation. The terminal uploads the screen operation data to a background server of the video application program, and the background server determines an interest original story in which the user is interested according to the screen operation data and further determines an interest sub-element, as shown in the right diagram of fig. 7.

Step 1003, displaying information of a second user account on the video picture, wherein the second user account is associated with an interest sub-element in the interest elements, the interest elements are in a user interest area in the video picture, and the user interest area is obtained through at least one of eyeball tracking data and screen operation data.

After the server determines the video related to the interest sub-element according to the interest sub-element, the server recommends the user account for publishing the video to the client, and prompt information is displayed on the client, as shown in the right diagram of fig. 5, the right diagram of fig. 7, and the right diagram of fig. 9, and the prompt information is displayed on the upper right corner of the video picture. It can be understood that the prompt message may also be displayed in the upper left corner of the video frame, and at any position on the video frame, which is not limited in the embodiment of the present application.

In summary, in the method of the embodiment, the information of the second user account recommended by the server is displayed on the video picture, so that the user can more intuitively see the information of the second user account recommended by the server, the search process of the user is simplified, and the human-computer interaction efficiency is improved.

In an alternative embodiment based on fig. 10, after displaying the information of the second user account on the video screen, the method further includes:

step 1004a, in response to receiving the triggering operation on the information of the second user account, displaying a home page interface of the second user account, where the home page interface includes a second video issued by the second user account.

Illustratively, when the terminal is a terminal including a touch display screen, such as a smart phone or a tablet computer, the trigger operation includes a single-click operation, a double-click operation, a long-press operation, a sliding operation, a dragging operation, a hovering operation, and a combination thereof; when the terminal is a terminal connected to an external input device, such as a desktop computer or a notebook computer, the triggering operation is an operation performed through the external input device (such as a keyboard, a mouse, etc.).

In one example, the terminal used by the user is a smart phone, and the user clicks the prompt message 16 shown in fig. 5 to display the home page interface 901 of the second user account shown in fig. 11. In the home interface 901, all videos published by the second user account and personal information of the second user account (such as information of the second user account including sex, nickname, fan number, focus number, video approval number, etc.) are included, and in some embodiments, the home interface 901 further includes the number of videos that the second user approves of using the second user account. In some embodiments, the user may autonomously choose to view the video posted by the second user account when the home interface 901 is displayed.

Or the like, or, alternatively,

and 1004b, responding to the received trigger operation on the information of the second user account, and displaying a video picture of a second video issued by the second user account, wherein the second video is the video with the highest video heat degree in the videos issued by the second user account.

In an example, a terminal used by a user is a smartphone, the user clicks the prompt message 16 shown in fig. 5, a video picture shown in fig. 12 is displayed, a video of a dancer 902 dancing is displayed on the video picture, the video is a video issued by a second user account, the second user corresponding to the second user account may be the dancer 902 or may not be the dancer 902, and the video is a video with the highest video rank in videos issued by the second user account.

In summary, in the method of this embodiment, by operating the information of the second user account by the user, the user can directly view a homepage interface (personal data information of the second user account) of the second user account on the video application program, or view a video with the highest video popularity published by the second user account, and recommend the video content of interest to the user in an intuitive manner, so that the human-computer interaction efficiency is improved.

Fig. 13 shows an information recommendation method provided in another exemplary embodiment of the present application, which is described as an example of the method used in the computer system 200 shown in fig. 2, and the method includes the following steps:

step 1301, start.

In step 1302, the user opens a video application to view the video.

Illustratively, the video application is a short video application.

And step 1303a, the terminal starts a camera, and calls an eye movement tracking system to track the eye movement behavior and the fixation point of the user watching a single video.

When a user watches a video, the sight line changes, the eye movement tracking system records the sight line changes of the user and converts the sight line changes into eye movement tracking data, and the eye movement tracking data comprises eye movement behavior data and gazing point data.

And step 1303b, receiving screen operation data by a display screen of the terminal.

Illustratively, a user amplifies elements in a first video through a two-finger amplification operation, a display screen of a terminal receives the two-finger amplification operation, an operation acquisition system acquires operation data corresponding to the two-finger amplification operation, and the operation data is sent to a background server of the short video application program.

Step 1304, the server calls an interest analysis system to judge interest sub-elements of the user in the video according to the eye movement behavior and the fixation point of the user; or judging the interest sub-element according to the screen operation data.

The terminal uploads the eye tracking data to a background server of the short video application program, and the server calls an interest analysis system to analyze the eye tracking data to obtain a user interest area when the user watches the short video. The interest analysis system can further analyze the eye tracking data to obtain interest sub-elements in the interest elements. The interest analysis system transmits the interest sub-element to the recognition system.

The terminal uploads the screen operation data to a background server of the short video application program, and the server calls an interest analysis system to analyze the screen operation data to obtain a user interest area when the user watches the video. The interest analysis system may further analyze the screen operation data, and illustratively, the server obtains an enlargement ratio corresponding to the enlargement operation data, and determines an element corresponding to the enlargement operation data as an interest sub-element in response to the enlargement ratio being greater than a ratio threshold. The interest analysis system transmits the interest sub-element to the recognition system.

Similar to the way the server determines the sub-element of interest in the video, the server illustratively determines the sub-element of interest in the video from the eye tracking data by:

the server acquires a fifth interest value corresponding to the gazing point data and a sixth interest value corresponding to the eye movement behavior data; the server determines the interest element in the first video according to the fifth interest value and the sixth interest value.

Illustratively, the gaze point data comprises gaze point locations and the eye movement behavior data comprises gaze behaviors. The server acquiring the fifth interest value and the sixth interest value comprises the following modes:

the server acquires the area proportion of a third area where the watching point position is located in a video area of the first video, and determines a fifth interest value of the third area according to the area proportion; or the server acquires the watching time length of the watching point position in the third area, and determines a fifth interest value of the third area according to the watching time length; in response to the eye movement behavior belonging to the gaze behavior and the gaze behavior resulting in the third region, the server determines a sixth interest value for the third region.

Step 1305, the server calls an identification system to identify the person element/object element corresponding to the interest sub-element.

And the server calls an identification system to identify the interest sub-elements. Schematically, the server calls an object identification system to identify the object element corresponding to the interest sub-element; and the server calls a figure recognition system to recognize the human body elements corresponding to the interest sub-elements.

In step 1306, the server records the interest sub-element and marks the portrait information/object information corresponding to the interest sub-element.

The recognition system transmits the recognition data to the interest sub-element tagging unit. The interest sub-element marking unit marks the interest sub-elements according to the types of the interest sub-elements.

In step 1307, the server determines whether the interest sub-element is associated with the video publisher.

Illustratively, when the interest sub-element belongs to the character element, the server determines whether the character element is a character element corresponding to the video publisher, that is, determines whether the character in the video that is interested by the user is the video publisher. If the person interested by the user is the video publisher, go to step 1313; if the person of interest to the user is not the video publisher, then step 1308 is entered.

In step 1308, the server determines whether there is video content associated with the interest sub-element.

When the person of interest to the user is not the video publisher, the server continues to determine whether there is video content associated with the sub-element of interest in the video. If there is video content associated with the interest sub-element in the video, go to step 1309; if there is no video content associated with the sub-element of interest in the video, then step 1313 is entered.

Step 1309, the server calls out the information of the user account.

And when the video content associated with the interest sub-element exists in the video, calling out the information of the user account by the server, and sending the video content to the terminal.

In step 1310, the terminal displays the prompt message of the user account.

In step 1311, the user clicks on the home page of the user account.

And the user clicks the prompt message of the user account displayed on the terminal, opens the homepage of the user account, and displays the message of the user account and the video published by the user account on the homepage interface.

At step 1312, the user continues to watch the video.

Illustratively, a user recommends an associated video to the user by clicking on a video playing server, wherein the associated video is recommended by detecting an interest element or an interest sub-element which is interested in the user.

In step 1313, the terminal does not display the content.

When the server judges that the interest sub-elements are associated with the video publisher, the called information of the user account is not sent to the terminal, and the terminal does not display related contents.

When the server judges that the video content associated with the interest sub-element does not exist, the server does not send called information of the user account to the terminal, and the terminal does not display related content.

And step 1314, ending.

In summary, according to the method of the embodiment, the interest elements of the user, which are interested in the video, are judged according to the eye tracking mode, and the related user accounts and the content are recommended directly, so that the user can be helped to inquire the content which is most interested in more quickly, and the human-computer interaction efficiency is improved.

It is understood that the steps 1304a to 1309 can be repeatedly executed during the process of watching a video, wherein the steps 1304a and 1304b can be executed separately, in combination, or alternatively.

In some implementations, videos that are not of interest to the user can be filtered out by using the information recommendation method provided by the above embodiments. For example, when the user watches a video as shown in fig. 5, the server determines that the user is not interested in the right dancer in the video according to the eyeball tracking data, and the server does not recommend the relevant video about the right dancer to the user any more.

Fig. 14 is a block diagram of an information recommendation apparatus according to an exemplary embodiment of the present application, where the apparatus includes:

the obtaining module 1410 is configured to obtain a user interest area of a target client when playing a video screen of a first video, where the first video is a video issued by a first user account;

an element extraction module 1420, configured to extract an interest element from the user interest area, where the interest element includes at least one of a person element and an object element;

the element extraction module 1420 is configured to extract an interest sub-element from the interest element;

and the information recommending module 1430, configured to recommend the information of the second user account to the target client in response to the association between the interest sub-element and the second user account.

In an optional embodiment, the element extraction module 1420 is configured to, in response to that the interest element belongs to a character element, divide the interest element into a head element of the character element, an upper body element of the character element, and a lower body element of the character element; determining interest sub-elements from the head element, the upper body element and the lower body element.

In an alternative embodiment, the obtaining module 1410 is configured to, in response to the interest sub-element belonging to the head element, obtain a second video matching the head element; the information recommending module 1430 is configured to recommend publisher information of the second video to the target client as information of the second user account; or, the obtaining module 1410 is configured to obtain, in response to the interest sub-element belonging to the upper body element, a second video matching the upper body element; the information recommending module 1430 recommends the publisher information of the second video as the information of the second user account to the target client; or, the obtaining module 1410 is configured to, in response to the interest sub-element belonging to the lower body element, obtain a second video matching the lower body element; the information recommending module 1430 is configured to recommend publisher information of the second video to the target client as information of the second user account.

In an optional embodiment, the obtaining module 1410 is configured to, in response to a decoration element corresponding to a character element included in the interest sub-elements, obtain a third video matching the decoration element; the information recommending module 1430 is configured to recommend the publisher information of the third video to the target client as the information of the second user account.

In an alternative embodiment, the obtaining module 1410 is configured to obtain eye tracking data; acquiring a first user interest area from a video picture according to eyeball tracking data; or acquiring screen operation data; and acquiring a second user interest area from the video picture according to the screen operation data.

In an alternative embodiment, the eye tracking data comprises gaze point data and eye movement behavior data;

the obtaining module 1410 is configured to obtain a first interest value corresponding to the gaze point data and a second interest value corresponding to the eye movement behavior data; the element extraction module 1420 is configured to determine interest sub-elements from the head element, the upper body element, and the lower body element according to the first interest value and the second interest value.

In an alternative embodiment, the point of regard data comprises a point of regard location and the eye movement behavior data comprises a gaze behavior;

the obtaining module 1410 is configured to obtain an area ratio of a first region where the gazing point position is located in a video region of a first video, and determine a first interest value of the first region according to the area ratio; or, acquiring the watching time length of the watching point position in the first area, and determining a first interest value of the first area according to the watching time length; in response to the eye movement behavior belonging to the gaze behavior and the gaze behavior resulting in the first region, a second interest value for the first region is determined.

In an optional embodiment, the element extracting module 1420 is configured to, in response to that an interest element belongs to an object element, divide the interest element to obtain a mark element and a structure element of the object element; an interest sub-element is determined from the flag element and the structural element.

the obtaining module 1410 is configured to obtain a third interest value corresponding to the gaze point data and a fourth interest value corresponding to the eye movement behavior data; the element extracting module 1420 is configured to determine an interest sub-element from the mark element and the structural element according to the third interest value and the fourth interest value.

the obtaining module 1410 is configured to obtain an area ratio of a second region where the gazing point position is located in a video region of the first video, and determine a third interest value of the second region according to the area ratio; or, acquiring the watching time length of the watching point position in the second area, and determining a third interest value of the second area according to the watching time length; in response to the eye movement behavior belonging to the gaze behavior and the gaze behavior resulting in the second region, a fourth interest value for the second region is determined.

the obtaining module 1410 is configured to obtain a fifth interest value corresponding to the gazing point data and a sixth interest value corresponding to the eye movement behavior data; the element extracting module 1420 is configured to determine an interest element in the first video according to the fifth interest value and the sixth interest value.

the obtaining module 1410 is configured to obtain an area ratio of a third region where the gazing point position is located in a video region of the first video, and determine a fifth interest value of the third region according to the area ratio; or, acquiring the watching time length of the watching point position in the third area, and determining a fifth interest value of the third area according to the watching time length; in response to the eye movement behavior belonging to the gaze behavior and the gaze behavior resulting in a third region, a sixth interest value for the third region is determined.

In an optional embodiment, the obtaining module 1410 is configured to obtain an amplification scale corresponding to the amplification operation data; the obtaining module 1410 is configured to determine, in response to the amplification ratio being greater than the ratio threshold, an element corresponding to the amplification operation data as the interest sub-element.

In an optional embodiment, the obtaining module 1410 is configured to obtain a duration of an amplification region in an amplification state, where the amplification region corresponds to the amplification operation data; the element extracting module 1420 is configured to determine, in response to the duration being greater than the time threshold, an element corresponding to the amplification operation data as the interest sub-element.

In an optional embodiment, the obtaining module 1410 is configured to obtain an enlarged area corresponding to the enlarged operation data; the element extracting module 1420 is configured to determine, in response to the enlargement area being greater than the area threshold, an element corresponding to the enlargement operation data as an interest sub-element.

In an optional embodiment, the information recommending module 1430 is configured to sort videos issued by the second user account according to the video popularity; extracting videos with the video popularity ranking of the top N from the videos according to the sequence, wherein N is a positive integer; and recommending the video with the top N to the target client.

In summary, the apparatus provided in this embodiment obtains the eye tracking data of the user when watching the video, determines which part of the content of the video the user is interested in by using the eye tracking data, and can accurately obtain the video content of interest of the user by finely dividing the interest elements, so as to recommend the second user account associated with the video content of interest of the user to the target client (user), thereby simplifying the steps of the user when searching the related video of interest, and improving the human-computer interaction efficiency.

Fig. 15 is a block diagram of an information recommendation apparatus according to another exemplary embodiment of the present application, where the apparatus includes:

a display module 1510, configured to display a video frame when a first video is played, where the first video is a video published by a first user account;

an acquisition module 1520, configured to acquire at least one of eye tracking data and screen operation data;

the display module 1510 is configured to display information of a second user account on the video screen, where the second user account is associated with an interest sub-element in the interest element, and the interest element is in a user interest area in the video screen, where the user interest area is obtained through at least one of eyeball tracking data and screen operation data.

In an optional embodiment, the display module 1510 is configured to, in response to receiving a trigger operation on the information of the second user account, display a home interface of the second user account, where the home interface includes a second video published by the second user account; or responding to the received trigger operation on the information of the second user account, and displaying a video picture of a second video issued by the second user account, wherein the second video is the video with the highest video heat degree in the videos issued by the second user account.

In summary, the device provided in this embodiment displays the information of the second user account recommended by the server on the video screen, so that the user can more intuitively see the information of the second user account recommended by the server, the search process of the user is simplified, and the human-computer interaction efficiency is improved.

By operating the information of the second user account by the user, the user can directly check homepage data of the second user account on the video application program or check the video with the highest video popularity issued by the second user account, and recommend the video content interested by the user to the user in a visual mode, so that the man-machine interaction efficiency is improved.

It should be noted that: the information recommendation device provided in the above embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be allocated to different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the information recommendation device and the information recommendation method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 16 shows a schematic structural diagram of a server provided in an exemplary embodiment of the present application. The server may be server 220 in computer system 200 shown in fig. 2.

The server 1600 includes a Central Processing Unit (CPU) 1601, a system Memory 1604 including a Random Access Memory (RAM) 1602 and a Read Only Memory (ROM) 1603, and a system bus 1605 connecting the system Memory 1604 and the Central Processing Unit 1601. The server 1600 also includes a basic Input/Output System (I/O System)1606, which facilitates information transfer between various devices within the computer, and a mass storage device 1607 for storing an operating System 1613, application programs 1614, and other program modules 1615.

The basic input/output system 1606 includes a display 1608 for displaying information and an input device 1609 such as a mouse, keyboard, etc. for user input of information. Wherein a display 1608 and an input device 1609 are connected to the central processing unit 1601 by way of an input-output controller 1610 which is connected to the system bus 1605. The basic input/output system 1606 may also include an input-output controller 1610 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 1610 may also provide output to a display screen, a printer, or other type of output device.

The mass storage device 1607 is connected to the central processing unit 1601 by a mass storage controller (not shown) connected to the system bus 1605. The mass storage device 1607 and its associated computer-readable media provide non-volatile storage for the server 1600. That is, the mass storage device 1607 may include a computer-readable medium (not shown) such as a hard disk or Compact disk Read Only Memory (CD-ROM) drive.

Computer-readable media may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other Solid State Memory technology, CD-ROM, Digital Versatile Disks (DVD), or Solid State Drives (SSD), other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 1604 and mass storage device 1607 described above may be collectively referred to as memory.

According to various embodiments of the application, the server 1600 may also operate with remote computers connected to a network, such as the Internet. That is, the server 1600 may be connected to the network 1612 through the network interface unit 1611 that is coupled to the system bus 1605, or the network interface unit 1611 may be used to connect to other types of networks or remote computer systems (not shown).

The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.

In an alternative embodiment, a computer device is provided that includes a processor and a memory having at least one instruction, at least one program, set of codes, or set of instructions stored therein, the at least one instruction, at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the information recommendation method as described above.

In an alternative embodiment, a computer-readable storage medium is provided that has at least one instruction, at least one program, set of codes, or set of instructions stored therein, which is loaded and executed by a processor to implement the information recommendation method as described above.

Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). The above-mentioned serial numbers of the embodiments of the present application are for description only and do not represent the merits of the embodiments.

Referring to fig. 17, a block diagram of a computer device 1700 according to an exemplary embodiment of the present application is shown. The computer device 1700 may be a portable mobile terminal, such as: smart phones, tablet computers, MP3 players (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4). Computer device 1700 may also be referred to by other names such as user equipment, portable terminal, etc.

Generally, computer device 1700 includes: a processor 1701 and a memory 1702.

The processor 1701 may include one or more processing cores, such as 4-core processors, 8-core processors, and the like. The processor 1701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1701 may also include a main processor, which is a processor for processing data in an awake state, also called a Central Processing Unit (CPU), and a coprocessor; a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1701 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and rendering content that the display screen needs to display. In some embodiments, the processor 1701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1702 may include one or more computer-readable storage media, which may be tangible and non-transitory. The memory 1702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1702 is used to store at least one instruction for execution by the processor 1701 to implement the information recommendation method provided in embodiments of the present application.

In some embodiments, computer device 1700 may also optionally include: a peripheral interface 1703 and at least one peripheral. Specifically, the peripheral device includes: at least one of a radio frequency circuit 1704, a touch display screen 1705, a camera 1706, an audio circuit 1707, a positioning component 1708, and a power source 1709.

The peripheral interface 1703 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1701 and the memory 1702. In some embodiments, the processor 1701, memory 1702, and peripheral interface 1703 are integrated on the same chip or circuit board; in some other embodiments, any one or both of the processor 1701, the memory 1702, and the peripheral interface 1703 may be implemented on separate chips or circuit boards, which are not limited in this embodiment.

The Radio Frequency circuit 1704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1704 communicates with a communication network and other communication devices via electromagnetic signals. The rf circuit 1704 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, etc. The radio frequency circuit 1704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1704 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The touch display screen 1705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. The touch display screen 1705 also has the ability to capture touch signals on or above the surface of the touch display screen 1705. The touch signal may be input as a control signal to the processor 1701 for processing. The touch screen 1705 is used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the touch display screen 1705 may be one, providing the front panel of the computer device 1700; in other embodiments, the touch screen display 1705 may be at least two, each disposed on a different surface of the computer device 1700 or in a folded design; in still other embodiments, the touch display 1705 may be a flexible display, disposed on a curved surface or on a folded surface of the computer device 1700. Even more, the touch screen 1705 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The touch screen 1705 may be made of LCD (Liquid Crystal Display), OLED (organic light-Emitting Diode), or the like.

The camera assembly 1706 is used to capture images or video. Optionally, camera assembly 1706 includes a front camera and a rear camera. Generally, a front camera is used for realizing video call or self-shooting, and a rear camera is used for realizing shooting of pictures or videos. In the embodiment of the application, the front camera is used for collecting eyeball tracking data of a user when the user watches videos. In some embodiments, the number of the rear cameras is at least two, and each of the rear cameras is any one of a main camera, a depth-of-field camera and a wide-angle camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting function and a VR (Virtual Reality) shooting function. In some embodiments, camera assembly 1706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuitry 1707 is used to provide an audio interface between a user and computer device 1700. The audio circuit 1707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, inputting the electric signals into the processor 1701 for processing, or inputting the electric signals into the radio frequency circuit 1704 for voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location on the computer device 1700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1701 or the radio frequency circuit 1704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1707 may also include a headphone jack.

Location component 1708 is used to locate the current geographic Location of computer device 1700 for navigation or LBS (Location Based Service). The Positioning component 1708 may be based on a GPS (Global Positioning System) in the united states, a beidou System in china, or a galileo System in russia.

Power supply 1709 is used to power the various components in computer device 1700. The power supply 1709 may be ac, dc, disposable or rechargeable. When the power supply 1709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, computer device 1700 also includes one or more sensors 1710. The one or more sensors 1710 include, but are not limited to: acceleration sensor 1711, gyro sensor 1714, pressure sensor 1713, fingerprint sensor 1714, optical sensor 1715, and proximity sensor 1716.

The acceleration sensor 1711 can detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the computer apparatus 1700. For example, the acceleration sensor 1711 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 1701 may control the touch display screen 1705 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1711. The acceleration sensor 1711 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1714 may detect a body direction and a rotation angle of the computer apparatus 1700, and the gyro sensor 1714 may cooperate with the acceleration sensor 1711 to acquire a 3D motion of the user on the computer apparatus 1700. The processor 1701 may perform the following functions based on the data collected by the gyro sensor 1714: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 1713 may be disposed on the side bezel of computer device 1700 and/or underlying touch display screen 1705. When the pressure sensor 1713 is disposed on the side frame of the computer apparatus 1700, a user's grip signal for the computer apparatus 1700 can be detected, and left-right hand recognition or shortcut operation can be performed based on the grip signal. When the pressure sensor 1713 is disposed at the lower layer of the touch display screen 1705, the control of the operability control on the UI interface can be realized according to the pressure operation of the user on the touch display screen 1705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 1714 is used to collect a fingerprint of the user to identify the identity of the user based on the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 1701 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 1714 may be disposed on the front, back, or side of computer device 1700. When a physical key or vendor Logo is provided on computer device 1700, fingerprint sensor 1714 may be integrated with the physical key or vendor Logo.

The optical sensor 1715 is used to collect the ambient light intensity. In one embodiment, the processor 1701 may control the display brightness of the touch display screen 1705 based on the ambient light intensity collected by the optical sensor 1715. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1705 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 1705 is turned down. In another embodiment, the processor 1701 may also dynamically adjust the shooting parameters of the camera assembly 1706 according to the ambient light intensity collected by the optical sensor 1715.

Proximity sensors 1716, also known as distance sensors, are typically provided on the front of the computer device 1700. Proximity sensor 1716 is used to capture the distance between the user and the front of computer device 1700. In one embodiment, the processor 1701 controls the touch display screen 1705 to switch from a bright screen state to a rest screen state when the proximity sensor 1716 detects that the distance between the user and the front surface of the computer device 1700 is gradually decreased; when the proximity sensor 1716 detects that the distance between the user and the front of the computer device 1700 is gradually increasing, the processor 1701 controls the touch display screen 1705 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the architecture shown in FIG. 17 is not intended to be limiting of the computer device 1700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and executes the computer instructions to cause the computer device to execute the information recommendation method as described above.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is intended to be exemplary only, and not to limit the present application, and any modifications, equivalents, improvements, etc. made within the spirit and scope of the present application are intended to be included therein.

Claims

1. An information recommendation method, characterized in that the method comprises:

extracting interest sub-elements from the interest elements;

2. The method of claim 1, wherein the extracting the interest sub-element from the interest element comprises:

responding to the interest element belonging to the character element, dividing the interest element to obtain a head element of the character element, an upper body element of the character element and a lower body element of the character element;

determining the interest sub-element from the head element, the upper body element, and the lower body element.

3. The method of claim 2, wherein recommending information of the second user account to the target client in response to the interest sub-element being associated with the second user account comprises:

in response to the interest sub-element belonging to the head element, acquiring a second video matched with the head element; recommending the publisher information of the second video to the target client as the information of the second user account;

or, in response to the interest sub-element belonging to the upper body element, acquiring a second video matched with the upper body element; recommending the publisher information of the second video to the target client as the information of the second user account;

or, in response to the interest sub-element belonging to the lower body element, acquiring a second video matching the lower body element; and recommending the publisher information of the second video to the target client as the information of the second user account.

4. The method of claim 2, further comprising:

responding to the interest sub-elements including decoration elements corresponding to the character elements, and acquiring a third video matched with the decoration elements;

and recommending the publisher information of the third video to the target client as the information of the second user account.

5. The method according to any one of claims 1 to 4, wherein the obtaining of the user interest area of the target client when playing the video frame of the first video comprises:

acquiring eyeball tracking data; acquiring a first user interest area from the video picture according to the eyeball tracking data;

or the like, or, alternatively,

acquiring screen operation data; and acquiring a second user interest area from the video picture according to the screen operation data.

6. The method of any one of claims 2 to 4, wherein the eye tracking data comprises gaze point data and eye movement behavior data;

said determining said interest sub-element from said head element, said upper body element and said lower body element, comprising:

acquiring a first interest value corresponding to the fixation point data and a second interest value corresponding to the eye movement behavior data;

determining the interest sub-element from the head element, the upper body element, and the lower body element according to the first interest value and the second interest value.

7. The method of claim 6, wherein the gaze point data comprises a gaze point location and the eye movement behavior data comprises gaze behavior;

the obtaining a first interest value corresponding to the point-of-regard data and a second interest value corresponding to the eye movement behavior data includes:

acquiring the area proportion of a first region where the point of regard is located in a video region of the first video, and determining the first interest value of the first region according to the area proportion; or, acquiring the gazing time length of the gazing point position in the first area, and determining the first interest value of the first area according to the gazing time length;

determining the second interest value for the first region in response to the eye movement behavior belonging to the gaze behavior and the gaze behavior resulting in the first region.

8. The method of claim 1, wherein extracting the interest sub-element from the interest element further comprises:

responding to the object element belonging to the interest element, dividing the interest element to obtain a mark element and a structural element of the object element;

determining the interest sub-element from the flag element and the structural element.

9. The method of claim 5, wherein the screen operation data comprises zoom-in operation data; the method further comprises the following steps:

acquiring an amplification ratio corresponding to the amplification operation data;

and in response to the amplification scale being larger than a scale threshold value, determining an element corresponding to the amplification operation data as the interest sub-element.

10. The method of claim 5, wherein the screen operation data comprises zoom-in operation data; the method further comprises the following steps:

acquiring the duration of an amplification area in an amplification state, wherein the amplification area corresponds to the amplification operation data;

and determining the element corresponding to the amplification operation data as the interest sub-element in response to the duration being greater than a time threshold.

11. An information recommendation method, characterized in that the method comprises:

collecting at least one of eyeball tracking data and screen operation data;

displaying information of a second user account on the video picture, wherein the second user account is associated with an interest sub-element in the interest elements, the interest elements are in a user interest area in the video picture, and the user interest area is obtained through at least one of the eyeball tracking data and the screen operation data.

12. An information recommendation apparatus, characterized in that the apparatus comprises:

13. An information recommendation apparatus, characterized in that the apparatus comprises:

the display module is configured to display information of a second user account on the video screen, where the second user account is associated with an interest sub-element in the interest element, the interest element is in a user interest area in the video screen, and the user interest area is obtained through at least one of the eyeball tracking data and the screen operation data.

14. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the information recommendation method of any one of claims 1 to 11.

15. A computer-readable storage medium, having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the information recommendation method of any one of claims 1 to 11.