CN109874054B

CN109874054B - Advertisement recommendation method and device

Info

Publication number: CN109874054B
Application number: CN201910114138.7A
Authority: CN
Inventors: 陈海波
Original assignee: Deep Blue Technology Shanghai Co Ltd
Current assignee: Deep Blue Technology Shanghai Co Ltd
Priority date: 2019-02-14
Filing date: 2019-02-14
Publication date: 2021-06-29
Anticipated expiration: 2039-02-14
Also published as: CN109874054A

Abstract

The invention discloses an advertisement recommendation method and device, which are used for solving the problem of low recommendation efficiency of the existing advertisement recommendation method. The advertisement recommendation method comprises the following steps: collecting video frame images of audiences watching advertisements in a set advertisement time period; acquiring a face image in each video frame image; determining the state information of the corresponding audience in the video frame image according to the face image; determining the audience type interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience; automatically recommending the advertisement to the type of viewer during the advertisement time period.

Description

Advertisement recommendation method and device

Technical Field

The invention relates to the technical field of information recommendation, in particular to an advertisement recommendation method and device.

Background

With the development of internet televisions, compared with traditional advertisements, internet televisions have the advantages of wide coverage, wide audience, high cost performance, good interactivity and the like, so that internet televisions become a main way for more and more merchants to recommend their products. However, due to the limited manpower and physics and the huge amount of data, it is not easy for advertisers to obtain audience population information of their delivered advertisements to verify whether the advertisements are effective.

The existing advertisement recommendation system takes characteristics of user's needs, interests and the like as conditions for filtering information, and recommends some product information which the user is interested in to the user, information sources of the advertisement recommendation system often influence the recommendation effect, while the traditional information sources generally judge the preference degree of the user to advertisements by means of sales records of commodities, questionnaires and the like, and the recommendation efficiency is low.

Disclosure of Invention

In order to solve the problem of low recommendation efficiency of the existing advertisement recommendation mode, the embodiment of the invention provides an advertisement recommendation method and device.

In a first aspect, an embodiment of the present invention provides an advertisement recommendation method, including:

collecting video frame images of audiences watching advertisements in a set advertisement time period;

acquiring a face image in each video frame image;

determining the state information of the corresponding audience in the video frame image according to the face image;

determining the audience type interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience;

automatically recommending the advertisement to the type of viewer during the advertisement time period.

According to the advertisement recommendation method provided by the embodiment of the invention, a server collects video frame images of audiences watching advertisements in a set advertisement time period, a face image in each video frame image is acquired, the state information of the corresponding audience in the video frame image is determined according to the acquired face image, further, the audience type interested in the advertisements is determined according to the state information of each audience in each video frame image in the set advertisement time period and the watching duration of each audience, and the advertisements are automatically recommended to the audiences of the type in the later advertisement time period. Compared with the prior art, the method and the device have the advantages that the state information of the audiences is determined according to the face image information extracted from the video frame images of the audiences watching the advertisements, and the crowds interested in the advertisements are obtained according to the state information of the audiences, so that the advertisements interested by the users can be automatically and accurately pushed in real time by the advertisement background, the effects of saving the advertisement cost, improving the advertisement recommendation efficiency and increasing the advertisement putting accuracy are achieved, the suggestions of advertisement putting strategies can be automatically generated, and the manpower and time cost of artificial statistics of the advertisements is saved.

Preferably, the obtaining of the face image in the video frame image specifically includes:

extracting face position coordinates in the video frame image according to a preset face detection model, wherein the face position coordinates comprise face key point position coordinates;

and intercepting a face image according to the face position coordinates.

In the above preferred embodiment, the face position coordinates in the video frame image may be extracted according to a preset face detection model, where the face position coordinates include face key point position coordinates, and the face image may be captured according to the face position coordinates.

Preferably, the status information includes gender, age, expression, opening and closing status of human eyes and rotation angle of human face;

determining the state information of the corresponding audience in the video frame image according to the face image, which specifically comprises:

inputting the face image into a preset gender classification model, and acquiring the gender of the audience corresponding to the face image; and

inputting the face image into a preset age classification model, and acquiring the age of the audience corresponding to the face image; and

inputting the facial image into a preset expression classification model, and acquiring the expression of the audience corresponding to the facial image; and

extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images; and

and determining the face rotation angle according to the position coordinates of the face key points and the position coordinates of the face key points of a preset standard frontal face.

In the above preferred embodiment, the state information of the viewers corresponding to the facial image may include gender, age, expression, eye opening and closing states, and a face rotation angle, the gender, age, expression, eye opening and closing states of the viewers may be obtained according to respective corresponding preset neural network classification models, and the face rotation angle may be determined according to the position coordinates of the key point of the face and the position coordinates of the key point of the face of the preset standard frontal face. And the method for acquiring the state information of the audience by using each pre-trained neural network classification model improves the classification accuracy and the calculation efficiency.

Optionally, after determining the face rotation angle according to the face key point position coordinates and the face key point position coordinates of the preset standard front face, the method further includes:

and judging whether the attention of the audience corresponding to the face image is concentrated or not according to the opening and closing state of the human eyes and the face rotation angle.

Preferably, the determining whether the attention of the audience corresponding to the face image is concentrated according to the opening and closing states of the human eyes and the rotation angle of the face includes:

if the eye opening and closing state of the human eyes is eye opening and the rotation angle of the human face is within a preset angle range, judging that the attention of the audience corresponding to the human face image is concentrated;

and if the eye opening and closing state of the human eyes is eye closing or the human face rotation angle is not within the preset angle range, judging that the attention of the audience corresponding to the human face image is not concentrated.

In the above preferred embodiment, whether the attention of the viewer is focused or not is determined according to the opening/closing state of the human eyes and the rotation angle of the human face, and whether the attention is focused or not can be used as one of the indexes for determining whether the viewer is interested in the advertisement or not.

Preferably, the determining the type of the audience interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience specifically includes:

counting audiences of which the watching time length is longer than the preset time length, the attention is concentrated, and the number of frames of the video frame images with smile expressions is larger than the preset number in the video frame images;

determining the counted age and gender of the audience as the type of audience interested in the advertisement.

In the above preferred embodiment, the type of the viewer interested in the advertisement is determined according to the three indicators of the viewing duration, the attention concentration and the smiling state of the viewer, so that the positioning of the viewer interested in the advertisement is more accurate.

In a second aspect, an embodiment of the present invention provides an advertisement recommendation apparatus, including:

the acquisition unit is used for acquiring video frame images of audiences watching advertisements in a set advertisement time period;

the acquisition unit is used for acquiring a face image in each video frame image;

the first determining unit is used for determining the state information of the corresponding audience in the video frame image according to the face image;

the second determining unit is used for determining the type of the audience interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience;

and the recommending unit is used for automatically recommending the advertisement to the type of audiences in the advertisement time period.

Preferably, the obtaining unit is specifically configured to extract face position coordinates in the video frame image according to a preset face detection model, where the face position coordinates include face key point position coordinates; and intercepting a face image according to the face position coordinates.

the first determining unit is specifically configured to input the face image into a preset gender classification model, and acquire the gender of the audience corresponding to the face image; inputting the face image into a preset age classification model to obtain the age of the audience corresponding to the face image; inputting the face image into a preset expression classification model, and acquiring the expression of the audience corresponding to the face image; extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images; and determining the face rotation angle according to the position coordinates of the face key points and the position coordinates of the face key points of a preset standard frontal face.

Optionally, the apparatus further comprises:

and the judging unit is used for judging whether the attention of the audience corresponding to the face image is concentrated according to the opening and closing state of the human eyes and the face rotation angle after determining the face rotation angle according to the position coordinates of the key points of the human face and the position coordinates of the key points of the human face of a preset standard frontal face.

Preferably, the determining unit is specifically configured to determine that the attention of the viewer corresponding to the face image is concentrated if the eye opening/closing state of the human eyes is eye opening and the rotation angle of the face is within a preset angle range; and if the eye opening and closing state of the human eyes is eye closing or the human face rotation angle is not within the preset angle range, judging that the attention of the audience corresponding to the human face image is not concentrated.

Preferably, the second determining unit is specifically configured to count viewers in the video frame images, for whom the watching duration is longer than a preset duration, and the number of frames of the video frame images with concentrated attention and smile expression is greater than a preset number; determining the counted age and gender of the audience as the type of audience interested in the advertisement.

The technical effects of the advertisement recommendation device provided by the present invention can be seen in the technical effects of the first aspect or the implementation manners of the first aspect, which are not described herein again.

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the advertisement recommendation method according to the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps in the advertisement recommendation method according to the present invention.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

fig. 1 is a schematic flow chart of an implementation of an advertisement recommendation method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an implementation flow of acquiring a face image in a video frame image according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating five key points of a standard front face according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating the distribution of key points of an actual face and key points of a standard frontal face when the face rotates left and right in the embodiment of the present invention;

FIG. 5 is a flow chart illustrating the determination of viewer types of interest to an advertisement in an embodiment of the present invention;

FIG. 6 is a schematic structural diagram of an advertisement recommendation apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present invention, and are not intended to limit the present invention, and that the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

In this context, it is to be understood that, in the technical terms referred to in the present invention:

MTCNN (Multi-task Convolutional Neural Networks): MTCNN consists of 3 CNNs (Convolutional Neural Networks) in a cascaded manner (P-Net, R-Net, O-Net).

Propusal Network (P-Net): the network structure mainly obtains regression vectors of candidate windows and bounding boxes of the face region. And using the bounding box for regression, calibrating the candidate windows, and then merging the highly overlapping candidate boxes by Non-maximum suppression (NMS).

Refine Network (R-Net): the network structure also removes those false-positive areas by bounding box regression and NMS. The network structure is different from the P-Net network structure, and a full connection layer is added, so that the effect of better suppressing false-positive can be achieved.

Output Network (O-Net): the layer has one more roll base layer than the R-Net layer, the processing result is more fine, and the function of the layer is the same as that of the R-Net layer. But this layer has made more supervision of the face area and also outputs 5 landmarks (landmark).

As shown in fig. 1, which is a schematic implementation flow diagram of an advertisement recommendation method provided by an embodiment of the present invention, the method may include the following steps:

and S11, collecting video frame images of audiences watching the advertisements in the set advertisement time period.

In specific implementation, the server collects video frame images of audiences watching the advertisements in a set advertisement time period, wherein the set advertisement time period can be a certain set advertisement playing time period.

Specifically, a video frame image of a viewer who is watching the advertisement before the network television can be captured by using a camera device installed on the network television.

And S12, acquiring the face image in the video frame image aiming at each video frame image.

In specific implementation, the server acquires a face image in each acquired video frame image.

Specifically, the step of acquiring the face image in the video frame image by the steps shown in fig. 2 includes the following steps:

and S121, extracting face position coordinates in the video frame image according to a preset face detection model.

In this step, the face position coordinates include face key point position coordinates. The preset face detection model is a pre-trained neural network model for face detection, and in the training process, the neural network may be, but is not limited to, MTCNN, which is not limited in the embodiment of the present invention.

Taking MTCNN as an example, the classification training process is as follows: a large number of face and non-face sample images are used for training to obtain a classifier for solving the problem of two classes of classification, namely, whether the face is the face or not is judged. The target of the face detection is to find out the corresponding positions of all faces in the image, the output of the algorithm is the coordinates of the external rectangle of the face in the image, and the method can also comprise face key point information, namely the position coordinates of the face key points. The face key points at least comprise five key points: two eyes, nose, two corners of the mouth, as shown in fig. 3, are a schematic diagram of five key points of a standard frontal face.

The training process is as follows:

training data set: the image detection method comprises a face image and annotation information thereof, wherein the annotation information comprises coordinates of an upper left corner point and coordinates of a lower right corner point of an image detection frame and coordinates of key points of the face.

Training: and inputting the labeled face picture and the corresponding labeling information into a neural network to obtain a face detection model.

And S122, intercepting a face image according to the face position coordinates.

And intercepting the face image according to the acquired face position coordinates, namely coordinates of the face circumscribed rectangle in the video frame image.

And S13, determining the state information of the corresponding audience in the video frame image according to the face image.

In specific implementation, the state information includes gender, age, expression, opening and closing states of human eyes, and a rotation angle of a human face. Specifically, the gender, age, expression, human eye opening and closing state and human face rotation angle of the corresponding audience in the video frame image are determined according to the human face image.

Specifically, the gender of the corresponding audience in the video frame image can be obtained by: and inputting the face image into a preset gender classification model, and acquiring the gender of the audience corresponding to the face image. The preset gender classification model is a pre-trained neural Network model for gender detection, and in the training process, the used neural Network may be, but is not limited to, google lenet, VGGNet (Visual Geometry Group Network), AlexNet, or the like, which is not limited in the embodiment of the present invention.

Taking the neural network google lenet as an example, the training process of the gender classification model is as follows: the data set adopts a large number of face images and labeling information thereof, wherein the labeling information is male (0) or female (1), the face images with the labeling information are input into a GoogleNet network, a predicted value is output after the network is calculated, the predicted value and a true value of the labeling information are subjected to difference to obtain an error value, the error is reversely propagated to a neural network according to a gradient descent method to correct parameters of the neural network until the predicted value of the network approaches the true value, and a gender classification model is obtained.

When the gender classification model is specifically implemented, the face image is input into the gender classification model, and the predicted gender can be obtained.

Similarly, the age of the corresponding viewer in the video frame image can be obtained in a similar manner: and inputting the face image into a preset age classification model, and acquiring the age of the audience corresponding to the face image. The preset age classification model is a pre-trained neural Network model for age detection, and in the training process, the used neural Network may use, but is not limited to, google lenet, VGGNet (Visual Geometry Group Network), AlexNet, or the like, which is not limited in the embodiment of the present invention.

The data set adopted in the training process of the age classification model is a face image and the labeled information thereof, the labeled information is age numbers, the training process is similar to the training process of the gender classification model, and the detailed description is omitted here.

When the method is specifically implemented, the face image is input into the age classification model, and a predicted age value can be obtained.

Similarly, the expression of the corresponding viewer in the video frame image can be obtained in a similar manner: and inputting the face image into a preset expression classification model, and acquiring the expression of the audience corresponding to the face image. The preset expression classification model is a pre-trained neural Network model for expression detection, and in the training process, the used neural Network may be, but is not limited to, google lenet, VGGNet (Visual Geometry Group Network), AlexNet, or the like, which is not limited in the embodiment of the present invention.

The data set adopted in the training process of the expression classification model is a face image and the labeling information thereof, the labeling information is a non-expression (0) and a smile (1), the training process is similar to the training process of the gender classification model, and the details are not repeated here.

When the facial image is specifically implemented, the facial image is input into the expression classification model, and a predicted expression, namely smiling or no expression, can be obtained.

Similarly, the opening and closing states of the human eyes of the corresponding viewers in the video frame image can be obtained in a similar manner: extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images. The preset human eye classification model is a pre-trained neural Network model for human eye state detection, and in the training process, the used neural Network may be, but is not limited to, google lenet, VGGNet (Visual Geometry Group Network), AlexNet, or the like, which is not limited in the embodiment of the present invention.

The data set adopted in the training process of the human eye classification model is a human face image and the labeled information thereof, the labeled information is eye opening (0) and eye closing (1), the training process is similar to the training process of the gender classification model, and the detailed description is omitted here.

In specific implementation, the human face image is input into the human eye classification model, and the predicted open and close states of human eyes, namely the open and close states of the human eyes can be obtained.

In addition, the rotation angle of the face of the corresponding viewer in the video frame image can be obtained by the following method: and determining the face rotation angle according to the position coordinates of the face key points and the position coordinates of the face key points of a preset standard frontal face.

In specific implementation, taking a left-right rotation angle (i.e., a rotation angle in an X-axis direction) of a human face as an example, as shown in fig. 4, the left-right rotation angle is a schematic distribution diagram of key points of an actual human face and key points of a standard frontal face when the human face rotates left and right, the

points

131, 132, 133, 134, and 135 are five key points of the standard frontal face and respectively represent a left eye, a right eye, a nose, a left mouth angle, and a right mouth angle, the positions of the

points

137 and 138 are positions of maximum angles of left-right rotation of the human face, that is, positions of left-right rotation of 90 degrees, and the position of the point 136 is a position of one key point of the actual human face, that is. L denotes a vertical distance between the left eye 131 and the actual nose 136 of the standard frontal face, R denotes a vertical distance between the right eye 132 and the actual nose 136 of the standard frontal face, and S denotes a left-right eye X-axis direction pitch of the standard frontal face. If L < R, it means that the face is biased to the left, and if L > R, it means that the face is biased to the right. The left and right face rotation angles can be calculated by the following formula:

the calculation of the rotation angles of the upper and lower faces (i.e., the rotation angle in the Y-axis direction) is analogized, and details thereof are not described herein.

When the rotation angle of the human face in the X-axis direction and the rotation angle of the human face in the Y-axis direction are both within a preset angle range, determining the human face as a front face, and when at least one of the rotation angle of the human face in the X-axis direction and the rotation angle of the human face in the Y-axis direction is not within the preset angle range, determining the human face as a side face. For example, the preset angle range may be [ -30 °, 30 ° ]. In a specific implementation process, the preset angle range may be set according to an empirical value, which is not limited in the embodiment of the present invention.

It should be noted that the human face may have different viewing angles and poses in the image, so that the human face needs to be aligned, and the alignment method can perform affine transformation according to five key points of the standard human face to correct the human face.

Further, whether the attention of the audience corresponding to the face image is concentrated is judged according to the opening and closing state of human eyes and the rotation angle of the face.

Specifically, if the eye opening and closing state of the human eyes is eye opening and the rotation angle of the human face is within a preset angle range, the attention of the audience corresponding to the human face image is judged to be concentrated; and if the eye opening and closing state of the human eyes is eye closing or the human face rotation angle is not within the preset angle range, judging that the attention of the audience corresponding to the human face image is not concentrated.

And S14, determining the audience type interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience.

In particular, the viewer type interested in the advertisement may be determined according to the process shown in fig. 5, which includes the following steps:

and S141, counting audiences of which the watching time length is longer than the preset time length, the attention is concentrated, and the number of frames of the video frame images with the smile expression is larger than the preset number.

In specific implementation, firstly, the human face in each video frame image needs to be matched, and the number of video frame frames containing the same audience is counted to determine the watching time of the audience. Specifically, face features of face images in different video frame images are extracted to obtain face feature vectors, and face feature matching is performed according to the face feature vectors extracted from the face images in the different video frame images. The face feature vector can be extracted in the following way: and inputting the face image into a preset face feature extraction classification network to obtain a face feature vector corresponding to the face image. Furthermore, the face feature matching can be performed by calculating the Euclidean distance between every two face feature vectors, the greater the Euclidean distance value is, the higher the similarity is, otherwise, the smaller the similarity is, so as to perform the face matching of different video frame images.

For example, the face feature vector 1: t is₁＝x₁₁+x₁₂+......+x_1i

Face feature vector 2: t is₂＝x₂₁+x₂₂+......+x_2i

The euclidean distance between the face feature vector 1 and the face feature vector 2 is:

further, counting audiences with watching time length longer than preset time length in the video frame images, wherein the number of the audiences with concentrated attention and smile expression in the video frame images is larger than the preset number of the audiences. The preset duration and the preset number may be set according to empirical values, which is not limited in the embodiments of the present invention.

And S142, determining the counted age and gender of the audience as the audience type interested in the advertisement.

And S15, automatically recommending the advertisement to the audience of the type in the advertisement time period.

Based on the same inventive concept, the embodiment of the invention also provides an advertisement recommendation device, and as the problem solving principle of the advertisement recommendation device is similar to that of the advertisement recommendation method, the implementation of the device can refer to the implementation of the method, and repeated parts are not repeated.

As shown in fig. 6, which is a schematic structural diagram of an advertisement recommendation apparatus provided in an embodiment of the present invention, the advertisement recommendation apparatus may include:

the acquisition unit 21 is used for acquiring video frame images of audiences watching advertisements in a set advertisement time period;

an acquiring unit 22, configured to acquire, for each video frame image, a face image in the video frame image;

the first determining unit 23 is configured to determine, according to the face image, state information of a corresponding viewer in the video frame image;

a second determining unit 24, configured to determine a type of the viewer interested in the advertisement according to the state information of each viewer in each video frame image and the viewing duration of each viewer;

a recommending unit 25 for automatically recommending the advertisement to the type of viewer in the advertisement time period.

Preferably, the obtaining unit 22 is specifically configured to extract a face position coordinate in the video frame image according to a preset face detection model, where the face position coordinate includes a face key point position coordinate; and intercepting a face image according to the face position coordinates.

the first determining unit 23 is specifically configured to input the face image into a preset gender classification model, and obtain the gender of the audience corresponding to the face image; inputting the face image into a preset age classification model to obtain the age of the audience corresponding to the face image; inputting the face image into a preset expression classification model, and acquiring the expression of the audience corresponding to the face image; extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images; and determining the face rotation angle according to the position coordinates of the face key points and the position coordinates of the face key points of a preset standard frontal face.

Optionally, the apparatus further comprises:

Preferably, the second determining unit 24 is specifically configured to count viewers in the video frame images, for which the watching duration is longer than a preset duration, and the number of frames of the video frame images with concentrated attention and smile expression is greater than a preset number; determining the counted age and gender of the audience as the type of audience interested in the advertisement.

Based on the same technical concept, an embodiment of the present invention further provides an electronic device 300, and referring to fig. 7, the electronic device 300 is configured to implement the advertisement recommendation method described in the foregoing method embodiment, where the electronic device 300 of this embodiment may include: a memory 301, a processor 302, and a computer program, such as an advertisement recommendation program, stored in the memory and executable on the processor. The processor, when executing the computer program, implements the steps in the above-described embodiments of the advertisement recommendation method, such as step S11 shown in fig. 1. Alternatively, the processor, when executing the computer program, implements the functions of each module/unit in the above-described device embodiments, for example, 21.

The embodiment of the present invention does not limit the specific connection medium between the memory 301 and the processor 302. In the embodiment of the present application, the memory 301 and the processor 302 are connected by the bus 303 in fig. 7, the bus 303 is represented by a thick line in fig. 7, and the connection manner between other components is merely illustrative and is not limited thereto. The bus 303 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

The memory 301 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 301 may also be a non-volatile memory (non-volatile memory) such as, but not limited to, a read-only memory (rom), a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD), or any other medium which can be used to carry or store desired program code in the form of instructions or data structures and which can be accessed by a computer. The memory 301 may be a combination of the above memories.

Processor 302, configured to implement an advertisement recommendation method as shown in fig. 1, comprising:

the processor 302 is configured to call the computer program stored in the memory 301 to execute steps S11 to S15 shown in fig. 1.

The embodiment of the present application further provides a computer-readable storage medium, which stores computer-executable instructions required to be executed by the processor, and includes a program required to be executed by the processor.

In some possible embodiments, aspects of the advertisement recommendation method provided by the present invention may also be implemented in a form of a program product, which includes program code for causing an electronic device to perform the steps in the advertisement recommendation method according to various exemplary embodiments of the present invention described above in this specification when the program product runs on the electronic device, for example, the electronic device may perform steps S11 to S15 as shown in fig. 1.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for advertisement recommendation of embodiments of the present invention may employ a portable compact disk read only memory (CD-ROM) and include program code, and may be run on a computing device. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units described above may be embodied in one unit, according to embodiments of the invention. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. An advertisement recommendation method, comprising:

acquiring a face image in each video frame image;

determining state information of corresponding audiences in the video frame image according to the face image, wherein the state information comprises gender, age, expression, opening and closing states of human eyes and a face rotation angle; determining the rotation angle of the face of the corresponding audience in the video frame image by the following method: determining the face rotation angle according to the position coordinates of the key points of the face and the position coordinates of the key points of the face of a preset standard frontal face, wherein the position coordinates of the key points at least comprise: a left eye position coordinate, a right eye position coordinate, and a nose position coordinate;

determining the face rotation angle according to the position coordinates of the key points of the face and the position coordinates of the key points of the face of a preset standard frontal face, which specifically comprises the following steps: determining the vertical distance between the left eye of the standard frontal face and the nose of the human face according to the position coordinates of the nose of the human face and the position coordinates of the left eye of the standard frontal face; determining the vertical distance between the right eye of the standard frontal face and the nose of the human face according to the position coordinates of the nose of the human face and the position coordinates of the right eye of the standard frontal face; determining the distance between the left eye of the standard frontal face and the right eye of the standard frontal face in the X-axis direction; determining the rotation direction of the face according to the vertical distance between the left eye of the standard frontal face and the nose of the face and the vertical distance between the right eye of the standard frontal face and the nose of the face, and determining the rotation angle of the face according to the vertical distance between the left eye of the standard frontal face and the nose of the face and the distance between the left eye of the standard frontal face and the right eye of the standard frontal face in the X-axis direction;

judging whether the attention of the audience corresponding to the face image is concentrated according to the opening and closing state of the human eyes and the face rotation angle, specifically comprising the following steps: if the eye opening and closing state of the human eyes is eye opening and the rotation angle of the human face is within a preset angle range, judging that the attention of the audience corresponding to the human face image is concentrated; if the eye opening and closing state of the human eyes is eye closing or the human face rotation angle is not within the preset angle range, judging that the attention of the audience corresponding to the human face image is not concentrated;

determining the audience type interested in the advertisement according to the state information of each audience in each video frame image and the watching duration of each audience, specifically comprising: counting audiences of which the watching time length is longer than the preset time length, the attention is concentrated, and the number of frames of the video frame images with smile expressions is larger than the preset number in the video frame images; determining the counted age and gender of the audience as the type of audience interested in the advertisement;

2. The method of claim 1, wherein obtaining the face image in the video frame image specifically comprises:

and intercepting a face image according to the face position coordinates.

3. The method of claim 2, wherein determining the state information of the corresponding viewer in the video frame image according to the face image comprises:

extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images.

4. An advertisement recommendation apparatus, comprising:

the first determining unit is used for determining the state information of the corresponding audience in the video frame image according to the face image, wherein the state information comprises gender, age, expression, opening and closing states of human eyes and a face rotating angle;

the first determining unit is specifically configured to determine a face rotation angle according to the coordinates of the face key point position and the coordinates of the face key point position of a preset standard frontal face;

the first determining unit is specifically configured to determine a vertical distance between a left eye of the standard frontal face and a nose of the human face according to the nose position coordinate of the human face and the left eye position coordinate of the standard frontal face; determining the vertical distance between the right eye of the standard frontal face and the nose of the human face according to the position coordinates of the nose of the human face and the position coordinates of the right eye of the standard frontal face; determining the distance between the left eye of the standard frontal face and the right eye of the standard frontal face in the X-axis direction; determining the rotation direction of the face according to the vertical distance between the left eye of the standard frontal face and the nose of the face and the vertical distance between the right eye of the standard frontal face and the nose of the face, and determining the rotation angle of the face according to the vertical distance between the left eye of the standard frontal face and the nose of the face and the distance between the left eye of the standard frontal face and the right eye of the standard frontal face in the X-axis direction;

the judging unit is used for judging whether the attention of the audience corresponding to the human face image is concentrated or not according to the opening and closing state of the human eyes and the human face rotating angle;

the judging unit is specifically configured to judge that the attention of the audience corresponding to the face image is concentrated if the eye opening/closing state of the human eyes is eye opening and the rotation angle of the face is within a preset angle range; if the eye opening and closing state of the human eyes is eye closing or the human face rotation angle is not within the preset angle range, judging that the attention of the audience corresponding to the human face image is not concentrated;

the second determining unit is specifically configured to count audiences of which the watching duration is longer than a preset duration and the number of frames of the video frame images with concentrated attention and smile expressions is larger than a preset number in the video frame images; determining the counted age and gender of the audience as the type of audience interested in the advertisement;

5. The apparatus of claim 4,

the acquiring unit is specifically configured to extract face position coordinates in the video frame image according to a preset face detection model, where the face position coordinates include face key point position coordinates; and intercepting a face image according to the face position coordinates.

6. The apparatus of claim 5, wherein the status information includes gender, age, expression, opening and closing status of human eyes, and rotation angle of human face;

the first determining unit is specifically configured to input the face image into a preset gender classification model, and acquire the gender of the audience corresponding to the face image; inputting the face image into a preset age classification model to obtain the age of the audience corresponding to the face image; inputting the face image into a preset expression classification model, and acquiring the expression of the audience corresponding to the face image; and extracting the human eye position coordinates from the human face key point position coordinates, extracting human eye images according to the human eye position coordinates, inputting the human eye images into a preset human eye classification model, and acquiring the human eye opening and closing state of the audience corresponding to the human face images.

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the advertisement recommendation method of any one of claims 1-3 when executing the program.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the advertisement recommendation method according to any one of claims 1 to 3.