CN105303998A

CN105303998A - Method, device and equipment for playing advertisements based on inter-audience relevance information

Info

Publication number: CN105303998A
Application number: CN201410356198.7A
Authority: CN
Inventors: 陈岩; 高艳君; 邱亚钦; 严超; 王强; 黄英; 熊君君
Original assignee: Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Current assignee: Beijing Samsung Telecom R&D Center; Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Priority date: 2014-07-24
Filing date: 2014-07-24
Publication date: 2016-02-03
Also published as: KR20160012902A

Abstract

An embodiment of the invention provides a method for playing advertisements on a display based on inter-audience relevance information, comprising: first, collecting data of at least two human bodies present in front of the display, and extracting inter-human body relevance information from the data; second, automatically selecting advertisement information corresponding to the relevance information according the relevance information; third, playing the corresponding advertisement information on the display. Since in the embodiment of the invention the extracted relevance information is common relevance characteristics of at least two human bodies present in front of the display, the selected advertisement information is highly matched with the at least two human bodies, the advertisement information can adapt to multiple people, advertisements are pushed more specifically to the at least two audience based on the relevance information, the purpose of pushing advertisements to multiple people in a targeted manner is achieved and advertisement pushing effect is improved; meanwhile, the view experience of watchers watching the advertisement information is also improved.

Description

Based on the method for the related information broadcast advertisement between spectators, device and equipment

Technical field

The present invention relates to multimedia technology field, specifically, the method for the related information broadcast advertisement between the present invention relates to based on spectators, device and equipment.

Background technology

Digital signage (DigitalSignage) is a kind of brand-new media concept, refer in megastore, public place that supermarket, restaurant, the stream of people such as movie theatre converge, by having the terminal presentation facility of giant-screen, issue the multimedia specialty audiovisual system of the information such as business, finance and economics and amusement.The input that it is intended to specific physics place, the specific time period carries out information to specific crowd, thus obtain demonstration effect.In the last few years, along with the development of novel human-machine interaction technology, computer vision, artificial intelligence is made to start to play an increasingly important role in acquisition of information, collection and monitoring, machine Interaction Interface Design etc.Face feature identification technique is combined with digital signage media, just can play different ad contents according to different users, make the advertisement of input more targeted, the face characteristic information of the spectators of viewing advertisement can also be added up simultaneously, and it is analyzed, according to statistic analysis result adjustment broadcast advertisement, allow the more characteristic and specific aim of content play.

In prior art, smart ads system main manifestations form is the personal information according to beholder, and as age, sex, expression etc., carry out personalized advertisement customization and input, therefore, advertising message is thrown in based on the personal information of beholder.The subject matter of prior art comprises:

1) existing Intelligent advertisement delivery system mainly lays particular emphasis on extraction and the identification of carrying out information for single viewer, incidence relation between have ignored multiple beholder and the identification and analysis of general character, and the advertisement adapted with multiple beholder is pushed according to discriminance analysis result.Only according to the individual information advertisement of single viewer, advertisement form and content cannot accomplish comprehensive personalized customization, and the propelling movement effect of thus advertisement has limitation.

2) existing Intelligent advertisement delivery system only carries out simple information extraction for single viewer, and fail the degree of depth and excavate beholder information, as the viewpoint residence time, affiliated social class, accent etc., with merchandise news, as content of good, price, sales volume, between the internal association that exists, cause choosing throw in the advertising message of beholder and beholder's matching degree lower, thus affect advertisement pushing effect.Meanwhile, existing Intelligent advertisement delivery system only lays particular emphasis on propelling movement beholder being carried out to advertisement, and ignores the feedback information obtaining beholder, thus cannot accomplish shooting the arrow at the target of advertisement pushing.

3) beholder and advertising message are only simply merged by existing Intelligent advertisement delivery system, as witch mirror etc. of fitting, the form that represents of advertisement has larger limitation, beholder to the interest-degree of advertisement and acceptance lower, affect advertisement pushing effect further.

Summary of the invention

Object of the present invention is intended at least solve one of above-mentioned technological deficiency, and particularly existing Intelligent advertisement delivery system is only for single extraction and the identification of carrying out information, thus only can choose and the single advertising message matched.

The invention provides a kind of method based on related information broadcast advertisement on a display screen, comprising:

Gather the data of at least two human bodies, extract the related information between human body according to data;

According to related information, the selected advertising message corresponding with related information automatically;

Play corresponding advertising message on a display screen.

Present invention also offers a kind of device based on related information broadcast advertisement on a display screen, comprise the first extraction module, chosen module and playing module:

First extraction module, for gathering the data of at least two human bodies, extracts the related information between human body according to data;

Chosen module, for according to related information, selectes the advertising message corresponding with related information automatically;

Playing module, for playing corresponding advertising message on a display screen.

In the scheme of the present embodiment, by gathering the data of at least two human bodies that display screen front exists, extract the related information between human body from data; Subsequently can according to the related information selected advertising message corresponding with related information automatically; The final advertising message playing correspondence on a display screen.Due to the related information extracted be display screen front exist at least two human bodies between total associate feature, the advertising message chosen is made all to have higher matching degree with at least two human bodies, advertising message and many people adapt, thus more targetedly based on related information at least two human body advertisement information, achieve shoot the arrow at the target to many people advertisement information, improve advertisement pushing effect; Meanwhile, the viewing experience that beholder watches advertising message is also improved.The such scheme that the present invention proposes, very little to the change of existing system, can not the compatibility of influential system, and realize simple, efficient.

The aspect that the present invention adds and advantage will part provide in the following description, and these will become obvious from the following description, or be recognized by practice of the present invention.

Accompanying drawing explanation

The present invention above-mentioned and/or additional aspect and advantage will become obvious and easy understand from the following description of the accompanying drawings of embodiments, wherein:

Fig. 1 is the method flow diagram based on the related information broadcast advertisement between spectators of the embodiment of the present invention;

Fig. 2 is speech detection exemplary plot;

Fig. 3 is face windows detecting exemplary plot;

Fig. 4 is face organ characteristic Information locating exemplary plot;

Fig. 5 is the identification exemplary plot of countenance information, age information, gender information and Skin Color Information;

Fig. 6 is eyes viewpoint location Calculation exemplary plot;

Fig. 7 is face wear detection example figure;

Fig. 8 is human body windows detecting exemplary plot;

Fig. 9 is body part location recognition exemplary plot;

Figure 10 is the action behavior Examples of information figure of human body;

Figure 11 is hair style information, dress ornament information and build information identification exemplary plot;

Figure 12 is human body belongings information identification exemplary plot;

Figure 13 is body temperature information identification exemplary plot;

Figure 14 is voice messaging identification exemplary plot;

Figure 15 is that voice carry out source electricity exemplary plot;

Figure 16 is the exemplary plot of the present invention's specific embodiment;

Figure 17 is the exemplary plot of the another specific embodiment of the present invention;

Figure 18 is identification information identification exemplary plot;

Figure 19 is the functions of the equipments schematic diagram based on the related information broadcast advertisement between spectators according to the embodiment of the present invention.

Embodiment

Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.

Those skilled in the art of the present technique are appreciated that unless expressly stated, and singulative used herein " ", " one ", " described " and " being somebody's turn to do " also can comprise plural form.Should be further understood that, the wording used in instructions of the present invention " comprises " and refers to there is described feature, integer, step, operation, element and/or assembly, but does not get rid of and exist or add other features one or more, integer, step, operation, element, assembly and/or their group.Should be appreciated that, when we claim element to be " connected " or " coupling " to another element time, it can be directly connected or coupled to other elements, or also can there is intermediary element.In addition, " connection " used herein or " coupling " can comprise wireless connections or wirelessly to couple.Wording "and/or" used herein comprises one or more whole or arbitrary unit listing item be associated and all combinations.

Those skilled in the art of the present technique are appreciated that unless otherwise defined, and all terms used herein (comprising technical term and scientific terminology), have the meaning identical with the general understanding of the those of ordinary skill in field belonging to the present invention.It should also be understood that, those terms defined in such as general dictionary, should be understood to that there is the meaning consistent with the meaning in the context of prior art, unless and by specific definitions as here, otherwise can not explain by idealized or too formal implication.

Those skilled in the art of the present technique are appreciated that, here used " terminal ", " terminal device " had both comprised the equipment of wireless signal receiver, it only possesses the equipment of the wireless signal receiver without emissive ability, comprise again the equipment receiving and launch hardware, it has and on bidirectional communication link, can carry out the reception of two-way communication and launch the equipment of hardware.This equipment can comprise: honeycomb or other communication facilitiess, its honeycomb or other communication facilities of having single line display or multi-line display or not having multi-line display; PCS (PersonalCommunicationsService, PCS Personal Communications System), it can combine voice, data processing, fax and/or its communication ability; PDA (PersonalDigitalAssistant, personal digital assistant), it can comprise radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, calendar and/or GPS (GlobalPositioningSystem, GPS) receiver; Conventional laptop and/or palmtop computer or other equipment, it has and/or comprises the conventional laptop of radio frequency receiver and/or palmtop computer or other equipment.Here used " terminal ", " terminal device " can be portable, can transport, be arranged in the vehicles (aviation, sea-freight and/or land), or be suitable for and/or be configured at local runtime, and/or with distribution form, any other position operating in the earth and/or space is run.Here used " terminal ", " terminal device " can also be communication terminal, access terminals, music/video playback terminal, can be such as PDA, MID (MobileInternetDevice, mobile internet device) and/or there is the mobile phone of music/video playing function, also can be the equipment such as intelligent television, Set Top Box.

In the present invention, " terminal device " can comprise multiple, multiple " terminal device " can be used as entirety, or by wherein part " terminal device ", even be installed on the specific device in one or more " terminal device ", as the device based on the related information broadcast advertisement between spectators, come based on the related information broadcast advertisement between spectators.If no special instructions, herein " terminal device " is equal to the device based on the related information broadcast advertisement between spectators and uses.

Fig. 1 is the process flow diagram of method first embodiment of the related information broadcast advertisement that the present invention is based between spectators.As shown in Figure 1, the present embodiment specifically comprises following job step based on the workflow of the method for the related information broadcast advertisement between spectators:

In step 110, the data of terminal device collection at least two human bodies, extract the related information between human body according to data.

Step S110 comprises step S111 (not shown), and in step S111, terminal device gathers the data of at least two human bodies that display screen front exists by camera or microphone; Wherein, camera comprises following any one or plurality of devices: visible image capturing head, depth camera and infrared camera.

In an embodiment of the present invention, above-mentioned data comprise: image information and/or voice messaging.

Particularly, in step S111, gathered the image information of at least two people that display screen front exists in terminal device by camera, comprise following at least one situation:

The visible light image information of at least two people is gathered by visible image capturing head;

The image depth information of at least two people is gathered by depth camera;

The Infrared Image Information of at least two people is gathered by infrared camera.

Particularly, step S111 comprises step S1111 (not shown), step S1112 (not shown) and step S1113 (not shown).In step S1111, terminal device gathers the acoustic information in display screen front by microphone; In step S1112, speech detection is carried out to acoustic information, judge whether acoustic information comprises voice messaging; In step S1113, when judging to comprise voice messaging, extract by filter out background noise the voice messaging that acoustic information comprises.

In one example, concrete processing procedure as shown in Figure 2, after terminal device gathers the acoustic information in display screen front by microphone, such as, framing smoothing processing is carried out to the voice signal obtained, such as, first according to every 10ms, framing extraction is carried out to acoustic information, then to the smoothing process of the every interframe of the voice signal extracted, the absolute maximum difference in the Difference Calculation N neighborhood of field is utilized; Based on fixing ground unrest model, noise filtering is carried out to acoustic information, such as, utilize predetermined noise coefficient N to do FFT conversion to voice signal, obtain the power spectrum signal calculating conversion; Obtain adaptive decision threshold by least square, be greater than this threshold value and then determine that this acoustic information is voice messaging, otherwise, be then defined as noise signal.When judging to comprise voice messaging, extract by filter out background noise the voice messaging that acoustic information comprises.

Step S110 comprises step S112 (not shown) and step S113 (not shown); In step S112, terminal device is from the characteristic information of extracting data at least two people; In step S113, terminal device, according to characteristic information, determines the related information between human body; Wherein, characteristic information comprises following one or more: characteristics of human body's information; Voice characteristics information; Physiological characteristic information.

Particularly, in step S112, terminal device extracts characteristics of human body's information of at least two people from image information.

Wherein, characteristics of human body's information of at least two people comprises: the range information of at least two people; The face relevant information of at least two people; The human body relevant information of at least two people.

Step S112 comprises step S1121 (not shown); In step S1121, terminal device, according to image information, calculates the range information determining at least two people by distance.

Wherein, range information comprises longitudinal separation and/or left and right distance.

Particularly, according to image depth information, by calculating at least two people to the difference of the spacing of display screen, determine the longitudinal separation of at least two people.

Particularly, according to visible light image information and image depth information, carry out the process of human body windows detecting by mode as shown in Figure 8, to obtain human detection window information; By calculating the spacing distance of the human detection window between at least two people, determine the left and right distance of at least two people.Concrete computation process is: based on the proportionate relationship of predetermined image pixel and actual range linear module, and the pixel separation of the human detection frame of this two people according to visible light image information calculates the actual left and right distance determining this two people.

Step S112 comprises step S1122 (not shown); In step S1122, terminal device, according to image information, extracts the face relevant information of at least two people by Face datection process.

Wherein, the face relevant information of at least two people comprises: face number; Face organ characteristic information; Countenance information; Face's Skin Color Information; Age information; Gender information; Eyes viewpoint locating information; Face's wear information.

Particularly, terminal device carries out the process of face windows detecting according to visible light image information and image depth information, to obtain Face datection window information; Face number is determined according to the quantity of Face datection window.In one embodiment, according to visible light image information and image depth information, RLAB (RandomLocalAssembleBlocks) can be adopted to carry out face windows detecting in real time with the mode of Adaboost.Idiographic flow is as shown in Figure 3: first, do not belong to the background image of face according to image depth information filtering; Then, by predetermined ratio and level, down-sampling is carried out to visible light image information, such as, image is carried out down-sampling by radio=1.259, respectively RLAB feature is calculated to every level by setting up 24 level image pyramids, then the characteristic pattern of every level is pressed to the stationary window scan image of 24*24 size, i.e. window traversal; Subsequently, the output of calculation window image and cascading filter responds, and compares the size that result of calculation and first trains threshold value, then determines to recognize face when result of calculation is greater than training threshold value, is less than the first training threshold value and then determines unidentified to face; According to detection window size, the background sample outside acquisition testing window area and the face sample of detection window, calculate the Haar-like feature of background sample and face sample; Utilize Bayes sorter to carry out real-time online study, the on-time model utilizing study to obtain is followed the tracks of, to obtain the Face datection window information of output; Subsequently, face number is determined according to the quantity of Face datection window.

Preferably, terminal device carries out face's structures locating process according to Face datection window information, to extract face organ characteristic information.Face's structures locating mainly comprises the positioning feature point to eyes, eyebrow, face, nose, ear.The process of face's structures locating process is as shown in Figure 4: first, according to Face datection window information, Face datection frame is normalized to average face model size, and being bold little as adopted average shape is 64*64; Then, extract the HOG feature of average shape face at Face datection frame, utilize and train SDM (SupervisedDescentMethod) model that obtains to carry out iteration, and constantly update face organ characteristic and put position, obtain the characteristic point position of face's organ.

More preferably, terminal device, according to Face datection window information and described face organ characteristic information, by face normalization and unitary of illumination process, extracts face texture feature information; According to face texture feature information, carry out identification based on machine learning algorithm and determine countenance information, and/or face's Skin Color Information, and/or age information and/or gender information.The identifying of countenance information, age information, gender information and face's Skin Color Information is as shown in Figure 5: first, according to by face windows detecting as shown in Figure 3 obtain the face organ characteristic information that Face datection window information and face organ characteristic point location mode as shown in Figure 4 extract, carry out face normalization and unitary of illumination process, extract face textural characteristics, include but not limited to Gabor, SIFT, LBP, HOG; Then, according to face texture feature information, based on machine learning algorithm, as SVM, DeepLearning, the methods such as linear regression carry out training to identify determining countenance information, age information, gender information and face's Skin Color Information.

More preferably, terminal device after extraction face organ characteristic information, by carrying out the process of viewpoint location Calculation to eyes, to determine eyes viewpoint locating information.In one example, first, pre-established average face 3D model emulation is mapped in Face datection window that face windows detecting obtains, determines affine 2D point; Subsequently, calculate the difference between the characteristic point position of face's organ and affine 2D point, carry out by gradient descent method the attitude angle calculating to determine head, according to attitude angle and eyes and screen distance, calculate according to mode as shown in Figure 6 and determine eyes viewpoint locating information.In right-angle triangle as shown in Figure 6, known head anglec of rotation θ, a known right-angle side is behaved the distance of screen, calculates another right-angle side and namely drops on distance between viewpoint on screen and screen center's point.Wherein, end rotation angle θ is divided into level angle and vertical angle, two angles can calculate by above-mentioned right-angle triangle respectively and obtain from the skew in screen center x direction and the skew in y direction, and a bit, this point is viewpoint in the offset distance synthesis that on final x direction and y direction calculates.Wherein, people can carry out shooting face according to many people to the calculating of screen distance under the different distance preset, as 25cm, 50cm, 1500cm etc., the average man calculated subsequently under different distance is bold little, when human body can calculate the distance of people to screen according to current face's size and average man little and corresponding distance relation of being bold when watching advertisement.

More preferably, terminal device after extraction face organ characteristic information, by carrying out face's wear check processing to face organ characteristic information, to determine face's wear information.In one example, Glasses detection is carried out to eye areas, ear nail is carried out to ear portions and detects, mouth mask detection etc. is carried out to mouth.Face's wear testing process as shown in Figure 7, collect dress region near human face image pattern and human face near the area image sample do not dressed carry out texture feature extraction, send into machine learning framework, carry out learning model, utilize the model learning to obtain to carry out wearing to detect, if there is wearing, then locations of contours is carried out, to determine face's wear information to the gadget dressed.

Step S112 comprises step S1123 (not shown); In step S1123, terminal device, according to image information, extracts the human body relevant information of at least two people by human detection process.

Wherein, the human body relevant information of at least two people comprises following one or more: human body number; Body part characteristic information; The action behavior information of human body; Hair style information; Dress ornament information; Build information; Human body belongings information.

Particularly, terminal device, according to visible light image information and image depth information, carries out the process of human body windows detecting, to obtain human detection window information; Human body number is determined according to the quantity of human detection window.In one example, according to visible light image information and image depth information, HOG (HistogramofGradient) can be adopted to carry out human detection in real time with the mode of DPM (DeformablePartModel).Concrete testing process is as shown in Figure 8: first, utilize image depth information filtering not belong to the background image of human body, and utilizes generic object to detect the object of filtering without profile border; Then, HOG characteristic pattern is asked to the image after carrying out filtering process; Set up search window pyramid by certain ratio, search for respectively to HOG characteristic pattern, the HOG characteristic pattern calculated in DPM model and window responds, and when result of calculation is greater than the second training threshold value, exports human detection window information according to DPM types of models; Human body number is determined according to the quantity of human detection window.

Preferably, terminal device carries out body part localization process according to human detection window information, to extract body part characteristic information.In one example, comprise as shown in Figure 9 according to the process that the body part of image information to human body positions, first according to DPM (DeformablePartModel) model, detect the Position Approximate of the body parts such as head, shoulder, trunk, then, carrying out iteration by training SDM (SupervisedDescentMethod) model obtained to constantly update accurately to locate each body part position, obtaining the body part position of human body.Wherein, partes corporis humani's part SDM model adopts the normalization average shape of different size to train, and if the number of people is 32*32 size, shank is 60*10 size.

More preferably, terminal device, according to body part characteristic information and image depth information, identifies the action behavior information of human body by action behavior identifying processing.Particularly, comprise as shown in Figure 10 to the process that the action behavior of human body identifies, according to the determined body part characteristic information of body part locator meams as shown in Figure 9 and image depth information, by the action behavior between action behavior model of cognition identification human body, comprise hand in hand, gather around shoulder, bosom etc.

More preferably, terminal device, according to body part characteristic information and face's Skin Color Information, carries out identifying processing to hair style and/or dress ornament and/or build, to determine hair style information and/or dress ornament information and/or build information.Concrete identifying is as shown in figure 11: according to body part characteristic information, and utilize complexion model, split by the parts of GraphicCut cutting techniques to location, to the extracted region texture information split and shape information, utilize the model that machine learning is determined, carry out identifying determining hair style information and/or dress ornament information and/or build information.

More preferably, terminal device, according to human detection window information and body part characteristic information, carries out belongings check processing to predetermined areas near human detection window, to determine human body belongings information.Wherein, the detection and Identification to belongings such as pet, bag, books, mobile communication equipments are included but not limited to the identification of human body belongings.Detect identifying as shown in figure 12: first, adopt DPM algorithm to human detection window, such as, detect at hand near zone, preliminary classification is carried out to belongings, identify pet, bag, mobile phone, the types such as flat board, then the machine learning algorithms such as DeepLearning are utilized to carry out determining the concrete belongings information of hand, as pet type, the color etc. of bag.

In step S112, terminal device extracts the physiological characteristic information of at least two people from image information.Wherein, the physiological characteristic information of at least two people comprises: body temperature information.

Step S112 comprises step S1124 (not shown), and in step S1124, terminal device, according to characteristics of human body's information, determines face's area of skin color and health area of skin color; According to Infrared Image Information, and in conjunction with face's area of skin color and health area of skin color, extract the infrared image half-tone information of area of skin color; According to infrared image half-tone information, calculated by linear regression, determine body temperature information.Particularly, determine the concrete mode of body temperature information as shown in figure 13, the Face datection window information determined by Face datection process and the human detection window information determined by human detection process, pass through complexion model, in conjunction with glasses, mouth mask testing result in face belongings, detect exposed face complexion area and area of skin color of human body, and search the infrared image region of corresponding input, extract the infrared image half-tone information of area of skin color, utilize linear regression, calculate the body temperature information of area of skin color.Wherein, the darker representation temperature of infrared image color is higher, otherwise then lower.Such as, red color area is the body temperature of general proxy people, as at about 37 degree; Yellow region is about 20-28 degree; Blue region is about 5-19 degree.Linear regression is the statistical model that the temperature value corresponding to all colours value of area of skin color calculates, by statistics, it determines which region temperature value is mainly distributed in, and determines body temperature value by the main distributed areas of temperature.Preferably, also according to mouth organ shape and voice messaging, can judge whether human body has the symptoms such as heating, influenza.

In step S112, terminal device extracts the voice characteristics information of at least two people from image information.Wherein, the voice characteristics information of at least two people comprises: languages type, voice content and voice are originated.

Step S112 comprises step S1126 (not shown), and in step S1126, terminal device extracts acoustic feature information and spectrum signature information from voice messaging; By machine learning algorithm, identifying processing is carried out to acoustic feature information and spectrum signature information, to determine the first level languages type belonging to voice messaging.Step S112 also comprises step S1127 (not shown), after determining the first level languages type belonging to voice messaging, in step S1127, terminal device is based on first level languages type, secondary classification identification is carried out to voice messaging, to determine the second level languages type belonging to voice messaging, second level languages type affiliation is in first level languages type.Particularly, the identifying of the languages type using language is exchanged as shown in figure 14 between human body: according to voice messaging according to voice messaging, extract acoustic feature and the spectrum signature of voice messaging, utilize GMM (Gaussianmixturemodel, mixed Gauss model) characteristic length is normalized, machine learning algorithm (SVM, DeepLearning etc.) is utilized to identify languages type; Then to the further classification identification of languages type identified, if languages type is English, classification identifies British English, Americanese; Languages type is Chinese, and classification identifies mandarin and dialect etc.

Step S112 comprises step S1128 (not shown), and in step S1128, terminal device is by the voice content in speech recognition technology identification voice messaging.Particularly, utilize speech recognition technology, as HMM, DeepLearning, identify the content in voice messaging, extract the key message of voice content.

Step S112 comprises step S1129 (not shown), and in step S1129, terminal device extracts face organ characteristic information from image information; According to languages type and voice content, and in conjunction with the mouth shapes characteristic information that face's organ characteristic's information comprises, carry out shape of the mouth as one speaks matching treatment to locate the voice source of voice messaging.Particularly, the concrete mode positioned voice sound source is as shown in figure 15: adopt face mouth organ characteristic point location mode determination mouth shapes as shown in Figure 4; Simultaneously according to voice recognition mode as shown in figure 14, identify voice languages type and voice content by DeepLearning; Languages type and voice content are combined with mouth shapes, carries out shape of the mouth as one speaks coupling and voice sound source is positioned.

In step S113, one or more according in characteristics of human body's information, voice characteristics information and physiological characteristic information of terminal device, carries out mating the related information determined between human body in characteristic relation corresponding lists.Wherein, related information comprises: social relationships information and personage's common information.Social relationships information comprises: family relationship, friends and Peer Relationships; Family relationship comprises: two generation relation or throwback relation: friends comprises: lovers' relation or regular friend relation; Peer Relationships comprise: sane level relation or relationship between superior and subordinate.Wherein, personage's common information comprises: sex, age, the colour of skin, hair style, dress ornament, build, face's wear and health belongings.In an embodiment of the present invention, characteristic relation corresponding lists comprises in characteristics of human body's information, voice characteristics information and physiological characteristic information between the human body corresponding to one or more combination related information; Such as, sex is a man and a woman between 20 to 30 years old the age of two people; Left and right distance between two people be less than the preset 100cm of predetermined left and right distance and action behavior for hand in hand, then to should two people be lovers' relation; Again such as, the Age and sex of two people is depicted as a female middle-aged and a young girl, two human action behaviors for hand in hand, then to should two people be mother and daughter relationship; Again such as, the Age and sex of two people is depicted as an elderly men and a boy, two human action behaviors for hand in hand, then to should two people be grandparent and grandchild's relation.

In one example, as shown in figure 16, first, according to image information, the Face datection window obtained by face windows detecting mode as shown in Figure 3 and human body windows detecting mode as shown in Figure 8 and the position of human detection window, face number and human body number, select the Face datection window of adjacent two people between two and human detection window to carry out longitudinal separation and left and right distance calculates.Particularly, the computation process of longitudinal separation is: the difference calculating the distance between this two people to display screen according to image depth information, the difference i.e. longitudinal separation of this two people.The computation process of left and right distance is: based on predetermined image pixel and centimetre proportionate relationship, the pixel separation of the human detection frame of this two people according to visible light image information, calculate determine this two people by centimetre in units of actual left and right distance.Be 80cm according to the longitudinal separation between image information acquisition two people, it is less than the preset 100cm of predetermined longitudinal separation, and distance position, left and right 70cm, be less than the preset 100cm of predetermined left and right distance, then determine that the incidence relation of this two people is that social relationships information between this two people belongs to close relationship.Further, the two people location, body part position of carrying out as shown in Figure 9 is determined to the body part positional information of this two people, by mode as shown in Figure 10, the body part positional information of this two people is identified by action behavior model of cognition, determine the action behavior information between this two people, as gathered around shoulder, then combine obtain according to mode as shown in Figure 5 countenance information, age information, gender information and Skin Color Information, determine the social relationships of this two people.Such as, the gender information of this two people is men and women, and age information is less than 40 for being greater than 10, and action behavior for embracing, then determines that the social relationships information of this two people is lovers' relation.

Embodiments of the invention are by the one or more related information determined between human body in characteristics of human body's information, voice characteristics information and physiological characteristic information, namely the related information between many people is determined from multiple angle, drastically increasing the accuracy determining related information, improve strong guarantee for pushing the advertising message adapted with many people.

Preferably, after determining the incidence relation between two people, if there is the adjacent beholder of more than three or three, transmission merging treatment is carried out to determine many relationships to incidence relation.When identification determines that a people has incidence relation with two adjacent people respectively, again such as, identify children and adjacent two adult males and adult female and there is father and son, mother-child relationship (MCR) respectively, then can determine that two adults on these children both sides are conjugal relations, father and son, mother-child relationship (MCR) can be merged into family's three mouthfuls of relations.

Method of the present invention also comprises step S140 (not shown), and in step S140, terminal device, according to characteristics of human body's information, voice characteristics information and physiological characteristic information, carries out stratum's identifying processing, determines the social class belonging at least two people.Particularly, terminal device, according to the wear information near fixed face organ information, face's organ, dress ornament information, human body belongings information, voice messaging etc., carries out stratum's identifying processing, determines the social class belonging at least two people.Social class mainly comprises blue collar, white collar and gold-collardom.

Method of the present invention also comprises step S150 (not shown), and when extracting many group incidence relations, in step S150, terminal device chooses preferred incidence relation based on predetermined selective rule.Preferred incidence relation is chosen based on one or more predetermined selective rules following:

Social relationships information is preferentially chosen in many group incidence relations;

Preferentially choose in many group incidence relations and comprise the maximum incidence relation of number;

The incidence relation of at least two people belonging to predetermined social class is preferentially chosen in many group incidence relations;

The incidence relation of at least two people that preferential selected distance display screen is nearest in many group incidence relations.

Alternatively, above multiple predetermined selective rule can have different selection weights, i.e. the highest preferential as selected rule of weight.

Such as, be lovers' relation when the many groups incidence relation extracting two people is social relationships information, and personage's common information is for being of medium height and age all between 20 to 30 years old, then preferentially choose the incidence relation of lovers' relation as its two people.When screen front exists one group " lovers' relation ", comprising 2 people altogether, exist three groups " family relationships ", is 6 people altogether, then preferentially choose and comprise the maximum incidence relation of number for " family relationship ".When there are 10 people in screen front, two distance display screens in its this 10 people with lovers' relation are nearest, then preferentially choose lovers' relation.When existence two groups " lovers' relation " and two groups " family relationship ", and include 4 people, then select " family relationship " belonging to 4 people with golden collar social class.

In the step s 120, terminal device, according to related information, selectes the advertising message corresponding with related information automatically.Particularly, terminal device, according to relation information, carries out matching inquiry in relation corresponding lists, determines the adline that relation information is corresponding; Subsequently, in advertisement base, advertising message is extracted according to adline.In one example, terminal device, according to " lovers' relation ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline is wedding celebration series advertisements; Subsequently, in advertisement base, honeymoon trip advertising message is extracted according to wedding celebration class.In another example, terminal device, according to " mother and baby's relation ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline is mother and baby's series advertisements; Subsequently, in advertisement base, paper diaper advertisement is extracted according to mother and baby's class.In another example, terminal device, according to personage's general character " women ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline comprises toiletries advertisement; Subsequently, in advertisement base, facial mask advertisement is extracted according to toiletries.

Preferably, step S120 comprises step S121, and in step S121, terminal device, according to related information, in conjunction with current time information, selectes the advertising message corresponding with related information.In one example, current time is the predetermined meal time, and as 12 noon, then terminal device is according to " lovers' relation " selected western-style food dining room series advertisements; According to " family relationship " selected parent-offspring's class special dining room series advertisements.

Method of the present invention also comprises step S160 (not shown), automatically after the selected advertising message corresponding with related information, in step S160, terminal device is according to related information, in selected advertising message, carry out role match, determine the character of at least two human bodies in selected advertising message; Selected advertising message and character are carried out fusion treatment, obtains the advertising message after fusion.

Step S160 comprises step S161 (not shown), step S162 (not shown) and step S163 (not shown); In step S161, terminal device sets up the face human 3d model of at least two human bodies by three-dimensional modeling mode; In step S162, from voice messaging, extract the tone information of at least two human bodies, by phonetic synthesis process, the reconstructed speech information of the advertising message that synthesis is selected, specific implementation comprises: from image information, extract face organ characteristic information; According to languages type and voice content, and in conjunction with the mouth shapes characteristic information that face's organ characteristic's information comprises, carry out shape of the mouth as one speaks matching treatment to locate the voice source of voice messaging; In the voice messaging of oriented people, detect the frequency of voice, tone color etc., go out a kind of sound-type with the Model Matching of precondition, utilize this sound-type to simulate the sound of this people subsequently; In step S163, face human 3d model, reconstructed speech information are carried out fusion treatment with selected advertising message, obtains the advertising message after fusion.

Particularly, according to the recognition of face of human body and face's structures locating, human bioequivalence and body part location, carry out background segment by graphcut algorithm, adopt head pose estimation and RBF converter technique to carry out image 3D modeling; Such as, adopt RLAB (RandomLocalAssembleBlocks) and the mode real-time face region detection of Adaboost, adopt SDM to carry out human face feature location, utilize head pose estimation and RBF method to carry out the modeling of single width facial image; (DeformablePartModel carries out the detection of human body frame with DPM to adopt HOG (HistogramofGradient) subsequently, then graphcut algorithm is adopted to carry out body segmentation further, body area image is mapped on three-dimensional model, and three-dimensional (3 D) manikin adopts unified preset model, just by the body area image texture of above-mentioned segmentation on three-dimensional model, reach analog result.The tone information of at least two human bodies is extracted from voice messaging, by phonetic synthesis process, the reconstructed speech information of the advertising message that synthesis is selected; , based on content and the scene of selected advertising message, the model built up, reconstructed speech information and advertisement context are merged meanwhile, obtain the advertising message after personage being merged.

In one example, as shown in figure 17, gather the characteristic information of two human bodies that display screen front exists, as the action behavior information, gender information, age information etc. of human body, determine that the related information between this two people is lovers' relation according to these characteristic informations; According to this lovers' relation, and in conjunction with the social class of this two people, based on predetermined advertisement selection strategy, selected broadcast advertisement is the travelling products advertisement of " Maldivian Romantic trip "; Close to tie up in this advertisement according to its lovers and select role, the personage's voice messaging in the tone generating advertisement of this two people is imitated by speech recognition technology, simultaneously by face modeling and Human Modeling technology, the dummy model of this two people and personage's voice messaging are placed in advertisement video, generate and merge advertisement video.

The model of at least two people in display screen front and advertisement, by video fusion technology, merge, obtain video immersion effect by embodiments of the invention; Provide a favorable guarantee for fusion advertisement being play the good advertisement pushing effect of rear acquisition.Further, beholder can produce viewing experience on the spot in person after watching and merging advertisement, is the participant of ad content, effectively can improves the acceptance of beholder to advertising message from the angle change of the 3rd people's viewing, final raising advertisement pushing effect.

In step s 130, which, corresponding advertising message is play on a display screen.

Preferably, step of the present invention also comprises step S170 (not shown), step S180 (not shown) and step S190 (not shown); After playing corresponding advertising message on a display screen, in step S170, terminal device obtains at least two human bodies to relevant information after the sight of the advertising message play; In step S180, terminal device, according to relevant information after sight, based on predetermined satisfaction computing method, determines the satisfaction of at least two human bodies to advertising message; In step S190, satisfaction and predetermined satisfaction threshold value compare by terminal device, when judging satisfaction lower than predetermined satisfaction threshold value, change advertising message.Wherein, after seeing, relevant information comprises: eyes viewpoint locating information, facial expression information and voice content information.Wherein, satisfaction computing method comprise: by calculating because usually asking power this three class of focus, human face expression and concern time; Focus can determine the name of product that human body is watched, and the concern time is viewing time length, expresses one's feelings when human face expression is this product of viewing.Based on the preset satisfaction question blank of this three classes factor, as product: milk powder, time: 10s-12s can be paid close attention to, human face expression: smile, carry out inquiring about at satisfaction question blank and determine corresponding satisfaction: 0.5; Satisfaction threshold value when presetting: 0.7; Can determine that human body is unsatisfied with to this milk powder advertisement, need to change advertising message.

Particularly, by mode as shown in Figure 6 calculate determine at least two human eyes view information, the viewpoint residence time, the voice messaging etc. obtain expression information by mode as shown in Figure 5, being obtained by mode as shown in figure 14, based on predetermined satisfaction computing method, determine the satisfaction of beholder to advertising message; When satisfaction is lower than predetermined satisfaction threshold value, change advertising message, the advertising message of the corresponding adline of replaceable identical relation information, the advertising message of the different adline of identical relation information correspondence, the advertising message that different relation information is corresponding.

In one example, terminal device, according to " mother and baby's relation ", selects a milk powder advertisement.According to being obtained the expression information of beholder by mode as shown in Figure 5 for smiling, calculated by mode as shown in Figure 6 simultaneously and determine that the view information of eyes is the milk powder name information of watching attentively in milk powder advertisement, and the viewpoint duration rested on screen is 12 seconds, be greater than predetermined stay time 10 seconds, determine that beholder is 0.8 to the satisfaction of advertising message, higher than predetermined satisfaction threshold value 0.7.

More preferably, terminal device repeated execution of steps S170, step S180 and step S190, until satisfaction is greater than predetermined satisfaction threshold value.

More preferably, step of the present invention also comprises step S210 (not shown) and step S220 (not shown).In step S210, terminal device, when judging that advertising message replacing number of times is greater than predetermined replacing threshold value, extracts the related information of at least two human bodies again; In step S220, according to the related information after replacing, the selected advertising message corresponding with related information again.

Constantly adjust replacing advertising message according to satisfaction, realize playing the advertising message the highest with its interest-degree matching degree to beholder, obtain more excellent advertisement pushing effect.

In one example, terminal device, according to " mother and baby's relation ", selects a milk powder advertisement.Determine that the view information of eyes is any position place that viewpoint does not concentrate on advertisement according to being calculated by mode as shown in Figure 6, and viewpoint rests on duration on screen is 3 seconds, be less than predetermined stay time 10 seconds, calculate and determine that beholder is 0.1 to the satisfaction of advertising message, lower than predetermined satisfaction threshold value 0.7, then change the toy advertisement that another matches with " mother and baby's relation ".

Step of the present invention also comprises step S230 (not shown), step S240 (not shown) and step S250 (not shown).In step S230, the identity beacon information of terminal device identification at least two human bodies; In step S240, according to identity beacon information, play in recorded information inquire about in history, any one conception of history determining at least two human bodies has seen the history satisfaction of the adline belonging to advertising message of current broadcasting; When judging history satisfaction lower than predetermined satisfaction threshold value, change advertising message.The identifying of identity beacon information is as shown in figure 18: first, face's structures locating is carried out at least two human bodies, extract eye pupil iris, around eyes and whole facial image texture information, then, extracted texture information is mated with the face texture information with identity ID stored in information bank, if it fails to match, then determines that this people does not possess and has identity ID, for it distributes identity ID, and corresponding with texture information for its identity ID is recorded in information bank; If the match is successful, then according to its identity ID, play in recorded information inquire about in history, any one conception of history determining at least two human bodies has seen the history satisfaction of the adline belonging to advertising message of current broadcasting; When judging history satisfaction lower than predetermined satisfaction threshold value, change advertising message.Wherein, history play that recorded information comprises identification information, the corresponding informance of the adline belonging to advertising message play to its history and history satisfaction.

Preferably, step of the present invention also comprises step S260 (not shown); In step S260, recorded information play by terminal device more new historical.Particularly, by the identification information of any one at least two human bodies, to its current broadcasting advertising message belonging to adline and satisfaction to advertising message, play in recorded information as a data write history.

Figure 19 is the functions of the equipments schematic diagram based on the related information broadcast advertisement between spectators according to the embodiment of the present invention.Wherein, the device based on the related information broadcast advertisement between spectators comprises the first extraction module 110, chosen module 120 and playing module 130.This device is arranged in terminal device 100.

As shown in figure 19, the first extraction module 110 gathers the data of at least two human bodies that display screen front exists, and extracts the related information between human body according to data.

First extraction module 110 comprises acquisition module (not shown), and acquisition module gathers the data of at least two human bodies that display screen front exists by camera or microphone; Wherein, camera comprises following any one or plurality of devices: visible image capturing head, depth camera and infrared camera.

In an embodiment of the present invention, data comprise: image information and/or voice messaging.

Particularly, acquisition module comprises visible ray acquisition module (not shown), degree of depth acquisition module (not shown) and infrared collecting module (not shown), and visible ray acquisition module gathers the visible light image information of at least two people by visible image capturing head; Degree of depth acquisition module gathers the image depth information of at least two people by depth camera; Infrared collecting module gathers the Infrared Image Information of at least two people by infrared camera.

Particularly, acquisition module comprises voice acquisition module (not shown).Voice acquisition module gathers the acoustic information in display screen front by microphone; Speech detection is carried out to acoustic information, judges whether acoustic information comprises voice messaging; When judging to comprise voice messaging, extract by filter out background noise the voice messaging that acoustic information comprises.

In one example, concrete processing procedure as shown in Figure 2, voice acquisition module by microphone gather display screen front acoustic information after, first, according to every frame 10ms, framing is carried out to sound import, then, to the smoothing process of every frame, calculate the absolute maximum difference in N neighborhood; Based on fixing ground unrest model, noise filtering is carried out to acoustic information, such as, utilize predetermined noise coefficient N to do FFT conversion to voice signal, obtain the power spectrum signal calculating conversion; Obtain adaptive decision threshold by least square, be greater than this threshold value and then determine that this acoustic information is voice messaging, otherwise, be then defined as noise signal.When judging to comprise voice messaging, extract by filter out background noise the voice messaging that acoustic information comprises.

First extraction module 110 comprises the second extraction module (not shown) and associates determination module (not shown); Second extraction module is from the characteristic information of extracting data at least two people; Association determination module, according to characteristic information, determines the related information between human body; Wherein, characteristic information comprises following one or more: characteristics of human body's information; Voice characteristics information; Physiological characteristic information.

Second extraction module comprises characteristics of human body's extraction module (not shown), pronunciation extracting module (not shown) and physiological characteristic extraction module (not shown).

Characteristics of human body's extraction module extracts characteristics of human body's information of at least two people from image information.

Characteristics of human body's extraction module comprises distance determination module (not shown); Distance determination module, according to image information, calculates the range information determining at least two people by distance.

Characteristics of human body's extraction module comprises face information extraction module (not shown); Face information extraction module, according to image information, extracts the face relevant information of at least two people by Face datection process.

Particularly, face information extraction module carries out the process of face windows detecting according to visible light image information and image depth information, to obtain Face datection window information; Face number is determined according to the quantity of Face datection window.In one embodiment, according to visible light image information and image depth information, RLAB (RandomLocalAssembleBlocks) can be adopted to carry out face windows detecting in real time with the mode of Adaboost.Idiographic flow is as shown in Figure 3: first, do not belong to the background image of face according to image depth information filtering; Then, by predetermined ratio and level, down-sampling is carried out to visible light image information, such as, image is carried out down-sampling by radio=1.259, respectively RLAB feature is calculated to every level by setting up 24 level image pyramids, then the characteristic pattern of every level is pressed to the stationary window scan image of 24*24 size, i.e. window traversal; Subsequently, the output of calculation window image and cascading filter responds, and compares the size that result of calculation and first trains threshold value, then determines to recognize face when result of calculation is greater than training threshold value, is less than the first training threshold value and then determines unidentified to face; According to detection window size, the background sample outside acquisition testing window area and the face sample of detection window, calculate the Haar-like feature of background sample and face sample; Utilize Bayes sorter to carry out real-time online study, the on-time model utilizing study to obtain is followed the tracks of, to obtain the Face datection window information of output; Subsequently, face number is determined according to the quantity of Face datection window.

Preferably, face information extraction module carries out face's structures locating process according to Face datection window information, to extract face organ characteristic information.Face's structures locating mainly comprises the positioning feature point to eyes, eyebrow, face, nose, ear.The process of face's structures locating process is as shown in Figure 4: first, according to Face datection window information, Face datection frame is normalized to average face model size, and being bold little as adopted average shape is 64*64; Then, extract the HOG feature of average shape face at Face datection frame, utilize and train SDM (SupervisedDescentMethod) model obtained to carry out iteration, and constantly update the characteristic point position of face's organ, obtain the characteristic point position of face's organ.

More preferably, face information extraction module, according to Face datection window information and described face organ characteristic information, by face normalization and unitary of illumination process, extracts face texture feature information; According to face texture feature information, carry out identification based on machine learning algorithm and determine countenance information, and/or face's Skin Color Information, and/or age information and/or gender information.The identifying of countenance information, age information, gender information and face's Skin Color Information is as shown in Figure 5: first, according to by face windows detecting as shown in Figure 3 obtain the face organ characteristic information that Face datection window information and face organ characteristic point location mode as shown in Figure 4 extract, carry out face normalization and unitary of illumination process, extract face textural characteristics, include but not limited to Gabor, SIFT, LBP, HOG; Then, according to face texture feature information, based on machine learning algorithm, as SVM, DeepLearning, the methods such as linear regression carry out training to identify determining countenance information, age information, gender information and face's Skin Color Information.

More preferably, face information extraction module after extraction face organ characteristic information, by carrying out the process of viewpoint location Calculation to eyes, to determine eyes viewpoint locating information.In one example, first, pre-established average face 3D model emulation is mapped in Face datection window that face windows detecting obtains, determines affine 2D point; Subsequently, calculate the difference between the characteristic point position of face's organ and affine 2D point, carry out by gradient descent method the attitude angle calculating to determine head, according to attitude angle and eyes and screen distance, calculate according to mode as shown in Figure 6 and determine eyes viewpoint locating information.

More preferably, face information extraction module after extraction face organ characteristic information, by carrying out face's wear check processing to face organ characteristic information, to determine face's wear information.In one example, Glasses detection is carried out to eye areas, ear nail is carried out to ear portions and detects, mouth mask detection etc. is carried out to mouth.Face's wear testing process as shown in Figure 7, collect dress region near human face image pattern and human face near the area image sample do not dressed carry out texture feature extraction, send into machine learning framework, carry out learning model, utilize the model learning to obtain to carry out wearing to detect, if there is wearing, then locations of contours is carried out, to determine face's wear information to the gadget dressed.

Characteristics of human body's extraction module comprises human body information extraction module (not shown); Human body information extraction module, according to image information, extracts the human body relevant information of at least two people by human detection process.

Particularly, human body information extraction module, according to visible light image information and image depth information, carries out the process of human body windows detecting, to obtain human detection window information; Human body number is determined according to the quantity of human detection window.In one example, according to visible light image information and image depth information, HOG (HistogramofGradient) can be adopted to carry out human detection in real time with the mode of DPM (DeformablePartModel).Concrete testing process is as shown in Figure 8: first, utilize image depth information filtering not belong to the background image of human body, and utilizes generic object to detect the object of filtering without profile border; Then, HOG characteristic pattern is asked to the image after carrying out filtering process; Set up search window pyramid by certain ratio, search for respectively to HOG characteristic pattern, the HOG characteristic pattern calculated in DPM model and window responds, and when checkout result is greater than the second training threshold value, exports human detection window information according to DPM types of models; Human body number is determined according to the quantity of human detection window.

Preferably, human body information extraction module carries out body part localization process according to human detection window information, to extract body part characteristic information.In one example, comprise as shown in Figure 9 according to the process that the body part of image information to human body positions, first according to DPM model, detect the Position Approximate of the body parts such as head, shoulder, trunk, then, carrying out iteration by training SDM (SupervisedDescentMethod) model obtained to constantly update accurately to locate each body part position, obtaining the body part position of human body.

More preferably, human body information extraction module, according to body part characteristic information and image depth information, identifies the action behavior information of human body by action behavior identifying processing.Particularly, comprise as shown in Figure 10 to the process that the action behavior of human body identifies, according to the determined body part characteristic information of body part locator meams as shown in Figure 9 and image depth information, by the action behavior between action behavior model of cognition identification human body, comprise hand in hand, gather around shoulder, bosom etc.

More preferably, human body information extraction module, according to body part characteristic information and face's Skin Color Information, carries out identifying processing to hair style and/or dress ornament and/or build, to determine hair style information and/or dress ornament information and/or build information.Concrete identifying is as shown in figure 11: according to body part characteristic information, and utilize complexion model, split by the parts of GraphicCut cutting techniques to location, to the extracted region texture information split and shape information, utilize the model that machine learning is determined, carry out identifying determining hair style information and/or dress ornament information and/or build information.

More preferably, human body information extraction module, according to human detection window information and body part characteristic information, carries out belongings check processing to predetermined areas near human detection window, to determine human body belongings information.Wherein, the detection and Identification to belongings such as pet, bag, books, mobile communication equipments are included but not limited to the identification of human body belongings.Detect identifying as shown in figure 12: first, adopt DPM algorithm to human detection window, such as, detect at hand near zone, preliminary classification is carried out to belongings, identify pet, bag, mobile phone, the types such as flat board, then the machine learning algorithms such as DeepLearning are utilized to carry out determining the concrete belongings information of hand, as pet type, the color etc. of bag.

Physiological characteristic extraction module extracts the physiological characteristic information of at least two people from image information.Wherein, the physiological characteristic information of at least two people comprises: body temperature information.

Physiological characteristic extraction module comprises body temperature extraction module (not shown), and body temperature extraction module, according to characteristics of human body's information, determines face's area of skin color and health area of skin color; According to Infrared Image Information, and in conjunction with face's area of skin color and health area of skin color, extract the infrared image half-tone information of area of skin color; According to infrared image half-tone information, calculated by linear regression, determine body temperature information.Particularly, determine the concrete mode of body temperature information as shown in figure 13, the Face datection window information determined by Face datection process and the human detection window information determined by human detection process, pass through complexion model, in conjunction with glasses, mouth mask testing result in face belongings, detect exposed face complexion area and area of skin color of human body, and search the infrared image region of corresponding input, extract the infrared image half-tone information of area of skin color, utilize linear regression, calculate the body temperature information of area of skin color.Preferably, also according to mouth organ shape and voice messaging, can judge whether human body has the symptoms such as heating, influenza.

Pronunciation extracting module extracts the voice characteristics information of at least two people from image information.Wherein, the voice characteristics information of at least two people comprises: languages type, voice content and voice are originated.

Pronunciation extracting module comprises languages determination type module (not shown), and languages determination type module extracts acoustic feature information and spectrum signature information from voice messaging; By machine learning algorithm, identifying processing is carried out to acoustic feature information and spectrum signature information, to determine the first level languages type belonging to voice messaging.After determining the first level languages type belonging to voice messaging, languages determination type module is based on first level languages type, secondary classification identification is carried out to voice messaging, to determine the second level languages type belonging to voice messaging, second level languages type affiliation is in first level languages type.Particularly, the identifying of the languages type using language is exchanged as shown in figure 14 between human body: according to voice messaging according to voice messaging, extract acoustic feature and the spectrum signature of voice messaging, utilize GMM (Gaussianmixturemodel, mixed Gauss model) characteristic length is normalized, machine learning algorithm (SVM, DeepLearning etc.) is utilized to identify languages type; Then to the further classification identification of languages type identified, if languages type is English, classification identifies British English, Americanese; Languages type is Chinese, and classification identifies mandarin and dialect etc.

Pronunciation extracting module comprises voice content identification module (not shown), and voice content identification module is by the voice content in speech recognition technology identification voice messaging.Particularly, utilize speech recognition technology, as HMM, DeepLearning, identify the content in voice messaging, extract the key message of voice content.

Pronunciation extracting module comprises voice source extraction module (not shown), and voice source extraction module extracts face organ characteristic information from image information; According to languages type and voice content, and in conjunction with the mouth shapes characteristic information that face's organ characteristic's information comprises, carry out shape of the mouth as one speaks matching treatment to locate the voice source of voice messaging.Particularly, the concrete mode positioned voice sound source is as shown in figure 15: adopt face mouth organ characteristic point location mode determination mouth shapes as shown in Figure 4; Simultaneously according to voice recognition mode as shown in figure 14, identify voice languages type and voice content by DeepLearning; Languages type and voice content are combined with mouth shapes, carries out shape of the mouth as one speaks coupling and voice sound source is positioned.

One or more according in characteristics of human body's information, voice characteristics information and physiological characteristic information of association determination module, carries out mating the related information determined between human body in characteristic relation corresponding lists.Wherein, related information comprises: social relationships information and personage's common information.Social relationships information comprises: family relationship, friends and Peer Relationships; Family relationship comprises: two generation relation or throwback relation: friends comprises: lovers' relation or regular friend relation; Peer Relationships comprise: sane level relation or relationship between superior and subordinate.Wherein, personage's common information comprises: sex, age, the colour of skin, hair style, dress ornament, build, face's wear and health belongings.In an embodiment of the present invention, characteristic relation corresponding lists comprises in characteristics of human body's information, voice characteristics information and physiological characteristic information between the human body corresponding to one or more combination related information; Such as, sex is a man and a woman between 20 to 30 years old the age of two people; Left and right distance between two people be less than the preset 100cm of predetermined left and right distance and action behavior for hand in hand, then to should two people be lovers' relation; Again such as, the Age and sex of two people is depicted as a female middle-aged and a young girl, two human action behaviors for hand in hand, then to should two people be mother and daughter relationship; Again such as, the Age and sex of two people is depicted as an elderly men and a boy, two human action behaviors for hand in hand, then to should two people be grandparent and grandchild's relation.

Device of the present invention also comprises stratum's determination module (not shown), and stratum's determination module, according to characteristics of human body's information, voice characteristics information and physiological characteristic information, carries out stratum's identifying processing, determines the social class belonging at least two people.Particularly, stratum's determination module, according to the wear information near fixed face organ information, face's organ, dress ornament information, human body belongings information, voice messaging etc., carries out stratum's identifying processing, determines the social class belonging at least two people.Social class mainly comprises blue collar, white collar and gold-collardom.

Device of the present invention also comprises preference relation and chooses module (not shown), and when extracting many group incidence relations, preference relation is chosen module and chosen preferred incidence relation based on predetermined selective rule.Preferred incidence relation is chosen based on one or more predetermined selective rules following:

Chosen module 120, according to related information, selectes the advertising message corresponding with related information automatically.Particularly, chosen module 120, according to relation information, carries out matching inquiry in relation corresponding lists, determines the adline that relation information is corresponding; Subsequently, in advertisement base, advertising message is extracted according to adline.In one example, chosen module 120, according to " lovers' relation ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline is wedding celebration series advertisements; Subsequently, in advertisement base, honeymoon trip advertising message is extracted according to wedding celebration class.In another example, chosen module 120, according to " mother and baby's relation ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline is mother and baby's series advertisements; Subsequently, in advertisement base, paper diaper advertisement is extracted according to mother and baby's class.In another example, chosen module 120, according to personage's general character " women ", carries out matching inquiry in relation corresponding lists, determines that corresponding adline comprises toiletries advertisement; Subsequently, in advertisement base, facial mask advertisement is extracted according to toiletries.

Preferably, chosen module 120, according to related information, in conjunction with current time information, selectes the advertising message corresponding with related information.In one example, current time is the predetermined meal time, and as 12 noon, then chosen module 120 is according to " lovers' relation " selected western-style food dining room series advertisements; According to " family relationship " selected parent-offspring's class special dining room series advertisements.

Device of the present invention also comprises Fusion Module (not shown), automatically after the selected advertising message corresponding with related information, Fusion Module is according to related information, in selected advertising message, carry out role match, determine the character of at least two human bodies in selected advertising message; Selected advertising message and character are carried out fusion treatment, obtains the advertising message after fusion.

Fusion Module comprises model building module (not shown), voice synthetic module (not shown) and advertisement acquisition module (not shown); Model building module sets up the face human 3d model of at least two human bodies by three-dimensional modeling mode; Voice synthetic module extracts the tone information of at least two human bodies from voice messaging, by phonetic synthesis process, and the reconstructed speech information of the advertising message that synthesis is selected; Face human 3d model, reconstructed speech information are carried out fusion treatment with selected advertising message by advertisement acquisition module, obtain the advertising message after fusion.

Particularly, according to the recognition of face of human body and face's structures locating, human bioequivalence and body part location, carry out background segment by graphcut algorithm, adopt head pose estimation and RBF converter technique to carry out image 3D modeling; The tone information of at least two human bodies is extracted from voice messaging, by phonetic synthesis process, the reconstructed speech information of the advertising message that synthesis is selected; , based on content and the scene of selected advertising message, the model built up, reconstructed speech information and advertisement context are merged meanwhile, obtain the advertising message after personage being merged.

Playing module 130 plays corresponding advertising message on a display screen.

Preferably, after device of the present invention also comprises sight, data obtaining module (not shown), satisfaction determination module (not shown) and first change module (not shown); After playing corresponding advertising message on a display screen, after seeing, data obtaining module obtains at least two human bodies to relevant information after the sight of the advertising message play; Satisfaction determination module, according to relevant information after sight, based on predetermined satisfaction computing method, determines the satisfaction of at least two human bodies to advertising message; First changes module compares satisfaction and predetermined satisfaction threshold value, when judging satisfaction lower than predetermined satisfaction threshold value, changes advertising message.Wherein, after seeing, relevant information comprises: eyes viewpoint locating information, facial expression information and voice content information.

More preferably, after seeing, data obtaining module, satisfaction determination module and first are changed module and are repeated its respective operation, until satisfaction is greater than predetermined satisfaction threshold value.

More preferably, device of the present invention also comprises related information update module (not shown) and gravity treatment module (not shown).Related information update module, when judging that advertising message replacing number of times is greater than predetermined replacing threshold value, extracts the related information of at least two human bodies again; Gravity treatment module, according to the related information after replacing, selectes the advertising message corresponding with related information again.

Device of the present invention also comprises identity and indicates identification module (not shown), history satisfaction determination module (not shown) and the second replacing module (not shown).Identity indicates the identity beacon information of identification module identification at least two human bodies; History satisfaction determination module, according to identity beacon information, is play in recorded information in history and is inquired about, and any one conception of history determining at least two human bodies has seen the history satisfaction of the adline belonging to advertising message of current broadcasting; When judging history satisfaction lower than predetermined satisfaction threshold value, second changes module replacing advertising message.The identifying of identity beacon information is as shown in figure 18: first, face's structures locating is carried out at least two human bodies, extract eye pupil iris, around eyes and whole facial image texture information, then, extracted texture information is mated with the face texture information with identity ID stored in information bank, if it fails to match, then determines that this people does not possess and has identity ID, for it distributes identity ID, and corresponding with texture information for its identity ID is recorded in information bank; If the match is successful, then according to its identity ID, play in recorded information inquire about in history, any one conception of history determining at least two human bodies has seen the history satisfaction of the adline belonging to advertising message of current broadcasting; When judging history satisfaction lower than predetermined satisfaction threshold value, change advertising message.Wherein, history play that recorded information comprises identification information, the corresponding informance of the adline belonging to advertising message play to its history and history satisfaction.

Preferably, device of the present invention also comprises update module (not shown); Update module more new historical plays recorded information.Particularly, by the identification information of any one at least two human bodies, to its current broadcasting advertising message belonging to adline and satisfaction to advertising message, play in recorded information as a data write history.

Those skilled in the art of the present technique are appreciated that the one or more equipment that the present invention includes and relate to for performing in operation described in the application.These equipment for required object and specialized designs and manufacture, or also can comprise the known device in multi-purpose computer.These equipment have storage computer program within it, and these computer programs optionally activate or reconstruct.Such computer program can be stored in equipment (such as, computing machine) in computer-readable recording medium or be stored in and be suitable for store electrons instruction and be coupled in the medium of any type of bus respectively, described computer-readable medium includes but not limited to that the dish of any type (comprises floppy disk, hard disk, CD, CD-ROM, and magneto-optic disk), ROM (Read-OnlyMemory, ROM (read-only memory)), RAM (RandomAccessMemory, storer immediately), EPROM (ErasableProgrammableRead-OnlyMemory, Erarable Programmable Read only Memory), EEPROM (ElectricallyErasableProgrammableRead-OnlyMemory, EEPROM (Electrically Erasable Programmable Read Only Memo)), flash memory, magnetic card or light card.Namely, computer-readable recording medium comprises and being stored or any medium of transmission information with the form that can read by equipment (such as, computing machine).

Those skilled in the art of the present technique are appreciated that the combination that can realize the frame in each frame in these structural drawing and/or block diagram and/or flow graph and these structural drawing and/or block diagram and/or flow graph with computer program instructions.Those skilled in the art of the present technique are appreciated that, the processor that these computer program instructions can be supplied to multi-purpose computer, special purpose computer or other programmable data disposal routes realizes, thus is performed the scheme of specifying in the frame of structural drawing disclosed by the invention and/or block diagram and/or flow graph or multiple frame by the processor of computing machine or other programmable data disposal routes.

Those skilled in the art of the present technique are appreciated that various operations, method, the step in flow process, measure, the scheme discussed in the present invention can be replaced, changes, combines or delete.Further, there is various operations, method, other steps in flow process, measure, the scheme discussed in the present invention also can be replaced, change, reset, decompose, combine or delete.Further, of the prior art have also can be replaced with the step in operation various disclosed in the present invention, method, flow process, measure, scheme, changed, reset, decomposed, combined or deleted.

The above is only some embodiments of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1., based on a method for related information broadcast advertisement on a display screen, it is characterized in that, comprising:

Gather the data of at least two human bodies, extract the related information between human body according to described data;

According to described related information, the selected advertising message corresponding with described related information automatically;

Described display screen is play the advertising message of described correspondence.

2. the method based on related information broadcast advertisement on a display screen according to claim 1, is characterized in that, gathers the data of at least two human bodies that described display screen front exists, comprising:

The data of at least two human bodies that described display screen front exists are gathered by camera or microphone;

Wherein, described camera comprises following any one or plurality of devices: visible image capturing head, depth camera and infrared camera.

3. the method based on related information broadcast advertisement on a display screen according to claim 2, it is characterized in that, described data comprise following information:

Image information and/or voice messaging.

4. the method based on related information broadcast advertisement on a display screen according to claim 3, is characterized in that, extracts the related information between human body, comprising according to described data:

From the characteristic information of described extracting data at least two people;

According to described characteristic information, determine the related information between human body;

Wherein, described characteristic information comprises following one or more:

Characteristics of human body's information;

Voice characteristics information;

Physiological characteristic information.

5. the method based on related information broadcast advertisement on a display screen according to claim 4, is characterized in that, from the characteristic information of described extracting data at least two people, comprise one or more situations following:

Characteristics of human body's information of at least two people described in extracting from described image information;

The voice characteristics information of at least two people described in extracting from described voice messaging;

The physiological characteristic information of at least two people described in extracting from described image information.

6. the method based on related information broadcast advertisement on a display screen according to claim 5, it is characterized in that, characteristics of human body's information of described at least two people comprises following one or more:

The range information of at least two people;

The face relevant information of at least two people;

The human body relevant information of at least two people.

7. the method based on related information broadcast advertisement on a display screen according to claim 6, is characterized in that, characteristics of human body's information of at least two people described in extracting from described image information, comprising:

According to described image information, calculated the range information determining at least two people by distance.

8. the method based on related information broadcast advertisement on a display screen according to claim 6, is characterized in that, characteristics of human body's information of at least two people described in extracting from described image information, comprising:

According to described image information, extracted the face relevant information of at least two people by Face datection process.

9. the method based on related information broadcast advertisement on a display screen according to claim 8, it is characterized in that, the face relevant information of described at least two people comprises following one or more:

Face number; Face organ characteristic information; Countenance information; Face's Skin Color Information; Age information; Gender information; Eyes viewpoint locating information; Face's wear information.

10. the method based on related information broadcast advertisement on a display screen according to claim 6, is characterized in that, the human body relevant information of at least two people described in extracting from described image information, comprising:

According to described image information, the human body relevant information of at least two people described in being extracted by human detection process.

11. methods based on related information broadcast advertisement on a display screen according to claim 10, it is characterized in that, the human body relevant information of described at least two people comprises following one or more:

Human body number; Body part characteristic information; The action behavior information of human body; Hair style information; Dress ornament information; Build information; Human body belongings information.

12. methods based on related information broadcast advertisement on a display screen according to claim 5, it is characterized in that, the voice characteristics information of described at least two people comprises following one or more:

Languages type;

Voice content;

Voice are originated.

13. methods based on related information broadcast advertisement on a display screen according to claim 12, is characterized in that, the languages type of at least two people described in extracting from described voice messaging, comprising:

Acoustic feature information and spectrum signature information is extracted from described voice messaging;

By machine learning algorithm, identifying processing is carried out to described acoustic feature information and described spectrum signature information, to determine the first level languages type belonging to described voice messaging.

14. methods based on related information broadcast advertisement on a display screen according to claim 13, is characterized in that, the voice source of at least two people described in extracting from described voice messaging, comprising:

Face organ characteristic information is extracted from described image information;

According to languages type and voice content, and in conjunction with the mouth shapes characteristic information that described face organ characteristic information comprises, carry out shape of the mouth as one speaks matching treatment to locate the voice source of described voice messaging.

15. methods based on related information broadcast advertisement on a display screen according to claim 4, is characterized in that, according to described characteristic information, determine the related information between human body, comprising:

One or more according in described characteristics of human body's information, described voice characteristics information and described physiological characteristic information, carries out mating the related information determined between human body in characteristic relation corresponding lists.

16. methods based on related information broadcast advertisement on a display screen according to claim 15, it is characterized in that, described related information comprises one or more information following:

Social relationships information and/or personage's common information.

17. methods based on related information broadcast advertisement on a display screen according to claim 16, it is characterized in that, described social relationships information comprises one or more information following:

Family relationship; Friends; Peer Relationships;

Wherein, described family relationship comprises: two generation relation or throwback relation:

Described friends comprises: lovers' relation or regular friend relation;

Described Peer Relationships comprise: sane level relation or relationship between superior and subordinate.

18. methods based on related information broadcast advertisement on a display screen according to claim 16, it is characterized in that, described personage's common information comprises one or more information following:

Sex; Age; The colour of skin; Hair style; Dress ornament; Build; Face's wear; Health belongings.

19. methods based on related information broadcast advertisement on a display screen according to claim 1, is characterized in that, according to described related information, the selected advertising message corresponding with described related information automatically, comprising:

According to described related information, based on predetermined advertisement selection strategy, the selected advertising message corresponding with described related information.

20. methods based on related information broadcast advertisement on a display screen according to claim 1, is characterized in that, automatically after the selected advertising message corresponding with described related information, also comprise:

According to described related information, in selected advertising message, carry out role match, the character of at least two human bodies in described selected advertising message described in determining;

Described selected advertising message and described character are carried out fusion treatment, obtains the advertising message after fusion.

21. methods based on related information broadcast advertisement on a display screen according to claim 20, is characterized in that, described advertising message and described character are carried out fusion treatment, obtains the advertising message after fusion, comprising:

The face human 3d model of at least two human bodies described in being set up by three-dimensional modeling mode;

The tone information of at least two human bodies described in extracting from voice messaging, by phonetic synthesis process, the reconstructed speech information of advertising message selected described in synthesis;

Described face human 3d model, described reconstructed speech information and described selected advertising message are carried out fusion treatment, obtains the advertising message after fusion.

22. methods based on related information broadcast advertisement on a display screen according to claim 1, is characterized in that, after described display screen is play the advertising message of described correspondence, also comprise:

Described in X obtains, at least two human bodies are to relevant information after the sight of the advertising message play;

Y is according to relevant information after described sight, and based on predetermined satisfaction computing method, described in determining, at least two human bodies are to the satisfaction of advertising message;

Described satisfaction and predetermined satisfaction threshold value compare by Z, when judging satisfaction lower than predetermined satisfaction threshold value, change advertising message.

23. methods based on related information broadcast advertisement on a display screen according to claim 22, is characterized in that, comprising:

Repeated execution of steps X, Y and Z, until described satisfaction is greater than predetermined satisfaction threshold value.

24. methods based on related information broadcast advertisement on a display screen according to claim 23, is characterized in that, also comprise:

When judging that advertising message replacing number of times is greater than predetermined replacing threshold value, the related information of at least two human bodies described in again extracting;

According to the related information after replacing, the selected advertising message corresponding with described related information again.

25. 1 kinds, based on the device of related information broadcast advertisement on a display screen, is characterized in that, comprise the first extraction module, chosen module and playing module:

Described first extraction module, for gathering the data of at least two human bodies, extracts the related information between human body according to described data;

Described chosen module, for according to described related information, selectes the advertising message corresponding with described related information automatically;

Described playing module, for playing the advertising message of described correspondence on described display screen.

26. devices based on related information broadcast advertisement on a display screen according to claim 25, it is characterized in that, described data comprise following information:

Image information and/or voice messaging.

27. devices based on related information broadcast advertisement on a display screen according to claim 26, it is characterized in that, described first extraction module comprises the second extraction module and associates determination module:

Described second extraction module, for the characteristic information from described extracting data at least two people;

Described association determination module, for according to described characteristic information, determines the related information between human body;

Wherein, described characteristic information comprises following one or more:

Characteristics of human body's information;

Voice characteristics information;

Physiological characteristic information.

28. devices based on related information broadcast advertisement on a display screen according to claim 25, it is characterized in that, described related information comprises one or more information following:

Social relationships information and/or personage's common information.

29. devices based on related information broadcast advertisement on a display screen according to claim 25, it is characterized in that, described chosen module is used for according to described related information, based on predetermined advertisement selection strategy, and the selected advertising message corresponding with described related information.

30. devices based on related information broadcast advertisement on a display screen according to claim 25, is characterized in that, also comprise Fusion Module:

After described Fusion Module is used for automatically selecting the advertising message corresponding with described related information, according to described related information, role match is carried out, the character of at least two human bodies in described selected advertising message described in determining in selected advertising message; Described selected advertising message and described character are carried out fusion treatment, obtains the advertising message after fusion.

31. devices based on related information broadcast advertisement on a display screen according to claim 30, it is characterized in that, described Fusion Module comprises model building module, voice synthetic module and advertisement acquisition module:

Described model building module, for the face human 3d model of at least two human bodies described in being set up by three-dimensional modeling mode;

Described voice synthetic module, for the tone information of at least two human bodies described in extraction from voice messaging, by phonetic synthesis process, the reconstructed speech information of advertising message selected described in synthesis;

Described advertisement acquisition module, for described face human 3d model, described reconstructed speech information and described selected advertising message are carried out fusion treatment, obtains the advertising message after fusion.

32. devices based on related information broadcast advertisement on a display screen according to claim 25, is characterized in that, also comprise data obtaining module after seeing, satisfaction determination module and first changes module:

Data obtaining module after described sight, after the advertising message playing described correspondence on described display screen, described in acquisition, at least two human bodies are to relevant information after the sight of the advertising message play;

Described satisfaction determination module, for according to relevant information after described sight, based on predetermined satisfaction computing method, described in determining, at least two human bodies are to the satisfaction of advertising message;

Described first changes module, for described satisfaction and predetermined satisfaction threshold value being compared, when judging satisfaction lower than predetermined satisfaction threshold value, changes advertising message.

33. 1 kinds of terminal devices, is characterized in that, comprise the device based on related information broadcast advertisement on a display screen described in claim 25-32.