CN103984738B - Role labelling method based on search matching - Google Patents
Role labelling method based on search matching Download PDFInfo
- Publication number
- CN103984738B CN103984738B CN201410218854.7A CN201410218854A CN103984738B CN 103984738 B CN103984738 B CN 103984738B CN 201410218854 A CN201410218854 A CN 201410218854A CN 103984738 B CN103984738 B CN 103984738B
- Authority
- CN
- China
- Prior art keywords
- face
- role
- marked
- mark
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000002372 labelling Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 60
- 230000008569 process Effects 0.000 claims abstract description 22
- 238000001514 detection method Methods 0.000 claims abstract description 15
- 230000000007 visual effect Effects 0.000 claims abstract description 15
- 238000004458 analytical method Methods 0.000 claims abstract description 10
- 239000011159 matrix material Substances 0.000 claims description 30
- 230000001815 facial effect Effects 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 230000004069 differentiation Effects 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims description 2
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000010276 construction Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 210000000887 face Anatomy 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241001137327 Vireo Species 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7837—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
- G06F16/784—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a movie and television play role labelling method based on search matching. The method comprises the following steps of: obtaining the to-be-labelled object set of a labelling scene and all to-be-labelled object information according to a to-be-labelled object list; constructing a text keyword for each of to-be-labelled objects, and obtaining the corresponding image set by virtue of an image search engine; carrying out face detection and visual attribute analysis on the image of the search result, and removing a noise therein to obtain a role face set which is closely related to the labelling scene, of the to-be-labelled objects; carrying out face detection and tracking on the labelling scene to obtain all face sequences therein; carrying out role labelling on the labelling scene on the basis of a visual similarity among the face sequences, and a visual similarity analysis on the face sequences and the role faces of the to-be-labelled objects. According to the method disclosed by the invention, movie and television play role labelling is carried out by virtue of face images related to movie and television play roles in the Internet; the method disclosed by the invention has the beneficial effects that the labelling process is fully-automatic, high in labelling accuracy, and high in method extensibility and universality.
Description
Technical field
The present invention relates to video intelligent analysis technical field, in particular, it is related to a kind of role based on search matching
Mask method.
Background technology
With flourishing for film and television play industry, there are a large amount of movie and television play programs to be produced out and extreme enrichment every year
The entertainment life of the people.The story main body of most movie and television plays is character.These roles are played the part of by true performer
Drill, movie and television play plot is continued to develop and goed deep into also with the appearance and interaction of role.Therefore, character labeling is carried out to movie and television play,
For the face occurred in movie and television play adds corresponding role name, the mapping relations set up between face-role name, so as to obtain
The character specific time slice for occurring and area of space information in movie and television play, are worth as an extensive application
Important topic.Currently, movie and television play character labeling turned into the intelligent and personal management of extensive movie and television play data, browse and
Base support technology in the service such as retrieval.Browse in the movie and television play centered on role, intelligent video summary, towards specific angle
The role of nucleus module is play in the application such as video frequency searching of color.
The method for having had some movie and television play character labelings at present is suggested, and they can be broadly divided into based on face mould
The method of type and the method based on drama.Method based on faceform is that each role collects a number of face as instruction
Practice sample, and be that each role constructs respective faceform, based on these models, face in movie and television play using these samples
Character labeling is then realized according to it with the similarity of different role faceform.Although this kind of method has been obtained in many systems
To successful Application.But, it needs to artificially collect training sample, it will usually expend regular hour and energy.And above-mentioned instruction
The faceform that gets is general also more difficult to be applied to other movie and television plays.Because even being same performer, s/hes are in difference
Visual appearance in movie and television play also likely to be present larger difference, cause the method based on faceform to be difficult to expand on a large scale
The treatment and analysis of movie and television play come up.On the other hand, the method based on drama is then by excavating movie and television play text and visual information
Mode uniformity in time realizes character labeling.Usually, this kind of method is obtained from outside channel such as internet first
The drama and captioned test of movie and television play program are obtained, by drama and the captions of aliging, specific role is obtained and is being said in particular point in time
The information of words.Simultaneously according to the time point that face is detected in movie and television play, the mapping relations of face and role name are tentatively set up, entered
And the visual similarity between face is utilized, this relation is refined and is allowed to more accurate.Method advantage based on drama is
Annotation process is automatic (without manual intervention).However, the drama and caption information of not all movie and television play are all easy to
Obtain.Many movie and television plays do not disclose its drama, or drama and captions and non-fully corresponding, and many dubbed films do not have yet
Chinese drama and captions, these factors limit the universality of the method based on drama.
In addition to the above methods, some famous person's image labeling methods for being based on search also are suggested in the recent period.These methods
Famous person's facial image construction famous person storehouse is collected first with search engine.Then to image to be marked, by calculate the image with
The vision similarity of image in famous person storehouse, obtains a small amount of highly similar image, and then the famous person's letter according to belonging to these images
Breath, realizes the famous person's mark to image to be marked.But, the validity of this kind of method is still only only including the storehouse of hundreds of famous persons
On be confirmed, additionally, this work is directed to image area rather than visual domain, it is impossible to be can be used to using video structure etc. auxiliary
Help the valuable clue of mark.
The prosperity of internet causes that substantial amounts of character image is appeared on network.Performer with certain popularity is come
Say, with the Real Name of s/he as inquiry, the facial image of many s/hes can be retrieved by image search engine.This
A little faces generally have following features:1) image of the retrieval result image comprising the performer in different movie and television plays, and life,
Therefore face also has certain visual appearance to change;2) certain noise is usually contained in facial image, is such as occurred in image
It is the face of other people;3) sorted in retrieval result forward image correct proportions generally than sequence rearward high.The opposing party
Face, with movie and television play name plus performer institute role name in movie and television play as inquiry, because inquiry is more strict, by image
The characteristics of facial image that search engine retrieving is arrived, is then different from the former.Usually, the master in being movie and television play as inquired about role
The image major part for sorting forward when wanting role, in retrieval result is facial image of the role in the movie and television play, but ought
When role is not dominant role, the noise proportional of the forward retrieval result of sorting would generally be higher, as a result in also have it is higher
Probability there is the facial image of other dominant roles in some movie and television plays.
The facial image and its These characteristics that movie and television play character search is obtained obviously can be used to preferably realize role
Mark.But, prior art does not have well using these information, is particularly excavating the result that different query and searches are obtained
The characteristics of image this aspect.The present invention is based on this understanding and puts forward.Specifically, the present invention is using movie and television play name plus angle
The facial image that the role occurs in the movie and television play is generally comprised in the image that color name retrieval is obtained.Therefore, using based on regarding
Feeling the method for matching can obtain good character labeling effect.But, it is likely to deposit in the image collection that so retrieval is obtained
In a small number of or even more noises, how to differentiate noise and remove its influence as a difficult point.Therefore, novelty of the present invention
The image collection noise proportional that obtains of utilization Real Name retrieval generally relatively low this feature, by excavating " Real Name "
Face set obtains the perceptual property of performer, and then the face set using these perceptual properties to " movie and television play name plus role name "
Denoising is carried out, so as to obtain role's face set of performer.Based on this, the vision of role's face and face in movie and television play is recycled
Visual similarity in similitude, and movie and television play between face, realizes the high accuracy mark of movie and television play role.It is based on tradition
The method of faceform is compared, and annotation process of the invention is automatic without manual intervention, and role's facial image is with video display
Play is adaptive to be should determine that, with good autgmentability.Compared with the method based on drama, the present invention only needs to the performer of movie and television play
Table can be carried out, and compared to drama and captions are obtained, obtained cast and be relatively easy to many tasks.Even so, even if
Cannot get cast, artificial summary one is also one and summarizes drama and the easy task of captioned test more than artificial.Therefore originally
Invention has stronger universality, is applicable in more movie and television plays.Additionally, the famous person's image labeling method based on search is only
Facial image is collected using name, it is of the invention then fully excavated the correlation between the different facial images that obtain of inquiries, and according to
This realizes that great targetedly movie and television play role face is collected.Moreover, the present invention is also by excavating the structural information of video
Character labeling is better achieved, thus it is higher to be more technically advanced mark precision.More than refer to Application No.
201210215951.1, the invention of entitled " a kind of method that high priest's summary is automatically generated inside TV programme " is special
Profit;And Application No. 201110406765.1, entitled " a kind of TV play video analysis method of based role "
Patent of invention.
The content of the invention
It is an object of the invention to fully excavating and effectively using the facial image on movie and television play role in internet, carrying
For a kind of automatic, expansible, universality is strong, high-precision character labeling method, be the intelligent of magnanimity movie and television play data and
Propertyization management, browse and the service offer base support technology such as retrieve.
To achieve the above object, the present invention provide it is a kind of based on search matching character labeling method, the method include with
Lower step:
S1, according to list object to be marked, obtain marking the object set to be marked and all objects to be marked of scene
Information;
S2, it is every object formation text key word to be marked, corresponding Search Results is obtained using image search engine
Image collection;
S3, Face datection and perceptual property analysis are carried out on the Search Results image for being obtained, belonged to using face vision
Property uniformity remove noise therein, obtain the object to be marked role face set closely related with mark scene;
S4, to it is described mark scene carry out Face detection and tracking, obtain wherein all of face sequence;S5, based on people
Vision similarity between face sequence, and face sequence is analyzed with the vision similarity of object role face to be marked, to institute
Stating mark scene carries out character labeling.
According to the invention it is proposed that a kind of movie and television play character labeling method based on search matching.The method is by excavating
The relation of the facial image that different query and searches are obtained, obtains the role facial image closely related with movie and television play, and then according to
The visual similarity of face sequence in obtained role's facial image and movie and television play, and the vision in movie and television play between face sequence
Similitude realizes character labeling.The method has annotation process full-automatic without manual intervention, marks high precision, it is adaptable to big rule
Mould movie and television play data processing, autgmentability is strong, it is adaptable to polytype movie and television play, the strong advantage of universality.The method can also be made
For extensive movie and television play data intelligent and personal management, browse with the important foundation support technology in retrieval service,
Movie and television play centered on role is browsed, intelligent video summary, towards playing core in the application such as video frequency searching of specific role
The effect of module.
Brief description of the drawings
Fig. 1 is the flow chart according to the character labeling method based on search matching of one embodiment of the invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference
Accompanying drawing, the present invention is described in more detail.
As shown in figure 1, the character labeling method based on search matching of the invention is comprised the following steps:
S1, according to the list object to be marked such as cast, obtain marking the object set to be marked of scene and need to be marked
Note the information of object:Real Name and role name;
S2, it is every performer construction text key word, corresponding Search Results image set is obtained using image search engine
Close;
S3, the Search Results image set for being obtained close carry out Face datection and perceptual property analysis, regarded using face
Feel that the uniformity of attribute removes noise therein, obtain performer and the closely related role's face set of the movie and television play;
S4, Face detection and tracking is carried out to movie and television play, obtain all of face sequence in movie and television play;
It is S5, similar to the vision of performer role's face based on the vision similarity between face sequence, and face sequence
Degree analysis, realizes the character labeling to movie and television play.
According to a preferred embodiment of the invention, according to the list object to be marked such as cast, all objects to be marked are obtained
Real Name and the detailed process of role name be:
Step 11, access love performer's net (http://www.ayanyuan.com/)、IMDB(http://
) etc. www.imdb.com/ specialty is directed to movie and television play cast, the website of story introduction, and the shadow is obtained using movie and television play name inquiry
Depending on play, i.e., webpage related to the mark scene;
Step 12, the page layout according to the webpage, crawl obtain performer's matrix section, obtain performer's collection of the movie and television play
Close, and each performer Real Name, the information such as role name.
According to a preferred embodiment of the invention, the performer's set for being obtained to step 12, is every performer's construction Real Name
With movie and television play name plus two groups of text key words of role name, using image search engine obtain Search Results image detailed process such as
Under:
Every performer in step 21, the performer's set obtained for step 12 constructs two text key words, and one is to drill
The Real Name of member, another is name combinatorics on words of the full name of movie and television play plus performer institute figure;
After step 22, text key word construction are finished, using image search engine, such as the application for calling Google to provide
The two text key words are submitted to Google image search engines by routine interface successively, and it is retrieval to set search parameter
Image comprising face, returns to several Search Results images corresponding with the performer, such as retrieve the result images quantity for obtaining
It is 64, under the setting, Google image search engines can sort in the unified resource of the facial image of first 64 retrieval result
Finger URL (i.e. URL addresses) returns to retrieval end, retrieves end and then downloads respective image according to the address.That is, institute
Having image can normally download ideally, the step for can obtain 64 Search Results images.In practical application,
The image that each keyword can be downloaded to is generally between 50 to 64.Downloaded with Real Name and movie and television play name plus role name
To image collection be known respectively as " Real Name " and " movie and television play name plus role name " image collection.
Every performer in gathering performer repeats said process, that is, obtain " Real Name " and " video display of each performer
Acute name adds role name " image collection.
According to a preferred embodiment of the invention, " Real Name " and " the movie and television play name plus role name " figure for being obtained to step 2
Image set is closed carries out Face datection and perceptual property analysis, and noise therein is removed using the uniformity of face perceptual property, obtains
The detailed process of performer's role face set closely related with the movie and television play is as follows:
Step 31, call recognition of face cloud service Face++ (http://www.faceplusplus.com.cn/) people
The face detection instrument such as interface, carries out Face datection to " Real Name " and " movie and television play name adds role name " image collection, and according to
Image collection is expressed as " Real Name " accordingly and " movie and television play name plus role name " face set by testing result;Extract simultaneously
The perceptual property of each object face to be marked, in an embodiment of the present invention, the perceptual property includes sex, age and people
Kinds three kinds, and locating human face M facial key area, in an embodiment of the present invention, the facial key area includes nine
It is individual, respectively:Two left and right corners of eyes, the lower-left of nose is along, middle lower edge and bottom right edge, the left and right corner of face.In each face
Portion's key area extracts N-dimensional characteristic vector (the SIFT feature vectors of such as 128 dimensions), and this 9 128 characteristic vectors of dimension are spelled
It is connected in face face visual signature description of 1152 dimensions.Every performer in gathering performer repeats said process, obtains every
" Real Name " of individual performer and " movie and television play name plus role name " face set, above-mentioned three kinds of perceptual properties of each face and face
Portion's critical zone locations;
Step 32, closed in " Real Name " face collection of every performer, the system of above-mentioned three kinds of perceptual properties is generated respectively
Meter histogram, such as:For gender attribute generates one 2 dimension histogram, 2 dimensions correspond to masculinity and femininity respectively;For age attribute is given birth to
Into one 8 dimension histogram, wherein the dimension of the 1st peacekeeping the 8th corresponds to the face of less than 10 years old and more than 70 years old respectively, the age falls in interval
[10* (i-1), 10*i) the face histogrammic i-th dimension of correspondence;For people's attribute generates a 3-dimensional histogram, 3-dimensional is right respectively
Answer " Asian ", " white man " and " Black people ".The phase of appearance situation according to three kinds of perceptual properties of face to the statistic histogram
Dimension is answered to be voted.When all faces have been voted in performer " Real Name " the face set, calculate histogram and obtain
The ratio of the most dimension of poll and face quantity, if the ratio exceedes the threshold value of setting, such as 0.5, then it is assumed that the vision belongs to
It is significant that property is closed in " Real Name " face collection.One performer is defined as the above-mentioned of recognizable and if only if s/he
Three kinds of perceptual properties are all significant.These three notable attributes are also defined as the character attribute of the performer.All performers'
" Real Name " face collection closes repetition said process, obtains the character attribute of all of recognizable performer and s/hes.For
Those are not defined as recognizable performer, due to the character attribute of s/hes cannot be identified from network facial image,
Will not be considered in follow-up character labeling;
Step 33, the every recognizable performer obtained to step 32, close in its " movie and television play name plus role name " face collection
(without loss of generality, performer's role name and " movie and television play name plus role name " face set is respectively defined as PeriAnd CFi), based on step
The rapid 31 1152 dimension face face visual signatures for obtaining describe son carries out face cluster, in an embodiment of the present invention, using imitative
Penetrating propagation (Affinity Propagation) algorithm carries out face cluster, and the clustering algorithm needs to calculate the similarity moment of face
Battle array S ≡ [si,j]T×T, wherein, element si,jIt is face fiAnd fjVision similarity, be face f as i ≠ jiAnd fjDescription
COS distance, be the average value of all human face similarity degrees in the set as i=j, T is set CFiIn face quantity.
According to the cluster process, can be by CFiIt is expressed as the form of formula (1)
Wherein, w is the categorical measure of generation after cluster,It is CFiJ-th cluster result in set,ForIn
K-th description of face.Cluster only retains result classification of the face quantity more than or equal to 3.
Each the cluster result classification obtained to formula (1)Respectively the sex of the performer that statistic procedure 32 is obtained,
The appearance ratio of age, ethnic group three-type-person's thing attribute in the category.When the appearance ratio of three attribute is both greater than a predetermined threshold
Value, such as 0.6, then it is assumed thatIn face be all performer PeriThe candidate role face closely related with the movie and television play.To institute
HaveClassification repeat said process, obtain PeriAll candidate role's faces.To in all recognizable performer's repetitions
Process is stated, the respective candidate role's face set of s/hes is obtained;
Step 34, image duplicate removal is carried out for candidate role's face set of object to be marked, i.e., obtained for step 33
Performer PeriCandidate role's face setDue to generally there is a number of vision copy in network facial image
Image, it is right to remove the influence of copy imageFacial image in set carries out vision copy detection, real in the present invention one
Apply in example, detected as detection kit with gopher case SOTU using vision copy detection and (for details, reference can be made to
http://vireo.cs.cityu.edu.hk/research/project/sotu.htm).IfInterior detection vision copy people
Face, the then sequence according to facial image in Google retrieval results, the face that deletion is sorted rearward in retrieval result is repeated
Perform said process untilIn there is no copy face.Said process is repeated to all recognizable performers so that s/hes
There is no vision copy face in respective candidate role's face set;
Step 35, the result based on step 34, further carry out face duplicate removal, that is, detect different recognizable performers'
In candidate role's face set face is copied with the presence or absence of vision.Because vision copy face is only possible to belong to a role.If
Performer PeriAnd PerjCandidate role's face set in detect copy face f, then respectively calculate f with the two set in other
The average visual similarity of face, f is deleted in the low face set of similarity.Said process is repeated until between different performers not
There is copy face again.By above-mentioned steps, K the set Γ and the respective role of s/hes of recognizable performer are can obtain
Face set Ai, it is designated as respectively:
Γ={ A1,A2,…,AK, wherein
Wherein,Represent PeriRole's face set in j-th face description son.
According to a preferred embodiment of the invention, Face detection and tracking is carried out to movie and television play, obtains the face in movie and television play
The detailed process of sequence is:
Step 41, shot boundary detector is carried out to movie and television play, if detecting s-1 shot boundary point.According to this s-1 mirror
Movie and television play is decomposed into s camera lens by head boundary point;
The instruments such as step 42, the Face datection and tracked interface for calling recognition of face cloud service Face++, in each camera lens
Face detection and tracking is inside carried out, the face sequence detected in the camera lens is obtained.This process is repeated to all s camera lenses,
Obtain all of face sequence in the movie and television play.It is of course also possible to using other Face detection and tracking methods, the present invention for
Face detection and tracking method does not do any limitation.
According to a preferred embodiment of the invention, the recognizable performer and the respective role of s/hes for being obtained based on step 35
Face set, and the movie and television play face sequence that step 42 is obtained, based on the vision similarity between face sequence, and face
Sequence is analyzed with the vision similarity of performer role's face, and realization is to the detailed process of the character labeling of movie and television play:
Step 51, set step 42 T face sequence be obtained, to each face sequence in all face extraction colors it is straight
Square figure feature, and clustered based on this feature.Clustering algorithm equally uses affine propagation (Affinity
Propagation) algorithm, wherein human face similarity degree matrix computations principle are identical with described in step 33.Will according to cluster result
Face sequence FTkIt is expressed as:
Wherein,WithBe respectively classification i class center vector classification i in face quantity, such center vector by away from
The character representation of the face nearest from classification i central points, w is categorical measure;
Step 52, due to appear in multiple face sequences of synchronization be not generally possible to correspondence same person.According to people
The overlapping cases of face sequence time of occurrence, generation collison matrix C ≡ [ci,j]T×T.If face sequence FTiAnd FTjTime of occurrence
There is overlap, then ci,j=1, the c if non-overlappingi,j=0;
Step 53, the face sequence obtained according to step 51 are represented, based on earth mover distance (Earth Mover ' s
Distance face sequence FT) is calculatediAnd FTiVision similarity, be designated as fsi,j, to the combination of two weight of all face sequences
Multiple above-mentioned calculating process, and this obtains the probability propagation matrix P ≡ [p of face sequence similarity by formula (2)i,j]T×T, its
In:
Step 54, calculating role match confidence level matrix S ≡ [s with face sequencei,j]T×K, wherein si,jIt is face sequence
FTiWith PerjRole's face set similarity, the similarity be equal to the two set in most like face vision it is similar
Property, calculated according to formula (3):
WhereinIt is face sequence FTiIn m-th class center vector and PerjRole's face set in n-th jiao
The similarity of color face;
Step 55, by formula (4), update matching confidence level matrix S using collison matrix C
This operation can avoid the face sequence overlapped for time of occurrence from assigning matching confidence level high simultaneously, so as to follow-up
Same role is noted as in step;
Step 56, updated using step 55 after matrix S, similar threshold value V1 (such as V1=0.8) and dissmilarity threshold value V2
(such as V2=0.2), initial mark matrix is generated by formula (5)
In matrix L(0)In,Represent FTiIt is role PerjFace,Represent face sequence FTiIt is not angle
Color PerjFace,Represent only by matching confidence level, face sequence FTiCorresponding role not can determine that still.Will be full
FootTwo tuples<FTi,Perj>It is added to and has marked role's set LFaces.Realize to height matching confidence value and
Two tuples not conflicted<FTi,Perj>Character labeling;
The initial mark matrix L that step 57, the probability propagation matrix P obtained based on formula (2) and formula (5) are obtained(0),
(Label Propagation) algorithm is propagated by label, namely iteration performs formula (6) and formula (7) updates initial mark
Matrix L(0)InElement, until algorithmic statement
L(t+1)≡PL(t) (6)
By performing label propagation algorithm, existing high confidence level character labeling result is by according to the phase between face sequence
Like spending, propagated not can determine that the face sequence of role still with certain probability;
Step 58, orderIt is the mark matrix after algorithmic statement, L is updated according to formula (8)ΔMiddle satisfactionThe element of conditionMark confidence level
Wherein α ∈ (0,1) are the threshold values of regulation mark confidence level and matching confidence weight, are set to 0.5.By formula
(8), the similarity and face sequence between face sequence obtain effective integration with the confidence level that matches of role's face;
Step 59, L successively from after step 58 renewalΔMiddle lookup value is maximum and meets the element of condition (9)Will<
FTi,Perj>It is added to and has marked role set LFaces, while updates matrix L according to formula (10)Δ.Repeat above-mentioned lookup
Cheng Zhizhi LΔIn no longer exist and meet the element of condition (9);
Wherein,<FTi,Perj>Represent by face sequence FTiWith role PerjComposition with height matching confidence value and not
Two tuples of conflict, TlabelIt is differentiation threshold value set in advance, is set to 0.5.
According to formula (9) and (10), can choose successively when previous belief highest face sequence and role name combination are carried out
Mark.Work as LΔWhen no longer there is the element for meeting condition (9), annotation process terminates.The knot in role's set LFaces is marked
Fruit is character labeling result.
Particular embodiments described above, has been carried out further in detail to the purpose of the present invention, technical scheme and beneficial effect
Describe in detail bright, should be understood that and the foregoing is only specific embodiment of the invention, be not intended to limit the invention, it is all
Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc., should be included in guarantor of the invention
Within the scope of shield.
Claims (9)
1. it is a kind of based on the character labeling method for searching for matching, it is characterised in that the method is comprised the following steps:
S1, according to list object to be marked, obtain marking the object set to be marked of scene and the information of all objects to be marked;
S2, it is every object formation text key word to be marked, corresponding Search Results image is obtained using image search engine
Set;
S3, Face datection and perceptual property analysis are carried out on the Search Results image for being obtained, using face perceptual property
Uniformity removes noise therein, obtains the object to be marked role face set closely related with mark scene;
S4, to it is described mark scene carry out Face detection and tracking, obtain wherein all of face sequence;
S5, based on the vision similarity between face sequence, and face sequence and object role face to be marked vision phase
Analyzed like degree, character labeling is carried out to the mark scene;
Wherein, the step S4 is comprised the following steps:
Step 41, shot boundary detector carried out to the mark scene, and the mark scene is decomposed into by s according to testing result
Individual camera lens;
Step 42, Face detection and tracking is carried out for each camera lens in s camera lens, obtain owning in the mark scene
Face sequence.
2. method according to claim 1, it is characterised in that the step S1 is comprised the following steps:
Step 11, retrieval obtain the webpage related to the mark scene;
Step 12, according to the webpage that obtains of retrieval, obtain the object set to be marked of the mark scene, and each is to be marked
The information of object.
3. method according to claim 2, it is characterised in that the information of the object to be marked includes Real Name and angle
Color name.
4. method according to claim 1, it is characterised in that the step S2 is comprised the following steps:
Step 21, it is each the object formation text key word to be marked in the object set to be marked;
Step 22, based on the text key word, using image search engine retrieval obtain each object to be marked, several with
The corresponding Search Results image set of the text key word is closed.
5. method according to claim 4, it is characterised in that the text key word includes mark scene title and waits to mark
Note object correspondence role name combinatorics on words, and object to be marked Real Name, the Real Name with object to be marked is corresponding
Search Results image set close be designated as Peri, it is corresponding with mark scene title role name combinatorics on words corresponding with object to be marked
Search Results image set is closed and is designated as CFi。
6. method according to claim 1, it is characterised in that the step S3 is comprised the following steps:
Step 31, to the Search Results image set close carry out Face datection, extract each object face to be marked vision category
Property, and M of locating human face facial key area, N-dimensional characteristic vector is extracted in each facial key area, obtain M × N-dimensional
Face face visual signature description;
Step 32, the image collection Per for each object to be markedi, generation is straight corresponding to the statistics of the perceptual property respectively
Fang Tu, and the respective dimensions of the statistic histogram are voted according to the appearance situation of each perceptual property, according to ballot
Result judges the conspicuousness of each perceptual property, when all perceptual properties of and if only if certain object to be marked are notable,
The object to be marked is considered as recognizable, and using corresponding perceptual property as the object to be marked character attribute;
Step 33, the to be marked object recognizable to each, in its corresponding image collection CFiOn, regarded based on the face face
Feel that Feature Descriptor carries out face cluster, according to appearance ratio of the character attribute in each cluster result classification, obtain
Candidate role's face set of corresponding object to be marked;
Step 34, carry out image duplicate removal for candidate role's face set of object to be marked;
Step 35, using face average visual similarity, carry out face for the candidate role's face set after image duplicate removal
Weight.
7. method according to claim 6, it is characterised in that the perceptual property includes sex, age and ethnic group.
8. method according to claim 1, it is characterised in that the step S5 is comprised the following steps:
Step 51, to each face sequence in all face extraction color histogram features, and gathered based on this feature
Class;
Step 52, according to cluster result and the overlapping cases of face sequence time of occurrence, generation collison matrix C;
Vision similarity between step 53, calculating face sequence, obtains the probability propagation matrix P of face sequence similarity;
Step 54, calculating role match confidence level matrix S with face sequence, wherein, the element of matrix S is face sequence and angle
Similarity between color face set;
Step 55, update the matching confidence level matrix S using the collison matrix C, it is to avoid be face that time of occurrence is overlapped
Sequence assigns matching confidence level high simultaneously;
Step 56, using matching confidence level matrix S, the similar threshold value V1 after renewal and dissimilar threshold value, the initial mark square of generation
Battle array L(0);
Step 57, mark based on the probability propagation matrix P and initially matrix L(0), update described first by label propagation algorithm
Begin mark matrix L(0)In uncertain element, until algorithmic statement;
Step 58, make LΔIt is the mark matrix after algorithmic statement, updates LΔThe mark confidence level of middle element, to merge face sequence
Between similarity and face sequence and role's face match confidence level;
Step 59, mark matrix L successively from after renewalΔMiddle lookup value is maximum and meets the element of certain conditionAnd update
The mark matrix LΔ, said process is repeated until the mark matrix LΔIn no longer exist and meet the element of the condition, so
Afterwards to being labeled when previous belief highest face sequence and role name combination.
9. method according to claim 8, it is characterised in that the certain condition in the step 59 is:
Wherein,Represent only by matching confidence level, face sequence FTiCorresponding role not can determine that still;TlabelFor pre-
The differentiation threshold value for first setting, < FTi, Perj> is represented by face sequence FTiWith role PerjWhat is constituted matches confidence level with height
Value and two tuples not conflicted, LFaces are represented and have been marked role's set, Label (FTk) represent face sequence FTkLabel,
cI, kIt is the element of the i-th row kth row in collison matrix C, cI, k=1 represents face sequence FTiAnd FTkTime of occurrence have overlap.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410218854.7A CN103984738B (en) | 2014-05-22 | 2014-05-22 | Role labelling method based on search matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410218854.7A CN103984738B (en) | 2014-05-22 | 2014-05-22 | Role labelling method based on search matching |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103984738A CN103984738A (en) | 2014-08-13 |
CN103984738B true CN103984738B (en) | 2017-05-24 |
Family
ID=51276711
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410218854.7A Expired - Fee Related CN103984738B (en) | 2014-05-22 | 2014-05-22 | Role labelling method based on search matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103984738B (en) |
Families Citing this family (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104217008B (en) * | 2014-09-17 | 2018-03-13 | 中国科学院自动化研究所 | Internet personage video interactive mask method and system |
CN104778481B (en) * | 2014-12-19 | 2018-04-27 | 五邑大学 | A kind of construction method and device of extensive face pattern analysis sample storehouse |
CN105335726B (en) * | 2015-11-06 | 2018-11-27 | 广州视源电子科技股份有限公司 | Recognition of face confidence level acquisition methods and system |
CN105913275A (en) * | 2016-03-25 | 2016-08-31 | 哈尔滨工业大学深圳研究生院 | Clothes advertisement putting method and system based on video leading role identification |
CN105843949B (en) * | 2016-04-11 | 2019-07-16 | 麒麟合盛网络技术股份有限公司 | A kind of image display method and device |
CN106682094B (en) * | 2016-12-01 | 2020-05-22 | 深圳市梦网视讯有限公司 | Face video retrieval method and system |
CN106708806B (en) * | 2017-01-17 | 2020-06-02 | 科大讯飞股份有限公司 | Sample confirmation method, device and system |
CN107153817B (en) * | 2017-04-29 | 2021-04-27 | 深圳市深网视界科技有限公司 | Pedestrian re-identification data labeling method and device |
CN107273859B (en) * | 2017-06-20 | 2020-10-02 | 南京末梢信息技术有限公司 | Automatic photo marking method and system |
CN108228871A (en) | 2017-07-21 | 2018-06-29 | 北京市商汤科技开发有限公司 | Facial image dynamic storage method and device, electronic equipment, medium, program |
CN107633048B (en) * | 2017-09-15 | 2021-02-26 | 国网重庆市电力公司电力科学研究院 | Image annotation identification method and system |
CN107909088B (en) * | 2017-09-27 | 2022-06-28 | 百度在线网络技术(北京)有限公司 | Method, apparatus, device and computer storage medium for obtaining training samples |
CN107886109B (en) * | 2017-10-13 | 2021-06-25 | 天津大学 | Video abstraction method based on supervised video segmentation |
CN108228845B (en) * | 2018-01-09 | 2020-10-27 | 华南理工大学 | Mobile phone game classification method |
JP2020035086A (en) * | 2018-08-28 | 2020-03-05 | 富士ゼロックス株式会社 | Information processing system, information processing apparatus and program |
CN109740623B (en) * | 2018-11-21 | 2020-12-04 | 北京奇艺世纪科技有限公司 | Actor screening method and device |
CN109933719B (en) * | 2019-01-30 | 2021-08-31 | 维沃移动通信有限公司 | Searching method and terminal equipment |
CN110135804B (en) * | 2019-04-29 | 2024-03-29 | 深圳市元征科技股份有限公司 | Data processing method and device |
CN110555117B (en) * | 2019-09-10 | 2022-05-31 | 联想(北京)有限公司 | Data processing method and device and electronic equipment |
CN110807108A (en) * | 2019-10-15 | 2020-02-18 | 华南理工大学 | Asian face data automatic collection and cleaning method and system |
CN111770299B (en) * | 2020-04-20 | 2022-04-19 | 厦门亿联网络技术股份有限公司 | Method and system for real-time face abstract service of intelligent video conference terminal |
CN111813660B (en) * | 2020-06-12 | 2021-10-12 | 北京邮电大学 | Visual cognition search simulation method, electronic equipment and storage medium |
CN113052079B (en) * | 2021-03-26 | 2022-01-21 | 重庆紫光华山智安科技有限公司 | Regional passenger flow statistical method, system, equipment and medium based on face clustering |
CN113283480B (en) * | 2021-05-13 | 2023-09-05 | 北京奇艺世纪科技有限公司 | Object identification method and device, electronic equipment and storage medium |
CN113792186B (en) * | 2021-08-16 | 2023-07-11 | 青岛海尔科技有限公司 | Method, device, electronic equipment and storage medium for name retrieval |
CN115482618A (en) * | 2022-08-10 | 2022-12-16 | 青岛民航凯亚系统集成有限公司 | Remote airplane boarding check auxiliary method based on face recognition |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1311677C (en) * | 2004-03-12 | 2007-04-18 | 冯彦 | Substitute method of role head of digital TV. program |
US8300953B2 (en) * | 2008-06-05 | 2012-10-30 | Apple Inc. | Categorization of digital media based on media characteristics |
KR20130000828A (en) * | 2011-06-24 | 2013-01-03 | 엘지이노텍 주식회사 | A method of detecting facial features |
CN102521340B (en) * | 2011-12-08 | 2014-09-03 | 中国科学院自动化研究所 | Method for analyzing TV video based on role |
CN102542292B (en) * | 2011-12-26 | 2014-03-26 | 湖北莲花山计算机视觉和信息科学研究院 | Method for determining roles of staffs on basis of behaviors |
CN102902821B (en) * | 2012-11-01 | 2015-08-12 | 北京邮电大学 | The image high-level semantics mark of much-talked-about topic Network Based, search method and device |
CN103309953B (en) * | 2013-05-24 | 2017-02-08 | 合肥工业大学 | Method for labeling and searching for diversified pictures based on integration of multiple RBFNN classifiers |
CN103793697B (en) * | 2014-02-17 | 2018-05-01 | 北京旷视科技有限公司 | The identity mask method and face personal identification method of a kind of facial image |
-
2014
- 2014-05-22 CN CN201410218854.7A patent/CN103984738B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN103984738A (en) | 2014-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103984738B (en) | Role labelling method based on search matching | |
CN102549603B (en) | Relevance-based image selection | |
US9430719B2 (en) | System and method for providing objectified image renderings using recognition information from images | |
US8897505B2 (en) | System and method for enabling the use of captured images through recognition | |
US7809192B2 (en) | System and method for recognizing objects from images and identifying relevancy amongst images and information | |
US7809722B2 (en) | System and method for enabling search and retrieval from image files based on recognized information | |
Kennedy et al. | A reranking approach for context-based concept fusion in video indexing and retrieval | |
CN108769823A (en) | Direct broadcasting room display methods, device, equipment and storage medium | |
CN109783817A (en) | A kind of text semantic similarity calculation model based on deeply study | |
CN113553429B (en) | Normalized label system construction and text automatic labeling method | |
CN103064903B (en) | Picture retrieval method and device | |
Edwards et al. | Passive citizen science: The role of social media in wildlife observations | |
CN106204156A (en) | A kind of advertisement placement method for network forum and device | |
CN111985520A (en) | Multi-mode classification method based on graph convolution neural network | |
CN103955480B (en) | A kind of method and apparatus for determining the target object information corresponding to user | |
Kaneko et al. | Visual event mining from geo-tweet photos | |
Ulges et al. | A system that learns to tag videos by watching youtube | |
Varini et al. | Personalized egocentric video summarization for cultural experience | |
CN113220915B (en) | Remote sensing image retrieval method and device based on residual attention | |
CN110442736B (en) | Semantic enhancer spatial cross-media retrieval method based on secondary discriminant analysis | |
Demidova et al. | Semantic image-based profiling of users’ interests with neural networks | |
CN102915311A (en) | Searching method and searching system | |
CN116955707A (en) | Content tag determination method, device, equipment, medium and program product | |
Snoek et al. | Are concept detector lexicons effective for video search? | |
CN106202338B (en) | Image search method based on the more relationships of multiple features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170524 |
|
CF01 | Termination of patent right due to non-payment of annual fee |