CN102752540B

CN102752540B - A kind of automated cataloging method based on face recognition technology

Info

Publication number: CN102752540B
Application number: CN201110453762.3A
Authority: CN
Inventors: 张峰
Original assignee: China Digital Video Beijing Ltd
Current assignee: China Digital Video Beijing Ltd
Priority date: 2011-12-30
Filing date: 2011-12-30
Publication date: 2017-12-29
Anticipated expiration: 2031-12-30
Also published as: CN102752540A

Abstract

The invention discloses a kind of automated cataloging method based on face recognition technology, specifically include：Receive face material database；Receive multimedia file；Crucial frame recording and corresponding key frame data picture are obtained according to the video file；Key frame face picture is obtained according to the key frame data picture；The face material database face image information is inquired about according to the key frame face picture and obtains matching face material text message；Language identification is carried out to the audio file according to the crucial frame recording and obtains key frame cataloguing text；It is recorded according to the key frame in the key frame cataloguing text and merges the face material text message, obtains catalogued file.The present invention solves the problems, such as that catalogued file generation and editor can not be carried out by video file, improves precision and the flexibility of catalogued file generation and processing, saved system cost, reduce error rate, and have more wide applicability.

Description

A kind of automated cataloging method based on face recognition technology

Technical field

The present invention relates in the material data editor of radio data system and process field, lay particular emphasis in CHINA RFTCOM Co Ltd system In, emphasis is in the application of digital video-audio industrial field, more particularly to a kind of automated cataloging method based on face recognition technology.

Background technology

With the development of television production technology, popularization, the more matchmakers that generally obtained during program making to collection Voxel material is pre-processed, and voice messaging therein is identified and obtains corresponding inventory information, especially sport category program, In the case of news controlling, visiting nursing, variety class program occupation rate more and more higher.It is time-consuming to the manually cataloguing of program to take Power.Meanwhile this kind of program is using key person as specific picture, such as：Sports star, state leader, host, men and women Main broadcaster etc. is relatively more fixed with respect to personnel, and the intrinsic biological information of computer automatic analysis face is compiled as the primary of video Mesh information will largely save artificial Catalogue Work.Personal information more than in the prior art can not directly obtain from audio file Obtain, it is necessary to be obtained from other approach, the method that manually video content is identified for generally use in the prior art, artificial needs Name information is inserted in catalogued file according to picture is broadcasted, but in the case where needing to carry out a large amount of manual identifieds, according to people Generation and operation of the thing picture to inventory information need to put into substantial amounts of manpower and materials, and due to being artificially to participate in, also can be by The production quality and efficiency of cataloguing material are had influence in human factor.

In inventor realizes process of the present invention, discovery have following defect in the prior art, in the prior art need by Need that manually people information is identified according to different figure pictures when people information adds catalogued file editor, it is right afterwards Corresponding catalogued file enters edlin, and therefore, production quality and operating efficiency to catalogued file all rely on artificial operation, take When it is laborious, while a large amount of system resources are consumed, good catalogued file production effect can not be obtained.

The content of the invention

For in the prior art the defects of, the present invention solves and can not carry out catalogued file generation and volume by video file The problem of collecting.

In order to solve above technical problem, the invention provides a kind of automated cataloging method based on face recognition technology, tool Body includes：

Face material database is received, the face material database specifically includes：Face image information and face material text message；

Multimedia file is received, the multimedia file includes：Video file and audio file；

Crucial frame recording and corresponding key frame data picture are obtained according to the video file；

Key frame face picture is obtained according to the key frame data picture；

The face material database face image information is inquired about according to the key frame face picture and obtains matching face material Text message；

Language identification is carried out to the audio file according to the crucial frame recording and obtains key frame cataloguing text；

It is recorded according to the key frame in the key frame cataloguing text and merges the face material text message, is obtained Catalogued file.

Wherein, also specifically included before the reception face material database step：Establish face material database.

Wherein, described establish in face material database step specifically includes：Face material is received, the face material passes through people Face material keyword identification, include in single face material：Multi-angle material, emotion class expression material and class expression element of speaking Material；Face material database is established according to the face material keyword and corresponding face material.

Wherein, described establish in face material database step specifically includes：Receive face material threedimensional model, the face element Material threedimensional model includes：Face control point model information and corresponding face material threedimensional model text message；According to institute State face material three-dimension modeling face material database.

Wherein, the face image information also specifically includes monochrome information attribute.

Wherein, obtained in key frame face picture step and specifically included according to the key frame data picture：

According to the key frame data picture obtain information of shooting angles, shooting monochrome information, emotion class expression material and/ Or class expression material information of speaking；Carried out taking face image processing acquisition key frame face according to the key frame data picture Picture；According to the information of shooting angles, shooting monochrome information, emotion class expression material and/or class expression material information of speaking Obtain key frame face image information.

Wherein, it is described that the face material database face image information acquisition matching is inquired about according to the key frame face picture Face material text message step specifically includes：Looked into according to the key frame face picture and the key frame face image information Ask the face material database face image information and obtain matching face material text message.

Wherein, the face material text message specifically includes：Name information.

Wherein, it is described that the face material database face image information acquisition matching is inquired about according to the key frame face picture Specifically included in face material text message step：Face control point model information is obtained according to the key frame face picture； The face material database face material obtaining three-dimensional model matching face material is inquired about according to face control point model information Threedimensional model text message.

Wherein, face control point model information specifically includes：Face boundary Control point model information and human face five-sense-organ Control point model information.

Wherein, it is described that crucial frame recording and corresponding key frame data picture step are obtained according to the video file Specifically include：Receive shooting monochrome information；The video file is adjusted according to the shooting monochrome information；According to adjustment rear video File acquisition key frame recording and corresponding key frame data picture.

Wherein, also specifically included after the acquisition catalogued file：Subtitle file is obtained according to the catalogued file；Broadcast Control system System plays out according to the subtitle file.

Compared with prior art, the embodiment of the present invention has advantages below：Pass through the audio-visual content to Multi-media Material Separation, on the one hand according to video file intercept key frame picture, facial image is picked up from key frame picture, with face before Face picture in storehouse is matched, so as to obtain the people information corresponding to face, in addition, knowing to its corresponding voice Not, corresponding text message is obtained, the people information and text envelope for above recognition of face being obtained according to keyword message Breath merges, and so as to automatically generate automated cataloging file, therefore, the present invention no longer needs manually to participate in, and improves multimedia The cataloguing synthesis of program material, treatment effeciency；The intrinsic biological information of computer automatic analysis face is as the first of video Level inventory information will largely save artificial Catalogue Work.Precision and the flexibility of catalogued file generation and processing are improved, is saved System cost, reduce error rate, and with more wide applicability.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other accompanying drawings according to these accompanying drawings.

Fig. 1：It is a kind of schematic diagram of the automated cataloging method based on face recognition technology in the embodiment of the present invention 1；

Fig. 2：It is the schematic diagram of automated cataloging method of the another kind based on face recognition technology in the embodiment of the present invention 2.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on the present invention In embodiment, those of ordinary skill in the art's every other implementation acquired under the premise of creative work is not made Example, belongs to the scope of protection of the invention.

A kind of automated cataloging method based on face recognition technology is provided in the embodiment of the present invention 1, as shown in figure 1, bag Include following steps：

S101：Receive face material database；

This step specifically includes：Face material database is received, the face material database specifically includes：Face image information and people Face material text message；

S102：Receive multimedia file；

This step specifically includes：Multimedia file is received, the multimedia file includes：Video file and audio file；

S103：Obtain crucial frame recording and corresponding key frame data picture；

This step specifically includes：Crucial frame recording is obtained according to the video file and corresponding key frame data is drawn Face；

S104：Obtain key frame face picture；

This step specifically includes：Key frame face picture is obtained according to the key frame data picture；

S105：Obtain matching face material text message；

This step specifically includes：The face material database face image information is inquired about according to the key frame face picture to obtain Take matching face material text message；

S106：Obtain key frame cataloguing text；

This step specifically includes：Language identification is carried out to the audio file according to the crucial frame recording and obtains key frame Cataloguing text；

S107：Merge face material text message and obtain catalogued file；

This step specifically includes：It is recorded according to the key frame in the key frame cataloguing text and merges the face element Material text message, obtain catalogued file.

Another automated cataloging method based on face recognition technology is provided in the embodiment of the present invention 2, as shown in Fig. 2 Comprise the following steps：

S201：Establish face material database；

This step specifically includes：Also specifically included before the reception face material database step：Establish face material database；

Described establish in face material database step specifically includes：Face material is received, the face material passes through face element Material keyword identification, include in single face material：Multi-angle material, emotion class expression material and class expression material of speaking； Face material database is established according to the face material keyword and corresponding face material；

Described establish in face material database step specifically includes：Receive face material threedimensional model, the face material three Dimension module includes：Face control point model information and corresponding face material threedimensional model text message；According to the people Face material three-dimension modeling face material database；

S202：Receive face material database；

The face material text message specifically includes：Name information；

S203：Receive multimedia file；

S204：Obtain crucial frame recording and corresponding key frame data picture；

It is described specific according to the video file crucial frame recording of acquisition and corresponding key frame data picture step Including：Receive shooting monochrome information；The video file is adjusted according to the shooting monochrome information；

According to adjustment rear video file acquisition key frame recording and corresponding key frame data picture；

S205：Obtain key frame face picture；

The face image information also specifically includes monochrome information attribute；

Obtained in key frame face picture step and specifically included according to the key frame data picture：

According to the key frame data picture obtain information of shooting angles, shooting monochrome information, emotion class expression material and/ Or class expression material information of speaking；

Carried out taking face image processing acquisition key frame face picture according to the key frame data picture；

According to the information of shooting angles, shooting monochrome information, emotion class expression material and/or class expression material of speaking letter Breath obtains key frame face image information；

S206：Obtain matching face material text message；

It is described that the face material database face image information acquisition matching face is inquired about according to the key frame face picture Material text message step specifically includes：

The face material database face is inquired about according to the key frame face picture and the key frame face image information Image information obtains matching face material text message；

It is described that the face material database face image information acquisition matching face is inquired about according to the key frame face picture Specifically included in material text message step：

Face control point model information is obtained according to the key frame face picture；

The face material database face material obtaining three-dimensional model matching is inquired about according to face control point model information Face material threedimensional model text message；

Face control point model information specifically includes：Face boundary Control point model information and human face five-sense-organ control point Model information；

S207：Obtain key frame cataloguing text；

S208：Merge face material text message and obtain catalogued file；

This step specifically includes：It is recorded according to the key frame in the key frame cataloguing text and merges the face element Material text message, obtain catalogued file；

S209：Obtain subtitle file and play out；

Also specifically included after the acquisition catalogued file：Subtitle file is obtained according to the catalogued file；Broadcast control system root Played out according to the subtitle file.

Through the above description of the embodiments, those skilled in the art can be understood that the present invention can lead to Hardware realization is crossed, the mode of necessary general hardware platform can also be added by software to realize.Based on such understanding, this hair Bright technical scheme can be embodied in the form of software product, and the software product can be stored in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are causing a computer equipment (can be Personal computer, server, or network equipment etc.) perform method described in each embodiment of the present invention.

It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, module or stream in accompanying drawing Journey is not necessarily implemented necessary to the present invention.

It will be appreciated by those skilled in the art that the module in device in embodiment can describe be divided according to embodiment It is distributed in the device of embodiment, respective change can also be carried out and be disposed other than in one or more devices of the present embodiment.On The module for stating embodiment can be merged into a module, can also be further split into multiple submodule.

The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.

Disclosed above is only several specific embodiments of the present invention, and still, the present invention is not limited to this, any ability What the technical staff in domain can think change should all fall into protection scope of the present invention.

Claims

A kind of 1. automated cataloging method based on face recognition technology, it is characterised in that including：

Face material database is received, the face material database specifically includes：Face image information and face material text message；

Multimedia file is received, the multimedia file includes：Video file and audio file；

Crucial frame recording and corresponding key frame data picture are obtained according to the video file；

Key frame face picture is obtained according to the key frame data picture；

The face material database face image information is inquired about according to the key frame face picture and obtains matching face material text Information；

Language identification is carried out to the audio file according to the crucial frame recording and obtains key frame cataloguing text；

It is recorded according to the key frame in the key frame cataloguing text and merges the face material text message, obtains cataloguing File.
2. method as described in claim 1, it is characterised in that also specifically included before the reception face material database step： Establish face material database.
3. method as described in claim 2, it is characterised in that described establish in face material database step specifically includes：

Face material is received, the face material is included by face material keyword identification in single face material：It is polygonal Spend material, emotion class expression material and class expression material of speaking；

Face material database is established according to the face material keyword and corresponding face material.
4. method as described in claim 2, it is characterised in that described establish in face material database step specifically includes：

Face material threedimensional model is received, the face material threedimensional model includes：Face control point model information and right with it The face material threedimensional model text message answered；

According to the face material three-dimension modeling face material database.
5. method as described in claim 1, it is characterised in that the face image information also specifically includes monochrome information category Property.
6. the method as described in claim 1 or 5, it is characterised in that key frame is obtained according to the key frame data picture Specifically included in face picture step：

Information of shooting angles, shooting monochrome information, emotion class expression material are obtained according to the key frame data picture and/or said Talk about class expression material information；

Carried out taking face image processing acquisition key frame face picture according to the key frame data picture；

Obtained according to the information of shooting angles, shooting monochrome information, emotion class expression material and/or class expression material information of speaking Take key frame face image information.
7. method as described in claim 6, it is characterised in that described that the people is inquired about according to the key frame face picture Face material database face image information obtains matching face material text message step and specifically included：

The face material database face picture is inquired about according to the key frame face picture and the key frame face image information Acquisition of information matches face material text message.
8. method as described in claim 1, it is characterised in that the face material text message specifically includes：Name is believed Breath.
9. method as described in claim 4, it is characterised in that described that the people is inquired about according to the key frame face picture Face material database face image information obtains to be specifically included in matching face material text message step：

Face control point model information is obtained according to the key frame face picture；

The face material database face material obtaining three-dimensional model matching face is inquired about according to face control point model information Material threedimensional model text message.
10. method as described in claim 9, it is characterised in that face control point model information specifically includes：Face Boundary Control point model information and human face five-sense-organ control point model information.
11. method as described in claim 1, it is characterised in that described that crucial frame recording is obtained according to the video file And corresponding key frame data picture step specifically includes：

Receive shooting monochrome information；

The video file is adjusted according to the shooting monochrome information；

According to adjustment rear video file acquisition key frame recording and corresponding key frame data picture.
12. method as described in claim 1, it is characterised in that also specifically included after the acquisition catalogued file：

Subtitle file is obtained according to the catalogued file；

Broadcast control system plays out according to the subtitle file.