CN110418076A - Video Roundup generation method, device, electronic equipment and storage medium - Google Patents
Video Roundup generation method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN110418076A CN110418076A CN201910711992.1A CN201910711992A CN110418076A CN 110418076 A CN110418076 A CN 110418076A CN 201910711992 A CN201910711992 A CN 201910711992A CN 110418076 A CN110418076 A CN 110418076A
- Authority
- CN
- China
- Prior art keywords
- video
- face
- time information
- human
- identification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- XDDAORKBJWWYJS-UHFFFAOYSA-N glyphosate Chemical compound OC(=O)CNCP(O)(O)=O XDDAORKBJWWYJS-UHFFFAOYSA-N 0.000 title abstract 5
- 238000001514 detection method Methods 0.000 claims abstract description 21
- 238000013507 mapping Methods 0.000 claims abstract description 9
- 238000004590 computer program Methods 0.000 claims description 13
- 239000000463 material Substances 0.000 claims description 12
- 230000015572 biosynthetic process Effects 0.000 claims description 8
- 238000012790 confirmation Methods 0.000 claims description 8
- 238000003786 synthesis reaction Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 4
- 230000001815 facial effect Effects 0.000 abstract description 8
- 230000002123 temporal effect Effects 0.000 abstract 2
- 238000012545 processing Methods 0.000 description 12
- 230000000694 effects Effects 0.000 description 9
- 230000001360 synchronised effect Effects 0.000 description 7
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000007639 printing Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000001680 brushing effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present invention provides a kind of Video Roundup generation method, device, electronic equipment and storage medium.The Video Roundup generation method, comprising: extract face characteristic from the picture frame of video;It is tracked detection based on the video, obtains the temporal information that same personage occurs in the video;When the not stored character recognition and label for having the face characteristic in face characteristic library, character recognition and label is distributed for the face characteristic, and the mapping relations of the face characteristic and the character recognition and label are put in storage to face characteristic library;Establish the incidence relation between the character recognition and label and the video;When receiving the facial image that triggering Video Roundup generates, it is based on the temporal information and the incidence relation, generates the Video Roundup comprising the facial image.
Description
Technical Field
The present invention relates to the field of information technologies, and in particular, to a method and an apparatus for generating a video album, an electronic device, and a storage medium.
Background
In the prior art, the generation method of the video highlights can include two methods:
the first method comprises the following steps: a special team is invited to shoot and is manually edited and synthesized;
and the second method comprises the following steps: the method is completed by inputting face information in advance and adopting a face recognition technology;
the first mode has no universality, low efficiency and overhigh cost;
the second method needs heavy participation of users and needs to be input in advance, the accuracy of identification depends on the shooting angle and the stability of video pictures, and the support for dynamic pictures is poor.
Disclosure of Invention
In view of this, the invention provides a video album generating method, a video album generating apparatus, an electronic device and a storage medium.
The technical scheme of the invention is realized as follows:
a video highlights generation method comprising:
extracting human face features from image frames of a video;
tracking detection is carried out on the basis of the video, and time information of the same person appearing in the video is obtained;
when the human face feature library does not store the human identification of the human face feature, distributing the human identification for the human face feature, and warehousing the human face feature and the mapping relation of the human identification into the human face feature library;
establishing an incidence relation between the character identification and the video;
and when a face image which triggers the generation of the video collection is received, generating the video collection containing the face image based on the time information and the incidence relation.
Based on the above scheme, the tracking detection based on the video to obtain the time information of the same person appearing in the video includes:
and combining face tracking and human body tracking to obtain the time information of the same person appearing in the video.
Based on the above scheme, the obtaining of the time information of the same person appearing in the video by combining the face tracking and the human body tracking comprises:
based on the face tracking, obtaining first time information of the face of the same person appearing in the video;
obtaining second time information of the human body of the same person appearing in the video based on the human body tracking;
and combining the first time information and the second time information of the same person appearing in the video based on the image frames with the faces appearing simultaneously.
Based on the above scheme, the establishing of the association relationship between the character identifier and the video includes:
and generating structural data containing the character identification, the time information and the video identification of the video.
Based on the above scheme, the allocating the character identifier to the face feature includes:
and distributing the character identification for the face characteristics extracted from the faces meeting the preset presentation condition.
Based on the above scheme, the face meeting the preset presentation condition includes at least one of the following:
the definition of the face is not less than a definition threshold value;
the integrity of the face is not less than an integrity threshold;
the shielding degree of the face is lower than a shielding threshold value.
Based on the above scheme, when a face image which triggers generation of video highlights is received, the generation of the video highlights including the face image based on the time information and the association relationship comprises:
when a face image triggering the generation of video collection is received, displaying prompt information generated by the video collection;
and inquiring the time information and the association relation based on the confirmation operation of the prompt information to generate a video collection containing the face image.
Based on the scheme, when the face image sent by the acquisition equipment at the preset position is received, the face image generated by triggering video collection is determined to be received;
or,
when a face image uploaded by user equipment through a network is received, the face image generated by triggering video collection is determined to be received.
Based on the above scheme, when a face image triggering the generation of video highlights is received, based on the time information, the generation of the video highlights including the face image comprises:
and performing collection and synthesis on the video clips and materials input and designated by a user to generate a video collection containing the face images.
A video highlights generating apparatus comprising:
the extraction module is used for extracting the human face features from the image frames of the video;
the tracking detection module is used for carrying out tracking detection based on the video to obtain the time information of the same person appearing in the video;
the distribution module is used for distributing the figure identification for the face characteristic when the figure identification of the face characteristic is not stored in the face characteristic library, and storing the mapping relation between the face characteristic and the figure identification into the face characteristic library;
the establishing module is used for establishing an incidence relation between the character identification and the video;
and the generating module is used for generating the video collection containing the face image based on the time information and the incidence relation when the face image generated by triggering the video collection is received.
An electronic device, comprising: a processor and a memory for storing a computer program capable of running on the processor;
when the processor is used for running the computer program, the video collection generation method provided by any technical scheme is realized.
A computer storage medium having a computer program stored therein, wherein the computer program, when executed by a processor, is capable of performing the video highlight generation method provided by any of the preceding claims.
According to the technical scheme provided by the embodiment of the invention, the person needing to generate the video collection does not need to be allocated with the identification in advance, but the face of which the person identification is not allocated in the face feature library is automatically identified by the face when the video is acquired, and a unique identifiable person identification is allocated in the system; and the distribution process is executed in the background and is not sensible to the user, the character identification can be used for collecting the time information of the same character appearing in the video, so that when a face image which triggers the generation of the video collection is received subsequently, a video segment containing the time information of the character image can be searched based on the face, and the video collection is generated. Therefore, the video collection does not need to be artificially synthesized, or the face images are collected in advance and registered, so that the generation of the video collection is simplified, and the generation efficiency of the video collection is improved.
Drawings
Fig. 1 is a schematic flow chart of a video highlight generation method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a time information obtaining method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a video highlight generation apparatus according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of a video processing method according to an embodiment of the present invention;
fig. 5 is a schematic flow chart of video search in a video highlight generation process according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Before further detailed description of the present invention, terms and expressions referred to in the embodiments of the present invention are described, and the terms and expressions referred to in the embodiments of the present invention are applicable to the following explanations.
As shown in fig. 1, the present embodiment provides a video highlight generation method, including:
step S110: extracting human face features from image frames of a video;
step S120: tracking detection is carried out on the basis of the video, and time information of the same person appearing in the video is obtained;
step S130: when the human face feature library does not store the human identification of the human face feature, distributing the human identification for the human face feature, and warehousing the human face feature and the mapping relation of the human identification into the human face feature library;
step S140: establishing an incidence relation between the character identification and the video;
step S150: and when a face image which triggers the generation of the video collection is received, generating the video collection containing the face image based on the time information and the incidence relation.
The video collection generation method provided by the embodiment can be applied to a video platform; for example, the video platform includes one or more video devices.
The video to be put in storage can be a video collected by one or more devices, and the video can be a live video stream or a video to be put in storage.
Extracting face features from image frames of a video through face recognition, wherein the face features comprise: the characteristics of the whole human face and the characteristics of five sense organs in the human face.
A face feature library can be arranged in the video platform, and face features and unique figure identification in the whole system corresponding to the face features are stored in the face feature library. Whether the face features are unknown faces assigned with the figure identifications in the face feature library or not can be determined through face feature matching. If face identification has been assigned, it is not an unknown face. When it is determined that the facial features are from a known face according to the facial features, that is, the facial features currently extracted are already assigned with the person identifiers in the facial feature library, step S140 or S150 may be directly performed. The person identifier in the embodiment of the application is a unique identifier in the face feature library, is an identifier distributed by the person feature library according to a distribution rule of the person feature library, and can be different from identifiers such as an identification number or a passport number of a person in real life.
That is, in some embodiments, the method further comprises: matching the currently extracted face features with the faces in the face feature library, and when the face features with the similarity threshold value larger than the first matching threshold value are matched, determining that the face feature library already distributes character identification to the currently extracted face features, or else distributing character identification to the currently extracted face features.
In some embodiments, if it is found through matching of the face features that the currently extracted face features have been assigned with the person identifier, the face feature library may be updated according to the current face features.
For example, if the matching degree between the currently extracted face features and the face features which are most matched in the face feature library is greater than a first matching threshold but lower than a second matching threshold, the current face features are added to the corresponding relationship between the face features which are most matched with the current face features and the person identifications, so as to improve the matching of the subsequent hot face features.
In this embodiment, even if the face is unknown, the person identifier may be assigned to enter the video collection through subsequent video clips and person identifiers, so that the subsequent video collection that is directly synthesized from the video clips based on the person identifier and the face feature may be generated conveniently.
In this embodiment, the time information may be one or more time points indicating the appearance of the face image in the video.
In some embodiments, the time information may include: timeline information; the timeline may be constructed for one or more time points and one or more time segments. And arranging the time lines on a time axis according to a certain time sequence to form the time line.
In other embodiments, the time information may include: the time zone information records a time zone in which the same person appears in the video, for example, with 15 seconds as one time zone or 24 seconds as one time zone.
In some embodiments, the step S120 may track the same person based on face detection only.
In some embodiments, the step S120 may include: and combining face tracking and human body tracking to obtain the time information of the same person appearing in the video.
In order to improve the tracking success rate, some collected faces are incomplete or only the back shadow is collected under the condition that the layout of the collection equipment is sparse. In this embodiment, face tracking and human body tracking are combined to obtain time information of the same person appearing in the video.
In order to solve the problem that the same person is lost when the face presenting condition is not good, human body tracking can be carried out. For example, body tracking is performed based on appearance features and/or geometric features of the body.
The geometric features include: body contour characteristics, height, stature ratio, and the like.
The appearance characteristics include: dressing, wearing and the like.
In a word, in the application, when the human face is tracked, the human body tracking is adopted to further track a person in the video, and the phenomenon of loss of the person based on the human face tracking is reduced, so that the accurate and continuous tracking of the same person is realized.
Further, as shown in fig. 2, the step S120 may include:
step S121: based on the face tracking, obtaining first time information of the face of the same person appearing in the video;
step S122: obtaining second time information of the human body of the same person appearing in the video based on the human body tracking;
step S123: and combining the first time information and the second time information of the same person appearing in the video based on the image frames with the faces appearing simultaneously.
For example, through face tracking, the time information of the person a appearing in the video is found to be first time information, and the first time information forms a first time set; for another example, through human body tracking, the time information of the person a appearing in the video is found to be second time information, and the second time information forms a second time set; and combining the first time set and the second time set to obtain all time information of the same person in the video, so that the phenomena of omission and loss of follow-up can be reduced.
In some embodiments, the establishing the association between the person identifier and the video includes: and generating structural data containing the character identification, the time information and the video identification of the video.
In this embodiment, the person identifier, the time information, and the structural data including the video identifier of the person corresponding to the person identifier facilitate subsequent quick query and retrieval.
In some embodiments, the assigning the person identifier to the facial feature includes:
and distributing the character identification for the face characteristics extracted from the faces meeting the preset presentation condition.
In order to ensure that the face in the face feature library is sufficiently clear and can be used for subsequent accurate query of face identification, in this embodiment, the face feature extracted from the face meeting the preset presentation condition is selected from the extracted multiple face features to allocate the character identification.
Further, the face meeting the preset presentation condition includes at least one of:
the definition of the face is not less than a definition threshold value;
the integrity of the face is not less than an integrity threshold;
the shielding degree of the face is lower than a shielding threshold value.
The sharpness threshold may be a preset value, and in some blurred images, the face image is also blurred.
For example, only a little bit of the side face may not recognize which person belongs to. For example, if the integrity of the full front face is 1, the integrity threshold is a value between 0 and 1, for example, 0.6, 0.7, 0.8, or 0.9.
In the process of acquiring the video, because the people are moving, so the people are sheltered from other objects or other people, the sheltering can be judged in order to ensure that the image frames in the video collection all comprise the same people. The degree of occlusion can likewise be any value from 0 to 1. The occlusion threshold may be 0.6, 0.7, 0.8, or 0.9, etc.
In some embodiments, the step S150 may include:
when a face image triggering the generation of video collection is received, displaying prompt information generated by the video collection;
and inquiring the time information and the association relation based on the confirmation operation of the prompt information to generate a video collection containing the face image.
Therefore, the video collection can be generated through the notification of the prompt information, but the video collection is generated based on the confirmation operation of the prompt information, so that the video collection can be generated only when the video collection needs to be generated according to the intention of a user.
In some embodiments, the video highlights are not generated if a denial operation based on the prompt information is received.
In some embodiments, to reduce the data storage capacity of the asset library, the video or video segments being binned may be compressed at predetermined intervals.
In some embodiments the method further comprises:
and sending a deletion prompt comprising the video segment of the corresponding person to the user equipment, and deleting the corresponding video segment and/or the corresponding time information if a confirmation operation acting on the deletion prompt is received.
In other embodiments, the prompt information is output simultaneously with the deletion prompt, and if a confirmation operation acting on the deletion prompt is detected, the video highlights are not generated and the corresponding video segments and/or time information are deleted.
In still other embodiments, the step S140 may include:
when a face image generated by triggering video collection is received, displaying and inquiring video clip information comprising a person corresponding to the face image based on the time information and the incidence relation;
and detecting a confirmation operation aiming at the video clip information to generate the video collection.
In some embodiments, when a face image sent by an acquisition device located at a predetermined position is received, it is determined that the face image triggering the video compilation is received.
For example, after a user plays in an amusement park and is captured by cameras at different positions in the amusement park, and the video is stored in a media asset library by adopting the method, when the user leaves the amusement park, the system receives a face image captured by the camera at the exit of the amusement park, and the face image is a face image generated by triggering video collection. However, in order to avoid unnecessary generation, a prompt message or the like is output to request the user to confirm whether or not the video album is generated.
In some embodiments, when a user leaves a specific place such as an amusement park, it may not be determined whether a video album needs to be generated, but the user may still want to generate the video album after going home or after a certain period of time, and at this time, the user may upload a face image of the user via a mobile phone, a tablet, a notebook, or other user terminal to trigger online generation of the video album.
Therefore, in some embodiments, when the face image uploaded by the user equipment through the network is received, it is determined that the face image triggering the video collection generation is received.
In some embodiments, the step S150 may include: and performing collection and synthesis on the video clips and materials input and designated by a user to generate a video collection containing the face images.
In generating the video highlights, the user may specify material that needs to be added, including but not limited to: audio materials, special effects materials, and the like.
In some embodiments, synthesizing a video compilation based on video segments may comprise:
and combining the video clips into a video file according to a certain playing sequence based on the video collection generation strategy.
For example, according to the collection time sequence of the video clips, the videos are combined into one video file according to the sequence of the collection time from front to back or from back to front of the video clips.
For another example, according to the flag events contained in the video clips, the video clips are combined to obtain a video file according to the video effects generated by the flag events.
When the video collection is generated based on the symbolic event, the video effect required by the video collection is combined, and the video fragments are combined to obtain a video file. The video effects may include: synthesizing warm and romantic effect, humorous effect, funny effect and the like. In composing a funny effect, video segments with different capture times may be purposely reversed, thereby composing a funny meaning video collection.
As shown in fig. 3, this embodiment further provides a video album generating apparatus, including:
an extraction module 110, configured to extract facial features from image frames of a video;
a tracking detection module 120, configured to perform tracking detection based on the video to obtain time information of the same person appearing in the video;
the distribution module 130 is configured to distribute a person identifier for the face feature when the person identifier of the face feature is not stored in the face feature library, and store the mapping relationship between the face feature and the person identifier in the face feature library;
an establishing module 140, configured to establish an association relationship between the person identifier and the video;
and the generating module 150 is configured to generate a video collection including the face image based on the time information and the association relationship when the face image generated by triggering the video collection is received.
In some embodiments, the extracting module 110, the trace detecting module 120, the assigning module 130, the creating module 140, and the generating module 150 may be program modules, which can be executed by a processor to perform the functions of the above-mentioned functional modules.
In other embodiments, the extracting module 110, the tracking detecting module 120, the assigning module 130, the establishing module 140, and the generating module 150 may be a combination of hardware and software modules, which may include various programmable arrays; the programmable array includes, but is not limited to, a complex programmable array or a field programmable array.
In still other embodiments, the extraction module 110, the trace detection module 120, the assignment module 130, the creation module 140, and the generation module 150 may be purely hardware modules; the pure hardware module includes, but is not limited to, a complex programmable array.
In some embodiments, the tracking detection module 120 is specifically configured to combine face tracking and human body tracking to obtain time information of the same person appearing in the video.
In some embodiments, the tracking detection module 120 is specifically configured to obtain, based on the face tracking, first time information of faces of the same person appearing in the video; obtaining second time information of the human body of the same person appearing in the video based on the human body tracking; and combining the first time information and the second time information of the same person appearing in the video based on the image frames with the faces appearing simultaneously.
In some embodiments, the creating module 140 is specifically configured to generate structural data including the person identifier, the time information, and a video identifier of the video.
In some embodiments, the assigning module 130 is specifically configured to assign the person identifier to a face feature extracted from a face meeting a preset presentation condition.
In some embodiments, the face meeting the preset presentation condition includes at least one of:
the definition of the face is not less than a definition threshold value;
the integrity of the face is not less than an integrity threshold;
the shielding degree of the face is lower than a shielding threshold value.
In some embodiments, the generating module 150 is specifically configured to display a prompt message generated by the video highlights when a face image triggering the generation of the video highlights is received; and inquiring the time information and the association relation based on the confirmation operation of the prompt information to generate a video collection containing the face image.
In some embodiments, when a face image sent by an acquisition device located at a predetermined position is received, it is determined that the face image triggering the generation of video highlights is received; or when receiving the face image uploaded by the user equipment through the network, determining to receive the face image generated by triggering the video collection.
In some embodiments, the generating module 150 is specifically configured to perform highlight synthesis on the video segments and the material specified by the user input, so as to generate a video highlight containing the face image.
One specific example is provided below in connection with any of the embodiments described above:
the present example provides a video highlight processing system, which may include:
the first video processing module is used for processing before video warehousing; the video searching module is used for retrieving video segments when a face image needing to generate video highlights is received;
and the second video processing module can be used for functional processing such as face recognition, human body tracking detection, video segment synthesis and the like.
As shown in fig. 4, the video processing of the first video processing module may include:
collecting videos or live broadcast streams shot by a plurality of cameras and other collecting equipment; generating corresponding materials according to certain logic, and printing character labels; the video processing may include: the method mainly comprises video/live stream collection, encoding and decoding processing and the like; in fig. 4 the acquisition device comprises: collecting equipment 1 to collecting equipment N, wherein N can be any positive integer; detecting and comparing human faces; the method comprises the steps of performing face detection (region, angle, blur, occlusion and the like) on a video segment, forming a prescreening slice (a time line of a person appearing in the video segment), creating a unique identification ID of the person, and storing face information of the person in a face feature library.
The face detection and the human body tracking specifically include: human body tracking detection is used as an auxiliary means for face recognition, and when face recognition fails due to face blurring, shielding or other reasons. For example, it may be determined whether a person has appeared within a time period of close proximity. This is mainly because human tracking is more a check on human body contour, and can be normally used even if face recognition is not accurate.
And performing face recognition to obtain a figure identifier, wherein the figure identifier is a unique identifier and is stored in a face feature library corresponding to the face features. Namely, the mapping relation between the human face characteristics and the person identification is stored in a human face characteristic library.
Video & tag structure data, which may include: and slicing and cutting the video according to the time line, printing the person ID on the slice to form a video material, and storing the video material in a media asset library for subsequent searching.
As shown in fig. 5, in the video search, the same or similar identifier as the current user, i.e. the above-mentioned person identifier ID, can be searched through face recognition, and then the related material video is searched from the media asset library; on the basis, after the user independently selects the materials, the capability of video synthesis is used for quickly generating the collection video.
And the mapping relation between the face features and the character identifications is stored in a face feature library, and the video is stored in a media asset library.
Detecting and comparing the human face, detecting the human face of the current person, and comparing the human face with a human face feature library to obtain a person Identification (ID) similar to or consistent with the current user for the next processing;
video searching, namely searching the media asset data of the same label in a media asset library through the figure ID for the next processing;
video synthesis may be aided by self-built video synthesis capabilities. The user can select video clips, background music, transition, filters or synthesized auxiliary materials such as subtitles and the like independently, and highlight videos such as video highlights and the like are automatically synthesized.
The video collection generation method provided by the example can be used for identifying unknown faces without registering the faces in advance or manually marking the faces; for the user, the whole process can be completed only by brushing the face at the end and automatically selecting the synthesized material.
And combining face detection, comparison and human body tracking. Blurring and shielding in a motion scene can affect the face recognition effect, and then video segments of related users are missed. The combination of face recognition and body tracking can optimize this problem.
The present embodiment further provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and when being executed by a processor, the computer program implements the video highlight generation method provided in any of the foregoing technical solutions; for example, at least one of the methods shown in fig. 1-2, 4, and 5.
The present embodiment also provides an electronic device, including: a processor and a memory for storing a computer program capable of running on the processor; when the processor is used to run the computer program, the method for generating the video highlights, provided by any of the foregoing technical solutions, is implemented, for example, at least one of the methods shown in fig. 1 to fig. 2, fig. 4 and fig. 5.
The electronic device may further include: at least one network interface. The various components in the electronic device are coupled together by a bus system. It will be appreciated that a bus system is used to enable communications among the components. The bus system includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration.
The memory may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Random Access Memory), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Random Access Memory), Synchronous link Dynamic Random Access Memory (SLDRAM, Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory bus Access Memory (DRAM, Random Access Memory). The described memory for embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The memory in embodiments of the present invention is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device, such as an operating system and application programs. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs may include various application programs for implementing various application services. Here, the program that implements the method of the embodiment of the present invention may be included in an application program.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.
Claims (12)
1. A video collection generation method is characterized by comprising the following steps:
extracting human face features from image frames of a video;
tracking detection is carried out on the basis of the video, and time information of the same person appearing in the video is obtained;
when the human face feature library does not store the human identification of the human face feature, distributing the human identification for the human face feature, and warehousing the human face feature and the mapping relation of the human identification into the human face feature library;
establishing an incidence relation between the character identification and the video;
and when a face image which triggers the generation of the video collection is received, generating the video collection containing the face image based on the time information and the incidence relation.
2. The method of claim 1, wherein the performing tracking detection based on the video to obtain time information of the same person appearing in the video comprises:
and combining face tracking and human body tracking to obtain the time information of the same person appearing in the video.
3. The method of claim 2,
the method for acquiring the time information of the same person in the video by combining the face tracking and the human body tracking comprises the following steps:
based on the face tracking, obtaining first time information of the face of the same person appearing in the video;
obtaining second time information of the human body of the same person appearing in the video based on the human body tracking;
and combining the first time information and the second time information of the same person appearing in the video based on the image frames with the faces appearing simultaneously.
4. The method of claim 1, wherein the establishing the association between the person identifier and the video comprises:
and generating structural data containing the character identification, the time information and the video identification of the video.
5. The method of claim 1 or 2, wherein the assigning the human face feature with a human identification comprises:
and distributing the character identification for the face characteristics extracted from the faces meeting the preset presentation condition.
6. The method according to claim 5, wherein the face meeting the preset presentation condition comprises at least one of:
the definition of the face is not less than a definition threshold value;
the integrity of the face is not less than an integrity threshold;
the shielding degree of the face is lower than a shielding threshold value.
7. The method according to claim 1 or 2, wherein the generating a video collection containing the face image based on the time information and the association relationship when receiving the face image triggering the generation of the video collection comprises:
when a face image triggering the generation of video collection is received, displaying prompt information generated by the video collection;
and inquiring the time information and the association relation based on the confirmation operation of the prompt information to generate a video collection containing the face image.
8. The method according to claim 1 or 2, characterized in that when receiving the face image sent by the acquisition device located at a predetermined position, it is determined that the face image triggering the generation of the video highlights is received;
or,
when a face image uploaded by user equipment through a network is received, the face image generated by triggering video collection is determined to be received.
9. The method according to claim 1 or 2, wherein the generating a video compilation containing face images based on the time information when receiving the face images triggering the generation of the video compilation comprises:
and performing collection and synthesis on the video clips and materials input and designated by a user to generate a video collection containing the face images.
10. A video album generating apparatus comprising:
the extraction module is used for extracting the human face features from the image frames of the video;
the tracking detection module is used for carrying out tracking detection based on the video to obtain the time information of the same person appearing in the video;
the distribution module is used for distributing the figure identification for the face characteristic when the figure identification of the face characteristic is not stored in the face characteristic library, and storing the mapping relation between the face characteristic and the figure identification into the face characteristic library;
the establishing module is used for establishing an incidence relation between the character identification and the video;
and the generating module is used for generating the video collection containing the face image based on the time information and the incidence relation when the face image generated by triggering the video collection is received.
11. An electronic device, comprising: a processor and a memory for storing a computer program capable of running on the processor;
wherein the processor is configured to implement the method provided in any one of claims 1 to 9 when running the computer program.
12. A computer storage medium in which a computer program is stored, which computer program, when being executed by a processor, carries out the method as set forth in any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910711992.1A CN110418076A (en) | 2019-08-02 | 2019-08-02 | Video Roundup generation method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910711992.1A CN110418076A (en) | 2019-08-02 | 2019-08-02 | Video Roundup generation method, device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110418076A true CN110418076A (en) | 2019-11-05 |
Family
ID=68365472
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910711992.1A Pending CN110418076A (en) | 2019-08-02 | 2019-08-02 | Video Roundup generation method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110418076A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111356022A (en) * | 2020-04-18 | 2020-06-30 | 徐琼琼 | Video file processing method based on voice recognition |
CN111506770A (en) * | 2020-04-22 | 2020-08-07 | 新华智云科技有限公司 | Interview video collection generation method and system |
CN111627470A (en) * | 2020-05-29 | 2020-09-04 | 深圳市天一智联科技有限公司 | Video editing method, device, storage medium and equipment |
CN112182297A (en) * | 2020-09-30 | 2021-01-05 | 北京百度网讯科技有限公司 | Training information fusion model, and method and device for generating collection video |
CN112291574A (en) * | 2020-09-17 | 2021-01-29 | 上海东方传媒技术有限公司 | Large-scale sports event content management system based on artificial intelligence technology |
CN113542791A (en) * | 2021-07-08 | 2021-10-22 | 山东云缦智能科技有限公司 | Personalized video collection generation method |
CN113938747A (en) * | 2021-10-15 | 2022-01-14 | 深圳市智此一游科技服务有限公司 | Video generation method and device and server |
CN114153342A (en) * | 2020-08-18 | 2022-03-08 | 深圳市万普拉斯科技有限公司 | Visual information display method and device, computer equipment and storage medium |
CN115439796A (en) * | 2022-11-09 | 2022-12-06 | 江西省天轴通讯有限公司 | Specific area personnel tracking and identifying method, system, electronic equipment and storage medium |
CN116055762A (en) * | 2022-12-19 | 2023-05-02 | 北京百度网讯科技有限公司 | Video synthesis method and device, electronic equipment and storage medium |
TWI817014B (en) * | 2019-11-25 | 2023-10-01 | 仁寶電腦工業股份有限公司 | Method, system and storage medium for providing a timeline-based graphical user interface |
WO2023241377A1 (en) * | 2022-06-16 | 2023-12-21 | 北京字跳网络技术有限公司 | Video data processing method and device, equipment, system, and storage medium |
CN117979123A (en) * | 2024-03-29 | 2024-05-03 | 江西省亿发姆科技发展有限公司 | Video gathering generation method and device for travel record and electronic equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101236599A (en) * | 2007-12-29 | 2008-08-06 | 浙江工业大学 | Human face recognition detection device based on multi- video camera information integration |
CN101854516A (en) * | 2009-04-02 | 2010-10-06 | 北京中星微电子有限公司 | Video monitoring system, video monitoring server and video monitoring method |
CN105224925A (en) * | 2015-09-30 | 2016-01-06 | 努比亚技术有限公司 | Video process apparatus, method and mobile terminal |
CN106375872A (en) * | 2015-07-24 | 2017-02-01 | 三亚中兴软件有限责任公司 | Method and device for video editing |
CN107292240A (en) * | 2017-05-24 | 2017-10-24 | 深圳市深网视界科技有限公司 | It is a kind of that people's method and system are looked for based on face and human bioequivalence |
CN107358146A (en) * | 2017-05-22 | 2017-11-17 | 深圳云天励飞技术有限公司 | Method for processing video frequency, device and storage medium |
CN108171207A (en) * | 2018-01-17 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Face identification method and device based on video sequence |
CN109426787A (en) * | 2017-08-31 | 2019-03-05 | 杭州海康威视数字技术股份有限公司 | A kind of human body target track determines method and device |
-
2019
- 2019-08-02 CN CN201910711992.1A patent/CN110418076A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101236599A (en) * | 2007-12-29 | 2008-08-06 | 浙江工业大学 | Human face recognition detection device based on multi- video camera information integration |
CN101854516A (en) * | 2009-04-02 | 2010-10-06 | 北京中星微电子有限公司 | Video monitoring system, video monitoring server and video monitoring method |
CN106375872A (en) * | 2015-07-24 | 2017-02-01 | 三亚中兴软件有限责任公司 | Method and device for video editing |
CN105224925A (en) * | 2015-09-30 | 2016-01-06 | 努比亚技术有限公司 | Video process apparatus, method and mobile terminal |
CN107358146A (en) * | 2017-05-22 | 2017-11-17 | 深圳云天励飞技术有限公司 | Method for processing video frequency, device and storage medium |
CN107292240A (en) * | 2017-05-24 | 2017-10-24 | 深圳市深网视界科技有限公司 | It is a kind of that people's method and system are looked for based on face and human bioequivalence |
CN109426787A (en) * | 2017-08-31 | 2019-03-05 | 杭州海康威视数字技术股份有限公司 | A kind of human body target track determines method and device |
CN108171207A (en) * | 2018-01-17 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Face identification method and device based on video sequence |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI817014B (en) * | 2019-11-25 | 2023-10-01 | 仁寶電腦工業股份有限公司 | Method, system and storage medium for providing a timeline-based graphical user interface |
CN111356022A (en) * | 2020-04-18 | 2020-06-30 | 徐琼琼 | Video file processing method based on voice recognition |
CN111506770A (en) * | 2020-04-22 | 2020-08-07 | 新华智云科技有限公司 | Interview video collection generation method and system |
CN111506770B (en) * | 2020-04-22 | 2023-10-27 | 新华智云科技有限公司 | Interview video gathering generation method and system |
CN111627470A (en) * | 2020-05-29 | 2020-09-04 | 深圳市天一智联科技有限公司 | Video editing method, device, storage medium and equipment |
CN114153342A (en) * | 2020-08-18 | 2022-03-08 | 深圳市万普拉斯科技有限公司 | Visual information display method and device, computer equipment and storage medium |
CN112291574A (en) * | 2020-09-17 | 2021-01-29 | 上海东方传媒技术有限公司 | Large-scale sports event content management system based on artificial intelligence technology |
CN112182297A (en) * | 2020-09-30 | 2021-01-05 | 北京百度网讯科技有限公司 | Training information fusion model, and method and device for generating collection video |
CN113542791A (en) * | 2021-07-08 | 2021-10-22 | 山东云缦智能科技有限公司 | Personalized video collection generation method |
CN113938747A (en) * | 2021-10-15 | 2022-01-14 | 深圳市智此一游科技服务有限公司 | Video generation method and device and server |
WO2023241377A1 (en) * | 2022-06-16 | 2023-12-21 | 北京字跳网络技术有限公司 | Video data processing method and device, equipment, system, and storage medium |
CN115439796A (en) * | 2022-11-09 | 2022-12-06 | 江西省天轴通讯有限公司 | Specific area personnel tracking and identifying method, system, electronic equipment and storage medium |
CN116055762A (en) * | 2022-12-19 | 2023-05-02 | 北京百度网讯科技有限公司 | Video synthesis method and device, electronic equipment and storage medium |
CN117979123A (en) * | 2024-03-29 | 2024-05-03 | 江西省亿发姆科技发展有限公司 | Video gathering generation method and device for travel record and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110418076A (en) | Video Roundup generation method, device, electronic equipment and storage medium | |
CN107633241B (en) | Method and device for automatically marking and tracking object in panoramic video | |
JP5355422B2 (en) | Method and system for video indexing and video synopsis | |
Pan et al. | Automatic detection of replay segments in broadcast sports programs by detection of logos in scene transitions | |
US11393208B2 (en) | Video summarization using selected characteristics | |
CN103581705A (en) | Method and system for recognizing video program | |
CN111368622A (en) | Personnel identification method and device, and storage medium | |
CN110933299B (en) | Image processing method and device and computer storage medium | |
WO2012064532A1 (en) | Aligning and summarizing different photo streams | |
CN112118395B (en) | Video processing method, terminal and computer readable storage medium | |
CN112800255A (en) | Data labeling method, data labeling device, object tracking method, object tracking device, equipment and storage medium | |
CN105159959A (en) | Image file processing method and system | |
CN111241872A (en) | Video image shielding method and device | |
CN113709545A (en) | Video processing method and device, computer equipment and storage medium | |
CN114339423A (en) | Short video generation method and device, computing equipment and computer readable storage medium | |
CN108540817B (en) | Video data processing method, device, server and computer readable storage medium | |
CN113297499A (en) | Information recommendation system, method, computer equipment and storage medium | |
CN110457998B (en) | Image data association method and apparatus, data processing apparatus, and medium | |
WO2018171234A1 (en) | Video processing method and apparatus | |
CN113784058A (en) | Image generation method and device, storage medium and electronic equipment | |
JP2005191892A (en) | Information acquisition device and multi-media information preparation system using it | |
WO2014092553A2 (en) | Method and system for splitting and combining images from steerable camera | |
CN116366913B (en) | Video playing method, computer equipment and storage medium | |
JP2005039354A (en) | Metadata input method and editing system | |
CN115858854B (en) | Video data sorting method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191105 |