CN111666447A - Content-based three-dimensional CG animation searching method and device - Google Patents

Content-based three-dimensional CG animation searching method and device Download PDF

Info

Publication number
CN111666447A
CN111666447A CN202010506909.XA CN202010506909A CN111666447A CN 111666447 A CN111666447 A CN 111666447A CN 202010506909 A CN202010506909 A CN 202010506909A CN 111666447 A CN111666447 A CN 111666447A
Authority
CN
China
Prior art keywords
video
shot
frames
clustering
extracting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010506909.XA
Other languages
Chinese (zh)
Inventor
刘潇峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhenjiang Aoyou Network Technology Co ltd
Original Assignee
Zhenjiang Aoyou Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhenjiang Aoyou Network Technology Co ltd filed Critical Zhenjiang Aoyou Network Technology Co ltd
Priority to CN202010506909.XA priority Critical patent/CN111666447A/en
Publication of CN111666447A publication Critical patent/CN111666447A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Abstract

The invention relates to the technical field of computers, in particular to a content-based three-dimensional CG animation searching method and a content-based three-dimensional CG animation searching device, wherein the method comprises the following steps: performing structured analysis on the video; lens segmentation; extracting key frames; extracting characteristics; forming an index; inquiring; the above device includes: the invention realizes the analysis and processing of CG animation with huge data quantity, ambiguity and three-dimension by adopting video structural analysis, video shot segmentation, key frame extraction, feature extraction, video similarity measurement and video query, so that developers can quickly and accurately retrieve the needed animation, thereby improving the efficiency of product development and reducing the development cost.

Description

Content-based three-dimensional CG animation searching method and device
Technical Field
The invention relates to the technical field of computers, in particular to a content-based three-dimensional CG animation searching method and device.
Background
With the rapid development of information technology and network technology, multimedia data, especially video data, are continuously accumulated, wherein CG animation resources are increasingly abundant, but the integration degree of these resources is low, so that the utilization efficiency of the resources is low. The CG animation data has the characteristics of complex structure, abundant video content and the like, so that effective analysis and processing of videos are very difficult, and three-dimensional CG animation searching is to search out required videos from massive video data.
In view of the above problems, the designer is based on the practical experience and professional knowledge that are abundant for many years in engineering application of such products, and is engaged with the application of theory to actively make research and innovation, so as to create a content-based three-dimensional CG animation searching method and device, which are more practical.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in multimedia data owned by the internet, CG animation resource data volume is huge, data structure is complex, video content is very rich, and effective analysis and retrieval of CG animation become very difficult.
In order to achieve the purpose, the invention adopts the technical scheme that: a three-dimensional CG animation searching method and device based on content; the three-dimensional CG animation searching method based on the content comprises the following steps:
step 1, video structural analysis, wherein video data is divided into the following layers according to the levels: video sequence, scene, shot, image frame;
step 2, shot segmentation, namely segmenting a video into a plurality of video shots;
step 3, extracting key frames, and selecting a plurality of image frames from each shot to represent the main visual content of the shot after the shot segmentation is finished;
step 4, feature extraction, namely extracting motion information from the shot and extracting visual feature information from the key frame on the basis of shot segmentation and key frame extraction;
step 5, forming an index, and storing the characteristics into a characteristic database of the retrieval system to form the index;
and 6, inquiring, namely performing similarity measurement according to the inquiry requirement submitted by the user and the description and expression of the video, and submitting the user from high to low according to the similarity.
Further, the key frame extraction adopts an automatic extraction algorithm based on spatio-temporal slice clustering, and the spatio-temporal slices are formed by combining one row and/or column of pixels extracted from the same position of a continuous video image sequence.
Further, the automatic extraction algorithm based on spatio-temporal slice clustering firstly clusters spatio-temporal slices of a video to form a sub-shot, and key frames are extracted from the sub-shot;
the clustering algorithm comprises the following steps: selecting an initial clustering center, dividing a video slice into a plurality of equal parts, setting a process variable related to the total number of video frames and the number of the clustering centers, and defining the clustering centers through the process variable; changing the clustering center, calculating the mean value of all samples in each class, and then finding out the sample closest to the mean value in each class as a new clustering center; calculating the distance, namely calculating the distance between frames according to the number of samples and the time sequence between the samples; and selecting the number of the clustering centers, and automatically selecting the optimal number of the clustering centers for different videos.
Further, the automatic extraction algorithm based on spatio-temporal slice clustering comprises the following steps:
step 1, graying a video image frame, and then extracting a horizontal space-time slice of a lens;
step 2, clustering the video space-time slices;
step 3, when the number of continuous frames in the cluster is less, when the continuous frames less than N (N = 10) are divided into a class, and the colors of the classes on two adjacent sides of the class are the same, classifying the two classes into one class; if the colors of the classes on both sides of the class are different, the class is classified into the class which is closer to the clustering center of the two classes;
step 4, extracting candidate key frames, and taking the frame with the maximum image information entropy in each class as a candidate key frame;
and 5, extracting key frames, finally extracting the key frames in the candidate frames, wherein when the edge histogram difference of two adjacent key frames is smaller than a certain threshold value, the two frames have redundancy, removing the redundancy, and extracting the final key frames.
Further, the key frame extraction determines the number of key frames according to different dynamics of video contents, and extracts the key frames according to the high and low levels of the video structured analysis and division.
Further, the similarity measure includes one or more of: feature similarity, order similarity, and time-span.
Further, the similarity measurement of the frame is performed on the level of the frame based on a color histogram of a block and/or a calculation method of an inter-image distance in image retrieval based on content; at the shot level, a measure of shot similarity is implemented based on the low-level features of the key frames, shot motion or object features.
Further, a content-based three-dimensional CG animation search apparatus includes: the query module provides a plurality of query modes for users, supports the query retrieval of videos according to different modes and self requirements of the users, and can adopt a video example input mode, a template selection input mode and an input mode of submitting a characteristic template; the description module is used for extracting video characteristics when a video enters the database and when a user submits and inquires video content; the matching module is used for searching the required video in the video database according to a certain matching principle; the extraction module extracts the matched video meeting the given conditions of the user from the database and presents the video to the user; the feedback module realizes man-machine interaction through feedback of a user to gradually obtain a satisfactory result, generally, videos presented to the user by the extraction module are a group of videos meeting the given requirements of the user to different degrees, and the videos are listed in the order of high similarity to low similarity.
The invention has the beneficial effects that: by adopting video structural analysis, video shot segmentation, key frame extraction, feature extraction, video similarity measurement and video query, the analysis and processing of the CG animation with huge data volume, ambiguity and three-dimension are realized. The video is structured, the video can be analyzed and processed on different levels of the video, the video content can be reflected from multiple angles through the extraction and analysis of the features, the influence of subjective factors on a retrieval result is avoided, and the original video content can be well expressed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a logic block diagram of an embodiment of the present invention;
FIG. 2 is a diagram showing a structure of a search device according to an embodiment of the present invention;
FIG. 3 is a flow chart of a clustering algorithm in an embodiment of the present invention;
fig. 4 is a video structural analysis diagram in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The invention discloses a content-based three-dimensional CG animation searching method, which comprises the following steps: step 1, video structural analysis, wherein video data is divided into the following layers according to the levels: video sequence, scene, shot, image frame; step 2, shot segmentation, namely segmenting a video into a plurality of video shots; step 3, extracting key frames, and selecting a plurality of image frames from each shot to represent the main visual content of the shot after the shot segmentation is finished; step 4, feature extraction, namely extracting motion information from the shot and extracting visual feature information from the key frame on the basis of shot segmentation and key frame extraction; step 5, forming an index, and storing the characteristics into a characteristic database of the retrieval system to form the index; and 6, inquiring, namely performing similarity measurement according to the inquiry requirement submitted by the user and the description and expression of the video, and submitting the user from high to low according to the similarity. In a specific implementation process, the method disclosed by the application is applied to a content-based three-dimensional CG animation searching device, and the device comprises a query module, a description module, a matching module, an extraction module and a feedback module, wherein the query module provides a plurality of query modes for a user, supports the user to query and retrieve videos according to different modes and own requirements, and can adopt a video example input mode, a template selection input mode and an input mode of submitting a characteristic template; the description module extracts video characteristics when a video enters a database and when a user submits and inquires video content; the matching module searches the required video in the video database according to a certain matching principle; the extraction module extracts the matched videos meeting the given conditions of the user from the database and presents the videos to the user; the feedback module realizes man-machine interaction through feedback of a user to gradually obtain a satisfactory result, generally, videos presented to the user by the extraction module are a group of videos meeting given requirements of the user to different degrees, and the videos are contained in a sequence from high to low in similarity. By adopting video structural analysis, video shot segmentation, key frame extraction, feature extraction, video similarity measurement and video query, the analysis and processing of CG animation with huge data volume, ambiguity and three-dimension are realized, developers can quickly and accurately retrieve required animation, and therefore the efficiency of product development is improved and the development cost is reduced.
As a preferred embodiment of the application, the key frame extraction adopts an automatic extraction algorithm based on space-time slice clustering, and the space-time slice is formed by combining one row and/or column of pixels extracted from the same position of a continuous video image sequence; clustering the space-time slices of the video based on an automatic extraction algorithm of the space-time slice clustering to form a sub-lens, and extracting a key frame from the sub-lens;
the clustering algorithm comprises the following steps:
selecting an initial clustering center, dividing a video slice into a plurality of equal parts, setting a process variable related to the total number of video frames and the number of the clustering centers, and defining the clustering centers through the process variable;
changing the clustering center, calculating the mean value of all samples in each class, and then finding out the sample closest to the mean value in each class as a new clustering center;
calculating the distance, namely calculating the distance between frames according to the number of samples and the time sequence between the samples;
and selecting the number of the clustering centers, and automatically selecting the optimal number of the clustering centers for different videos.
The automatic extraction algorithm of the space-time slice clustering comprises the following steps:
step 1, graying a video image frame, and then extracting a horizontal space-time slice of a lens;
step 2, clustering the video space-time slices;
step 3, when the number of continuous frames in the cluster is less, when the continuous frames less than N (N = 10) are divided into a class, and the colors of the classes on two adjacent sides of the class are the same, classifying the two classes into one class; if the colors of the classes on both sides of the class are different, the class is classified into the class which is closer to the clustering center of the two classes;
step 4, extracting candidate key frames, and taking the frame with the maximum image information entropy in each class as a candidate key frame;
and 5, extracting key frames, finally extracting the key frames in the candidate frames, wherein when the edge histogram difference of two adjacent key frames is smaller than a certain threshold value, the two frames have redundancy, removing the redundancy, and extracting the final key frames.
In the specific implementation process, the number of the key frames is determined according to different dynamics of video contents, and the key frames are extracted according to the high and low levels of the video structured analysis and division.
The time continuity of the video is considered in the automatic extraction algorithm based on the space-time slice clustering, and the number of the final sub-shots is not necessarily equal to the number of the clustering centers. For still shots, more sub-shots may be formed after clustering video slices due to some minor changes, and the redundancy can be removed in the above algorithm. The algorithm can automatically extract the key frame without manually inputting parameters, thereby avoiding the influence of subjective factors on results and well expressing the content of the original video.
In this embodiment, the similarity measure includes one or more of: feature similarity, order similarity, and time-span; similarity measurement, which is carried out on the frame level based on a color histogram of a block and/or a calculation method of inter-image distance in image retrieval based on content; at the shot level, a measure of shot similarity is implemented based on the low-level features of the key frames, shot motion or object features. The accuracy of the video presented to the user is made higher by the similarity measure.
It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (8)

1. A three-dimensional CG animation searching method based on contents is characterized by comprising the following steps:
step 1, video structural analysis, wherein video data is divided into the following layers according to the levels: video sequence, scene, shot, image frame;
step 2, shot segmentation, namely segmenting a video into a plurality of video shots;
step 3, extracting key frames, and selecting a plurality of image frames from each shot to represent the main visual content of the shot after the shot segmentation is finished;
step 4, feature extraction, namely extracting motion information from the shot and extracting visual feature information from the key frame on the basis of shot segmentation and key frame extraction;
step 5, forming an index, and storing the characteristics into a characteristic database of the retrieval system to form the index;
and 6, inquiring, namely performing similarity measurement according to the inquiry requirement submitted by the user and the description and expression of the video, and submitting the user from high to low according to the similarity.
2. The method of claim 1, wherein the key frame extraction employs an automatic extraction algorithm based on spatio-temporal slice clustering, the spatio-temporal slices are a combination of rows and/or columns of pixels extracted from the same position of a sequence of consecutive video images.
3. The method as claimed in claim 2, wherein the automatic extraction algorithm based on spatio-temporal slice clustering clusters spatio-temporal slices of a video to form a sub-shot, and extracts key frames from the sub-shot;
wherein the clustering algorithm comprises:
selecting an initial clustering center, dividing a video slice into a plurality of equal parts, setting a process variable related to the total number of video frames and the number of the clustering centers, and defining the clustering centers through the process variable;
changing the clustering center, calculating the mean value of all samples in each class, and then finding out the sample closest to the mean value in each class as a new clustering center;
calculating the distance, namely calculating the distance between frames according to the number of samples and the time sequence between the samples;
and selecting the number of the clustering centers, and automatically selecting the optimal number of the clustering centers for different videos.
4. The method as claimed in claim 3, wherein the automatic extraction algorithm based on spatio-temporal slice clustering comprises the following steps:
step 1, graying a video image frame, and then extracting a horizontal space-time slice of a lens;
step 2, clustering the video space-time slices;
step 3, when the number of continuous frames in the cluster is less, when the continuous frames less than N (N = 10) are divided into a class, and the colors of the classes on two adjacent sides of the class are the same, classifying the two classes into one class; if the colors of the classes on both sides of the class are different, the class is classified into the class which is closer to the clustering center of the two classes;
step 4, extracting candidate key frames, and taking the frame with the maximum image information entropy in each class as a candidate key frame;
and 5, extracting key frames, finally extracting the key frames in the candidate frames, wherein when the edge histogram difference of two adjacent key frames is smaller than a certain threshold value, the two frames have redundancy, removing the redundancy, and extracting the final key frames.
5. The method as claimed in claim 4, wherein the key frame extraction determines the number of key frames according to different dynamics of video content, and the key frame extraction is performed according to the high and low levels of the video structural analysis.
6. The method of claim 1, wherein the similarity metric comprises one or more of: feature similarity, order similarity, and time-span.
7. The method of claim 6, wherein the similarity measure is performed at a frame level based on a color histogram of blocks and/or a calculation method of inter-image distance in image retrieval based on content; at the shot level, a measure of shot similarity is implemented based on the low-level features of the key frames, shot motion or object features.
8. A content-based three-dimensional CG animation search apparatus, comprising:
the query module provides a plurality of query modes for users, supports the query retrieval of videos according to different modes and self requirements of the users, and can adopt a video example input mode, a template selection input mode and an input mode of submitting a characteristic template;
the description module is used for extracting video characteristics when a video enters the database and when a user submits and inquires video content;
the matching module is used for searching the required video in the video database according to a certain matching principle;
the extraction module extracts the matched video meeting the given conditions of the user from the database and presents the video to the user;
the feedback module realizes man-machine interaction through feedback of a user to gradually obtain a satisfactory result, generally, videos presented to the user by the extraction module are a group of videos meeting the given requirements of the user to different degrees, and the videos are listed in the order of high similarity to low similarity.
CN202010506909.XA 2020-06-05 2020-06-05 Content-based three-dimensional CG animation searching method and device Pending CN111666447A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010506909.XA CN111666447A (en) 2020-06-05 2020-06-05 Content-based three-dimensional CG animation searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010506909.XA CN111666447A (en) 2020-06-05 2020-06-05 Content-based three-dimensional CG animation searching method and device

Publications (1)

Publication Number Publication Date
CN111666447A true CN111666447A (en) 2020-09-15

Family

ID=72386610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010506909.XA Pending CN111666447A (en) 2020-06-05 2020-06-05 Content-based three-dimensional CG animation searching method and device

Country Status (1)

Country Link
CN (1) CN111666447A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314681A (en) * 2011-07-08 2012-01-11 太原理工大学 Adaptive KF (keyframe) extraction method based on sub-lens segmentation
US20180137892A1 (en) * 2016-11-16 2018-05-17 Adobe Systems Incorporated Robust tracking of objects in videos

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314681A (en) * 2011-07-08 2012-01-11 太原理工大学 Adaptive KF (keyframe) extraction method based on sub-lens segmentation
US20180137892A1 (en) * 2016-11-16 2018-05-17 Adobe Systems Incorporated Robust tracking of objects in videos

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曹长青: "基于内容的视频检索中关键帧提取算法研究", 《中国优秀硕士学位论文全文数据库(硕士)信息科技辑》 *

Similar Documents

Publication Publication Date Title
JP3568117B2 (en) Method and system for video image segmentation, classification, and summarization
Almeida et al. Vison: Video summarization for online applications
Cheung et al. Efficient video similarity measurement with video signature
CN110866896B (en) Image saliency target detection method based on k-means and level set super-pixel segmentation
Gharbi et al. Key frame extraction for video summarization using local description and repeatability graph clustering
CN107153670B (en) Video retrieval method and system based on multi-image fusion
CN105389590B (en) Video clustering recommendation method and device
CN1851709A (en) Embedded multimedia content-based inquiry and search realizing method
CN107451200B (en) Retrieval method using random quantization vocabulary tree and image retrieval method based on same
Parihar et al. Multiview video summarization using video partitioning and clustering
CN110188625B (en) Video fine structuring method based on multi-feature fusion
JP5116017B2 (en) Video search method and system
CN109241315B (en) Rapid face retrieval method based on deep learning
Ejaz et al. Video summarization using a network of radial basis functions
CN114187558A (en) Video scene recognition method and device, computer equipment and storage medium
Adly et al. Issues and challenges for content-based video search engines a survey
Lin et al. Visual search engine for product images
Singh et al. PICS: a novel technique for video summarization
Gupta et al. A learning-based approach for automatic image and video colorization
CN111666447A (en) Content-based three-dimensional CG animation searching method and device
Zhao Application of a clustering algorithm in sports video image extraction and processing
Lv et al. Pf-face: A parallel framework for face classification and search from massive videos based on spark
Sudha et al. Reducing semantic gap in video retrieval with fusion: A survey
Raheem et al. Video Important Shot Detection Based on ORB Algorithm and FLANN Technique
Chatur et al. A simple review on content based video images retrieval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination