CN113486201A - Cartoon figure image classification processing method and system - Google Patents

Cartoon figure image classification processing method and system Download PDF

Info

Publication number
CN113486201A
CN113486201A CN202110687533.1A CN202110687533A CN113486201A CN 113486201 A CN113486201 A CN 113486201A CN 202110687533 A CN202110687533 A CN 202110687533A CN 113486201 A CN113486201 A CN 113486201A
Authority
CN
China
Prior art keywords
cartoon
cartoon character
module
characters
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110687533.1A
Other languages
Chinese (zh)
Inventor
杨大为
宋世唯
周强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Thermosphere Information Technology Co ltd
Original Assignee
Shanghai Stratosphere Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Stratosphere Intelligent Technology Co ltd filed Critical Shanghai Stratosphere Intelligent Technology Co ltd
Priority to CN202110687533.1A priority Critical patent/CN113486201A/en
Publication of CN113486201A publication Critical patent/CN113486201A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a cartoon character image classification processing method and a system, wherein the method comprises the steps of obtaining cartoon character image data of each cartoon character in a cartoon video; extracting the characteristic vector of the image data of the cartoon character to obtain the characteristic vector of the cartoon character; clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data; judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters in a database; the system comprises a detection module, a feature vector extraction module, a cluster classification module, a similarity comparison module and a database; the cartoon character recognition system can help the staff who make cartoon to quickly locate and recognize cartoon characters and quickly recognize and count cartoon characters.

Description

Cartoon figure image classification processing method and system
Technical Field
The invention relates to the field of cartoon character image classification processing methods and systems, in particular to a cartoon character image classification processing method and system.
Background
At present, the cartoon animation industry develops rapidly, more and more animations are deeply loved by people, and more display forms are presented along with the prosperity of the cartoon industry; in the animation production process, the animation characters are diversified, so that the production process of identifying the identities of the animation characters by workers is more and more complicated, and the complicated and low-efficiency production process brings much trouble to the workers in the animation industry;
the method for identifying cartoon characters based on images is rare when more cartoon face identification or portrait identification is applied in the market, and the application field is more and more extensive with the development of image identification technology in the field of computers, so that the problem can be solved.
Disclosure of Invention
The invention aims to solve the technical problems that in the existing animation production process, the diversification of animation characters causes the identification of cartoon characters by workers to be more and more complicated, and the complicated and low-efficiency production process brings much trouble to workers in the animation industry, and the invention provides a cartoon character image classification processing method, which can identify cartoon character images in cartoon videos by traversing the cartoon videos, acquire each cartoon character image and calculate character feature vectors, compare whether the similarity with a certain known character is greater than a threshold value, if so, show the known cartoon character information, if so, the known cartoon character information is less than the threshold value, then the known cartoon character is an unknown cartoon character, after a user adds identity information or marking information, if the cartoon character appears in the subsequent videos, then the known cartoon character is shown as the known cartoon character, and can help the workers producing cartoon to quickly position and identify the cartoon characters, and rapidly identifying and counting cartoon characters so as to solve the defects caused by the prior art.
The invention further provides a cartoon figure image classification processing system.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, a cartoon character image classification processing method includes:
acquiring cartoon character image data of each cartoon character in a cartoon video;
extracting characteristic vectors of the cartoon character image data to obtain cartoon character characteristic vectors;
clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data;
and judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters into a database, wherein the database is used for storing the cartoon character image data, the cartoon character feature vectors and the cartoon character classification data, and comparing the cartoon character classification data with the existing cartoon character database to judge whether the cartoon characters are known cartoon characters.
The cartoon character image classification processing method comprises the steps of detecting a cartoon character appearing in each frame of a cartoon video by adopting a convolutional neural network, marking a detection frame on an image of the cartoon character, capturing the content in the detection frame to obtain the cartoon character image data of each cartoon character, and using the cartoon character image data captured by the detection frame for multiple times in the subsequent process; the convolutional neural network can select common target detection networks such as yolo series, fast-rcnn and the like, and end-to-end training can be directly carried out.
In the above cartoon character image classification processing method, after the characteristic vector of the cartoon character is obtained, it is further necessary to determine whether the cartoon video is finished;
if the cartoon video is not finished, continuously acquiring cartoon character image data of cartoon characters in the cartoon video;
and if the cartoon video is finished, finishing the acquisition of the cartoon character image data.
The cartoon character image classification processing method comprises the steps of extracting feature vectors of cartoon character image data by adopting a convolutional neural network to obtain cartoon character feature vectors, wherein the cartoon character feature vectors can describe the features of one character and have discrimination among different characters;
the convolutional neural network extracts the cartoon character feature vector of a cartoon character according to a feature extraction model, the feature extraction model can theoretically use any classification network, remove the final full-connection classification head, use the features output by the last layer as the cartoon character feature vector, and common choices include the resnet series, the shufflenet and the like; in consideration of the difference between cartoon characters and general object classification tasks, in order to improve the effect of feature extraction, some modification and adjustment are also carried out on the network: adding an attention module (attention), wherein the attention module realizes an attention mechanism, which is a method in a neural network, and as people pay attention to some specific details when observing things, the attention module enables a feature extraction model to pay attention to important and effective parts in features, and can calculate and reflect importance distribution of intermediate features so as to adjust the intermediate features, as in a cartoon character recognition task, more critical features are often concentrated on some local areas such as faces, the accuracy of the model can be improved by adding the attention mechanism, the convolutional neural network is composed of layers of convolution and other operations, the intermediate result of each layer is a series of feature maps, acquiring feature maps of some layers in the convolutional neural network, and generating a weight mask (mask) with the same size as the feature maps after the feature maps are subjected to convolution and a Sigmoid activation function, multiplying the mask (mask) with the feature map to obtain a corrected feature map;
the attention module with more complicated design, such as non-local module, can be used, compared with the common attention module non-local module, when generating the weight mask (mask), the matrix multiplication is carried out on the local feature and the global feature, so that the receptive field is increased, the global information can be obtained, and the attention module can be added in each down-sampling stage of the whole feature extraction network;
the attribute extraction branch is added, and some attributes of the cartoon characters can be well distinguished and recognized, so that the attribute extraction branch can be added into the feature extraction model to improve the capability of the model for extracting features, defined attributes can be binary or multivariate, such as gender, color, species and the like, and are trained by adding an attribute classification head.
The cartoon character image classification processing method includes the following steps that a classification head is added after a feature extraction network in training of a feature extraction model, so that cross entropy loss (cross entropy loss) can be used as a classification task for training, classification labels are labeled character identities, if training is performed only by using a classification loss function, the model learns to increase distances between different classes but not to decrease the distances between the classes, so that the feature extraction model is optimized, a triple loss (triple loss) is used for optimizing the feature extraction model, the triple loss (triple loss) optimizes similarities between the different classes and the same class of features, and the capability of the model for extracting features can be improved better, and the specific method includes the following steps:
triple loss is defined as < a, p, n > on a triple data input;
wherein a is an anchor sample in the triplet, and the anchor sample is a sample of a certain category;
p is a sample of the same class as a;
n is a sample of a different class than a;
for example, in the cartoon character data set, a is one sample of the cartoon character a, p is another sample of the cartoon character a, and n is a sample of the cartoon character B.
The loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);
a, p and n are all the feature vectors of the corresponding samples;
d (a, p) is a function for calculating the distance between the sample a and the sample p, d (a, n) is a function for calculating the distance between the sample a and the sample n, and Euclidean distance is adopted;
m is a hyper-parameter for controlling the distribution intervals of different types of data, and is set according to requirements;
selecting c classes for each training batch (batch) using an online hard sampling mining (online hard sampling) method, each class selecting k samples, for a total of c × k samples;
calculating a distance matrix among all samples in the current batch (batch), taking each sample as an anchor sample, and selecting p which enables d (a, p) to be maximum and n which enables d (a, n) to be minimum as a difficult example to calculate a loss value L;
updating the feature extraction model according to the loss value L and through back propagation;
the training data should be cartoon figure images, each figure only comprises one figure, the labeling information is the identity type of the figure, the training sample is from a cartoon video, small images are intercepted from a big figure according to a detection task labeling frame, each small image is a sample image of a single cartoon figure, and the type of each sample is labeled;
optimizing triple loss (triplet loss) can lead the model to learn to be closer to the distance of the same type of data and to be larger than the distance of different types of data, and training by using triple loss (triplet loss) requires a special sample selection strategy, so that the cost is very high if all sample triples are traversed.
In the cartoon character image classification processing method, a density clustering method is adopted to perform clustering calculation on the cartoon character feature vectors to obtain the cartoon character classification data; through the previous steps, a section of cartoon video is detected and subjected to feature extraction, a large number of cartoon character images and feature vectors corresponding to each image are obtained at the moment, inductive constraint is needed to be carried out on the information at the moment, and the number of actually-appearing characters and identity comparison verification are further obtained;
the data is processed and summarized by adopting a Clustering method, because the number of the categories To be clustered is not known, the adopted Clustering method cannot be Based on the known condition of the number of the categories, such as k-means Clustering, a Density-Based Clustering method, such as DBSCAN (sensitivity-Based Spatial Clustering of Applications with Noise), OPTIC (organizing Points To Identify Cluster Structure), and the like, images of the same person in different frames are clustered into one category, most of the images are classified, images of persons with less part or unique expression are abandoned as outliers (outliers), because of the following steps of user verification, compared with the requirement of Clustering all the images of the single person into one category, the requirement of avoiding two different persons being clustered into the same category is more important, Clustering parameters can be adjusted, and the maximum distance threshold value is adjusted To be relatively smaller, to prevent different classes of data from being grouped into one class;
after finishing clustering and classification, finishing the summary extraction of the cartoon characters in the current video, next comparing each extracted cartoon character with the known cartoon characters to confirm and display the identity of the cartoon characters, wherein the comparison method also comprises the steps of calculating cosine distance of the feature vectors, for a known person, there is a recorded average feature vector, and for the persons in the current video cluster, the average feature vector of the middle point in each cluster is selected as the average feature vector of the cartoon person, when the similarity is higher than a certain threshold value, the current cartoon character is considered to be the known cartoon character, the character identity is shown to the user, if the similarity between the current cartoon character and all known cartoon characters is lower than the threshold value, the current character is considered to be an unknown character, and at the moment, the user can select to add the identity of the cartoon character to a known cartoon character library in the database according to the requirement.
The cartoon character image classification processing method comprises the steps that cartoon character information data of multiple categories are stored in the cartoon character classification data, and the cartoon character information data of each category comprise cartoon character information, cartoon character feature vectors and cartoon character image data;
the process of judging whether the cartoon character is a known cartoon character according to the cartoon character classification data is as follows:
extracting the cartoon character information data of one category and the known cartoon character information in the database to obtain similarity by a method of calculating cosine distance between feature vectors;
if the similarity is larger than or equal to a preset threshold value, judging that the cartoon character is known, and displaying the cartoon character information and the cartoon character image data of the known cartoon character;
if the similarity is smaller than a preset threshold value, the cartoon character is judged to be unknown, labeling is carried out (information is added to the current cartoon character, and if the cartoon character is detected again in the video in the following process, the cartoon character is treated as a known cartoon character) and the known cartoon character is stored or discarded.
The second aspect is a cartoon figure image classification processing system, which comprises a detection module, a feature vector extraction module, a cluster classification module, a similarity comparison module and a database;
the detection module is used for detecting cartoon character images in the cartoon videos to obtain cartoon character image data of each cartoon character and transmitting the cartoon character image data to the feature vector extraction module;
the characteristic vector extraction module is used for receiving the cartoon character image data, extracting characteristic vectors of cartoon characters from the cartoon character image data to obtain characteristic vectors of the cartoon characters, and transmitting the characteristic vectors of the cartoon characters to the cluster classification module;
the clustering classification module is used for receiving the cartoon character feature vectors, clustering and classifying to obtain cartoon character classification data transmitted to the similarity comparison module;
the similarity comparison module is used for receiving the cartoon character classification data, comparing the similarity of the cartoon character classification data with the similarity of known cartoon characters, labeling the cartoon character classification data and transmitting the labeled cartoon character classification data to the database;
the database is used for storing the cartoon character classification data and known cartoon characters.
In the cartoon character image classification processing system, a convolutional neural network is stored in each of the detection module and the feature vector extraction module;
the detection module detects the cartoon characters appearing in each frame of the cartoon video by adopting a convolutional neural network, marks a detection box on the images of the cartoon characters, captures the contents in the detection box and acquires the image data of the cartoon characters of each cartoon character;
the feature vector extraction module extracts feature vectors of the cartoon character image data by adopting a convolutional neural network to obtain cartoon character feature vectors;
the detection module is internally provided with a video judgment module which is used for judging whether the cartoon video is finished or not.
The cartoon figure image classification processing system is characterized in that a feature extraction model optimization module is arranged in the feature vector extraction module, and the feature extraction model optimization module is used for optimizing the feature extraction model.
The technical scheme provided by the cartoon character image classification processing method and system provided by the invention has the following technical effects:
the method comprises the steps of identifying cartoon character images in a cartoon video after traversing the cartoon video, obtaining each cartoon character image, calculating to obtain a cartoon character feature vector, comparing whether the similarity with a certain known character is greater than a threshold value, if so, showing the known cartoon character information, if not, showing the known cartoon character, and after adding identity information or marking information by a user, showing the known cartoon character if the cartoon character appears in the subsequent video in the identification process, so that a worker making the cartoon can be helped to quickly position and identify the cartoon character and quickly identify and count the cartoon character.
Drawings
FIG. 1 is a schematic structural diagram of a cartoon character image classification processing method according to the present invention;
FIG. 2 is a schematic structural diagram of a cartoon character image classification processing system according to the present invention;
FIG. 3 is a schematic structural diagram of a cartoon character image classification processing method according to the present invention;
FIG. 4 is a schematic diagram of non-local modules.
Wherein the reference numbers are as follows:
the system comprises a detection module 101, a feature vector extraction module 102, a cluster classification module 103, a similarity comparison module 104 and a database 105.
Detailed Description
In order to make the technical means, the inventive features, the objectives and the effects of the invention easily understood and appreciated, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the specific drawings, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments.
All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions under which the present invention can be implemented, so that the present invention has no technical significance, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention.
In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.
The invention provides a cartoon character image classification processing method and system, and aims to identify cartoon character images in a cartoon video after traversing the cartoon video, acquire each cartoon character image and calculate a cartoon character feature vector, compare whether the similarity with a certain known character is greater than a threshold value, if the similarity is greater than the threshold value, show the known cartoon character information, if the similarity is less than the threshold value, the known cartoon character is an unknown cartoon character, and after adding identity information or marking information by a user, if the cartoon character appears again in the subsequent video identification, show the known cartoon character, so that a worker making cartoon can be helped to quickly position and identify the cartoon character and quickly identify and count the cartoon character.
As shown in fig. 1, in a first aspect, a cartoon character image classification processing method includes:
acquiring cartoon character image data of each cartoon character in a cartoon video;
extracting the characteristic vector of the image data of the cartoon character to obtain the characteristic vector of the cartoon character;
clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data;
judging whether the cartoon character is a known cartoon character according to the cartoon character classification data, labeling and storing the known cartoon character into the database 105, wherein the database 105 is used for storing cartoon character image data, cartoon character feature vectors and cartoon character classification data, and judging whether the cartoon character is a known cartoon character according to the comparison between the cartoon character classification data and an existing cartoon character database.
The method comprises the following steps that a convolutional neural network is adopted to detect cartoon characters appearing in each frame of a cartoon video and mark a detection box on images of the cartoon characters, the contents in the detection boxes are captured to obtain cartoon character image data of each cartoon character, and the cartoon character image data captured by the detection boxes are used for multiple times in the subsequent process; the convolutional neural network can select common target detection networks such as yolo series, fast-rcnn and the like, and end-to-end training is directly carried out.
After the characteristic vector of the cartoon character is obtained, whether the cartoon video is finished or not needs to be judged;
if the cartoon video is not finished, continuously acquiring cartoon character image data of cartoon characters in the cartoon video;
and if the cartoon video is finished, finishing the acquisition of the cartoon character image data.
The method comprises the following steps that a convolutional neural network is adopted to extract feature vectors of cartoon character image data to obtain cartoon character feature vectors, the cartoon character feature vectors can describe the features of one character, and discrimination is provided among different characters;
the convolutional neural network extracts cartoon character feature vectors of cartoon characters according to the feature extraction model, the feature extraction model can theoretically use any classification network, the final full-connection classification head is removed, the features output by the last layer are used as the cartoon character feature vectors, and common choices comprise a resnet series, a shufflenet and the like; in consideration of the difference between cartoon characters and general object classification tasks, in order to improve the effect of feature extraction, some modification and adjustment are also carried out on the network: an attention module (attention) is added, the attention module realizes an attention mechanism, the method is a method in a neural network, the attention module enables a feature extraction model to pay attention to important and effective parts in features like a person who observes things, the importance distribution reflecting intermediate features can be calculated, and then the intermediate features are adjusted, as the more critical features are usually concentrated on some local areas such as faces in a cartoon character recognition task, the accuracy of the model can be improved by adding the attention mechanism, the convolutional neural network is composed of layers of convolution and other operations, the intermediate result of each layer is a series of feature maps, feature maps of some layers in the convolutional neural network are obtained, the feature maps are activated through convolution and a Sigmoid function to generate a weight mask (mask) with the same size as the feature maps, multiplying the mask (mask) with the feature map to obtain a corrected feature map;
an attention module with a more complex design, such as a non-local module, shown in fig. 4, may also be used, and compared with a general attention module, the non-local module performs matrix multiplication on local features and global features when generating a weight mask (mask), so that the receptive field is increased, global information may be obtained, and the attention module may be added in each downsampling stage of the whole feature extraction network;
the attribute extraction branch is added, and some attributes of the cartoon characters can be well distinguished and recognized, so that the attribute extraction branch can be added into the feature extraction model to improve the capability of the model for extracting features, defined attributes can be binary or multivariate, such as gender, color, species and the like, and are trained by adding an attribute classification head.
The training of the feature extraction model is to add a classification head behind the feature extraction network, so that cross entropy loss (cross entropy loss) can be used as a classification task for training, a classification label is the labeled character identity, if only a classification loss function is used for training, the model can learn to increase the distance between different classes but not to reduce the distance between the same classes, so that the optimization of the feature extraction model is also included, the feature extraction model is optimized by adopting triple loss, the similarity between the different classes and the same class of features is optimized by the triple loss, and the capability of the model for extracting the features can be better improved, and the specific method comprises the following steps:
triple loss is defined as < a, p, n > on a triple data input;
wherein a is an anchor sample in the triplet, and the anchor sample is a sample of a certain category;
p is a sample of the same class as a;
n is a sample of a different class than a;
the loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);
a, p and n are all the feature vectors of the corresponding samples;
d (a, p) is a function for calculating the distance between the sample a and the sample p, d (a, n) is a function for calculating the distance between the sample a and the sample n, and Euclidean distance is adopted;
m is a hyper-parameter and is set according to requirements;
selecting c classes for each training batch (batch) using an online hard sampling mining (online hard sampling) method, each class selecting k samples, for a total of c × k samples;
calculating a distance matrix among all samples in the current batch (batch), taking each sample as an anchor sample, and selecting p which enables d (a, p) to be maximum and n which enables d (a, n) to be minimum as a difficult example to calculate a loss value L;
updating the feature extraction model through back propagation according to the loss value L;
the training data should be cartoon figure images, each figure only comprises one figure, the labeling information is the identity type of the figure, the training sample is from a cartoon video, small images are intercepted from a big figure according to a detection task labeling frame, each small image is a sample image of a single cartoon figure, and the type of each sample is labeled;
optimizing triple loss (triplet loss) can lead the model to learn to be closer to the distance of the same type of data and to be larger than the distance of different types of data, and training by using triple loss (triplet loss) requires a special sample selection strategy, so that the cost is very high if all sample triples are traversed.
Carrying out clustering calculation on the characteristic vectors of the cartoon characters by adopting a density clustering method to obtain cartoon character classification data; through the previous steps, a section of cartoon video is detected and subjected to feature extraction, a large number of cartoon character images and feature vectors corresponding to each image are obtained at the moment, inductive constraint is needed to be carried out on the information at the moment, and the number of actually-appearing characters and identity comparison verification are further obtained;
the data is processed and summarized by adopting a Clustering method, because the number of the categories To be clustered is not known, the adopted Clustering method cannot be Based on the known condition of the number of the categories, such as k-means Clustering, a Density-Based Clustering method, such as DBSCAN (sensitivity-Based Spatial Clustering of Applications with Noise), OPTIC (organizing Points To Identify Cluster Structure), and the like, images of the same person in different frames are clustered into one category, most of the images are classified, images of persons with less part or unique expression are abandoned as outliers (outliers), because of the following steps of user verification, compared with the requirement of Clustering all the images of the single person into one category, the requirement of avoiding two different persons being clustered into the same category is more important, Clustering parameters can be adjusted, and the maximum distance threshold value is adjusted To be relatively smaller, to prevent different classes of data from being grouped into one class;
after finishing clustering and classification, finishing the summary extraction of the cartoon characters in the current video, next comparing each extracted cartoon character with the known cartoon characters to confirm and display the identity of the cartoon characters, wherein the comparison method also comprises the steps of calculating cosine distance of the feature vectors, for a known person, there is a recorded average feature vector, and for the persons in the current video cluster, the average feature vector of the middle point in each cluster is selected as the average feature vector of the cartoon person, when the similarity is higher than a certain threshold value, the current cartoon character is considered to be the known cartoon character, the character identity is shown to the user, if the similarity between the current cartoon character and all known cartoon characters is lower than the threshold value, the current character is considered to be an unknown character, and the user can select to add the identity of the cartoon character to the database of known cartoon characters in the database 105 according to the requirement.
The cartoon character classification data stores cartoon character information data of multiple categories, and the cartoon character information data of each category comprises cartoon character information, cartoon character feature vectors and cartoon character image data;
the process of judging whether the cartoon character is a known cartoon character according to the cartoon character classification data is as follows:
extracting the data of the cartoon figure information of one category and the known cartoon figure information in the database 105 to obtain the similarity by a method of calculating the cosine distance between the characteristic vectors;
if the similarity is larger than or equal to a preset threshold value, judging that the cartoon character is known, and displaying the cartoon character information and the cartoon character image data of the known cartoon character;
if the similarity is smaller than a preset threshold value, the cartoon character is judged to be unknown, labeling is carried out (information is added to the current cartoon character, and if the cartoon character is detected again in the video in the following process, the cartoon character is treated as a known cartoon character) and the known cartoon character is stored or discarded.
As shown in fig. 2, in a second aspect, a cartoon character image classification processing system includes a detection module 101, a feature vector extraction module 102, a cluster classification module 103, a similarity comparison module 104, and a database 105;
the detection module 101 is used for detecting cartoon character images in the cartoon video to obtain cartoon character image data of each cartoon character and transmitting the cartoon character image data to the feature vector extraction module 102;
the feature vector extraction module 102 is configured to receive the image data of the cartoon character, extract feature vectors of the cartoon character from the image data to obtain feature vectors of the cartoon character, and transmit the feature vectors of the cartoon character to the cluster classification module 103;
the clustering classification module 103 is used for receiving the cartoon character feature vectors, clustering and classifying the cartoon character feature vectors to obtain cartoon character classification data transmitted to the similarity comparison module 104;
the similarity comparison module 104 is used for receiving the cartoon character classification data, comparing the similarity of the data with the similarity of known cartoon characters, labeling the data, and transmitting the labeled cartoon character classification data to the database 105;
the database 105 is used for storing the cartoon character classification data and the known cartoon characters.
The detection module 101 and the feature vector extraction module 102 both store convolutional neural networks therein;
the detection module 101 detects the cartoon characters appearing in each frame of the cartoon video by adopting a convolutional neural network, marks a detection box on the images of the cartoon characters, captures the contents in the detection box and obtains the cartoon character image data of each cartoon character;
the feature vector extraction module 102 extracts feature vectors of cartoon character image data by using a convolutional neural network to obtain cartoon character feature vectors;
the detection module 101 is internally provided with a video judgment module, and the video judgment module is used for judging whether the cartoon video is finished.
The feature vector extraction module 102 is internally provided with a feature extraction model optimization module, and the feature extraction model optimization module is used for optimizing the feature extraction model.
As shown in fig. 3, the specific classification process of the cartoon character image classification processing system is as follows:
1. starting;
2. reading a frame of video image;
3. detecting cartoon characters;
4. intercepting each cartoon character image, and calculating a cartoon character feature vector;
5. recording the cartoon character image and the cartoon character feature vector;
6. judging whether the video is finished (the end condition of the loop);
if not, reading a frame of video image again;
having ended and continuing to the next step;
7. clustering all cartoon character images according to the feature vectors, and calculating average feature vectors;
8. taking a category;
9. calculating similarity with all known cartoon character feature vectors;
10. judging whether the similarity with a known cartoon character is greater than a threshold value;
if yes, displaying cartoon character information and category images;
if not, entering the next step;
11. the user marks or discards the class according to the class image;
abandoning, and judging whether all the clustering categories are traversed or not;
labeling and entering the next step;
12. adding the labeling information and the category feature vectors into the known cartoon character database 105;
13. whether all cluster categories are traversed;
1) if not, one category is selected;
2) yes, the flow ends.
Those of skill in the art would understand that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of software and electronic hardware;
whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution;
skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments of the present application, the disclosed system, apparatus and method may be implemented in other ways;
for example, the division of a unit or a module is only one logic function division, and there may be another division manner in actual implementation;
for example, a plurality of units or modules or components may be combined or may be integrated into another system;
in addition, functional units or modules in the embodiments of the present application may be integrated into one processing unit or module, or may exist separately and physically.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a machine-readable storage medium;
therefore, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a machine-readable storage medium and may include several instructions to cause an electronic device to execute all or part of the processes of the technical solution described in the embodiments of the present application;
the storage medium may include various media that can store program codes, such as ROM, RAM, a removable disk, a hard disk, a magnetic disk, or an optical disk.
In summary, according to the cartoon character image classification processing method and system provided by the invention, after traversing a cartoon video, cartoon character images in the cartoon video can be identified, each cartoon character image is obtained and a cartoon character feature vector is obtained through calculation, whether the similarity with a certain known character is greater than a threshold value or not is compared, if the similarity is greater than the threshold value, the known cartoon character information is displayed, if the similarity is less than the threshold value, the known cartoon character information is an unknown cartoon character, after identity information or marking information is added by a user, if the cartoon character appears in the subsequent video identification, the known cartoon character information is displayed, and therefore, a worker making cartoon can be helped to quickly locate and identify the cartoon character and quickly identify and count the cartoon character.
Specific embodiments of the invention have been described above. It is to be understood that the invention is not limited to the particular embodiments described above, in that devices and structures not described in detail are understood to be implemented in a manner common in the art; various changes or modifications may be made by one skilled in the art within the scope of the claims without departing from the spirit of the invention, and without affecting the spirit of the invention.

Claims (10)

1. A cartoon character image classification processing method is characterized by comprising the following steps:
acquiring cartoon character image data of each cartoon character in a cartoon video;
extracting characteristic vectors of the cartoon character image data to obtain cartoon character characteristic vectors;
clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data;
and judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters in a database.
2. The cartoon character image classification processing method of claim 1, wherein a convolutional neural network is adopted to detect cartoon characters appearing in each frame of a cartoon video and mark a detection box on an image of a cartoon character, and the contents in the detection box are captured to obtain the cartoon character image data of each cartoon character.
3. The cartoon character image classification processing method of claim 1 or 2, wherein after the cartoon character feature vector is obtained, whether the cartoon video is finished is judged;
if the cartoon video is not finished, continuously acquiring cartoon character image data of cartoon characters in the cartoon video;
and if the cartoon video is finished, finishing the acquisition of the cartoon character image data.
4. The cartoon character image classification processing method of claim 3, wherein a convolutional neural network is adopted to extract the characteristic vector of the cartoon character image data to obtain a cartoon character characteristic vector;
the convolutional neural network extracts the cartoon character feature vector of the cartoon character according to a feature extraction model;
the convolutional neural network comprises an attention module, wherein the attention module is used for acquiring a feature map of certain layers in the convolutional neural network, generating a weight mask with the same size as the feature map after the feature map is subjected to convolution and a Sigmoid activation function, and multiplying the mask and the feature map to acquire a modified feature map.
5. The cartoon character image classification processing method of claim 4, further comprising optimizing the feature extraction model by using triple loss, the specific method being as follows:
triple penalty is defined at a triple data input < a, p, n >;
wherein a represents an anchor sample in the triple loss, and the anchor sample is a sample of a certain category;
p is a sample of the same class as a;
n is a sample of a different class than a;
the loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);
a, p and n are all the feature vectors of the corresponding samples;
d (a, p) is a function for calculating the distance between the sample a and the sample p, d (a, n) is a function for calculating the distance between the sample a and the sample n, and Euclidean distance is adopted;
m is a hyper-parameter for controlling the distribution interval of different types of data;
c classes are selected for each training batch by using an online hard case mining method, k samples are selected for each class, and the total of c multiplied by k samples is obtained;
calculating a distance matrix among all samples in the current batch, taking each sample as an anchor sample, and taking p enabling d (a, p) to be maximum and n enabling d (a, n) to be minimum as a difficult example to calculate a loss value L;
and updating the feature extraction model according to the loss value L and through back propagation.
6. The cartoon character image classification processing method of claim 1, wherein the cartoon character classification data is obtained by clustering the cartoon character feature vectors by a density clustering method.
7. The cartoon character image classification processing method of claim 6, wherein the cartoon character classification data stores cartoon character information data of a plurality of categories, each category of cartoon character information data including cartoon character information, the cartoon character feature vector, the cartoon character image data;
the process of judging whether the cartoon character is a known cartoon character according to the cartoon character classification data is as follows:
extracting the cartoon character information data of one category and the known cartoon character information in the database to obtain similarity by a method of calculating cosine distance between feature vectors;
if the similarity is larger than or equal to a preset threshold value, displaying the cartoon character information and the cartoon character image data of the cartoon character;
and if the similarity is smaller than a preset threshold value, marking and storing or discarding.
8. A cartoon figure image classification processing system is characterized by comprising a detection module, a feature vector extraction module, a clustering classification module, a similarity comparison module and a database;
the detection module is used for detecting cartoon character images in the cartoon videos to obtain cartoon character image data of each cartoon character and transmitting the cartoon character image data to the feature vector extraction module;
the characteristic vector extraction module is used for receiving the cartoon character image data, extracting characteristic vectors of cartoon characters from the cartoon character image data to obtain characteristic vectors of the cartoon characters, and transmitting the characteristic vectors of the cartoon characters to the cluster classification module;
the clustering classification module is used for receiving the cartoon character feature vectors, clustering and classifying to obtain cartoon character classification data transmitted to the similarity comparison module;
the similarity comparison module is used for receiving the cartoon character classification data, comparing the similarity of the cartoon character classification data with the similarity of known cartoon characters, labeling the cartoon character classification data and transmitting the labeled cartoon character classification data to the database;
the database is used for storing the cartoon character classification data and known cartoon characters.
9. The cartoon character image classification processing system of claim 8, wherein the detection module and the feature vector quantity extraction module each have a convolutional neural network stored therein;
the detection module detects the cartoon characters appearing in each frame of the cartoon video by adopting a convolutional neural network, marks a detection box on the images of the cartoon characters, captures the contents in the detection box and acquires the image data of the cartoon characters of each cartoon character;
the feature vector extraction module extracts feature vectors of the cartoon character image data by adopting a convolutional neural network to obtain cartoon character feature vectors;
the detection module is internally provided with a video judgment module which is used for judging whether the cartoon video is finished or not.
10. The cartoon character image classification processing system of claim 9, wherein the feature vector extraction module is embedded with a feature extraction model optimization module for optimizing the feature extraction model.
CN202110687533.1A 2021-06-21 2021-06-21 Cartoon figure image classification processing method and system Pending CN113486201A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110687533.1A CN113486201A (en) 2021-06-21 2021-06-21 Cartoon figure image classification processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110687533.1A CN113486201A (en) 2021-06-21 2021-06-21 Cartoon figure image classification processing method and system

Publications (1)

Publication Number Publication Date
CN113486201A true CN113486201A (en) 2021-10-08

Family

ID=77934100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110687533.1A Pending CN113486201A (en) 2021-06-21 2021-06-21 Cartoon figure image classification processing method and system

Country Status (1)

Country Link
CN (1) CN113486201A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114881079A (en) * 2022-05-11 2022-08-09 北京大学 Human body movement intention abnormity detection method and system for wearable sensor

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114881079A (en) * 2022-05-11 2022-08-09 北京大学 Human body movement intention abnormity detection method and system for wearable sensor

Similar Documents

Publication Publication Date Title
CN108596277B (en) Vehicle identity recognition method and device and storage medium
CN113255694B (en) Training image feature extraction model and method and device for extracting image features
CN112818951B (en) Ticket identification method
CN108848422B (en) Video abstract generation method based on target detection
CN105139040A (en) Queuing state information detection method and system thereof
CN108228696A (en) Research on face image retrieval and system, filming apparatus, computer storage media
CN111079640A (en) Vehicle type identification method and system based on automatic amplification sample
CN114360038B (en) Weak supervision RPA element identification method and system based on deep learning
CN111310662A (en) Flame detection and identification method and system based on integrated deep network
CN113963147B (en) Key information extraction method and system based on semantic segmentation
CN111522951A (en) Sensitive data identification and classification technical method based on image identification
CN112150692A (en) Access control method and system based on artificial intelligence
US20230215125A1 (en) Data identification method and apparatus
CN110223310A (en) A kind of line-structured light center line and cabinet edge detection method based on deep learning
Amaro et al. Evaluation of machine learning techniques for face detection and recognition
CN112464775A (en) Video target re-identification method based on multi-branch network
CN112712051A (en) Object tracking method and device, computer equipment and storage medium
CN110555125A (en) Vehicle retrieval method based on local features
CN113486201A (en) Cartoon figure image classification processing method and system
Pavlov et al. Application for video analysis based on machine learning and computer vision algorithms
CN117197864A (en) Certificate classification recognition and crown-free detection method and system based on deep learning
CN112016434A (en) Lens motion identification method based on attention mechanism 3D residual error network
CN115240115B (en) Visual SLAM loop detection method combining semantic features and bag-of-words model
CN111753618A (en) Image recognition method and device, computer equipment and computer readable storage medium
Visalatchi et al. Intelligent Vision with TensorFlow using Neural Network Algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240226

Address after: 5 / F, 277 Huqingping Road, Minhang District, Shanghai, 201100

Applicant after: Shanghai Thermosphere Information Technology Co.,Ltd.

Country or region after: China

Address before: Building C, No.888, Huanhu West 2nd Road, Lingang New District, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai

Applicant before: Shanghai stratosphere Intelligent Technology Co.,Ltd.

Country or region before: China

TA01 Transfer of patent application right