CN113486201A

CN113486201A - Cartoon figure image classification processing method and system

Info

Publication number: CN113486201A
Application number: CN202110687533.1A
Authority: CN
Inventors: 杨大为; 宋世唯; 周强
Original assignee: Shanghai Stratosphere Intelligent Technology Co ltd
Current assignee: Shanghai Thermosphere Information Technology Co ltd
Priority date: 2021-06-21
Filing date: 2021-06-21
Publication date: 2021-10-08

Abstract

The invention discloses a cartoon character image classification processing method and a system, wherein the method comprises the steps of obtaining cartoon character image data of each cartoon character in a cartoon video; extracting the characteristic vector of the image data of the cartoon character to obtain the characteristic vector of the cartoon character; clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data; judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters in a database; the system comprises a detection module, a feature vector extraction module, a cluster classification module, a similarity comparison module and a database; the cartoon character recognition system can help the staff who make cartoon to quickly locate and recognize cartoon characters and quickly recognize and count cartoon characters.

Description

Cartoon figure image classification processing method and system

Technical Field

The invention relates to the field of cartoon character image classification processing methods and systems, in particular to a cartoon character image classification processing method and system.

Background

At present, the cartoon animation industry develops rapidly, more and more animations are deeply loved by people, and more display forms are presented along with the prosperity of the cartoon industry; in the animation production process, the animation characters are diversified, so that the production process of identifying the identities of the animation characters by workers is more and more complicated, and the complicated and low-efficiency production process brings much trouble to the workers in the animation industry;

the method for identifying cartoon characters based on images is rare when more cartoon face identification or portrait identification is applied in the market, and the application field is more and more extensive with the development of image identification technology in the field of computers, so that the problem can be solved.

Disclosure of Invention

The invention aims to solve the technical problems that in the existing animation production process, the diversification of animation characters causes the identification of cartoon characters by workers to be more and more complicated, and the complicated and low-efficiency production process brings much trouble to workers in the animation industry, and the invention provides a cartoon character image classification processing method, which can identify cartoon character images in cartoon videos by traversing the cartoon videos, acquire each cartoon character image and calculate character feature vectors, compare whether the similarity with a certain known character is greater than a threshold value, if so, show the known cartoon character information, if so, the known cartoon character information is less than the threshold value, then the known cartoon character is an unknown cartoon character, after a user adds identity information or marking information, if the cartoon character appears in the subsequent videos, then the known cartoon character is shown as the known cartoon character, and can help the workers producing cartoon to quickly position and identify the cartoon characters, and rapidly identifying and counting cartoon characters so as to solve the defects caused by the prior art.

The invention further provides a cartoon figure image classification processing system.

In order to solve the technical problems, the invention provides the following technical scheme:

in a first aspect, a cartoon character image classification processing method includes:

acquiring cartoon character image data of each cartoon character in a cartoon video;

extracting characteristic vectors of the cartoon character image data to obtain cartoon character characteristic vectors;

clustering and classifying the cartoon characters according to the characteristic vectors of the cartoon characters to obtain cartoon character classification data;

and judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters into a database, wherein the database is used for storing the cartoon character image data, the cartoon character feature vectors and the cartoon character classification data, and comparing the cartoon character classification data with the existing cartoon character database to judge whether the cartoon characters are known cartoon characters.

The cartoon character image classification processing method comprises the steps of detecting a cartoon character appearing in each frame of a cartoon video by adopting a convolutional neural network, marking a detection frame on an image of the cartoon character, capturing the content in the detection frame to obtain the cartoon character image data of each cartoon character, and using the cartoon character image data captured by the detection frame for multiple times in the subsequent process; the convolutional neural network can select common target detection networks such as yolo series, fast-rcnn and the like, and end-to-end training can be directly carried out.

In the above cartoon character image classification processing method, after the characteristic vector of the cartoon character is obtained, it is further necessary to determine whether the cartoon video is finished;

if the cartoon video is not finished, continuously acquiring cartoon character image data of cartoon characters in the cartoon video;

and if the cartoon video is finished, finishing the acquisition of the cartoon character image data.

The cartoon character image classification processing method comprises the steps of extracting feature vectors of cartoon character image data by adopting a convolutional neural network to obtain cartoon character feature vectors, wherein the cartoon character feature vectors can describe the features of one character and have discrimination among different characters;

the convolutional neural network extracts the cartoon character feature vector of a cartoon character according to a feature extraction model, the feature extraction model can theoretically use any classification network, remove the final full-connection classification head, use the features output by the last layer as the cartoon character feature vector, and common choices include the resnet series, the shufflenet and the like; in consideration of the difference between cartoon characters and general object classification tasks, in order to improve the effect of feature extraction, some modification and adjustment are also carried out on the network: adding an attention module (attention), wherein the attention module realizes an attention mechanism, which is a method in a neural network, and as people pay attention to some specific details when observing things, the attention module enables a feature extraction model to pay attention to important and effective parts in features, and can calculate and reflect importance distribution of intermediate features so as to adjust the intermediate features, as in a cartoon character recognition task, more critical features are often concentrated on some local areas such as faces, the accuracy of the model can be improved by adding the attention mechanism, the convolutional neural network is composed of layers of convolution and other operations, the intermediate result of each layer is a series of feature maps, acquiring feature maps of some layers in the convolutional neural network, and generating a weight mask (mask) with the same size as the feature maps after the feature maps are subjected to convolution and a Sigmoid activation function, multiplying the mask (mask) with the feature map to obtain a corrected feature map;

the attention module with more complicated design, such as non-local module, can be used, compared with the common attention module non-local module, when generating the weight mask (mask), the matrix multiplication is carried out on the local feature and the global feature, so that the receptive field is increased, the global information can be obtained, and the attention module can be added in each down-sampling stage of the whole feature extraction network;

the attribute extraction branch is added, and some attributes of the cartoon characters can be well distinguished and recognized, so that the attribute extraction branch can be added into the feature extraction model to improve the capability of the model for extracting features, defined attributes can be binary or multivariate, such as gender, color, species and the like, and are trained by adding an attribute classification head.

The cartoon character image classification processing method includes the following steps that a classification head is added after a feature extraction network in training of a feature extraction model, so that cross entropy loss (cross entropy loss) can be used as a classification task for training, classification labels are labeled character identities, if training is performed only by using a classification loss function, the model learns to increase distances between different classes but not to decrease the distances between the classes, so that the feature extraction model is optimized, a triple loss (triple loss) is used for optimizing the feature extraction model, the triple loss (triple loss) optimizes similarities between the different classes and the same class of features, and the capability of the model for extracting features can be improved better, and the specific method includes the following steps:

triple loss is defined as < a, p, n > on a triple data input;

wherein a is an anchor sample in the triplet, and the anchor sample is a sample of a certain category;

p is a sample of the same class as a;

n is a sample of a different class than a;

for example, in the cartoon character data set, a is one sample of the cartoon character a, p is another sample of the cartoon character a, and n is a sample of the cartoon character B.

The loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);

a, p and n are all the feature vectors of the corresponding samples;

d (a, p) is a function for calculating the distance between the sample a and the sample p, d (a, n) is a function for calculating the distance between the sample a and the sample n, and Euclidean distance is adopted;

m is a hyper-parameter for controlling the distribution intervals of different types of data, and is set according to requirements;

selecting c classes for each training batch (batch) using an online hard sampling mining (online hard sampling) method, each class selecting k samples, for a total of c × k samples;

calculating a distance matrix among all samples in the current batch (batch), taking each sample as an anchor sample, and selecting p which enables d (a, p) to be maximum and n which enables d (a, n) to be minimum as a difficult example to calculate a loss value L;

updating the feature extraction model according to the loss value L and through back propagation;

the training data should be cartoon figure images, each figure only comprises one figure, the labeling information is the identity type of the figure, the training sample is from a cartoon video, small images are intercepted from a big figure according to a detection task labeling frame, each small image is a sample image of a single cartoon figure, and the type of each sample is labeled;

optimizing triple loss (triplet loss) can lead the model to learn to be closer to the distance of the same type of data and to be larger than the distance of different types of data, and training by using triple loss (triplet loss) requires a special sample selection strategy, so that the cost is very high if all sample triples are traversed.

In the cartoon character image classification processing method, a density clustering method is adopted to perform clustering calculation on the cartoon character feature vectors to obtain the cartoon character classification data; through the previous steps, a section of cartoon video is detected and subjected to feature extraction, a large number of cartoon character images and feature vectors corresponding to each image are obtained at the moment, inductive constraint is needed to be carried out on the information at the moment, and the number of actually-appearing characters and identity comparison verification are further obtained;

the data is processed and summarized by adopting a Clustering method, because the number of the categories To be clustered is not known, the adopted Clustering method cannot be Based on the known condition of the number of the categories, such as k-means Clustering, a Density-Based Clustering method, such as DBSCAN (sensitivity-Based Spatial Clustering of Applications with Noise), OPTIC (organizing Points To Identify Cluster Structure), and the like, images of the same person in different frames are clustered into one category, most of the images are classified, images of persons with less part or unique expression are abandoned as outliers (outliers), because of the following steps of user verification, compared with the requirement of Clustering all the images of the single person into one category, the requirement of avoiding two different persons being clustered into the same category is more important, Clustering parameters can be adjusted, and the maximum distance threshold value is adjusted To be relatively smaller, to prevent different classes of data from being grouped into one class;

after finishing clustering and classification, finishing the summary extraction of the cartoon characters in the current video, next comparing each extracted cartoon character with the known cartoon characters to confirm and display the identity of the cartoon characters, wherein the comparison method also comprises the steps of calculating cosine distance of the feature vectors, for a known person, there is a recorded average feature vector, and for the persons in the current video cluster, the average feature vector of the middle point in each cluster is selected as the average feature vector of the cartoon person, when the similarity is higher than a certain threshold value, the current cartoon character is considered to be the known cartoon character, the character identity is shown to the user, if the similarity between the current cartoon character and all known cartoon characters is lower than the threshold value, the current character is considered to be an unknown character, and at the moment, the user can select to add the identity of the cartoon character to a known cartoon character library in the database according to the requirement.

The cartoon character image classification processing method comprises the steps that cartoon character information data of multiple categories are stored in the cartoon character classification data, and the cartoon character information data of each category comprise cartoon character information, cartoon character feature vectors and cartoon character image data;

the process of judging whether the cartoon character is a known cartoon character according to the cartoon character classification data is as follows:

extracting the cartoon character information data of one category and the known cartoon character information in the database to obtain similarity by a method of calculating cosine distance between feature vectors;

if the similarity is larger than or equal to a preset threshold value, judging that the cartoon character is known, and displaying the cartoon character information and the cartoon character image data of the known cartoon character;

if the similarity is smaller than a preset threshold value, the cartoon character is judged to be unknown, labeling is carried out (information is added to the current cartoon character, and if the cartoon character is detected again in the video in the following process, the cartoon character is treated as a known cartoon character) and the known cartoon character is stored or discarded.

The second aspect is a cartoon figure image classification processing system, which comprises a detection module, a feature vector extraction module, a cluster classification module, a similarity comparison module and a database;

the detection module is used for detecting cartoon character images in the cartoon videos to obtain cartoon character image data of each cartoon character and transmitting the cartoon character image data to the feature vector extraction module;

the characteristic vector extraction module is used for receiving the cartoon character image data, extracting characteristic vectors of cartoon characters from the cartoon character image data to obtain characteristic vectors of the cartoon characters, and transmitting the characteristic vectors of the cartoon characters to the cluster classification module;

the clustering classification module is used for receiving the cartoon character feature vectors, clustering and classifying to obtain cartoon character classification data transmitted to the similarity comparison module;

the similarity comparison module is used for receiving the cartoon character classification data, comparing the similarity of the cartoon character classification data with the similarity of known cartoon characters, labeling the cartoon character classification data and transmitting the labeled cartoon character classification data to the database;

the database is used for storing the cartoon character classification data and known cartoon characters.

In the cartoon character image classification processing system, a convolutional neural network is stored in each of the detection module and the feature vector extraction module;

the detection module detects the cartoon characters appearing in each frame of the cartoon video by adopting a convolutional neural network, marks a detection box on the images of the cartoon characters, captures the contents in the detection box and acquires the image data of the cartoon characters of each cartoon character;

the feature vector extraction module extracts feature vectors of the cartoon character image data by adopting a convolutional neural network to obtain cartoon character feature vectors;

the detection module is internally provided with a video judgment module which is used for judging whether the cartoon video is finished or not.

The cartoon figure image classification processing system is characterized in that a feature extraction model optimization module is arranged in the feature vector extraction module, and the feature extraction model optimization module is used for optimizing the feature extraction model.

The technical scheme provided by the cartoon character image classification processing method and system provided by the invention has the following technical effects:

the method comprises the steps of identifying cartoon character images in a cartoon video after traversing the cartoon video, obtaining each cartoon character image, calculating to obtain a cartoon character feature vector, comparing whether the similarity with a certain known character is greater than a threshold value, if so, showing the known cartoon character information, if not, showing the known cartoon character, and after adding identity information or marking information by a user, showing the known cartoon character if the cartoon character appears in the subsequent video in the identification process, so that a worker making the cartoon can be helped to quickly position and identify the cartoon character and quickly identify and count the cartoon character.

Drawings

FIG. 1 is a schematic structural diagram of a cartoon character image classification processing method according to the present invention;

FIG. 2 is a schematic structural diagram of a cartoon character image classification processing system according to the present invention;

FIG. 3 is a schematic structural diagram of a cartoon character image classification processing method according to the present invention;

FIG. 4 is a schematic diagram of non-local modules.

Wherein the reference numbers are as follows:

the system comprises a detection module 101, a feature vector extraction module 102, a cluster classification module 103, a similarity comparison module 104 and a database 105.

Detailed Description

In order to make the technical means, the inventive features, the objectives and the effects of the invention easily understood and appreciated, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the specific drawings, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments.

All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions under which the present invention can be implemented, so that the present invention has no technical significance, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention.

In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.

The invention provides a cartoon character image classification processing method and system, and aims to identify cartoon character images in a cartoon video after traversing the cartoon video, acquire each cartoon character image and calculate a cartoon character feature vector, compare whether the similarity with a certain known character is greater than a threshold value, if the similarity is greater than the threshold value, show the known cartoon character information, if the similarity is less than the threshold value, the known cartoon character is an unknown cartoon character, and after adding identity information or marking information by a user, if the cartoon character appears again in the subsequent video identification, show the known cartoon character, so that a worker making cartoon can be helped to quickly position and identify the cartoon character and quickly identify and count the cartoon character.

As shown in fig. 1, in a first aspect, a cartoon character image classification processing method includes:

extracting the characteristic vector of the image data of the cartoon character to obtain the characteristic vector of the cartoon character;

judging whether the cartoon character is a known cartoon character according to the cartoon character classification data, labeling and storing the known cartoon character into the database 105, wherein the database 105 is used for storing cartoon character image data, cartoon character feature vectors and cartoon character classification data, and judging whether the cartoon character is a known cartoon character according to the comparison between the cartoon character classification data and an existing cartoon character database.

The method comprises the following steps that a convolutional neural network is adopted to detect cartoon characters appearing in each frame of a cartoon video and mark a detection box on images of the cartoon characters, the contents in the detection boxes are captured to obtain cartoon character image data of each cartoon character, and the cartoon character image data captured by the detection boxes are used for multiple times in the subsequent process; the convolutional neural network can select common target detection networks such as yolo series, fast-rcnn and the like, and end-to-end training is directly carried out.

After the characteristic vector of the cartoon character is obtained, whether the cartoon video is finished or not needs to be judged;

The method comprises the following steps that a convolutional neural network is adopted to extract feature vectors of cartoon character image data to obtain cartoon character feature vectors, the cartoon character feature vectors can describe the features of one character, and discrimination is provided among different characters;

the convolutional neural network extracts cartoon character feature vectors of cartoon characters according to the feature extraction model, the feature extraction model can theoretically use any classification network, the final full-connection classification head is removed, the features output by the last layer are used as the cartoon character feature vectors, and common choices comprise a resnet series, a shufflenet and the like; in consideration of the difference between cartoon characters and general object classification tasks, in order to improve the effect of feature extraction, some modification and adjustment are also carried out on the network: an attention module (attention) is added, the attention module realizes an attention mechanism, the method is a method in a neural network, the attention module enables a feature extraction model to pay attention to important and effective parts in features like a person who observes things, the importance distribution reflecting intermediate features can be calculated, and then the intermediate features are adjusted, as the more critical features are usually concentrated on some local areas such as faces in a cartoon character recognition task, the accuracy of the model can be improved by adding the attention mechanism, the convolutional neural network is composed of layers of convolution and other operations, the intermediate result of each layer is a series of feature maps, feature maps of some layers in the convolutional neural network are obtained, the feature maps are activated through convolution and a Sigmoid function to generate a weight mask (mask) with the same size as the feature maps, multiplying the mask (mask) with the feature map to obtain a corrected feature map;

an attention module with a more complex design, such as a non-local module, shown in fig. 4, may also be used, and compared with a general attention module, the non-local module performs matrix multiplication on local features and global features when generating a weight mask (mask), so that the receptive field is increased, global information may be obtained, and the attention module may be added in each downsampling stage of the whole feature extraction network;

The training of the feature extraction model is to add a classification head behind the feature extraction network, so that cross entropy loss (cross entropy loss) can be used as a classification task for training, a classification label is the labeled character identity, if only a classification loss function is used for training, the model can learn to increase the distance between different classes but not to reduce the distance between the same classes, so that the optimization of the feature extraction model is also included, the feature extraction model is optimized by adopting triple loss, the similarity between the different classes and the same class of features is optimized by the triple loss, and the capability of the model for extracting the features can be better improved, and the specific method comprises the following steps:

triple loss is defined as < a, p, n > on a triple data input;

p is a sample of the same class as a;

n is a sample of a different class than a;

the loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);

a, p and n are all the feature vectors of the corresponding samples;

m is a hyper-parameter and is set according to requirements;

updating the feature extraction model through back propagation according to the loss value L;

Carrying out clustering calculation on the characteristic vectors of the cartoon characters by adopting a density clustering method to obtain cartoon character classification data; through the previous steps, a section of cartoon video is detected and subjected to feature extraction, a large number of cartoon character images and feature vectors corresponding to each image are obtained at the moment, inductive constraint is needed to be carried out on the information at the moment, and the number of actually-appearing characters and identity comparison verification are further obtained;

after finishing clustering and classification, finishing the summary extraction of the cartoon characters in the current video, next comparing each extracted cartoon character with the known cartoon characters to confirm and display the identity of the cartoon characters, wherein the comparison method also comprises the steps of calculating cosine distance of the feature vectors, for a known person, there is a recorded average feature vector, and for the persons in the current video cluster, the average feature vector of the middle point in each cluster is selected as the average feature vector of the cartoon person, when the similarity is higher than a certain threshold value, the current cartoon character is considered to be the known cartoon character, the character identity is shown to the user, if the similarity between the current cartoon character and all known cartoon characters is lower than the threshold value, the current character is considered to be an unknown character, and the user can select to add the identity of the cartoon character to the database of known cartoon characters in the database 105 according to the requirement.

The cartoon character classification data stores cartoon character information data of multiple categories, and the cartoon character information data of each category comprises cartoon character information, cartoon character feature vectors and cartoon character image data;

extracting the data of the cartoon figure information of one category and the known cartoon figure information in the database 105 to obtain the similarity by a method of calculating the cosine distance between the characteristic vectors;

As shown in fig. 2, in a second aspect, a cartoon character image classification processing system includes a detection module 101, a feature vector extraction module 102, a cluster classification module 103, a similarity comparison module 104, and a database 105;

the detection module 101 is used for detecting cartoon character images in the cartoon video to obtain cartoon character image data of each cartoon character and transmitting the cartoon character image data to the feature vector extraction module 102;

the feature vector extraction module 102 is configured to receive the image data of the cartoon character, extract feature vectors of the cartoon character from the image data to obtain feature vectors of the cartoon character, and transmit the feature vectors of the cartoon character to the cluster classification module 103;

the clustering classification module 103 is used for receiving the cartoon character feature vectors, clustering and classifying the cartoon character feature vectors to obtain cartoon character classification data transmitted to the similarity comparison module 104;

the similarity comparison module 104 is used for receiving the cartoon character classification data, comparing the similarity of the data with the similarity of known cartoon characters, labeling the data, and transmitting the labeled cartoon character classification data to the database 105;

the database 105 is used for storing the cartoon character classification data and the known cartoon characters.

The detection module 101 and the feature vector extraction module 102 both store convolutional neural networks therein;

the detection module 101 detects the cartoon characters appearing in each frame of the cartoon video by adopting a convolutional neural network, marks a detection box on the images of the cartoon characters, captures the contents in the detection box and obtains the cartoon character image data of each cartoon character;

the feature vector extraction module 102 extracts feature vectors of cartoon character image data by using a convolutional neural network to obtain cartoon character feature vectors;

the detection module 101 is internally provided with a video judgment module, and the video judgment module is used for judging whether the cartoon video is finished.

The feature vector extraction module 102 is internally provided with a feature extraction model optimization module, and the feature extraction model optimization module is used for optimizing the feature extraction model.

As shown in fig. 3, the specific classification process of the cartoon character image classification processing system is as follows:

1. starting;

2. reading a frame of video image;

3. detecting cartoon characters;

4. intercepting each cartoon character image, and calculating a cartoon character feature vector;

5. recording the cartoon character image and the cartoon character feature vector;

6. judging whether the video is finished (the end condition of the loop);

if not, reading a frame of video image again;

having ended and continuing to the next step;

7. clustering all cartoon character images according to the feature vectors, and calculating average feature vectors;

8. taking a category;

9. calculating similarity with all known cartoon character feature vectors;

10. judging whether the similarity with a known cartoon character is greater than a threshold value;

if yes, displaying cartoon character information and category images;

if not, entering the next step;

11. the user marks or discards the class according to the class image;

abandoning, and judging whether all the clustering categories are traversed or not;

labeling and entering the next step;

12. adding the labeling information and the category feature vectors into the known cartoon character database 105;

13. whether all cluster categories are traversed;

1) if not, one category is selected;

2) yes, the flow ends.

Those of skill in the art would understand that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of software and electronic hardware;

whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution;

skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments of the present application, the disclosed system, apparatus and method may be implemented in other ways;

for example, the division of a unit or a module is only one logic function division, and there may be another division manner in actual implementation;

for example, a plurality of units or modules or components may be combined or may be integrated into another system;

in addition, functional units or modules in the embodiments of the present application may be integrated into one processing unit or module, or may exist separately and physically.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a machine-readable storage medium;

therefore, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a machine-readable storage medium and may include several instructions to cause an electronic device to execute all or part of the processes of the technical solution described in the embodiments of the present application;

the storage medium may include various media that can store program codes, such as ROM, RAM, a removable disk, a hard disk, a magnetic disk, or an optical disk.

In summary, according to the cartoon character image classification processing method and system provided by the invention, after traversing a cartoon video, cartoon character images in the cartoon video can be identified, each cartoon character image is obtained and a cartoon character feature vector is obtained through calculation, whether the similarity with a certain known character is greater than a threshold value or not is compared, if the similarity is greater than the threshold value, the known cartoon character information is displayed, if the similarity is less than the threshold value, the known cartoon character information is an unknown cartoon character, after identity information or marking information is added by a user, if the cartoon character appears in the subsequent video identification, the known cartoon character information is displayed, and therefore, a worker making cartoon can be helped to quickly locate and identify the cartoon character and quickly identify and count the cartoon character.

Specific embodiments of the invention have been described above. It is to be understood that the invention is not limited to the particular embodiments described above, in that devices and structures not described in detail are understood to be implemented in a manner common in the art; various changes or modifications may be made by one skilled in the art within the scope of the claims without departing from the spirit of the invention, and without affecting the spirit of the invention.

Claims

1. A cartoon character image classification processing method is characterized by comprising the following steps:

and judging whether the cartoon characters are known cartoon characters according to the cartoon character classification data, labeling and storing the known cartoon characters in a database.

2. The cartoon character image classification processing method of claim 1, wherein a convolutional neural network is adopted to detect cartoon characters appearing in each frame of a cartoon video and mark a detection box on an image of a cartoon character, and the contents in the detection box are captured to obtain the cartoon character image data of each cartoon character.

3. The cartoon character image classification processing method of claim 1 or 2, wherein after the cartoon character feature vector is obtained, whether the cartoon video is finished is judged;

4. The cartoon character image classification processing method of claim 3, wherein a convolutional neural network is adopted to extract the characteristic vector of the cartoon character image data to obtain a cartoon character characteristic vector;

the convolutional neural network extracts the cartoon character feature vector of the cartoon character according to a feature extraction model;

the convolutional neural network comprises an attention module, wherein the attention module is used for acquiring a feature map of certain layers in the convolutional neural network, generating a weight mask with the same size as the feature map after the feature map is subjected to convolution and a Sigmoid activation function, and multiplying the mask and the feature map to acquire a modified feature map.

5. The cartoon character image classification processing method of claim 4, further comprising optimizing the feature extraction model by using triple loss, the specific method being as follows:

triple penalty is defined at a triple data input < a, p, n >;

wherein a represents an anchor sample in the triple loss, and the anchor sample is a sample of a certain category;

p is a sample of the same class as a;

n is a sample of a different class than a;

the loss function is defined as L ═ max (d (a, p) -d (a, n) + M, 0);

a, p and n are all the feature vectors of the corresponding samples;

m is a hyper-parameter for controlling the distribution interval of different types of data;

c classes are selected for each training batch by using an online hard case mining method, k samples are selected for each class, and the total of c multiplied by k samples is obtained;

calculating a distance matrix among all samples in the current batch, taking each sample as an anchor sample, and taking p enabling d (a, p) to be maximum and n enabling d (a, n) to be minimum as a difficult example to calculate a loss value L;

and updating the feature extraction model according to the loss value L and through back propagation.

6. The cartoon character image classification processing method of claim 1, wherein the cartoon character classification data is obtained by clustering the cartoon character feature vectors by a density clustering method.

7. The cartoon character image classification processing method of claim 6, wherein the cartoon character classification data stores cartoon character information data of a plurality of categories, each category of cartoon character information data including cartoon character information, the cartoon character feature vector, the cartoon character image data;

if the similarity is larger than or equal to a preset threshold value, displaying the cartoon character information and the cartoon character image data of the cartoon character;

and if the similarity is smaller than a preset threshold value, marking and storing or discarding.

8. A cartoon figure image classification processing system is characterized by comprising a detection module, a feature vector extraction module, a clustering classification module, a similarity comparison module and a database;

9. The cartoon character image classification processing system of claim 8, wherein the detection module and the feature vector quantity extraction module each have a convolutional neural network stored therein;

10. The cartoon character image classification processing system of claim 9, wherein the feature vector extraction module is embedded with a feature extraction model optimization module for optimizing the feature extraction model.