CN111832498B - Cartoon face recognition method based on convolutional neural network - Google Patents

Cartoon face recognition method based on convolutional neural network Download PDF

Info

Publication number
CN111832498B
CN111832498B CN202010692679.0A CN202010692679A CN111832498B CN 111832498 B CN111832498 B CN 111832498B CN 202010692679 A CN202010692679 A CN 202010692679A CN 111832498 B CN111832498 B CN 111832498B
Authority
CN
China
Prior art keywords
picture
convolutional neural
neural network
pictures
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010692679.0A
Other languages
Chinese (zh)
Other versions
CN111832498A (en
Inventor
王笛
田玉敏
黄珍
刘瑗
万波
杨鹏飞
赵辉
罗楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202010692679.0A priority Critical patent/CN111832498B/en
Publication of CN111832498A publication Critical patent/CN111832498A/en
Application granted granted Critical
Publication of CN111832498B publication Critical patent/CN111832498B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a cartoon face recognition method based on a convolutional neural network, which comprises the following steps: (1) generating a training set; (2) generating a C-F Loss function; (3) training an Xreception convolutional neural network; (4) generating an identification picture set; and (5) recognizing the cartoon face picture. The invention adopts Xreception convolutional neural network to extract the characteristics, can extract more complete cartoon face characteristics to obtain higher recognition rate, and simultaneously increases the Focal Loss function to generate a C-F Loss function on the basis of the cross entropy Loss function, thereby solving the problems of unbalanced number of pictures in different types and unbalanced difficulty degree of outputting corresponding type names after training.

Description

Cartoon face recognition method based on convolutional neural network
Technical Field
The invention belongs to the technical field of image processing, and further relates to a cartoon face recognition method based on a convolutional neural network in the technical field of image recognition. The invention can be applied to identify the identity information corresponding to the cartoon face from the image of the face.
Background
Cartoon is an artistic form that uses simple and exaggerated techniques to describe a face image. The same person can also have facial cartoon pictures of different styles, and the facial cartoon pictures of different styles highlight different facial part characteristics. For such a cartoon face portrait, a person can easily distinguish which person the cartoon face belongs to, but it is challenging for a machine. Studying cartoon face recognition can help us better understand human perception of face. On the basis of simple face recognition, the study of the cartoon reveals the intrinsic nature of human face perception. The insight gained by computer scientists from psychological studies may promote the development of machine learning methods, further improving the performance of cartoons and face recognition.
Pushkar Shukla et al in its published paper "CARTOONNET Caricature Recognition of Public Figures" (Proceedings of 3rd International Conference on Computer Vision and Image Processing,pp 1-10, 2019) proposes a cartoon face recognition method based on a deep convolutional neural network. The data in the paper that make up the training set uses the published IIIT-CFW data set (Mishra et al European conference on computer vision, 2016) that includes cartoon face images of public characters. During the experiment of the paper, the number of pictures required to meet the same category selected from the data set must be greater than 35, and the categories meeting the conditions are added to the training process. The method can be well applied to the actual situation of cartoon face recognition, but the method still has the defect that as the method only selects more than 35 pictures of the same type from the data set and ignores the type with less images in the same type, the number of people recognized in the process of recognizing the cartoon face is reduced, and the recognition number of identity information corresponding to the face is influenced.
The Hangzhou university of electronic technology discloses a method for carrying out cartoon face recognition by using the gating fusion discrimination features in patent literature (application number: 201911157921.8, application publication number: CN 111079549A) of Hangzhou university. The method comprises the steps of firstly preprocessing data, secondly extracting and fusing characteristics, and fusing 17 local characteristics and global characteristics. And the global features are extracted by scaling the size of the picture to 112 multiplied by 96 and inputting the picture into a lightweight network, and finally, the cosine distance is calculated by the features of the fused cartoon and the face photo. The method can achieve a better recognition result, but has the defects that as the picture input by the method is a zoomed picture when global features are extracted, the pixel size of each picture is reduced, and a lightweight network is adopted for feature extraction, the extracted features of each picture are incomplete, and the identity information recognition rate corresponding to a human face is affected.
Disclosure of Invention
The invention aims to provide a cartoon face recognition method based on a convolutional neural network aiming at the defects of the prior art. The method is used for solving the problems that the number of people identified in the identification process of cartoon faces is reduced due to the fact that the categories with small number of images in the same category are ignored, and the identification rate of identity information corresponding to the faces is affected due to incomplete extraction of picture features.
The method is used for solving the problem that the number of people identified in the identification process of cartoon faces is reduced because the category with small image number cannot be added into training. In training, an Xattention convolutional neural network is used for extracting picture features, so that the problem that the recognition rate of identity information corresponding to a human face is affected due to incomplete picture feature extraction is solved.
The method comprises the following specific steps:
(1) Generating a training set:
(1a) Collecting cartoon face pictures and face pictures of each person to be identified, and collecting at least 15 pictures of each person to be identified;
(1b) Marking each corner of each eye in each picture as a key point, obtaining a picture with aligned faces by adopting a face alignment method based on eyes, cutting the picture with aligned faces into a size of 250 multiplied by 350, and obtaining a cut picture;
(1c) Forming a class by all the cut pictures of each person to be identified, taking the name of the person to be identified as a class name of the class, taking all the cut pictures of the person to be identified in each class as training pictures, and forming training pictures of all the classes into a training set;
(2) The C-F Loss function is generated as follows:
F=-[y log y′+(1-y)log(1-y′)]+[-α(1-y′) γ log(y′)]×e
wherein y represents a class name corresponding to a picture input to the Xreception convolutional neural network, y' represents a predicted class name of an output of the Xreception convolutional neural network for training, alpha represents a parameter for solving imbalance of the number of pictures in different classes, the value range is [0,1], gamma represents a parameter for solving imbalance of the difficulty level of outputting a corresponding class name after the picture is trained in the Xreception convolutional neural network, the value range is [0, + ], and e represents an identification rate factor for adjusting the size of the identification rate, and the value range is [0,1].
(3) Training Xreception convolutional neural networks:
inputting the training set into the Xreception convolutional neural network, performing iterative training on training pictures in the training set by using an Adam optimizer until the value of the C-F Loss function is continuously converged to the minimum, obtaining a trained Xreception convolutional neural network, and storing weights in the trained Xreception convolutional neural network;
(4) Generating an identification picture set:
collecting cartoon face pictures of each person to be identified, which are not repeated with each picture in the training set, at least collecting 1 picture of each person to be identified, and forming an identification picture set by all the cartoon face pictures;
(5) Identifying cartoon face pictures:
and sequentially inputting each picture in the identification picture set into the trained Xreception convolutional neural network, and sequentially outputting each picture and a class name corresponding to each picture.
Compared with the prior art, the invention has the following advantages:
firstly, the invention generates a C-F Loss function used for training the Xreception convolutional neural network, and overcomes two problems in the prior art, namely, the imbalance of the number of pictures in different classes and the imbalance of the difficulty degree of outputting corresponding class names after the pictures are trained in the Xreception convolutional neural network, and the problem that the number of people identified in the identification process of cartoon faces is reduced because of the fact that the number of pictures is small and the pictures cannot be added into the trained classes is solved, so that the pictures do not need to be deleted before the Xreception convolutional neural network is trained, the number of classes is reduced, and the identifiable class number of the cartoon faces in the identification process is increased.
Secondly, the Xreception convolutional neural network is trained by using the pictures in the training set, and as the characteristic is extracted by adopting a plurality of convolution kernels with different sizes, the adaptability of the characteristic with different scales in the picture to be identified is improved, and the characteristic in the picture is extracted more completely. The invention solves the problem that the identification rate of the identity information corresponding to the face is affected because the pixel size of each picture is reduced and the characteristics are extracted by adopting a lightweight network in the prior art, so that the invention extracts more complete characteristics in the picture to be identified, thereby improving the identification rate of the identity information corresponding to the face.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic illustration of a cartoon face sample of a WebCactarture dataset employed in a simulation experiment;
fig. 3 is a schematic diagram of a cartoon face sample after an eye-based face alignment method is adopted for each picture in fig. 2 in a simulation experiment.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The specific steps of the present invention will be described in further detail with reference to fig. 1.
And step 1, generating a training set.
Firstly, collecting cartoon face pictures and face pictures of each person to be identified, wherein each person to be identified collects at least 15 pictures;
marking each corner of each eye in each picture as a key point, obtaining a picture with aligned faces by adopting a face alignment method based on eyes, and cutting each picture with aligned faces into a size of 250 multiplied by 350 to obtain a cut picture;
thirdly, forming a class by all the cut pictures of each person to be identified, taking the name of the person to be identified as a class name of the class, taking all the cut pictures of the person to be identified in each class as training pictures, and forming training sets by the training pictures of all the classes;
and 2, generating a C-F Loss function as follows.
F=-[y log y′+(1-y)log(1-y′)]+[-α(1-y′) γ log(y′)]×e
Wherein y represents a class name corresponding to a picture input to the Xreception convolutional neural network, y' represents a predicted class name of an output of the Xreception convolutional neural network for training, alpha represents a parameter for solving imbalance of the number of pictures in different classes, the value range is [0,1], gamma represents a parameter for solving imbalance of the difficulty level of outputting a corresponding class name after the picture is trained in the Xreception convolutional neural network, the value range is [0, + ], and e represents an identification rate factor for adjusting the size of the identification rate, and the value range is [0,1].
And step 3, training the Xreception convolutional neural network.
Inputting the training set into the Xreception convolutional neural network, performing iterative training on training pictures in the training set by using an Adam optimizer until the value of the C-F Loss function is continuously converged to the minimum, obtaining a trained Xreception convolutional neural network, and storing weights in the trained Xreception convolutional neural network;
and 4, generating an identification picture set.
Collecting cartoon face pictures of each person to be identified, which are not repeated with each picture in the training set, at least collecting 1 picture of each person to be identified, and forming an identification picture set by all the cartoon face pictures;
and 5, recognizing the cartoon face picture.
And sequentially inputting each picture in the identification picture set into the trained Xreception convolutional neural network, and sequentially outputting each picture and a class name corresponding to each picture.
The effects of the present invention can be further illustrated by the following simulation experiments.
1. Simulation conditions:
the simulation experiment of the invention adopts software Pycharm as a simulation tool, and the computer is configured as an Intel core i7/3.6GHz/16G, 64-bit Windows7 operating system.
The data used in the simulation experiments of the present invention were from all of the data in the webcast cartoon face dataset created by university of south Beijing, which consisted of 6042 cartoons and 5974 photographs containing 252 person identities. The size of each image was 250×350. Fig. 2 is a schematic diagram of 3 cartoon face samples of two character identities respectively selected in webcast cartoon face dataset.
2. Simulation experiment contents:
the simulation experiment is carried out on the WebCactature cartoon face data set, and the face cartoon image is identified by adopting the method to obtain the identification rate. And (3) adopting an eye-based face alignment method to each picture in the WebCactatic cartoon face data set to obtain a picture with the aligned faces. Fig. 3 is a schematic illustration of a cartoon face sample after face alignment for each of the pictures of fig. 2. Aligning the face data set according to 6:2: the proportion of 2 is randomly divided into a training set, a verification set and a test set. The training set data are input into the Xreception convolutional neural network for training, and the data enhancement technology is adopted in the training process, so that the generalization capability of the model is improved. The verification set is used for verifying a model after one epoch is trained in the training process, so that the recognition rate and the loss value are obtained. The recognition rate and loss value of the model on the verification set are calculated through training, and overfitting caused by over training is avoided by using early-stop. And testing the model with the best training effect by using the test set data to obtain the recognition rate.

Claims (1)

1. A cartoon face recognition method based on a convolutional neural network is characterized in that a C-F Loss function is utilized to train an Xattention convolutional neural network; the method comprises the following steps:
(1) Generating a training set:
(1a) Collecting cartoon face pictures and face pictures of each person to be identified, and collecting at least 15 pictures of each person to be identified;
(1b) Marking each corner of each eye in each picture as a key point, obtaining a picture with aligned faces by adopting a face alignment method based on eyes, cutting the picture with aligned faces into a size of 250 multiplied by 350, and obtaining a cut picture;
(1c) Forming a class by all the cut pictures of each person to be identified, taking the name of the person to be identified as a class name of the class, taking all the cut pictures of the person to be identified in each class as training pictures, and forming training pictures of all the classes into a training set;
(2) The C-F Loss function is generated as follows:
F=-[y log y′+(1-y)log(1-y′)]+[-α(1-y′) γ log(y′)]×e
wherein y represents a class name corresponding to a picture input to the Xreception convolutional neural network, y' represents a predicted class name of an output of the Xreception convolutional neural network for training, alpha represents a parameter for solving imbalance of the number of pictures in different classes, the value range is [0,1], gamma represents a parameter for solving imbalance of the difficulty level of outputting a corresponding class name after the picture is trained in the Xreception convolutional neural network, the value range is [0, + ], e represents an identification rate factor for adjusting the size of an identification rate, and the value range is [0,1];
(3) Training Xreception convolutional neural networks:
inputting the training set into the Xreception convolutional neural network, performing iterative training on training pictures in the training set by using an Adam optimizer until the value of the C-F Loss function is continuously converged to the minimum, obtaining a trained Xreception convolutional neural network, and storing weights in the trained Xreception convolutional neural network;
(4) Generating an identification picture set:
collecting cartoon face pictures of each person to be identified, which are not repeated with each picture in the training set, at least collecting 1 picture of each person to be identified, and forming an identification picture set by all the cartoon face pictures;
(5) Identifying cartoon face pictures:
and sequentially inputting each picture in the identification picture set into the trained Xreception convolutional neural network, and sequentially outputting each picture and a class name corresponding to each picture.
CN202010692679.0A 2020-07-17 2020-07-17 Cartoon face recognition method based on convolutional neural network Active CN111832498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010692679.0A CN111832498B (en) 2020-07-17 2020-07-17 Cartoon face recognition method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010692679.0A CN111832498B (en) 2020-07-17 2020-07-17 Cartoon face recognition method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111832498A CN111832498A (en) 2020-10-27
CN111832498B true CN111832498B (en) 2023-07-28

Family

ID=72923534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010692679.0A Active CN111832498B (en) 2020-07-17 2020-07-17 Cartoon face recognition method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111832498B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112015B (en) * 2021-04-06 2023-10-20 咪咕动漫有限公司 Model training method, device, electronic equipment and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214308A (en) * 2018-08-15 2019-01-15 武汉唯理科技有限公司 A kind of traffic abnormity image identification method based on focal loss function
CN109214360A (en) * 2018-10-15 2019-01-15 北京亮亮视野科技有限公司 A kind of construction method of the human face recognition model based on ParaSoftMax loss function and application
GB201910720D0 (en) * 2019-07-26 2019-09-11 Tomtom Global Content Bv Generative adversarial Networks for image segmentation
CN110516576A (en) * 2019-08-20 2019-11-29 西安电子科技大学 Near-infrared living body faces recognition methods based on deep neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214308A (en) * 2018-08-15 2019-01-15 武汉唯理科技有限公司 A kind of traffic abnormity image identification method based on focal loss function
CN109214360A (en) * 2018-10-15 2019-01-15 北京亮亮视野科技有限公司 A kind of construction method of the human face recognition model based on ParaSoftMax loss function and application
GB201910720D0 (en) * 2019-07-26 2019-09-11 Tomtom Global Content Bv Generative adversarial Networks for image segmentation
CN110516576A (en) * 2019-08-20 2019-11-29 西安电子科技大学 Near-infrared living body faces recognition methods based on deep neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于CNN的农作物病虫害图像识别模型;史冰莹;李佳琦;张磊;李健;;计算机系统应用(第06期);全文 *
基于语义分割的复杂场景下的秸秆检测;刘媛媛;张硕;于海业;王跃勇;王佳木;;光学精密工程(第01期);全文 *

Also Published As

Publication number Publication date
CN111832498A (en) 2020-10-27

Similar Documents

Publication Publication Date Title
CN110569721B (en) Recognition model training method, image recognition method, device, equipment and medium
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
JP6484333B2 (en) Intelligent scoring method and system for descriptive problems
CN107967475A (en) A kind of method for recognizing verification code based on window sliding and convolutional neural networks
Karatzas et al. ICDAR 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email)
CN107808358B (en) Automatic detection method for image watermark
CN109410184B (en) Live broadcast pornographic image detection method based on dense confrontation network semi-supervised learning
CN108446700A (en) A kind of car plate attack generation method based on to attack resistance
CN112801057B (en) Image processing method, image processing device, computer equipment and storage medium
CN109359550B (en) Manchu document seal extraction and removal method based on deep learning technology
CN112001282A (en) Image recognition method
CN105608454A (en) Text structure part detection neural network based text detection method and system
CN109446345A (en) Nuclear power file verification processing method and system
CN111046760A (en) Handwriting identification method based on domain confrontation network
CN110163567A (en) Classroom roll calling system based on multitask concatenated convolutional neural network
CN111832498B (en) Cartoon face recognition method based on convolutional neural network
Saudagar et al. Augmented reality mobile application for arabic text extraction, recognition and translation
CN109741351A (en) A kind of classification responsive type edge detection method based on deep learning
CN112801923A (en) Word processing method, system, readable storage medium and computer equipment
CN113553947B (en) Method and device for generating and describing multi-mode pedestrian re-recognition and electronic equipment
CN111813996B (en) Video searching method based on sampling parallelism of single frame and continuous multi-frame
CN111832622A (en) Method and system for identifying ugly pictures of specific figures
Pang et al. PTRSegNet: A Patch-to-Region Bottom-Up Pyramid Framework for the Semantic Segmentation of Large-Format Remote Sensing Images
Lima et al. Using convolutional neural networks for fingerspelling sign recognition in brazilian sign language
Jain et al. Dynamic Visualization of an Image for Interactive Actions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant