CN113627528B

CN113627528B - Automatic identification method for painters belonging to traditional Chinese painting based on human eye vision deep learning

Info

Publication number: CN113627528B
Application number: CN202110917411.7A
Authority: CN
Inventors: 董晨曦; 王峰
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2021-08-11
Filing date: 2021-08-11
Publication date: 2024-09-06
Anticipated expiration: 2041-08-11
Also published as: CN113627528A

Abstract

The invention discloses an automatic identification method of painters belonging to traditional Chinese painting based on human eye visual deep learning, which relates to application of deep learning based on visual processing mechanism in the art field, after visual front-end saliency processing and extraction are carried out on each traditional Chinese painting sample in a sample data set to obtain a visual saliency map, visual back-end perception information processing is carried out based on pixel value relation of each pixel point and neighborhood pixel points thereof to obtain an ordered style effect map, taking an ordered style effect graph obtained by processing each traditional Chinese painting sample as input and a corresponding artist as output, performing model training based on a convolutional neural network by using a sample data set to obtain an artist identification model, then using the artist identification model to identify and obtain an artist to which the traditional Chinese painting to be identified belongs, compared with the existing various identification methods, the method has the advantages that the accuracy of automatic identification is improved, and the judgment result of the observer is highly consistent.

Description

Automatic identification method for painters belonging to traditional Chinese painting based on human eye vision deep learning

Technical Field

The invention relates to application of deep learning based on a visual processing mechanism in the field of art, in particular to an automatic recognition method for a traditional Chinese painting artist based on human eye visual deep learning.

Background

Along with the improvement of living standard, people pursue mental life more and more, wherein the appreciation requirement of Chinese painting is gradually improved. A large number of digital chinese drawings appear in networks or digital museums, but how to efficiently utilize and manage these drawings is a problem to be solved. Therefore, the method has great practical value in the research of automatically classifying and identifying the artistic style of the Chinese painting and the authors thereof. In the field of Chinese painting feature extraction and classification, related researches exist. Li et al designed the paintings of painters of related algorithm classification Shen Zhou, tiger, zhang Daqian, etc., but the paintings information could not be fully described, and its omission of other important information resulted in a non-ideal classification result. Jiang et al extract color features and texture features from the chinese painting of class 2, and classify the chinese painting by using a support vector machine (support vector machine, SVM). Chen Junjie and the like are used for extracting the characteristics of the mountain water and flower and bird Chinese painting in terms of color, shape and the like, and classifying by adopting SVM. Sun et al propose feature selection by using a Monte Carlo convex hull model and classify for Chinese painting authors. Liu Xiaowei, and the like, distinguish different drawings through 4 features such as image color board entropy, redundancy, order degree, complexity, and the like, but the algorithm only extracts the features related to colors, and does not consider the influence of other features on classification. Wang Zheng et al propose supervised heterogeneous sparse feature selection algorithms to perform feature screening and classify authors of Chinese paintings. Fusion algorithm is proposed by Sheng and the like, the classification results of the whole feature and the local feature are integrated by information entropy, and the texture feature is extracted by using a 3-layer wavelet transform method for Chinese painting classification. The existing literature adopts a traditional feature extraction method to extract the features of the Chinese painting, and the obtained features are single and one-sided. In addition, land et al use neural networks to extract Chinese painting features, tenenbaum et al propose feature recognition methods of embedded machine learning, li Yuzhi et al use Convolutional Neural Networks (CNNs) to extract Chinese painting visual features, an improved embedded learning algorithm is adopted to classify Chinese painting artist works, zhang Jia et al adopt manual evaluation and regression analysis strategies.

At present, the traditional Chinese painting still has the following problems of feature extraction and automatic identification: 1) The traditional Chinese painting sign extraction methods are all based on various mathematical modeling algorithms in the image pixel domain, and human eye perception characteristics and subjective judgment experience of an observer are not considered, so that the automatic recognition accuracy of various methods is not high, and a large gap exists between the automatic recognition accuracy and the judgment result of the observer; 2) The requirements on the quality of the traditional Chinese painting image are high, and any noise pollution and artificial intervention can reduce or disable the performance of the existing method.

Disclosure of Invention

Aiming at the problems and the technical requirements, the inventor provides an automatic artist identification method for traditional Chinese painting based on deep learning of human eyes, and the technical scheme of the invention is as follows:

an automatic identification method for a painting belonging to a traditional Chinese painting based on human eye vision deep learning, which comprises the following steps:

Constructing a sample data set, wherein the sample data set comprises a plurality of Chinese painting samples belonging to a plurality of different painters;

Performing visual front-end saliency processing and extraction on each traditional Chinese painting sample in the sample data set to obtain a visual saliency map;

performing visual back-end perception information processing on the visual saliency map based on the pixel value relation of each pixel point and the neighborhood pixel points to obtain an ordered style effect map;

taking an ordered style effect graph obtained by processing each traditional Chinese painting sample as input and a corresponding artist as output, and performing model training based on a convolutional neural network by using a sample data set to obtain an artist identification model;

And extracting an ordered style effect diagram of the traditional Chinese painting to be identified, inputting an artist identification model, and outputting an artist to which the identified traditional Chinese painting to be identified belongs.

The further technical scheme is that the traditional Chinese painting sample in the sample data set comprises an original traditional Chinese painting and a lossy traditional Chinese painting obtained by simulation after image pretreatment of the original traditional Chinese painting, wherein the image pretreatment comprises at least one of brightness enhancement, color rendering, contrast enhancement, gaussian pollution, random brightness change, sharpness treatment, image rotation and image overturning.

The further technical scheme is that the visual saliency processing and extraction are carried out on each traditional Chinese painting sample in the sample data set to obtain a visual saliency map, and the method comprises the following steps of for each traditional Chinese painting sample:

performing threshold filtering processing based on vision on the traditional Chinese painting sample;

Carrying out logarithmic transformation of frequency domain light intensity on the traditional Chinese painting sample after finishing threshold filtering treatment;

and performing feature extraction on the converted traditional Chinese painting sample by using an SDSP extraction algorithm to obtain a visual saliency map.

The further technical scheme is that the threshold value filtering processing based on vision is carried out on the traditional Chinese painting sample, and the method comprises the following steps:

According to the visual contrast sensitivity threshold CSF (1), the visual fovea function tau (i, j) and the visual threshold delta, carrying out threshold filtering processing on a pixel value y (i, j) of a pixel point at any coordinate (i, j) in a traditional Chinese painting sample to obtain a pixel value z (i, j) after the threshold filtering processing, wherein the pixel value z (i, j) is as follows:

Wherein the visual contrast sensitivity threshold is:

CSF(1)＝2.6[0.0192+λ·2^-1·r·v·tan(0.5°)]exp{-[λ·2^-1·r·v·tan(0.5°)]^1.1}；

visual foveal function

Wherein r is a resolution parameter, v is a viewing distance parameter, lambda is a first adjustment parameter, d _t is a second adjustment parameter, d (i, j) is a distance between a pixel point of coordinates (i, j) in the traditional Chinese painting sample and a center of a sample image, and d ₀ is a distance from the center of the sample image of the traditional Chinese painting sample to an image edge.

The further technical scheme is that the logarithmic transformation of the frequency domain light intensity is carried out on the traditional Chinese painting sample after the threshold filtering treatment, and the method comprises the following steps:

The result of logarithmic transformation of the frequency domain light intensity is that the pixel value z (i, j) of the pixel point at any coordinate (i, j) in the traditional Chinese painting sample after finishing the threshold filtering process is:

Wherein, the function F (-) represents performing frequency domain positive transformation processing, F ^-1 (-) represents performing frequency domain inverse transformation processing, and G (omega, theta _j) is a log-Gabor filter.

The further technical scheme is that the method for processing the visual back end perception information of the visual saliency map to obtain the ordered style effect map based on the pixel value relation of each pixel point and the neighborhood pixel points thereof comprises the following steps:

Calculating mutual information between a pixel value t (i, j) of a pixel point at the coordinate (i, j) and a pixel value t _k (i, j) of a pixel point nearby the pixel point;

based on mutual information between the pixel value t (i, j) and the pixel value of each neighborhood pixel point, the pixel value Q (i, j) of the pixel point at the coordinate (i, j) in the ordered style effect map is obtained.

The further technical scheme is that mutual information between a pixel value t (i, j) and a pixel value t _k (i, j) of a neighboring pixel point is as follows:

I[t(i,j),t_k(i,j)]＝logp[t(i,j)/t_k(i.j)]-logp[t(i,j)]；

Wherein p [ t (i, j) ] is the statistical probability of the pixel value t (i, j), p [ t (i, j)/t _k (i, j) ] is the statistical conditional probability of the pixel value t (i, j) and the neighborhood pixel value t _k (i, j), and log represents the logarithmic operation.

The further technical scheme is that the pixel value Q (i, j) of the pixel point at the coordinate (i, j) in the ordered style effect graph is as follows:

wherein T (i, j) is a neighborhood pixel value set formed by pixel values of all neighborhood pixel points of the pixel point at the coordinate (i, j), and C _l is a normalization coefficient corresponding to the neighborhood pixel value T _l (i, j) and has Ω (I, j) is a gaussian weight function, mxn is a window size of T (I, j), I [ T (I, j), T _l (I, j) ] is mutual information between the pixel value T (I, j) and the neighborhood pixel value T _l (I, j), and T _k (I, j) represents a pixel value of any one of the neighborhood pixel points in T (I, j).

The further technical scheme is that when the artist identification model is obtained through training:

According to And carrying out normalization processing on the ordered style effect graph obtained by processing each pair of traditional Chinese painting samples, taking the ordered style effect graph obtained by normalizing each pair of traditional Chinese painting samples as input and the corresponding painter as output, and carrying out model training based on a convolutional neural network by utilizing a sample data set to obtain a painter identification model, wherein Q (i, j) is a pixel value in the ordered style effect graph.

The convolutional neural network for training to obtain the artist identification model comprises five convolutional modules and three full-connection layers, wherein each convolutional module comprises a convolutional layer, an activation function and a pooling layer, the activation function in each convolutional module is a ReLU, the pooling function of each pooling layer is the maximum pooling, and the activation function of each full-connection layer is a ReLU;

outputting the painters to which the Chinese painting to be identified obtained by identification belongs, comprising:

And outputting the prediction probability of the traditional Chinese painting to be identified belonging to each candidate artist, and taking the candidate artist with the highest prediction probability as the artist to which the traditional Chinese painting to be identified belongs, wherein the candidate artist is each artist to which the traditional Chinese painting sample in the sample data set belongs.

The beneficial technical effects of the invention are as follows:

The application discloses an automatic identification method of a traditional Chinese painting belonging to a painting based on deep learning of human eyes, which fully considers the perception characteristics of human eyes and subjective judgment experience of an observer. And the method still keeps higher recognition accuracy under various noise pollution and human intervention conditions.

Drawings

Fig. 1 is a schematic flow chart of an automatic identification method of painters to which the traditional Chinese painting disclosed by the application belongs.

Fig. 2 is a visual saliency map obtained by extracting a traditional Chinese painting sample after visual front saliency processing.

Fig. 3 is an ordered style effect diagram obtained by processing the visual back-end perception information of the visual saliency map shown in fig. 2.

Detailed Description

The following describes the embodiments of the present invention further with reference to the drawings.

The application discloses an automatic identification method of painters to which traditional Chinese painting belongs based on deep learning of human eyes, please refer to a flow chart shown in fig. 1, the method comprises the following steps:

step 1, a sample data set is constructed, wherein the sample data set comprises a plurality of Chinese painting samples belonging to a plurality of different painters. The traditional Chinese painting sample in the sample data set comprises an original traditional Chinese painting and a damaged traditional Chinese painting obtained by simulating the original traditional Chinese painting after image pretreatment, and the influence of factors such as the transition of a pseudo-age, technical restoration, ornamental mode and the like on the traditional Chinese painting can be simulated through the image pretreatment. Image preprocessing includes at least one of brightness enhancement, color rendering, contrast enhancement, gaussian stain, random brightness change, sharpness processing, image rotation, image flipping, image rotation processing including rotation at any angle, typically including rotation at 90 ° or 180 ° or 270 ° clockwise/counterclockwise, flipping including left-right and/or up-down flipping. The image preprocessing actually adopted comprises one or more of the processing means, for example, the original country can be rotated by 90 degrees anticlockwise and turned up and down to obtain the lossy traditional Chinese painting, and other combination modes can be used for pushing the lossy traditional Chinese painting.

The original traditional Chinese painting has more sources, for example, in an actual measurement experiment, the original traditional Chinese painting is obtained from relevant departments of all levels of China museums, the Yachang art network and the Flickr network download centralized approach and is subjected to relevant processing, so that 5 painters are obtained, each painter comprises 600 original traditional Chinese paintings and 600 damaged traditional Chinese paintings, and a sample data set containing 3000 traditional Chinese painting samples in total is constructed.

And 2, performing visual front-end saliency processing and extraction on each traditional Chinese painting sample in the sample data set to obtain a visual saliency map. Comprising the following steps:

(1) The traditional Chinese painting sample is subjected to threshold filtering processing based on vision, specifically, the pixel value y (i, j) of the pixel point at any coordinate (i, j) in the traditional Chinese painting sample is subjected to threshold filtering processing according to a visual contrast sensitivity threshold CSF (1), a visual fovea function tau (i, j) and a visual threshold delta, and the pixel value z (i, j) after the threshold filtering processing is obtained as follows:

Wherein the visual contrast sensitivity threshold is:

where r is a resolution parameter, v is a line-of-sight parameter, r=96 (pixels/inch), v=19.1 (inch), and pixels represents an image in pixels and inch represents a length in inches, according to subjective test conditions. Lambda is the first adjustment parameter. Preferably, λ=0.228.

Visual foveal functionWherein d _t is a second adjustment parameter, d (i, j) is a distance between a pixel point of the coordinate (i, j) in the traditional Chinese painting sample and the center of the sample image, and d ₀ is a distance from the center of the sample image of the traditional Chinese painting sample to the edge of the image. Preferably, d _t =4.0 is taken.

Preferably, the visual threshold δ=6.0 is taken.

(2) Carrying out logarithmic transformation of frequency domain light intensity on the traditional Chinese painting sample after the threshold filtering treatment, and carrying out logarithmic transformation of the frequency domain light intensity on the pixel value z (i, j) of the pixel point at any coordinate (i, j) in the traditional Chinese painting sample after the threshold filtering treatment, wherein the result is as follows:

(3) Performing feature extraction on the converted traditional Chinese painting sample by using an SDSP extraction algorithm to obtain a visual saliency map, and obtaining pixel values of pixel points at any coordinates (i, j) in the visual saliency mapSDSP (·) represents an SDSP extraction algorithm.

And step 3, performing visual back-end perception information processing on the visual saliency map based on the pixel value relation of each pixel point and the neighborhood pixel points thereof to obtain an ordered style effect map. Comprising the following steps:

(1) Mutual information between the pixel value t (i, j) and its neighborhood pixel point t _k (i, j) is calculated. Specifically, the mutual information between the pixel value t (i, j) and the pixel value t _k (i, j) of the neighboring pixel point is:

I[t(i,j),t_k(i,j)]＝logp[t(i,j)/t_k(i.j)]-logp[t(i,j)]。

(2) Based on mutual information between the pixel value t (i, j) and the pixel value of each neighborhood pixel point, the pixel value Q (i, j) of the pixel point at the coordinate (i, j) in the ordered style effect map is obtained. Specifically, the pixel value Q (i, j) of the pixel point at the coordinate (i, j) in the ordered style effect map is:

wherein T (i, j) is a neighborhood pixel value set formed by pixel values of all neighborhood pixel points of the pixel point at the coordinate (i, j), and C _l is a normalization coefficient corresponding to the neighborhood pixel value T _l (i, j) and has Ω (I, j) is a gaussian weight function, mxn is the window size of T (I, j), I [ T (I, j), T _l (I, j) ] is the mutual information between the pixel value T (I, j) and the neighborhood pixel value T _l (I, j), and T _k (I, j) represents the pixel value of any one neighborhood pixel in T (I, j), i.e., T _k (I, j) et (I, j).

As a preferred alternative to this, m×n=16×16.

For example, a visual saliency map obtained by performing visual front-end saliency processing and extraction on a traditional Chinese painting sample is shown in fig. 2, and an ordered style effect map obtained by continuing visual rear-end perception information processing is shown in fig. 3.

And 4, taking an ordered style effect graph obtained by processing each traditional Chinese painting sample as input and a corresponding artist as output, and performing model training based on a convolutional neural network by using a sample data set to obtain an artist identification model.

In practical application, the ordered style effect graph is not generally directly taken as input, but is firstly taken as followsAnd carrying out normalization processing on the ordered style effect graph obtained by processing each traditional Chinese painting sample, wherein Q (i, j) is a pixel value in the ordered style effect graph.

And then taking the normalized ordered style effect graph of each traditional Chinese painting sample as input and the corresponding painter as output, and performing model training based on a convolutional neural network by using a sample data set to obtain a painter identification model. During training, a training set and a testing set are firstly randomly divided into a sample data set according to the proportion of 8:2, then model training is carried out by using the training set, model testing is carried out by using the testing set, and finally an artist identification model is obtained.

The convolutional neural network used for training to obtain the artist identification model is modified on the basis of the VGG16 standard model, the convolutional neural network comprises five convolutional modules and three full-connection layers, each convolutional module comprises a convolutional layer, an activation function and a pooling layer, the activation function in each convolutional module is a ReLU, and the pooling function of each pooling layer is the maximum pooling. In the application, the first two convolution modules comprise two convolution layers, and the last three convolution modules comprise three convolution layers. The activation function of each fully connected layer is ReLU. The network parameters of each layer in one experiment were as follows:

As an optimization, training parameters were set as follows: the optimizer selects a random gradient descent SGD function, a loss function is selected as Cross entropy-Entropy, the iteration number is 30, the number of input images in each batch is 16, the learning rate is 1e-5, and the momentum is 0.9.

And 5, extracting an ordered style effect diagram of the traditional Chinese painting to be identified, inputting an artist identification model, and outputting an artist to which the identified traditional Chinese painting to be identified belongs.

The traditional Chinese painting to be identified can be an original traditional Chinese painting or a damaged traditional Chinese painting, the method for extracting the ordered style effect picture of the traditional Chinese painting to be identified is similar to the model training process, and if the ordered style effect picture after normalization processing is used as input during model training, the ordered style effect picture and the ordered style effect picture are also required to be input after normalization processing. The candidate painters with the largest prediction probability are used as the painters to which the traditional Chinese painting to be identified belongs, and the candidate painters are all the painters to which the traditional Chinese painting samples in the sample data set belong.

Under the condition of no noise pollution and artificial interference, the traditional Chinese painting drawing method respectively utilizes the traditional methods to automatically identify and count the accuracy, and under the condition of noise pollution and artificial interference, the traditional Chinese painting drawing method automatically identifies and counts the accuracy, and compared with the statistics result, the method provided by the application has the following steps:

the experimental results show that the method provided by the application still has higher automatic identification accuracy even under various environmental conditions of noise pollution and artificial interference.

The above is only a preferred embodiment of the present application, and the present application is not limited to the above examples. It is to be understood that other modifications and variations which may be directly derived or contemplated by those skilled in the art without departing from the spirit and concepts of the present application are deemed to be included within the scope of the present application.

Claims

1. An automatic identification method for painters belonging to traditional Chinese painting based on deep learning of human eyes is characterized by comprising the following steps:

performing visual front-end saliency processing and extraction on each traditional Chinese painting sample in the sample data set to obtain a visual saliency map, wherein the visual saliency map comprises the following steps of: performing threshold filtering processing based on vision on the traditional Chinese painting sample, performing logarithmic transformation of frequency domain light intensity on the traditional Chinese painting sample after the threshold filtering processing, and performing feature extraction on the traditional Chinese painting sample after transformation by using an SDSP extraction algorithm to obtain the visual saliency map;

Performing visual back-end perception information processing on the visual saliency map based on the pixel value relation of each pixel point and the neighborhood pixel points thereof to obtain an ordered style effect map, wherein the method comprises the following steps: calculating coordinates Pixel value of pixel point atPixel value of adjacent pixel pointMutual information between them, based on pixel valuesMutual information between the pixel values of each neighborhood pixel point to obtain coordinatesPixel values of pixel points in the ordered style effect graph；

Taking an ordered style effect graph obtained by processing each traditional Chinese painting sample as input and a corresponding artist as output, and performing model training based on a convolutional neural network by using the sample data set to obtain an artist identification model;

And extracting an ordered style effect diagram of the traditional Chinese painting to be identified, inputting the artist identification model, and outputting the artist to which the identified traditional Chinese painting to be identified belongs.

2. The method of claim 1, wherein the chinese painting samples in the sample dataset include original chinese painting and lossy chinese painting simulated after image preprocessing of the original chinese painting, the image preprocessing including at least one of brightness enhancement, color rendering, contrast enhancement, gaussian pollution, random brightness change, sharpness processing, image rotation, image flipping.

3. The method of claim 1, wherein said performing a vision-based thresholding process on said chinese painting sample comprises:

Sensitivity threshold according to visual contrast Visual foveal functionAnd vision thresholdFor any coordinates in the Chinese painting samplePixel value of pixel point atPerforming threshold filtering processing to obtain pixel values after the threshold filtering processingThe method comprises the following steps:

；

Wherein the visual contrast sensitivity threshold is:

；

visual foveal function ；

Wherein, In order to be able to carry out the resolution parameters,As a function of the viewing distance parameter,For the first adjustment parameter, a first parameter is provided,For the second adjustment parameter, a second adjustment parameter,Is the coordinates in the Chinese painting sampleIs the distance between the pixel point of (c) and the center of the sample image,Is the distance from the center of the sample image of the traditional Chinese painting sample to the edge of the image.

4. The method according to claim 1, wherein the performing logarithmic transformation of the frequency domain light intensity on the chinese painting sample after the threshold filtering process includes:

for any coordinates in the traditional Chinese painting sample after finishing the threshold filtering process Pixel value of pixel point atThe result of logarithmic transformation of the frequency domain light intensity is:

；

Wherein the function is The representation is subjected to a frequency domain forward transform process,The representation is subjected to an inverse frequency domain transform process,Is a log-Gabor filter.

5. The method of claim 1, wherein the pixel values arePixel value of adjacent pixel pointMutual information between the two is:

；

Wherein, For pixel valuesIs used for the statistical probability of (a),For pixel valuesAnd neighborhood pixel valuesIs used to determine the statistical conditional probability of (1),Representing a logarithmic operation.

6. The method of claim 1, wherein the coordinatesPixel values of pixel points in the ordered style effect graphThe method comprises the following steps:

；

Wherein, Is the coordinatesA neighborhood pixel value set formed by pixel values of all neighborhood pixel points of the pixel point,Is a neighborhood pixel valueCorresponding normalized coefficient and has，As a function of the gaussian weight,Is thatIs defined by a window size of (a),For pixel valuesAnd neighborhood pixel valuesThe mutual information between the two pieces of information,Representation ofThe pixel value of any one of the neighborhood pixel points.

7. The method of claim 1, wherein, in training to obtain the artist identification model:

According to Normalizing the ordered style effect graph obtained by processing each traditional Chinese painting sample, taking the normalized ordered style effect graph of each traditional Chinese painting sample as input and the corresponding painter as output, performing model training based on a convolutional neural network by using the sample data set to obtain a painter identification model,Is the pixel value in the ordered style effect map.

8. The method of claim 1, wherein the convolutional neural network for training to obtain the artist recognition model comprises five convolutional modules and three fully connected layers, each convolutional module comprises a convolutional layer, an activation function and a pooling layer, the activation function in each convolutional module is a ReLU, the pooling function of each pooling layer is a maximum pooling, and the activation function of each fully connected layer is a ReLU;

The outputting the obtained painters to which the to-be-identified traditional Chinese painting belongs, including:

And outputting the prediction probability that the traditional Chinese painting to be identified belongs to each candidate artist, and taking the candidate artist with the largest prediction probability as the artist to which the traditional Chinese painting to be identified belongs, wherein the candidate artist is each artist to which the traditional Chinese painting sample in the sample data set belongs.