CN111950565B - Abstract painting image orientation recognition method based on feature fusion and naive Bayes - Google Patents

Abstract painting image orientation recognition method based on feature fusion and naive Bayes Download PDF

Info

Publication number
CN111950565B
CN111950565B CN202010737934.9A CN202010737934A CN111950565B CN 111950565 B CN111950565 B CN 111950565B CN 202010737934 A CN202010737934 A CN 202010737934A CN 111950565 B CN111950565 B CN 111950565B
Authority
CN
China
Prior art keywords
image
sub
block
len
abstract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010737934.9A
Other languages
Chinese (zh)
Other versions
CN111950565A (en
Inventor
白茹意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou Weimili Technology Co ltd
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN202010737934.9A priority Critical patent/CN111950565B/en
Publication of CN111950565A publication Critical patent/CN111950565A/en
Application granted granted Critical
Publication of CN111950565B publication Critical patent/CN111950565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of image processing and computer vision, and particularly relates to an abstract picture image direction identification method based on feature fusion and naive Bayes, which comprises the following steps: s1, rotating the abstract picture images to obtain four abstract picture images with different directions; simultaneously, dividing the abstract picture image to obtain four subblocks; s2, extracting low-level features of the abstract picture image; s3, extracting the high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN); s4, linearly combining the image low-level characteristic value and the image high-level characteristic value to obtain a final characteristic value of the abstract picture image; and S5, inputting the final characteristic value of the abstract image into a naive Bayes classifier for training and prediction. The method acquires the characteristic value of the image by fusing the low-level and high-level characteristics, and then puts the characteristic value of the image into a naive Bayes classifier (NB) for training and prediction, thereby realizing the automatic prediction of the direction of the abstract image and improving the prediction precision.

Description

基于特征融合和朴素贝叶斯的抽象画图像方向识别方法Abstract painting image orientation recognition method based on feature fusion and naive Bayes

技术领域technical field

本发明属于图像处理和计算机视觉技术领域,具体为一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法。The invention belongs to the technical field of image processing and computer vision, in particular to an abstract painting image orientation recognition method based on feature fusion and Naive Bayes.

背景技术Background technique

抽象艺术是一种采用形、色和线进行构图,在一定程度上独立于世界的视觉语言。其中,为情感表达而创作的绘画被称为“热抽象”,而用抽象的方式描述世界的绘画被称为“冷抽象”。通常在创作抽象画时,艺术家根据自己的审美观念,决定作品的正确悬挂方向。尽管正确的方向通常是在画布的背面指定的,但这对其他非专业观众来说并不明显。而且,近些年心理学中的一些研究已经解决了抽象绘画的方向问题,大多数研究都认为正确定位的绘画会获得更高的审美评价。参与者的实验表明,大约一半的偏好取向决定与艺术家的预期取向一致,这远高于偶然性,但低于完美表现。这些都为绘画方向与审美品质的关系提供了证据。方位识别的研究可以揭示视觉审美评价的客观规律。Abstract art is a visual language that uses shape, color and line to compose images, and is independent of the world to a certain extent. Among them, paintings created for emotional expression are called "hot abstractions", while paintings that describe the world in an abstract way are called "cold abstractions". Usually when creating abstract paintings, the artist decides the correct hanging direction of the work according to his own aesthetic concept. Although the correct orientation is usually specified on the back of the canvas, this is not obvious to other lay viewers. Moreover, some studies in psychology in recent years have addressed the orientation of abstract painting, with most studies agreeing that correctly positioned paintings will receive higher aesthetic evaluations. Experiments with participants showed that about half of preference orientation decisions were in line with the artist's expected orientation, which is well above chance but below perfect performance. These provide evidence for the relationship between painting direction and aesthetic quality. The study of orientation recognition can reveal the objective laws of visual aesthetic evaluation.

随着信息数字化的趋势,在互联网上可以很容易地找到绘画的数字图像。这使得计算机辅助绘画分析成为可能。人们通过直接探索人的审美感知与计算视觉特征之间的关系,研究了各种审美评价方法,但没有一种方法通过计算机辅助方位判断来解决审美评价问题。近些年对图像方向的研究现状如下:(1)图像方向识别的研究主要针对摄影图片,比如自然或场景图像,而且识别率都比较满意。然而,对于抽象画图像而言,其内容和语义相对于摄影图像比较含蓄,不明显,因此对抽象画的方向识别会比较困难,近几年的相关工作也比较少。(2)人类一般是通过对图像内容的理解来识别方向,因此一些方法是采用高层语义特征来识别图像方向的,准确率显然更高。但是它的准确性将在很大程度上取决于能否弥合高层次线索和低层次特征之间的语义鸿沟。With the trend of digitization of information, digital images of paintings can be easily found on the Internet. This enables computer-aided painting analysis. People have studied various aesthetic evaluation methods by directly exploring the relationship between human aesthetic perception and computational visual features, but none of them can solve the problem of aesthetic evaluation through computer-aided orientation judgment. The research status of image orientation in recent years is as follows: (1) The research on image orientation recognition is mainly aimed at photographic pictures, such as natural or scene images, and the recognition rate is relatively satisfactory. However, for abstract painting images, the content and semantics are more subtle and less obvious than photographic images, so it is more difficult to identify the direction of abstract paintings, and there is less related work in recent years. (2) Humans generally recognize the direction by understanding the content of the image, so some methods use high-level semantic features to recognize the direction of the image, and the accuracy is obviously higher. But its accuracy will largely depend on being able to bridge the semantic gap between high-level cues and low-level features.

目前对自然图像方向的广泛研究促使我们探索抽象绘画的方向判断问题。本发明的目的是为了更好地理解抽象绘画的方向感,特别是在没有实质意义的内容的情况下,在机器学习的框架下建立图像视觉内容与正确方向之间的关系。The current extensive research on orientation in natural images motivates us to explore the problem of orientation judgment in abstract paintings. The purpose of the present invention is to better understand the sense of direction of abstract painting, especially to establish the relationship between the visual content of the image and the correct direction under the framework of machine learning when there is no substantive content.

发明内容SUMMARY OF THE INVENTION

本发明克服现有技术存在的不足,提供一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法,通过计算机运算可以实现解决抽象画图像方向的自动预测。The invention overcomes the deficiencies in the prior art, and provides a method for recognizing the direction of an abstract painting image based on feature fusion and Naive Bayes, which can realize automatic prediction of the direction of the abstract painting image through computer operation.

为了解决上述技术问题,本发明采用的技术方案为:一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法,包括以下步骤:In order to solve the above-mentioned technical problems, the technical solution adopted in the present invention is: a method for recognizing the direction of an abstract painting image based on feature fusion and Naive Bayes, comprising the following steps:

S1、将抽象画图像旋转0°、90°、180°、270°,得到四幅方向不同的抽象画图像,将抽象画图像进行上下平均分割,以及左右平均分割;由此,每幅抽象画图像被分为上、下、左、右四个子块;S1. Rotate the abstract painting image by 0°, 90°, 180°, and 270° to obtain four abstract painting images with different directions. The abstract painting images are equally divided up and down, and left and right are equally divided; thus, each abstract painting image It is divided into four sub-blocks: upper, lower, left and right;

S2、提取抽象画图像的低层特征,分别计算每个子块的低层特征描述,将各个字块的低层特征描述的比较结果作为图像低层特征值,如果比较结果为真则,表示为1,否则为0;S2. Extract the low-level features of the abstract painting image, calculate the low-level feature descriptions of each sub-block respectively, and use the comparison result of the low-level feature descriptions of each word block as the image low-level feature value. If the comparison result is true, it is represented as 1, otherwise it is 0;

S3、采用卷积神经网络CNN提取抽象画图像的高层特征,具体步骤如下:S3, using the convolutional neural network CNN to extract the high-level features of the abstract painting image, the specific steps are as follows:

S301、将抽象画的四个子块调整为128×128的RGB彩色图像;S301. Adjust the four sub-blocks of the abstract painting into RGB color images of 128×128;

S302、将四个子块分别输入卷积神经网络CNN,所述卷积神经网络CNN包含3个步长为1的卷积层,3个2×2的最大池化层和2个全连接层,卷积层中激活函数采用ReLU,两个全连接层的维度分别为1024和521,最后分别得到512维向量作为神经网络特征向量;S302, respectively input the four sub-blocks into the convolutional neural network CNN. The convolutional neural network CNN includes three convolutional layers with a stride of 1, three 2×2 maximum pooling layers and two fully connected layers. The activation function in the convolutional layer adopts ReLU, the dimensions of the two fully connected layers are 1024 and 521 respectively, and finally a 512-dimensional vector is obtained as the neural network feature vector;

S303、判断其中上下两个子块和左右两个子块的特征向量的比较结果,作为图像高层特征值f14和f15,计算公式如下:S303, judging the comparison result of the feature vectors of the upper and lower sub-blocks and the left and right sub-blocks as the high-level feature values f14 and f15 of the image, and the calculation formula is as follows:

f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnRf14=f_cnn A ≥ f_cnn B ; f15=f_cnn L ≥ f_cnn R ;

其中f_cnnA、f_cnnB、f_cnnL、f_cnnR分别表示上、下、左和右子块的神经网络特征值;Where f_cnn A , f_cnn B , f_cnn L , f_cnn R represent the neural network eigenvalues of the upper, lower, left and right sub-blocks, respectively;

S4、将步骤S2得到的图像低层特征值和步骤S3得到的图像高层特征值进行线性组合,得到抽象画图像的最终特征值;S4, linearly combine the low-level eigenvalues of the image obtained in step S2 and the high-level eigenvalues of the image obtained in step S3 to obtain the final eigenvalues of the abstract painting image;

S5、对图像库中所有抽象画进行上述步骤S1~S4的操作,得到抽象画图像的最终特征值,输入朴素贝叶斯分类器进行训练和预测,最终将抽象画分为“向上”、“向下”、“向左”或“向右”,从而实现抽象画图像方向的自动预测。S5. Perform the operations of the above steps S1 to S4 on all the abstract paintings in the image library to obtain the final eigenvalues of the abstract paintings images, input the naive Bayes classifier for training and prediction, and finally classify the abstract paintings into "up", "" Down", "Left" or "Right", which enables automatic prediction of the orientation of the abstract painting image.

所述步骤S2具体包括如下步骤:The step S2 specifically includes the following steps:

S201、将步骤S1中的四个子块由RGB颜色空间转换成HSV模型,将H-S空间分为16个色调和8个饱和度,统计128种颜色的像素个数作为抽象画的颜色直方图向量;判断其中上下两个子块和左右两个子块直方图向量的比较结果,作为图像特征值f1和f2,具体公式如下:S201, converting the four sub-blocks in step S1 into an HSV model from the RGB color space, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors as the color histogram vector of the abstract painting; The comparison results of the histogram vectors of the upper and lower sub-blocks and the left and right sub-blocks are judged as the image feature values f1 and f2. The specific formula is as follows:

f1=HistA≥HistB;f2=HistL≥HistRf1=Hist A ≥Hist B ; f2=Hist L ≥Hist R ;

其中,HistA,HistB,HistL,HistR分别为上、下、左、右四个子块的直方图向量;Among them, Hist A , Hist B , Hist L , Hist R are the histogram vectors of the upper, lower, left and right sub-blocks respectively;

S202、将图像的最大梯度表示为该图像的复杂度,计算四个子块的复杂度;判断其中上下两个子块和左右两个子块的复杂度的比较结果,作为图像特征值f3和f4,公式如下:S202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of the four sub-blocks; judging the comparison result of the complexity of the upper and lower sub-blocks and the left and right sub-blocks as the image feature values f3 and f4, the formula as follows:

f3=CompA≥CompB;f4=CompL≥CompRf3 = Comp A ≥ Comp B ; f4 = Comp L ≥ Comp R ;

其中,CompA、CompB、CompL、CompR分别表示上、下、左、右四个子块的复杂度;Among them, Comp A , Comp B , Comp L , and Comp R represent the complexity of the upper, lower, left and right sub-blocks, respectively;

S203、计算四个子块中每两个子块之间的相似度;并将子块之间相似性的比较结果作为图像特征值f5、和f6和f7;公式如下:S203, calculate the similarity between every two sub-blocks in the four sub-blocks; and use the comparison result of the similarity between the sub-blocks as the image feature values f5, f6 and f7; the formula is as follows:

f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);f5=Sim(A,L)≥Sim(A,R); f6=Sim(B,L)≥Sim(B,R); f7=Sim(A,B)≥Sim(L,R);

S204、利用Hough变换检测四个子块的显著直线,根据直线的倾角α判断其为静态线还是动态线,计算静态线、动态线的个数和所有线的平均长度作为图像特征,将其中两个子块之间直线属性值的比较结果分别作为图像特征值f8、f9、f10、f11、f12和f13,公式如下:S204. Use Hough transform to detect the significant straight lines of the four sub-blocks, determine whether it is a static line or a dynamic line according to the inclination angle α of the straight line, calculate the number of static lines, dynamic lines and the average length of all lines as image features, and use two of the sub-blocks as image features. The comparison results of the attribute values of the straight lines between the blocks are taken as the image feature values f8, f9, f10, f11, f12 and f13, respectively, and the formulas are as follows:

f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenBf8=Len_S A ≥Len_S B ; f9=Len_D A ≥Len_D B ; f10=Ave_Len A ≥Ave_Len B ;

f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenRf11=Len_S L ≥Len_S R ; f12=Len_D L ≥Len_D R ; f13=Ave_Len L ≥Ave_Len R ;

其中,Len_SA、Len_SB、Len_SL、Len_SR分别表示上、下、左、右四个子块中静态线的个数,Len_DA、Len_DB、Len_DL、Len_DR分别表示上、下、左、右四个子块中动态线的个数,Ave_LenA、Ave_LenB、Ave_LenL、Ave_LenR分别表示上、下、左、右四个子块中所有线的平均长度。Among them, Len_S A , Len_S B , Len_S L , and Len_S R represent the number of static lines in the upper, lower, left, and right sub-blocks, respectively, and Len_D A , Len_D B , Len_D L , and Len_D R represent upper, lower, left, respectively , the number of dynamic lines in the four right sub-blocks, Ave_Len A , Ave_Len B , Ave_Len L , and Ave_Len R represent the average length of all lines in the upper, lower, left and right sub-blocks, respectively.

所述步骤S202中,图像复杂度的计算公式如下:In the step S202, the calculation formula of the image complexity is as follows:

Figure BDA0002605778750000031
Figure BDA0002605778750000031

Figure BDA0002605778750000032
Figure BDA0002605778750000032

其中,Gmax(x,y)表示在RGB颜色空间中,图像中像素点(x,y)的最大梯度,

Figure BDA0002605778750000033
分别表示图像中(x,y)点的R、G、B的梯度值,Pixelnum(G)表示图像G的总像素点个数,CompG表示图像G的复杂度。Among them, G max (x, y) represents the maximum gradient of the pixel (x, y) in the image in the RGB color space,
Figure BDA0002605778750000033
Represent the gradient values of R, G, and B at the (x, y) point in the image, respectively, Pixelnum(G) represents the total number of pixels in the image G, and Comp G represents the complexity of the image G.

所述步骤S201中,图像由RGB颜色空间转换为HSV模型的公式如下:In the step S201, the formula for converting the image from the RGB color space to the HSV model is as follows:

Figure BDA0002605778750000034
Figure BDA0002605778750000034

kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;kmax=max(r',g',b'); kmin=min(r',g',b'); Δ=kmax-kmin;

Figure BDA0002605778750000041
Figure BDA0002605778750000041

Figure BDA0002605778750000042
Figure BDA0002605778750000042

v=kmax;v = kmax;

其中r,g,b分别表示RGB颜色空间中图像像素点的RGB值,r’、g‘和b’为中间变量,kmax表示r’、g‘和b’中的最大值,kmin表示r’、g‘和b’中的最小值,h,s,v表示HSV模型中图像像素点的色调值、饱和度和明度。where r, g, b represent the RGB values of image pixels in the RGB color space, respectively, r', g', and b' are intermediate variables, kmax represents the maximum value among r', g', and b', and kmin represents r' The minimum value among , g' and b', h, s, v represent the hue value, saturation and lightness of the image pixel in the HSV model.

所述卷积神经网络CNN的网络结构为:第一个卷积层由16个3×3的卷积核组成;第二个卷积层由8个3×3的卷积核组成;第三个卷积层由4个3×3的卷积核组成,每次经过卷积后得到的特征图用0对边缘部分进行填白,保持大小不变;在每个卷积层之后,采用2×2的最大池化来降低特征分辨率;最后,使用全连接层将4个16×16的二维矩阵转换成1024维特征向量,再将1024降为512维。The network structure of the convolutional neural network CNN is: the first convolutional layer consists of 16 convolution kernels of 3×3; the second convolutional layer consists of 8 convolutional kernels of 3×3; the third convolutional layer consists of 8 convolution kernels of 3×3. Each convolutional layer consists of four 3×3 convolution kernels. The feature map obtained after each convolution is filled with 0 to the edge part, keeping the size unchanged; after each convolutional layer, 2 is used ×2 max pooling to reduce feature resolution; finally, four 16×16 2D matrices are converted into 1024-dimensional feature vectors using fully connected layers, and 1024 is reduced to 512-dimensional.

所述步骤S4中,线性组合后的抽象画图像的最终特征值的向量维度为1291。In the step S4, the vector dimension of the final feature value of the linearly combined abstract painting image is 1291.

所述步骤S5中,朴素贝叶斯分类器对抽象画图像方向的预测时,进行“向上”、“向下”、“向左”和“向右”四分类的具体方法为:In the step S5, when the Naive Bayes classifier predicts the direction of the abstract painting image, the specific methods for performing four classifications of "upward", "downward", "leftward" and "rightward" are:

将这四种情况分为四组:每组选择其中一个方向作为一类,其余三个方向作为另一类,计算每组中两类的后验概率的比值,计算公式如下:Divide these four cases into four groups: select one direction as one category for each group, and the other three directions as the other category, and calculate the ratio of the posterior probabilities of the two categories in each group. The calculation formula is as follows:

Figure BDA0002605778750000043
Figure BDA0002605778750000043

其中,

Figure BDA0002605778750000044
是每组中两类的后验概率比;P(Cθ|F)表示其中选择的方向的后验概率,
Figure BDA0002605778750000045
表示其余三个方向的后验概率;比较四组中的两类的后验概率比值
Figure BDA0002605778750000046
选取
Figure BDA0002605778750000047
值最大的方向作为抽象画图像的正确方向。in,
Figure BDA0002605778750000044
is the posterior probability ratio of the two classes in each group; P(C θ |F) represents the posterior probability of the direction chosen among them,
Figure BDA0002605778750000045
Represents the posterior probability of the remaining three directions; compares the posterior probability ratio of the two categories in the four groups
Figure BDA0002605778750000046
select
Figure BDA0002605778750000047
The direction with the largest value is used as the correct direction for the abstract painting image.

本发明与现有技术相比具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

本发明提供了一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法,(1)将抽象画图像进行上下平均分割,以及左右平均分割。由此,每幅抽象画被分为四个子块(上、下、左和右)。图像特征都是基于这四个方向特征描述的比较结果,这样能够更加具体的体现图像的方向结构。(2)依据抽象绘画理论的基本原理,提取所有抽象画图像的低层特征,包括颜色、复杂度、相似度和直线属性。这些特征从绘画原理的角度出发,既能更好的表达抽象画的基本特性,又能体现图像的方向性。(3)采用卷积神经网络CNN提取抽象画图像的高层特征。(4)将低层特征和高层特征进行线性组合,组合后的向量即为抽象画图像的最终特征值。这样能更好融合图像的局部和全局特征,能更加准确的检测图像方向。The invention provides a method for recognizing the direction of an abstract painting image based on feature fusion and Naive Bayes. (1) The abstract painting image is equally divided up and down, and left and right equally divided. Thus, each abstract painting is divided into four sub-blocks (top, bottom, left and right). The image features are based on the comparison results of the four directional feature descriptions, which can more specifically reflect the directional structure of the image. (2) According to the basic principles of abstract painting theory, extract the low-level features of all abstract painting images, including color, complexity, similarity and straight line attributes. From the perspective of painting principles, these features can not only better express the basic characteristics of abstract paintings, but also reflect the directionality of images. (3) Convolutional neural network CNN is used to extract high-level features of abstract painting images. (4) Linearly combine low-level features and high-level features, and the combined vector is the final feature value of the abstract painting image. In this way, the local and global features of the image can be better fused, and the image orientation can be detected more accurately.

附图说明Description of drawings

图1为本发明实施例中抽象画旋转的示意图;1 is a schematic diagram of the rotation of an abstract painting in an embodiment of the present invention;

图2为本发明实施例汇总抽象画分割示意图;FIG. 2 is a schematic diagram of a summary abstract painting segmentation according to an embodiment of the present invention;

图3为本发明实施例采用的CNN模型的结构示意图;3 is a schematic structural diagram of a CNN model adopted in an embodiment of the present invention;

图4为本发明实施例中抽象画图像方向识别框架。FIG. 4 is a framework for recognizing the direction of an abstract painting image according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚,下面将对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部的实施例;基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below. Obviously, the described embodiments are part of the embodiments of the present invention, not All the embodiments; based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work, all belong to the protection scope of the present invention.

本发明提供了一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法,其选取来公开网站的绘画,进行实验,具体实施步骤如下:The invention provides a method for recognizing the direction of an abstract painting image based on feature fusion and Naive Bayes. The method is selected to disclose the painting of a website, and an experiment is carried out. The specific implementation steps are as follows:

S1:选取WikiArt(http://www.wikiart.org)数据集中的500幅抽象画。将所有的抽象画图像顺时针旋转四个方向(0°,90°,180°,270°),参考图1。最终得到四幅方向不同的抽象画图像,共2000幅。将抽象画图像进行上下平均分割,以及左右平均分割。由此,每幅抽象画被分为四个子块(上(A)、下(B)、左(L)和右(R)),参考图2。S1: Select 500 abstract paintings in the WikiArt (http://www.wikiart.org) dataset. Rotate all abstract painting images clockwise in four directions (0°, 90°, 180°, 270°), refer to Figure 1. Finally, four abstract paintings in different directions were obtained, totaling 2,000. The abstract painting image is equally divided up and down, and the left and right are equally divided. Thus, each abstract painting is divided into four sub-blocks (upper (A), lower (B), left (L) and right (R)), refer to FIG. 2 .

S2:依据抽象绘画理论的基本原理,提取所有抽象画图像的低层特征。分别计算每个子块的低层特征描述,将这些特征描述的比较结果作为图像最终特征值,如果比较结果为真则,表示为1,否则为0。具体步骤如下:S2: According to the basic principles of abstract painting theory, extract the low-level features of all abstract painting images. Calculate the low-level feature description of each sub-block separately, and take the comparison result of these feature descriptions as the final image feature value. If the comparison result is true, it is represented as 1, otherwise it is 0. Specific steps are as follows:

S201:将S1中的四个子块由RGB颜色空间转换成HSV模型(色调(h),饱和度(s),明度(v))。计算公式如下:S201: Convert the four sub-blocks in S1 into an HSV model (hue (h), saturation (s), lightness (v)) from the RGB color space. Calculated as follows:

Figure BDA0002605778750000051
Figure BDA0002605778750000051

Figure BDA0002605778750000052
Figure BDA0002605778750000052

v=kmax; (3)v = kmax; (3)

其中,r,g,b分别表示RGB颜色空间中图像像素点的RGB值,r’、g’和b’为中间变量,kmax表示r’、g’和b’中的最大值,kmin表示r’、g’和b’中的最小值,h,s,v表示HSV模型中图像像素点的色调值、饱和度和明度。Among them, r, g, b respectively represent the RGB value of the image pixel in the RGB color space, r', g' and b' are intermediate variables, kmax represents the maximum value among r', g' and b', and kmin represents r The minimum values of ', g' and b', h, s, v represent the hue, saturation and lightness of the image pixels in the HSV model.

Figure BDA0002605778750000061
Figure BDA0002605778750000061

kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin; (5)kmax=max(r',g',b'); kmin=min(r',g',b'); Δ=kmax-kmin; (5)

在方向识别中,明度的影响因素很小,因此将H-S空间分为16个色调和8个饱和度,统计128种颜色的像素个数作为绘画的颜色直方图向量。图像特征值f1和f2为两个子块直方图向量的比较结果,公式如下:In the direction recognition, the influence factor of lightness is very small, so the H-S space is divided into 16 hues and 8 saturations, and the number of pixels of 128 colors is counted as the color histogram vector of the painting. The image feature values f1 and f2 are the comparison results of the two sub-block histogram vectors, and the formula is as follows:

f1=HistA≥HistB;f2=HistL≥HistR; (6)f1=Hist A ≥Hist B ; f2=Hist L ≥Hist R ; (6)

其中,HistA,HistB,HistL,HistR分别为四个子块的直方图向量。f1和f2的维度为128维。Among them, Hist A , Hist B , Hist L , and Hist R are the histogram vectors of the four sub-blocks, respectively. The dimensions of f1 and f2 are 128 dimensions.

S202:将图像的最大梯度图表示为该图像的复杂度,然后计算步骤S1中四个子块图像的复杂度。设图像为G,在RGB颜色空间中,计算图像中像素点(x,y)的最大梯度为Gmax(x,y)。再将图像中所有像素点的Gmax的平均值作为该图像的复杂度。计算公式如下:S202: Denote the maximum gradient map of the image as the complexity of the image, and then calculate the complexity of the four sub-block images in step S1. Let the image be G, and in the RGB color space, calculate the maximum gradient of the pixel (x, y) in the image as G max (x, y). Then the average value of G max of all pixels in the image is taken as the complexity of the image. Calculated as follows:

Figure BDA0002605778750000062
Figure BDA0002605778750000062

Figure BDA0002605778750000063
Figure BDA0002605778750000063

其中(x,y)表示图像中像素点的坐标,

Figure BDA0002605778750000064
是(x,y)点的梯度值,Pixelnum(G)是图像G的总像素点个数,CompG是图像G的复杂度。图像特征值f3和f4为两个子块的复杂度的比较结果,公式如下:Where (x, y) represents the coordinates of the pixel in the image,
Figure BDA0002605778750000064
is the gradient value of the (x, y) point, Pixelnum(G) is the total number of pixels in the image G, and Comp G is the complexity of the image G. The image feature values f3 and f4 are the comparison results of the complexity of the two sub-blocks, and the formula is as follows:

f3=CompA≥CompB;f4=CompL≥CompR; (9)f3 = Comp A ≥ Comp B ; f4 = Comp L ≥ Comp R ; (9)

其中,CompA、CompB、CompL、CompR分别表示上、下、左、右四个子块的复杂度。Among them, Comp A , Comp B , Comp L , and Comp R represent the complexity of the upper, lower, left, and right sub-blocks, respectively.

S203:计算四个子块中每两个子块之间的相似度。S203: Calculate the similarity between every two sub-blocks in the four sub-blocks.

假设两个图像G1和G2,采用直方图金字塔(HOG)来计算两个图像之间的相似性。将图像作为一个含有8个方向的单元,在RGB模式下计算3个通道的HOG特征。两幅图像之间的相似度Sim(G1,G2)计算公式如下:Assuming two images G 1 and G 2 , a Histogram Pyramid (HOG) is used to calculate the similarity between the two images. Taking the image as a unit with 8 directions, the HOG features of 3 channels are computed in RGB mode. The calculation formula of the similarity Sim(G 1 , G 2 ) between two images is as follows:

Figure BDA0002605778750000065
Figure BDA0002605778750000065

其中G1,G2∈RGB,H1和H2分别是图像G1和G2的对应归一化直方图,m是HOG特征中存在的单元数。图像特征值f5、和f6和f7为子块之间相似性的比较结果,公式如下:f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R); (11)where G 1 , G 2 ∈ RGB, H 1 and H 2 are the corresponding normalized histograms of images G 1 and G 2 , respectively, and m is the number of cells present in the HOG feature. The image feature values f5, f6 and f7 are the comparison results of the similarity between sub-blocks, and the formula is as follows: f5=Sim(A,L)≥Sim(A,R); f6=Sim(B,L)≥Sim( B,R); f7=Sim(A,B)≥Sim(L,R); (11)

S204:利用Hough变换检测四个子块的显著直线。根据直线的倾角α,如果倾角(-15°<α<15°)或(75°<α<105°)则该线为静态线,否则为动态线。计算静态线、动态线的个数和所有线的平均长度作为图像特征。图像特征值f8、f9、f10、f11、f12和f13为两个子块之间直线属性值的比较结果,公式如下:S204: Use Hough transform to detect the salient straight lines of the four sub-blocks. According to the inclination angle α of the line, if the inclination angle is (-15°<α<15°) or (75°<α<105°), the line is a static line, otherwise it is a dynamic line. Calculate the number of static lines, dynamic lines and the average length of all lines as image features. The image feature values f8, f9, f10, f11, f12 and f13 are the comparison results of the attribute values of the straight lines between the two sub-blocks, and the formula is as follows:

f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB; (12)f8=Len_S A ≥Len_S B ; f9=Len_D A ≥Len_D B ; f10=Ave_Len A ≥Ave_Len B ; (12)

f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR; (13)f11=Len_S L ≥Len_S R ; f12=Len_D L ≥Len_D R ; f13=Ave_Len L ≥Ave_Len R ; (13)

其中,Len_SA、Len_SB、Len_SL、Len_SR分别表示上、下、左、右四个子块中静态线的个数,Len_DA、Len_DB、Len_DL、Len_DR分别表示上、下、左、右四个子块中动态线的个数,Ave_LenA、Ave_LenB、Ave_LenL、Ave_LenR分别表示上、下、左、右四个子块中所有线的平均长度。Among them, Len_S A , Len_S B , Len_S L , and Len_S R represent the number of static lines in the upper, lower, left, and right sub-blocks, respectively, and Len_D A , Len_D B , Len_D L , and Len_D R represent upper, lower, left, respectively , the number of dynamic lines in the four right sub-blocks, Ave_Len A , Ave_Len B , Ave_Len L , and Ave_Len R represent the average length of all lines in the upper, lower, left and right sub-blocks, respectively.

S3:采用卷积神经网络CNN(Convolutional Neural Networks)提取所有抽象画图像的高层特征,模型参考图3。具体步骤如下:S3: Convolutional neural network CNN (Convolutional Neural Networks) is used to extract the high-level features of all abstract painting images. Refer to Figure 3 for the model. Specific steps are as follows:

S301、将抽象画的四个子块调整为128×128的RGB彩色图像;S301. Adjust the four sub-blocks of the abstract painting into RGB color images of 128×128;

S302、将四个子块分别输入卷积神经网络CNN,所述卷积神经网络CNN包含3个步长为1的卷积层,3个2×2的最大池化层和2个全连接层,卷积层中激活函数采用ReLU,两个全连接层的维度分别为1024和521,最后分别得到512维向量作为神经网络特征向量;S302, respectively input the four sub-blocks into the convolutional neural network CNN. The convolutional neural network CNN includes three convolutional layers with a stride of 1, three 2×2 maximum pooling layers and two fully connected layers. The activation function in the convolutional layer adopts ReLU, the dimensions of the two fully connected layers are 1024 and 521 respectively, and finally a 512-dimensional vector is obtained as the neural network feature vector;

S303、判断其中上下两个子块和左右两个子块的特征向量的比较结果,如果为真,则表示为1,否则为0,并将比较结果作为图像高层特征值f14和f15,计算公式如下:S303, judge the comparison result of the feature vectors of the upper and lower sub-blocks and the left and right sub-blocks, if it is true, it is represented as 1, otherwise it is 0, and the comparison result is used as the high-level feature values f14 and f15 of the image, and the calculation formula is as follows:

f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR; (14)f14=f_cnn A ≥f_cnn B ; f15=f_cnn L ≥f_cnn R ; (14)

其中f_cnnA、f_cnnB、f_cnnL、f_cnnR分别表示上、下、左和右子块的神经网络特征值。f14和f15的维度为512维。where f_cnn A , f_cnn B , f_cnn L , and f_cnn R represent the neural network eigenvalues of the upper, lower, left and right sub-blocks, respectively. The dimensions of f14 and f15 are 512 dimensions.

本是实施例中,CNN包含3个步长为1的卷积层、激活函数采用ReLU,卷积层利用滤波器对输入样本进行卷积,得到特征图。第一个卷积层由16个3×3的卷积核组成;第二个卷积层由8个3×3的卷积核组成;第三个卷积层由4个3×3的卷积核组成,每次经过卷积后得到的特征图用0对边缘部分进行填白,保持大小不变。CNN包含3个2×2的最大池化层来降低分辨率。池化层对输入数据进行采样,以便减少参数和避免过度拟合;CNN包含2个全连接层,用于连接所有神经元。两个全连接成的维度分别为1024和521,最后将512维向量作为神经网络特征值,用f_cnn表示。CNN网络的其他参数设置:batch_size为8,epochs为10,学习率为1e-4,代价函数选择“交叉熵损失函数”,优化器为“Adam”。In this embodiment, the CNN includes three convolutional layers with a stride of 1, the activation function adopts ReLU, and the convolutional layer uses filters to convolve the input samples to obtain feature maps. The first convolutional layer consists of 16 3×3 convolution kernels; the second convolutional layer consists of 8 3×3 convolutional kernels; the third convolutional layer consists of 4 3×3 convolutional layers The feature map obtained after each convolution is filled with 0 to the edge part, keeping the size unchanged. The CNN contains 3 2×2 max pooling layers to reduce the resolution. The pooling layer samples the input data in order to reduce parameters and avoid overfitting; the CNN contains 2 fully connected layers to connect all neurons. The dimensions of the two fully connected are 1024 and 521 respectively, and finally the 512-dimensional vector is used as the eigenvalue of the neural network, which is represented by f_cnn. Other parameter settings of the CNN network: batch_size is 8, epochs is 10, learning rate is 1e-4, cost function selects "cross entropy loss function", and optimizer is "Adam".

S4:将S2和S3的图像特征f1-f15进行线性组合,组合后的向量即为抽象画图像的最终特征值。组合后的特征向量维度为1291。S4: Linearly combine the image features f1-f15 of S2 and S3, and the combined vector is the final feature value of the abstract painting image. The combined feature vector dimension is 1291.

S5:随机选取400幅绘画作为训练集原始图像,100幅绘画作为测试集,因此,原始图像通过旋转后得到最终的训练集样本为1600幅,测试集样本为400幅。为了得到更准确的分类结果,采用10折交叉验证评估该分类模型。将步骤S4得到的抽象画的特征值放入朴素贝叶斯(NB)进行训练和预测,最终将抽象画分为“向上”、“向下”、“向左”和“向右”四类,从而实现抽象画图像方向的自动预测。抽象画图像方向识别框架参考图4。S5: 400 paintings are randomly selected as the original images of the training set, and 100 paintings are used as the test set. Therefore, the final training set samples are 1600 and the test set samples are 400 after the original images are rotated. To get more accurate classification results, 10-fold cross-validation was used to evaluate the classification model. Put the eigenvalues of the abstract paintings obtained in step S4 into Naive Bayes (NB) for training and prediction, and finally divide the abstract paintings into four categories: "up", "down", "left" and "right" , so as to realize the automatic prediction of the image orientation of abstract paintings. Refer to Figure 4 for the framework of image orientation recognition in abstract paintings.

采用朴素贝叶斯分类器进行二分类(“向上”和“非向上”)时,其后验概率的比值为:When using the Naive Bayes classifier for binary classification ("up" and "non-up"), the ratio of the posterior probability is:

Figure BDA0002605778750000081
Figure BDA0002605778750000081

其中,F=[f1,f2…,f15]表示抽象画图像G的特征向,C1表示向上类,C2表示非向上方向类。P(C1)和P(C2)分别是这两类的先验概率,P(C1|F)和P(C2|F)分别表示这两类的后验概率,P(F|C1)和P(F|C2)分别表示所有特征的条件概率,P(fi|C1)和P(fi|C2)分别表示第i个特征状态的条件概率。Among them, F=[f1, f2..., f15] represents the feature direction of the abstract painting image G, C1 represents the upward category, and C2 represents the non-upward direction category. P(C 1 ) and P(C 2 ) are the prior probabilities of the two classes, respectively, P(C 1 |F) and P(C 2 |F) are the posterior probabilities of the two classes, respectively, P(F| C 1 ) and P(F|C 2 ) represent the conditional probabilities of all features, respectively, and P(f i |C 1 ) and P(f i |C 2 ) represent the conditional probability of the i-th feature state, respectively.

所有特征都是离散的,P(fi|Cj)(i=1,2,…,1291,j=1,2)与0-1分布一致。在训练阶段可以计算出每个特征状态的条件概率P(fi|Cj)。在预测阶段,根据后验概率比确定抽象画G应归类为哪一类的概率,公式如下:All features are discrete, P(f i |C j ) (i=1, 2, . . . , 1291, j=1, 2) is consistent with a 0-1 distribution. The conditional probability P(f i |C j ) of each feature state can be calculated during the training phase. In the prediction stage, the probability of which category the abstract painting G should be classified into is determined according to the posterior probability ratio. The formula is as follows:

Figure BDA0002605778750000082
Figure BDA0002605778750000082

其中,T是阈值,本发明实施例中阈值T取值为0.5。Wherein, T is a threshold, and in this embodiment of the present invention, the threshold T takes a value of 0.5.

本实施例中,还可以通过朴素贝叶斯分类器对抽象画进行四分类,将抽象画图像识别为“向上”、“向下”、“向左”和“向右”四个方向,其具体方法为:将这四种情况分为四组:选择其中一个方向θ作为一类,其余三个方向

Figure BDA0002605778750000083
作为另一类。然后计算每组中两类的后验概率的比值,公式如下:In this embodiment, the naive Bayes classifier can also be used to classify the abstract paintings into four directions, and the abstract paintings images are identified as four directions of "upward", "downward", "leftward" and "rightward". The specific method is: divide these four situations into four groups: choose one of the directions θ as one category, and the other three directions
Figure BDA0002605778750000083
as another category. The ratio of the posterior probabilities of the two classes in each group is then calculated using the following formula:

Figure BDA0002605778750000084
Figure BDA0002605778750000084

其中,

Figure BDA0002605778750000085
是每组中两类的后验概率比。比较各组的
Figure BDA0002605778750000086
值,选取
Figure BDA0002605778750000087
值最大的方向作为抽象画图像的正确方向。in,
Figure BDA0002605778750000085
is the ratio of the posterior probabilities of the two classes in each group. Compare the groups
Figure BDA0002605778750000086
value, select
Figure BDA0002605778750000087
The direction with the largest value is used as the correct direction for the abstract painting image.

为充分验证本发明方法的有效性和适用性,分别采用低层特征,高层特征,低层与高层特征相融合的方式对分类模型进行测试,分类准确如表1所示。实验结果发现,无论将抽象画图像分为两类还是四类,采用低层特征和高层特征融合的方法得到的分类准确率最高。In order to fully verify the effectiveness and applicability of the method of the present invention, the low-level features, high-level features, and low-level and high-level features are used to test the classification model, and the classification accuracy is shown in Table 1. The experimental results show that whether the abstract painting images are divided into two categories or four categories, the method of fusion of low-level features and high-level features has the highest classification accuracy.

表1:不同特征下的分类精度对比Table 1: Comparison of classification accuracy under different features

Figure BDA0002605778750000091
Figure BDA0002605778750000091

此外,将融合后的特征在常用的分类器上进行分类测试,测试结果如表2所示。结果表明,由于本发明实施例中,特征值都是1或者0,因此采用本发明的朴素贝叶斯多分类模型得到的分类精度更高。In addition, the fused features are classified and tested on commonly used classifiers, and the test results are shown in Table 2. The results show that, since in the embodiment of the present invention, the feature values are all 1 or 0, the classification accuracy obtained by using the naive Bayesian multi-classification model of the present invention is higher.

表2:不同分类器下的分类精度对比Table 2: Comparison of classification accuracy under different classifiers

Figure BDA0002605778750000092
Figure BDA0002605778750000092

综上所述,本发明提供一种基于特征融合和朴素贝叶斯的抽象画图像方向识别方法,其利用低层与高层特征相融合的方式获取图像的特征值,然后将图像的特征值放入朴素贝叶斯分类器(NB)进行训练和预测,实现了抽象画图像方向的自动预测,能有效识别图像的方向,即能够在机器学习的框架下建立图像视觉内容与正确方向之间的关系,并提高了预测精度。To sum up, the present invention provides a method for recognizing the direction of an abstract painting image based on feature fusion and Naive Bayes. The Naive Bayes classifier (NB) is trained and predicted, which realizes the automatic prediction of the image direction of abstract paintings, and can effectively identify the direction of the image, that is, it can establish the relationship between the visual content of the image and the correct direction under the framework of machine learning. , and improve the prediction accuracy.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present invention. scope.

Claims (4)

1. An abstract picture image direction identification method based on feature fusion and naive Bayes is characterized by comprising the following steps:
s1, rotating the abstract drawing image by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain four abstract drawing images with different directions, and performing upper-lower average segmentation and left-right average segmentation on the abstract drawing images; therefore, each abstract picture image is divided into an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s2, extracting low-level features of the abstract picture image, respectively calculating low-level feature descriptions of each sub-block, taking a comparison result of the low-level feature descriptions of each block as an image low-level feature value, if the comparison result is true, representing as 1, otherwise, representing as 0; the step S2 specifically includes the following steps:
s201, converting the four subblocks in the step S1 from an RGB color space into HSV models, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors to be used as a color histogram vector of an abstract picture; judging the comparison result of the histogram vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f1 and f2, wherein the specific formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR
wherein, HistA,HistB,HistL,HistRHistogram vectors of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block are respectively;
the formula for converting the image from the RGB color space to the HSV model is as follows:
Figure FDA0003500281790000011
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;
Figure FDA0003500281790000012
Figure FDA0003500281790000013
v=kmax;
the method comprises the following steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of r ', g ' and b ', kmin represents the minimum value of r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model;
s202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of the four sub-blocks; judging the comparison result of the complexity of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f3 and f4, wherein the following formula is adopted:
f3=CompA≥CompB;f4=CompL≥CompR
therein, CompA、CompB、CompL、CompRRespectively representing the complexity of an upper sub block, a lower sub block, a left sub block and a right sub block;
the calculation formula of the complexity is as follows:
Figure FDA0003500281790000021
Figure FDA0003500281790000022
wherein, Gmax(x, y) represents the maximum gradient of a pixel point (x, y) in the image in the RGB color space,
Figure FDA0003500281790000023
representing R, G, B points of (x, y) in the image respectivelyGradient value, Pixelnum (G), represents the total number of pixels of the image G, CompGRepresents the complexity of image G;
s203, calculating the similarity between every two sub-blocks in the four sub-blocks; and the comparison result of the similarity between the sub-blocks is taken as the image feature values f5, and f6 and f 7; the formula is as follows:
f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);
the similarity calculation formula is as follows:
Figure FDA0003500281790000024
wherein Sim (G)1,G2) Indicating the degree of acquaintance, G, of the images G1 and G21,G2∈RGB,H(i)1And H (i)2Are respectively an image G1And G2M is the number of cells present in the HOG feature;
s204, detecting the significant straight lines of the four sub-blocks by using Hough transformation, judging whether the straight lines are static lines or dynamic lines according to the inclination angles alpha of the straight lines, calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics, and respectively taking the comparison results of the straight line attribute values between the two sub-blocks as image characteristic values f8, f9, f10, f11, f12 and f13, wherein the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block, Ave_LenA、Ave_LenB、Ave_LenL、Ave_LenRRespectively representing the average length of all lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s3, extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN), which comprises the following concrete steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRespectively representing the characteristic values of the neural network of the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s4, linearly combining the image low-level characteristic value obtained in the step S2 and the image high-level characteristic value obtained in the step S3 to obtain a final characteristic value of the abstract picture image;
and S5, performing the operations of S1-S4 on all abstract pictures in the image library to obtain the final characteristic value of the abstract picture, inputting the final characteristic value into a naive Bayes classifier for training and prediction, and finally dividing the abstract picture into 'upward', 'downward', 'left' or 'right', thereby realizing the automatic prediction of the direction of the abstract picture image.
2. The method for identifying the direction of the abstract picture image based on the feature fusion and naive Bayes as claimed in claim 1, wherein the network structure of the convolutional neural network CNN is as follows: the first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolution layer consists of 4 convolution kernels of 3 multiplied by 3, a feature graph obtained after each convolution fills the edge part with 0, and the size is kept unchanged; after each convolutional layer, reducing feature resolution with maximum pooling of 2 × 2; finally, the 4 16 × 16 two-dimensional matrices are converted into 1024-dimensional eigenvectors using the fully-connected layer, and 1024 is reduced to 512 dimensions.
3. The method for identifying the direction of an abstract drawing image based on feature fusion and naive Bayes as claimed in claim 1, wherein in said step S4, the vector dimension of the final feature value of the linearly combined abstract drawing image is 1291.
4. The method for identifying the direction of the abstract drawing image based on the feature fusion and naive Bayes as claimed in claim 1, wherein in said step S5, the concrete method for performing four classifications of "upward", "downward", "leftward" and "rightward" when the naive Bayes classifier predicts the direction of the abstract drawing image is:
these four cases are divided into four groups: one direction is selected as one type in each group, the other three directions are used as the other types, the ratio of the posterior probabilities of the two types in each group is calculated, and the calculation formula is as follows:
Figure FDA0003500281790000031
wherein,
Figure FDA0003500281790000032
is the posterior probability ratio of the two classes in each group; p (C)θIf) represents the posterior probability of the direction selected therein,
Figure FDA0003500281790000033
representing the posterior probabilities of the remaining three directions; comparing the posterior probability ratios of two of the four groups
Figure FDA0003500281790000034
Selecting
Figure FDA0003500281790000035
The direction with the largest value is taken as the correct direction for the abstract picture image.
CN202010737934.9A 2020-07-28 2020-07-28 Abstract painting image orientation recognition method based on feature fusion and naive Bayes Active CN111950565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010737934.9A CN111950565B (en) 2020-07-28 2020-07-28 Abstract painting image orientation recognition method based on feature fusion and naive Bayes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010737934.9A CN111950565B (en) 2020-07-28 2020-07-28 Abstract painting image orientation recognition method based on feature fusion and naive Bayes

Publications (2)

Publication Number Publication Date
CN111950565A CN111950565A (en) 2020-11-17
CN111950565B true CN111950565B (en) 2022-05-20

Family

ID=73338368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010737934.9A Active CN111950565B (en) 2020-07-28 2020-07-28 Abstract painting image orientation recognition method based on feature fusion and naive Bayes

Country Status (1)

Country Link
CN (1) CN111950565B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557771A (en) * 2016-11-17 2017-04-05 电子科技大学 Skin disease color of image feature extracting method based on Naive Bayes Classifier
CN110276278A (en) * 2019-06-04 2019-09-24 刘嘉津 Insect image identification entirety and the recognition methods of multiple clips comprehensive automation
CN110956184A (en) * 2019-11-18 2020-04-03 山西大学 An Abstract Graph Orientation Determination Method Based on HSI-LBP Features

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6282193B2 (en) * 2014-07-28 2018-02-21 クラリオン株式会社 Object detection device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557771A (en) * 2016-11-17 2017-04-05 电子科技大学 Skin disease color of image feature extracting method based on Naive Bayes Classifier
CN110276278A (en) * 2019-06-04 2019-09-24 刘嘉津 Insect image identification entirety and the recognition methods of multiple clips comprehensive automation
CN110956184A (en) * 2019-11-18 2020-04-03 山西大学 An Abstract Graph Orientation Determination Method Based on HSI-LBP Features

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Nonlocal Patch Tensor Sparse Representation for Hyperspectral Image Super-Resolution;Yang Xu等;《IEEE Transactions on Image Processing》;20190118;第28卷(第6期);3034-3047 *
Orientation judgment for abstract paintings;Jia Liu等;《Multimedia Tools And Applications》;20171221;第76卷(第1期);1017-1036 *
Why my photos look sideways or upside down? Detecting canonical orientation of images using convolutional neural networks;Kunal Swami等;《2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)》;20170907;495-500 *
可计算图像复杂度评价方法综述;郭小英等;《电子学报》;20200415;第48卷(第4期);819-826 *
基于深度学习的情感化设计;王晓慧等;《包装工程》;20170320;第38卷(第6期);12-16 *
绘画图像美学研究方法综述;白茹意等;《中国图象图形学报》;20191116;第24卷(第11期);1860-1881 *
绘画特征提取方法与情感分析研究综述;贾春花等;《中国图象图形学报》;20180716;第23卷(第7期);937-952 *

Also Published As

Publication number Publication date
CN111950565A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
US9633282B2 (en) Cross-trained convolutional neural networks using multimodal images
Kao et al. Visual aesthetic quality assessment with a regression model
Kao et al. Hierarchical aesthetic quality assessment using deep convolutional neural networks
US6738494B1 (en) Method for varying an image processing path based on image emphasis and appeal
Lee et al. Automatic content-aware color and tone stylization
JP4335476B2 (en) Method for changing the number, size, and magnification of photographic prints based on image saliency and appeal
Redi et al. The beauty of capturing faces: Rating the quality of digital portraits
CN110991389B (en) A Matching Method for Determining the Appearance of Target Pedestrians in Non-overlapping Camera Views
CN112070044B (en) Video object classification method and device
Lee et al. Photographic composition classification and dominant geometric element detection for outdoor scenes
CN111178208A (en) Pedestrian detection method, device and medium based on deep learning
WO2022199710A1 (en) Image fusion method and apparatus, computer device, and storage medium
CN106203448B (en) A scene classification method based on nonlinear scale space
CN107066916A (en) Scene Semantics dividing method based on deconvolution neutral net
CN111046868B (en) Object saliency detection method based on matrix low-rank sparse decomposition
CN109740539B (en) 3D object recognition method based on extreme learning machine and fusion convolutional network
CN110827304A (en) A TCM tongue image localization method and system based on deep convolutional network and level set method
CN109213886A (en) Image retrieval method and system based on image segmentation and fuzzy pattern recognition
CN114359323A (en) Image target area detection method based on visual attention mechanism
Lee et al. Property-specific aesthetic assessment with unsupervised aesthetic property discovery
CN110956184A (en) An Abstract Graph Orientation Determination Method Based on HSI-LBP Features
CN109325434A (en) A Multi-feature Probabilistic Topic Model for Image Scene Classification
Anwar et al. A survey on image aesthetic assessment
CN111950565B (en) Abstract painting image orientation recognition method based on feature fusion and naive Bayes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230629

Address after: No. 304-314, No. 16 (Plant B), Huifeng East Second Road, Zhongkai High tech Zone, Huizhou, Guangdong Province, 516000

Patentee after: HUIZHOU WEIMILI TECHNOLOGY Co.,Ltd.

Address before: 030006 No. 92, Hollywood Road, Taiyuan, Shanxi

Patentee before: SHANXI University