CN111950565B - Abstract picture image direction identification method based on feature fusion and naive Bayes - Google Patents

Abstract picture image direction identification method based on feature fusion and naive Bayes Download PDF

Info

Publication number
CN111950565B
CN111950565B CN202010737934.9A CN202010737934A CN111950565B CN 111950565 B CN111950565 B CN 111950565B CN 202010737934 A CN202010737934 A CN 202010737934A CN 111950565 B CN111950565 B CN 111950565B
Authority
CN
China
Prior art keywords
image
sub
block
len
abstract
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010737934.9A
Other languages
Chinese (zh)
Other versions
CN111950565A (en
Inventor
白茹意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou Weimili Technology Co ltd
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN202010737934.9A priority Critical patent/CN111950565B/en
Publication of CN111950565A publication Critical patent/CN111950565A/en
Application granted granted Critical
Publication of CN111950565B publication Critical patent/CN111950565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of image processing and computer vision, and particularly relates to an abstract picture image direction identification method based on feature fusion and naive Bayes, which comprises the following steps: s1, rotating the abstract picture images to obtain four abstract picture images with different directions; simultaneously, dividing the abstract picture image to obtain four subblocks; s2, extracting low-level features of the abstract picture image; s3, extracting the high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN); s4, linearly combining the image low-level characteristic value and the image high-level characteristic value to obtain a final characteristic value of the abstract picture image; and S5, inputting the final characteristic value of the abstract image into a naive Bayes classifier for training and prediction. The method acquires the characteristic value of the image by fusing the low-level and high-level characteristics, and then puts the characteristic value of the image into a naive Bayes classifier (NB) for training and prediction, thereby realizing the automatic prediction of the direction of the abstract image and improving the prediction precision.

Description

Abstract picture image direction identification method based on feature fusion and naive Bayes
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to an abstract picture image direction identification method based on feature fusion and naive Bayes.
Background
Abstract art is a visual language that is somewhat independent of the world, using shapes, colors, and lines to make a composition. Among them, a drawing created for emotional expression is called "hot abstraction", and a drawing describing the world in an abstract manner is called "cold abstraction". Usually, when creating abstract drawings, artists determine the correct hanging direction of the work according to their own aesthetic concepts. While the correct direction is typically specified on the back of the canvas, this is not apparent to other non-professional viewers. Moreover, some studies in psychology in recent years have addressed the problem of the direction of abstract drawings, most of which consider that correctly positioned drawings will receive a higher aesthetic appreciation. Experiments by participants showed that about half of the preferred orientations were decided to be consistent with the artist's intended orientation, which is much higher than chance, but less than perfect performance. These all provide evidence for the relationship between painting direction and aesthetic quality. The research of orientation recognition can reveal objective rules of visual aesthetic evaluation.
With the trend of information digitization, digital images of paintings can be easily found on the internet. This makes computer-aided drawing analysis possible. People research various aesthetic evaluation methods by directly exploring the relationship between human aesthetic perception and calculation visual characteristics, but none of the methods solves the aesthetic evaluation problem through computer-aided orientation judgment. The current state of research in recent years on image orientation is as follows: (1) the research of image direction identification mainly aims at photographic pictures such as natural or scene images, and the identification rate is satisfactory. However, in the abstract picture image, since the content and the semantic meaning are relatively less conspicuous than the photographed image, it is difficult to recognize the direction of the abstract picture, and the correlation work in recent years is relatively small. (2) Human beings generally recognize directions through understanding the image content, so some methods adopt high-level semantic features to recognize the image directions, and the accuracy is obviously higher. Its accuracy will depend to a large extent on whether the semantic gap between high-level cues and low-level features can be closed.
At present, extensive research on natural image direction prompts us to explore the direction judgment problem of abstract painting. The aim of the invention is to better understand the sense of orientation of abstract drawings, and in particular to establish the relationship between the visual content of images and the correct orientation in the framework of machine learning without substantial content.
Disclosure of Invention
The invention overcomes the defects of the prior art, provides the method for identifying the direction of the abstract picture image based on feature fusion and naive Bayes, and can realize the automatic prediction of the direction of the abstract picture image through computer operation.
In order to solve the technical problems, the invention adopts the technical scheme that: an abstract picture image direction identification method based on feature fusion and naive Bayes comprises the following steps:
s1, rotating the abstract drawing image by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain four abstract drawing images with different directions, and performing upper-lower average segmentation and left-right average segmentation on the abstract drawing images; therefore, each abstract picture image is divided into an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s2, extracting low-level features of the abstract picture image, respectively calculating low-level feature descriptions of each sub-block, taking a comparison result of the low-level feature descriptions of each block as an image low-level feature value, if the comparison result is true, representing as 1, otherwise, representing as 0;
s3, extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN), which comprises the following concrete steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRespectively representing the characteristic values of the neural network of the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s4, linearly combining the image low-level characteristic value obtained in the step S2 and the image high-level characteristic value obtained in the step S3 to obtain a final characteristic value of the abstract picture image;
and S5, performing the operations of the steps S1-S4 on all abstract pictures in the image library to obtain the final characteristic value of the abstract picture, inputting the final characteristic value into a naive Bayes classifier to train and predict, and finally dividing the abstract picture into upward, downward, leftward or rightward, so that the automatic prediction of the direction of the abstract picture image is realized.
The step S2 specifically includes the following steps:
s201, converting the four subblocks in the step S1 from an RGB color space into HSV models, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors to be used as a color histogram vector of an abstract picture; judging the comparison result of the histogram vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f1 and f2, wherein the specific formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR
wherein, HistA,HistB,HistL,HistRHistogram vectors of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block are respectively;
s202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of four sub-blocks; judging the comparison result of the complexity of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f3 and f4, wherein the following formula is adopted:
f3=CompA≥CompB;f4=CompL≥CompR
therein, CompA、CompB、CompL、CompRRespectively representing the complexity of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s203, calculating the similarity between every two sub-blocks in the four sub-blocks; and the comparison result of the similarity between the sub-blocks is taken as the image feature values f5, and f6 and f 7; the formula is as follows:
f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);
s204, detecting the significant straight lines of the four sub-blocks by using Hough transformation, judging whether the straight lines are static lines or dynamic lines according to the inclination angles alpha of the straight lines, calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics, and respectively taking the comparison results of the straight line attribute values between the two sub-blocks as image characteristic values f8, f9, f10, f11, f12 and f13, wherein the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper, lower, left and right sub-blocks, Ave _ LenA、Ave_LenB、Ave_LenL、Ave_LenRThe average lengths of all lines in the upper, lower, left, and right sub-blocks are represented, respectively.
In step S202, the calculation formula of the image complexity is as follows:
Figure BDA0002605778750000031
Figure BDA0002605778750000032
wherein G ismax(x, y) represents the maximum gradient of a pixel point (x, y) in the image in the RGB color space,
Figure BDA0002605778750000033
r, G, B representing the gradient values of (x, y) points in the image, Pixelnum (G) representing the total number of pixels of image G, CompGRepresenting the complexity of the image G.
In step S201, the formula for converting the image from the RGB color space to the HSV model is as follows:
Figure BDA0002605778750000034
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;
Figure BDA0002605778750000041
Figure BDA0002605778750000042
v=kmax;
the method comprises the steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of the r ', g ' and b ', kmin represents the minimum value of the r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model.
The network structure of the convolutional neural network CNN is as follows: the first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolution layer consists of 4 convolution kernels of 3 multiplied by 3, a feature graph obtained after each convolution fills the edge part with 0, and the size is kept unchanged; after each convolutional layer, reducing feature resolution with maximum pooling of 2 × 2; finally, the 4 16 × 16 two-dimensional matrices are converted into 1024-dimensional eigenvectors using the fully-connected layer, and 1024 is reduced to 512 dimensions.
In step S4, the vector dimension of the final feature value of the linearly combined abstract picture image is 1291.
In step S5, the specific method of performing four classifications, i.e., "up", "down", "left", and "right", when the naive bayes classifier predicts the direction of the abstract drawing image is:
these four cases are divided into four groups: one direction is selected as one type in each group, the other three directions are used as the other types, the ratio of the posterior probabilities of the two types in each group is calculated, and the calculation formula is as follows:
Figure BDA0002605778750000043
wherein,
Figure BDA0002605778750000044
is the posterior probability ratio of the two classes in each group; p (C)θIf) represents the posterior probability of the direction selected therein,
Figure BDA0002605778750000045
representing the posterior probabilities of the remaining three directions; comparing the posterior probability ratios of two of the four groups
Figure BDA0002605778750000046
Selecting
Figure BDA0002605778750000047
The direction with the largest value is taken as the correct direction for the abstract picture image.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an abstract picture image direction identification method based on feature fusion and naive Bayes, which comprises the following steps of (1) carrying out upper and lower average segmentation and left and right average segmentation on an abstract picture image. Thus, each abstract drawing is divided into four sub-blocks (up, down, left, and right). The image features are based on the comparison result of the four direction feature descriptions, so that the direction structure of the image can be embodied more specifically. (2) According to the basic principle of the abstract drawing theory, low-level features including color, complexity, similarity and linear attributes of all abstract drawing images are extracted. From the perspective of the drawing principle, the characteristics can better express the basic characteristics of abstract drawing and reflect the directionality of the image. (3) And extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN). (4) And linearly combining the low-level features and the high-level features, wherein the combined vector is the final characteristic value of the abstract picture image. Therefore, local and global characteristics of the image can be better fused, and the image direction can be more accurately detected.
Drawings
FIG. 1 is a diagram illustrating abstract drawing rotation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of abstract picture segmentation according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a CNN model adopted in the embodiment of the present invention;
FIG. 4 is a frame for identifying an abstract image direction according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments; all other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides an abstract drawing image direction recognition method based on feature fusion and naive Bayes, which selects drawings of public websites to perform experiments, and comprises the following concrete implementation steps:
s1: 500 abstract pictures in the Wikiart (http:// www.wikiart.org) dataset were chosen. All abstract drawing images are rotated clockwise four directions (0 °,90 °,180 °,270 °) with reference to fig. 1. Finally, 2000 abstract picture images with different directions are obtained. And performing average segmentation on the abstract picture image up and down, and average segmentation on the abstract picture image left and right. Thus, each abstract picture is divided into four sub-blocks (up (a), down (B), left (L), and right (R)), referring to fig. 2.
S2: and extracting low-level features of all abstract drawing images according to the basic principle of the abstract drawing theory. And respectively calculating the low-level feature description of each sub-block, taking the comparison result of the feature descriptions as the final feature value of the image, and if the comparison result is true, representing as 1, otherwise, representing as 0. The method comprises the following specific steps:
s201: the four sub-blocks in S1 are converted from the RGB color space into HSV models (hue (h), saturation (S), value (v)). The calculation formula is as follows:
Figure BDA0002605778750000051
Figure BDA0002605778750000052
v=kmax; (3)
the method comprises the steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of r ', g ' and b ', kmin represents the minimum value of r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model.
Figure BDA0002605778750000061
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin; (5)
In the direction recognition, the influence factor of brightness is small, so that the H-S space is divided into 16 hues and 8 saturations, and the number of pixels of 128 colors is counted as a color histogram vector of the painting. The image feature values f1 and f2 are the comparison result of two sub-block histogram vectors, and the formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR; (6)
wherein, HistA,HistB,HistL,HistRHistogram vectors for the four sub-blocks, respectively. The dimensions of f1 and f2 are 128 dimensions.
S202: the maximum gradient map of the image is represented as the complexity of the image, and then the complexity of the four subblock images in step S1 is calculated. Setting the image as G, and calculating the maximum gradient of pixel points (x, y) in the image as G in the RGB color spacemax(x, y). Then G of all pixel points in the image is calculatedmaxAs the complexity of the image. The calculation formula is as follows:
Figure BDA0002605778750000062
Figure BDA0002605778750000063
wherein (x, y) represents the coordinates of a pixel point in the image,
Figure BDA0002605778750000064
is the gradient value of (x, y) point, Pixelnum (G) is the total number of pixels in image G, CompGIs the complexity of the image G. The image feature values f3 and f4 are the comparison result of the complexity of the two sub-blocks, and the formula is as follows:
f3=CompA≥CompB;f4=CompL≥CompR; (9)
therein, CompA、CompB、CompL、CompRRespectively represent up, down, left,Complexity of the right four sub-blocks.
S203: and calculating the similarity between every two sub-blocks in the four sub-blocks.
Suppose two images G1And G2A histogram pyramid (HOG) is used to compute the similarity between the two images. The HOG features of 3 channels were calculated in RGB mode, with the image as a unit containing 8 directions. Similarity Sim (G) between two images1,G2) The calculation formula is as follows:
Figure BDA0002605778750000065
wherein G is1,G2∈RGB,H1And H2Are respectively an image G1And G2M is the number of cells present in the HOG feature. The image feature values f5, f6 and f7 are the comparison results of the similarity between the sub-blocks, and the formula is as follows: f5 ≧ Sim (A, L) ≧ Sim (A, R); f6 ≧ Sim (B, L) ≧ Sim (B, R); f7 ≧ Sim (A, B) ≧ Sim (L, R); (11)
s204: the Hough transform was used to detect the significant straight lines of the four sub-blocks. According to the inclination angle α of a straight line, the line is a static line if the inclination angle (-15 ° < α <15 °) or (75 ° < α <105 °), and a dynamic line otherwise. And calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics. The image feature values f8, f9, f10, f11, f12 and f13 are the comparison results of the straight line attribute values between two sub-blocks, and the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB; (12)
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR; (13)
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper, lower, left and right sub-blocks, Ave _ LenA、Ave_LenB、Ave_LenL、Ave_LenRThe average lengths of all lines in the upper, lower, left, and right sub-blocks are represented, respectively.
S3: the convolutional Neural network CNN (convolutional Neural networks) is adopted to extract the high-level features of all abstract picture images, and the model refers to FIG. 3. The method comprises the following specific steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block, if true, representing as 1, otherwise, representing as 0, and taking the comparison result as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR; (14)
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRepresenting the neural network eigenvalues of the upper, lower, left and right sub-blocks, respectively. The dimensions of f14 and f15 are 512 dimensions.
In this embodiment, the CNN includes 3 convolutional layers with step size of 1, the activation function adopts ReLU, and the convolutional layers perform convolution on the input samples by using a filter to obtain a feature map. The first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolutional layer is composed of 4 convolution kernels of 3 × 3, and the feature map obtained after convolution each time fills the edge part with 0, and the size is kept unchanged. CNN contains 3 maximum pooling layers of 2 × 2 to reduce resolution. The pooling layer samples the input data to reduce parameters and avoid overfitting; CNN contains 2 fully connected layers for connecting all neurons. The two fully-connected dimensions are 1024 and 521 respectively, and finally, the 512-dimensional vector is used as a neural network characteristic value and is represented by f _ cnn. Other parameter settings of CNN network: batch _ size is 8, epochs is 10, learning rate is 1e-4, cost function selects "cross entropy loss function", optimizer is "Adam".
S4: and linearly combining the image features f1-f15 of the S2 and the S3, wherein the combined vector is the final feature value of the abstract drawing image. The combined feature vector dimension is 1291.
S5: and randomly selecting 400 paintings as original images of a training set and 100 paintings as a test set, so that 1600 final training set samples and 400 final test set samples are obtained after the original images are rotated. To obtain more accurate classification results, the classification model was evaluated using 10-fold cross-validation. And (4) putting the characteristic values of the abstract drawing obtained in the step (S4) into Naive Bayes (NB) for training and prediction, and finally dividing the abstract drawing into four types of upward, downward, leftward and rightward, thereby realizing the automatic prediction of the image direction of the abstract drawing. The abstract picture image orientation recognition framework refers to fig. 4.
When a naive Bayes classifier is used for two classifications ("upward" and "non-upward"), the ratio of the posterior probabilities is:
Figure BDA0002605778750000081
wherein, F ═ F1, F2 …, F15]Representing the direction of features, C, of an abstract drawing image G1Indicating an upward class and C2 indicating a non-upward class. P (C)1) And P (C)2) The prior probabilities of these two classes, P (C), respectively1I F) and P (C)2F) respectively represent the posterior probabilities of these two classes, P (F | C)1) And P (F | C)2) Respectively representing the conditional probability, P (f), of all the featuresi|C1) And P (f)i|C2) Respectively representing the conditional probability of the ith feature state.
All features are discrete, P (f)i|Cj) (i-1, 2, …, 1291, j-1, 2) is consistent with a 0-1 distribution. The conditional probability P (f) of each feature state can be calculated in the training phasei|Cj). In the prediction stage, the probability of which class the abstract drawing G should be classified into is determined according to the posterior probability ratio, and the formula is as follows:
Figure BDA0002605778750000082
wherein T is a threshold, and the value of the threshold T in the embodiment of the present invention is 0.5.
In this embodiment, the abstract picture may be classified into four categories by a naive bayes classifier, and the abstract picture image is identified into four directions, namely "up", "down", "left", and "right", according to a specific method: these four cases are divided into four groups: one of the directions theta is selected as a class, and the other three directions
Figure BDA0002605778750000083
As another class. Then, the ratio of the posterior probabilities of the two classes in each group is calculated, and the formula is as follows:
Figure BDA0002605778750000084
wherein,
Figure BDA0002605778750000085
is the posterior probability ratio of the two classes in each group. Comparing groups of
Figure BDA0002605778750000086
Value, selection
Figure BDA0002605778750000087
The direction with the largest value is taken as the correct direction for the abstract picture image.
In order to fully verify the effectiveness and the applicability of the method, the classification model is tested in a mode of fusing a low-level feature and a high-level feature respectively, and the classification accuracy is shown in table 1. Experimental results show that the classification accuracy obtained by adopting the method of fusing the low-layer features and the high-layer features is highest no matter the abstract picture images are divided into two or four types.
Table 1: comparison of classification accuracy under different characteristics
Figure BDA0002605778750000091
In addition, the fused features were subjected to classification tests on a general classifier, and the test results are shown in table 2. The result shows that in the embodiment of the invention, the characteristic values are all 1 or 0, so the classification precision obtained by adopting the naive Bayes multi-classification model of the invention is higher.
Table 2: comparison of classification accuracy under different classifiers
Figure BDA0002605778750000092
In summary, the invention provides a method for identifying the direction of an abstract drawing image based on feature fusion and naive Bayes, which obtains the feature value of the image by means of fusion of low-level and high-level features, and then puts the feature value of the image into a naive Bayes classifier (NB) for training and prediction, thereby realizing automatic prediction of the direction of the abstract drawing image, effectively identifying the direction of the image, namely, establishing the relationship between the visual content of the image and the correct direction in the framework of machine learning, and improving the prediction precision.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. An abstract picture image direction identification method based on feature fusion and naive Bayes is characterized by comprising the following steps:
s1, rotating the abstract drawing image by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain four abstract drawing images with different directions, and performing upper-lower average segmentation and left-right average segmentation on the abstract drawing images; therefore, each abstract picture image is divided into an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s2, extracting low-level features of the abstract picture image, respectively calculating low-level feature descriptions of each sub-block, taking a comparison result of the low-level feature descriptions of each block as an image low-level feature value, if the comparison result is true, representing as 1, otherwise, representing as 0; the step S2 specifically includes the following steps:
s201, converting the four subblocks in the step S1 from an RGB color space into HSV models, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors to be used as a color histogram vector of an abstract picture; judging the comparison result of the histogram vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f1 and f2, wherein the specific formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR
wherein, HistA,HistB,HistL,HistRHistogram vectors of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block are respectively;
the formula for converting the image from the RGB color space to the HSV model is as follows:
Figure FDA0003500281790000011
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;
Figure FDA0003500281790000012
Figure FDA0003500281790000013
v=kmax;
the method comprises the following steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of r ', g ' and b ', kmin represents the minimum value of r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model;
s202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of the four sub-blocks; judging the comparison result of the complexity of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f3 and f4, wherein the following formula is adopted:
f3=CompA≥CompB;f4=CompL≥CompR
therein, CompA、CompB、CompL、CompRRespectively representing the complexity of an upper sub block, a lower sub block, a left sub block and a right sub block;
the calculation formula of the complexity is as follows:
Figure FDA0003500281790000021
Figure FDA0003500281790000022
wherein, Gmax(x, y) represents the maximum gradient of a pixel point (x, y) in the image in the RGB color space,
Figure FDA0003500281790000023
representing R, G, B points of (x, y) in the image respectivelyGradient value, Pixelnum (G), represents the total number of pixels of the image G, CompGRepresents the complexity of image G;
s203, calculating the similarity between every two sub-blocks in the four sub-blocks; and the comparison result of the similarity between the sub-blocks is taken as the image feature values f5, and f6 and f 7; the formula is as follows:
f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);
the similarity calculation formula is as follows:
Figure FDA0003500281790000024
wherein Sim (G)1,G2) Indicating the degree of acquaintance, G, of the images G1 and G21,G2∈RGB,H(i)1And H (i)2Are respectively an image G1And G2M is the number of cells present in the HOG feature;
s204, detecting the significant straight lines of the four sub-blocks by using Hough transformation, judging whether the straight lines are static lines or dynamic lines according to the inclination angles alpha of the straight lines, calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics, and respectively taking the comparison results of the straight line attribute values between the two sub-blocks as image characteristic values f8, f9, f10, f11, f12 and f13, wherein the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block, Ave_LenA、Ave_LenB、Ave_LenL、Ave_LenRRespectively representing the average length of all lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s3, extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN), which comprises the following concrete steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRespectively representing the characteristic values of the neural network of the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s4, linearly combining the image low-level characteristic value obtained in the step S2 and the image high-level characteristic value obtained in the step S3 to obtain a final characteristic value of the abstract picture image;
and S5, performing the operations of S1-S4 on all abstract pictures in the image library to obtain the final characteristic value of the abstract picture, inputting the final characteristic value into a naive Bayes classifier for training and prediction, and finally dividing the abstract picture into 'upward', 'downward', 'left' or 'right', thereby realizing the automatic prediction of the direction of the abstract picture image.
2. The method for identifying the direction of the abstract picture image based on the feature fusion and naive Bayes as claimed in claim 1, wherein the network structure of the convolutional neural network CNN is as follows: the first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolution layer consists of 4 convolution kernels of 3 multiplied by 3, a feature graph obtained after each convolution fills the edge part with 0, and the size is kept unchanged; after each convolutional layer, reducing feature resolution with maximum pooling of 2 × 2; finally, the 4 16 × 16 two-dimensional matrices are converted into 1024-dimensional eigenvectors using the fully-connected layer, and 1024 is reduced to 512 dimensions.
3. The method for identifying the direction of an abstract drawing image based on feature fusion and naive Bayes as claimed in claim 1, wherein in said step S4, the vector dimension of the final feature value of the linearly combined abstract drawing image is 1291.
4. The method for identifying the direction of the abstract drawing image based on the feature fusion and naive Bayes as claimed in claim 1, wherein in said step S5, the concrete method for performing four classifications of "upward", "downward", "leftward" and "rightward" when the naive Bayes classifier predicts the direction of the abstract drawing image is:
these four cases are divided into four groups: one direction is selected as one type in each group, the other three directions are used as the other types, the ratio of the posterior probabilities of the two types in each group is calculated, and the calculation formula is as follows:
Figure FDA0003500281790000031
wherein,
Figure FDA0003500281790000032
is the posterior probability ratio of the two classes in each group; p (C)θIf) represents the posterior probability of the direction selected therein,
Figure FDA0003500281790000033
representing the posterior probabilities of the remaining three directions; comparing the posterior probability ratios of two of the four groups
Figure FDA0003500281790000034
Selecting
Figure FDA0003500281790000035
The direction with the largest value is taken as the correct direction for the abstract picture image.
CN202010737934.9A 2020-07-28 2020-07-28 Abstract picture image direction identification method based on feature fusion and naive Bayes Active CN111950565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010737934.9A CN111950565B (en) 2020-07-28 2020-07-28 Abstract picture image direction identification method based on feature fusion and naive Bayes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010737934.9A CN111950565B (en) 2020-07-28 2020-07-28 Abstract picture image direction identification method based on feature fusion and naive Bayes

Publications (2)

Publication Number Publication Date
CN111950565A CN111950565A (en) 2020-11-17
CN111950565B true CN111950565B (en) 2022-05-20

Family

ID=73338368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010737934.9A Active CN111950565B (en) 2020-07-28 2020-07-28 Abstract picture image direction identification method based on feature fusion and naive Bayes

Country Status (1)

Country Link
CN (1) CN111950565B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557771A (en) * 2016-11-17 2017-04-05 电子科技大学 Skin disease color of image feature extracting method based on Naive Bayes Classifier
CN110276278A (en) * 2019-06-04 2019-09-24 刘嘉津 Insect image identification entirety and the recognition methods of multiple clips comprehensive automation
CN110956184A (en) * 2019-11-18 2020-04-03 山西大学 Abstract diagram direction determination method based on HSI-LBP characteristics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6282193B2 (en) * 2014-07-28 2018-02-21 クラリオン株式会社 Object detection device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557771A (en) * 2016-11-17 2017-04-05 电子科技大学 Skin disease color of image feature extracting method based on Naive Bayes Classifier
CN110276278A (en) * 2019-06-04 2019-09-24 刘嘉津 Insect image identification entirety and the recognition methods of multiple clips comprehensive automation
CN110956184A (en) * 2019-11-18 2020-04-03 山西大学 Abstract diagram direction determination method based on HSI-LBP characteristics

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Nonlocal Patch Tensor Sparse Representation for Hyperspectral Image Super-Resolution;Yang Xu等;《IEEE Transactions on Image Processing》;20190118;第28卷(第6期);3034-3047 *
Orientation judgment for abstract paintings;Jia Liu等;《Multimedia Tools And Applications》;20171221;第76卷(第1期);1017-1036 *
Why my photos look sideways or upside down? Detecting canonical orientation of images using convolutional neural networks;Kunal Swami等;《2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)》;20170907;495-500 *
可计算图像复杂度评价方法综述;郭小英等;《电子学报》;20200415;第48卷(第4期);819-826 *
基于深度学习的情感化设计;王晓慧等;《包装工程》;20170320;第38卷(第6期);12-16 *
绘画图像美学研究方法综述;白茹意等;《中国图象图形学报》;20191116;第24卷(第11期);1860-1881 *
绘画特征提取方法与情感分析研究综述;贾春花等;《中国图象图形学报》;20180716;第23卷(第7期);937-952 *

Also Published As

Publication number Publication date
CN111950565A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
Narihira et al. Learning lightness from human judgement on relative reflectance
Karayev et al. Recognizing image style
JP4335476B2 (en) Method for changing the number, size, and magnification of photographic prints based on image saliency and appeal
CN106446872A (en) Detection and recognition method of human face in video under low-light conditions
CN109151501A (en) A kind of video key frame extracting method, device, terminal device and storage medium
CN109359541A (en) A kind of sketch face identification method based on depth migration study
CN109948566B (en) Double-flow face anti-fraud detection method based on weight fusion and feature selection
CN112070044B (en) Video object classification method and device
EP1168247A2 (en) Method for varying an image processing path based on image emphasis and appeal
CN107169417B (en) RGBD image collaborative saliency detection method based on multi-core enhancement and saliency fusion
CN1975759A (en) Human face identifying method based on structural principal element analysis
CN112329851B (en) Icon detection method and device and computer readable storage medium
CN106529494A (en) Human face recognition method based on multi-camera model
CN109522883A (en) A kind of method for detecting human face, system, device and storage medium
Johnson et al. Sparse codes as alpha matte
CN111047543A (en) Image enhancement method, device and storage medium
CN109740539A (en) 3D object identification method based on transfinite learning machine and fusion convolutional network
CN110517270A (en) A kind of indoor scene semantic segmentation method based on super-pixel depth network
CN109325434A (en) A kind of image scene classification method of the probability topic model of multiple features
Liu et al. Modern architecture style transfer for ruin or old buildings
CN110956184A (en) Abstract diagram direction determination method based on HSI-LBP characteristics
CN111612090B (en) Image emotion classification method based on content color cross correlation
CN116975828A (en) Face fusion attack detection method, device, equipment and storage medium
KR20180092453A (en) Face recognition method Using convolutional neural network and stereo image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230629

Address after: No. 304-314, No. 16 (Plant B), Huifeng East Second Road, Zhongkai High tech Zone, Huizhou, Guangdong Province, 516000

Patentee after: HUIZHOU WEIMILI TECHNOLOGY Co.,Ltd.

Address before: 030006 No. 92, Hollywood Road, Taiyuan, Shanxi

Patentee before: SHANXI University

TR01 Transfer of patent right