CN111950565B - Abstract picture image direction identification method based on feature fusion and naive Bayes - Google Patents
Abstract picture image direction identification method based on feature fusion and naive Bayes Download PDFInfo
- Publication number
- CN111950565B CN111950565B CN202010737934.9A CN202010737934A CN111950565B CN 111950565 B CN111950565 B CN 111950565B CN 202010737934 A CN202010737934 A CN 202010737934A CN 111950565 B CN111950565 B CN 111950565B
- Authority
- CN
- China
- Prior art keywords
- image
- sub
- block
- len
- abstract
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000004927 fusion Effects 0.000 title claims abstract description 14
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 8
- 239000013598 vector Substances 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000011218 segmentation Effects 0.000 claims description 9
- 230000003068 static effect Effects 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000011176 pooling Methods 0.000 claims description 7
- MBGVUMXBUGIIBQ-LEWJYISDSA-N 1-[(3r,4r)-1-(cyclooctylmethyl)-3-(hydroxymethyl)piperidin-4-yl]-3-ethylbenzimidazol-2-one Chemical compound C([C@H]([C@@H](C1)CO)N2C3=CC=CC=C3N(C2=O)CC)CN1CC1CCCCCCC1 MBGVUMXBUGIIBQ-LEWJYISDSA-N 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 239000003086 colorant Substances 0.000 claims description 4
- 230000009466 transformation Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 2
- 238000010422 painting Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 238000013145 classification model Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
- G06V10/507—Summing image-intensity values; Histogram projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Algebra (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of image processing and computer vision, and particularly relates to an abstract picture image direction identification method based on feature fusion and naive Bayes, which comprises the following steps: s1, rotating the abstract picture images to obtain four abstract picture images with different directions; simultaneously, dividing the abstract picture image to obtain four subblocks; s2, extracting low-level features of the abstract picture image; s3, extracting the high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN); s4, linearly combining the image low-level characteristic value and the image high-level characteristic value to obtain a final characteristic value of the abstract picture image; and S5, inputting the final characteristic value of the abstract image into a naive Bayes classifier for training and prediction. The method acquires the characteristic value of the image by fusing the low-level and high-level characteristics, and then puts the characteristic value of the image into a naive Bayes classifier (NB) for training and prediction, thereby realizing the automatic prediction of the direction of the abstract image and improving the prediction precision.
Description
Technical Field
The invention belongs to the technical field of image processing and computer vision, and particularly relates to an abstract picture image direction identification method based on feature fusion and naive Bayes.
Background
Abstract art is a visual language that is somewhat independent of the world, using shapes, colors, and lines to make a composition. Among them, a drawing created for emotional expression is called "hot abstraction", and a drawing describing the world in an abstract manner is called "cold abstraction". Usually, when creating abstract drawings, artists determine the correct hanging direction of the work according to their own aesthetic concepts. While the correct direction is typically specified on the back of the canvas, this is not apparent to other non-professional viewers. Moreover, some studies in psychology in recent years have addressed the problem of the direction of abstract drawings, most of which consider that correctly positioned drawings will receive a higher aesthetic appreciation. Experiments by participants showed that about half of the preferred orientations were decided to be consistent with the artist's intended orientation, which is much higher than chance, but less than perfect performance. These all provide evidence for the relationship between painting direction and aesthetic quality. The research of orientation recognition can reveal objective rules of visual aesthetic evaluation.
With the trend of information digitization, digital images of paintings can be easily found on the internet. This makes computer-aided drawing analysis possible. People research various aesthetic evaluation methods by directly exploring the relationship between human aesthetic perception and calculation visual characteristics, but none of the methods solves the aesthetic evaluation problem through computer-aided orientation judgment. The current state of research in recent years on image orientation is as follows: (1) the research of image direction identification mainly aims at photographic pictures such as natural or scene images, and the identification rate is satisfactory. However, in the abstract picture image, since the content and the semantic meaning are relatively less conspicuous than the photographed image, it is difficult to recognize the direction of the abstract picture, and the correlation work in recent years is relatively small. (2) Human beings generally recognize directions through understanding the image content, so some methods adopt high-level semantic features to recognize the image directions, and the accuracy is obviously higher. Its accuracy will depend to a large extent on whether the semantic gap between high-level cues and low-level features can be closed.
At present, extensive research on natural image direction prompts us to explore the direction judgment problem of abstract painting. The aim of the invention is to better understand the sense of orientation of abstract drawings, and in particular to establish the relationship between the visual content of images and the correct orientation in the framework of machine learning without substantial content.
Disclosure of Invention
The invention overcomes the defects of the prior art, provides the method for identifying the direction of the abstract picture image based on feature fusion and naive Bayes, and can realize the automatic prediction of the direction of the abstract picture image through computer operation.
In order to solve the technical problems, the invention adopts the technical scheme that: an abstract picture image direction identification method based on feature fusion and naive Bayes comprises the following steps:
s1, rotating the abstract drawing image by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain four abstract drawing images with different directions, and performing upper-lower average segmentation and left-right average segmentation on the abstract drawing images; therefore, each abstract picture image is divided into an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s2, extracting low-level features of the abstract picture image, respectively calculating low-level feature descriptions of each sub-block, taking a comparison result of the low-level feature descriptions of each block as an image low-level feature value, if the comparison result is true, representing as 1, otherwise, representing as 0;
s3, extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN), which comprises the following concrete steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR;
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRespectively representing the characteristic values of the neural network of the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s4, linearly combining the image low-level characteristic value obtained in the step S2 and the image high-level characteristic value obtained in the step S3 to obtain a final characteristic value of the abstract picture image;
and S5, performing the operations of the steps S1-S4 on all abstract pictures in the image library to obtain the final characteristic value of the abstract picture, inputting the final characteristic value into a naive Bayes classifier to train and predict, and finally dividing the abstract picture into upward, downward, leftward or rightward, so that the automatic prediction of the direction of the abstract picture image is realized.
The step S2 specifically includes the following steps:
s201, converting the four subblocks in the step S1 from an RGB color space into HSV models, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors to be used as a color histogram vector of an abstract picture; judging the comparison result of the histogram vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f1 and f2, wherein the specific formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR;
wherein, HistA,HistB,HistL,HistRHistogram vectors of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block are respectively;
s202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of four sub-blocks; judging the comparison result of the complexity of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f3 and f4, wherein the following formula is adopted:
f3=CompA≥CompB;f4=CompL≥CompR;
therein, CompA、CompB、CompL、CompRRespectively representing the complexity of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s203, calculating the similarity between every two sub-blocks in the four sub-blocks; and the comparison result of the similarity between the sub-blocks is taken as the image feature values f5, and f6 and f 7; the formula is as follows:
f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);
s204, detecting the significant straight lines of the four sub-blocks by using Hough transformation, judging whether the straight lines are static lines or dynamic lines according to the inclination angles alpha of the straight lines, calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics, and respectively taking the comparison results of the straight line attribute values between the two sub-blocks as image characteristic values f8, f9, f10, f11, f12 and f13, wherein the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB;
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR;
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper, lower, left and right sub-blocks, Ave _ LenA、Ave_LenB、Ave_LenL、Ave_LenRThe average lengths of all lines in the upper, lower, left, and right sub-blocks are represented, respectively.
In step S202, the calculation formula of the image complexity is as follows:
wherein G ismax(x, y) represents the maximum gradient of a pixel point (x, y) in the image in the RGB color space,r, G, B representing the gradient values of (x, y) points in the image, Pixelnum (G) representing the total number of pixels of image G, CompGRepresenting the complexity of the image G.
In step S201, the formula for converting the image from the RGB color space to the HSV model is as follows:
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;
v=kmax;
the method comprises the steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of the r ', g ' and b ', kmin represents the minimum value of the r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model.
The network structure of the convolutional neural network CNN is as follows: the first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolution layer consists of 4 convolution kernels of 3 multiplied by 3, a feature graph obtained after each convolution fills the edge part with 0, and the size is kept unchanged; after each convolutional layer, reducing feature resolution with maximum pooling of 2 × 2; finally, the 4 16 × 16 two-dimensional matrices are converted into 1024-dimensional eigenvectors using the fully-connected layer, and 1024 is reduced to 512 dimensions.
In step S4, the vector dimension of the final feature value of the linearly combined abstract picture image is 1291.
In step S5, the specific method of performing four classifications, i.e., "up", "down", "left", and "right", when the naive bayes classifier predicts the direction of the abstract drawing image is:
these four cases are divided into four groups: one direction is selected as one type in each group, the other three directions are used as the other types, the ratio of the posterior probabilities of the two types in each group is calculated, and the calculation formula is as follows:
wherein,is the posterior probability ratio of the two classes in each group; p (C)θIf) represents the posterior probability of the direction selected therein,representing the posterior probabilities of the remaining three directions; comparing the posterior probability ratios of two of the four groupsSelectingThe direction with the largest value is taken as the correct direction for the abstract picture image.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an abstract picture image direction identification method based on feature fusion and naive Bayes, which comprises the following steps of (1) carrying out upper and lower average segmentation and left and right average segmentation on an abstract picture image. Thus, each abstract drawing is divided into four sub-blocks (up, down, left, and right). The image features are based on the comparison result of the four direction feature descriptions, so that the direction structure of the image can be embodied more specifically. (2) According to the basic principle of the abstract drawing theory, low-level features including color, complexity, similarity and linear attributes of all abstract drawing images are extracted. From the perspective of the drawing principle, the characteristics can better express the basic characteristics of abstract drawing and reflect the directionality of the image. (3) And extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN). (4) And linearly combining the low-level features and the high-level features, wherein the combined vector is the final characteristic value of the abstract picture image. Therefore, local and global characteristics of the image can be better fused, and the image direction can be more accurately detected.
Drawings
FIG. 1 is a diagram illustrating abstract drawing rotation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of abstract picture segmentation according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a CNN model adopted in the embodiment of the present invention;
FIG. 4 is a frame for identifying an abstract image direction according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments; all other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides an abstract drawing image direction recognition method based on feature fusion and naive Bayes, which selects drawings of public websites to perform experiments, and comprises the following concrete implementation steps:
s1: 500 abstract pictures in the Wikiart (http:// www.wikiart.org) dataset were chosen. All abstract drawing images are rotated clockwise four directions (0 °,90 °,180 °,270 °) with reference to fig. 1. Finally, 2000 abstract picture images with different directions are obtained. And performing average segmentation on the abstract picture image up and down, and average segmentation on the abstract picture image left and right. Thus, each abstract picture is divided into four sub-blocks (up (a), down (B), left (L), and right (R)), referring to fig. 2.
S2: and extracting low-level features of all abstract drawing images according to the basic principle of the abstract drawing theory. And respectively calculating the low-level feature description of each sub-block, taking the comparison result of the feature descriptions as the final feature value of the image, and if the comparison result is true, representing as 1, otherwise, representing as 0. The method comprises the following specific steps:
s201: the four sub-blocks in S1 are converted from the RGB color space into HSV models (hue (h), saturation (S), value (v)). The calculation formula is as follows:
v=kmax; (3)
the method comprises the steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of r ', g ' and b ', kmin represents the minimum value of r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model.
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin; (5)
In the direction recognition, the influence factor of brightness is small, so that the H-S space is divided into 16 hues and 8 saturations, and the number of pixels of 128 colors is counted as a color histogram vector of the painting. The image feature values f1 and f2 are the comparison result of two sub-block histogram vectors, and the formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR; (6)
wherein, HistA,HistB,HistL,HistRHistogram vectors for the four sub-blocks, respectively. The dimensions of f1 and f2 are 128 dimensions.
S202: the maximum gradient map of the image is represented as the complexity of the image, and then the complexity of the four subblock images in step S1 is calculated. Setting the image as G, and calculating the maximum gradient of pixel points (x, y) in the image as G in the RGB color spacemax(x, y). Then G of all pixel points in the image is calculatedmaxAs the complexity of the image. The calculation formula is as follows:
wherein (x, y) represents the coordinates of a pixel point in the image,is the gradient value of (x, y) point, Pixelnum (G) is the total number of pixels in image G, CompGIs the complexity of the image G. The image feature values f3 and f4 are the comparison result of the complexity of the two sub-blocks, and the formula is as follows:
f3=CompA≥CompB;f4=CompL≥CompR; (9)
therein, CompA、CompB、CompL、CompRRespectively represent up, down, left,Complexity of the right four sub-blocks.
S203: and calculating the similarity between every two sub-blocks in the four sub-blocks.
Suppose two images G1And G2A histogram pyramid (HOG) is used to compute the similarity between the two images. The HOG features of 3 channels were calculated in RGB mode, with the image as a unit containing 8 directions. Similarity Sim (G) between two images1,G2) The calculation formula is as follows:
wherein G is1,G2∈RGB,H1And H2Are respectively an image G1And G2M is the number of cells present in the HOG feature. The image feature values f5, f6 and f7 are the comparison results of the similarity between the sub-blocks, and the formula is as follows: f5 ≧ Sim (A, L) ≧ Sim (A, R); f6 ≧ Sim (B, L) ≧ Sim (B, R); f7 ≧ Sim (A, B) ≧ Sim (L, R); (11)
s204: the Hough transform was used to detect the significant straight lines of the four sub-blocks. According to the inclination angle α of a straight line, the line is a static line if the inclination angle (-15 ° < α <15 °) or (75 ° < α <105 °), and a dynamic line otherwise. And calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics. The image feature values f8, f9, f10, f11, f12 and f13 are the comparison results of the straight line attribute values between two sub-blocks, and the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB; (12)
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR; (13)
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper, lower, left and right sub-blocks, Ave _ LenA、Ave_LenB、Ave_LenL、Ave_LenRThe average lengths of all lines in the upper, lower, left, and right sub-blocks are represented, respectively.
S3: the convolutional Neural network CNN (convolutional Neural networks) is adopted to extract the high-level features of all abstract picture images, and the model refers to FIG. 3. The method comprises the following specific steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block, if true, representing as 1, otherwise, representing as 0, and taking the comparison result as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR; (14)
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRepresenting the neural network eigenvalues of the upper, lower, left and right sub-blocks, respectively. The dimensions of f14 and f15 are 512 dimensions.
In this embodiment, the CNN includes 3 convolutional layers with step size of 1, the activation function adopts ReLU, and the convolutional layers perform convolution on the input samples by using a filter to obtain a feature map. The first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolutional layer is composed of 4 convolution kernels of 3 × 3, and the feature map obtained after convolution each time fills the edge part with 0, and the size is kept unchanged. CNN contains 3 maximum pooling layers of 2 × 2 to reduce resolution. The pooling layer samples the input data to reduce parameters and avoid overfitting; CNN contains 2 fully connected layers for connecting all neurons. The two fully-connected dimensions are 1024 and 521 respectively, and finally, the 512-dimensional vector is used as a neural network characteristic value and is represented by f _ cnn. Other parameter settings of CNN network: batch _ size is 8, epochs is 10, learning rate is 1e-4, cost function selects "cross entropy loss function", optimizer is "Adam".
S4: and linearly combining the image features f1-f15 of the S2 and the S3, wherein the combined vector is the final feature value of the abstract drawing image. The combined feature vector dimension is 1291.
S5: and randomly selecting 400 paintings as original images of a training set and 100 paintings as a test set, so that 1600 final training set samples and 400 final test set samples are obtained after the original images are rotated. To obtain more accurate classification results, the classification model was evaluated using 10-fold cross-validation. And (4) putting the characteristic values of the abstract drawing obtained in the step (S4) into Naive Bayes (NB) for training and prediction, and finally dividing the abstract drawing into four types of upward, downward, leftward and rightward, thereby realizing the automatic prediction of the image direction of the abstract drawing. The abstract picture image orientation recognition framework refers to fig. 4.
When a naive Bayes classifier is used for two classifications ("upward" and "non-upward"), the ratio of the posterior probabilities is:
wherein, F ═ F1, F2 …, F15]Representing the direction of features, C, of an abstract drawing image G1Indicating an upward class and C2 indicating a non-upward class. P (C)1) And P (C)2) The prior probabilities of these two classes, P (C), respectively1I F) and P (C)2F) respectively represent the posterior probabilities of these two classes, P (F | C)1) And P (F | C)2) Respectively representing the conditional probability, P (f), of all the featuresi|C1) And P (f)i|C2) Respectively representing the conditional probability of the ith feature state.
All features are discrete, P (f)i|Cj) (i-1, 2, …, 1291, j-1, 2) is consistent with a 0-1 distribution. The conditional probability P (f) of each feature state can be calculated in the training phasei|Cj). In the prediction stage, the probability of which class the abstract drawing G should be classified into is determined according to the posterior probability ratio, and the formula is as follows:
wherein T is a threshold, and the value of the threshold T in the embodiment of the present invention is 0.5.
In this embodiment, the abstract picture may be classified into four categories by a naive bayes classifier, and the abstract picture image is identified into four directions, namely "up", "down", "left", and "right", according to a specific method: these four cases are divided into four groups: one of the directions theta is selected as a class, and the other three directionsAs another class. Then, the ratio of the posterior probabilities of the two classes in each group is calculated, and the formula is as follows:
wherein,is the posterior probability ratio of the two classes in each group. Comparing groups ofValue, selectionThe direction with the largest value is taken as the correct direction for the abstract picture image.
In order to fully verify the effectiveness and the applicability of the method, the classification model is tested in a mode of fusing a low-level feature and a high-level feature respectively, and the classification accuracy is shown in table 1. Experimental results show that the classification accuracy obtained by adopting the method of fusing the low-layer features and the high-layer features is highest no matter the abstract picture images are divided into two or four types.
Table 1: comparison of classification accuracy under different characteristics
In addition, the fused features were subjected to classification tests on a general classifier, and the test results are shown in table 2. The result shows that in the embodiment of the invention, the characteristic values are all 1 or 0, so the classification precision obtained by adopting the naive Bayes multi-classification model of the invention is higher.
Table 2: comparison of classification accuracy under different classifiers
In summary, the invention provides a method for identifying the direction of an abstract drawing image based on feature fusion and naive Bayes, which obtains the feature value of the image by means of fusion of low-level and high-level features, and then puts the feature value of the image into a naive Bayes classifier (NB) for training and prediction, thereby realizing automatic prediction of the direction of the abstract drawing image, effectively identifying the direction of the image, namely, establishing the relationship between the visual content of the image and the correct direction in the framework of machine learning, and improving the prediction precision.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (4)
1. An abstract picture image direction identification method based on feature fusion and naive Bayes is characterized by comprising the following steps:
s1, rotating the abstract drawing image by 0 degrees, 90 degrees, 180 degrees and 270 degrees to obtain four abstract drawing images with different directions, and performing upper-lower average segmentation and left-right average segmentation on the abstract drawing images; therefore, each abstract picture image is divided into an upper sub-block, a lower sub-block, a left sub-block and a right sub-block;
s2, extracting low-level features of the abstract picture image, respectively calculating low-level feature descriptions of each sub-block, taking a comparison result of the low-level feature descriptions of each block as an image low-level feature value, if the comparison result is true, representing as 1, otherwise, representing as 0; the step S2 specifically includes the following steps:
s201, converting the four subblocks in the step S1 from an RGB color space into HSV models, dividing the H-S space into 16 hues and 8 saturations, and counting the number of pixels of 128 colors to be used as a color histogram vector of an abstract picture; judging the comparison result of the histogram vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f1 and f2, wherein the specific formula is as follows:
f1=HistA≥HistB;f2=HistL≥HistR;
wherein, HistA,HistB,HistL,HistRHistogram vectors of an upper sub-block, a lower sub-block, a left sub-block and a right sub-block are respectively;
the formula for converting the image from the RGB color space to the HSV model is as follows:
kmax=max(r′,g′,b′);kmin=min(r′,g′,b′);Δ=kmax-kmin;
v=kmax;
the method comprises the following steps that r, g and b respectively represent RGB values of image pixels in an RGB color space, r ', g ' and b ' are intermediate variables, kmax represents the maximum value of r ', g ' and b ', kmin represents the minimum value of r ', g ' and b ', and h, s and v represent hue values, saturation and brightness of the image pixels in an HSV model;
s202, representing the maximum gradient of the image as the complexity of the image, and calculating the complexity of the four sub-blocks; judging the comparison result of the complexity of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image characteristic values f3 and f4, wherein the following formula is adopted:
f3=CompA≥CompB;f4=CompL≥CompR;
therein, CompA、CompB、CompL、CompRRespectively representing the complexity of an upper sub block, a lower sub block, a left sub block and a right sub block;
the calculation formula of the complexity is as follows:
wherein, Gmax(x, y) represents the maximum gradient of a pixel point (x, y) in the image in the RGB color space,representing R, G, B points of (x, y) in the image respectivelyGradient value, Pixelnum (G), represents the total number of pixels of the image G, CompGRepresents the complexity of image G;
s203, calculating the similarity between every two sub-blocks in the four sub-blocks; and the comparison result of the similarity between the sub-blocks is taken as the image feature values f5, and f6 and f 7; the formula is as follows:
f5=Sim(A,L)≥Sim(A,R);f6=Sim(B,L)≥Sim(B,R);f7=Sim(A,B)≥Sim(L,R);
the similarity calculation formula is as follows:
wherein Sim (G)1,G2) Indicating the degree of acquaintance, G, of the images G1 and G21,G2∈RGB,H(i)1And H (i)2Are respectively an image G1And G2M is the number of cells present in the HOG feature;
s204, detecting the significant straight lines of the four sub-blocks by using Hough transformation, judging whether the straight lines are static lines or dynamic lines according to the inclination angles alpha of the straight lines, calculating the number of the static lines and the dynamic lines and the average length of all the lines as image characteristics, and respectively taking the comparison results of the straight line attribute values between the two sub-blocks as image characteristic values f8, f9, f10, f11, f12 and f13, wherein the formula is as follows:
f8=Len_SA≥Len_SB;f9=Len_DA≥Len_DB;f10=Ave_LenA≥Ave_LenB;
f11=Len_SL≥Len_SR;f12=Len_DL≥Len_DR;f13=Ave_LenL≥Ave_LenR;
wherein Len _ SA、Len_SB、Len_SL、Len_SRRespectively representing the number of static lines in the upper, lower, left and right sub-blocks, Len _ DA、Len_DB、Len_DL、Len_DRRespectively represents the number of dynamic lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block, Ave_LenA、Ave_LenB、Ave_LenL、Ave_LenRRespectively representing the average length of all lines in the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s3, extracting high-level features of the abstract picture image by adopting a Convolutional Neural Network (CNN), which comprises the following concrete steps:
s301, adjusting four sub-blocks of the abstract picture into a 128 multiplied by 128 RGB color image;
s302, respectively inputting the four subblocks into a convolutional neural network CNN, wherein the convolutional neural network CNN comprises 3 convolutional layers with the step length of 1, 3 maximum pooling layers of 2 multiplied by 2 and 2 full-connection layers, a ReLU is adopted as an activation function in each convolutional layer, the dimensionalities of the two full-connection layers are respectively 1024 and 521, and finally, 512-dimensional vectors are respectively obtained and used as neural network characteristic vectors;
s303, judging the comparison result of the feature vectors of the upper sub-block, the lower sub-block and the left sub-block and the right sub-block as image high-level feature values f14 and f15, wherein the calculation formula is as follows:
f14=f_cnnA≥f_cnnB;f15=f_cnnL≥f_cnnR;
wherein f _ cnnA、f_cnnB、f_cnnL、f_cnnRRespectively representing the characteristic values of the neural network of the upper sub-block, the lower sub-block, the left sub-block and the right sub-block;
s4, linearly combining the image low-level characteristic value obtained in the step S2 and the image high-level characteristic value obtained in the step S3 to obtain a final characteristic value of the abstract picture image;
and S5, performing the operations of S1-S4 on all abstract pictures in the image library to obtain the final characteristic value of the abstract picture, inputting the final characteristic value into a naive Bayes classifier for training and prediction, and finally dividing the abstract picture into 'upward', 'downward', 'left' or 'right', thereby realizing the automatic prediction of the direction of the abstract picture image.
2. The method for identifying the direction of the abstract picture image based on the feature fusion and naive Bayes as claimed in claim 1, wherein the network structure of the convolutional neural network CNN is as follows: the first convolutional layer consists of 16 convolution kernels of 3 × 3; the second convolutional layer consists of 8 convolution kernels of 3 × 3; the third convolution layer consists of 4 convolution kernels of 3 multiplied by 3, a feature graph obtained after each convolution fills the edge part with 0, and the size is kept unchanged; after each convolutional layer, reducing feature resolution with maximum pooling of 2 × 2; finally, the 4 16 × 16 two-dimensional matrices are converted into 1024-dimensional eigenvectors using the fully-connected layer, and 1024 is reduced to 512 dimensions.
3. The method for identifying the direction of an abstract drawing image based on feature fusion and naive Bayes as claimed in claim 1, wherein in said step S4, the vector dimension of the final feature value of the linearly combined abstract drawing image is 1291.
4. The method for identifying the direction of the abstract drawing image based on the feature fusion and naive Bayes as claimed in claim 1, wherein in said step S5, the concrete method for performing four classifications of "upward", "downward", "leftward" and "rightward" when the naive Bayes classifier predicts the direction of the abstract drawing image is:
these four cases are divided into four groups: one direction is selected as one type in each group, the other three directions are used as the other types, the ratio of the posterior probabilities of the two types in each group is calculated, and the calculation formula is as follows:
wherein,is the posterior probability ratio of the two classes in each group; p (C)θIf) represents the posterior probability of the direction selected therein,representing the posterior probabilities of the remaining three directions; comparing the posterior probability ratios of two of the four groupsSelectingThe direction with the largest value is taken as the correct direction for the abstract picture image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010737934.9A CN111950565B (en) | 2020-07-28 | 2020-07-28 | Abstract picture image direction identification method based on feature fusion and naive Bayes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010737934.9A CN111950565B (en) | 2020-07-28 | 2020-07-28 | Abstract picture image direction identification method based on feature fusion and naive Bayes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111950565A CN111950565A (en) | 2020-11-17 |
CN111950565B true CN111950565B (en) | 2022-05-20 |
Family
ID=73338368
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010737934.9A Active CN111950565B (en) | 2020-07-28 | 2020-07-28 | Abstract picture image direction identification method based on feature fusion and naive Bayes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111950565B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106557771A (en) * | 2016-11-17 | 2017-04-05 | 电子科技大学 | Skin disease color of image feature extracting method based on Naive Bayes Classifier |
CN110276278A (en) * | 2019-06-04 | 2019-09-24 | 刘嘉津 | Insect image identification entirety and the recognition methods of multiple clips comprehensive automation |
CN110956184A (en) * | 2019-11-18 | 2020-04-03 | 山西大学 | Abstract diagram direction determination method based on HSI-LBP characteristics |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6282193B2 (en) * | 2014-07-28 | 2018-02-21 | クラリオン株式会社 | Object detection device |
-
2020
- 2020-07-28 CN CN202010737934.9A patent/CN111950565B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106557771A (en) * | 2016-11-17 | 2017-04-05 | 电子科技大学 | Skin disease color of image feature extracting method based on Naive Bayes Classifier |
CN110276278A (en) * | 2019-06-04 | 2019-09-24 | 刘嘉津 | Insect image identification entirety and the recognition methods of multiple clips comprehensive automation |
CN110956184A (en) * | 2019-11-18 | 2020-04-03 | 山西大学 | Abstract diagram direction determination method based on HSI-LBP characteristics |
Non-Patent Citations (7)
Title |
---|
Nonlocal Patch Tensor Sparse Representation for Hyperspectral Image Super-Resolution;Yang Xu等;《IEEE Transactions on Image Processing》;20190118;第28卷(第6期);3034-3047 * |
Orientation judgment for abstract paintings;Jia Liu等;《Multimedia Tools And Applications》;20171221;第76卷(第1期);1017-1036 * |
Why my photos look sideways or upside down? Detecting canonical orientation of images using convolutional neural networks;Kunal Swami等;《2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)》;20170907;495-500 * |
可计算图像复杂度评价方法综述;郭小英等;《电子学报》;20200415;第48卷(第4期);819-826 * |
基于深度学习的情感化设计;王晓慧等;《包装工程》;20170320;第38卷(第6期);12-16 * |
绘画图像美学研究方法综述;白茹意等;《中国图象图形学报》;20191116;第24卷(第11期);1860-1881 * |
绘画特征提取方法与情感分析研究综述;贾春花等;《中国图象图形学报》;20180716;第23卷(第7期);937-952 * |
Also Published As
Publication number | Publication date |
---|---|
CN111950565A (en) | 2020-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
Narihira et al. | Learning lightness from human judgement on relative reflectance | |
Karayev et al. | Recognizing image style | |
JP4335476B2 (en) | Method for changing the number, size, and magnification of photographic prints based on image saliency and appeal | |
CN106446872A (en) | Detection and recognition method of human face in video under low-light conditions | |
CN109151501A (en) | A kind of video key frame extracting method, device, terminal device and storage medium | |
CN109359541A (en) | A kind of sketch face identification method based on depth migration study | |
CN109948566B (en) | Double-flow face anti-fraud detection method based on weight fusion and feature selection | |
CN112070044B (en) | Video object classification method and device | |
EP1168247A2 (en) | Method for varying an image processing path based on image emphasis and appeal | |
CN107169417B (en) | RGBD image collaborative saliency detection method based on multi-core enhancement and saliency fusion | |
CN1975759A (en) | Human face identifying method based on structural principal element analysis | |
CN112329851B (en) | Icon detection method and device and computer readable storage medium | |
CN106529494A (en) | Human face recognition method based on multi-camera model | |
CN109522883A (en) | A kind of method for detecting human face, system, device and storage medium | |
Johnson et al. | Sparse codes as alpha matte | |
CN111047543A (en) | Image enhancement method, device and storage medium | |
CN109740539A (en) | 3D object identification method based on transfinite learning machine and fusion convolutional network | |
CN110517270A (en) | A kind of indoor scene semantic segmentation method based on super-pixel depth network | |
CN109325434A (en) | A kind of image scene classification method of the probability topic model of multiple features | |
Liu et al. | Modern architecture style transfer for ruin or old buildings | |
CN110956184A (en) | Abstract diagram direction determination method based on HSI-LBP characteristics | |
CN111612090B (en) | Image emotion classification method based on content color cross correlation | |
CN116975828A (en) | Face fusion attack detection method, device, equipment and storage medium | |
KR20180092453A (en) | Face recognition method Using convolutional neural network and stereo image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230629 Address after: No. 304-314, No. 16 (Plant B), Huifeng East Second Road, Zhongkai High tech Zone, Huizhou, Guangdong Province, 516000 Patentee after: HUIZHOU WEIMILI TECHNOLOGY Co.,Ltd. Address before: 030006 No. 92, Hollywood Road, Taiyuan, Shanxi Patentee before: SHANXI University |
|
TR01 | Transfer of patent right |