CN109472209A - A kind of image-recognizing method, device and storage medium - Google Patents
A kind of image-recognizing method, device and storage medium Download PDFInfo
- Publication number
- CN109472209A CN109472209A CN201811191691.2A CN201811191691A CN109472209A CN 109472209 A CN109472209 A CN 109472209A CN 201811191691 A CN201811191691 A CN 201811191691A CN 109472209 A CN109472209 A CN 109472209A
- Authority
- CN
- China
- Prior art keywords
- image
- feature
- description information
- target
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Abstract
The invention discloses a kind of image-recognizing methods, comprising: obtains target image;The target image is identified with preset image recognition model, obtains the goal description information for being directed to the target image;Content of the goal description information to describe the target image performance;The goal description information is detected, determines whether the target image is target class image according to testing result.The invention also discloses a kind of pattern recognition device and computer readable storage mediums.
Description
Technical field
The present invention relates to machine learning techniques more particularly to a kind of image-recognizing methods, device and computer-readable storage
Medium.
Background technique
In the prior art, the conventional method that supervised learning is mostly used to the identification of imperfect picture, specifically includes: acquisition is not
Plan deliberately piece, it is tagged to imperfect picture according to the type of imperfect picture, it is then trained to obtain classifier.It here, is knowledge
Not different types of imperfect picture needs to acquire different types of imperfect picture and learns to obtain different classifiers, then with not
Same classifier identifies different types of imperfect picture, is unable to get unified identifying schemes, effect is poor.
Summary of the invention
In view of this, the main purpose of the present invention is to provide a kind of image-recognizing method, device and computer-readable depositing
Storage media.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
The embodiment of the invention provides a kind of image-recognizing methods, which comprises
Obtain target image;
The target image is identified with preset image recognition model, obtains the goal description for being directed to the target image
Information;Content of the goal description information to describe the target image performance;
The goal description information is detected, determines whether the target image is target class image according to testing result.
In above scheme, the method also includes: generate described image identification model;
The generation described image identification model, comprising:
The sample image of preset quantity is obtained, and obtains the pattern representation information of each sample image;
The characteristics of image and each sample image corresponding pattern representation information of each sample image are extracted respectively
Text word feature;
According to described image feature, the text word feature and the pattern representation information training computation model, based on instruction
Computation model after white silk obtains described image identification model.
In above scheme, the computation model includes: time recurrent neural network (LSTM, Long Short-Term
Memory);
It is described that the LSTM, packet are trained according to described image feature, the text word feature and the pattern representation information
It includes:
Described image feature and the text word feature are sequentially input into LSTM, obtain result feature;The result feature
The the second result spy for including: the first result feature obtained according to described image feature and being obtained according to the text word feature
Sign;
Classify to the first result feature and the second result feature, obtains at least one according to classification results
It predicts word, prediction description information is generated according at least one described prediction word;
Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.
In above scheme, the text word feature of the corresponding pattern representation information of the sample image is extracted, comprising:
The pattern representation information is segmented, at least one pattern representation word is obtained;According to the pattern representation word
Determine the text word feature.
It is described to identify the target image with preset image recognition model in above scheme, it obtains and is directed to the mesh
The goal description information of logo image, comprising:
With the characteristics of image of target image described in preset image recognition model extraction, determined according to described image feature
At least one descriptor;
The goal description information for being directed to the target image is generated according at least one described descriptor.
In above scheme, whether the detection goal description information determines the target image according to testing result
For target class image, comprising:
Detect whether the goal description information includes preset sensitive word, determines that the goal description information includes default
Sensitive word when, determine the target image be target class image.
The embodiment of the invention provides a kind of pattern recognition device, described device includes: first processing module, second processing
Module and third processing module;Wherein,
The first processing module, for obtaining target image;
The Second processing module is directed to for identifying the target image with preset image recognition model
The goal description information of the target image;Content of the goal description information to describe the target image performance;
The third processing module determines the target figure for detecting the goal description information according to testing result
It seem no for target class image.
In above scheme, described device further include: preprocessing module, for generating described image identification model;
The preprocessing module specifically for the sample image of acquisition preset quantity, and obtains each sample image
Pattern representation information;The characteristics of image and the corresponding pattern representation letter of each sample image of each sample image are extracted respectively
The text word feature of breath;Computation model is trained according to described image feature, the text word feature and the pattern representation information,
Described image identification model is obtained based on the computation model after training.
The embodiment of the invention provides a kind of pattern recognition device, described device includes: processor and can for storing
The memory of the computer program run on a processor;Wherein,
The processor is for executing the step of any of the above item described image recognition methods when running the computer program
Suddenly.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer program, the meter
The step of described image recognition methods of any of the above item is realized when calculation machine program is executed by processor.
Image-recognizing method, device provided by the embodiment of the present invention and computer readable storage medium obtain target figure
Picture;The target image is identified with preset image recognition model, obtains the goal description information for being directed to the target image;
Content of the goal description information to describe the target image performance;The goal description information is detected, according to detection
As a result determine whether the target image is target class image.In the embodiment of the present invention, recognition target image is to obtain target figure
The description information of picture judges whether target image is target class image according to description information, without the multiple classifiers of training with into
Row image recognition realizes integrated image-recognizing method, greatly improves recognition effect.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of another image-recognizing method provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram of ResNet50 network structure provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of down-sampled module provided in an embodiment of the present invention;
Fig. 5 is a kind of convolution flow diagram provided in an embodiment of the present invention;
Fig. 6 is a kind of maximum value pond flow diagram provided in an embodiment of the present invention;
Fig. 7 is a kind of training flow chart of computation model provided in an embodiment of the present invention;
Fig. 8 is a kind of basic structure schematic diagram of LSTM memory unit provided in an embodiment of the present invention;
Fig. 9 is a kind of structural schematic diagram of pattern recognition device provided in an embodiment of the present invention;
Figure 10 is the structural schematic diagram of another pattern recognition device provided in an embodiment of the present invention.
Specific embodiment
In various embodiments of the present invention, target image is obtained;The mesh is identified with preset image recognition model
Logo image obtains the goal description information for being directed to the target image;The goal description information is to describe the target figure
As the content of performance;The goal description information is detected, determines whether the target image is target class figure according to testing result
Picture.
Below with reference to embodiment, the present invention is further described in more detail.
Fig. 1 is a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention;The method can be applied
In server, as shown in Figure 1, which comprises
Step 101 obtains target image.
Here, the target image is images to be recognized.In one embodiment, the target image can be stored in service
In device, the target image of itself preservation, i.e. acquisition target image are read by the server;In another embodiment, Ke Yiyou
Target image is sent to the server by other terminals, so that the server obtains the target image.
Step 102 identifies the target image with preset image recognition model, obtains for the target image
Goal description information;Content of the goal description information to describe the target image performance.
In the present embodiment, the method also includes: generate described image identification model.
Specifically, the generation described image identification model, comprising:
The sample image of preset quantity is obtained, and obtains the pattern representation information of each sample image;
The characteristics of image and each sample image corresponding pattern representation information of each sample image are extracted respectively
Text word feature;
According to described image feature, the text word feature and the pattern representation information training computation model, based on instruction
Computation model after white silk obtains described image identification model.
Described image feature and the text word feature are term vector form.
Specifically, the computation model includes: LSTM.
It is described that the LSTM, packet are trained according to described image feature, the text word feature and the pattern representation information
It includes:
Described image feature and the text word feature are sequentially input into LSTM, obtain result feature;The result feature
The the second result spy for including: the first result feature obtained according to described image feature and being obtained according to the text word feature
Sign;
Classify to the first result feature and the second result feature, obtains at least one according to classification results
It predicts word, prediction description information is generated according at least one described prediction word;
Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.
Here, the LSTM is optimized repeatedly through the above steps, the LSTM after being optimized, according to the LSTM after optimization
Obtain the computation model.
The computation model, further includes: for extracting the module of the characteristics of image of each sample image, described for extracting
The module of the text word feature of the corresponding pattern representation information of each sample image, and it is pre- for carrying out word according to classification results
It surveys to obtain the module of at least one prediction word.
Specifically, the text word feature of the corresponding pattern representation information of the sample image is extracted, comprising:
The pattern representation information is segmented, at least one pattern representation word is obtained;According to the pattern representation word
Determine the text word feature.
Specifically, the characteristics of image of each sample image is extracted, comprising:
It identifies the sample image, feature extraction is carried out to the sample image, to obtain the image of the sample image
Feature.
It is described to identify the target image with preset image recognition model in the present embodiment, it obtains and is directed to the mesh
The goal description information of logo image, comprising:
With the characteristics of image of target image described in preset image recognition model extraction, determined according to described image feature
At least one descriptor;
The goal description information for being directed to the target image is generated according at least one described descriptor.
Step 103, the detection goal description information, determine whether the target image is target class according to testing result
Image.
Specifically, the detection goal description information, determines whether the target image is mesh according to testing result
Mark class image, comprising:
Detect whether the goal description information includes preset sensitive word, determines that the goal description information includes default
Sensitive word when, determine the target image be target class image.
Here, content of the goal description information to describe the target image performance.Such as: thick with the fumes of gunpowder is black
Come up in night one group of riot police, indignation protest crowd burning national flag, a terrorist holds two rifles, groups
It plays by the sea.The preset sensitive word includes: terrorist, protest, explosion-proof etc.;According to the above goal description information into
It is " thick with the fumes of gunpowder to have separately included sensitive word " explosion-proof ", " protest ", the goal description information of " terrorist " for the matching of row sensitive word
Night in come up one group of riot police ", " angry protest crowd is burning national flag ", " terrorist holds two
The corresponding target image of rifle " is target class image.
Fig. 2 is the flow diagram of another image-recognizing method provided in an embodiment of the present invention;The method is for knowing
Not bad image, the method can be applied in server;As shown in Figure 2, which comprises
Step 201, the good image of acquisition are as positive sample.
Specifically, the step 201 includes: to obtain the good image of the first preset quantity as positive sample;It is described good
Image is corresponding with the first description information;First description information may include: image download address, image filename, with
And the corresponding five Chinese description of image.
Step 202, the bad image of acquisition simultaneously add Chinese description as negative sample.
Specifically, the step 202, comprising: obtain the bad image of the second preset quantity as negative sample;Described in determination
Second description information of bad image;Second description information may include: image download address, image filename, with
And the corresponding five Chinese description of image.Here, the bad image may include: the figure of the colors such as violence, pornographic, terror
Picture.The ratio of first preset quantity and second preset quantity can be between 1/2-1/5, it is therefore preferable to 1/3.
Step 203 respectively pre-processes the Chinese description in positive sample and negative sample.
Specifically, in order to preferably extract characteristics of image, positive sample and the Chinese of negative sample can be described respectively
It is pre-processed, the pretreatment comprises at least one of the following:
Chinese word segmentation specifically can be used stammerer (Jieba) participle tool and segment to Chinese;
Each word and word order number are mapped by word2ix;
Word order number and word are mapped by ix2word;
Id2ix determines the corresponding picture numbers of image file name;
Ix2id determines the corresponding image file name of picture numbers;
Filtering low word, i.e., by text describe in the lower word of the frequency of occurrences filter out, including some auxiliary words;
Polishing is isometric, i.e., by the Data-parallel language of different length at equally long.
It should be noted that image file name or word is corresponding with serial number, it is equivalent to and defines a dictionary, word can be passed through
To search corresponding word order number or search corresponding word by word order number.So as to facilitate statistics to segment, but also can
It is segmented with filtering low.
Step 204, the characteristics of image that image is extracted by depth residual error network (ResNet).
Specifically, ResNet carry out image semantic space to term vector semantic space conversion, extraction institute's predicate
Term vector in the semantic space of vector, i.e. described image feature.
Here, what ResNet was extracted is 2048 dimensional vectors, by the vector for becoming 256 dimensions after full articulamentum.ResNet is mentioned
The output of pond layer is got, full articulamentum input is 2048 dimensional feature vectors of layer second from the bottom, constructs an input as 2048 dimensions
And the full articulamentum tieed up for 256 is exported, and the feature vector extracted is inputted into the full articulamentum and obtains the vector of 256 dimensions, i.e., it is complete
At image semantic space to term vector semantic space conversion.
In the present embodiment, the ResNet can use ResNet50 as shown in Figure 3;ResNet module in Fig. 3
Structure is as shown in Figure 4.
In Fig. 4, BN is Batch Normalization, that is, criticizes standardization.RELU is amendment linear unit (Rectified
Linear Unit) function, RELU functional form are as follows: θ (x)=max (0, x).CONV is convolutional layer, and convolutional layer is by right
Image carries out convolution operation to extract characteristics of image.In convolutional neural networks, each convolutional layer would generally be instructed comprising multiple
Experienced convolution mask (i.e. convolution kernel), different convolution masks correspond to different characteristics of image.Convolution kernel and input picture carry out
After convolution operation, by nonlinear activation function, such as Sigmoid function, amendment linear unit (RELU, Rectified
Linear Unit) function, ELU function etc., it can map to obtain corresponding characteristic pattern (Feature Map).Wherein, convolution
The parameter of core is usually being calculated using specific learning algorithm (such as stochastic gradient descent algorithm).What the convolution referred to
It is the operation that summation is weighted with the pixel value of parameter and image corresponding position in template.One typical convolution process can
As shown in figure 5, carrying out convolution operation by sleiding form window to all positions in input picture, can obtaining later
To corresponding characteristic pattern.
In the present embodiment, based on convolutional neural networks, it is advantageous that: it abandons adjacent in traditional neural network
" full connection " design between layer, in such a way that part connection and weight are shared, reduction significantly needs trained model parameter
Number reduces calculation amount.The part connection refers to each neuron and one in input picture in convolutional neural networks
Regional area is connected, rather than connect entirely with all neurons.The shared different zones referred in input picture of the weight, altogether
Enjoy Connecting quantity (i.e. convolution nuclear parameter).In addition, the design method that the part connection of convolutional neural networks and weight are shared, so that
The feature that network extracts has the stability of height, insensitive to translation, scaling and deformation etc..
Pond layer usually occurs with convolutional layer in pairs, after convolutional layer, is used to carry out down-sampled behaviour to input feature vector figure
Make.Image is commonly entered after convolution operation, a large amount of characteristic patterns that can be obtained, characteristic dimension is excessively high to will lead to network query function amount
Increase severely.Pond layer greatly reduces the number of parameters of model by the dimension of reduction characteristic pattern.On the one hand this method reduces net
The calculation amount of network operation, on the other hand also reduces the risk of network over-fitting.The spy of characteristic pattern and convolutional layer that pond obtains
Sign figure is one-to-one, therefore pondization operation is only reduction of characteristic pattern dimension, and number does not change.
There is pond method involved in convolutional neural networks in the present embodiment: maximum value pond (Max Pooling), mean value
Pond (Mean Pooling) and random pool (Stochastic Pooling).It is maximum for a sampling subregion
Value pond refers to choosing output result of the maximum point of wherein pixel value as the region;Mean value pond refers to calculating wherein
The mean value of all pixels point, uses the mean value as the output of sampling area;Random pool refers to selecting at random from sampling area
A pixel value is taken to export as a result, usual pixel value is bigger, and the probability selected is higher.Maximum value pond process is as follows
Shown in Fig. 6.
Step 205 obtains text word feature according to pretreated Chinese description.
Specifically, pretreated Chinese description obtains text word feature after embeding layer (Embedding).Wherein,
Each text word feature is 256 dimensional vectors;Common Embedding method has Word2vec, GloVe etc..
Step 206 trains computation model according to described image feature and text word feature, based on the computation model after training
Obtain described image identification model.
Specifically, the computation model includes LSTM.
The step 206 includes: the corresponding characteristics of image of the image for obtaining step 204,205 and text word in order
Feature (i.e. term vector in Fig. 7), sequentially inputs LSTM and is trained;The output of the term vector of each word is calculated, i.e. acquisition root
The the first result feature obtained according to described image feature and the second result feature obtained according to the text word feature;To described
First result feature and the second result feature are classified, and obtain at least one prediction word according to classification results;The above tool
Body process is as shown in Figure 7;
Prediction description information is generated according at least one described prediction word;Compare the prediction description information and the sample
Description information optimizes the LSTM according to comparison result.
Here it is possible to which (i.e. text word is special for the term vector and other term vectors of regarding described image feature as first word
Sign) being stitched together inputs LSTM;The output of the LSTM predicts the feature of the serial number of next word as classifying.
Here, the basic structure of LSTM memory unit is as shown in Figure 8, wherein xtFor the input of current point in time, it is assumed that defeated
Entering gate cell and forgeing the input of gate cell is respectively itAnd ot, meet following formula (1) and (2):
it=σ (Wxixt+Whiht-1+Wcict-1+bi) (1)
ft=σ (Wxfxt+Whfht-1+Wcfct-1+bf) (2)
Wherein, ftIt is to forget door, then the state Ct of memory cell can be calculated by following formula (3):
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc) (3)
The value of output gate cell is determined by current cell state, but is the cell value after filtering.First will
Cell state guarantees output area between -1 to 1 by a Sigmoid unit, followed by hyperbolic tangent function, it may be assumed that
ot=σ (Wxoxt+Whoht-1+Wcoct+bo) (4)
The output ht of hidden unit is codetermined by cell state and output gate cell, is met:
ht=ottanh(ct) (5)
More than, σ () indicates that Sigmoid function, W indicate that the weight matrix connected between each unit, b indicate each list
The bias vector of member.
Here, the prediction technique of word is described further.In the present embodiment, (refer to characteristics of image and text using each word
This word feature) output classify, predict next word according to classification results.Specific method includes: to be made using preceding n-1 word
For input, rear n-1 word is as prediction target.In the present embodiment, it is contemplated that be easy that search is made to fall into part using greedy algorithm
It is optimal, it therefore, is trained using beam-search (bean search) algorithm of Dynamic Programming, every time when search, is only write down most
Possible n word then proceedes to search for next word, finds n*n sequence, next one word finds n*n*n sequence, saves
The n*n of maximum probability, so constantly search is until finally obtain optimal result.
Step 207 obtains images to be recognized, identifies the images to be recognized with described image identification model, obtains needle
To the goal description information of the images to be recognized;The goal description information is to describe in the images to be recognized performance
Hold;The goal description information is detected, determines whether the target image is bad image according to testing result.
Specifically, the step 207 includes: (to scheme with the semantic feature of image recognition model extraction images to be recognized
As feature), the goal description information of the images to be recognized is obtained according to the semantic feature, is occurred when in goal description information
When preset sensitive word, then determine that the images to be recognized is bad image.
Here, the sensitive word can pre-save in the server, and the sensitive word may include terror, violence, color
The bad word such as feelings.Server is matched using filtering sensitive words algorithm (DFA), is occurred when being matched to the goal description information
When sensitive word, that is, determine that the images to be recognized is bad image.
Fig. 9 is a kind of structural schematic diagram of pattern recognition device provided in an embodiment of the present invention;As shown in figure 9, the dress
Set includes: first processing module 301, Second processing module 302 and third processing module 303.
The first processing module 301, for obtaining target image.
The Second processing module 302 obtains needle for identifying the target image with preset image recognition model
To the goal description information of the target image;Content of the goal description information to describe the target image performance.
The third processing module 303 determines the target for detecting the goal description information according to testing result
Whether image is target class image.
Specifically, described device further include: preprocessing module, for generating described image identification model.
The preprocessing module specifically for the sample image of acquisition preset quantity, and obtains each sample image
Pattern representation information;The characteristics of image and the corresponding pattern representation letter of each sample image of each sample image are extracted respectively
The text word feature of breath;Computation model is trained according to described image feature, the text word feature and the pattern representation information,
Described image identification model is obtained based on the computation model after training.
Here, the computation model includes: LSTM.
The preprocessing module is obtained specifically for described image feature and the text word feature are sequentially input LSTM
Obtain result feature;The result feature includes: according to the first result feature of described image feature acquisition and according to the text
The second result feature that word feature obtains;
Classify to the first result feature and the second result feature, obtains at least one according to classification results
It predicts word, prediction description information is generated according at least one described prediction word;
Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.
Specifically, the preprocessing module obtains at least one specifically for segmenting to the pattern representation information
Pattern representation word;The text word feature is determined according to the pattern representation word.
Specifically, the Second processing module 302 is specifically used for target described in preset image recognition model extraction
The characteristics of image of image determines at least one descriptor according to described image feature;It is generated according at least one described descriptor
For the goal description information of the target image.
Specifically, whether the third processing module 303 includes preset specifically for detecting the goal description information
Sensitive word when determining that the goal description information includes preset sensitive word, determines that the target image is target class image.
It should be understood that pattern recognition device provided by the above embodiment is when carrying out image recognition, only with above-mentioned each
The division progress of program module can according to need for example, in practical application and distribute above-mentioned processing by different journeys
Sequence module is completed, i.e., the internal structure of device is divided into different program modules, to complete whole described above or portion
Divide processing.In addition, pattern recognition device provided by the above embodiment and image-recognizing method embodiment belong to same design, have
Body realizes that process is detailed in embodiment of the method, and which is not described herein again.
Figure 10 is the structural schematic diagram of another pattern recognition device provided in an embodiment of the present invention;Described image identification dress
It sets and can be applied to server;As shown in Figure 10, described device 40 includes: processor 401 and can be at the place for storing
The memory 402 of the computer program run on reason device;Wherein, when the processor 401 is used to run the computer program,
It executes: obtaining target image;The target image is identified with preset image recognition model, is obtained and is directed to the target image
Goal description information;Content of the goal description information to describe the target image performance;The target is detected to retouch
Information is stated, determines whether the target image is target class image according to testing result.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: obtaining present count
The sample image of amount, and obtain the pattern representation information of each sample image;The image for extracting each sample image respectively is special
The text word feature for the corresponding pattern representation information of each sample image of seeking peace;According to described image feature, the text
Word feature and pattern representation information training computation model, obtain described image based on the computation model after training and identify mould
Type.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: by described image
Feature and the text word feature sequentially input LSTM, obtain result feature;The result feature includes: according to described image spy
The the second result feature levying the first result feature obtained and being obtained according to the text word feature;To the first result feature
Classify with the second result feature, obtain at least one prediction word according to classification results, at least one is pre- according to described
It surveys word and generates prediction description information;Compare the prediction description information and the pattern representation information, is optimized according to comparison result
The LSTM.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: to the sample
Description information is segmented, at least one pattern representation word is obtained;The text word feature is determined according to the pattern representation word.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: with preset
The characteristics of image of target image described in image recognition model extraction determines at least one descriptor according to described image feature;Root
The goal description information for being directed to the target image is generated according at least one described descriptor.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: detecting the mesh
It marks whether description information includes preset sensitive word, when determining that the goal description information includes preset sensitive word, determines institute
Stating target image is target class image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the
When one recognition result meets the first preset condition, determine that the first image is target class image;First preset condition is
In first recognition result at least two first corresponding confidence levels of attribute and be greater than the first preset threshold;Determine institute
When stating the first recognition result and not meeting the first preset condition, at least one described corresponding weight of the first attribute, root are determined
According at least one described corresponding confidence level of the first attribute and weight, the first confidence level is obtained;According to first confidence
It spends and determines whether the first image is target class image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: identification described the
One image obtains the second recognition result;Second recognition result includes at least one emotion class of the first image performance
Type and the corresponding confidence level of at least one affective style;Correspondingly, described determine institute according to first confidence level
State whether the first image is target class image, comprising: according to first confidence level and second recognition result, determine described in
Whether the first image is target class image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the
When two recognition results meet the second preset condition, determine that the first image is target class image;Second preset condition is
The corresponding confidence level of target affective style is greater than the second preset threshold in second recognition result;Determine the second identification knot
When fruit does not meet the second preset condition, the corresponding weight of at least one affective style is determined, according to described at least one
The corresponding weight of kind affective style and confidence level, determine the second confidence level;In conjunction with first confidence level and described second
Confidence level determines whether the first image is target class image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the
Corresponding first weight of one confidence level and corresponding second weight of second confidence level;According to first confidence level, described
First weight, second confidence level and second weight obtain objective degrees of confidence, determine institute according to the objective degrees of confidence
State whether the first image is target class image.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the
When one image includes face, at least one facial image is extracted from the first image;Based on preset second image recognition
Model identifies the facial image, obtains the second recognition result;Second recognition result includes the first image performance
At least one face affective style and the corresponding confidence level of at least one face affective style;Determine first figure
When as not including face, scene characteristic is extracted from the first image;Institute is identified based on preset third image recognition model
Scene characteristic is stated, the second recognition result is obtained;Second recognition result includes at least one ring of the first image performance
Border affective style and the corresponding confidence level of at least one environment affective style.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: obtaining present count
The sample image of amount, each sample image is corresponding at least one first attribute in the sample image of the preset quantity;According to
The sample image of the preset quantity and at least one corresponding first attribute of each sample image are carried out based on convolutional Neural
The learning training of network obtains the first image identification model.
In one embodiment, it when the processor 401 is also used to run the computer program, executes: setting the volume
Product neural network uses Multi-label mode, and the convolutional layer of the convolutional neural networks includes multiple carry out learning trainings
Convolution module, different convolution modules corresponds to different characteristics of image;According to the sample image of the preset quantity, with more
A convolution module carries out learning training to the first attribute of each of at least one first attribute respectively;It obtains for identification
The first image identification model of at least one the first attribute.
It should be understood that pattern recognition device provided by the above embodiment belong to image-recognizing method embodiment it is same
Design, specific implementation process are detailed in embodiment of the method, and which is not described herein again.
When practical application, described device 40 can also include: at least one network interface 403.In pattern recognition device 40
Various components be coupled by bus system 404.It is understood that bus system 404 is for realizing between these components
Connection communication.Bus system 404 further includes power bus, control bus and status signal bus in addition in addition to including data/address bus.
But for the sake of clear explanation, various buses are all designated as bus system 404 in Figure 10.Wherein, the processor 404
Number can be at least one.Network interface 403 is used for wired or wireless way between pattern recognition device 40 and other equipment
Communication.
Memory 402 in the embodiment of the present invention is for storing various types of data to support the operation of device 40.
The method that the embodiments of the present invention disclose can be applied in processor 401, or be realized by processor 401.
Processor 401 may be a kind of IC chip, the processing capacity with signal.During realization, the above method it is each
Step can be completed by the integrated logic circuit of the hardware in processor 401 or the instruction of software form.Above-mentioned processing
Device 401 can be general processor, digital signal processor (DSP, Digital Signal Processor) or other can
Programmed logic device, discrete gate or transistor logic, discrete hardware components etc..Processor 401 may be implemented or hold
Disclosed each method, step and logic diagram in the row embodiment of the present invention.General processor can be microprocessor or appoint
What conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly at hardware decoding
Reason device executes completion, or in decoding processor hardware and software module combine and execute completion.Software module can be located at
In storage medium, which is located at memory 402, and processor 401 reads the information in memory 402, in conjunction with its hardware
The step of completing preceding method.
In the exemplary embodiment, pattern recognition device 40 can by one or more application specific integrated circuit (ASIC,
Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable
Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), scene
Programmable gate array (FPGA, Field-Programmable Gate Array), general processor, controller, microcontroller
(MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronic components are realized, are used for
Execute preceding method.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described
It when computer program is run by processor, executes: obtaining target image;The target is identified with preset image recognition model
Image obtains the goal description information for being directed to the target image;The goal description information is to describe the target image
The content of performance;The goal description information is detected, determines whether the target image is target class image according to testing result.
In one embodiment, it when the computer program is run by processor, executes: obtaining the sample graph of preset quantity
Picture, and obtain the pattern representation information of each sample image;The characteristics of image of each sample image and described every is extracted respectively
The text word feature of the corresponding pattern representation information of a sample image;According to described image feature, the text word feature and institute
Pattern representation information training computation model is stated, described image identification model is obtained based on the computation model after training.
In one embodiment, it when the computer program is run by processor, executes: by described image feature and the text
This word feature sequentially inputs LSTM, obtains result feature;The result feature includes: first obtained according to described image feature
As a result feature and the second result feature obtained according to the text word feature;To the first result feature and second knot
Fruit feature is classified, and obtains at least one prediction word according to classification results, generates prediction according at least one described prediction word
Description information;Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.
In one embodiment, it when the computer program is run by processor, executes: the pattern representation information is carried out
Participle, obtains at least one pattern representation word;The text word feature is determined according to the pattern representation word.
In one embodiment, it when the computer program is run by processor, executes: using preset image recognition model
The characteristics of image for extracting the target image determines at least one descriptor according to described image feature;According to described at least one
A descriptor generates the goal description information for being directed to the target image.
In one embodiment, when the computer program is run by processor, execute: detecting the goal description information is
No includes that preset sensitive word determines that the target image is when determining that the goal description information includes preset sensitive word
Target class image.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only
A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or
It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion
Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit
Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit
The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists
In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also
To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned
Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned include: movable storage device, it is read-only
Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or
The various media that can store program code such as person's CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product
When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented
Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words,
The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with
It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention.
And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code
Medium.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all
Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention
Within the scope of.
Claims (10)
1. a kind of image-recognizing method, which is characterized in that the described method includes:
Obtain target image;
The target image is identified with preset image recognition model, and the goal description obtained for the target image is believed
Breath;Content of the goal description information to describe the target image performance;
The goal description information is detected, determines whether the target image is target class image according to testing result.
2. the method according to claim 1, wherein the method also includes: generate described image identification model;
The generation described image identification model, comprising:
The sample image of preset quantity is obtained, and obtains the pattern representation information of each sample image;
The characteristics of image of each sample image and the text of the corresponding pattern representation information of each sample image are extracted respectively
Word feature;
According to described image feature, the text word feature and the pattern representation information training computation model, after training
Computation model obtain described image identification model.
3. according to the method described in claim 2, it is characterized in that, the computation model includes: time recurrent neural network
LSTM;
It is described that the LSTM is trained according to described image feature, the text word feature and the pattern representation information, comprising:
Described image feature and the text word feature are sequentially input into LSTM, obtain result feature;The result feature includes:
The the first result feature obtained according to described image feature and the second result feature obtained according to the text word feature;
Classify to the first result feature and the second result feature, obtains at least one prediction according to classification results
Word generates prediction description information according at least one described prediction word;
Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.
4. according to the method described in claim 2, it is characterized in that, extracting the corresponding pattern representation information of the sample image
Text word feature, comprising:
The pattern representation information is segmented, at least one pattern representation word is obtained;It is determined according to the pattern representation word
The text word feature.
5. the method according to claim 1, wherein described identify the mesh with preset image recognition model
Logo image obtains the goal description information for being directed to the target image, comprising:
With the characteristics of image of target image described in preset image recognition model extraction, determined at least according to described image feature
One descriptor;
The goal description information for being directed to the target image is generated according at least one described descriptor.
6. the method according to claim 1, wherein the detection goal description information, is tied according to detection
Fruit determines whether the target image is target class image, comprising:
Detect whether the goal description information includes preset sensitive word, determines that the goal description information includes preset quick
When feeling word, determine that the target image is target class image.
7. a kind of pattern recognition device, which is characterized in that described device includes: first processing module, Second processing module and
Three processing modules;Wherein,
The first processing module, for obtaining target image;
The Second processing module is obtained for identifying the target image with preset image recognition model for described
The goal description information of target image;Content of the goal description information to describe the target image performance;
The third processing module determines that the target image is for detecting the goal description information according to testing result
No is target class image.
8. device according to claim 7, which is characterized in that described device further include: preprocessing module, for generating
State image recognition model;
The preprocessing module, specifically for obtaining the sample image of preset quantity, and the sample of each sample image of acquisition
Description information;The characteristics of image and each sample image corresponding pattern representation information of each sample image are extracted respectively
Text word feature;According to described image feature, the text word feature and the pattern representation information training computation model, it is based on
Computation model after training obtains described image identification model.
9. a kind of pattern recognition device, which is characterized in that described device includes: processor and can be on a processor for storing
The memory of the computer program of operation;Wherein,
The processor is for the step of when running the computer program, perform claim requires any one of 1 to 6 the method.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of any one of claim 1 to 6 the method is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811191691.2A CN109472209B (en) | 2018-10-12 | 2018-10-12 | Image recognition method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811191691.2A CN109472209B (en) | 2018-10-12 | 2018-10-12 | Image recognition method, device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109472209A true CN109472209A (en) | 2019-03-15 |
CN109472209B CN109472209B (en) | 2021-06-29 |
Family
ID=65663731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811191691.2A Active CN109472209B (en) | 2018-10-12 | 2018-10-12 | Image recognition method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109472209B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110162639A (en) * | 2019-04-16 | 2019-08-23 | 深圳壹账通智能科技有限公司 | Knowledge figure knows the method, apparatus, equipment and storage medium of meaning |
CN110705460A (en) * | 2019-09-29 | 2020-01-17 | 北京百度网讯科技有限公司 | Image category identification method and device |
CN111181835A (en) * | 2019-10-17 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Message monitoring method, system and server |
CN111241993A (en) * | 2020-01-08 | 2020-06-05 | 咪咕文化科技有限公司 | Seat number determination method and device, electronic equipment and storage medium |
CN111291649A (en) * | 2020-01-20 | 2020-06-16 | 广东三维家信息科技有限公司 | Image recognition method and device and electronic equipment |
CN111709406A (en) * | 2020-08-18 | 2020-09-25 | 成都数联铭品科技有限公司 | Text line identification method and device, readable storage medium and electronic equipment |
CN111931840A (en) * | 2020-08-04 | 2020-11-13 | 中国建设银行股份有限公司 | Picture classification method, device, equipment and storage medium |
CN112614568A (en) * | 2020-12-28 | 2021-04-06 | 东软集团股份有限公司 | Inspection image processing method and device, storage medium and electronic equipment |
CN112906726A (en) * | 2019-11-20 | 2021-06-04 | 北京沃东天骏信息技术有限公司 | Model training method, image processing method, device, computing device and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104680189A (en) * | 2015-03-15 | 2015-06-03 | 西安电子科技大学 | Pornographic image detection method based on improved bag-of-words model |
US20170115853A1 (en) * | 2015-10-21 | 2017-04-27 | Google Inc. | Determining Image Captions |
CN106846306A (en) * | 2017-01-13 | 2017-06-13 | 重庆邮电大学 | A kind of ultrasonoscopy automatic describing method and system |
CN107122806A (en) * | 2017-05-16 | 2017-09-01 | 北京京东尚科信息技术有限公司 | A kind of nude picture detection method and device |
CN107133951A (en) * | 2017-05-22 | 2017-09-05 | 中国科学院自动化研究所 | Distorted image detection method and device |
CN107391505A (en) * | 2016-05-16 | 2017-11-24 | 腾讯科技(深圳)有限公司 | A kind of image processing method and system |
-
2018
- 2018-10-12 CN CN201811191691.2A patent/CN109472209B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104680189A (en) * | 2015-03-15 | 2015-06-03 | 西安电子科技大学 | Pornographic image detection method based on improved bag-of-words model |
US20170115853A1 (en) * | 2015-10-21 | 2017-04-27 | Google Inc. | Determining Image Captions |
CN107391505A (en) * | 2016-05-16 | 2017-11-24 | 腾讯科技(深圳)有限公司 | A kind of image processing method and system |
CN106846306A (en) * | 2017-01-13 | 2017-06-13 | 重庆邮电大学 | A kind of ultrasonoscopy automatic describing method and system |
CN107122806A (en) * | 2017-05-16 | 2017-09-01 | 北京京东尚科信息技术有限公司 | A kind of nude picture detection method and device |
CN107133951A (en) * | 2017-05-22 | 2017-09-05 | 中国科学院自动化研究所 | Distorted image detection method and device |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110162639A (en) * | 2019-04-16 | 2019-08-23 | 深圳壹账通智能科技有限公司 | Knowledge figure knows the method, apparatus, equipment and storage medium of meaning |
CN110705460A (en) * | 2019-09-29 | 2020-01-17 | 北京百度网讯科技有限公司 | Image category identification method and device |
CN111181835A (en) * | 2019-10-17 | 2020-05-19 | 腾讯科技(深圳)有限公司 | Message monitoring method, system and server |
CN111181835B (en) * | 2019-10-17 | 2021-07-27 | 腾讯科技(深圳)有限公司 | Message monitoring method, system and server |
CN112906726A (en) * | 2019-11-20 | 2021-06-04 | 北京沃东天骏信息技术有限公司 | Model training method, image processing method, device, computing device and medium |
CN112906726B (en) * | 2019-11-20 | 2024-01-16 | 北京沃东天骏信息技术有限公司 | Model training method, image processing device, computing equipment and medium |
CN111241993A (en) * | 2020-01-08 | 2020-06-05 | 咪咕文化科技有限公司 | Seat number determination method and device, electronic equipment and storage medium |
CN111241993B (en) * | 2020-01-08 | 2023-10-20 | 咪咕文化科技有限公司 | Seat number determining method and device, electronic equipment and storage medium |
CN111291649B (en) * | 2020-01-20 | 2023-08-25 | 广东三维家信息科技有限公司 | Image recognition method and device and electronic equipment |
CN111291649A (en) * | 2020-01-20 | 2020-06-16 | 广东三维家信息科技有限公司 | Image recognition method and device and electronic equipment |
CN111931840A (en) * | 2020-08-04 | 2020-11-13 | 中国建设银行股份有限公司 | Picture classification method, device, equipment and storage medium |
CN111709406B (en) * | 2020-08-18 | 2020-11-06 | 成都数联铭品科技有限公司 | Text line identification method and device, readable storage medium and electronic equipment |
CN111709406A (en) * | 2020-08-18 | 2020-09-25 | 成都数联铭品科技有限公司 | Text line identification method and device, readable storage medium and electronic equipment |
CN112614568A (en) * | 2020-12-28 | 2021-04-06 | 东软集团股份有限公司 | Inspection image processing method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109472209B (en) | 2021-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109472209A (en) | A kind of image-recognizing method, device and storage medium | |
Xu et al. | Reasoning-rcnn: Unifying adaptive global reasoning into large-scale object detection | |
Wang et al. | Research on face recognition based on deep learning | |
CN107633207B (en) | AU characteristic recognition methods, device and storage medium | |
Li et al. | Multiple-human parsing in the wild | |
Zhang et al. | Relationship proposal networks | |
Alani et al. | Hand gesture recognition using an adapted convolutional neural network with data augmentation | |
CN108664924B (en) | Multi-label object identification method based on convolutional neural network | |
He et al. | Supercnn: A superpixelwise convolutional neural network for salient object detection | |
Guo et al. | Human attribute recognition by refining attention heat map | |
Chen et al. | Research on recognition of fly species based on improved RetinaNet and CBAM | |
CN106951825A (en) | A kind of quality of human face image assessment system and implementation method | |
CN110276248B (en) | Facial expression recognition method based on sample weight distribution and deep learning | |
CN109522925A (en) | A kind of image-recognizing method, device and storage medium | |
Lei et al. | A skin segmentation algorithm based on stacked autoencoders | |
CN109117879A (en) | Image classification method, apparatus and system | |
CN110765954A (en) | Vehicle weight recognition method, equipment and storage device | |
CN106485260A (en) | The method and apparatus that the object of image is classified and computer program | |
CN109344856B (en) | Offline signature identification method based on multilayer discriminant feature learning | |
CN106599864A (en) | Deep face recognition method based on extreme value theory | |
Shang et al. | Image spam classification based on convolutional neural network | |
CN114882521A (en) | Unsupervised pedestrian re-identification method and unsupervised pedestrian re-identification device based on multi-branch network | |
CN107818299A (en) | Face recognition algorithms based on fusion HOG features and depth belief network | |
Li et al. | Latent semantic representation learning for scene classification | |
CN110427912A (en) | A kind of method for detecting human face and its relevant apparatus based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |