CN109472209A

CN109472209A - A kind of image-recognizing method, device and storage medium

Info

Publication number: CN109472209A
Application number: CN201811191691.2A
Authority: CN
Inventors: 徐嵚嵛; 李琳; 周冰; 周效军
Original assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2018-10-12
Filing date: 2018-10-12
Publication date: 2019-03-15
Anticipated expiration: 2038-10-12
Also published as: CN109472209B

Abstract

The invention discloses a kind of image-recognizing methods, comprising: obtains target image；The target image is identified with preset image recognition model, obtains the goal description information for being directed to the target image；Content of the goal description information to describe the target image performance；The goal description information is detected, determines whether the target image is target class image according to testing result.The invention also discloses a kind of pattern recognition device and computer readable storage mediums.

Description

A kind of image-recognizing method, device and storage medium

Technical field

The present invention relates to machine learning techniques more particularly to a kind of image-recognizing methods, device and computer-readable storage Medium.

Background technique

In the prior art, the conventional method that supervised learning is mostly used to the identification of imperfect picture, specifically includes: acquisition is not Plan deliberately piece, it is tagged to imperfect picture according to the type of imperfect picture, it is then trained to obtain classifier.It here, is knowledge Not different types of imperfect picture needs to acquire different types of imperfect picture and learns to obtain different classifiers, then with not Same classifier identifies different types of imperfect picture, is unable to get unified identifying schemes, effect is poor.

Summary of the invention

In view of this, the main purpose of the present invention is to provide a kind of image-recognizing method, device and computer-readable depositing Storage media.

In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:

The embodiment of the invention provides a kind of image-recognizing methods, which comprises

Obtain target image；

The target image is identified with preset image recognition model, obtains the goal description for being directed to the target image Information；Content of the goal description information to describe the target image performance；

The goal description information is detected, determines whether the target image is target class image according to testing result.

In above scheme, the method also includes: generate described image identification model；

The generation described image identification model, comprising:

The sample image of preset quantity is obtained, and obtains the pattern representation information of each sample image；

The characteristics of image and each sample image corresponding pattern representation information of each sample image are extracted respectively Text word feature；

According to described image feature, the text word feature and the pattern representation information training computation model, based on instruction Computation model after white silk obtains described image identification model.

In above scheme, the computation model includes: time recurrent neural network (LSTM, Long Short-Term Memory)；

It is described that the LSTM, packet are trained according to described image feature, the text word feature and the pattern representation information It includes:

Described image feature and the text word feature are sequentially input into LSTM, obtain result feature；The result feature The the second result spy for including: the first result feature obtained according to described image feature and being obtained according to the text word feature Sign；

Classify to the first result feature and the second result feature, obtains at least one according to classification results It predicts word, prediction description information is generated according at least one described prediction word；

Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.

In above scheme, the text word feature of the corresponding pattern representation information of the sample image is extracted, comprising:

The pattern representation information is segmented, at least one pattern representation word is obtained；According to the pattern representation word Determine the text word feature.

It is described to identify the target image with preset image recognition model in above scheme, it obtains and is directed to the mesh The goal description information of logo image, comprising:

With the characteristics of image of target image described in preset image recognition model extraction, determined according to described image feature At least one descriptor；

The goal description information for being directed to the target image is generated according at least one described descriptor.

In above scheme, whether the detection goal description information determines the target image according to testing result For target class image, comprising:

Detect whether the goal description information includes preset sensitive word, determines that the goal description information includes default Sensitive word when, determine the target image be target class image.

The embodiment of the invention provides a kind of pattern recognition device, described device includes: first processing module, second processing Module and third processing module；Wherein,

The first processing module, for obtaining target image；

The Second processing module is directed to for identifying the target image with preset image recognition model The goal description information of the target image；Content of the goal description information to describe the target image performance；

The third processing module determines the target figure for detecting the goal description information according to testing result It seem no for target class image.

In above scheme, described device further include: preprocessing module, for generating described image identification model；

The preprocessing module specifically for the sample image of acquisition preset quantity, and obtains each sample image Pattern representation information；The characteristics of image and the corresponding pattern representation letter of each sample image of each sample image are extracted respectively The text word feature of breath；Computation model is trained according to described image feature, the text word feature and the pattern representation information, Described image identification model is obtained based on the computation model after training.

The embodiment of the invention provides a kind of pattern recognition device, described device includes: processor and can for storing The memory of the computer program run on a processor；Wherein,

The processor is for executing the step of any of the above item described image recognition methods when running the computer program Suddenly.

The embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer program, the meter The step of described image recognition methods of any of the above item is realized when calculation machine program is executed by processor.

Image-recognizing method, device provided by the embodiment of the present invention and computer readable storage medium obtain target figure Picture；The target image is identified with preset image recognition model, obtains the goal description information for being directed to the target image； Content of the goal description information to describe the target image performance；The goal description information is detected, according to detection As a result determine whether the target image is target class image.In the embodiment of the present invention, recognition target image is to obtain target figure The description information of picture judges whether target image is target class image according to description information, without the multiple classifiers of training with into Row image recognition realizes integrated image-recognizing method, greatly improves recognition effect.

Detailed description of the invention

Fig. 1 is a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention；

Fig. 2 is the flow diagram of another image-recognizing method provided in an embodiment of the present invention；

Fig. 3 is a kind of schematic diagram of ResNet50 network structure provided in an embodiment of the present invention；

Fig. 4 is a kind of structural schematic diagram of down-sampled module provided in an embodiment of the present invention；

Fig. 5 is a kind of convolution flow diagram provided in an embodiment of the present invention；

Fig. 6 is a kind of maximum value pond flow diagram provided in an embodiment of the present invention；

Fig. 7 is a kind of training flow chart of computation model provided in an embodiment of the present invention；

Fig. 8 is a kind of basic structure schematic diagram of LSTM memory unit provided in an embodiment of the present invention；

Fig. 9 is a kind of structural schematic diagram of pattern recognition device provided in an embodiment of the present invention；

Figure 10 is the structural schematic diagram of another pattern recognition device provided in an embodiment of the present invention.

Specific embodiment

In various embodiments of the present invention, target image is obtained；The mesh is identified with preset image recognition model Logo image obtains the goal description information for being directed to the target image；The goal description information is to describe the target figure As the content of performance；The goal description information is detected, determines whether the target image is target class figure according to testing result Picture.

Below with reference to embodiment, the present invention is further described in more detail.

Fig. 1 is a kind of flow diagram of image-recognizing method provided in an embodiment of the present invention；The method can be applied In server, as shown in Figure 1, which comprises

Step 101 obtains target image.

Here, the target image is images to be recognized.In one embodiment, the target image can be stored in service In device, the target image of itself preservation, i.e. acquisition target image are read by the server；In another embodiment, Ke Yiyou Target image is sent to the server by other terminals, so that the server obtains the target image.

Step 102 identifies the target image with preset image recognition model, obtains for the target image Goal description information；Content of the goal description information to describe the target image performance.

In the present embodiment, the method also includes: generate described image identification model.

Specifically, the generation described image identification model, comprising:

Described image feature and the text word feature are term vector form.

Specifically, the computation model includes: LSTM.

Here, the LSTM is optimized repeatedly through the above steps, the LSTM after being optimized, according to the LSTM after optimization Obtain the computation model.

The computation model, further includes: for extracting the module of the characteristics of image of each sample image, described for extracting The module of the text word feature of the corresponding pattern representation information of each sample image, and it is pre- for carrying out word according to classification results It surveys to obtain the module of at least one prediction word.

Specifically, the text word feature of the corresponding pattern representation information of the sample image is extracted, comprising:

Specifically, the characteristics of image of each sample image is extracted, comprising:

It identifies the sample image, feature extraction is carried out to the sample image, to obtain the image of the sample image Feature.

It is described to identify the target image with preset image recognition model in the present embodiment, it obtains and is directed to the mesh The goal description information of logo image, comprising:

Step 103, the detection goal description information, determine whether the target image is target class according to testing result Image.

Specifically, the detection goal description information, determines whether the target image is mesh according to testing result Mark class image, comprising:

Here, content of the goal description information to describe the target image performance.Such as: thick with the fumes of gunpowder is black Come up in night one group of riot police, indignation protest crowd burning national flag, a terrorist holds two rifles, groups It plays by the sea.The preset sensitive word includes: terrorist, protest, explosion-proof etc.；According to the above goal description information into It is " thick with the fumes of gunpowder to have separately included sensitive word " explosion-proof ", " protest ", the goal description information of " terrorist " for the matching of row sensitive word Night in come up one group of riot police ", " angry protest crowd is burning national flag ", " terrorist holds two The corresponding target image of rifle " is target class image.

Fig. 2 is the flow diagram of another image-recognizing method provided in an embodiment of the present invention；The method is for knowing Not bad image, the method can be applied in server；As shown in Figure 2, which comprises

Step 201, the good image of acquisition are as positive sample.

Specifically, the step 201 includes: to obtain the good image of the first preset quantity as positive sample；It is described good Image is corresponding with the first description information；First description information may include: image download address, image filename, with And the corresponding five Chinese description of image.

Step 202, the bad image of acquisition simultaneously add Chinese description as negative sample.

Specifically, the step 202, comprising: obtain the bad image of the second preset quantity as negative sample；Described in determination Second description information of bad image；Second description information may include: image download address, image filename, with And the corresponding five Chinese description of image.Here, the bad image may include: the figure of the colors such as violence, pornographic, terror Picture.The ratio of first preset quantity and second preset quantity can be between 1/2-1/5, it is therefore preferable to 1/3.

Step 203 respectively pre-processes the Chinese description in positive sample and negative sample.

Specifically, in order to preferably extract characteristics of image, positive sample and the Chinese of negative sample can be described respectively It is pre-processed, the pretreatment comprises at least one of the following:

Chinese word segmentation specifically can be used stammerer (Jieba) participle tool and segment to Chinese；

Each word and word order number are mapped by word2ix；

Word order number and word are mapped by ix2word；

Id2ix determines the corresponding picture numbers of image file name；

Ix2id determines the corresponding image file name of picture numbers；

Filtering low word, i.e., by text describe in the lower word of the frequency of occurrences filter out, including some auxiliary words；

Polishing is isometric, i.e., by the Data-parallel language of different length at equally long.

It should be noted that image file name or word is corresponding with serial number, it is equivalent to and defines a dictionary, word can be passed through To search corresponding word order number or search corresponding word by word order number.So as to facilitate statistics to segment, but also can It is segmented with filtering low.

Step 204, the characteristics of image that image is extracted by depth residual error network (ResNet).

Specifically, ResNet carry out image semantic space to term vector semantic space conversion, extraction institute's predicate Term vector in the semantic space of vector, i.e. described image feature.

Here, what ResNet was extracted is 2048 dimensional vectors, by the vector for becoming 256 dimensions after full articulamentum.ResNet is mentioned The output of pond layer is got, full articulamentum input is 2048 dimensional feature vectors of layer second from the bottom, constructs an input as 2048 dimensions And the full articulamentum tieed up for 256 is exported, and the feature vector extracted is inputted into the full articulamentum and obtains the vector of 256 dimensions, i.e., it is complete At image semantic space to term vector semantic space conversion.

In the present embodiment, the ResNet can use ResNet50 as shown in Figure 3；ResNet module in Fig. 3 Structure is as shown in Figure 4.

In Fig. 4, BN is Batch Normalization, that is, criticizes standardization.RELU is amendment linear unit (Rectified Linear Unit) function, RELU functional form are as follows: θ (x)=max (0, x).CONV is convolutional layer, and convolutional layer is by right Image carries out convolution operation to extract characteristics of image.In convolutional neural networks, each convolutional layer would generally be instructed comprising multiple Experienced convolution mask (i.e. convolution kernel), different convolution masks correspond to different characteristics of image.Convolution kernel and input picture carry out After convolution operation, by nonlinear activation function, such as Sigmoid function, amendment linear unit (RELU, Rectified Linear Unit) function, ELU function etc., it can map to obtain corresponding characteristic pattern (Feature Map).Wherein, convolution The parameter of core is usually being calculated using specific learning algorithm (such as stochastic gradient descent algorithm).What the convolution referred to It is the operation that summation is weighted with the pixel value of parameter and image corresponding position in template.One typical convolution process can As shown in figure 5, carrying out convolution operation by sleiding form window to all positions in input picture, can obtaining later To corresponding characteristic pattern.

In the present embodiment, based on convolutional neural networks, it is advantageous that: it abandons adjacent in traditional neural network " full connection " design between layer, in such a way that part connection and weight are shared, reduction significantly needs trained model parameter Number reduces calculation amount.The part connection refers to each neuron and one in input picture in convolutional neural networks Regional area is connected, rather than connect entirely with all neurons.The shared different zones referred in input picture of the weight, altogether Enjoy Connecting quantity (i.e. convolution nuclear parameter).In addition, the design method that the part connection of convolutional neural networks and weight are shared, so that The feature that network extracts has the stability of height, insensitive to translation, scaling and deformation etc..

Pond layer usually occurs with convolutional layer in pairs, after convolutional layer, is used to carry out down-sampled behaviour to input feature vector figure Make.Image is commonly entered after convolution operation, a large amount of characteristic patterns that can be obtained, characteristic dimension is excessively high to will lead to network query function amount Increase severely.Pond layer greatly reduces the number of parameters of model by the dimension of reduction characteristic pattern.On the one hand this method reduces net The calculation amount of network operation, on the other hand also reduces the risk of network over-fitting.The spy of characteristic pattern and convolutional layer that pond obtains Sign figure is one-to-one, therefore pondization operation is only reduction of characteristic pattern dimension, and number does not change.

There is pond method involved in convolutional neural networks in the present embodiment: maximum value pond (Max Pooling), mean value Pond (Mean Pooling) and random pool (Stochastic Pooling).It is maximum for a sampling subregion Value pond refers to choosing output result of the maximum point of wherein pixel value as the region；Mean value pond refers to calculating wherein The mean value of all pixels point, uses the mean value as the output of sampling area；Random pool refers to selecting at random from sampling area A pixel value is taken to export as a result, usual pixel value is bigger, and the probability selected is higher.Maximum value pond process is as follows Shown in Fig. 6.

Step 205 obtains text word feature according to pretreated Chinese description.

Specifically, pretreated Chinese description obtains text word feature after embeding layer (Embedding).Wherein, Each text word feature is 256 dimensional vectors；Common Embedding method has Word2vec, GloVe etc..

Step 206 trains computation model according to described image feature and text word feature, based on the computation model after training Obtain described image identification model.

Specifically, the computation model includes LSTM.

The step 206 includes: the corresponding characteristics of image of the image for obtaining step 204,205 and text word in order Feature (i.e. term vector in Fig. 7), sequentially inputs LSTM and is trained；The output of the term vector of each word is calculated, i.e. acquisition root The the first result feature obtained according to described image feature and the second result feature obtained according to the text word feature；To described First result feature and the second result feature are classified, and obtain at least one prediction word according to classification results；The above tool Body process is as shown in Figure 7；

Prediction description information is generated according at least one described prediction word；Compare the prediction description information and the sample Description information optimizes the LSTM according to comparison result.

Here it is possible to which (i.e. text word is special for the term vector and other term vectors of regarding described image feature as first word Sign) being stitched together inputs LSTM；The output of the LSTM predicts the feature of the serial number of next word as classifying.

Here, the basic structure of LSTM memory unit is as shown in Figure 8, wherein x_tFor the input of current point in time, it is assumed that defeated Entering gate cell and forgeing the input of gate cell is respectively i_tAnd o_t, meet following formula (1) and (2):

i_t=σ (W_xix_t+W_hih_t-1+W_cic_t-1+b_i) (1)

f_t=σ (W_xfx_t+W_hfh_t-1+W_cfc_t-1+b_f) (2)

Wherein, f_tIt is to forget door, then the state Ct of memory cell can be calculated by following formula (3):

c_t=f_tc_t-1+i_ttanh(W_xcx_t+W_hch_t-1+b_c) (3)

The value of output gate cell is determined by current cell state, but is the cell value after filtering.First will Cell state guarantees output area between -1 to 1 by a Sigmoid unit, followed by hyperbolic tangent function, it may be assumed that

o_t=σ (W_xox_t+W_hoh_t-1+W_coc_t+b_o) (4)

The output ht of hidden unit is codetermined by cell state and output gate cell, is met:

h_t=o_ttanh(c_t) (5)

More than, σ () indicates that Sigmoid function, W indicate that the weight matrix connected between each unit, b indicate each list The bias vector of member.

Here, the prediction technique of word is described further.In the present embodiment, (refer to characteristics of image and text using each word This word feature) output classify, predict next word according to classification results.Specific method includes: to be made using preceding n-1 word For input, rear n-1 word is as prediction target.In the present embodiment, it is contemplated that be easy that search is made to fall into part using greedy algorithm It is optimal, it therefore, is trained using beam-search (bean search) algorithm of Dynamic Programming, every time when search, is only write down most Possible n word then proceedes to search for next word, finds n*n sequence, next one word finds n*n*n sequence, saves The n*n of maximum probability, so constantly search is until finally obtain optimal result.

Step 207 obtains images to be recognized, identifies the images to be recognized with described image identification model, obtains needle To the goal description information of the images to be recognized；The goal description information is to describe in the images to be recognized performance Hold；The goal description information is detected, determines whether the target image is bad image according to testing result.

Specifically, the step 207 includes: (to scheme with the semantic feature of image recognition model extraction images to be recognized As feature), the goal description information of the images to be recognized is obtained according to the semantic feature, is occurred when in goal description information When preset sensitive word, then determine that the images to be recognized is bad image.

Here, the sensitive word can pre-save in the server, and the sensitive word may include terror, violence, color The bad word such as feelings.Server is matched using filtering sensitive words algorithm (DFA), is occurred when being matched to the goal description information When sensitive word, that is, determine that the images to be recognized is bad image.

Fig. 9 is a kind of structural schematic diagram of pattern recognition device provided in an embodiment of the present invention；As shown in figure 9, the dress Set includes: first processing module 301, Second processing module 302 and third processing module 303.

The first processing module 301, for obtaining target image.

The Second processing module 302 obtains needle for identifying the target image with preset image recognition model To the goal description information of the target image；Content of the goal description information to describe the target image performance.

The third processing module 303 determines the target for detecting the goal description information according to testing result Whether image is target class image.

Specifically, described device further include: preprocessing module, for generating described image identification model.

Here, the computation model includes: LSTM.

The preprocessing module is obtained specifically for described image feature and the text word feature are sequentially input LSTM Obtain result feature；The result feature includes: according to the first result feature of described image feature acquisition and according to the text The second result feature that word feature obtains；

Specifically, the preprocessing module obtains at least one specifically for segmenting to the pattern representation information Pattern representation word；The text word feature is determined according to the pattern representation word.

Specifically, the Second processing module 302 is specifically used for target described in preset image recognition model extraction The characteristics of image of image determines at least one descriptor according to described image feature；It is generated according at least one described descriptor For the goal description information of the target image.

Specifically, whether the third processing module 303 includes preset specifically for detecting the goal description information Sensitive word when determining that the goal description information includes preset sensitive word, determines that the target image is target class image.

It should be understood that pattern recognition device provided by the above embodiment is when carrying out image recognition, only with above-mentioned each The division progress of program module can according to need for example, in practical application and distribute above-mentioned processing by different journeys Sequence module is completed, i.e., the internal structure of device is divided into different program modules, to complete whole described above or portion Divide processing.In addition, pattern recognition device provided by the above embodiment and image-recognizing method embodiment belong to same design, have Body realizes that process is detailed in embodiment of the method, and which is not described herein again.

Figure 10 is the structural schematic diagram of another pattern recognition device provided in an embodiment of the present invention；Described image identification dress It sets and can be applied to server；As shown in Figure 10, described device 40 includes: processor 401 and can be at the place for storing The memory 402 of the computer program run on reason device；Wherein, when the processor 401 is used to run the computer program, It executes: obtaining target image；The target image is identified with preset image recognition model, is obtained and is directed to the target image Goal description information；Content of the goal description information to describe the target image performance；The target is detected to retouch Information is stated, determines whether the target image is target class image according to testing result.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: obtaining present count The sample image of amount, and obtain the pattern representation information of each sample image；The image for extracting each sample image respectively is special The text word feature for the corresponding pattern representation information of each sample image of seeking peace；According to described image feature, the text Word feature and pattern representation information training computation model, obtain described image based on the computation model after training and identify mould Type.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: by described image Feature and the text word feature sequentially input LSTM, obtain result feature；The result feature includes: according to described image spy The the second result feature levying the first result feature obtained and being obtained according to the text word feature；To the first result feature Classify with the second result feature, obtain at least one prediction word according to classification results, at least one is pre- according to described It surveys word and generates prediction description information；Compare the prediction description information and the pattern representation information, is optimized according to comparison result The LSTM.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: to the sample Description information is segmented, at least one pattern representation word is obtained；The text word feature is determined according to the pattern representation word.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: with preset The characteristics of image of target image described in image recognition model extraction determines at least one descriptor according to described image feature；Root The goal description information for being directed to the target image is generated according at least one described descriptor.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: detecting the mesh It marks whether description information includes preset sensitive word, when determining that the goal description information includes preset sensitive word, determines institute Stating target image is target class image.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the When one recognition result meets the first preset condition, determine that the first image is target class image；First preset condition is In first recognition result at least two first corresponding confidence levels of attribute and be greater than the first preset threshold；Determine institute When stating the first recognition result and not meeting the first preset condition, at least one described corresponding weight of the first attribute, root are determined According at least one described corresponding confidence level of the first attribute and weight, the first confidence level is obtained；According to first confidence It spends and determines whether the first image is target class image.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: identification described the One image obtains the second recognition result；Second recognition result includes at least one emotion class of the first image performance Type and the corresponding confidence level of at least one affective style；Correspondingly, described determine institute according to first confidence level State whether the first image is target class image, comprising: according to first confidence level and second recognition result, determine described in Whether the first image is target class image.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the When two recognition results meet the second preset condition, determine that the first image is target class image；Second preset condition is The corresponding confidence level of target affective style is greater than the second preset threshold in second recognition result；Determine the second identification knot When fruit does not meet the second preset condition, the corresponding weight of at least one affective style is determined, according to described at least one The corresponding weight of kind affective style and confidence level, determine the second confidence level；In conjunction with first confidence level and described second Confidence level determines whether the first image is target class image.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the Corresponding first weight of one confidence level and corresponding second weight of second confidence level；According to first confidence level, described First weight, second confidence level and second weight obtain objective degrees of confidence, determine institute according to the objective degrees of confidence State whether the first image is target class image.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: determining described the When one image includes face, at least one facial image is extracted from the first image；Based on preset second image recognition Model identifies the facial image, obtains the second recognition result；Second recognition result includes the first image performance At least one face affective style and the corresponding confidence level of at least one face affective style；Determine first figure When as not including face, scene characteristic is extracted from the first image；Institute is identified based on preset third image recognition model Scene characteristic is stated, the second recognition result is obtained；Second recognition result includes at least one ring of the first image performance Border affective style and the corresponding confidence level of at least one environment affective style.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: obtaining present count The sample image of amount, each sample image is corresponding at least one first attribute in the sample image of the preset quantity；According to The sample image of the preset quantity and at least one corresponding first attribute of each sample image are carried out based on convolutional Neural The learning training of network obtains the first image identification model.

In one embodiment, it when the processor 401 is also used to run the computer program, executes: setting the volume Product neural network uses Multi-label mode, and the convolutional layer of the convolutional neural networks includes multiple carry out learning trainings Convolution module, different convolution modules corresponds to different characteristics of image；According to the sample image of the preset quantity, with more A convolution module carries out learning training to the first attribute of each of at least one first attribute respectively；It obtains for identification The first image identification model of at least one the first attribute.

It should be understood that pattern recognition device provided by the above embodiment belong to image-recognizing method embodiment it is same Design, specific implementation process are detailed in embodiment of the method, and which is not described herein again.

When practical application, described device 40 can also include: at least one network interface 403.In pattern recognition device 40 Various components be coupled by bus system 404.It is understood that bus system 404 is for realizing between these components Connection communication.Bus system 404 further includes power bus, control bus and status signal bus in addition in addition to including data/address bus. But for the sake of clear explanation, various buses are all designated as bus system 404 in Figure 10.Wherein, the processor 404 Number can be at least one.Network interface 403 is used for wired or wireless way between pattern recognition device 40 and other equipment Communication.

Memory 402 in the embodiment of the present invention is for storing various types of data to support the operation of device 40.

The method that the embodiments of the present invention disclose can be applied in processor 401, or be realized by processor 401. Processor 401 may be a kind of IC chip, the processing capacity with signal.During realization, the above method it is each Step can be completed by the integrated logic circuit of the hardware in processor 401 or the instruction of software form.Above-mentioned processing Device 401 can be general processor, digital signal processor (DSP, Digital Signal Processor) or other can Programmed logic device, discrete gate or transistor logic, discrete hardware components etc..Processor 401 may be implemented or hold Disclosed each method, step and logic diagram in the row embodiment of the present invention.General processor can be microprocessor or appoint What conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly at hardware decoding Reason device executes completion, or in decoding processor hardware and software module combine and execute completion.Software module can be located at In storage medium, which is located at memory 402, and processor 401 reads the information in memory 402, in conjunction with its hardware The step of completing preceding method.

In the exemplary embodiment, pattern recognition device 40 can by one or more application specific integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), scene Programmable gate array (FPGA, Field-Programmable Gate Array), general processor, controller, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronic components are realized, are used for Execute preceding method.

The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described It when computer program is run by processor, executes: obtaining target image；The target is identified with preset image recognition model Image obtains the goal description information for being directed to the target image；The goal description information is to describe the target image The content of performance；The goal description information is detected, determines whether the target image is target class image according to testing result.

In one embodiment, it when the computer program is run by processor, executes: obtaining the sample graph of preset quantity Picture, and obtain the pattern representation information of each sample image；The characteristics of image of each sample image and described every is extracted respectively The text word feature of the corresponding pattern representation information of a sample image；According to described image feature, the text word feature and institute Pattern representation information training computation model is stated, described image identification model is obtained based on the computation model after training.

In one embodiment, it when the computer program is run by processor, executes: by described image feature and the text This word feature sequentially inputs LSTM, obtains result feature；The result feature includes: first obtained according to described image feature As a result feature and the second result feature obtained according to the text word feature；To the first result feature and second knot Fruit feature is classified, and obtains at least one prediction word according to classification results, generates prediction according at least one described prediction word Description information；Compare the prediction description information and the pattern representation information, the LSTM is optimized according to comparison result.

In one embodiment, it when the computer program is run by processor, executes: the pattern representation information is carried out Participle, obtains at least one pattern representation word；The text word feature is determined according to the pattern representation word.

In one embodiment, it when the computer program is run by processor, executes: using preset image recognition model The characteristics of image for extracting the target image determines at least one descriptor according to described image feature；According to described at least one A descriptor generates the goal description information for being directed to the target image.

In one embodiment, when the computer program is run by processor, execute: detecting the goal description information is No includes that preset sensitive word determines that the target image is when determining that the goal description information includes preset sensitive word Target class image.

In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.

Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists In member；Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units；It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed；And storage medium above-mentioned include: movable storage device, it is read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or The various media that can store program code such as person's CD.

If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention. And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code Medium.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention Within the scope of.

Claims

1. a kind of image-recognizing method, which is characterized in that the described method includes:

Obtain target image；

The target image is identified with preset image recognition model, and the goal description obtained for the target image is believed Breath；Content of the goal description information to describe the target image performance；

2. the method according to claim 1, wherein the method also includes: generate described image identification model；

The generation described image identification model, comprising:

The characteristics of image of each sample image and the text of the corresponding pattern representation information of each sample image are extracted respectively Word feature；

According to described image feature, the text word feature and the pattern representation information training computation model, after training Computation model obtain described image identification model.

3. according to the method described in claim 2, it is characterized in that, the computation model includes: time recurrent neural network LSTM；

It is described that the LSTM is trained according to described image feature, the text word feature and the pattern representation information, comprising:

Described image feature and the text word feature are sequentially input into LSTM, obtain result feature；The result feature includes: The the first result feature obtained according to described image feature and the second result feature obtained according to the text word feature；

Classify to the first result feature and the second result feature, obtains at least one prediction according to classification results Word generates prediction description information according at least one described prediction word；

4. according to the method described in claim 2, it is characterized in that, extracting the corresponding pattern representation information of the sample image Text word feature, comprising:

The pattern representation information is segmented, at least one pattern representation word is obtained；It is determined according to the pattern representation word The text word feature.

5. the method according to claim 1, wherein described identify the mesh with preset image recognition model Logo image obtains the goal description information for being directed to the target image, comprising:

With the characteristics of image of target image described in preset image recognition model extraction, determined at least according to described image feature One descriptor；

6. the method according to claim 1, wherein the detection goal description information, is tied according to detection Fruit determines whether the target image is target class image, comprising:

Detect whether the goal description information includes preset sensitive word, determines that the goal description information includes preset quick When feeling word, determine that the target image is target class image.

7. a kind of pattern recognition device, which is characterized in that described device includes: first processing module, Second processing module and Three processing modules；Wherein,

The first processing module, for obtaining target image；

The Second processing module is obtained for identifying the target image with preset image recognition model for described The goal description information of target image；Content of the goal description information to describe the target image performance；

The third processing module determines that the target image is for detecting the goal description information according to testing result No is target class image.

8. device according to claim 7, which is characterized in that described device further include: preprocessing module, for generating State image recognition model；

The preprocessing module, specifically for obtaining the sample image of preset quantity, and the sample of each sample image of acquisition Description information；The characteristics of image and each sample image corresponding pattern representation information of each sample image are extracted respectively Text word feature；According to described image feature, the text word feature and the pattern representation information training computation model, it is based on Computation model after training obtains described image identification model.

9. a kind of pattern recognition device, which is characterized in that described device includes: processor and can be on a processor for storing The memory of the computer program of operation；Wherein,

The processor is for the step of when running the computer program, perform claim requires any one of 1 to 6 the method.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of any one of claim 1 to 6 the method is realized when being executed by processor.