CN107862322A - The method, apparatus and system of picture attribute classification are carried out with reference to picture and text - Google Patents
The method, apparatus and system of picture attribute classification are carried out with reference to picture and text Download PDFInfo
- Publication number
- CN107862322A CN107862322A CN201710832627.7A CN201710832627A CN107862322A CN 107862322 A CN107862322 A CN 107862322A CN 201710832627 A CN201710832627 A CN 201710832627A CN 107862322 A CN107862322 A CN 107862322A
- Authority
- CN
- China
- Prior art keywords
- picture
- text
- neural network
- network model
- carried out
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
Abstract
The invention discloses the method, apparatus and system that a kind of combination picture and text carry out picture attribute classification, belong to field of computer technology.Methods described includes:The characteristics of image of the picture and the text feature of the picture are identified by presetting neural network model, and forms union feature;Classification processing is carried out to the union feature, exports picture attribute classification results;The default neural network model comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network model.The present invention is by combining the characteristics of image of picture and the text feature of picture, both can carry out complementation, more fully picture feature data are provided, make it possible to the attribute of preferably expression picture, article or other related objects, obtain more detailed, accurate object properties classification results, therefore this method can be used for picture attribute extraction, improves knowledge mapping or the business such as be inquired about, searched for according to picture attribute classification.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of combination picture and text carry out picture attribute classification
Method and device.
Background technology
At present, whole world internet has formed scale, and the Internet, applications move towards diversification, and internet is more and more profoundly
Change the study, work and life style of people.In network data analysis, the habit of Internet user can be accurately known
The attributes such as used, demand are that precise contents are promoted with better services in client or the important prerequisite of advertisement putting.At present, interconnecting
Identify that the prior art of media subscriber attribute is all based on user's article or picture sample in net, especially picture sample exists
The customer attribute information contained in some field pictures has very big potential use, specifically needs first to collect user's full dose and goes through
History sample, the data of sample of users are arranged, arrange Sample Storehouse, the classification of row label corpus is entered to Sample Storehouse, such as, some language material
Storehouse represents the contents such as " shopping ", " fashion ", " dress ornament ", then further according to Sample Storehouse and the progress of the Sample Storehouse of Internet user
Match somebody with somebody, to identify user property.That is, identify that the conventional method of user property is based on sample data, passes through engineering in internet
Practise, then be equipped with data model and be trained, carry out the judgement of internet customer attribute.Wherein, entered according to the sample data of collection
Row attributive classification is the important step of said process.In order to meet the growing market demand, how to realize to scheming in network
Piece attribute carry out in more detail, more fully attributive classification, the problem of being current urgent need to resolve.
The content of the invention
In order to solve problem of the prior art, combine picture the embodiments of the invention provide one kind and text carries out picture category
Property classification method, apparatus and system.The technical scheme is as follows:
First aspect, there is provided the method that a kind of combination picture and text carry out picture attribute classification, methods described include:
The characteristics of image of the picture and the text feature of the picture are identified by presetting neural network model, and is formed
Union feature;
Classification processing is carried out to the union feature, exports picture attribute classification results;
The default neural network model comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network mould
Type.
With reference in a first aspect, in second of possible implementation, described pass through presets neural network model identification institute
The characteristics of image of picture and the text feature of the picture are stated, and forms union feature, including:
Image expression is carried out by default neural network model, obtains image expression result;
Text representation is carried out by default neural network model, obtains text representation result;
Association list is carried out according to described image expression of results and the text representation result by default neural network model
Reach, form union feature.
It is described to pass through in the third possible implementation with reference to second of possible implementation of first aspect
Default neural network model carries out image expression, obtains image expression result, including:
Global image expression is carried out by the predetermined depth convolutional neural networks model, obtains image expression result.
It is described to pass through in the 4th kind of possible implementation with reference to second of possible implementation of first aspect
Default neural network model carries out text representation, obtains text representation result, including:
Term vector expression is carried out by preset loop neural network model, obtains term vector expression of results;
Global text representation is carried out by preset loop neural network model according to the term vector expression of results, obtains text
This expression of results.
With reference to the 4th kind of possible implementation of first aspect, in the 5th kind of possible implementation, described logical
Cross before the progress term vector expression of preset loop neural network model, in addition to step:
Chinese word segmentation is carried out to the text of the picture, obtains Chinese word.
It is described to pass through in the 6th kind of possible implementation with reference to second of possible implementation of first aspect
Default neural network model carries out Combined expression according to described image expression of results and the text representation result, and it is special to form joint
Sign, including:
Connection is weighted to described image expression of results and the text representation result, forms union feature.
With reference in a first aspect, in the 7th kind of possible implementation, described pass through presets neural network model identification institute
The characteristics of image of picture and the text feature of the picture are stated, and forms union feature, including:
Combined expression is carried out to the image and text of the picture by presetting neural network model, and it is special to form joint
Sign.
It is described that classification processing is carried out to the union feature with reference in a first aspect, in the 8th kind of possible implementation,
Picture attribute classification results are exported, including:
Softmax classification, output picture attribute classification knot are carried out to the union feature by default neural network model
Fruit.
Second aspect, there is provided a kind of combination picture and text carry out the device of picture attribute classification, and described device includes:
Computing module is identified, for identifying the characteristics of image of the picture and the picture by presetting neural network model
Text feature, and form union feature;It is additionally operable to carry out classification processing to the union feature;The default neutral net mould
Type comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network model.
Output module, for exporting picture attribute classification results.
The third aspect, there is provided a kind of combination picture and text carry out the device of picture attribute classification, and described device includes:
Memory and the processor being connected with the memory,
Memory is used to store batch processing code, the program code that processor calls memory to be stored be used to performing with
Lower operation:
The characteristics of image of the picture and the text feature of the picture are identified by presetting neural network model, and is formed
Union feature;
Classification processing is carried out to the union feature, exports picture attribute classification results;
The default neural network model comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network mould
Type.
Fourth aspect, there is provided the system that a kind of combination picture and text carry out picture attribute classification, the system include:
Computing device is identified, for identifying the characteristics of image of the picture and the picture by presetting neural network model
Text feature, and form union feature;It is additionally operable to carry out classification processing to the union feature;The default neutral net mould
Type comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network model
Output device, for exporting picture attribute classification results.
The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is:
Combination picture and text provided in an embodiment of the present invention carry out the method, apparatus and system of picture attribute classification, lead to
Cross and realize following steps:The characteristics of image of the picture and the text spy of the picture are identified by presetting neural network model
Sign, and form union feature;Classification processing is carried out to the union feature, exports picture attribute classification results, is schemed by combining
The characteristics of image of piece and the text feature of picture, both can carry out complementation, there is provided more fully picture feature data so that energy
The attribute of enough preferably expression pictures, article or other related objects, obtains more detailed, accurate object properties classification results,
Therefore this method can be used for picture attribute extraction, improve knowledge mapping or be inquired about according to picture attribute classification, the industry such as search for
Business.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for
For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings
Accompanying drawing.
Fig. 1 is the method flow diagram that the combination picture that inventive embodiments 1 provide and text carry out picture attribute classification;
Fig. 2 is the method flow diagram that the combination picture that inventive embodiments 2 provide and text carry out picture attribute classification;
Fig. 3 is the default neural network model schematic diagram based on picture and text that inventive embodiments 2 provide;
Fig. 4 is the VGG model schematics that inventive embodiments 2 provide;
Fig. 5 is the method flow diagram that the combination picture that inventive embodiments 3 provide and text carry out picture attribute classification;
Fig. 6 is the default neural network model schematic diagram based on picture and text that inventive embodiments 3 provide;
Fig. 7 is the apparatus structure schematic diagram that the combination picture that inventive embodiments 4 provide and text carry out picture attribute classification;
Fig. 8 is the system structure diagram that the combination picture that inventive embodiments 5 provide and text carry out picture attribute classification;
Fig. 9 is the structural representation of device 6 that the combination picture that inventive embodiments 6 provide and text carry out picture attribute classification
Figure.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention
Figure, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only this
Invention part of the embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art exist
The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
The embodiment of the present invention by provide a kind of combination picture and text carry out picture attribute classification method, apparatus and
System, by combining the characteristics of image of picture and the text feature of picture, both can carry out complementation, there is provided more fully picture
Characteristic, enabling preferably express the attribute of picture, article or other related objects, it is more detailed, accurately right to obtain
As attributive classification result, thus this method can be used for picture attribute extraction, improve knowledge mapping or according to picture attribute classify into
The business such as row inquiry, search.
Picture is carried out to combination picture provided in an embodiment of the present invention and text with reference to specific embodiment and accompanying drawing
The method, apparatus and system of attributive classification are described further.
Embodiment 1
Fig. 1 is the method flow diagram that the combination picture that inventive embodiments 1 provide and text carry out picture attribute classification, is such as schemed
Shown in 1, the method that combination picture and text provided in an embodiment of the present invention carry out picture attribute classification comprises the following steps:
101st, the characteristics of image of picture and the text feature of picture are identified by presetting neural network model, and forms joint
Feature.
Specifically, default neural network model here comprises at least predetermined depth convolutional neural networks model and circulation god
Through network model.
Specifically, identifying the characteristics of image of picture and the text feature of picture by presetting neural network model, and formed
Union feature, including:
Image expression is carried out by default neural network model, obtains image expression result.
Text representation is carried out by default neural network model, obtains text representation result;
Combined expression is carried out according to image expression result and text representation result by default neural network model, forms connection
Close feature.
Because characteristics of image and text feature have the characteristics of different, said process is distinguished by default neural network model
Image expression and text representation are carried out, individually obtains image expression result and text representation result, then passes through default nerve again
Both are carried out Combined expression by network, ultimately form union feature, and such processing procedure allows to carrying out feature representation
When, adaptable expression way or expression process are selected, final comprehensive obtained Combined expression result is also more accurate, and efficiency is more
It is high.
Specifically, step carries out image expression by default neural network model, obtains image expression result, including:
Global image expression is carried out by predetermined depth convolutional neural networks model, obtains image expression result.
Specifically, step carries out text representation by default neural network model, obtains text representation result, including:
Term vector expression is carried out by preset loop neural network model, obtains term vector expression of results;
Global text representation is carried out by preset loop neural network model according to term vector expression of results, obtains text table
Up to result.
Specifically, before term vector expression is carried out by preset loop neural network model, in addition to step:
Chinese word segmentation is carried out to the text of picture, obtains Chinese word.
Specifically, step is combined by default neural network model according to image expression result and text representation result
Expression, union feature is formed, including:
Connection is weighted to image expression result and text representation result, forms union feature.
102nd, classification processing is carried out to union feature, exports picture attribute classification results.
The method that the embodiment of the present invention carries out picture attribute classification by providing a kind of combination picture and text, by pre-
If the characteristics of image of neural network model identification picture and the text feature of picture simultaneously form union feature and to union feature
Classification processing, output picture attribute classification results are carried out, are extracted with reference to the characteristics of image of picture and the text feature of picture
Classification, because both can carry out complementation, using the teaching of the invention it is possible to provide more fully picture feature data, enabling preferably expression figure
The attribute of piece, article or other related objects, more detailed, accurate object properties classification results are obtained, therefore this method can use
In picture attribute extraction, improve knowledge mapping or inquired about according to picture attribute classification, the multinomial business such as search for.
Embodiment 2
Fig. 2 be inventive embodiments 2 provide combination picture and text carry out picture attribute classification method flow diagram, Fig. 3
It is the default neural network model schematic diagram based on picture and text that inventive embodiments 2 provide, Fig. 4 is that inventive embodiments 2 carry
The VGG model schematics of confession, as shown in Figures 2 and 3, combination picture and text provided in an embodiment of the present invention carry out picture attribute
The method of classification comprises the following steps:
201st, image expression is carried out by default neural network model, obtains image expression result.
Specifically, by presetting neural network model, to all elements on picture, (such as element here can be to scheme
Each pattern block of piece is unit) or Partial Elements carry out image expression, obtain the image expression result of each element, each
The corresponding attribute tags of expression of results, for expressing the image information of picture.Further, according to default neural network model
To the difference of picture all elements or Partial Elements, specific expression process can be divided into following two situations:
First, by a default neural network model or multiple default neutral nets used by all elements traversal of picture
Model, finally obtain the image expression result of each element;
2nd, when needing to carry out the Partial Elements expression of picture according to end article, determined according to predefined rule above-mentioned
Partial Elements, then by a default neural network model or multiple default nerve nets used by these elements traversal of picture
Network model, finally obtain the image expression result of each element.
Exemplarily, global image expression is carried out by predetermined depth convolutional neural networks model, obtains image expression knot
Fruit.For example carry out image expression using 16 layers of VGG models.The predetermined depth convolutional neural networks model utilizes multilayer nerve net
Learn simple shape, color, texture etc. simply from simple to the series of features of complexity, such as lower level in network picture engraving
Pattern, constantly combination form the gradually complicated pattern with semantic information, such as face feature, collar feature etc..Such as Fig. 4
Shown, the conventional part of VGG models is by five groups【3*3*N convolutional layers+2*2max-pooling+Relu】Block composition;Afterwards
Connect two layers of fully-connected network (fc6, fc7) and obtain the feature of 4096 dimensions;One layer of fully-connected network (fc8) of progress obtains more afterwards
The logits of classification;The probability that softmax classifies to obtain all categories is finally carried out to logits.
It is worth noting that, step 201 carries out image expression by default neural network model, obtains image expression knot
Fruit, in addition to the mode described in above-mentioned steps, the process can also be realized by other means, the embodiment of the present invention is to specific
Mode be not limited.
202nd, text representation is carried out by default neural network model, obtains text representation result.
Specifically, carrying out Chinese word segmentation to the text of picture, Chinese word is obtained;Pass through preset loop neural network model
Term vector expression is carried out, obtains term vector expression of results;Preset loop neural network model is passed through according to term vector expression of results
Global text representation is carried out, obtains text representation result.
As shown in figure 3, text source is name of product corresponding with image, product introduction etc..Chinese word segmentation is carried out first,
Obtain a series of Chinese words;Part II is to obtain the expression of Chinese word, utilizes the word of the continuous word vectors trained
Allusion quotation obtains the word lists compared with low dimensional up to (it can be based on Recognition with Recurrent Neural Network RNN to obtain term vector method, or is based on
Continuous BoW/Skip-gram method);Part III is to obtain the expression of whole sentence or paragraph, is utilized
RNN or LSTM are modeled to sequence vector, table of the hidden state vector that last term vector exports as whole paragraph
Reach.
It is worth noting that, step 202 carries out text representation by default neural network model, obtains text representation knot
Fruit, in addition to the mode described in above-mentioned steps, the process can also be realized by other means, the embodiment of the present invention is to specific
Mode be not limited.
203rd, Combined expression, shape are carried out according to image expression result and text representation result by default neural network model
Into union feature.
Specifically, being weighted connection to image expression result and text representation result, union feature is formed.
It is worth noting that, step 203 is by presetting neural network model according to image expression result and text representation knot
Fruit carries out Combined expression, forms union feature, in addition to the mode described in above-mentioned steps, can also realize by other means
The process, the embodiment of the present invention are not limited to specific mode.
204th, softmax classification, output picture attribute classification knot are carried out to union feature by default neural network model
Fruit.
The union feature after image expression and text representation is obtained, both are weighted with connection, obtains Combined expression.
One or more layers fully-connected network is carried out to Combined expression and obtains the logits of N classes, softmax classification is carried out to logits.Profit
Back-propagation is carried out to classification loss with stochastic gradient descent algorithm, loss is respectively along image branch and text point
The downward anti-pass of branch.According to the size of database, the depth of anti-pass is controlled.Such as less training set, in order to prevent plan
Close, the fc6 layers of an anti-pass to VGG models and the Recognition with Recurrent Neural Network layer of text model;Can be with anti-pass to figure for large data sets
The convolutional layer of picture and the term vector layer of text.
The method that the embodiment of the present invention carries out picture attribute classification by providing a kind of combination picture and text, by pre-
If neural network model carries out image expression, image expression result is obtained;Text representation is carried out by default neural network model,
Obtain text representation result;Association list is carried out according to image expression result and text representation result by default neural network model
Reach, form union feature;Softmax classification, output picture attribute point are carried out to union feature by default neural network model
Class result, extraction classification is carried out with reference to the characteristics of image of picture and the text feature of picture, because both can carry out complementation, energy
It is enough that more fully picture feature data are provided, enabling preferably to express the attribute of picture, article or other related objects, obtain
In more detail, accurate object properties classification results, therefore this method can be used for picture attribute extraction, improve knowledge mapping or root
Inquired about according to picture attribute classification, the multinomial business such as search for.
Embodiment 3
Fig. 5 is the method flow diagram that the combination picture that inventive embodiments 3 provide and text carry out picture attribute classification;Fig. 6
It is the default neural network model schematic diagram based on picture and text that inventive embodiments 3 provide, as it can be seen in figures 5 and 6, of the invention
The method that the combination picture and text that embodiment provides carry out picture attribute classification comprises the following steps:
301st, Combined expression is carried out to the image and text of picture by presetting neural network model, and it is special to form joint
Sign.
Specifically, being different from embodiment 1 and embodiment 2, the image and text of the step combination picture are together by default
Neural network model carries out Combined expression.As shown in fig. 6, before Combined expression is carried out, predetermined depth convolution can be first passed through
Network model carries out preliminary image expression to picture, subsequently into embeding layer, afterwards with by picture attribute word participle, word to
Text representation result after amount expression carries out Combined expression by preset loop neural network model together, and it is special to form joint
Sign.
It is worth noting that, step 301 carries out association list by presetting neural network model to the image and text of picture
Reach, and form union feature, in addition to the mode described in above-mentioned steps, the process can also be realized by other means, this
Inventive embodiments are not limited to specific mode.
302nd, softmax classification, output picture attribute classification knot are carried out to union feature by default neural network model
Fruit.
It is worth noting that, step 30 carries out softmax classification, output to union feature by default neural network model
Picture attribute classification results, in addition to the mode described in above-mentioned steps, the process, this hair can also be realized by other means
Bright embodiment is not limited to specific mode.
The method that the embodiment of the present invention carries out picture attribute classification by providing a kind of combination picture and text, by pre-
If neural network model carries out Combined expression to the image and text of picture, and forms union feature;By presetting neutral net
Model carries out softmax classification to union feature, exports picture attribute classification results, with reference to the characteristics of image and picture of picture
Text feature carries out extraction classification, because both can carry out complementation, using the teaching of the invention it is possible to provide more fully picture feature data so that energy
The attribute of enough preferably expression pictures, article or other related objects, obtains more detailed, accurate object properties classification results,
Combined expression is carried out together additionally, due to the characteristics of image and text feature for combining picture, simplifies step, this method can be used for
Picture attribute extraction, improve knowledge mapping or inquired about according to picture attribute classification, the multinomial business such as search for.
Embodiment 4
Fig. 7 is the structural representation of device 4 that the combination picture that inventive embodiments 4 provide and text carry out picture attribute classification
Figure, as shown in fig. 7, the device that combination picture provided in an embodiment of the present invention and text carry out picture attribute classification includes:
Computing module 41 is identified, for identifying the characteristics of image of picture and the text of picture by presetting neural network model
Feature, and form union feature;It is additionally operable to carry out classification processing to union feature;
Output module 42, for exporting picture attribute classification results.
Specifically, identification computing module 41 performs the characteristics of image and picture that picture is identified by presetting neural network model
Text feature and form the process of union feature, including:
Image expression is carried out by default neural network model, obtains image expression result.
Text representation is carried out by default neural network model, obtains text representation result;
Combined expression is carried out according to image expression result and text representation result by default neural network model, forms connection
Close feature.
Because characteristics of image and text feature have the characteristics of different, said process is distinguished by default neural network model
Image expression and text representation are carried out, individually obtains image expression result and text representation result, then passes through default nerve again
Both are carried out Combined expression by network, ultimately form union feature, and such processing procedure allows to carrying out feature representation
When, adaptable expression way or expression process are selected, final comprehensive obtained Combined expression result is also more accurate, and efficiency is more
It is high.
Specifically, step carries out image expression by default neural network model, obtains image expression result, including:
Global image expression is carried out by predetermined depth convolutional neural networks model, obtains image expression result.
Specifically, step carries out text representation by default neural network model, obtains text representation result, including:
Term vector expression is carried out by preset loop neural network model, obtains term vector expression of results;
Global text representation is carried out by preset loop neural network model according to term vector expression of results, obtains text table
Up to result.
Specifically, before term vector expression is carried out by preset loop neural network model, in addition to step:
Chinese word segmentation is carried out to the text of picture, obtains Chinese word.
Specifically, step is combined by default neural network model according to image expression result and text representation result
Expression, union feature is formed, including:
Connection is weighted to image expression result and text representation result, forms union feature.
In addition, identification computing module 41 is additionally operable to carry out union feature classification processing, classification results are obtained.
The embodiment of the present invention carries out the device of picture attribute classification by providing a kind of combination picture and text, utilizes it
Including identification computing module and output module pass through default neural network model and identify the characteristics of image of picture and the text of picture
Eigen simultaneously forms union feature and classification processing, output picture attribute classification results is carried out to union feature, with reference to picture
Characteristics of image and the text feature of picture carry out extraction classification, because both can carry out complementation, using the teaching of the invention it is possible to provide more fully
Picture feature data, enabling preferably express the attribute of picture, article or other related objects, obtain in more detail, accurately
Object properties classification results, therefore this method can be used for picture attribute extraction, improve knowledge mapping or according to picture attribute point
Class such as is inquired about, searched at the multinomial business.
Embodiment 5
Fig. 8 is the system structure diagram that the combination picture that inventive embodiments 5 provide and text carry out picture attribute classification,
As shown in figure 8, the system that combination picture provided in an embodiment of the present invention and text carry out picture attribute classification includes:
Computing device 51 is identified, for identifying the characteristics of image of picture and the text of picture by presetting neural network model
Feature, and form union feature;It is additionally operable to carry out classification processing to union feature;
Output device 52, for exporting picture attribute classification results.
Specifically, identification computing device 51 performs the characteristics of image and picture that picture is identified by presetting neural network model
Text feature, and the process for forming union feature can be:
Image expression is carried out by default neural network model, obtains image expression result.Preferably pass through predetermined depth
Convolutional neural networks model carries out global image expression, obtains image expression result.Using depth convolutional neural networks (DCNN),
For example 16 layers of VGG models carry out image expression.
Text representation is carried out by default neural network model, obtains text representation result.To in the text progress of picture
Text participle, obtains Chinese word;Term vector expression is carried out by preset loop neural network model, obtains term vector expression knot
Fruit;Global text representation is carried out by preset loop neural network model according to term vector expression of results, obtains text representation knot
Fruit.
Combined expression is carried out according to image expression result and text representation result by default neural network model, forms connection
Close feature.Specifically, being weighted connection to image expression result and text representation result, union feature is formed.
Softmax classification is carried out to union feature by default neural network model, obtains classification results.
The system that the embodiment of the present invention carries out picture attribute classification by providing a kind of combination picture and text, utilizes it
Including identification computing device and output device pass through default neural network model and carry out image expression, obtain image expression knot
Fruit;Text representation is carried out by default neural network model, obtains text representation result;By default neural network model according to
Image expression result and text representation result carry out Combined expression, form union feature;By presetting neural network model distich
Close feature and carry out softmax classification, picture attribute classification results are exported, with reference to the characteristics of image of picture and the text feature of picture
Extraction classification is carried out, because both can carry out complementation, using the teaching of the invention it is possible to provide more fully picture feature data, enabling preferably
The attribute of picture, article or other related objects is expressed, obtains more detailed, accurate object properties classification results, therefore the party
Method can be used for picture attribute extraction, improve knowledge mapping or be inquired about according to picture attribute classification, the multinomial business such as search for.
Embodiment 6
Fig. 9 is the structural representation of device 6 that the combination picture that inventive embodiments 6 provide and text carry out picture attribute classification
Figure, as shown in figure 9, the device that combination picture provided in an embodiment of the present invention and text carry out picture attribute classification includes:Storage
Device 61 and the processor 62 being connected with memory, memory 61 are used to store batch processing code, and processor 62 calls storage
The program code that device 61 is stored is used to perform following operation:
The characteristics of image of picture and the text feature of picture are identified by presetting neural network model, and it is special to form joint
Sign, specifically, including:Combined expression is carried out to the image and text of picture by presetting neural network model, and forms joint
Feature.
Classification processing is carried out to union feature, exports picture attribute classification results, specifically, including:Pass through default nerve
Network model carries out softmax classification to union feature, exports picture attribute classification results.
The embodiment of the present invention carries out the device of picture attribute classification by providing a kind of combination picture and text, by pre-
If neural network model carries out Combined expression to the image and text of picture, and forms union feature;By presetting neutral net
Model carries out softmax classification to union feature, exports picture attribute classification results, with reference to the characteristics of image and picture of picture
Text feature carries out extraction classification, because both can carry out complementation, using the teaching of the invention it is possible to provide more fully picture feature data so that energy
The attribute of enough preferably expression pictures, article or other related objects, obtains more detailed, accurate object properties classification results,
Combined expression is carried out together additionally, due to the characteristics of image and text feature for combining picture, simplifies step, this method can be used for
Picture attribute extraction, improve knowledge mapping or inquired about according to picture attribute classification, the multinomial business such as search for.
Above-mentioned all optional technical schemes, any combination can be used to form the alternative embodiment of the present invention, herein no longer
Repeat one by one.
In summary, combination picture and text provided in an embodiment of the present invention carry out the method, apparatus of picture attribute classification
And system, by realizing following steps:The characteristics of image of picture and the text spy of picture are identified by presetting neural network model
Sign, and form union feature;Classification processing is carried out to union feature, exports picture attribute classification results, by combining picture
The text feature of characteristics of image and picture, both can carry out complementation, there is provided more fully picture feature data, enabling more
The good attribute for expressing picture, article or other related objects, obtains more detailed, accurate object properties classification results, therefore
This method can be used for picture attribute extraction, improve knowledge mapping or be inquired about according to picture attribute classification, the business such as search for.
It should be noted that:The combination picture and text that above-described embodiment provides carry out the device of picture attribute classification, are
System is only carried out for example, real when combining picture and text carries out picture attribute classification with the division of above-mentioned each functional module
In the application of border, it can be completed as needed and by above-mentioned function distribution by different functional modules, i.e., by device or system
Portion's structure is divided into different functional modules, to complete all or part of function described above.In addition, above-described embodiment carries
The combination picture and text of confession carry out the device of picture attribute classification, system and carry out picture attribute classification with combining picture and text
Embodiment of the method belong to same design, its specific implementation process refers to embodiment of the method, repeats no more here.
One of ordinary skill in the art will appreciate that hardware can be passed through by realizing all or part of step of above-described embodiment
To complete, by program the hardware of correlation can also be instructed to complete, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.
Claims (10)
1. a kind of method that combination picture and text carry out picture attribute classification, it is characterised in that methods described includes:
The characteristics of image of the picture and the text feature of the picture are identified by presetting neural network model, and forms joint
Feature;
Classification processing is carried out to the union feature, exports picture attribute classification results;
The default neural network model comprises at least predetermined depth convolutional neural networks model and Recognition with Recurrent Neural Network model.
2. according to the method for claim 1, it is characterised in that described that the picture is identified by default neural network model
Characteristics of image and the picture text feature, and form union feature, including:
Image expression is carried out by default neural network model, obtains image expression result;
Text representation is carried out by default neural network model, obtains text representation result;
Combined expression, shape are carried out according to described image expression of results and the text representation result by default neural network model
Into union feature.
3. according to the method for claim 2, it is characterised in that described that image table is carried out by default neural network model
Reach, obtain image expression result, including:
Global image expression is carried out by the predetermined depth convolutional neural networks model, obtains image expression result.
4. according to the method for claim 2, it is characterised in that described that text table is carried out by default neural network model
Reach, obtain text representation result, including:
Term vector expression is carried out by preset loop neural network model, obtains term vector expression of results;
Global text representation is carried out by preset loop neural network model according to the term vector expression of results, obtains text table
Up to result.
5. according to the method for claim 4, it is characterised in that word is carried out by preset loop neural network model described
Before vector table reaches, in addition to step:
Chinese word segmentation is carried out to the text of the picture, obtains Chinese word.
6. according to the method for claim 2, it is characterised in that described by presetting neural network model according to described image
Expression of results and the text representation result carry out Combined expression, form union feature, including:
Connection is weighted to described image expression of results and the text representation result, forms union feature.
7. according to the method for claim 1, it is characterised in that described that the picture is identified by default neural network model
Characteristics of image and the picture text feature, and form union feature, including:
Combined expression is carried out to the image and text of the picture by presetting neural network model, and forms union feature.
8. according to the method for claim 1, it is characterised in that described that classification processing, output are carried out to the union feature
Picture attribute classification results, including:
Softmax classification is carried out to the union feature by default neural network model, exports picture attribute classification results.
9. a kind of combination picture and text carry out the device of picture attribute classification, it is characterised in that described device includes:
Computing module is identified, for identifying the characteristics of image of the picture and the text of the picture by presetting neural network model
Eigen, and form union feature;It is additionally operable to carry out classification processing to the union feature;
Output module, for exporting picture attribute classification results.
10. the system that a kind of combination picture and text carry out picture attribute classification, it is characterised in that the system includes:
Computing device is identified, for identifying the characteristics of image of the picture and the text of the picture by presetting neural network model
Eigen, and form union feature;It is additionally operable to carry out classification processing to the union feature;
Output device, for exporting picture attribute classification results.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710832627.7A CN107862322B (en) | 2017-09-15 | 2017-09-15 | Method, device and system for classifying picture attributes by combining picture and text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710832627.7A CN107862322B (en) | 2017-09-15 | 2017-09-15 | Method, device and system for classifying picture attributes by combining picture and text |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107862322A true CN107862322A (en) | 2018-03-30 |
CN107862322B CN107862322B (en) | 2022-01-07 |
Family
ID=61699555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710832627.7A Active CN107862322B (en) | 2017-09-15 | 2017-09-15 | Method, device and system for classifying picture attributes by combining picture and text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107862322B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108734212A (en) * | 2018-05-17 | 2018-11-02 | 腾讯科技(深圳)有限公司 | A kind of method and relevant apparatus of determining classification results |
CN110222189A (en) * | 2019-06-19 | 2019-09-10 | 北京百度网讯科技有限公司 | Method and apparatus for output information |
CN110399516A (en) * | 2019-07-29 | 2019-11-01 | 拉扎斯网络科技(上海)有限公司 | A kind of method, apparatus of image procossing, readable storage medium storing program for executing and electronic equipment |
CN110728328A (en) * | 2019-10-22 | 2020-01-24 | 支付宝(杭州)信息技术有限公司 | Training method and device for classification model |
CN112232339A (en) * | 2020-10-15 | 2021-01-15 | 中国民航大学 | Flight display equipment fault detection method and monitoring device based on convolutional neural network |
CN114782670A (en) * | 2022-05-11 | 2022-07-22 | 中航信移动科技有限公司 | Multi-mode sensitive information identification method, equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101354733A (en) * | 2008-09-18 | 2009-01-28 | 上海交通大学 | System for data to excavate grid intermediate part facing to occupant restriction system performance analysis |
CN105426356A (en) * | 2015-10-29 | 2016-03-23 | 杭州九言科技股份有限公司 | Target information identification method and apparatus |
CN105469087A (en) * | 2015-07-13 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method for identifying clothes image, and labeling method and device of clothes image |
CN107066583A (en) * | 2017-04-14 | 2017-08-18 | 华侨大学 | A kind of picture and text cross-module state sensibility classification method merged based on compact bilinearity |
-
2017
- 2017-09-15 CN CN201710832627.7A patent/CN107862322B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101354733A (en) * | 2008-09-18 | 2009-01-28 | 上海交通大学 | System for data to excavate grid intermediate part facing to occupant restriction system performance analysis |
CN105469087A (en) * | 2015-07-13 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method for identifying clothes image, and labeling method and device of clothes image |
CN105426356A (en) * | 2015-10-29 | 2016-03-23 | 杭州九言科技股份有限公司 | Target information identification method and apparatus |
CN107066583A (en) * | 2017-04-14 | 2017-08-18 | 华侨大学 | A kind of picture and text cross-module state sensibility classification method merged based on compact bilinearity |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108734212A (en) * | 2018-05-17 | 2018-11-02 | 腾讯科技(深圳)有限公司 | A kind of method and relevant apparatus of determining classification results |
CN110222189A (en) * | 2019-06-19 | 2019-09-10 | 北京百度网讯科技有限公司 | Method and apparatus for output information |
CN110399516A (en) * | 2019-07-29 | 2019-11-01 | 拉扎斯网络科技(上海)有限公司 | A kind of method, apparatus of image procossing, readable storage medium storing program for executing and electronic equipment |
CN110728328A (en) * | 2019-10-22 | 2020-01-24 | 支付宝(杭州)信息技术有限公司 | Training method and device for classification model |
CN110728328B (en) * | 2019-10-22 | 2022-03-01 | 支付宝(杭州)信息技术有限公司 | Training method and device for classification model |
CN112232339A (en) * | 2020-10-15 | 2021-01-15 | 中国民航大学 | Flight display equipment fault detection method and monitoring device based on convolutional neural network |
CN112232339B (en) * | 2020-10-15 | 2023-04-07 | 中国民航大学 | Aviation display equipment fault detection method and monitoring device based on convolutional neural network |
CN114782670A (en) * | 2022-05-11 | 2022-07-22 | 中航信移动科技有限公司 | Multi-mode sensitive information identification method, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN107862322B (en) | 2022-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107862322A (en) | The method, apparatus and system of picture attribute classification are carried out with reference to picture and text | |
Campos et al. | From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction | |
CN110287320B (en) | Deep learning multi-classification emotion analysis model combining attention mechanism | |
CN107861972A (en) | The method and apparatus of the full result of display of commodity after a kind of user's typing merchandise news | |
CN110633373B (en) | Automobile public opinion analysis method based on knowledge graph and deep learning | |
CN110489755A (en) | Document creation method and device | |
CN107862239A (en) | A kind of combination text carries out the method and its device of picture recognition with picture | |
CN106776711A (en) | A kind of Chinese medical knowledge mapping construction method based on deep learning | |
CN110046671A (en) | A kind of file classification method based on capsule network | |
CN109871446A (en) | Rejection method for identifying, electronic device and storage medium in intention assessment | |
CN109658271A (en) | A kind of intelligent customer service system and method based on the professional scene of insurance | |
CN107705066A (en) | Information input method and electronic equipment during a kind of commodity storage | |
CN109934260A (en) | Image, text and data fusion sensibility classification method and device based on random forest | |
CN106815192A (en) | Model training method and device and sentence emotion identification method and device | |
CN104142995B (en) | The social event recognition methods of view-based access control model attribute | |
CN110196945B (en) | Microblog user age prediction method based on LSTM and LeNet fusion | |
CN110750656A (en) | Multimedia detection method based on knowledge graph | |
CN107679110A (en) | The method and device of knowledge mapping is improved with reference to text classification and picture attribute extraction | |
CN109753602A (en) | A kind of across social network user personal identification method and system based on machine learning | |
CN110245228A (en) | The method and apparatus for determining text categories | |
CN109902202A (en) | A kind of video classification methods and device | |
CN111832573A (en) | Image emotion classification method based on class activation mapping and visual saliency | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN109359198A (en) | A kind of file classification method and device | |
Li | Intelligent environmental art design combining big data and artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20211202 Address after: 510000 building 6, No. 20, Huahai street, Fangcun, Liwan District, Guangzhou City, Guangdong Province (office only) Applicant after: GUANGZHOU PINWEI SOFTWARE Co.,Ltd. Address before: 510000 room 01, No.314, Fangcun Avenue middle, Liwan District, Guangzhou City, Guangdong Province Applicant before: GUANGZHOU WEIPINHUI RESEARCH INSTITUTE CO.,LTD. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |