CN108062421A - A kind of extensive picture multiscale semanteme search method - Google Patents
A kind of extensive picture multiscale semanteme search method Download PDFInfo
- Publication number
- CN108062421A CN108062421A CN201810020300.4A CN201810020300A CN108062421A CN 108062421 A CN108062421 A CN 108062421A CN 201810020300 A CN201810020300 A CN 201810020300A CN 108062421 A CN108062421 A CN 108062421A
- Authority
- CN
- China
- Prior art keywords
- picture
- network
- vector
- text
- pictures
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Abstract
A kind of extensive picture semantic search method carrys out training network using unsupervised deep learning model and obtains the feature vector of picture, and considers the semantic relation between the text description of picture to realize the retrieval of large-scale picture;Processing for the feature vector of picture, the production for differentiating 46 layers of generation network composition of network using one 46 layers resists network, for extracting the feature of picture;Processing for the text of picture is obtained picture vector using the distributed method for expressing of term vector, the semantic information of picture is described using word nesting;It is clustered using clustering method come the picture to retrieval, by clustering one only to show in certain class commodity to user, reduces the time of the lookup commodity of user;Then picture text description vectors are obtained by trained term vector;The vector of text vector and picture is connected together as to the character representation of picture;Picture is clustered by k means++ afterwards.
Description
Technical field
The present invention is a kind of extensive picture semantic retrieval technique, particularly the multiscale semanteme of large-scale electric business picture
Search method.
Background technology
The retrieval technique of existing picture is broadly divided into text based picture retrieval technology and the picture inspection based on content
Rope technology.The technology of text based retrieval describes the feature of picture using the mode that text describes.Picture inspection based on content
Rope technology is analyzed and retrieved by the color of picture, texture, layout etc..The work that text based retrieval passes through picture
Person, age, school, size describe picture, and such mode cannot embody similar between semanteme between picture.Based on content
Picture retrieval technology need the feature of manual extraction picture, it is necessary to add in the input of man power and material.Come in recent years, depth
Study larger success is had been achieved in computer vision field, using deep learning realize image retrieval will be one very
Good method.
The search method as disclosed in CN106777177A receives the retrieval request that client is sent, wherein, the retrieval please
It asks including Target Photo;The Target Photo is parsed, extracts text message and characteristics of image;By the text message with
The text message of each preset picture in preset picture set is matched, and determines the first similarity, also, in response to institute
It states the first similarity and is more than default first threshold, described image feature is matched with the characteristics of image of the preset picture,
Based on matching result, it is determined whether the preset picture is determined as identical picture;The related information of identical picture is obtained, it will be identical
Picture and the related information are sent to the client as retrieval result, so that the client shows the retrieval knot
Fruit.
CN105760390A picture retrieval systems, run in electronic equipment, including picture acquisition module, for obtaining one
Open picture to be identified;Picture processing module, for being pre-processed to above-mentioned picture to be identified;Characteristic extracting module is used
In the characteristics of image for extracting the picture to be identified;And retrieval module, for according to acquired characteristics of image, being deposited from default cloud
The picture that retrieval matches with the picture to be identified in reservoir.
The content of the invention
The shortcomings that in order to overcome existing method semantic expressiveness imperfect and need the input of substantial amounts of human and material resources.
The present invention seeks to, it is proposed that a kind of extensive picture semantic retrieval technique considers the relation between picture from multiple scales,
Training network is come by using unsupervised deep learning model and obtains the feature vector of picture, and considers the text of picture
Semantic relation between description realizes the retrieval of large-scale picture.Both without being labeled to picture, manpower is reduced, together
When considered relation between the semanteme of picture.The method of the present invention has merged text based picture retrieval technology and base
In the retrieval technique of the picture of content the advantages of.
The present invention solve extensive picture search problem used in technical solution be:A kind of extensive picture semantic retrieval
Method carrys out training network using unsupervised deep learning model and obtains the feature vector of picture, and considers the text of picture
Semantic relation between this description realizes the retrieval of large-scale picture;
Processing for the feature vector of picture, the generation for differentiating 4-6 layers of generation network composition of network using one 4-6 layers
Formula resists network, for extracting the feature of picture;It may be referred to Fig. 3,5 layers of production for differentiating 5 layers of generation network composition of network
Resist network;
Processing for the text of picture obtains picture vector using the distributed method for expressing of term vector, utilizes word
Nesting describes the semantic information of picture;It may be referred to Fig. 7 in embodiment;
It is clustered come the picture to retrieval using clustering method, is only shown to user in certain class commodity by clustering
It one, may be referred to Fig. 4 in embodiment, reduce the time of the lookup commodity of user;The use of clustering method is k-means++ side
Method;
After picture vector is obtained, similitude is calculated with the picture to be searched by calculating, similitude is found out and is more than
0.5 picture is as candidate;
Then picture text description vectors are obtained by trained term vector;The vector of text vector and picture is connected
Character representation as picture together;Picture is clustered by k-means++ afterwards, one is found out in each cluster
Picture is presented to user, if user wants to check all pictures of the cluster where this pictures, clicks on the pictures, then can
See all pictures.
Further, the character representation for differentiating network acquisition picture in network is resisted using production, then passes through feature
Between similitude find out similar picture, may be referred to Figures 5 and 6;Meanwhile using term vector obtain the description of picture text to
Amount represents;Then the text description of the vector sum picture of picture is connected together as the expression of the pictures, then used
K-means clusters picture, and one is chosen from each classification and shows user.
It is specific to implement to be divided into training and two steps of production environment;Training step is trained production confrontation network;Instruction
Using tensorflow model platforms when practicing, for a convolutional neural networks, generation network is the differentiation network used when training
One deconvolution neutral net;
Above-mentioned is typically 5 layers of production confrontation network for differentiating network and 5 layers of generation network composition, in the network
In, the input of network is generated as the random vector of 100 dimensions, is exported as the picture of a 64*64*3;Differentiate network input be
A pictures of 64*64*3 are exported as the number between one 0 to 1, represent the probability that the picture is true picture;
In training, confrontation is formed by minimizing the loss of true picture and generating the loss of picture respectively;Network
It is middle to have used batch normalization to solve the problems, such as that the explosion of the gradient in network training and gradient disappear, cancel complete
Articulamentum improves the convergence rate of network;After network training, differentiate the output of layer second from the bottom of network as figure
Picture is picked out the higher part picture of similarity by the feature of piece according to the characteristic similarity between picture.
In the training of term vector, as input, output is then each word institute for the text description of the corresponding commodity of picture
Corresponding vector;Then the word vector included that the text of every pictures describes is added to obtain the language of the pictures
Justice represents.
The semanteme between picture can be given expression to for the distributed method represented compared to one-hot of above-mentioned term vector
It is similar.In the training of term vector, as input, output is then that each word institute is right for the text description of the corresponding commodity of picture
The vector answered.Then the word vector included that the text of every pictures describes is added to obtain the semanteme of the pictures
It represents.
The method of above-mentioned cluster is in order to which when being shown to user, it is only shown for of a sort picture
In one, reduce user lookup burden.K-means++ is compared with k-mean so that during initialization cluster centre so that poly-
Farther out, k-means methods are improved at the distance between class center.
Advantageous effect of the present invention:Resisting differentiating in network using production, network obtains the character representation of picture, then
Similar picture is found out by the similitude between feature.Meanwhile the vector that the description of picture text is obtained using term vector is represented.
Then the text description of the vector sum picture of picture is connected together as the expression of the pictures, then using k-means
Picture is clustered, one is chosen from each classification and shows user.The present invention is to consider picture from multiple scales
Semantic feature, compared to method before, need not largely artificial participation, picture is obtained by deep learning method automatically
Feature, and considered the semantic feature of the description of picture, retrieved suitable for ten million magnitude picture multiscale semanteme.Figure
The more diversification of the character representation of piece can more take out the profound feature of picture.Especially with unsupervised learning
Method extracts the feature of picture so that this method is still general in Large Scale Graphs under piece.
Description of the drawings
Fig. 1 is the frame that production resists network;
Fig. 2 is whole system flow chart.
Fig. 3 makes a living into the specific implementation of network.
Fig. 4 is the flow chart of keyword search results.
Fig. 5 makes a living into network flow chart.
Fig. 6 is differentiation flow through a network figure.
Fig. 7 is the product process of text description vectors.
Specific embodiment
The present invention is further described below in conjunction with the accompanying drawings, as shown in the figure, specific implement to be divided into training and production environment
Two parts.In trained part mainly training production confrontation network.This training uses tensorflow platforms.Differentiate net
Network is a convolutional neural networks, and generation network is a deconvolution neutral net.Each iteration uses 64 figures in a network
Piece.Main framework is in fig 2.
After training is completed, then the model after being trained utilizes trained one standard of model foundation
The server of tensorflow models.In actual application, one or a collection of picture can be sent to this server every time
To obtain the vector of picture.
After picture vector is obtained, similitude is calculated with the picture to be searched by calculating, similitude is found out and is more than
0.5 picture is as candidate.Then picture text description vectors are obtained by trained term vector.By text vector and picture
Vector be connected together as the character representation of picture.Picture is clustered by k-means++ afterwards, in each cluster
In find out a pictures and be presented to user, if user wants to check all pictures of the cluster where this pictures, clicking on should
Pictures, then it can be seen that all pictures.The flow of Fig. 3-7 can refer to.
Network is resisted with reference to 3,5 layers of production for differentiating that 5 layers of generation network of network form of figure;
With reference to figure 4, clustered using clustering method come the picture to retrieval, certain class is only shown to user by clustering
One in commodity, reduce the time of the lookup commodity of user;The use of clustering method is k-means++ methods;
After picture vector is obtained, similitude is calculated with the picture to be searched by calculating, similitude is found out and is more than
0.5 picture is as candidate;
It refers to Figures 5 and 6;Meanwhile the vector that the description of picture text is obtained using term vector is represented;Then by the vector of picture
Text description with picture is connected together as the expression of the pictures, and then picture is clustered using k-means, from
One, which is chosen, in each classification shows user.
With reference to figure 7, picture vector is obtained using the distributed method for expressing of term vector, picture is described using word nesting
Semantic information.
Present invention is not limited to the embodiments described above, using identical with the above-mentioned embodiment of the present invention or approximate structure,
Obtained from other structures design, within protection scope of the present invention.
Claims (5)
1. a kind of extensive picture semantic search method, is obtained it is characterized in that carrying out training network using unsupervised deep learning model
The feature vector of picture is taken, and considers the semantic relation between the text description of picture to realize the inspection of large-scale picture
Rope;
Processing for the feature vector of picture, the production pair for differentiating 4-6 layers of generation network composition of network using one 4-6 layers
Anti- network, for extracting the feature of picture;
Processing for the text of picture obtains picture vector using the distributed method for expressing of term vector, nested using word
To describe the semantic information of picture;
It is clustered using clustering method come the picture to retrieval, by clustering one only to show in certain class commodity to user
It is a, reduce the time of the lookup commodity of user;The use of clustering method is k-means++ methods;
After picture vector is obtained, similitude is calculated with the picture to be searched by calculating, similitude is found out and is more than 0.5
Picture is as candidate;
Then picture text description vectors are obtained by trained term vector;The vector of text vector and picture is connected to one
Play the character representation as picture;Picture is clustered by k-means++ afterwards, a pictures are found out in each cluster
User is presented to, if user wants to check all pictures of the cluster where this pictures, the pictures is clicked on, then can see
All pictures.
2. extensive picture semantic search method according to claim 1, it is characterized in that the feature vector for picture
In processing, resisting differentiating in network using production, network obtains the character representation of picture, then passes through the phase between feature
Similar picture is found out like property;Meanwhile the vector that the description of picture text is obtained using term vector is represented;Then by the vector of picture
Text description with picture is connected together as the expression of the pictures, and then picture is clustered using k-means, from
One, which is chosen, in each classification shows user.
3. extensive picture semantic search method according to claim 2, it is characterized in that it is specific implement to be divided into training and
Two steps of production environment;Training step is trained production confrontation network;Tensorflow model platforms, instruction are used during training
For the differentiation network used when practicing for a convolutional neural networks, generation network is a deconvolution neutral net;
Differentiate network and 5 layers of generation network, 5 layers of production confrontation network for differentiating network and 5 layers of generation network composition using 5 layers
In, the input of network is generated as the random vector of 100 dimensions, is exported as the picture of a 64*64*3;Differentiate network input be
A pictures of 64*64*3 are exported as the number between one 0 to 1, represent the probability that the picture is true picture;
In training, confrontation is formed by minimizing the loss of true picture and generating the loss of picture respectively;Make in network
It solves the problems, such as that the explosion of the gradient in network training and gradient disappear with batch normalization, cancels full connection
Layer improves the convergence rate of network;After network training, differentiate the output of layer second from the bottom of network as picture
Picture is picked out the higher part picture of similarity by feature according to the characteristic similarity between picture.
4. extensive picture semantic search method according to claim 1, it is characterized in that in the training of term vector, picture
As input, output is then the vector corresponding to each word for the text description of corresponding commodity;Then by the text of every pictures
The word vector included of this description is added to obtain the semantic expressiveness of the pictures.
5. extensive picture semantic search method according to claim 1, it is characterized in that the method for above-mentioned cluster be for
When being shown to user, for of a sort picture one therein is only shown, reduce the lookup burden of user;
K-means++ is compared with k-mean so that initialization cluster centre when so that the distance between cluster centre farther out,
K-means methods are improved;
After training is completed, then the model after being trained utilizes trained one standard of model foundation
The server of tensorflow models;In actual application, one or a collection of picture can be sent to this server every time
To obtain the vector of picture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810020300.4A CN108062421A (en) | 2018-01-09 | 2018-01-09 | A kind of extensive picture multiscale semanteme search method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810020300.4A CN108062421A (en) | 2018-01-09 | 2018-01-09 | A kind of extensive picture multiscale semanteme search method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108062421A true CN108062421A (en) | 2018-05-22 |
Family
ID=62141120
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810020300.4A Pending CN108062421A (en) | 2018-01-09 | 2018-01-09 | A kind of extensive picture multiscale semanteme search method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108062421A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108829847A (en) * | 2018-06-20 | 2018-11-16 | 山东大学 | Commodity search method and system based on multi-modal shopping preferences |
CN108932660A (en) * | 2018-07-26 | 2018-12-04 | 北京旷视科技有限公司 | A kind of commodity using effect analogy method, device and equipment |
CN109063772A (en) * | 2018-08-02 | 2018-12-21 | 广东工业大学 | A kind of image individuation semantic analysis, device and equipment based on deep learning |
CN109584257A (en) * | 2018-11-28 | 2019-04-05 | 中国科学院深圳先进技术研究院 | A kind of image processing method and relevant device |
CN109901835A (en) * | 2019-01-25 | 2019-06-18 | 北京三快在线科技有限公司 | Method, apparatus, equipment and the storage medium of layout element |
CN110059217A (en) * | 2019-04-29 | 2019-07-26 | 广西师范大学 | A kind of image text cross-media retrieval method of two-level network |
CN111339340A (en) * | 2018-12-18 | 2020-06-26 | 顺丰科技有限公司 | Training method of image description model, image searching method and device |
CN113656582A (en) * | 2021-08-17 | 2021-11-16 | 北京百度网讯科技有限公司 | Training method of neural network model, image retrieval method, device and medium |
CN115186119A (en) * | 2022-09-07 | 2022-10-14 | 深圳市华曦达科技股份有限公司 | Picture processing method and system based on picture and text combination and readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253996A (en) * | 2011-07-08 | 2011-11-23 | 北京航空航天大学 | Multi-visual angle stagewise image clustering method |
CN106126581A (en) * | 2016-06-20 | 2016-11-16 | 复旦大学 | Cartographical sketching image search method based on degree of depth study |
CN106997380A (en) * | 2017-03-21 | 2017-08-01 | 北京工业大学 | Imaging spectrum safe retrieving method based on DCGAN depth networks |
CN107154023A (en) * | 2017-05-17 | 2017-09-12 | 电子科技大学 | Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution |
CN107330364A (en) * | 2017-05-27 | 2017-11-07 | 上海交通大学 | A kind of people counting method and system based on cGAN networks |
US20170365038A1 (en) * | 2016-06-16 | 2017-12-21 | Facebook, Inc. | Producing Higher-Quality Samples Of Natural Images |
-
2018
- 2018-01-09 CN CN201810020300.4A patent/CN108062421A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253996A (en) * | 2011-07-08 | 2011-11-23 | 北京航空航天大学 | Multi-visual angle stagewise image clustering method |
US20170365038A1 (en) * | 2016-06-16 | 2017-12-21 | Facebook, Inc. | Producing Higher-Quality Samples Of Natural Images |
CN106126581A (en) * | 2016-06-20 | 2016-11-16 | 复旦大学 | Cartographical sketching image search method based on degree of depth study |
CN106997380A (en) * | 2017-03-21 | 2017-08-01 | 北京工业大学 | Imaging spectrum safe retrieving method based on DCGAN depth networks |
CN107154023A (en) * | 2017-05-17 | 2017-09-12 | 电子科技大学 | Face super-resolution reconstruction method based on generation confrontation network and sub-pix convolution |
CN107330364A (en) * | 2017-05-27 | 2017-11-07 | 上海交通大学 | A kind of people counting method and system based on cGAN networks |
Non-Patent Citations (2)
Title |
---|
刘玉杰: "基于条件生成对抗网络的手绘图像检索", 《计算机辅助设计与图形学学报》 * |
樊雷: "一种基于TensorFlow的DCGAN模型实现", 《电脑知识与技术》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108829847B (en) * | 2018-06-20 | 2020-11-17 | 山东大学 | Multi-modal modeling method based on translation and application thereof in commodity retrieval |
CN108829847A (en) * | 2018-06-20 | 2018-11-16 | 山东大学 | Commodity search method and system based on multi-modal shopping preferences |
CN108932660A (en) * | 2018-07-26 | 2018-12-04 | 北京旷视科技有限公司 | A kind of commodity using effect analogy method, device and equipment |
CN109063772A (en) * | 2018-08-02 | 2018-12-21 | 广东工业大学 | A kind of image individuation semantic analysis, device and equipment based on deep learning |
CN109063772B (en) * | 2018-08-02 | 2022-05-10 | 广东工业大学 | Image personalized semantic analysis method, device and equipment based on deep learning |
CN109584257A (en) * | 2018-11-28 | 2019-04-05 | 中国科学院深圳先进技术研究院 | A kind of image processing method and relevant device |
CN109584257B (en) * | 2018-11-28 | 2022-12-09 | 中国科学院深圳先进技术研究院 | Image processing method and related equipment |
CN111339340A (en) * | 2018-12-18 | 2020-06-26 | 顺丰科技有限公司 | Training method of image description model, image searching method and device |
CN109901835A (en) * | 2019-01-25 | 2019-06-18 | 北京三快在线科技有限公司 | Method, apparatus, equipment and the storage medium of layout element |
CN110059217A (en) * | 2019-04-29 | 2019-07-26 | 广西师范大学 | A kind of image text cross-media retrieval method of two-level network |
CN110059217B (en) * | 2019-04-29 | 2022-11-04 | 广西师范大学 | Image text cross-media retrieval method for two-stage network |
CN113656582A (en) * | 2021-08-17 | 2021-11-16 | 北京百度网讯科技有限公司 | Training method of neural network model, image retrieval method, device and medium |
CN115186119A (en) * | 2022-09-07 | 2022-10-14 | 深圳市华曦达科技股份有限公司 | Picture processing method and system based on picture and text combination and readable storage medium |
CN115186119B (en) * | 2022-09-07 | 2022-12-06 | 深圳市华曦达科技股份有限公司 | Picture processing method and system based on picture and text combination and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108062421A (en) | A kind of extensive picture multiscale semanteme search method | |
CN108985377B (en) | A kind of image high-level semantics recognition methods of the multiple features fusion based on deep layer network | |
CN107220657B (en) | A kind of method of high-resolution remote sensing image scene classification towards small data set | |
CN106845529B (en) | Image feature identification method based on multi-view convolution neural network | |
CN107944559B (en) | Method and system for automatically identifying entity relationship | |
CN109902665A (en) | Similar face retrieval method, apparatus and storage medium | |
CN110532996A (en) | The method of visual classification, the method for information processing and server | |
CN107025284A (en) | The recognition methods of network comment text emotion tendency and convolutional neural networks model | |
CN111353542A (en) | Training method and device of image classification model, computer equipment and storage medium | |
CN112119388A (en) | Training image embedding model and text embedding model | |
CN108875076B (en) | Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network | |
CN106778921A (en) | Personnel based on deep learning encoding model recognition methods again | |
CN109993102A (en) | Similar face retrieval method, apparatus and storage medium | |
CN109783794A (en) | File classification method and device | |
CN110399895A (en) | The method and apparatus of image recognition | |
CN112949740B (en) | Small sample image classification method based on multilevel measurement | |
CN111898703B (en) | Multi-label video classification method, model training method, device and medium | |
CN110287323A (en) | A kind of object-oriented sensibility classification method | |
CN112074828A (en) | Training image embedding model and text embedding model | |
CN107609055B (en) | Text image multi-modal retrieval method based on deep layer topic model | |
CN105989336A (en) | Scene identification method based on deconvolution deep network learning with weight | |
CN109325529B (en) | Sketch identification method and application of sketch identification method in commodity retrieval | |
CN109871749A (en) | A kind of pedestrian based on depth Hash recognition methods and device, computer system again | |
CN113688894A (en) | Fine-grained image classification method fusing multi-grained features | |
CN111581364B (en) | Chinese intelligent question-answer short text similarity calculation method oriented to medical field |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180522 |