WO2020201866A1 - 画像検索システム、及び画像検索方法 - Google Patents
画像検索システム、及び画像検索方法 Download PDFInfo
- Publication number
- WO2020201866A1 WO2020201866A1 PCT/IB2020/052405 IB2020052405W WO2020201866A1 WO 2020201866 A1 WO2020201866 A1 WO 2020201866A1 IB 2020052405 W IB2020052405 W IB 2020052405W WO 2020201866 A1 WO2020201866 A1 WO 2020201866A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- database
- image data
- data
- tag
- query
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/56—Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Definitions
- One aspect of the present invention relates to an image search system and an image search method.
- One aspect of the present invention is not limited to the above technical fields.
- a semiconductor device, a display device, a light emitting device, a power storage device, a storage device, an electronic device, a lighting device, a driving method thereof, or a manufacturing method thereof can be mentioned as an example.
- Prior art documents such as domestic and foreign patent documents and papers obtained by conducting a prior art search shall be used for confirming the novelty and inventive step of the invention and for determining whether or not to apply for a patent. Can be done.
- search for invalid materials in the prior art documents it is possible to investigate whether there is a risk that the patent right owned by oneself will be invalidated, or whether the patent right owned by another person can be invalidated.
- the above-mentioned prior art search can be carried out by searching the prior art document in which a drawing similar to the drawing embodying the technique before filing is published. Specifically, for example, when a user inputs a drawing into an image search system, a prior art document including a drawing similar to the input drawing can be searched.
- Patent Document 1 discloses a method of determining the similarity between images using a neural network.
- the similarity between the input image and the image to be searched is calculated using only the image data, for example, an image having a concept different from the input image may be searched. As a result, an image that becomes noise may be mixed in the search result, and the image to be searched may not be output. Therefore, the search accuracy of similar images may be low.
- one aspect of the present invention is to provide an image search system with high search accuracy.
- one aspect of the present invention is to provide an image search system capable of performing a search in a short time.
- one aspect of the present invention is to provide an image search system capable of easily performing a search.
- one aspect of the present invention is to provide a new image search system.
- one aspect of the present invention is to provide an image search method with high search accuracy.
- one aspect of the present invention is to provide an image search method capable of performing a search in a short time.
- one aspect of the present invention is to provide an image search method capable of easily performing a search.
- one aspect of the present invention is to provide a novel image retrieval method.
- One aspect of the present invention includes a database, a processing unit, and an input unit, the database has a function of storing document data and a plurality of database image data, and the processing unit has a database image.
- Database image that represents the feature amount of data It has a function to acquire feature amount data for each of a plurality of database image data, and the processing unit generates a plurality of database tags using the document data and uses the database tag as the database image data.
- the processing unit has a function to acquire a database tag vector representing a database tag for each of a plurality of database tags, and the processing unit has a function when query image data is input to the input unit.
- the processing unit has a function of acquiring query image feature amount data representing the feature amount of the query image data, and the processing unit sets the first similarity degree of the database image data to the query image data in a plurality of databases. It has a function to calculate for each of the image data, and the processing unit has a function to acquire the query tag associated with the query image data by using a part of the database tag based on the first similarity, and processes it.
- the unit has a function of acquiring a query tag vector representing a query tag, the processing unit has a function of acquiring the first data including the database image feature amount data and the database tag vector, and the processing unit has a function of acquiring the first data.
- It has a function of acquiring the second data including the query image feature amount data and the query tag vector, and the processing unit has a second similarity, which is the degree of similarity of the first data to the second data. It is an image search system that has a function of calculating the degree.
- the database tag may include a word.
- the processing unit may have a function of generating a database tag by performing morphological analysis on the document data.
- the processing unit has a first neural network and a second neural network, and the database image feature amount data and the query image feature amount data use the first neural network.
- the acquired database tag vector and query tag vector may be acquired using a second neural network.
- the first neural network has a convolution layer and a pooling layer, and the database image feature amount data and the query image feature amount data may be output from the pooling layer.
- the database tag vector and the query tag vector may be distributed representation vectors.
- the first similarity and the second similarity may be cosine similarity.
- one aspect of the present invention is an image search method using an image search system having a database in which document data and a plurality of database images are stored and an input unit, and a feature amount of the database image data.
- the database image feature data representing the above is acquired for each of the plurality of database image data, multiple database tags are generated using the document data, the database tag is linked to the database image data, and the database tag vector representing the database tag is obtained.
- a certain first similarity is calculated for each of a plurality of database image data, and based on the first similarity, a part of the database tags is used to acquire the query tag associated with the query image data, and the query tag is obtained.
- the query tag vector to be represented is acquired, and the database image feature amount data, the first data including the database tag vector, the query image feature amount data, and the second data including the query tag vector are acquired. This is an image search method for calculating a second degree of similarity of the first data with respect to the second data.
- the database tag may include a word.
- a database tag may be generated by performing morphological analysis on the document data.
- the database image feature data and the query image feature data are acquired by using the first neural network, and the database tag vector and the query tag vector are acquired by using the second neural network. You may.
- the first neural network has a convolution layer and a pooling layer, and the database image feature amount data and the query image feature amount data may be output from the pooling layer.
- the database tag vector and the query tag vector may be distributed representation vectors.
- the first similarity and the second similarity may be cosine similarity.
- an image search system with high search accuracy.
- one aspect of the present invention can provide a novel image retrieval system.
- one aspect of the present invention can provide an image search method with high search accuracy.
- one aspect of the present invention can provide a novel image retrieval method.
- FIG. 1 is a block diagram showing a configuration example of an image search system.
- FIG. 2 is a flowchart showing an example of a method of generating search data.
- 3A and 3B are diagrams showing a configuration example of a neural network.
- FIG. 4 is a diagram showing an example of a convolution process and a pooling process.
- FIG. 5 is a diagram showing a configuration example of a neural network.
- 6A and 6B are diagrams showing an example of a method of generating search data.
- FIG. 7A is a diagram showing an example of a method of generating search data.
- FIG. 7B is a diagram showing a configuration example of a neural network.
- 8A and 8B are diagrams showing an example of a method of generating search data.
- FIG. 1 is a block diagram showing a configuration example of an image search system.
- FIG. 2 is a flowchart showing an example of a method of generating search data.
- 3A and 3B are diagrams
- FIG. 9 is a flowchart showing an example of an image search method.
- FIG. 10 is a diagram showing an example of an image search method.
- 11A and 11B are diagrams showing an example of an image retrieval method.
- 12A and 12B are diagrams showing an example of an image retrieval method.
- FIG. 13 is a diagram showing an example of an image search method.
- FIG. 14 is a flowchart showing an example of an image search method.
- FIG. 15 is a diagram showing an example of an image search method.
- 16A and 16B are diagrams showing an example of an image retrieval method.
- FIG. 17 is a flowchart showing an example of an image search method.
- 18A and 18B are diagrams showing an example of an image retrieval method.
- FIG. 19 is a diagram showing an example of an image search method.
- 20A, 20B1 and 20B2 are diagrams showing an example of an image retrieval method.
- 21A and 21B are diagrams showing an example of an image retrieval method.
- 22A and 22B are diagrams showing an example of an image retrieval method.
- FIG. 23 is a flowchart showing an example of the image search method.
- 24A and 24B are diagrams showing an example of an image retrieval method.
- FIG. 25 is a diagram showing an example of an image search method.
- FIG. 26 is a diagram showing an example of an image search method.
- the image search system includes an input unit, a database, and a processing unit.
- the processing unit has a first neural network and a second neural network.
- the first and second neural networks are provided with layers having neurons.
- the neural network refers to a general model that imitates the neural network of an organism, determines the connection strength between neurons by learning, and has problem-solving ability.
- determining the connection strength (also referred to as a weighting coefficient) between neurons from existing information is referred to as "learning”.
- Image data is stored in the database.
- the image search system of one aspect of the present invention searches the database for image data similar to the input image data. Output.
- the image data stored in the database is referred to as database image data.
- the image data input to the input unit is called query image data.
- the database image data and the query image data may be collectively referred to as image data.
- the image feature amount data can be acquired.
- image feature amount data data representing a feature amount of image data
- the data representing the feature amount of the database image data is called the database image feature amount data
- the data representing the feature amount of the query image data is called the query image feature amount data.
- the first neural network can be, for example, a convolutional neural network having a convolutional layer and a pooling layer.
- the first neural network is a convolutional neural network
- the data output from the pooling layer by inputting the image data to the first neural network can be used as the image feature amount data.
- a tag is associated with the database image data.
- the tag can be associated by storing the document data associated with the database image data in the database and performing morphological analysis on the document data.
- the tag can be a keyword representing an image concept, technical content, attention point, etc. corresponding to database image data.
- one tag can represent one word.
- Multiple tags can be associated with the database image data.
- the tag associated with the database image data is referred to as a database tag.
- the tag associated with the query image data is called a query tag.
- the tag By inputting the tag into the second neural network of the processing unit, the tag can be represented by a vector.
- the tag can be represented by a 300-dimensional distributed representation vector.
- a vector representing a tag is referred to as a tag vector.
- a vector representing a database tag is called a database tag vector
- a vector representing a query tag is called a query tag vector.
- one tag vector indicates a tag vector corresponding to one tag.
- the term vector refers to a set of a plurality of values. Further, the number of values constituting one vector is called a number of dimensions. For example, the vector represented by (5,1,4,3,2) can be said to be a five-dimensional vector. The value that constitutes the vector may be called a component.
- the database image feature amount data representing the feature amount of the database image is stored in the database in advance.
- the database tag associated with the database image data and the database tag vector representing the database tag are also stored in the database in advance.
- the database tag itself does not have to be stored in the database.
- the query image data is input to the first neural network, and the query image feature. Amount data is generated.
- the similarity of the database image data to the query image data is calculated using the database image feature data and the query image feature data. For example, the cosine similarity is calculated.
- the degree of similarity to the query image data can be calculated, for example, for each of all the database image data.
- the query tag is acquired using the database tag based on the calculation result of the similarity. For example, among the database tags associated with the database image data having a high degree of similarity, the database tag having a high frequency of appearance can be used as a query tag.
- the number of query tags can be, for example, the same as the number of database tags associated with one database image data.
- one image data means, for example, image data representing one image displayed in one frame period.
- the first data including the database image feature amount data and the database tag vector is acquired.
- the second data including the query image feature amount data and the query tag vector is acquired.
- the similarity of the database image data with respect to the query image data is corrected. For example, the correction is performed by calculating the cosine similarity between the first data and the second data.
- one first data includes, for example, one database image feature amount data and a database tag vector corresponding to a database tag associated with the database image data corresponding to the database image feature amount data.
- the one second data can include the query image feature amount data and the same number of query tag vectors as the database tag vector included in the one first data.
- ranking data including information on the ranking of similarity after the correction is generated, and the search result is output to the outside of the image search system of one aspect of the present invention.
- a query tag is acquired by using a database tag.
- the acquisition method is a simpler method than, for example, a method of acquiring a query tag based on query image feature amount data. Therefore, the image search system according to one aspect of the present invention can perform a search in a short time.
- the method of acquiring the query tag using the database tag is, for example, the concept of the image corresponding to the query image data, as compared with the method of specifying all the query tags by the user of the image search system of one aspect of the present invention. It is possible to comprehensively acquire tags representing technical contents, points of interest, and the like. Therefore, the image search system of one aspect of the present invention can perform a search easily and with high accuracy.
- FIG. 1 is a block diagram showing a configuration example of the image search system 10.
- the components are classified by function and the block diagram is shown as blocks independent of each other.
- a component may be involved in multiple functions.
- one function may be related to a plurality of components. For example, a plurality of processes performed by the processing unit 13 may be executed by different servers.
- the image search system 10 has at least a processing unit 13.
- the image retrieval system 10 shown in FIG. 1 further includes an input unit 11, a transmission line 12, a storage unit 15, a database 17, and an output unit 19.
- Image data and the like are supplied to the input unit 11 from the outside of the image search system 10.
- the image data or the like supplied to the input unit 11 is supplied to the processing unit 13, the storage unit 15, or the database 17 via the transmission line 12.
- the image data input to the input unit 11 is called query image data.
- the transmission line 12 has a function of transmitting image data and the like. Information can be transmitted and received between the input unit 11, the processing unit 13, the storage unit 15, the database 17, and the output unit 19 via the transmission line 12.
- the processing unit 13 has a function of performing calculations, inferences, and the like using image data and the like supplied from the input unit 11, the storage unit 15, the database 17, and the like.
- the processing unit 13 has a neural network, and can perform calculations, inferences, and the like using the neural network. In addition, the processing unit 13 can perform operations and the like without using a neural network.
- the processing unit 13 can supply the calculation result, the inference result, and the like to the storage unit 15, the database 17, the output unit 19, and the like.
- the processing unit 13 It is preferable to use a transistor having a metal oxide in the channel forming region for the processing unit 13. Since the transistor has an extremely low off current, the data retention period can be secured for a long period of time by using the transistor as a switch for holding the electric charge (data) flowing into the capacitive element that functions as a storage element. ..
- the processing unit 13 is operated only when necessary, and in other cases, the information of the immediately preceding processing is saved in the storage element. This makes it possible to turn off the processing unit 13. That is, normally off-computing becomes possible, and the power consumption of the image search system can be reduced.
- a metal oxide is a metal oxide in a broad sense. Metal oxides are classified into oxide insulators, oxide conductors (including transparent oxide conductors), oxide semiconductors (also referred to as Oxide Semiconductor or simply OS) and the like. For example, when a metal oxide is used in the semiconductor layer of a transistor, the metal oxide may be referred to as an oxide semiconductor. That is, when a metal oxide has at least one of an amplification action, a rectifying action, and a switching action, the metal oxide can be referred to as a metal oxide semiconductor, or OS for short.
- a transistor using an oxide semiconductor or a metal oxide in a channel forming region is referred to as an Oxide Semiconductor transistor or an OS transistor.
- the metal oxide contained in the channel forming region preferably contains indium (In).
- the metal oxide contained in the channel forming region is a metal oxide containing indium, the carrier mobility (electron mobility) of the OS transistor becomes high.
- the metal oxide contained in the channel forming region is preferably an oxide semiconductor containing the element M.
- the element M is preferably aluminum (Al), gallium (Ga), tin (Sn) or the like. Other elements applicable to the element M include boron (B), silicon (Si), tantalum (Ti), iron (Fe), nickel (Ni), germanium (Ge), yttrium (Y), and zirconium (Zr).
- the element M is, for example, an element having a high binding energy with oxygen.
- the metal oxide contained in the channel forming region is preferably a metal oxide containing zinc (Zn). Metal oxides containing zinc may be more likely to crystallize.
- the metal oxide contained in the channel forming region is not limited to the metal oxide containing indium.
- the semiconductor layer may be, for example, a metal oxide containing zinc, a metal oxide containing zinc, a metal oxide containing zinc, a metal oxide containing tin, or the like, such as zinc tin oxide or gallium tin oxide.
- the processing unit 13 has, for example, an arithmetic circuit, a central processing unit (CPU: Central Processing Unit), or the like.
- CPU Central Processing Unit
- the processing unit 13 may have a microprocessor such as a DSP (Digital Signal Processor) or a GPU (Graphics Processing Unit).
- the microprocessor may have a configuration realized by a PLD (Programmable Logic Device) such as FPGA (Field Programmable Gate Array) or FPAA (Field Programmable Analog Array).
- PLD Programmable Logic Device
- FPGA Field Programmable Gate Array
- FPAA Field Programmable Analog Array
- the processing unit 13 can perform various data processing and program control by interpreting and executing instructions from various programs by the processor.
- the program that can be executed by the processor is stored in at least one of the memory area of the processor and the storage unit 15.
- the processing unit 13 may have a main memory.
- the main memory has at least one of a volatile memory such as RAM (Random Access Memory) and a non-volatile memory such as ROM (Read Only Memory).
- RAM Random Access Memory
- ROM Read Only Memory
- RAM for example, DRAM (Dynamic Random Access Memory), SRAM (Static Random Access Memory), or the like is used, and a memory space is virtually allocated and used as a work space of the processing unit 13.
- the operating system, application program, program module, program data, lookup table, and the like stored in the storage unit 15 are loaded into the RAM for execution. These data, the program, and the program module loaded in the RAM are each directly accessed and operated by the processing unit 13.
- the ROM can store a BIOS (Basic Input / Output System), firmware, and the like that do not require rewriting.
- BIOS Basic Input / Output System
- Examples of the ROM include a mask ROM, an OTPROM (One Time Program Read Only Memory), an EPROM (Erasable Program Read Only Memory), and the like.
- Examples of EPROM include UV-EPROM (Ultra-Violet Erasable Program Read Only Memory), EEPROM (Electrically Erasable Program Memory), etc., which enable erasure of stored data by irradiation with ultraviolet rays.
- the storage unit 15 has a function of storing a program executed by the processing unit 13. Further, the storage unit 15 may have a function of storing the calculation result and the inference result generated by the processing unit 13, the image data input to the input unit 11, and the like.
- the storage unit 15 has at least one of a volatile memory and a non-volatile memory.
- the storage unit 15 may have, for example, a volatile memory such as a DRAM or SRAM.
- the storage unit 15 includes, for example, ReRAM (Resistive Random Access Memory, also referred to as resistance change type memory), PRAM (Phase change Random Access Memory), FeRAM (Ferroelectric Random Access Memory), FeRAM (Ferroelectric Random Access Memory), Ferroelectric Random Memory Access Memory Also referred to as), or may have a non-volatile memory such as a flash memory.
- the storage unit 15 may have a recording media drive such as a hard disk drive (Hard Disk Drive: HDD) and a solid state drive (Solid State Drive: SSD).
- the database 17 has a function of storing image data to be searched. As described above, the image data stored in the database is called database image data. Further, the database 17 has a function of storing the calculation result and the inference result generated by the processing unit 13. Further, it may have a function of storing image data or the like input to the input unit 11. The storage unit 15 and the database 17 do not have to be separated from each other. For example, the image retrieval system 10 may have a storage unit having both functions of the storage unit 15 and the database 17.
- the output unit 19 has a function of supplying information to the outside of the image search system 10. For example, the calculation result or the inference result in the processing unit 13 can be supplied to the outside.
- FIG. 2 is a flowchart showing an example of the processing method.
- the database image data GD DB is input to the processing unit 13 from the database 17 via the transmission line 12.
- the database image data GD DB can be data representing a drawing possessed by information on intellectual property.
- examples of the information on intellectual property include publications such as patent documents (public patent gazettes, patent gazettes, etc.), utility model gazettes, design gazettes, and papers. Not limited to domestically published publications, publications published around the world can be used as information on intellectual property.
- Information on intellectual property is not limited to publications.
- various files such as image files independently owned by the user or the organization that uses the image search system can also be used as the database image data GD DB .
- a drawing or the like explaining an invention, a device, or a design can be mentioned.
- the database image data GD DB may have, for example, data representing a drawing described in a patent document of a specific applicant, or data representing a drawing described in a patent document of a specific technical field. ..
- the image search system 10 has a function of searching a database image data GD DB similar to the query image data. Therefore, by using the image search system 10, for example, patent documents, papers, or industrial products similar to the invention before filing can be searched. This makes it possible to conduct a prior art search for the invention before filing the application. By grasping and reexamining the related prior art, it is possible to strengthen the invention and make it a strong patent that is difficult for other companies to avoid.
- the image search system 10 it is possible to search for patent documents, papers, or industrial products similar to the industrial products before the release.
- the database image data GD DB has data corresponding to the image described in the company's patent document, it can be confirmed whether the technology related to the industrial product before the release has been sufficiently applied for a patent in the company.
- the database image data GD DB has data corresponding to the images described in the patent documents of other companies, it can be confirmed whether the industrial products before the release infringe the intellectual property rights of the other companies. ..
- By grasping and reexamining the related prior art it is possible to discover new inventions and make them strong patented inventions that contribute to the company's business.
- not only the industrial products before the release but also the industrial products after the release may be searched.
- the image search system 10 can be used to search for patent documents, papers, or industrial products similar to a specific patent. In particular, by searching based on the filing date of the patent, it is possible to easily and highly accurately search whether the patent does not include a reason for invalidation.
- Step S02 Next, the database image data GD DB is input to the neural network of the processing unit 13.
- FIG. 3A is a diagram showing a configuration example of a neural network 30 which is a neural network included in the processing unit 13.
- the neural network 30 has layers 31 [1] to 31 [m] (m is an integer of 1 or more).
- Layers 31 [1] to 31 [m] have neurons, and neurons provided in each layer are connected to each other.
- the neurons provided in layer 31 [1] are connected to the neurons provided in layer 31 [2].
- the neurons provided in the layer 31 [2] are connected to the neurons provided in the layer 31 [1] and the neurons provided in the layer 31 [3]. That is, a hierarchical neural network is composed of layers 31 [1] to 31 [m].
- the database image data GD DB is input to the layer 31 [1], and the layer 31 [1] outputs the data corresponding to the input image data.
- the data is input to the layer 31 [2], and the layer 31 [2] outputs the data corresponding to the input data.
- the data output from the layer 31 [m-1] is input to the layer 31 [m], and the layer 31 [m] outputs the data corresponding to the input data.
- the layer 31 [1] can be an input layer
- the layers 31 [2] to 31 [m-1] can be an intermediate layer
- the layer 31 [m] can be an output layer.
- the data output from the layers 31 [1] to 31 [m] is learned in advance so as to represent the feature amount of the image data input to the neural network 30.
- Learning can be performed by unsupervised learning, supervised learning, or the like.
- unsupervised learning is preferable because teacher data (also referred to as a correct label) is not required.
- an error back propagation method or the like can be used as the learning algorithm.
- the neural network 30 can perform learning by using all of the database image data GD DB stored in the database 17 as training data.
- the neural network 30 can perform learning by using a part of the database image data GD DB as training data.
- the image data stored in the storage unit 15 and the image data input from the outside of the image search system 10 to the processing unit 13 via the input unit 11 are used as training data. Therefore, the neural network 30 can perform learning.
- the neural network 30 can perform learning by using only the image data input from the outside of the image search system 10 to the processing unit 13 via the input unit 11 as the learning data.
- the neural network 30 can be a convolutional neural network (CNN: Convolutional Neural Network).
- CNN Convolutional Neural Network
- FIG. 3B is a diagram showing a configuration example of the neural network 30 when CNN is applied as the neural network 30.
- the neural network 30 to which CNN is applied is referred to as a neural network 30a.
- the neural network 30a has a convolution layer CL, a pooling layer PL, and a fully connected layer FCL.
- FIG. 3B shows an example in which the neural network 30a has m layers (m is an integer of 1 or more) each of the convolution layer CL and the pooling layer PL, and has one fully connected layer FCL.
- the neural network 30a may have two or more fully connected layers FCL.
- the convolution layer CL has a function of convolving the data input to the convolution layer CL.
- the convolution layer CL [1] has a function of convolving the image data input to the processing unit 13.
- the convolution layer CL [2] has a function of convolving the data output from the pooling layer PL [1].
- the convolution layer CL [m] has a function of convolving the data output from the pooling layer PL [m-1].
- the convolution is performed by repeating the product-sum operation of the data input to the convolution layer CL and the weight filter.
- the convolution in the convolution layer CL the features of the image corresponding to the image data input to the neural network 30a are extracted.
- the convolved data is converted by the activation function and then output to the pooling layer PL.
- the activation function ReLU (Rectifier Liner Units) or the like can be used.
- ReLU is a function that outputs "0" when the input value is negative, and outputs the input value as it is when the input value is "0" or more.
- a sigmoid function, a tanh function, or the like can also be used as the activation function.
- the pooling layer PL has a function of pooling the data input from the convolution layer CL. Pooling is a process of dividing data into a plurality of regions, extracting predetermined data for each region, and arranging the data in a matrix. By pooling, the amount of data can be reduced while retaining the features extracted by the convolution layer CL. In addition, it is possible to enhance the robustness against minute deviations of the input data. As the pooling, maximum pooling, average pooling, Lp pooling and the like can be used.
- the fully connected layer FCL has a function of determining an image using the data output from the pooling layer PL [m].
- the fully connected layer FCL has a configuration in which all the nodes of one layer are connected to all the nodes of the next layer.
- the data output from the convolution layer CL or the pooling layer PL is a two-dimensional feature map, and when input to the fully connected layer FCL, it is expanded in one dimension. Then, the vector obtained by the inference by the fully connected layer FCL is output from the fully connected layer FCL.
- the configuration of the neural network 30a is not limited to the configuration of FIG. 3B.
- the pooling layer PL may be provided for each of the plurality of convolution layers CL. That is, the number of pooling layers PL included in the neural network 30a may be less than the number of convolution layers CL. Further, if it is desired to retain the position information of the extracted features as much as possible, the pooling layer PL may not be provided.
- the neural network 30a can optimize the filter value of the weight filter, the weighting coefficient of the fully connected layer FCL, and the like.
- the data input to the convolution layer CL is the input data values of 3 rows and 3 columns (input data value i11, input data value i12, input data value i13, input data value i21, input data value i22, input data. It is assumed that the value i23, the input data value i31, the input data value i32, and the input data value i33) are provided. Further, it is assumed that the weight filter has a filter value of 2 rows and 2 columns (filter value f11, filter value f12, filter value f21, filter value f22).
- the data input to the convolution layer CL [1] can be image data.
- the input data value can be a pixel value included in the image data.
- the pixel value indicates a value representing the gradation of the brightness of the light emitted by the pixel.
- the pixel value when the pixel value is an 8-bit value, the pixel can emit light having a brightness of 256 gradations.
- the image data can be said to include a set of pixel values, and can include, for example, the same number of pixel values as pixels. For example, when the number of pixels of an image is 2 ⁇ 2, it can be said that the image data representing the image includes a pixel value of 2 ⁇ 2.
- the input data value input to the convolution layer CL [2] can be an output value of the pooling layer PC [1]
- the input data value input to the convolution layer CL [m] can be used as an output value. It can be the output value of the pooling layer PC [m-1].
- the convolution is performed by the product-sum operation of the input data value and the filter value.
- the filter value can be data indicating a predetermined feature (referred to as feature data).
- feature data a predetermined feature
- the convolution layer CL filters the input data value i11, the input data value i12, the input data value i21, and the input data value i22, so that the convolution layer CL has the data output from the convolution layer CL. It shows how to acquire the value C11. Further, the convolution layer CL filters the input data value i12, the input data value i13, the input data value i22, and the input data value i23, so that the convolution value C12 contained in the data output from the convolution layer CL It shows how to get.
- the convolution layer CL filters the input data value i21, the input data value i22, the input data value i31, and the input data value i32, so that the convolution value C21 included in the data output from the convolution layer CL It shows how to get. Further, the convolution layer CL filters the input data value i22, the input data value i23, the input data value i32, and the input data value i33, so that the convolution value C22 included in the data output from the convolution layer CL It shows how to get. From the above, it can be said that the stride of the convolution process shown in FIG. 4 is 1.
- the convolution value C11, the convolution value C12, the convolution value C21, and the convolution value C22 can be obtained by the multiply-accumulate operation shown in the following equation, respectively.
- the convolution value C11, the convolution value C12, the convolution value C21, and the convolution value C22 acquired by the convolution layer CL are arranged in a matrix according to the address and then output to the pooling layer PL. Specifically, the convolution value C11 is arranged in the first row and the first column, the convolution value C12 is arranged in the first row and the second column, the convolution value C21 is arranged in the second row and the first column, and the convolution value C22 is arranged in the second row. It is placed in the second row.
- FIG. 4 shows a state in which a convolution value C11, a convolution value C12, a convolution value C21, and a convolution value C22 are input to the pooling layer PL, and one value is set as the pooling value P based on the four convolution values.
- the maximum value among the convolution value C11, the convolution value C12, the convolution value C21, and the convolution value C22 can be set as the pooling value P.
- the average value of the convolution value C11, the convolution value C12, the convolution value C21, and the convolution value C22 can be set as the pooling value P.
- the pooling value P is an output value output from the pooling layer PL.
- FIG. 4 shows an example in which the data input to the convolution layer CL is processed by one weight filter, but it may be processed by two or more weight filters. In this case, a plurality of features included in the image data input to the neural network 30a can be extracted.
- the processing shown in FIG. 4 is performed for each filter. Further, as described above, the stride is set to 1 in FIG. 4, but the stride may be set to 2 or more.
- FIG. 5 is a diagram showing a configuration example of the convolution layer CL and the pooling layer PL included in the neural network 30a.
- FIG. 5 shows an example in which the convolution layer CL and the pooling layer PL perform the operations shown in FIG.
- FIG. 5 shows the neuron 32. Specifically, as the neuron 32, the neuron 32a, the neuron 32b, and the neuron 32c are shown.
- the value output from the neuron 32 is described inside the neuron 32. The value is output in the direction of the arrow.
- the weighting coefficient is described in the vicinity of the arrow.
- the filter value f11, the filter value f12, the filter value f21, and the filter value f22 are used as weighting coefficients.
- the neuron 32a is a neuron 32 possessed by layer L, which is a layer before the convolution layer CL shown in FIG.
- the layer L can be, for example, an input layer when the convolution layer CL shown in FIG. 5 is the convolution layer CL [1], and a pooling layer PL [1] when the convolution layer CL is the convolution layer CL [2].
- the pooling layer PL [m-1] can be used.
- neurons 32a [1] to 32a [9] are shown as neurons 32a.
- the neuron 32a [1] outputs the input data value i11
- the neuron 32a [2] outputs the input data value i12
- the neuron 32a [3] outputs the input data value i13.
- 32a [4] outputs the input data value i21
- the neuron 32a [5] outputs the input data value i22
- the neuron 32a [6] outputs the input data value i23
- the neuron 32a [7] outputs the input data value.
- the i31 is output
- the neuron 32a [8] outputs the input data value i32
- the neuron 32a [9] outputs the input data value i33.
- the neuron 32b is a neuron 32 included in the convolution layer CL shown in FIG. In FIG. 5, neurons 32b [1] to neurons 32b [4] are shown as neurons 32b.
- the neuron 32b [1] has an input data value i11 multiplied by a filter value f11, an input data value i12 multiplied by a filter value f12, and an input data value i21 multiplied by a filter value.
- a value obtained by multiplying f21 and a value obtained by multiplying the input data value i22 by the filter value f22 are input.
- the convolution value C11 which is the sum of these values, is output from the neuron 32b [1].
- the neuron 32b [4] has a value obtained by multiplying the input data value i22 by the filter value f11, a value obtained by multiplying the input data value i23 by the filter value f12, and a value obtained by multiplying the input data value i32 by the filter value f21. And the value obtained by multiplying the input data value i33 by the filter value f22 are input. Then, the convolution value C22, which is the sum of these values, is output from the neuron 32b [4].
- each of the neurons 32b [1] to 32b [4] is connected to a part of the neurons 32a [1] to 32a [9]. Therefore, it can be said that the convolution layer CL is a partially connected layer.
- the neuron 32c is a neuron 32 included in the pooling layer PL shown in FIG.
- the convolution value C11, the convolution value C12, the convolution value C21, and the convolution value C22 are input to the neuron 32c.
- the pooling value P is output from the neuron 32c.
- the convolution value output from the neuron 32b is not multiplied by the weighting coefficient.
- the weighting factor is a parameter optimized by the learning of the neural network. Therefore, the parameters used by the pooling layer PL in the calculation can be configured so that there are no parameters optimized by learning.
- the processing unit 13 can acquire the database image feature amount data GFD DB representing the feature amount of the database image data GD DB .
- the data output from the layer 31 [m] can be used as the database image feature data GFD DB .
- the data output from the pooling layer PL [m] can be used as the database image feature data GFD DB .
- the database image feature data GFD DB may include output data of two or more layers. Since the database image feature data GFD DB includes the output data of many layers, the database image feature data GFD DB can more accurately represent the features of the database image data GD DB .
- the database image feature amount data GFD DB acquired by the processing unit 13 can be stored in the database 17.
- FIG. 6A is a diagram showing an example of a method of acquiring the database tag TAG DB .
- the illustration of each data shown in FIG. 6A is an example, and the present invention is not limited to this. Further, the illustration of each data, vector, etc. shown in other figures is also an example, and is not limited to the contents shown.
- a tag is associated with each of the database image data GD DB [1] and the database image data GD DB [100]. Further, it is assumed that the document data TD DB corresponding to the database image data GD DB is stored in the database 17 in advance. Further, it is assumed that the figure number is associated with the database image data GD DB .
- the document data TD DB can be data corresponding to documents described in publications such as patent documents, utility model publications, design publications, and papers in which drawings represented by database image data GD DB are published.
- publications such as patent documents, utility model publications, design publications, and papers in which drawings represented by database image data GD DB are published.
- the data corresponding to the specification can be the document data TD DB .
- the data corresponding to the claims, the utility model registration claims, or the abstract can be used as the document data TD DB .
- the publication in which the database image data GD DB is published is a design gazette
- the data corresponding to the application can be used as the document data TD DB .
- the database tag TAG DB is acquired by performing morphological analysis on a paragraph explaining a drawing represented by the database image data GD DB. be able to.
- FIG. 6A the figure number of the image corresponding to the database image data GD DB [1] is “FIG. 1”, and the document represented by the document data TD DB [1] associated with the database image data GD DB [1].
- FIG. 6A the figure number of the image corresponding to the database image data GD DB [1] is “FIG. 1”, and the document represented by the document data TD DB [1] associated with the database image data GD DB [1].
- the morphological analysis is performed on the sentence described in the paragraph [0xx0].
- the database tag TAG DB [1] can be acquired.
- FIG. 6A the figure number of the image corresponding to the image data GD DB [100] is “FIG. 15”, which is represented by the document data TD DB [100] associated with the database image data GD DB [100].
- FIG. 15 An example is shown in which "FIG. 15" is described in the paragraph [0xx7] of the document.
- a sentence written in natural language can be divided into morphemes (the smallest unit having meaning as a language), and the part of speech of the morpheme can be discriminated.
- the database tag TAG DB [1] can be acquired.
- words such as “circuit diagram”, “aaa”, “bbbb”, “ccc”, and “ddd” are assumed to be the database tag TAG DB [1].
- words such as “block diagram”, “ggg”, “aaa”, “ccc”, and "hhh” are the database tag TAG DB [100].
- the database tag TAG DB can be obtained, for example, by performing morphological analysis on the document data TD DB associated with the database image data GD DB .
- the database tag TAG DB By acquiring the database tag TAG DB by this method, it is possible to comprehensively acquire tags representing the concept, technical content, points of interest, etc. of the image corresponding to the database image data GD DB .
- one tag means, for example, one word.
- the number of database tags TAG DB [1] can be 5 or more.
- the number of database tags TAG DB [100] can be 5 or more.
- a predetermined number of words can be extracted from the extracted words, and the extracted words can be used as a database tag TAG DB .
- a predetermined number of words having a high TF-IDF Term Frequency-Inverse Document Frequency
- TF-IDF Term Frequency-Inverse Document Frequency
- TF-IDF is calculated based on two indexes, word frequency (TF) and reverse document frequency (IDF). Therefore, words that often appear in the entire document have a high TF but a low IDF. Therefore, the frequency of appearance is high in paragraphs or the like from which words that are candidates for the database tag TAG DB are extracted, and TF-IDF is lower than words that appear less frequently in other paragraphs or the like. For example, words that often appear throughout a document may not be words that strongly represent image features such as concepts, technical content, and points of interest.
- the image search system 10 can perform a search with high accuracy.
- the database tag TAG DB may be acquired using only TF, for example, without calculating TF-IDF. In this case, the calculation performed by the processing unit 13 can be simplified.
- morphological analysis may be performed on sentences of two or more paragraphs. For example, morphological analysis may be performed on the paragraphs before and after the paragraph in which the description of the drawing represented by the database image data GD DB is considered to be described. For example, when performing morphological analysis on the document data TD DB [1] shown in FIG. 6A, morphological analysis may be performed on the paragraph [0xx1] which is the next paragraph in addition to the paragraph [0xx0]. .. In this case, for example, the word "eee" described in paragraph [0xx1] can be used as the database tag TAG DB [1]. Further, for example, when performing morphological analysis on the document data TD DB [100] shown in FIG.
- morphological analysis may be performed on the paragraph [0xx6] which is the previous paragraph in addition to the paragraph [0xx7]. Good.
- the word "fff" described in paragraph [0xx6] can be used as the database tag TAG DB [100].
- morphological analysis may be performed on all the paragraphs in which the figure numbers associated with the database image data GD DB are described.
- morphological analysis may be performed on a paragraph in which the figure number associated with the database image data GD DB is described and other figure numbers are not described.
- the morphological analysis may be performed only on a part of the sentences included in the sentences described in the predetermined paragraph. For example, in the case shown in FIG. 6A, the morphological analysis may be performed only on the sentences including "FIG. 1" among the sentences described in the paragraph [0xx0]. In this case, the word "ddd” does not become the database tag TAG DB [1].
- synonyms of the word may be used as the database tag TAG DB .
- the synonym dictionary data is stored in the storage unit 15 or the database 17 in advance, and the words extracted by the morphological analysis and the words registered in the synonym dictionary as synonyms of the words are referred to as the database tag TAG DB . can do.
- a generally available synonym dictionary may be used, or synonyms extracted by using a distributed expression of words may be used. Further, the extraction of synonyms using the distributed expression may be performed using a database including other documents in the field to which the document to be searched belongs.
- the database tag TAG DB can be used as the concept, technical content, and technical content of the database image data GD DB . It can strongly represent features such as points of interest.
- the database tag TAG DB may be acquired without using morphological analysis.
- the database tag TAG DB may be acquired based on the database image feature amount data GFD DB .
- FIG. 6B is a diagram showing an example of a method of associating a figure number with the database image data GD DB .
- the publication data PD includes the image data GD DB [1], the image data GD DB [2], and the document data TD DB .
- the publication represented by the publication data PD includes the text "FIG. 1 xxx” and the text "FIG. 2 yyy”. It is assumed that the data representing the text "FIG. 1 xxx” and the data representing the text "FIG. 2 yyy" are not included in the document data TD DB .
- "x1", “x2", “x1 ⁇ x2”, broken lines, arrows, etc. shown in FIG. 6B are attached for convenience of explanation, and are actually publications represented by publication data PD. It shall not be described in.
- the drawing number provided at the closest distance to the text "Fig. N” is set to "N".
- the distance between the coordinates representing the center of the text (center coordinates) and the center coordinates of the drawing can be set as the distance from the text to the drawing.
- N is not limited to an integer and may include, for example, a character.
- N may be "1 (A)".
- the distance x1 between the center coordinates of the text "FIG. 1 xxx" and the center coordinates of the drawing corresponding to the database image data GD DB [1] is the center of the text "FIG. 1 xxx". It is shorter than the distance x2 between the coordinates and the center coordinates of the drawing corresponding to the database image data GD DB [2]. Therefore, it can be said that the drawing provided at the closest distance to the text “FIG. 1 xxx” is the database image data GD DB [1]. Therefore, the figure number associated with the database image data GD DB [1] can be set to "1".
- FIG. 6B an example is shown in which "Fig. 1 is” is described in the paragraph [0zz3] of the document represented by the document data TD DB, and "Fig. 2 is” is described in the paragraph [0zz4].
- the database tag TAG DB [1] associated with the database image data GD DB [1] performs morphological analysis on the sentence described in paragraph [0zz3], for example. Can be obtained by.
- words such as “block diagram”, “iii”, “kkk”, “hhh”, and “ppp” described in paragraph [0zz3] are assumed to be the database tag TAG DB [1]. ..
- the center coordinates of all the drawings may be arranged to form a first one-dimensional array, and the center coordinates of all the texts "Fig. N" may be arranged to form a second one-dimensional array. Then, the coordinates included in the first one-dimensional array and the coordinates included in the second one-dimensional array are compared, and the text "Fig. N" described in the coordinates closest to each drawing is described. May be linked. That is, the figure number of the drawing located at the coordinates closest to the coordinates representing the position of the text "Fig. N" can be set to "N".
- the comparison between the coordinates included in the first one-dimensional array and the coordinates included in the second one-dimensional array is, for example, the sum of the square of the difference of the x-coordinate and the square of the difference of the y-coordinate. It can be done by calculation.
- the element having the smallest sum value can be the element located at the closest coordinate.
- the processing unit 13 can acquire the database tag TAG DB .
- the database tag TAG DB acquired by the processing unit 13 can be stored in the database 17.
- FIG. 7A is a diagram showing how the database tag TAG DB shown in FIG. 6A is represented by a vector.
- the database tag vector TAGV DB can be acquired by inputting the database tag TAG DB into, for example, the neural network of the processing unit 13.
- the database tag vector TAGV DB can be, for example, a distributed representation vector.
- the distributed representation vector is a vector in which words are represented by quantified continuous values for each feature element (dimension). Words with similar meanings have similar vectors.
- the neural network used to acquire the distributed representation vector can have a different configuration from the neural network used to acquire the image feature amount data described above.
- FIG. 7B is a diagram showing a configuration example of a neural network 40, which is a neural network used to acquire a distributed representation vector.
- the neural network used for acquiring image feature amount data may be referred to as a first neural network
- the neural network used for acquiring a distributed representation vector may be referred to as a second neural network.
- the ordinal number is just an example.
- the neural network used to acquire the distributed representation vector is called the first neural network
- the neural network used to acquire the image feature amount data is called the second neural network. It may be.
- the neural network used for acquiring the image feature amount data may be called a third neural network or the like, and for example, the neural network used for acquiring the dispersion representation vector may be called a third neural network or the like. Good.
- the neural network 40 has an input layer IL, an intermediate layer ML, and an output layer OL.
- the neural network 40 can be configured to have one intermediate layer ML.
- the neural network 40 can acquire a distributed representation vector representing a word input to the input layer IL by using, for example, Word2Vec, which is an open source algorithm.
- Word2Vec which is an open source algorithm.
- FIG. 7B an example of a method in which the neural network 40 having the configuration shown in FIG. 7B acquires the database tag vector TAGV DB representing the database tag TAG DB input to the input layer IL will be described.
- a vector representing the database tag TAG DB as a one-hot vector is input to the input layer IL.
- one component represents one word
- the component corresponding to the word input to the input layer IL can be 1, and the other components can be 0. That is, it can be said that the one-hot vector is a vector in which one component is 1 and all other components are 0.
- the number of neurons contained in the input layer IL can be the same as the number of components constituting the one-hot vector.
- the intermediate layer ML has a function of generating a distributed representation vector based on the one-hot vector input to the input layer IL.
- the intermediate layer ML can generate a distributed representation vector by multiplying the one-hot vector by a predetermined weight. Since the weight can be represented by a matrix, the neural network 40 can generate a distributed representation vector by performing a product-sum operation between the one-hot vector and the weight matrix.
- the number of neurons in the intermediate layer ML can be the same as the number of dimensions of the distributed representation vector. For example, when the number of dimensions of the distributed representation vector is 300, the intermediate layer ML can be configured to have 300 neurons.
- the weight matrix can be obtained by learning, for example, supervised learning. Specifically, a word represented by a one-hot vector is input to the input layer IL, and peripheral words of the word input to the input layer IL are input to the output layer OL. .. Here, for each word input to the input layer IL, a plurality of peripheral words are input to the output layer OL. Then, the value of the weight matrix of the neural network 40 is adjusted so that the output layer OL can output the probability of becoming a peripheral word of the word input to the input layer IL. For example, one neuron in the output layer OL corresponds to one word.
- the above is an example of the learning method of the neural network 40.
- one neuron can correspond to one word. Therefore, the number of neurons in the input layer IL and the number of neurons in the output layer OL can be the same.
- the number of neurons in the intermediate layer ML can be smaller than the number of neurons in the input layer IL.
- the number of words that can be processed by the neural network 40 that is, the number of neurons in the input layer IL is 10,000
- the number of dimensions of the distributed representation vector that is, the number of neurons in the intermediate layer ML is 300. be able to. Therefore, in the distributed expression, the number of dimensions can be kept small even if the number of words that can be expressed increases, so that the amount of calculation is unlikely to increase even if the number of words that can be expressed increases. Therefore, the image search system 10 can perform a search in a short time.
- the processing unit 13 can acquire the database tag vector TAGV DB .
- the database tag vector TAGV DB acquired by the processing unit 13 can be stored in the database 17.
- the processing unit 13 acquires the database image feature amount data GFD DB , the database tag TAG DB , and the database tag vector TAGV DB , and stores them in the database 17.
- the image search system 10 can search a database image similar to the query image.
- the database tag TAG DB does not have to be stored in the database 17.
- the processing unit 13 uses the database tag TAG DB and the database tag vector TAGV DB in steps S03 and S04.
- the database image feature amount data GFD DB may be acquired.
- the vector itself output from the neural network 40 by inputting the database tag TAG DB into the neural network 40 is used as the database tag vector TAGV DB , but one aspect of the present invention is this. Not exclusively.
- a modified example of the acquisition method of the database tag vector TAGV DB will be described.
- the processing unit 13 acquires a word that is a candidate for the database tag TAG DB .
- Words that are candidates for the database tag TAG DB can be obtained by morphological analysis, for example, as shown in FIGS. 6A and 6B.
- the acquired words are represented by vectors.
- the acquired word by inputting the acquired word into the neural network 40, it can be represented by a distributed representation vector.
- a predetermined number of clusters are generated by performing clustering on the distributed representation vector. For example, generate as many clusters as the number of database tag TAG DBs to be acquired.
- Clustering can be performed by a K-means method, a DBSCAN (Density-Based Spatial Crusting of Applications with Noise) method, or the like.
- FIG. 8A shows an example in which 20 words are acquired by the processing unit 13 as tag candidates associated with the database image data GD DB [1], and each of these words is represented by the database word vector WORDV DB. ing. Further, FIG. 8A shows an example of generating 5 clusters (cluster CST1, cluster CST2, cluster CST3, cluster CST4, and cluster CST5) based on 20 database word vectors WORDV DB .
- the vector shown in FIG. 8A is a two-dimensional vector, and the horizontal axis direction represents one component of the two-dimensional vector and the vertical axis direction represents the other component of the two-dimensional vector.
- the database word vector WORDV DB or the like can be, for example, a 300-dimensional vector.
- a vector representing a representative point is obtained.
- the vector representing the representative point can be the database tag vector TAGV DB [1].
- a vector representing a representative of cluster CST1 a database tag vector TAGV1 DB [1]
- the vector representing the representative points of the cluster CST2 database tag vector TAGV2 DB [1] represent the representative points of the cluster CST3 vector
- a vector representing a representative point of cluster CST4 is a database tag vector TAGV4 DB [1]
- a vector representing a representative point of cluster CST5 is a database tag vector TAGV5 DB [1]. Is shown.
- Each component of the vector representing the representative point can be, for example, the average value of each component of the database word vector WORDV DB included in the cluster. For example, in a cluster, (0.1,0.7), (0.2,0.5), (0.3,0.5), (0.4,0.2), (0.5) When the five database word vectors WORDV DB of, 0.1) are included, the vector representing the representative point of the cluster can be, for example, (0.3, 0.4).
- the processing unit 13 can acquire the database tag vector TAGV DB [1].
- the database tag vector TAGV DB [2] and later can also be obtained by the same method.
- Figure 8B for each database image data GD DB [1] to the database image data GD DB [100], five pieces of database tag vector TAGV DB (database tag vector TAGV1 DB, the database tag vector TAGV2 DB, the database tag vector TAGV3 DB , database tag vector TAGV4 DB , and database tag vector TAGV5 DB ) It is a table which shows the component of each vector at the time of acquisition.
- the components shown in FIG. 8B are examples for convenience of explanation.
- the database tag vector TAGV DB can be weighted.
- the weight is, for example, a value obtained by dividing the number of database word vector WORDV DBs included in one cluster by the total number of words acquired by the processing unit 13 as candidates for tags associated with the database image data GD DB. Can be done.
- FIGS. 8A and 8B show an example in which the processing unit 13 acquires 20 words as tag candidates associated with the database image data GD DB [1].
- Cluster CST1 also contains eight database word vectors WORDV DB
- cluster CST2 contains four database word vectors WORDV DB
- cluster CST3 contains two database word vectors WORDV DB .
- cluster CST4 includes three database word vectors WORDV DB
- cluster CST5 contains three database word vectors WORDV DB . Therefore, as shown in FIG. 8B, for example, for the database image data GD DB [1], the weight of the database tag vector TAGV1 DB [1] included in the cluster CST1 is 8/20, and the database tag vector TAGV2 included in the cluster CST2.
- the weight of DB [1] is 4/20
- the weight of database tag vector TAGV3 DB [1] included in cluster CST3 is 2/20
- the weight of database tag vector TAGV4 DB [1] included in cluster CST4 is 3/20.
- the weight of the database tag vector TAGV5 DB [1] included in the cluster CST5 can be set to 3/20.
- the image search system 10 can perform a search with high accuracy.
- FIG. 9 is a flowchart showing an example of the processing method.
- Step S11 First, the user of the image search system 10 inputs the query image data GD Q into the input unit 11.
- the query image data GD Q is supplied from the input unit 11 to the processing unit 13 via the transmission line 12.
- the query image data GD Q may be stored in the storage unit 15 or the database 17 via the transmission line 12 and supplied from the storage unit 15 or the database 17 to the processing unit 13 via the transmission line 12.
- the query image data GD Q can have, for example, an invention, a device or a design before filing an application, an industrial product before the release, technical information, an image explaining a technical idea, or the like.
- the query image data GD Q is input to the neural network of the processing unit 13.
- the query image data GD Q can be input to the neural network 30 having the configuration shown in FIG. 3A or FIG. 3B.
- the processing unit 13 can acquire the query image feature amount data GFD Q representing the feature amount of the query image data GD Q.
- the data output from the layer 31 [m] shown in FIG. 3A can be used as the query image feature amount data GFD Q.
- the data output from the pooling layer PL [m] shown in FIG. 3B can be used as the query image feature amount data GFD Q.
- the query image feature data GFD Q may include output data of two or more layers as in the database image feature data GFD DB . Since the query image feature data GFD Q includes the output data of many layers, the query image feature data GFD Q can be made to more accurately represent the features of the query image data GD Q.
- Step S13 the processing unit 13 calculates the similarity of the database image data GD DB with respect to the query image data GD Q.
- FIG. 10 is a diagram showing the calculation of the similarity of the database image data GD DB with respect to the query image data GD Q.
- FIG. 10 shows an example in which one query image data GD Q and 100 database image data GD DBs are input to the neural network 30a shown in FIG. 3B. Further, FIG. 10 shows an example in which the query image feature data GFD Q and the database image feature data GFD DB each have a pooling value P of x rows and y columns (x and y are integers of 1 or more). ..
- the pooling value of the query image feature data GFD Q is described as the pooling value P Q
- the pooling value of the database image feature data GFD DB is described as the pooling value P DB
- the pooling value of the database image feature amount data GFD DB [1] is described as the pooling value P1 DB
- the pooling value of the database image feature amount data GFD DB [100] is described as the pooling value P100 DB .
- the similarity to the query image feature data GFD Q is calculated for each of the database image feature data GFD DB [1] to the database image feature data GFD DB [100]. Then, the similarity can be set as the similarity of the database image data GD DB [1] to the database image data GD DB [100] with respect to the query image data GD Q.
- the degree of similarity to the query image feature data GFD Q may be calculated for all the database image feature data GFD DB stored in the database 17. Alternatively, the degree of similarity to the query image feature data GFD Q may be calculated for a part of the database image feature data GFD DB stored in the database 17.
- the similarity is preferably, for example, a cosine similarity. Alternatively, it may be Euclidean similarity or Minkowski similarity.
- the cosine similarity of the database image feature data GFD DB [1] to the query image feature data GFD Q can be calculated by the following formula. It can be said that the larger the value of the cosine similarity is, the more similar the database image data GD DB is to the query image data GD Q.
- the cosine similarity of the database image feature data GFD DB [2] to the database image feature data GFD DB [100] to the query image feature data GFD Q can also be calculated by the same method. From the above, the similarity of the database image data GD DB [1] to the database image data GD DB [100] with respect to the query image data GD Q can be calculated.
- the image search system 10 can perform a search with high accuracy.
- the cosine similarity can be calculated by a simple calculation. Therefore, when the processing unit 13 has a GPU, the similarity can be obtained by the GPU. Therefore, the similarity can be calculated in a short time, and the image search system 10 can perform the search in a short time.
- Step S14 the processing unit 13 acquires the query tag TAG Q , which is a tag associated with the query image data GD Q , based on the calculation result of the similarity with the query image data GD Q of the database image data GD DB .
- FIG. 11A and 11B are diagrams showing an example of a method for acquiring the query tag TAG Q.
- the database image data GD DB [1] to the database image data GD DB [100] are rearranged based on the similarity calculated in step S13.
- the database image data GD DB having the highest degree of similarity to the query image data GD Q is sorted in descending order.
- the similarity of the database image data GD DB [2] is the highest at 0.999
- the similarity of the database image data GD DB [31] is the second highest at 0.971
- the database image data is the similarity of the database image data GD DB [31]
- the similarity of the GD DB [73] is 0.964, which is the third highest
- the similarity of the database image data GD DB [52] is 0.951, which is the fourth highest
- the similarity of the database image data GD DB [28]. Is 0.937, which is the fifth highest.
- the database tag TAG DB associated with the database image data GD DB having a high degree of similarity is extracted.
- the database tag TAG DB associated with the database image data GD DB having the highest degree of similarity is extracted.
- the tag “aaa” tied to the database image data GD DB [2], "bbb”, “ccc”, “ddd”, and as “eee” database image data GD DB [31] The tags "aa”, “ccc”, “fff”, “ggg”, and “hhh” associated with the database image data GD DB [73] and the tags "aaa”, "bbbb” associated with the database image data GD DB [73].
- tags to be extracted may be duplicated.
- the number of database image data GD DB to extract the database tag TAG DB although set to a predetermined number, one embodiment of the present invention is not limited thereto.
- the database tag associated with the database image data GD DB whose similarity is equal to or higher than a predetermined value may be extracted.
- the number of database image data GD DB to extract the database tag TAG DB may not be fixed.
- the tag "aa” is a database image data GD DB [2], a database image data GD DB [31], a database image data GD DB [73], a database image data GD DB [52], and a database image data GD DB. Since it is linked to any of [28], the number of appearances is 5.
- the tags "ddd” are database image data GD DB [2], database image data GD DB [31], database image data GD DB [73], database image data GD DB [52], and database image data GD DB [28]. ], Since it is linked only to the database image data GD DB [2], the number of occurrences is 1.
- a predetermined number of tags are further extracted in order from the tag having the largest number of occurrences, and the extracted tags are designated as query tags TAG Q.
- the extracted tags are designated as query tags TAG Q.
- five tags are extracted as the query tag TAG Q in order from the tag having the largest number of occurrences. Specifically, the tag "aaa" having the largest number of appearances with 5 times and the tag "ccc" having the second largest number of appearances with 3 times have been extracted.
- a tag associated with a database image data GD DB having a higher degree of similarity can be extracted.
- the ranking of the degree of similarity of the database image data GD DB is expressed numerically. Then, it is possible to compare the total of the numerical values representing the order of similarity of the associated database image data GD DB between the tags having the same number of appearances, and extract in order from the tag with the smallest total value.
- the number of query tags TAG Q is 5, the number of appearances of the tag “aaa” is 5, and the number of appearances of the tag “ccc” is 3. Therefore, it is necessary to extract three tags from the tags whose appearance count is 2 or less.
- the rank of similarity of the database image data GD DB [2] to which the tag "bbb” is associated is 1, and the rank of similarity of the database image data GD DB [73] is 3.
- the total ranking of the degree of similarity related to the tag "bbbb” is 4.
- the total rank of similarity related to the tag "fff” is 5
- the total rank of similarity related to the tag “ggg” is 6
- the total rank of similarity related to the tag "kkk” is 8.
- the total rank of similarity is 1 among the tag “aaa” having 5 appearances, the tag “ccc” having 3 appearances, and the tag having 2 appearances.
- the third smallest tags "bbbb”, “fff”, and "ggg” can be used as query tags TAG Q.
- the word synonyms in the database the tag TAG DB may be included in Kueritagu TAG Q.
- the synonym dictionary data is stored in the storage unit 15 or the database 17 in advance, and the words included in the database tag TAG DB and the words registered in the synonym dictionary as synonyms of the words are set in the query tag TAG Q. Can be included.
- the processing unit 13 automatically selects the query tag TAG Q from the extracted database tag TAG DB , but one aspect of the present invention is not limited to this.
- the extracted database tag TAG DB may be presented to the user of the image search system 10, and the user of the image search system 10 may select a tag to be the query tag TAG Q from the presented tags.
- a database image having a high degree of similarity may be presented to the user of the image search system 10, and the presented database image may be selected by the user of the image search system 10. Then, all or a part of the database tag TAG DB associated with the database image data GD DB representing the selected database image may be used as the query tag TAG Q.
- the query tag TAG Q is selected from the database tag TAG DB , but one aspect of the present invention is not limited to this.
- a new tag may be generated based on the database tag TAG DB, and the tag may be used as the query tag TAG Q.
- the processing unit 13 acquires the query tag vector TAGV Q representing the query tag TAG Q will be described using the database tag vector TAGV DB representing the database tag TAG DB .
- the method described with reference to FIG. 11A can also be applied to the case of acquiring the query tag vector TAGV Q by the method described below. That is, the database tag TAG DB can be extracted by the same method as that shown in FIG. 11A.
- Clustering After extraction of the database tag TAG DB, by performing clustering on the database tag vector TAGV DB representing the extracted database tag TAG DB, it generates a cluster of a predetermined number. For example, generate the same number of clusters as the number of query tags TAG Q to be acquired. Clustering can be performed by the K-means method, the DBSCAN method, or the like.
- FIG. 12A shows an example in which the 25 database tag TAG DBs shown in FIG. 11A are acquired by the processing unit 13. Further, in FIG. 12A, five clusters (cluster CST1, cluster CST2, cluster CST3, cluster CST4, and cluster CST5) are generated based on the database tag vector TAGV DB corresponding to the database tag TAG DB shown in FIG. 11A.
- An example is shown.
- the vector shown in FIG. 12A is a two-dimensional vector, and the horizontal axis direction represents one component of the two-dimensional vector and the vertical axis direction represents the other component of the two-dimensional vector. Actually, it can be, for example, a 300-dimensional vector.
- the numbers in parentheses shown in FIG. 12A indicate the number of occurrences of the extracted database tag TAG DB . For example, "aaa (5)" indicates that the number of occurrences of the tag "aaa” is 5.
- a vector representing a representative point is obtained.
- the vector representing the representative point can be the query tag vector TAGV Q.
- the vector representing the representative point of the cluster CST1 is the query tag vector TAGV1 Q
- the vector representing the representative point of the cluster CST2 is the query tag vector TAGV2 DB
- the vector representing the representative point of the cluster CST3 is the query tag vector TAGV3 Q.
- a vector representing a representative of cluster CST4 the query tag vector TAGV4 Q shows an example of a query tag vector TAGV5 Q vectors representing representative of cluster CST5.
- Each component of the vector representing the representative point can be, for example, an average value of each component of the database tag vector TAGV DB included in the cluster.
- the processing unit 13 can acquire the query tag vector TAGV Q.
- Figure 12B is a table showing the components of the query tag vector TAGV1 Q to query tag vector TAGV5 Q.
- the components shown in FIG. 12B are examples for convenience of explanation.
- the query tag vector TAGV Q can be weighted.
- the weight can be, for example, a value obtained by dividing the number of database tag vector TAGV DBs included in one cluster by the total number of database tag TAG DBs extracted by the method shown in FIG. 12A or the like.
- FIGS. 12A and 12B show an example in which 25 database tag TAG DBs are extracted.
- cluster CST1 contains 11 database tag vectors TAGV DB
- cluster CST2 contains 4 database tag vectors TAGV DB
- cluster CST3 contains 5 database tag vectors TAGV DB .
- the cluster CST4 includes two database tag vector TAGV DBs and the cluster CST5 includes three database tag vectors TAGV DBs .
- the weight of the vector TAGV3 Q can be 5/25
- the weight of the query tag vector TAGV4 Q included in the cluster CST4 can be 2/25
- the weight of the query tag vector TAGV5 Q included in the cluster CST5 can be 3/25.
- the image search system 10 can perform a search with high accuracy.
- the method for acquiring the query tag TAG Q shown in steps S13 and S14 is a simpler method than, for example, a method for acquiring the query tag TAG Q without using the database tag TAG DB as a base. Therefore, the image search system 10 can perform a search in a short time. Further, in the acquisition of the query tag TAG Q by the methods shown in steps S13 and S14, for example, the user of the image search system 10 specifies all the query tag TAG Q , and the candidate of the query tag TAG Q is presented to the user. Compared with the case where the query image data is not performed, tags representing the concept, technical content, points of interest, etc. of the image corresponding to the query image data GD Q can be comprehensively acquired. Therefore, the image search system 10 can perform a search easily and with high accuracy.
- Step S15 the processing unit 13 acquires the data D DB including the database image feature amount data GFD DB and the database tag vector TAGV DB . Further, the processing unit 13 acquires the data D Q including the query image feature amount data GFD Q and the query tag vector TAGV Q.
- FIG. 13 is a diagram showing a configuration example of the data D DB and the data D Q.
- the database image feature data GFD DB and the query image feature data GFD Q can have the same configuration as that shown in FIG.
- the database tag vector TAGV DB can be configured to have a component VC DB [1] to a component VC DB [h] (h is an integer of 2 or more).
- the query tag vector TAGV Q can be configured to have a component VC Q [1] to a component VC Q [h].
- h is 1500.
- a database tag vector TAGV DB [1] is component ingredients VC1 DB having describes a component having a database tag vector TAGV DB [100] and the component VC100 DB.
- the term component can sometimes be rephrased as the term value.
- the image feature amount data and the tag vector are both a set of a plurality of values. Therefore, the term data and the term vector may be used interchangeably.
- Step S16 the processing unit 13 calculates the similarity of the data D DB to the data D Q.
- the similarity with respect to the data D Q is calculated for each of the data D DB [1] to the data D DB [100].
- the similarity can be set as the similarity of the database image data GD DB [1] to the database image data GD DB [100] with respect to the query image data GD Q. Therefore, the similarity of the database image data GD DB calculated by the processing unit 13 in step S13 to the query image data GD Q can be corrected.
- the weighting can be performed by, for example, multiplying the component of the tag vector by the weight.
- the similarity of the data D DB to the data D Q is the same as the similarity calculated by the processing unit 13 in step S13.
- the cosine similarity is calculated in step S13, it is preferable to calculate the cosine similarity as the similarity of the data D DB to the data D Q.
- the cosine similarity of the data D DB [1] to the data D Q can be calculated by the following formula.
- the cosine similarity of the data D DB [2] to the data D DB [100] to the data D Q can also be calculated by the same method. From the above, the similarity of the data D DB [1] to the data D DB [100] with respect to the data D Q can be calculated. Thereby, the similarity of the database image data GD DB [1] to the database image data GD DB [100] calculated in step S13 with respect to the query image data GD Q can be corrected.
- the search result can be changed by adjusting the ratio between the number of values possessed by the image feature amount data and the number of components possessed by the tag vector. For example, the number of values possessed by the query image feature data GFD Q and the value possessed by the database image feature data GFD DB is increased, or the number of components possessed by the query tag vector TAGV Q and the database tag vector TAGV DB have. When the number of components is reduced, the corrected similarity results in an emphasis on the amount of image features.
- the database image data GD DB For example, if the feature amount of the database image data GD DB is similar to the feature amount of the query image data GD Q , even if the database tag TAG DB is slightly different from the query tag TAG Q , the database image data GD DB The degree of similarity after correction to the query image data GD Q is high. On the other hand, the number of values possessed by the query image feature data GFD Q and the number of values possessed by the database image feature data GFD DB are reduced, or the number of components possessed by the query tag vector TAGV Q and the database tag vector TAGV DB. When the number of components contained in is increased, the corrected similarity results in an emphasis on tags.
- the database tag TAG DB is similar to the query tag TAG Q , even if the feature amount of the database image data GD DB is slightly different from the feature amount of the query image data GD Q , the feature amount of the database image data GD DB The degree of similarity after correction to the query image data GD Q is high.
- the number of tags associated with the image data may be increased or decreased. Further, for example, by using only a part of the values of the image feature amount data for the calculation of the similarity, the similarity with an emphasis on the tag can be calculated. For example, by not using the value representing the feature amount of the part that does not give a strong impression when the image is viewed in the calculation of the similarity, the similarity of the database image whose appearance is significantly different from that of the query image is increased. It is possible to calculate the similarity with an emphasis on tags while suppressing the above. Therefore, the image search system 10 can perform a search with high accuracy.
- the search result can also be changed by multiplying the value of the image feature amount data or the component of the tag vector by a predetermined coefficient. For example, by multiplying the value of the query image feature data GFD Q and the value of the database image feature data GFD DB by a real number greater than 1, the corrected similarity is obtained as a result of emphasizing the image feature. be able to. Further, by multiplying the component of the query tag vector TAGV Q and the component of the database tag vector TAGV DB by a real number of 0 or more and less than 1, the corrected similarity can be obtained as a result of emphasizing the image feature amount. it can.
- the similarity after correction is set as the result of emphasizing the tag. be able to.
- the component of the query tag vector TAGV Q and the component of the database tag vector TAGV DB by a real number larger than 1 the corrected similarity can be obtained as a result of emphasizing the tag.
- Step S17 Next, the processing unit 13 generates ranking data including information on the rank of similarity after correction calculated in step S16, and outputs the search result to the outside of the image search system 10.
- the processing unit 13 can supply the ranking data to the storage unit 15 or the database 17 via the transmission line 12. Further, the processing unit 13 can supply the ranking data to the output unit 19 via the transmission line 12. As a result, the output unit 19 can supply the ranking data to the outside of the image search system 10.
- the ranking data can include the ranking of the similarity with respect to the query image of each database image, the value of the similarity, and the like.
- the ranking data preferably includes a file path to the database image.
- the user of the image search system 10 can easily access the target image from the ranking data.
- the tag associated with the query image and the output database image may be confirmed.
- publication data representing a publication in which a database image is posted is stored in the database 17 or the like
- the user of the image search system 10 can post the database image associated with the ranking data. Easy access to existing publications.
- the above is an example of an image search method using the image search system 10.
- the similarity of the database image data GD DB to the query image data GD Q is calculated without associating a tag.
- the tags are linked to correct the similarity.
- the processing unit 13 acquires the query tag TAG Q based on the database tag TAG DB associated with the database image data GD DB having the highest similarity to the query image data GD Q.
- the database image data GD DB having the sixth or lower degree of similarity from being mixed with image data having a different concept, technical content, attention point, etc. from the query image. Therefore, it is possible to prevent an image that becomes noise from being mixed in the search result and the image to be searched is not output.
- the image search system 10 can perform a search with high accuracy.
- the query tag TAG Q is acquired based on the database tag TAG DB .
- the acquisition method is a simpler method as compared with the method of acquiring the query tag TAG Q without using the database tag TAG DB as a base. Therefore, the image search system 10 can perform a search in a short time.
- the user of the image search system 10 specifies all the query tag TAG Q and presents the candidate of the query tag TAG Q to the user. It is possible to comprehensively acquire tags representing the concept, features, technical contents, points of interest, etc. of the image corresponding to the query image data GD Q , as compared with the case where this is not performed. Therefore, the image search system 10 can perform a search easily and with high accuracy.
- FIG. 14 is a flowchart showing an example of an image search method using the image search system 10 when a user of the image search system 10 manually inputs a part of the query tag TAG Q. Even when the image search system 10 is operated by the method shown in FIG. 14, the process shown in FIG. 2 is performed in advance in the same manner as when the image search system 10 is operated by the image search method shown in FIG. It is good to keep it.
- Step S21 First, the user of the image search system 10, other query image data GD Q, is input to the input unit 11 the Kueritagu TAG Q.
- the number of query tag TAG Qs input by the user of the image search system 10 and the contents of the query tag TAG Q can be arbitrarily set by the user. Further, the user may be able to set the number of query tag TAG Q including the query tag TAG Q automatically acquired in a later step.
- FIG. 15 is a diagram showing input of the query image data GD Q and the query tag TAG Q to the input unit 11.
- the user of the image search system 10 inputs two query tags TAG Q , that is, "circuit diagram” and "semiconductor” representing the query image data GD Q , in addition to the query image data GD Q. ..
- the calculation result of the similarity with respect to the query image data GD Q of the database image data GD DB can be changed.
- the query tag TAG Q "capacitive element” is input to the input unit 11
- the similarity of the database image data representing the circuit diagram in which the capacitive element is not drawn can be lowered.
- the query image data GD Q is input to the neural network of the processing unit 13.
- the query image data GD Q can be input to the neural network 30 having the configuration shown in FIG. 3A or FIG. 3B.
- the processing unit 13 can acquire the query image feature amount data GFD Q representing the feature amount of the query image data GD Q.
- Step S23 the processing unit 13 acquires the data D DB including the database image feature amount data GFD DB and the database tag vector TAGV DB . Further, the processing unit 13 acquires the data D Q including the query image feature amount data GFD Q and the query tag vector TAGV Q.
- the tags included in the data D DB are stored in the database. Select from the tags associated with the image data GD DB . For example, it is assumed that five database tags TAG DB are associated with one database image data GD DB . Then, it is assumed that the number of query tags TAG Q input to the input unit 11 is two. In this case, among the five database tags TAG DB , for example, the tag having the highest TF-IDF and the tag having the second highest TF-IDF can be used as the tags possessed by the data D DB .
- Step S24 the processing unit 13 calculates the similarity of the data GD DB to the data GD Q.
- the similarity can be calculated by the same method as that shown in FIG.
- Step S25 Next, the query tag TAG Q is added or modified based on the calculation result of the similarity of the data D DB to the data D Q.
- FIG. 16A and 16B are diagrams showing an example of a method of adding the query tag TAG Q.
- the data D DB is rearranged based on the similarity calculated in step S24.
- FIG. 16A shows an example of rearranging 100 data DBs .
- the data D DB having the highest similarity to the data D Q is sorted in descending order.
- the similarity of the data D DB [2] is the highest at 0.999
- the similarity of the data D DB [41] is the second highest at 0.971
- the similarity of the data D DB [53] is the highest.
- the database tag TAG DB associated with the database image data GD DB possessed by the data D DB having a high degree of similarity is extracted.
- the database tag TAG DB associated with the database image data GD DB possessed by the data D DB having the first to fifth degree of similarity is extracted.
- tags "aaa”, “kkk”, “rrr”, “sss”, and "ttt” associated with the database image data GD DB [88] are extracted.
- the tags to be extracted may be duplicated.
- the number of occurrences is calculated for each of the extracted tags.
- a predetermined number of tags are further extracted from the extracted tags, and the extracted tags are designated as a new query tag TAG Q.
- two tags (“circuit diagram” and “semiconductor”) have already been acquired as query tags TAG Q in step S21.
- the number of query tags TAG Q is set to five, which is equal to the number of database tag TAG DBs associated with one database image data GD DB .
- Extraction of the tag to be the new query tag TAG Q can be performed by the same method as that shown in FIG. 11B.
- the tags with the highest number of occurrences can be extracted in order.
- a tag associated with the database image data GD DB of the data D DB having a higher degree of similarity is used. Can be extracted.
- the tags "aaa”, "bbbb", and "ccc" can be extracted as new query tags TAG Q.
- a part or all of the input by the user of the image search system 10 in the input unit 11 may be deleted from the query tag TAG Q.
- the tags "circuit diagram” and “semiconductor” may be deleted from the tag TAG Q , and five tags may be extracted from the tags shown in FIG. 16B to obtain a new tag TAG Q.
- the tags "aaa”, “bbbb”, “ccc”, “fff”, and "ggg” can be used as new tags TAG Q.
- Step S26 the tag possessed by the data DB is added or modified in response to the addition or modification of the query tag TAG Q.
- the number of database tag vector TAGV DBs possessed by one data D DB is made equal to the number of query tags TAG Q.
- Step S27 the processing unit 13 recalculates the similarity of the data GD DB to the data GD Q.
- the similarity can be calculated by the same method as that shown in step S24. Thereby, the similarity of the data GD DB with respect to the data GD Q can be corrected.
- Step S28 the processing unit 13 generates ranking data including information on the rank of similarity after correction calculated in step S27, and outputs the search result to the outside of the image search system 10.
- the user of the image search system 10 can confirm, for example, the order of similarity of each database image to the query image, the value of the similarity, the searched database image, the tag, and the like.
- Step S29, Step S30 the user of the image search system 10 confirms whether the ranking data is the expected result. If the result is as expected, end the search. If the expected result is not obtained, the user of the image search system 10 adds or modifies the query tag TAG Q, and then returns to step S23.
- the above is an example of an image search method using the image search system 10.
- the image retrieval system 10 and the entire area of the database image data GD DB, and the entire region of the query image data GD Q, by comparing the similarity of the query image data GD Q to the database image data GD DB may calculate the similarity to the query image data GD Q database image data GD DB .
- the entire area of the database image data GD DB, and a partial region of the query image data GD Q, by comparing may calculate the similarity to the query image data GD Q database image data GD DB ..
- Figure 17 is a partial area of the database image data GD DB, and the entire region of the query image data GD Q, by comparing, when calculating the similarity to the query image data GD Q database image data GD DB
- This is an example of an image search method using the image search system 10.
- the image search system 10 performs step S11 shown in FIG. 9 or step 21 shown in FIG.
- the processing unit 13 compares the query image data GD Q and the database image data GD DB, to extract the database image data GD DB including the area higher degree of coincidence with respect to the query image data GD Q.
- the extracted database image data GD DB is referred to as the extracted image data GD Ex .
- the comparison between the query image data GD Q and the database image data GD DB can be performed, for example, by area-based matching.
- step S31 as shown in FIG. 18A, the query image data GD Q is compared with each of n database image data GD DBs (n is an integer of 1 or more).
- n may be the same number as the number of database image data GD DBs stored in the database 17, or may be less than that. Further, n may be larger than the number of database image data GD DBs stored in the database 17.
- the obtained image data is compared with the query image data GD Q. Even if n is equal to or less than the number of database image data GD DBs , the image data stored in the storage unit 15 or the image input from the outside of the image search system 10 to the processing unit 13 via the input unit 11. The data may be compared with the query image data GD Q.
- step S31 When n is small, the operation of step S31 can be performed in a short time. On the other hand, when n is large, the database image data GD DB including the region having a high degree of matching with the query image data GD Q can be extracted with high accuracy.
- FIG. 18B is a diagram illustrating a procedure for comparing the query image data GD Q and the database image data GD DB by area-based matching.
- the number of pixels of the image corresponding to the query image data GD Q is 2 ⁇ 2
- the number of pixels of the image corresponding to the database image data GD DB is 4 ⁇ 4. That is, it is assumed that the query image data GD Q has a pixel value of 2 ⁇ 2 and the database image data GD DB has a pixel value of 4 ⁇ 4.
- the 2 ⁇ 2 pixel values of the query image data GD Q are the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22, respectively.
- the pixel value corresponding to the pixel in the first row and the first column is the pixel value vq11
- the pixel value corresponding to the pixel in the first row and the second column is the pixel value vq12, the pixel in the second row and the first column.
- the pixel value corresponding to is the pixel value vq21
- the pixel value corresponding to the pixel in the second row and the second column is the pixel value vq22.
- the 4 ⁇ 4 pixel values of the database image data GD DB are set to pixel values vdb11 to pixel values vdb44, respectively.
- the pixel value corresponding to the pixel in the first row and the first column is the pixel value vdb11
- the pixel value corresponding to the pixel in the first row and the fourth column is the pixel value vdb14, the pixel in the fourth row and the first column.
- the pixel value corresponding to is the pixel value vdb41
- the pixel value corresponding to the pixel in the 4th row and 4th column is the pixel value vdb44.
- the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 are compared with the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22.
- the degree of coincidence between the query image data GD Q and the area of the database image data GD DB composed of the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22 can be calculated.
- FIG. 18B among the pixel values of the database image data GD DB , the pixel values to be compared with the query image data GD Q are shown as a comparison data area 21 surrounded by a dotted line.
- the comparison data area 21 is moved by one column with respect to the pixel value of the database image data GD DB , the pixel values are compared in the same manner, and the degree of coincidence is calculated. Specifically, the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 are compared with the pixel value vdb12, the pixel value vdb13, the pixel value vdb22, and the pixel value vdb23.
- the degree of coincidence between the query image data GD Q and the area of the database image data GD DB composed of the pixel value vdb12, the pixel value vdb13, the pixel value vdb22, and the pixel value vdb23 can be calculated.
- the comparison data area 21 is moved by one column with respect to the pixel value of the database image data GD DB , the pixel values are compared in the same manner, and the degree of coincidence is calculated. Specifically, the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 are compared with the pixel value vdb13, the pixel value vdb14, the pixel value vdb23, and the pixel value vdb24.
- the degree of coincidence between the query image data GD Q and the area of the database image data GD DB composed of the pixel value vdb13, the pixel value vdb14, the pixel value vdb23, and the pixel value vdb24 can be calculated.
- a comparison data area 21 to the pixel value included in the database image data GD DB is moved one row, the second row of the pixel values of the database image data GD DB, and the pixel value of the third row, the query image
- the pixel values constituting the data GD Q are compared for each column in the same manner as described above.
- the degree of coincidence between the area composed of the pixel values in the second and third rows of the database image data GD DB and the query image data GD Q can be calculated for each column in the same manner as above. ..
- the comparison data area 21 is moved one row to the pixel value with a database image data GD DB, and the database image data GD 3-row pixel value of DB, and the pixel value of the fourth row, the query image data
- the pixel values constituting the GD Q are compared for each column in the same manner as above.
- the degree of coincidence between the pixel values in the third and fourth rows of the database image data GD DB and the query image data GD Q can be calculated for each column in the same manner as described above.
- the highest degree of matching is defined as the degree of matching of the database image data GD DB with respect to the query image data GD Q.
- the above is performed for each of the n database image data GD DBs .
- degree of coincidence between the query image data GD Q is high database image data GD DB, and extracts as extracted image data GD Ex.
- a specified number may be extracted as the extracted image data GD Ex in order from the database image data GD DB having the highest degree of matching.
- the database image data GD DB whose degree of coincidence with the query image data GD Q is equal to or higher than the specified value may be extracted as the extracted image data GD Ex .
- FIG. 19 is a diagram illustrating extraction of the database image data GD DB .
- FIG. 19 shows an example of extracting one image data as extracted image data GD Ex from the database image data GD DB [1] to the database image data GD DB [3].
- the image corresponding to the query image data GD Q shown in FIG. 19 includes, for example, a transistor symbol.
- the image that corresponds to the database image data GD DB [2] shown in FIG. 19 contains transistors symbols, images corresponding to the database image data GD DB [1], and the database image data GD DB [ It is assumed that the image corresponding to [3] does not include the transistor symbol.
- the degree of coincidence to the query image data GD Q database image data GD DB [2] the database image data GD DB [1] and higher than the degree of coincidence to the query image data GD Q database image data GD DB [3] Become. Therefore, the database image data GD DB [2] can be extracted as the extracted image data GD Ex .
- the comparison of the query image data GD Q and the database image data GD DB and the calculation of the degree of matching are performed by SAD (Sum of Absolute Differenses), SSD (Sum of Squared Differenses), NCC (Normalized CrossCorrelation), It can be performed by mean Normalized Cross Correlation), POC (Phase-Only Correlation), or the like.
- the comparison data area 21 is moved by one column or one row with respect to the pixel value of the database image data GD DB, but one aspect of the present invention is not limited to this.
- the comparison data area 21 may be moved by two columns or more or two rows or more with respect to the pixel value of the database image data GD DB .
- the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 with the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22, the pixel value vq11 and the pixel value You may compare vq12, pixel value vq21, and pixel value vq22 with pixel value vdb13, pixel value vdb14, pixel value vdb23, and pixel value vdb24.
- the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 are not compared with the pixel value vdb12, the pixel value vdb13, the pixel value vdb22, and the pixel value vdb23.
- the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 may be compared with the pixel value vdb31, the pixel value vdb32, the pixel value vdb41, and the pixel value vdb42.
- the number of comparison operations between the pixel value of the query image data GD Q and the pixel value of the database image data GD DB can be reduced.
- the degree of coincidence with the query image data GD Q of the database image data GD DB can be calculated in a short time.
- FIG. 18A shows an example in which one query image data GD Q is compared with n database image data GD DBs , but one aspect of the present invention is not limited to this.
- a plurality of query image data GD Qs having different numbers of pixel values may be generated based on the query image data GD Q input to the processing unit 13.
- FIG. 20A shows query image data GD Q [1], query image data GD Q [2], and query image data having different numbers of pixel values based on the query image data GD Q input to the processing unit 13.
- An example of generating GD Q [3] is shown. As shown in FIG.
- the number of pixels of the image corresponding to the query image data GD Q [1], the number of pixels of the image corresponding to the query image data GD Q [2], and the query image data GD Q [3] The number of pixels of the corresponding image is different from each other. That is, the image corresponding to the query image data GD Q [1] to the query image data GD Q [3] is said to be an enlarged or reduced image corresponding to the query image data GD Q input to the processing unit 13. be able to.
- each of the plurality of query image data GD Qs is compared with each of the database image data GD DB [1] to the database image data GD DB [n]. Thereby, for each of the database image data GD DB [1] to the database image data GD DB [n], it is possible to calculate the degree of agreement for each of the plurality of query image data GD Qs . Then, for example, the highest degree of matching with respect to the plurality of query image data GD Q can be set as the degree of matching with the query image data GD Q input to the processing unit 13 of the database image data GD DB .
- the query image data GD Q [1] is compared with each of the database image data GD DB [1] to the database image data GD DB [n], and the query image data GD Q [2] is compared.
- the query image data GD Q [3] is compared with each of the database image data GD DB [1] to the database image data GD DB [n]
- the query image data GD Q [3] is compared with the database image data GD DB [1] to the database image data GD DB [n]. Compare with each.
- the database image data is the highest degree of matching among the degree of matching with the query image data GD Q [1], the degree of matching with the query image data GD Q [2], and the degree of matching with the query image data GD Q [3]. It can be the degree of agreement with the query image data GD Q input to the processing unit 13 of the GD DB .
- the high degree of matching can be the degree of matching of the database image data GD DB [1] with respect to the query image data GD Q input to the processing unit 13.
- the database image data is concerned. It may be determined that the GD DB does not include a region having a high degree of matching with the query image data GD Q.
- the same element of the transistor symbol is shown in both the image corresponding to the query image data GD Q and the image corresponding to the database image data GD DB .
- the size of the transistor symbol shown in the image corresponding to the query image data GD Q is different from the size of the transistor symbol shown in the image corresponding to the database image data GD DB . In this case, it may be determined that the degree of matching of the database image data GD DB with respect to the query image data GD Q is low.
- the same element called the transistor symbol is shown in both the image corresponding to the query image data GD Q and the image corresponding to the database image data GD DB , and both are further shown.
- the size of the elements is also the same. Therefore, the processing unit 13 can determine that the database image data GD DB includes an area having a high degree of matching with the query image data GD Q.
- the size of the element shown in the image corresponding to the query image data GD Q is enlarged or reduced. be able to. Therefore, even if the same element is shown in different sizes in the image corresponding to the query image data GD Q and the image corresponding to the database image data GD DB input to the processing unit 13.
- the degree of matching between the two image data can be made high. For example, if the query image data GD Q shown in FIG. 20B1 to the processing unit 13 is input, generating a query image data GD Q that by varying the number of pixel values to which the query image data GD Q has shown in FIG.
- the degree of matching of the database image data GD DB with respect to the query image data GD Q can be made high. As described above, it is possible to calculate with high accuracy the degree of agreement of the database image data GD DB with respect to the query image data GD Q input to the processing unit 13.
- the processing unit 13 extracts the partial image data GD part , which is the data in the region having a high degree of coincidence with the query image data GD Q , from the extracted image data GD Ex .
- the degree of coincidence with respect to the query image data GD Q of each region of the database image data GD DB is calculated by the method shown in FIG. 18B, the region having the highest degree of coincidence is extracted as the partial image data GD part . Therefore, the number of pixel values possessed by the partial image data GD part can be equal to the number of pixel values possessed by the query image data GD Q.
- FIGS. 21A and 21B are diagrams showing an example of the operation of step S32.
- regions having a high degree of agreement with the query image data GD Q are shown with hatches.
- the hatched region can be extracted to obtain partial image data GD part [1] to partial image data GD part [4].
- the image data extracted from the extracted image data GD Ex [1] to the extracted image data GD Ex [4] is subjected to the partial image data GD part [1] to the partial image data GD part [4], respectively. It is supposed to be.
- FIG. 21A shows an example in which one query image data GD Q is compared with the database image data GD DB as shown in FIG. 18A.
- the number of pixels of the image corresponding to the partial image data GD part can be all equal.
- FIG. 21B shows an example in which a plurality of query image data GD Qs having different numbers of pixel values are compared with the database image data GD DB as shown in FIG. 20A.
- the number of pixels of the image corresponding to the partial image data GD part can be made equal to, for example, the number of pixels of the image corresponding to the query image data GD Q having the highest degree of matching.
- the partial image data GD part there are multiple the number of pixels of the image corresponding to the partial image data GD part may be different for each partial image data GD part.
- FIG. 21B shows an example in which the number of pixels of the image corresponding to the partial image data GD part [1] to the partial image data GD part [4] is different.
- the image search system 10 can be used without extracting the partial image data GD part.
- the image search method used can be executed.
- Step S33 Next, by inputting the query image data GD Q into the neural network of the processing unit 13, the processing unit 13 acquires the query image feature amount data GFD Q. Further, by inputting the partial image data GD part into the neural network of the processing unit 13, the processing unit 13 acquires the database image feature amount data GFD DB .
- the query image data GD Q and the partial image data GD part can be input to, for example, the neural network 30 having the configuration shown in FIG. 3A or FIG. 3B.
- step S02 shown in FIG. 2 does not have to be performed. That is, it is not necessary to acquire the database image feature amount data GFD DB representing the feature amount of the entire area of the database image data GD DB .
- the database image data GD DB can be used as the learning data of the neural network 30.
- the number of pixel values of the image data used for the training data is equal to the number of pixel values of the image data input to the neural network 30. Therefore, when the neural network 30 performs learning, it is preferable that the database image data GD DB or the like used for the learning data is adjusted by increasing or decreasing the number of pixel values as necessary.
- the pixel value is preferably increased by, for example, padding, and is preferably performed by, for example, zero padding.
- FIG. 22A is a diagram illustrating adjustment of the number of pixel values included in the database image data GD DB .
- the number of pixel values of the database image data GD DB [1] to the database image data GD DB [4] are all different.
- the number of pixel values possessed by these image data is determined as shown in FIG. 22A. It is preferable to align them.
- FIG. 22B is a diagram illustrating adjustment of the number of pixel values included in the partial image data GD part .
- the number of pixel values of the partial image data GD part is preferably equal to the number of pixel values of the image data used for learning the neural network 30.
- the number of pixel values of the query image data GD Q is the same as the number of pixel values of the image data used for learning the neural network 30. It is preferable to make them equal.
- step S33 the image retrieval system 10 performs step S13 shown in FIG. 9 or step S23 shown in FIG. Specifically, if step S11 is performed before step S31, step S13 is performed after step S33, and if step S21 is performed before step S31, step S23 is performed after step S33. Over a part of the area of the database image data GD DB, and the entire region of the query image data GD Q, by comparing a case of calculating the similarity of the query image data GD Q to the database image data GD DB, This is an example of an image search method using the image search system 10.
- the query image data GD Q and the database image data GD DB are compared by area-based matching or the like, and the database image data GD DB including the area having a high degree of matching with the query image data GD Q is extracted. It is extracted as data GD Ex .
- the region having a high degree of matching is extracted from the extracted image data GD Ex as the partial image data GD part , and the query image data GD Q and the partial image data GD part are input to the neural network of the processing unit 13. ..
- the processing unit 13 By extracting the database image data GD DB in this way, the processing unit 13 has the database image data GD DB representing a database image that does not include an image having a high degree of matching with the image corresponding to the query image data GD Q. It is possible to suppress the input to the neural network. Therefore, a database image including a part of an image similar to the image corresponding to the query image data GD Q can be searched with high accuracy in a short time. When the number of database image data GD DBs to be compared with the query image data GD Q is small, the above search can be performed with high accuracy and in a short time without extracting the database image data GD DBs .
- Figure 23 is a whole area of the database image data GD DB, and a partial region of the query image data GD Q, by comparing, when calculating the similarity to the query image data GD Q database image data GD DB
- This is an example of an image search method using the image search system 10.
- the image search system 10 performs step S11 shown in FIG. 9 or step 21 shown in FIG.
- the processing unit 13 compares the query image data GD Q with the database image data GD DB, and uses the database image data GD DB having a high degree of matching with a part of the query image data GD Q as the extracted image data GD Ex. Extract.
- the comparison between the query image data GD Q and the database image data GD DB can be performed, for example, by area-based matching as in step S31.
- step S41 As shown in FIG. 24A, the query image data GD Q is compared with each of the n database image data GD DBs .
- FIG. 24B is a diagram illustrating a procedure in the case of comparing the query image data GD Q and the database image data GD DB by area-based matching.
- the number of pixels of the image corresponding to the query image data GD Q is 4 ⁇ 4
- the number of pixels of the image corresponding to the database image data GD DB is 2 ⁇ 2. That is, it is assumed that the query image data GD Q has a pixel value of 4 ⁇ 4 and the database image data GD DB has a pixel value of 2 ⁇ 2.
- the 4 ⁇ 4 pixel values of the query image data GD Q are defined as pixel values vq11 to pixel values vq44, respectively.
- the pixel value corresponding to the pixel in the first row and the first column is the pixel value vq11
- the pixel value corresponding to the pixel in the first row and the fourth column is the pixel value vq14, the pixel in the fourth row and the first column.
- the pixel value corresponding to is the pixel value vq41
- the pixel value corresponding to the pixel in the 4th row and 4th column is the pixel value vq44.
- the 2 ⁇ 2 pixel values of the database image data GD DB are set to pixel value vdb11, pixel value vdb12, pixel value vdb21, and pixel value vdb22, respectively.
- the pixel value corresponding to the pixel in the first row and the first column is the pixel value vdb11
- the pixel value corresponding to the pixel in the first row and the second column is the pixel value vdb12, the pixel in the second row and the first column.
- the pixel value corresponding to is the pixel value vdb21
- the pixel value corresponding to the pixel in the second row and the second column is the pixel value vdb22.
- the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22 are compared with the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22.
- the degree of coincidence between the database image data GD DB and the region of the query image data GD Q composed of the pixel value vq11, the pixel value vq12, the pixel value vq21, and the pixel value vq22 can be calculated.
- the pixel values to be compared with the database image data GD DB are shown as a comparison data area 21 surrounded by a dotted line.
- the comparison data area 21 is moved by one column with respect to the pixel value of the query image data GD Q , the pixel values are compared in the same manner, and the degree of matching is calculated. Specifically, the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22 are compared with the pixel value vq12, the pixel value vq13, the pixel value vq22, and the pixel value vq23.
- the degree of coincidence between the database image data GD DB and the region of the query image data GD Q composed of the pixel value vq12, the pixel value vq13, the pixel value vq22, and the pixel value vq23 can be calculated.
- the comparison data area 21 is moved by one column with respect to the pixel value of the query image data GD Q , the pixel values are compared in the same manner, and the degree of matching is calculated. Specifically, the pixel value vdb11, the pixel value vdb12, the pixel value vdb21, and the pixel value vdb22 are compared with the pixel value vq13, the pixel value vq14, the pixel value vq23, and the pixel value vq24.
- the degree of coincidence between the database image data GD DB and the region of the query image data GD Q composed of the pixel value vq13, the pixel value vq14, the pixel value vq23, and the pixel value vq24 can be calculated.
- the query image data GD Q moves one row to the pixel value with comparison data area 21 the query image data GD Q is, the third row of the pixel values of the query image data GD Q, and the pixel value of the fourth row and the database image data
- the pixel values constituting the GD DB are compared for each column in the same manner as described above.
- the degree of coincidence between the area composed of the pixel values in the third and fourth rows of the query image data GD Q and the database image data GD DB can be calculated for each column in the same manner as described above. ..
- the highest degree of matching is defined as the degree of matching of the database image data GD DB with respect to the query image data GD Q.
- the above is performed for each of the n database image data GD DBs .
- degree of coincidence between the query image data GD Q is high database image data GD DB, and extracts as extracted image data GD Ex.
- FIG. 25 is a diagram illustrating extraction of the database image data GD DB .
- FIG. 25 shows an example of extracting one image data as extracted image data GD Ex from the database image data GD DB [1] to the database image data GD DB [3].
- the image corresponding to the query image data GD Q shown in FIG. 25 includes, for example, a transistor symbol and a capacitive element symbol.
- the image that corresponds to the database image data GD DB [2] shown in FIG. 25 contains transistors symbols, images corresponding to the database image data GD DB [1], and the database image data GD DB [ It is assumed that neither the symbol of the transistor nor the symbol of the capacitive element is included in the image corresponding to 3].
- the degree of coincidence to the query image data GD Q database image data GD DB [2] the database image data GD DB [1] and higher than the degree of coincidence to the query image data GD Q database image data GD DB [3] Become. Therefore, the database image data GD DB [2] can be extracted as the extracted image data GD Ex .
- the same method as that used in step S31 can be used.
- the comparison data area is moved by one column or one row with respect to the pixel value of the query image data GD Q, but the comparison data area 21 is the query image data as in step S31. It may be moved by two columns or more or two rows or more with respect to the pixel value of the GD Q.
- a plurality of query image data GD Qs having different numbers of pixel values may be generated based on the query image data GD Q input to the processing unit 13.
- the processing unit 13 extracts partial image data GD part-Q , which is data in a region having a high degree of coincidence with the extracted image data GD Ex , from the query image data GD Q.
- partial image data GD part-Q is data in a region having a high degree of coincidence with the extracted image data GD Ex .
- the degree of matching of each area of the query image data GD Q with respect to the database image data GD DB is calculated by the method shown in FIG. 24B
- the area having the highest degree of matching is extracted as the partial image data GD part-Q . .. Therefore, the number of pixel values possessed by the partial image data GD part-Q can be equal to the number of pixel values possessed by the extracted image data GD Ex .
- FIG. 26 is a diagram showing an example of the operation of step S42.
- the upper left portion of the image corresponding to the query image data GD Q is the region having the highest degree of coincidence with the extracted image data GD Ex [1]. Therefore, among the query image data GD Q , the data corresponding to the upper left region is referred to as the partial image data GD part-Q [1].
- the lower right portion of the image corresponding to the query image data GD Q is the region having the highest degree of matching with the extracted image data GD Ex [2]. Therefore, among the query image data GD Q , the data corresponding to the lower right region is referred to as the partial image data GD part-Q [2]. That is, a plurality of partial image data GD part-Q are extracted from one query image data GD Q.
- the same number of image data as the extracted image data GD Ex may be extracted from the query image data GD Q as partial image data GD part-Q .
- a smaller number of image data than the extracted image data GD Ex may be extracted from the query image data GD Q as partial image data GD part-Q .
- the same region is a partial image extracted from the query image data GD Q.
- the number of data GD parts-Q can be one. That is, it is not necessary to extract a plurality of the same partial image data GD part-Q from the query image data GD Q.
- the following description can be applied by appropriately reading the partial image data GD part-Q as the query image data GD Q.
- the entire query image data GD Q can be used as the partial image data GD part-Q .
- an image search system without extracting the partial image data GD part-Q .
- the image search method using 10 can be executed.
- Step S43 Next, the partial image data GD part-Q and the extracted image data GD Ex are input to the neural network of the processing unit 13.
- step S33 the description of step S33 can be referred to by appropriately reading the query image data GD Q as the partial image data GD part-Q and the partial image data GD part as the extracted image data GD Ex. ..
- the query image data GD Q may be read as the extracted image data GD Ex
- the partial image data GD part may be read as the partial image data GD part-Q .
- a whole area of the database image data GD DB is a part of the region of the query image data GD Q, by comparing a case of calculating the similarity of the query image data GD Q to the database image data GD DB,
- This is an example of an image search method using the image search system 10.
- the query image data GD Q and the database image data GD DB are compared by area-based matching or the like, and the database image data GD DB having a high degree of matching with a part of the query image data GD Q is extracted. It is extracted as data GD Ex .
- the region having a high degree of matching is extracted from the query image data GD Q as the partial image data GD part-Q , and the partial image data GD part-Q and the extracted image data GD Ex are added to the neural network of the processing unit 13. I'm typing.
- the processing unit 13 By extracting the database image data GD DB in this way, the processing unit 13 has the database image data GD DB representing a database image that does not include an image having a high degree of matching with the image corresponding to the query image data GD Q. It is possible to suppress the input to the neural network. Therefore, a database image similar to a part of the image corresponding to the query image data GD Q can be searched with high accuracy in a short time. When the number of database image data GD DBs to be compared with the query image data GD Q is small, the above search can be performed with high accuracy and in a short time without extracting the database image data GD DBs .
- one image is input as a query image in the image search system.
- 100 database images similar to the image were searched under condition 1 and condition 2.
- the query image is a schematic diagram showing the semiconductor manufacturing apparatus.
- a database image in addition to a schematic diagram showing a semiconductor manufacturing apparatus, a circuit diagram, a circuit layout diagram, a block diagram, and the like are prepared. Further, the database image is a drawing published in a patent document.
- steps S11 to S13 shown in FIG. 9 were performed to calculate the similarity of the database image to the query image.
- step S17 was performed to generate ranking data representing the first to 100th database images from the one with the highest similarity.
- step S17 was performed to generate ranking data representing the first to 100th database images from the one with the highest degree of similarity after correction.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
図2は、検索用データの生成方法の一例を示すフローチャートである。
図3A、及び図3Bは、ニューラルネットワークの構成例を示す図である。
図4は、畳み込み処理、及びプーリング処理の一例を示す図である。
図5は、ニューラルネットワークの構成例を示す図である。
図6A、及び図6Bは、検索用データの生成方法の一例を示す図である。
図7Aは、検索用データの生成方法の一例を示す図である。図7Bは、ニューラルネットワークの構成例を示す図である。
図8A、及び図8Bは、検索用データの生成方法の一例を示す図である。
図9は、画像検索方法の一例を示すフローチャートである。
図10は、画像検索方法の一例を示す図である。
図11A、及び図11Bは、画像検索方法の一例を示す図である。
図12A、及び図12Bは、画像検索方法の一例を示す図である。
図13は、画像検索方法の一例を示す図である。
図14は、画像検索方法の一例を示すフローチャートである。
図15は、画像検索方法の一例を示す図である。
図16A、及び図16Bは、画像検索方法の一例を示す図である。
図17は、画像検索方法の一例を示すフローチャートである。
図18A、及び図18Bは、画像検索方法の一例を示す図である。
図19は、画像検索方法の一例を示す図である。
図20A、図20B1、及び図20B2は、画像検索方法の一例を示す図である。
図21A、及び図21Bは、画像検索方法の一例を示す図である。
図22A、及び図22Bは、画像検索方法の一例を示す図である。
図23は、画像検索方法の一例を示すフローチャートである。
図24A、及び図24Bは、画像検索方法の一例を示す図である。
図25は、画像検索方法の一例を示す図である。
図26は、画像検索方法の一例を示す図である。
本実施の形態では、本発明の一態様の画像検索システム、及び画像検索方法について、図面を用いて説明する。
図1は、画像検索システム10の構成例を示すブロック図である。なお、本明細書に添付した図面では、構成要素を機能ごとに分類し、互いに独立したブロックとしてブロック図を示しているが、実際の構成要素は機能ごとに完全に切り分けることが難しく、一つの構成要素が複数の機能に係わることもあり得る。また、一つの機能が複数の構成要素に係わることもあり得、例えば、処理部13で行われる複数の処理は、互いに異なるサーバによって実行されることがある。
入力部11には、画像検索システム10の外部から画像データ等が供給される。入力部11に供給された画像データ等は、伝送路12を介して、処理部13、記憶部15、又はデータベース17に供給される。前述のように、入力部11に入力された画像データをクエリ画像データと呼ぶ。
伝送路12は、画像データ等を伝達する機能を有する。入力部11、処理部13、記憶部15、データベース17、及び出力部19の間の情報の送受信は、伝送路12を介して行うことができる。
処理部13は、入力部11、記憶部15、データベース17等から供給された画像データ等を用いて、演算、推論等を行う機能を有する。処理部13はニューラルネットワークを有し、当該ニューラルネットワークを用いて演算、推論等を行うことができる。また、処理部13はニューラルネットワークを用いない演算等を行うことができる。処理部13は、演算結果、推論結果等を記憶部15、データベース17、出力部19等に供給することができる。
記憶部15は、処理部13が実行するプログラムを記憶する機能を有する。また、記憶部15は、処理部13が生成した演算結果及び推論結果、並びに、入力部11に入力された画像データ等を記憶する機能を有していてもよい。
データベース17は、検索対象となる画像データを記憶する機能を有する。前述のように、データベースに記憶されている画像データを、データベース画像データと呼ぶ。また、データベース17は、処理部13が生成した演算結果及び推論結果を記憶する機能を有する。さらに、入力部11に入力された画像データ等を記憶する機能を有していてもよい。なお、記憶部15及びデータベース17は互いに分離されていなくてもよい。例えば、画像検索システム10は、記憶部15及びデータベース17の双方の機能を有する記憶ユニットを有していてもよい。
出力部19は、画像検索システム10の外部に情報を供給する機能を有する。例えば、処理部13における演算結果又は推論結果等を外部に供給することができる。
まず、画像検索システム10を用いて検索を行うために事前に行う処理について説明する。図2は、当該処理の方法の一例を示すフローチャートである。
まず、データベース17から伝送路12を介して、データベース画像データGDDBが処理部13に入力される。データベース画像データGDDBは、知的財産の情報が有する図面を表すデータとすることができる。ここで、知的財産の情報としては、例えば特許文献(公開特許公報、特許公報等)、実用新案公報、意匠公報、及び論文等の刊行物が挙げられる。国内で発行された刊行物に限られず、世界各国で発行された刊行物を、知的財産の情報として用いることができる。
次に、データベース画像データGDDBを、処理部13が有するニューラルネットワークに入力する。
次に、データベース画像データGDDBに紐付けられるデータベースタグTAGDBを、処理部13が取得する。データベース画像データGDDBに対応する画像の概念、技術的内容、注目点等を表すタグがデータベースタグTAGDBとなるように、データベースタグTAGDBを取得することが好ましい。図6Aは、データベースタグTAGDBを取得する方法の一例を示す図である。なお、図6Aに示す各データの図示は一例であり、これに限定されない。また、他の図で示す各データ、ベクトル等の図示も一例であり、図示する内容に限定されない。
次に、データベースタグTAGDBをベクトルにより表す。データベースタグTAGDBを表すベクトルを、データベースタグベクトルTAGVDBと呼ぶ。図7Aは、図6Aに示すデータベースタグTAGDBが、ベクトルにより表現されている様子を示す図である。
まず、画像検索システム10の使用者が、入力部11にクエリ画像データGDQを入力する。クエリ画像データGDQは、入力部11から、伝送路12を介して処理部13に供給される。又は、クエリ画像データGDQは、伝送路12を介して記憶部15又はデータベース17に保存され、記憶部15又はデータベース17から伝送路12を介して処理部13に供給されてもよい。
次に、クエリ画像データGDQを、処理部13が有するニューラルネットワークに入力する。例えば、図3A又は図3Bに示す構成のニューラルネットワーク30に、クエリ画像データGDQを入力することができる。これにより、処理部13が、クエリ画像データGDQの特徴量を表すクエリ画像特徴量データGFDQを取得することができる。例えば、図3Aに示す層31[m]から出力されるデータを、クエリ画像特徴量データGFDQとすることができる。又は、図3Bに示すプーリング層PL[m]から出力されるデータを、クエリ画像特徴量データGFDQとすることができる。なお、クエリ画像特徴量データGFDQは、データベース画像特徴量データGFDDBと同様に、2層以上の出力データを含んでいてもよい。クエリ画像特徴量データGFDQが多くの層の出力データを含むことにより、クエリ画像特徴量データGFDQを、クエリ画像データGDQの特徴をより正確に表すものとすることができる。
次に、データベース画像データGDDBの、クエリ画像データGDQに対する類似度を処理部13が算出する。
次に、データベース画像データGDDBのクエリ画像データGDQに対する類似度の算出結果に基づき、クエリ画像データGDQに紐付けられるタグであるクエリタグTAGQを処理部13が取得する。
次に、処理部13が、データベース画像特徴量データGFDDBと、データベースタグベクトルTAGVDBと、を含むデータDDBを取得する。また、処理部13が、クエリ画像特徴量データGFDQと、クエリタグベクトルTAGVQと、を含むデータDQを取得する。
次に、データDDBの、データDQに対する類似度を処理部13が算出する。図13に示す場合では、データDDB[1]乃至データDDB[100]のそれぞれについて、データDQに対する類似度が算出される。そして、当該類似度を、データベース画像データGDDB[1]乃至データベース画像データGDDB[100]の、クエリ画像データGDQに対する類似度とすることができる。よって、ステップS13で処理部13が算出した、データベース画像データGDDBの、クエリ画像データGDQに対する類似度を補正することができる。
次に、ステップS16で算出した、補正後の類似度の順位に関する情報を含むランキングデータを処理部13が生成し、検索結果として画像検索システム10の外部に出力する。
図9等に示す画像検索方法では、画像検索システム10の使用者は、クエリタグTAGQを入力していないが、本発明の一態様はこれに限らない。図14は、画像検索システム10の使用者がクエリタグTAGQの一部を手作業で入力する場合の、画像検索システム10を用いた画像検索方法の一例を示すフローチャートである。なお、図14に示す方法で画像検索システム10を動作させる場合であっても、図9に示す画像検索方法で画像検索システム10を動作させる場合と同様に、図2に示す処理をあらかじめ行っておくとよい。
まず、画像検索システム10の使用者が、クエリ画像データGDQの他、クエリタグTAGQを入力部11に入力する。画像検索システム10の使用者が入力するクエリタグTAGQの個数、及びクエリタグTAGQの内容は、当該使用者が任意に設定することができる。また、後のステップで自動的に取得されるクエリタグTAGQも含めた、クエリタグTAGQの個数を使用者が設定できるようにしてもよい。
次に、クエリ画像データGDQを、処理部13が有するニューラルネットワークに入力する。例えば、図3A又は図3Bに示す構成のニューラルネットワーク30に、クエリ画像データGDQを入力することができる。これにより、処理部13が、クエリ画像データGDQの特徴量を表すクエリ画像特徴量データGFDQを取得することができる。
次に、処理部13が、データベース画像特徴量データGFDDBと、データベースタグベクトルTAGVDBと、を含むデータDDBを取得する。また、処理部13が、クエリ画像特徴量データGFDQと、クエリタグベクトルTAGVQと、を含むデータDQを取得する。
次に、データGDDBの、データGDQに対する類似度を処理部13が算出する。当該類似度は、図13に示す方法と同様の方法で算出することができる。
次に、データDDBのデータDQに対する類似度の算出結果に基づいて、クエリタグTAGQを追加、修正する。
次に、クエリタグTAGQの追加、修正に対応して、データDDBが有するタグを追加、修正する。例えば、1つのデータDDBが有するデータベースタグベクトルTAGVDBの個数を、クエリタグTAGQの個数と等しくする。
次に、データGDDBの、データGDQに対する類似度を処理部13が再度算出する。当該類似度は、ステップS24に示す方法と同様の方法で算出することができる。これにより、データGDDBの、データGDQに対する類似度を補正することができる。
次に、ステップS27で算出した、補正後の類似度の順位に関する情報を含むランキングデータを処理部13が生成し、検索結果として画像検索システム10の外部に出力する。これにより、画像検索システム10の使用者は、例えば、各データベース画像のクエリ画像に対する類似度の順位、類似度の値、検索されたデータベース画像、タグ等を確認することができる。
次に、画像検索システム10の使用者が、ランキングデータが期待した結果であるか確認する。期待した結果であれば、検索を終了する。期待した結果が得られなかった場合、画像検索システム10の使用者が、クエリタグTAGQを追加、修正等した後、ステップS23に戻る。以上が画像検索システム10を用いた画像検索方法の一例である。
実施の形態1では、画像検索システム10はデータベース画像データGDDBの領域全体と、クエリ画像データGDQの領域全体と、を比較することにより、データベース画像データGDDBに対するクエリ画像データGDQの類似度を算出したが、本発明の一態様はこれに限らない。例えば、データベース画像データGDDBの一部の領域と、クエリ画像データGDQの領域全体と、を比較することにより、データベース画像データGDDBのクエリ画像データGDQに対する類似度を算出してもよい。又は、データベース画像データGDDBの領域全体と、クエリ画像データGDQの一部の領域と、を比較することにより、データベース画像データGDDBのクエリ画像データGDQに対する類似度を算出してもよい。
図17は、データベース画像データGDDBの一部の領域と、クエリ画像データGDQの領域全体と、を比較することにより、データベース画像データGDDBのクエリ画像データGDQに対する類似度を算出する場合の、画像検索システム10を用いた画像検索方法の一例である。まず、画像検索システム10は、図9に示すステップS11、又は図14に示すステップ21を行う。
次に、処理部13が、クエリ画像データGDQとデータベース画像データGDDBとを比較し、クエリ画像データGDQに対する一致度が高い領域を含むデータベース画像データGDDBを抽出する。ここで、抽出したデータベース画像データGDDBを、抽出画像データGDExとする。クエリ画像データGDQとデータベース画像データGDDBとの比較は、例えば領域ベースマッチングにより行うことができる。
次に、抽出画像データGDExから、クエリ画像データGDQに対する一致度が高い領域のデータである部分画像データGDpartを処理部13が抽出する。例えば、図18Bに示す方法により、データベース画像データGDDBの各領域のクエリ画像データGDQに対する一致度をそれぞれ算出した場合、一致度が最も高い領域を部分画像データGDpartとして抽出する。よって、部分画像データGDpartが有する画素値の数は、クエリ画像データGDQが有する画素値の数と等しくすることができる。
次に、クエリ画像データGDQを処理部13が有するニューラルネットワークに入力することにより、処理部13がクエリ画像特徴量データGFDQを取得する。また、部分画像データGDpartを処理部13が有するニューラルネットワークに入力することにより、処理部13がデータベース画像特徴量データGFDDBを取得する。クエリ画像データGDQ、及び部分画像データGDpartは、例えば、図3A、又は図3Bに示す構成のニューラルネットワーク30に入力することができる。なお、画像検索システム10を図17に示す方法で動作させる場合、図2に示すステップS02は行わなくてよい。つまり、データベース画像データGDDBの領域全体の特徴量を表すデータベース画像特徴量データGFDDBは取得しなくてもよい。
図23は、データベース画像データGDDBの領域全体と、クエリ画像データGDQの一部の領域と、を比較することにより、データベース画像データGDDBのクエリ画像データGDQに対する類似度を算出する場合の、画像検索システム10を用いた画像検索方法の一例である。まず、画像検索システム10は、図9に示すステップS11、又は図14に示すステップ21を行う。
次に、処理部13が、クエリ画像データGDQとデータベース画像データGDDBとを比較し、クエリ画像データGDQの一部に対する一致度が高いデータベース画像データGDDBを、抽出画像データGDExとして抽出する。クエリ画像データGDQとデータベース画像データGDDBとの比較は、ステップS31と同様に例えば領域ベースマッチングにより行うことができる。
次に、クエリ画像データGDQから、抽出画像データGDExに対する一致度が高い領域のデータである部分画像データGDpart−Qを処理部13が抽出する。例えば、図24Bに示す方法で、クエリ画像データGDQの各領域の、データベース画像データGDDBに対する一致度をそれぞれ算出した場合、一致度が最も高い領域を部分画像データGDpart−Qとして抽出する。よって、部分画像データGDpart−Qが有する画素値の数は、抽出画像データGDExが有する画素値の数と等しくすることができる。
次に、部分画像データGDpart−Qと抽出画像データGDExとを、処理部13が有するニューラルネットワークに入力する。
Claims (14)
- データベースと、処理部と、入力部と、を有し、
前記データベースは、文書データと、複数のデータベース画像データと、を記憶する機能を有し、
前記処理部は、前記データベース画像データの特徴量を表すデータベース画像特徴量データを、前記複数のデータベース画像データのそれぞれについて取得する機能を有し、
前記処理部は、前記文書データを用いてデータベースタグを複数生成し、前記データベースタグを前記データベース画像データに紐づける機能を有し、
前記処理部は、前記データベースタグを表すデータベースタグベクトルを、前記複数のデータベースタグのそれぞれについて取得する機能を有し、
前記処理部は、前記入力部にクエリ画像データが入力された場合に、前記クエリ画像データの特徴量を表すクエリ画像特徴量データを取得する機能を有し、
前記処理部は、前記データベース画像データの、前記クエリ画像データに対する類似度である第1の類似度を、前記複数のデータベース画像データのそれぞれについて算出する機能を有し、
前記処理部は、前記第1の類似度に基づき、前記データベースタグの一部を用いて、前記クエリ画像データに紐付けられるクエリタグを取得する機能を有し、
前記処理部は、前記クエリタグを表すクエリタグベクトルを取得する機能を有し、
前記処理部は、前記データベース画像特徴量データと、前記データベースタグベクトルと、を含む第1のデータを取得する機能を有し、
前記処理部は、前記クエリ画像特徴量データと、前記クエリタグベクトルと、を含む第2のデータを取得する機能を有し、
前記処理部は、前記第1のデータの、前記第2のデータに対する類似度である第2の類似度を算出する機能を有する画像検索システム。 - 請求項1において、
前記データベースタグには、単語が含まれる画像検索システム。 - 請求項1又は2において、
前記処理部は、前記文書データに対して形態素解析を行うことにより、前記データベースタグを生成する機能を有する画像検索システム。 - 請求項1乃至3のいずれか一項において、
前記処理部は、第1のニューラルネットワークと、第2のニューラルネットワークと、を有し、
前記データベース画像特徴量データ、及び前記クエリ画像特徴量データは、前記第1のニューラルネットワークを用いて取得され、
前記データベースタグベクトル、及び前記クエリタグベクトルは、前記第2のニューラルネットワークを用いて取得される画像検索システム。 - 請求項4において、
前記第1のニューラルネットワークは、畳み込み層と、プーリング層と、を有し、
前記データベース画像特徴量データ、及び前記クエリ画像特徴量データは、前記プーリング層から出力される画像検索システム。 - 請求項4又は5において、
前記データベースタグベクトル、及び前記クエリタグベクトルは、分散表現ベクトルである画像検索システム。 - 請求項1乃至6のいずれか一項において、
前記第1の類似度、及び前記第2の類似度は、コサイン類似度である画像検索システム。 - 文書データ、及び複数のデータベース画像が記憶されているデータベースと、入力部と、を有する画像検索システムを用いた画像検索方法であって、
前記データベース画像データの特徴量を表すデータベース画像特徴量データを、前記複数のデータベース画像データのそれぞれについて取得し、
前記文書データを用いてデータベースタグを複数生成し、前記データベースタグを前記データベース画像データに紐づけ、
前記データベースタグを表すデータベースタグベクトルを、前記複数のデータベースタグのそれぞれについて取得し、
前記入力部にクエリ画像データを入力し、
前記クエリ画像データの特徴量を表すクエリ画像特徴量データを取得し、
前記データベース画像データの、前記クエリ画像データに対する類似度である第1の類似度を、前記複数のデータベース画像データのそれぞれについて算出し、
前記第1の類似度に基づき、前記データベースタグの一部を用いて、前記クエリ画像データに紐付けられるクエリタグを取得し、
前記クエリタグを表すクエリタグベクトルを取得し、
前記データベース画像特徴量データ、及び前記データベースタグベクトルを含む第1のデータと、前記クエリ画像特徴量データと、前記クエリタグベクトルと、を含む第2のデータと、を取得し、
前記第1のデータの、前記第2のデータに対する類似度である第2の類似度を算出する画像検索方法。 - 請求項8において、
前記データベースタグには、単語が含まれる画像検索方法。 - 請求項8又は9において、
前記文書データに対して形態素解析を行うことにより、前記データベースタグを生成する画像検索方法。 - 請求項8乃至10のいずれか一項において、
前記データベース画像特徴量データ、及び前記クエリ画像特徴量データを、第1のニューラルネットワークを用いて取得し、
前記データベースタグベクトル、及び前記クエリタグベクトルを、第2のニューラルネットワークを用いて取得する画像検索方法。 - 請求項11において、
前記第1のニューラルネットワークは、畳み込み層と、プーリング層と、を有し、
前記データベース画像特徴量データ、及び前記クエリ画像特徴量データは、前記プーリング層から出力される画像検索方法。 - 請求項11又は12において、
前記データベースタグベクトル、及び前記クエリタグベクトルは、分散表現ベクトルである画像検索方法。 - 請求項8乃至13のいずれか一項において、
前記第1の類似度、及び前記第2の類似度は、コサイン類似度である画像検索方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217032757A KR20210145763A (ko) | 2019-03-29 | 2020-03-17 | 화상 검색 시스템 및 화상 검색 방법 |
DE112020001625.0T DE112020001625T5 (de) | 2019-03-29 | 2020-03-17 | Bildsuchsystem und Bildsuchverfahren |
JP2021510572A JPWO2020201866A1 (ja) | 2019-03-29 | 2020-03-17 | |
US17/439,684 US20220164381A1 (en) | 2019-03-29 | 2020-03-17 | Image retrieval system and image retrieval method |
CN202080023434.7A CN114026568A (zh) | 2019-03-29 | 2020-03-17 | 图像检索系统及图像检索方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2019065757 | 2019-03-29 | ||
JP2019-065757 | 2019-03-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020201866A1 true WO2020201866A1 (ja) | 2020-10-08 |
Family
ID=72666145
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2020/052405 WO2020201866A1 (ja) | 2019-03-29 | 2020-03-17 | 画像検索システム、及び画像検索方法 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20220164381A1 (ja) |
JP (1) | JPWO2020201866A1 (ja) |
KR (1) | KR20210145763A (ja) |
CN (1) | CN114026568A (ja) |
DE (1) | DE112020001625T5 (ja) |
TW (1) | TW202105200A (ja) |
WO (1) | WO2020201866A1 (ja) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115129825B (zh) * | 2022-08-25 | 2022-12-20 | 广东知得失网络科技有限公司 | 一种专利信息推送方法及系统 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018073429A (ja) * | 2017-11-15 | 2018-05-10 | ヤフー株式会社 | 検索装置、検索方法および検索プログラム |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9607014B2 (en) * | 2013-10-31 | 2017-03-28 | Adobe Systems Incorporated | Image tagging |
US20170039198A1 (en) * | 2014-05-15 | 2017-02-09 | Sentient Technologies (Barbados) Limited | Visual interactive search, scalable bandit-based visual interactive search and ranking for visual interactive search |
US20150331908A1 (en) * | 2014-05-15 | 2015-11-19 | Genetic Finance (Barbados) Limited | Visual interactive search |
WO2016147260A1 (ja) * | 2015-03-13 | 2016-09-22 | 株式会社日立製作所 | 画像検索装置、及び画像を検索する方法 |
JP6345203B2 (ja) | 2016-05-19 | 2018-06-20 | 株式会社 ディー・エヌ・エー | 対象物の類似度判定のためのプログラム、システム、及び方法 |
US10534809B2 (en) * | 2016-08-10 | 2020-01-14 | Zeekit Online Shopping Ltd. | Method, system, and device of virtual dressing utilizing image processing, machine learning, and computer vision |
US10496699B2 (en) * | 2017-03-20 | 2019-12-03 | Adobe Inc. | Topic association and tagging for dense images |
US10515275B2 (en) * | 2017-11-17 | 2019-12-24 | Adobe Inc. | Intelligent digital image scene detection |
-
2020
- 2020-03-17 WO PCT/IB2020/052405 patent/WO2020201866A1/ja active Application Filing
- 2020-03-17 KR KR1020217032757A patent/KR20210145763A/ko unknown
- 2020-03-17 CN CN202080023434.7A patent/CN114026568A/zh active Pending
- 2020-03-17 DE DE112020001625.0T patent/DE112020001625T5/de active Pending
- 2020-03-17 US US17/439,684 patent/US20220164381A1/en active Pending
- 2020-03-17 JP JP2021510572A patent/JPWO2020201866A1/ja active Pending
- 2020-03-25 TW TW109109919A patent/TW202105200A/zh unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018073429A (ja) * | 2017-11-15 | 2018-05-10 | ヤフー株式会社 | 検索装置、検索方法および検索プログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2020201866A1 (ja) | 2020-10-08 |
DE112020001625T5 (de) | 2021-12-23 |
US20220164381A1 (en) | 2022-05-26 |
TW202105200A (zh) | 2021-02-01 |
KR20210145763A (ko) | 2021-12-02 |
CN114026568A (zh) | 2022-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110163234B (zh) | 一种模型训练方法、装置和存储介质 | |
Kang et al. | Kernel-driven similarity learning | |
RU2699687C1 (ru) | Обнаружение текстовых полей с использованием нейронных сетей | |
Rebuffi et al. | Learning multiple visual domains with residual adapters | |
CN106776673B (zh) | 多媒体文档概括 | |
CN110362723B (zh) | 一种题目特征表示方法、装置及存储介质 | |
GB2571825A (en) | Semantic class localization digital environment | |
US10255355B2 (en) | Method and system for information retrieval and aggregation from inferred user reasoning | |
Desai et al. | Hybrid approach for content-based image retrieval using VGG16 layered architecture and SVM: an application of deep learning | |
Kumar et al. | A new feature selection method for sentiment analysis in short text | |
Ince | Automatic and intelligent content visualization system based on deep learning and genetic algorithm | |
Iqbal et al. | Comparative investigation of learning algorithms for image classification with small dataset | |
JP2024008989A (ja) | 画像検索システム及び画像検索方法 | |
Dai et al. | Multi-granularity association learning for on-the-fly fine-grained sketch-based image retrieval | |
WO2020201866A1 (ja) | 画像検索システム、及び画像検索方法 | |
Li et al. | Cross-language font style transfer | |
CN111753995A (zh) | 一种基于梯度提升树的局部可解释方法 | |
Adnan et al. | Automated image annotation with novel features based on deep ResNet50-SLT | |
CN116186301A (zh) | 基于多模态分层图的多媒体推荐方法、电子设备及存储介质 | |
WO2022248676A1 (en) | Continual learning neural network system training for classification type tasks | |
Xie et al. | KSRFB-net: detecting and identifying butterflies in ecological images based on human visual mechanism | |
Kalaivani et al. | An optimal multi-level backward feature subset selection for object recognition | |
CN113283235B (zh) | 一种用户标签的预测方法及系统 | |
CN115422369B (zh) | 基于改进TextRank的知识图谱补全方法和装置 | |
WO2022074505A1 (ja) | 情報検索システム、及び、情報検索方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20782624 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021510572 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20217032757 Country of ref document: KR Kind code of ref document: A |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20782624 Country of ref document: EP Kind code of ref document: A1 |