WO2012176317A1 - 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム - Google Patents
画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム Download PDFInfo
- Publication number
- WO2012176317A1 WO2012176317A1 PCT/JP2011/064463 JP2011064463W WO2012176317A1 WO 2012176317 A1 WO2012176317 A1 WO 2012176317A1 JP 2011064463 W JP2011064463 W JP 2011064463W WO 2012176317 A1 WO2012176317 A1 WO 2012176317A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- image
- node
- relevance
- graph
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06F18/2178—Validation; Performance evaluation; Active pattern learning techniques based on feedback of a supervisor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/20—Drawing from basic elements, e.g. lines or circles
- G06T11/206—Drawing of charts or graphs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/7635—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks based on graphs, e.g. graph cuts or spectral clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/778—Active pattern-learning, e.g. online learning of image or video features
- G06V10/7784—Active pattern-learning, e.g. online learning of image or video features based on feedback from supervisors
Definitions
- the present invention uses a relevance search system including an image recognition engine constructed on the server side via a network, thereby enabling image components to be recognized by the image recognition engine, and the relevance search.
- a relevance search system including an image recognition engine constructed on the server side via a network, thereby enabling image components to be recognized by the image recognition engine, and the relevance search.
- the user can visually search and search for the target object.
- the interest for each user, for each specific user group, or for the entire user -It relates to a system that collects data on the server side as a graph.
- some sites that sell goods using the Internet may additionally present recommended products or related services from the purchase history of the user or the browsing history of the site, or purchase the same product. Based on the history information of what other users have purchased, recommendations for those similar products are presented on the terminal of unpurchased users, etc. For a more diverse user group Therefore, it is possible to provide products and services with higher conformity accuracy (Patent Document 1).
- Patent Document 2 services that send short tweets over the Internet, such as 140 characters or less, take advantage of the feature that many interested users follow a specific caller or a specific topic, The idea of classifying and analyzing the content and themes to effectively find the location of the user's interest has also been proposed.
- Patent Document 3 As an example of seeking the location of the user's interest, there is an apparatus that implements an algorithm that estimates a user's changing interest in real time from words propagating between files viewed by the user (Patent Document 3). .
- the apparatus disclosed in Patent Document 3 includes a means for inputting words contained in a plurality of files as text for each file from a history browsed by the user, a means for dividing the text into words, and a user browsing Means for extracting “propagating words” referred to by the user among the plurality of files, means for storing one or more of the “propagating words”, and appearance frequency of all the “propagating words” for all files Means for obtaining a predetermined iDF value indicating the degree of occurrence of a predetermined “influence degree” and “propagating word” in a specific file, and “influence degree” which is a function of the “influence degree” and the iDF value. means for extracting a set of words of interest to the user as user profile information according to the “iDF value”.
- Patent Document 4 Also disclosed is an apparatus that expresses a content system with a graph system composed of a relationship between a user and an item, and enables a user to easily and accurately search content of interest based on semantic information.
- the device disclosed in Patent Document 4 measures the degree of approximation between the interest ontology supplied with the interest ontology data representing the interest ontology representing the interest ontology in which the interest of each individual is class-hierarchized.
- Approximation degree measurement means for measuring the degree of interest between users, and user graph data that can identify user communities whose approximation degree between interest ontologies is within a predetermined range based on the measurement result of the above-mentioned approximation degree measurement means
- User graph forming means for forming the user graph, and the meaning of the taxonomies at the edges connecting the plurality of users forming the nodes of the graph based on the data of the user graph while managing the data of the user graph formed by the user graph forming means
- User graph reconstruction means to reconstruct the relationship between users on a graph basis by giving information And said that there were pictures.
- image information including various objects together with the theme and surrounding situation is used without using characters, and the location of interest that changes from user to user is called search and search for images of interest by the user.
- the individual image components contained in these images are detected in real time with the help of an image recognition engine and are more relevant to the recognized individual image components
- the user can search and search for objects of interest visually and interactively.
- the interest graph collection system includes various objects and subjects as one form, regardless of input means using ideographic characters such as keywords, metadata, or sentences.
- a search system that uses image information as an input means, and the user is interested in a large number of images existing on the Internet or on a dedicated network, or images uploaded by the user to the Internet via a network terminal.
- An image recognition engine on the server side via the Internet by the user selecting an entire image or a specific area of the image on the network terminal and inquiring the selected image to the image recognition engine on the server side via the network. On the entire selected image or on the specified image area.
- a group of related elements is extracted based on a multidimensional feature vector that describes the direct relationship between elements stored in a relevance knowledge database in the relevance knowledge database in the relevance search engine.
- the relationship between each node is represented as a relationship graph with the recognized image component group and the related element group extracted by the relationship search engine as nodes.
- the interest graph collection system is such that the user taps an arbitrary node on the relevance graph displayed on the network terminal on the touch screen in the relevance search operation. Or select by touching, move the pointer cursor over any node to select, or flick on the touch screen towards any area on the relevance graph, or relevance of the pointer cursor Move to any area on the graph and drag and scroll the entire screen, or use the same operation with direction keys, etc., or use the input operation that gives the same effect using the gesture, line of sight, voice, or brain wave by the user.
- a new relevance graph centered on the selected node or the moved area The relevance search engine additionally sends it to the network terminal including the progress to this point, so that the user can trace the nodes or areas that are of interest to the user on the relevance graph while It is characterized by being able to be visually recognized as a wide range of relevance.
- the interest graph collection system provides a specific image configuration selected by the user from a plurality of image component groups presented by the image recognition engine in the relevancy search operation. Selection of a node by a user's operation of double-tapping or pinching out a specific node on an association graph displayed on an element or a network terminal on a touch screen, or operation of a pointer, etc. By using an input operation that exerts the same effect using the user's gesture, line of sight, voice, or brain wave, a more detailed relevance graph centered on the node is displayed on the user's network terminal. It is possible to express them visually, and a series of these operations are applied to the node.
- the interest graph collection system is configured such that, in the relevance search operation, the node image is passed through the network without tracing the node selected and focused by the user on the relevance graph.
- the image recognition engine By re-inquiring the image recognition engine on the server side, a new image component group related to the node is obtained with the help of the image recognition engine, and a new related component group starting from the image component group is related.
- Relevance search engine infers that the user recognizes and uses the existence of, and on the multidimensional feature vector that describes the direct relationship between each element, between each node that constitutes the series of relationships
- the feature vector value representing the depth of the direct relationship is adaptively strengthened so that additional learning of the relevance knowledge database in the relevance search engine is further enabled.
- the interest graph collection system includes an image component group that can be recognized by the image recognition engine in the relevance search operation, and each of the image component groups that correspond to the image component group.
- the relevance search engine sends a reduced image thumbnail generated from a representative photo, illustration, character, symbol, logo, favicon, etc. to the network terminal instead of the original image. It is characterized in that it can be further displayed and selected in units of image thumbnails as nodes on the graph.
- the interest graph collection system is provided in the image recognition process after making it possible to inquire a plurality of nodes to the image recognition engine on the server side in the relevance search operation.
- an input condition selection function logical operators (AND, OR) are introduced. When AND is selected, nodes that are directly and directly related to each other are selected. When OR is selected, each node is selected. A node directly related to any one or more of the above is further configured to be visually expressed on a network terminal together with a depth of mutual relation.
- the interest graph collection system is provided in the image recognition process after making it possible to inquire a plurality of nodes to the image recognition engine on the server side in the relevance search operation.
- a relationship search operator Connection Search
- relationships between multiple nodes that seem to be totally unrelated are directly and indirectly related to each input node group.
- the interest graph collection system is a node that has an indirect relationship with the user in the relevance search operation, or other that has no relevance with the user.
- a connection operator (LIKE) that connects the nodes as a direct relationship with the user
- DISLIKE disconnect operator
- the value representing the depth of interest of the user related to the node is increased, decreased, or destroyed on the multidimensional feature vector that describes the direct relationship between each element with the user as the central node. It is possible to update the interest graph corresponding to each user with the user as the central node. Characterized in that it has been.
- the interest graph collection system can identify the existence and non-existence of a new direct relationship for nodes other than the user in the relevance search operation.
- a reference operator REFERENCE
- UNREFERENCE a reference operator that suggests that multiple nodes should be directly connected, and a non-existent relationship that is already directly connected but its direct association is suspicious
- the relevance search engine can update the value of the collection vector, it can be reflected on the network terminal as an updated relevance graph related to the node group, and the existence or non-existence of the new direct relevance. It is characterized in that it is possible to further notify all users of update information related to existence.
- the system according to the present invention enables an information search process using an image itself as an input means regardless of characters, from an information search means that requires multilingual search such as a search by characters, so that a wider range of countries and regions It is possible to provide a language-free search system for users. Further, by replacing both the search input and the search result with image information from conventional characters, it becomes possible to search and find information more intuitively for human beings.
- FIG. 1 shows an embodiment of a system according to the present invention.
- the system 100 includes a server 101, a graph database (hereinafter also referred to as “GDB”) 102A, a mother database (hereinafter also referred to as “MDB”) 102B, and a plurality of network terminal devices 105a to 105d used by a user.
- the server 101 is connected to the GDB 102A and the MDB 102B via a connection 103, and the server 101 and the network device 105 are connected to a network or the Internet 104.
- the server is one or a plurality of computer programs that process data in response to a request from a client and provide the result as a service.
- the server may be implemented on one computer system or may be a plurality of computers. It can also be distributed and implemented in a system group consisting of It can also be implemented on one or more computer systems in parallel with other server functions. Furthermore, it can also be configured to have a plurality of independent processing functions.
- the significance of the server is positioned as described above.
- the computer system as hardware is an electronic computer having an arithmetic logic unit, a control unit, a storage device, and an input / output device connected by an instruction bus and a data bus as the most basic configuration. Based on information (bit data) input from the input / output device via the input / output interface, arithmetic operations, logical operations, comparison operations, shift operations, and the like are executed in the arithmetic logic unit. The executed data is stored in the storage device as necessary and output from the input / output device. A series of these processes is controlled by a software program stored in the storage device.
- Each server machine used in the embodiment of the present invention is also a hardware having at least a basic function as a computer as described above, and is controlled by a program group such as an operating system, a device driver, middleware, and application software. ing.
- FIG. 2 shows functional blocks of the server 101 and the GDB 102A and MDB 102B in an embodiment of the system according to the present invention.
- the server 101 includes an area processing unit 201, a general object recognition unit 202, a specific object recognition unit 203, an MDB search unit 206, an MDB learning unit 207, an MDB management unit 208, and network communication control as software function blocks.
- the area processing unit 201, the general object recognition unit 202, the specific object recognition unit 203, the MDB search unit 206, the MDB learning unit 207, and the MDB management 208 constitute the image recognition engine 200.
- the image recognition engine 200 may be replaced with an image recognition system described later with reference to FIG. 6A.
- the graph calculation unit 221, the graph storage unit 222, the graph management unit 223, and the relevance calculation unit 224 constitute a relevance search engine 220.
- the functional blocks of the server 101 are not necessarily limited to these, these representative functions will be briefly described.
- An area processing unit 201 divides an area in an image, cuts out a partial image, and the like.
- the general object recognition unit 202 recognizes an object included in the image with a general name (category).
- the object recognition unit 203 collates with information registered in the MDB to identify the object.
- the network communication control unit 204 performs image input / output processing, information communication control with a network terminal, and the like.
- the data search processing unit 205 collects information from link destinations, collects inquiries, collects, and searches.
- the MDB search unit 206 searches tag data such as the name of an object.
- the MDB learning unit 207 performs addition of new design data, addition of detailed information, registration of time information, registration, update, addition of incidental information, and the like.
- the MDB management unit 208 extracts feature points and feature amounts from design data, extracts category information from incidental information, registers it in category data, and expands, divides, updates, integrates, modifies, and adds new category classifications in category data.
- the relevance search engine 220 includes at least a graph calculation unit 221, a graph storage unit 222, a graph management unit 223, and a relevance calculation unit 224.
- the graph computation unit 221 processes various graph computations executed on the server, and the graph storage unit 222 develops a graph structure using node data and link data stored in the graph database on the memory, and The data format is arranged so that the processing can be easily performed, and the graph management unit 223 manages and arbitrates a large number of graph operations executed by the graph calculation unit 221. Further, the relevance calculation unit 224 calculates the relevance between nodes using a graph mining technique.
- the statistical information processing unit 209 performs statistical information processing using the graph data stored in the GDB 102A.
- the specific user filter processing unit 210 filters search results based on the user's subjectivity. For example, the user's interest based on the co-occurrence probability can be processed by extracting a partial graph from the type information given to each node and performing a graph mining process.
- the GDB 102 A is composed of node data 231 and link data 232. Although the GDB 102A is not necessarily limited to these, these representative functions will be briefly described.
- the node data 231 stores data related to the node. An example of the data structure will be described later with reference to FIG. 14A (D).
- the link data 232 stores data related to the link. An example of the link structure will be described later with reference to FIG. 14A (E).
- the MDB 102B includes design data 251, incidental information data 252, feature data 253, category data 254, and unspecified object data 255.
- the MDB 102B is not necessarily limited to these, but these representative functions will be briefly described.
- Design data 251 is generated from a database for manufacturing an object.
- the object includes the object structure, shape, dimensions, component connection information, layout, movable part, movable range, weight, rigidity, etc. It holds basic information necessary to do.
- the additional information data 252 holds all information related to the object such as the name of the object, the manufacturer, the part number, the date and time, the material, the composition, and the processing information.
- the feature amount data 253 holds feature points and feature amount information of individual objects generated based on the design information.
- the category data 254 holds information used when the general object recognition unit classifies an object.
- the unspecified object data 255 holds information about an object that cannot be recognized as a specific object at the present time, and if an object having similar characteristics is frequently detected thereafter, it is newly registered as a new specified object. .
- FIG. 3 shows a network terminal apparatus in an embodiment of the system according to the present invention.
- the network terminal devices 105a to 105d are client terminal devices widely used by users, and include computers, personal digital assistants (PDAs and pads), mobile phones and the like. That is, the network terminal devices 105a to 105d show a state in which a large number of various types of electronic information devices are connected to a network such as the Internet.
- the term “network terminal device 105” refers to any one of the network terminal devices 105a to 105d connected to the network.
- the network terminal devices 105a to 105d need not all be the same model. Any terminal device having an equivalent function (or a minimum function that can be implemented) may be used.
- typical functional blocks of the network terminal device 105 will be described.
- the network terminal 105 in FIG. 3 there are a case where the moving image input function and the display function exist together, and a case where each exists separately.
- the network terminal 105 like a mobile phone or the latest smartphone, the network terminal 105 includes an operation unit 105-01, a display unit 105-02, an audio input / output unit 105-03, an image transmission / reception unit 105-04, a camera.
- Unit 105-05 network communication unit 105-06, CPU 105-07, storage unit 105-08, power supply unit 105-09, position information acquisition unit 105-10, and various sensor groups 105-11.
- input / output functions exist as separate units, such as a video camera and a TV.
- the operation unit 105-01 includes, for example, an input device such as a touch pad (including a built-in display), a key input unit, a pointing device, and a jog dial.
- the display unit 105-02 is a display unit including a resolution and a video memory corresponding to each output device.
- the voice input / output unit 105-03 includes input / output devices such as a voice recognition microphone and speaker.
- the image transmission / reception unit 105-04 includes a codec unit, a memory unit, and the like necessary for transmitting moving image data captured by the network terminal 105 to the server or receiving moving image data distributed from the server.
- the moving image data includes still images.
- the camera unit 105-05 is a moving image photographing unit including an imaging device such as a CCD or a MOS sensor.
- the network communication unit 105-06 is an interface for connecting to a network such as the Internet, and may be either wired or wireless.
- the CPU 105-07 is a central processing unit
- the storage unit 105-08 is a temporary storage device such as a flash memory
- the power supply unit 105-09 indicates a battery or the like for supplying power to the entire network terminal.
- the position data detection unit 105-10 is a position information detection device such as GPS, and various sensor groups 105-11 include an acceleration sensor, a tilt sensor, a magnetic sensor, and the like.
- the start of the image recognition process starts with an input of an original image that is uploaded from the network terminal device 105 or collected by crawling from a server, for example (S402).
- the original image that originally exists on the server may be used. It does not matter whether the original image is a two-dimensional image or a three-dimensional image.
- a target region of which object in the original image is instructed through a device (not shown) such as a pointing device, or the entire original image is input as a processing target without an instruction of a target point. May be.
- general object recognition processing is performed in S404.
- a BoF (Bag-of-Features) technique can be adopted.
- the category of the detected object (general name of the object) is recognized. However, when the point of interest is instructed, the process branches depending on whether the category is recognized or not, and the determination is made in S405. If category recognition could not be performed, the process proceeds to step S406, where it is determined whether to handle an existing category (S407 or S408). In step S409, the process proceeds to the specific object recognition process.
- the process proceeds to S406, and the information distance between the feature quantity of the target object and the feature quantity of the object belonging to the existing category known by the MDB 102B is set. Based on this, a determination is made as to whether to register a new category including the target object (S407) or to consider expanding an existing category close to the target object (S408). When a new category is registered (S407), the process returns to S404, and when an existing category is expanded (S408), the process proceeds to S409.
- S411 it is determined whether or not the specific object has been identified. If the specific object can be identified, the process proceeds to S413, and it is determined whether the individual object image cut out in S409 includes more detailed information than the detailed data of the object registered in the MDB 102B. If it is determined Yes in S413, the process proceeds to S414, where the detailed data of the object in the MDB 102B is updated by the MDB learning unit 207, and has more detailed information. On the other hand, if it is determined No in S413, the process proceeds to S415 and the next determination is made.
- S415 is a case where it is determined in S405 that the general object could not be recognized, and the process proceeds to S408, S409, and S410 based on the determination in S406, and is determined when the specific object has been recognized (Yes in S411).
- the identified object is an existing category
- the definition of the existing category registered in the MDB 102B is extended, or if the information distance of the objects in the category is dispersed by the extension, Or, when the information distance to the proximity category is less than or equal to the information distance between objects in the category, or when the information object of the existing object is found by registering the specified object
- the correction is made and the category data 254 is updated (S416).
- the process jumps to S407 and is registered as a new category.
- FIG. 5 is a flowchart showing another embodiment of the specific object recognition process and part of the learning process in FIG. This will be described in detail below.
- the specific object recognition process is started from S501.
- data input in addition to an image of a single object, design data of the same layer can be used.
- design data linked to an image and design data itself can be used.
- feature points and feature amounts in the original image are extracted and compared with feature amount data generated from MDB.
- the first is a feature that maps to a two-dimensional plane from any angle based on the three-dimensional information for each minimum unit (design data etc.) constituting the object, and is used to identify the object from the mapped image. Generate quantities etc.
- the feature amount is extracted from the input image based on the feature amount, and the appearance part and the frequency are compared (S504). Note that the feature amount here is generated based on, for example, a contour extraction method or a SURF method.
- the second is the feature point of the object, using the process of mapping the three-dimensional shape information consisting of a set of minimum units (design data, etc.) constituting the object to the two-dimensional plane while changing the projection angle, magnification, etc. And a method of determining the difference from the feature amount as the degree of coincidence (tune method) (S505).
- S506 it is determined whether the object has been identified. If it is determined that the data has been identified, the process proceeds to S510, where it is determined whether the data used for identification is more detailed or up-to-date than the data of the MDB.
- Information design data, etc.
- time information object type, version information
- step S506 it is determined whether the object has been identified again.
- the process proceeds to S510 to determine whether the data used for identification is more detailed or up-to-date than the MDB data, and based on these determinations, the object Unique information (design data, etc.) and time information (object type, version information) are updated and registered in the MDB, and the specific object recognition process is exited.
- object Unique information design data, etc.
- time information object type, version information
- S508 In addition to or instead of the identification processing based on information other than the image information shown in S507, collective intelligence can also be used for object identification (S508).
- the processing of S508 is performed, for example, by searching an encyclopedia on the Internet or automatically posting it on a Q & A bulletin board.
- a search query is created using a feature amount generated from the MDB together with a category obtained by general object recognition, and the search is executed. Then, a new feature amount is extracted from the returned contents to try again to identify the object.
- the original image is uploaded to the bulletin board together with the category obtained by the general object recognition.
- FIG. 6A shows functional blocks of an image recognition system in another embodiment of the system according to the present invention.
- the image recognition system 202 shown in FIG. 6A can be operated as a part of the server 101 or can be operated as a server system independent of the server 101.
- the image recognition system 202 includes a scene recognition system for recognizing a scene in addition to the general object recognition system and the specific object recognition system corresponding to the general object recognition unit and the specific object recognition unit in the server 101.
- the image recognition function unit in the server 101 it will be described in detail below.
- the image recognition system 202 includes a network communication control unit 204, an area processing unit 201, a data search processing unit 205, a general object recognition system 106, a scene recognition system 108, a specific object recognition system 110, and an image category database 107. And a scene component database 109 and an MDB 111.
- the general object recognition system 106 includes a general object recognition unit 106-01, a category recognition unit 106-02, a category learning unit 106-03, and a new category registration unit 106-04.
- the specific object recognition system 110 includes a specific object recognition unit 110-01, an area extraction unit 108-01, a feature extraction unit 108-02, a weight learning unit 108-03, and a scene recognition unit 108-04.
- the image category database 107 includes a category classification database 107-01 and unspecified category data 107-02.
- the scene component database 109 is composed of a scene element database 109-01 and metadata. Is composed of a dictionary 109-02, MDB111 is composed of a detailed design data 111-01, and supplementary information data 111-02, and feature amount data 111-03, unspecified object data 111-04.
- the functional blocks of the image recognition system 202 are not necessarily limited to these, these representative functions will be briefly described.
- General object recognition system 106 recognizes an object included in an image with a general name or category.
- the categories here are hierarchical, even if they are recognized as the same general object, but are further subdivided categories (the same chair has four chairs or a chair with no legs at all) Can be classified and recognized as a large-scale category (chairs, desks, and chests are also broadly classified as “furniture” categories).
- Category recognition is a classification meaning this classification, that is, a proposition to classify an object into a known class, and a category is also called a class.
- the general object recognition unit 106-01 extracts local feature values from the feature points of the object in the input image, and the local feature values are similar to or similar to the description of the predetermined feature values obtained in advance by learning. A process is performed to determine whether the object is a known general object.
- the category recognition unit 106-02 specifies or estimates to which category (class) an object that can be recognized as a general object belongs by collation with the category classification database 107-01. As a result, the category is stored in the database in the specific category. When an additional feature amount that can be added or modified is found, the category learning unit 106-03 re-learns and updates the description of the general object in the category classification database 107-01.
- the new category registration unit 106-04 newly registers these feature quantities in the category classification database 107-01 and assigns a new general name.
- characteristic components governing the whole or part of the input image are detected by using a plurality of feature extraction systems having different properties, and they are described in the scene component database 109.
- the scene element database 109-01 By referring to the scene element database 109-01 on a multidimensional space, the pattern in which each input element group is detected in a specific scene is obtained by statistical processing, and the area that controls the whole image or a part is specified. To recognize whether it is a scene.
- the metadata group attached to the input image is collated with the components described in the metadata dictionary 109-02 registered in advance in the scene component database 109, thereby further improving the accuracy of scene detection. It can also be improved.
- the area extraction unit 108-01 divides the entire image into a plurality of areas as necessary, and enables scene discrimination for each area. For example, a high-resolution surveillance camera installed on the roof of a building in an urban space can overlook a plurality of scenes such as intersections and entrances of many stores.
- the feature extraction unit 108-02 uses the recognition results obtained from various available feature amounts such as local feature amounts of the plurality of feature points detected in the designated image region, color information, and object shape, as a weight in the subsequent stage. This is input to the learning unit 108-03, the probability that each element co-occurs in a specific scene is obtained, and input to the scene recognition unit 108-04 to perform scene discrimination for the final input image.
- the specific object recognition system 110 sequentially compares the characteristics of the object detected from the input image with the characteristics of the specific object group stored in advance in the MDB, and finally performs an object identification process.
- the total number of specific objects existing on the earth is enormous, and it is not practical to collate with all these specific objects. Therefore, as described later, it is necessary to narrow down an object category and a search range within a predetermined range in advance of the specific object recognition system.
- the specific object recognizing unit 110-01 compares the local feature amount at the detected feature point with the feature parameter group in the MDB obtained by learning, and determines which specific object the object corresponds to. Determine by processing.
- the MDB describes detailed data regarding a specific object that is available at that time.
- design data 111-01 Basic information necessary for constructing and manufacturing an object, such as finishing, is held in the MDB.
- incidental information data 111-02 all information related to the object such as the name of the object, the manufacturer, the part number, the date, the material, the composition, and the processing information is held.
- the feature amount data 111-03 holds feature points and feature amount information of individual objects generated based on the design information.
- the unspecified object data 111-04 is provisionally stored in the MDB for future analysis as data on objects that do not belong to any specific object at that time.
- the MDM search unit 110-02 provides a function of searching for detailed data corresponding to a specific object, and the MDB learning unit 110-03 adds / modifies the description contents of the MDB through an adaptive and dynamic learning process. Also, once an object having similar characteristics is detected frequently as an unspecified object data 111-04 as an unspecified object, a new specified object is newly created by the new MDB registration unit 110-04. Registration process.
- FIG. 6B shows a system configuration example and functional block example of the general object recognition unit 106-01.
- the functional blocks of the general object recognition unit 106-01 are not necessarily limited to these, but general object recognition when Bag-of-Features (hereinafter referred to as BoF) is applied as a representative feature extraction method. The method will be briefly described below.
- the general object recognition unit 106-01 includes a learning unit 106-01a, a Visual Word dictionary (CodeBook) 106-01e, a vector quantization unit 106-01f, a vector quantization histogram unit 106-01g, and a vector quantization histogram.
- the learning unit 106-01a includes a local feature amount extraction unit 106-01b, a clustering unit 106-01c, and a Visual word creation unit 106-01d, and includes a vector quantization histogram identification unit.
- 106-01h includes a Support vector Vector machine (hereinafter referred to as SVM) unit 106-01i.
- SVM Support vector Vector machine
- BoF extracts feature points that appear in an image by various methods, expresses them as a collection of a large number of local feature quantities (Visual Word) without using their relative positional relationship, and obtained them by learning. It is widely known as a typical object recognition method for comparing with the Visual Word dictionary (CodeBook) 106-01e extracted from a simple object and determining which object the appearance frequency of these local features is closest to. .
- Visual Word dictionary CodeBook
- FIG. 7A shows a case using Scale ⁇ ⁇ Invariant Feature Transform (hereinafter referred to as SIFT) as a representative case of local feature extraction.
- SIFT is one of feature point detection and feature quantity extraction algorithms that are robust to changes in the size, rotation, and illumination of an image.
- the distribution of a plurality of characteristic luminance gradients from a single image is applied to the original image.
- a method of detecting by using a difference between different smoothed images (Difference-of-Gaussian as an example, hereinafter referred to as DoG) and obtaining an extreme value (center of gravity position) as a representative point and extracting it as a feature point (key point) It is.
- a scale at each feature point is obtained from the opening amount of the obtained Gaussian window, and a local feature quantity in the dominant range is calculated.
- an opening is extremely small on an edge that appears frequently in an image, and is excluded from key points because it is difficult to obtain a useful feature amount.
- points with small DoG output are also excluded from key points because they are highly likely to be affected by noise contained in the original image.
- FIG. 7A a plurality of key points detected using these processes and their scales are indicated by white circles.
- a representative orientation (direction of principal component) is obtained.
- luminance gradient strengths are obtained for all 36 directions in increments of 10 degrees, and the orientation in which the intensity becomes the maximum value is adopted as the orientation that represents the key point.
- a representative point of the main luminance gradient is obtained and used as the main orientation of each key point.
- the entire surrounding area based on the scale of each key point is divided into a total of 16 areas of 4 ⁇ 4 while rotating according to the orientation obtained above, and a gradient direction histogram of 8 directions 45 degrees within each block.
- Generate 16 block ⁇ 8 directions total 128-dimensional feature vector from these results.
- the 128-dimensional feature vector obtained by the local feature quantity extraction unit 106-01b constituting the learning unit 106-01a is clustered into a multidimensional feature vector group by the subsequent clustering unit 106-01c.
- the Visual Word creation unit 106-01d generates a Visual Word for each feature vector based on each centroid vector.
- k-means method and mean-shift method are known.
- the generated Visual Word is stored in the Visual Word dictionary (CodeBook) 106-01e
- the Visual Word extracted from the input image is collated based on the Visual Word dictionary 106-01e, and the vector quantization unit 106-01f performs vector quantization for each feature. I do.
- the vector quantization histogram unit 106-01g generates a histogram for each dimension.
- FIG. 7B shows a generated Visual Word (CodeBook)
- FIG. 7C shows an example of a vector quantization histogram extracted.
- the total number (number of dimensions) of each bin in the histogram is as large as thousands to tens of thousands.
- there are many histogram bins that do not have feature matching at all. Normalization processing is performed so that the sum of all bin values of the histogram becomes 1 at a time.
- the obtained vector quantization histogram is input to a vector quantization histogram identification unit 106-01h in the subsequent stage, and a representative vector classifier Support Vector Machine (hereinafter referred to as SVM) 106-01i belongs to the object. Recognize and process classes (general objects).
- SVM representative vector classifier Support Vector Machine
- the recognition result here can also be used as a learning process for the Visual Word dictionary.
- recognition judgment including other methods can also be used as learning feedback for the Visual Word dictionary, describing the features of the same class most appropriately, and the degree of separation from other classes. Adaptive correction / calibration can be continued to keep good.
- FIG. 8 shows a schematic block diagram of the entire general object recognition system 106 including the general object recognition unit 106-01.
- General objects belong to various categories, and they have a multiple hierarchical structure. For example, humans belong to a higher category “mammals”, mammals belong to a higher category “animals”, and so on. Are humans also hair colors, eye colors, adults or children? It is possible to recognize in another category such as.
- the existence of the category classification database 107-01 is indispensable for making these recognition judgments. This is a collection of “knowledge” of civilization up to the present, and it will be continuously evolved by adding new “knowledge” through future learning and discovery.
- the classes identified by the general object recognition unit 106-01 have various multidimensional and hierarchical structures. It is described in this category classification database 107-01.
- the recognized general object is checked against the category classification database 107-01, and the category detection unit 106-02 recognizes the belonging category. Thereafter, the recognition result is delivered to the category learning unit 106-03, and the consistency with the description in the category classification database 107-01 is checked in detail.
- an object recognized as a general object often contains a plurality of recognition results.
- the new category registration unit 106-04 registers the information in the category classification database 107-01.
- the object that is unknown at that time is temporarily stored in the category classification database 107-01 as unspecified category data 107-02 for future analysis and collation.
- FIG. 9 is a block diagram showing a typical embodiment of the present invention of a scene recognition system 108 for recognizing and determining a scene included in an input image.
- a plurality of objects can generally be recognized from a learning image and an input image. For example, if objects such as “trees”, “grass” and “animals” can be recognized simultaneously with areas such as “sky”, “sun”, and “ground”, whether they are “zoos” or "Africa” It can be inferred from the overall landscape and co-occurrence relationships with other objects discovered. For example, if a fence, a bulletin board, etc.
- the scene recognition system 108 includes an area extraction unit 108-01, a feature extraction unit 108-02, a strong identification unit 108-03, a scene recognition unit 108-04, and a scene component database 109.
- 108-02 includes a general object recognition unit 108-05, a color information extraction unit 108-06, an object shape extraction unit, a context extraction unit, and weak classifiers 108-09 to 12, and
- a scene recognition unit 108-04 includes a scene classification unit 108-13, a scene learning unit 108-14, and a new scene registration unit 108-15.
- a scene component database 109 includes a scene element database 109-01, metadata 109 -02.
- the region extraction unit 108-01 performs region extraction of the target image in order to effectively extract the features of the target object without being affected by the background and other objects.
- a graph-based region segmentation method Graph-Based Image Segmentation or the like is known.
- the extracted object image is input to the local feature amount extraction unit 108-05, the color information extraction unit 108-06, the object shape extraction unit 108-07, and the context extraction unit 108-08, and obtained from each of these extraction units.
- the obtained feature quantities are subjected to classification processing in the weak classifiers 108-09 to 12, and are integratedly modeled as a multidimensional feature quantity group.
- These modeled feature quantity groups are input to a strong classifier 108-03 having a weighted learning function, and a recognition determination result for a final object image is obtained.
- Examples of the weak classifier include SVM, and examples of the strong classifier include AdaBoost.
- an input image often includes a plurality of objects and a plurality of categories that are a superordinate concept thereof, and a human can think of a specific scene or situation (context) at a glance from there.
- a human can think of a specific scene or situation (context) at a glance from there.
- it is difficult to directly determine what kind of scene the input image represents.
- the situation in which these objects exist and their relative positions, and the probability that each object or category appears at the same time (co-occurrence relationship) will have an important meaning for subsequent scene discrimination. .
- the object group and category group that can be image-recognized in the previous section are collated based on the frequent probabilities of each element group for each scene described in the scene element database 109-01, and the scene recognition unit 108 in the subsequent stage.
- metadata 109-02 attached to the image can also be useful information.
- metadata attached by humans may be misleading, obvious errors, or indirectly capture images as a metaphor, and do not always represent the object or category in the input image correctly. There may not be. Even in such a case, it is desirable that the object or category recognition process is finally performed in consideration of the result obtained by the image recognition system or the result obtained by the knowledge information system based on the co-occurrence relationship.
- a plurality of scenes can be obtained from one image (the “sea” and “the beach” at the same time). In that case, a plurality of scene names are given together. Furthermore, it is difficult to judge whether the scene name to be added to the image is “sea” or “beach”, and it is difficult to judge from the image alone. Based on the co-occurrence relationship, it will be necessary to make a final decision with the help of a knowledge database.
- FIG. 10A shows a description example of the scene element database 109-01.
- the scene (A) includes a plurality of categories, category m and category n, and general object ⁇ and general object ⁇ are constituent elements of category m, and general object ⁇ , specific object ⁇ , and specific object ⁇ are constituent elements of category n. Are described together with their respective appearance probabilities.
- Fig. 10 (B) shows an example of the components of the scene "intersection”.
- roads such as “main road” consisting of multiple lanes, “general road” of one lane on one side, or “sidewalk”.
- road surface indication such as “lane separation display”, “pedestrian crossing display” and “travel direction indication” is found on “road” at the same time, there is a considerable probability that it is “intersection” or a place close to “intersection” I can guess.
- FIG. 11 shows a configuration example and function blocks of the entire system of the specific object recognition system 110.
- the specific object recognition system 110 includes a general object recognition system 106, a scene recognition system 108, an MDB 111, a specific object recognition unit 110-01, an MDB search unit 110-02, an MDB learning unit 110-03, and a new MDB.
- the specific object recognition unit 110-01 includes a two-dimensional mapping unit 110-05, an individual image clipping unit 110-06, a local feature amount extraction unit 110-07, and a clustering unit 110.
- the class (category) to which the target object belongs can be recognized by the general object recognition system 106, can the object be further recognized as a specific object? You can move on to the process of narrowing down. If a class is not specified to some extent, it will be forced to search from a myriad of specific objects, which is not very practical in terms of time and cost.
- these narrowing processes can further narrow down the target based on the recognition result of the scene recognition system 108.
- the MDB search unit 110-02 sequentially extracts detailed data and design data of a plurality of object candidates from the MDB 111 from among the narrowed down possibilities, and proceeds to a matching process with the input image based on them. Even when the object is not an artificial object or when detailed design data itself does not exist, a certain object can be recognized to some extent by matching each feature in detail if there is a photograph or the like. However, there are rare cases where the appearance of the input image and the comparative image is almost the same, and there are cases where each is recognized as a different object.
- the two-dimensional mapping unit 110-05 visualizes (renders) the three-dimensional data in the MDB according to the appearance of the input image. This makes it possible to match feature quantities with extremely high accuracy. In this case, performing detailed rendering in all directions in the two-dimensional mapping unit 110-05 causes an unnecessary increase in calculation time and cost, and therefore, it is necessary to narrow down according to the appearance of the input image.
- various feature quantity groups of objects obtained from high-precision rendering images using MDB can be obtained in advance in the learning process, which is more effective in constructing a practical system. It becomes.
- the local feature amount of the object is detected by the local feature amount extraction unit 110-07, and each feature amount is separated into a plurality of similar feature groups by the clustering unit 110-08.
- the creation unit 110-09 converts to a multi-dimensional feature value set and registers them in the Visual Word dictionary 110-10. These are continuously performed until sufficient recognition accuracy is obtained for a large number of learning images. If the learning image is a photograph, the image resolution is insufficient, the influence of noise, the influence of occlusion, the influence of objects other than the target object image, etc. are unavoidable, but if the learning image is based on MDB, Extraction can also be performed ideally, and a specific object recognition system can be configured with significantly improved resolution compared to conventional methods.
- the input image is prepared in advance by learning the feature points and feature amounts of the local feature amount extraction unit 110-07 after the region of the target specific object is cut out by the individual image cut-out unit 110-06.
- the Visual Word dictionary 110-10 vector quantization is performed for each feature amount, and then, the vector quantization histogram unit 110-12 develops the multi-dimensional feature amount, and the vector quantization histogram identification unit 110-13 To determine whether the object is the same as the reference object.
- SVM Small Vector Machine
- AdaBoost AdaBoost
- the shape feature of the object In addition to the local feature amount, it is possible to use the shape feature of the object for the purpose of further improving the detection accuracy.
- the object cut out from the input image is input to the shape comparison unit 110-16 via the shape feature amount extraction unit 110-15, and is identified using the shape feature of the object.
- the result is fed back to the MDB search unit 110-02, and narrowing down to MDBs corresponding to possible specific objects is performed.
- the shape feature quantity extraction means HoG (Histograms of Oriented Gradients) is known.
- the shape feature amount is also useful for the purpose of reducing unnecessary rendering processing for obtaining a two-dimensional map using MDB.
- the color characteristics and surface treatment (texture) of the object are useful for the purpose of improving the image recognition accuracy.
- the extracted input image is input to the color information extraction unit 110-17, and the color comparison unit 110-18 extracts the color information or texture of the object, and the result is fed back to the MDB search unit 110-02. By doing so, it becomes possible to further narrow down the MDBs to be compared. Through these series of processes, specific object recognition is effectively performed.
- FIG. 12A to 12E illustrate a user interface in an embodiment of the system according to the present invention.
- FIG. 12A (A) on the display on the network terminal device 105, in addition to images 1201 and 1202, several images, a relationship search window 1203, and an output window (OUTPUT) 1205 are displayed. .
- the images 1201 and 1202 are images, illustrations, characters, symbols, and the like representing image component groups that can be recognized by the image recognition engine 200 and related element groups corresponding to the image component groups.
- a thumbnail image generated from a logo, favicon, etc. is an image tile sent by the relevance search engine to the network terminal instead of the original image, and can be moved to an arbitrary position on the screen by the user's finger (1206) or the like. Dragging is possible.
- the relevance search window 1203 can be arranged on an arbitrary screen such as a home screen related to the network terminal device 105 or a screen managed by a specific application operating on the home screen.
- the user selects the entire image to be searched or a specific area of the image for the relevance search window 1203 at any time by staying on the home screen after starting the network terminal device 105. It can be configured so that it can be dragged and dropped later to start the image recognition and subsequent relevance search process.
- FIG. 12A (A) shows an operation in which the user drags and drops the image 1201 to the relevance search window 1203.
- the relevance search window 1203 is not particularly prepared, and the user selects the entire image of interest or a specific area of the image on the network terminal 105 on the network terminal 105, and the selected image is transmitted to the server side via the network.
- Any interface may be adopted as long as it is configured to be able to make an inquiry to the image recognition engine 200.
- an operation of explicitly double-tapping the entire image to be searched or a specific image area on the display screen of the network terminal 105 Accordingly, it is possible to inquire the image recognition engine 200 on the server side about the recognition processing of the selected image.
- a PC or the like uses a pointing device 1204 such as a mouse instead of an input operation on the touch panel to move the cursor 1207 onto the image 1201, and moves the target image 1201 to the relevance search window 1203 a ( Alternatively, by directly dragging and dropping the icon 1203b) associated with the relevance search, or by double-clicking the mouse cursor on the image 1201, the server side image recognition engine 200 recognizes the selected image. It is also possible to inquire about processing.
- a pointing device 1204 such as a mouse instead of an input operation on the touch panel to move the cursor 1207 onto the image 1201, and moves the target image 1201 to the relevance search window 1203 a ( Alternatively, by directly dragging and dropping the icon 1203b) associated with the relevance search, or by double-clicking the mouse cursor on the image 1201, the server side image recognition engine 200 recognizes the selected image. It is also possible to inquire about processing.
- FIG. 12B (A) shows a group of image components sent from the server 101 to the network terminal 105 as a result of relevance search for the selected image 1201, and other related element groups that are more relevant to the node.
- the relevance graph is displayed on the entire screen of the network terminal 105, and each node on the relevance graph is seamlessly traced from left to right by flicking (1210) each of these nodes on the touch screen. It shows how it is.
- the entire display is automatically scrolled on the network terminal 105 side so that a relevance graph is displayed centering on the node. Things are also possible.
- FIG. 12B (A) shows an example of the relevance graph, and shows a state where a part of the graph is cut out and drawn on the network terminal 105.
- the size of the actual relevance graph is much larger than in this example, and the relevance search is performed on nodes belonging to the area 1209 that cannot be displayed on the network terminal 105 and link information that is a relevance between them.
- the engine 220 additionally sends it to the network terminal 105 in accordance with the user's scrolling operation, so that a wide range of relations spanning a plurality of nodes can be seamlessly traced on the relevance graph while nodes or regions of interest to the user are seamlessly traced. It is possible to present it visually to the user.
- FIG. 12B (A) as a result of flicking, orange juice 1221 and grape 1222 as related elements of grape juice 1220 are displayed, and fruit groups 1223 to 1226 as related elements of grape 122 are displayed.
- the grape juice 1220 in FIG. 12B (A) is explicitly selected (two or more taps, touches, etc.), and this is connected to the image recognition engine 200 on the server side via the network.
- the bottle cap 1231 and the bottle 1232 which are image component groups recognized by the image recognition engine 200 are displayed, and the manufacturer's logo 1233 is displayed.
- the scroll operation may be a gesture by the user or an input operation having the same effect using a line of sight, sound, brain waves, or the like (not shown in the figure, but for gestures including pinch-in / pinch-out).
- Many sensing techniques that have already been used can be introduced for detection, eye gaze or brain wave detection, etc.).
- the relevance graph can be arranged in a three-dimensional space or a multidimensional space.
- the representation of the relevance graph is not limited to a geometrical graph that visually represents a plurality of nodes and their relevance and relevance strength. It is also useful to represent a set of image tiles of equal size arranged in a tile shape for a mobile terminal or the like that has to have the same image display size.
- (1) the first input image (1501), (2) a plurality of image component candidate groups (1251) detected and recognized by the image recognition engine, and (3) each of these individual image component groups It is useful to display other related element groups (1252 or 1253) as different element groups side by side in the allocated area on the display screen of the network terminal 105.
- the related element group of (1252 or 1253) can be displayed for each layer according to the degree of relevance such as primary connection, secondary connection, tertiary connection ...
- the screen can be scrolled (1254) at high speed, and the entire relevance can be effectively browsed.
- the strength of the relationship between the nodes can be added as data such as numerical values and symbols in the vicinity of each node.
- the server-side image recognition engine 200 By querying the server-side image recognition engine 200 again for arbitrary node images of the relationship graph expressed in these tiles, and obtaining a new image component group from the input image, It is possible to acquire a new relevance graph starting from the relevance search engine 220.
- a new image component detection and image recognition request for the image are made from the network terminal 105 to the server 101.
- the image recognition engine 200 detects and recognizes a new image component group on the server side and returns the result to the network terminal 105, so that it can be re-presented as the image recognition element group on the display screen on the network terminal 105 side. This is possible (FIG. 12B (B)).
- These new image component groups may be displayed instead of the conventional relevance graph, or may be displayed superimposed on the conventional relevance graph using a translucent display function or the like.
- the original relevance graph is restored.
- the user double-tap a node representing the image component to display a new relationship graph centered on the image component.
- the new related node group is sent from the server side to the network terminal 105, and the user can acquire a new relevance graph.
- the collection of interest graphs involves the image recognition engine 200, the relevance search engine 220, the statistical information processing unit 209, and the specific user filter processing unit 210. All of these can be operated as a part of the server 101, or can be operated as a server system independent of the server 101.
- FIG. 13 shows a configuration example of a detailed functional block of the graph calculation unit 221 in the relevance search engine 220.
- the graph calculation unit 221 includes a partial graph generation unit 1301, a multidimensional feature vector generation unit 1302, and a related element node extraction unit 1303, and exchanges data from the graph database 102A and the relevance knowledge database 1310 as necessary. I do.
- the subgraph configuration unit 1301 receives a node corresponding to the image component extracted by the image recognition engine 200 as an input, and generates a subgraph of the node while accessing the GDB 102A.
- the multidimensional feature vector generation unit 1302 generates a multidimensional feature vector from the partial graph by calculation in the relevance calculation unit 224 (FIG. 18 described later).
- the related element node extraction unit 1303 obtains the distance between the obtained multidimensional feature vectors by, for example, measuring the Euclidean distance or the Mahalanobis distance, and extracts the related element node.
- FIG. 14A shows a basic data structure for expressing a graph in an embodiment of the system according to the present invention.
- key (1401) is obtained by performing hash operation 1404 on the generation time and value (1402).
- the hash operation 1404 uses the hash algorithm SHA-1
- the key is 160 bits long.
- Key (1401) obtains value (1402) by locate operation 1403.
- a distributed hash table can be used for the locate operation 1403.
- the relationship between the key and value is expressed by “(key, ⁇ value ⁇ )” (FIG. 14A (B)), and is a unit stored in the GDB 102A as node data and link data. For example, when two nodes in FIG.
- n1 and n2 are the keys of the node n1 (1410) and the node n2 (1411), respectively, and the hash calculation is performed on the node entity node n1 (1410) and the node n2 (1411) to obtain the respective keys.
- the link l1 (1412) is expressed by “(l1, ⁇ n1, n2 ⁇ )” similarly to the node, and the key (l1) 1412 is obtained by performing a hash operation on ⁇ n1, n2 ⁇ .
- FIG. 14A (D) shows the data structure held by the node.
- the type column stores the type of data held by the node.
- four types of “USER”, “OBJECT”, “META”, “URI”, and “EXT” are defined.
- “USER” indicates that the node represents a user
- “OBJECT” indicates that the node represents an object
- “META” indicates the metadata of the user or the object
- “SUBJECT” (Subjectivity) indicates the subjectivity of the user
- URI indicates the URI of the node to the user or the object.
- EXT is prepared for type expansion, and the expanded data is stored in the data column.
- FIG. 14A (E) shows the data structure held by the link.
- the type column stores the type of link.
- Two types of “UNDIRECTED” and “DIRECTED” are defined as types. “UNDIRECTED” indicates that the link is an undirected link, and “DIRECTED” indicates that the link is a directed link, and the data column includes a left node key and a right node key.
- a weight (w) and a function (f) are stored, and a value expressing the thickness of the link may be used as the weight, or a value obtained by compressing a multidimensional feature vector, which will be described later, may be used.
- the data represented by “(key, ⁇ value ⁇ )” of these nodes and links has the property of immutable (data invariant), that is, write-once-read-many semantics (write once only) Reading is possible multiple times), but is not limited to its semantics.
- immutable data invariant
- write-once-read-many semantics write once only Reading
- the semantics of write-many-read-many both writing and reading are possible multiple times
- a correction time column is added to both the node and the link.
- FIG. 14A The node data and link data shown in FIG. 14A are stored in the GDB 102A.
- FIG. 14B shows an operation of the GDB 102A for operating those data.
- five operations of “CREATE”, “CONNECT”, “NODE”, “LINK”, and “SUBGRAPH” are defined as typical operation sets, but may be extended.
- the data semantics are write-many-read-many, there may be operations of “DESTROY” and “UPDATE”.
- CREATE creates a node of the specified type.
- CONNECT generates a link that connects two specified nodes with a specified type.
- NODE acquires node data corresponding to the key.
- LINK acquires link data corresponding to the key.
- SUBGRAPH acquires a subgraph of a specified node.
- FIG. 14C shows a graph structure and a link expression in an embodiment of the system according to the present invention.
- FIG. 14C (A) shows a simple graph structure.
- Links are undirected links unless otherwise indicated.
- “(L1, ⁇ n1, n2 ⁇ )” is a link between the node n1 (1501) and the node n2 (1502).
- a directional link from the node n1 (1501) to the node n2 (1502) is expressed, it is expressed as “(l1, ⁇ n1, n2 ⁇ ′)”.
- FIG. 14C (C) shows a case where the link is not static but is expressed by a dynamic function.
- the link between the node n1 (1401) and the node n2 (1402) is calculated by the function “f (n1, n2)”.
- f (n1, n2) is the information distance. It becomes the operation which compares.
- the probability of a link between nodes may be sufficient.
- FIG. 15 illustrates an example of a visual link structure and an operation example of a search-related image, an image component, and a related element image group in an embodiment of the system according to the present invention.
- FIG. 15 shows a relationship graph after the image 1501 is dragged and dropped into the relationship search window 1203.
- the image 1501 is processed by the image recognition engine 200 or the image recognition system 202, and three image components are extracted. That is, there are three image components Wine (1502), Wine Glass (1503), and Wine Bottle (1504). In the drawing, labels such as Wine, Wine Glass, and Wine Bottle are attached to them, but these are not output on the screen, and are merely for ease of explanation in the present invention.
- These image constituent elements are processed by the related search engine 220, and related element groups 1505 to 1518 are extracted.
- the image 1502 is related to images of five related element groups.
- Wine Glass is related to Decanter (1509) as a related element.
- Wine Bottle (1504) is associated with images of eight related element groups. They are 8 of Wine (1502), Wine Glass (1503), Decanter (1509), Cork (1511), Grape (1513), DRC (1515), Wine Cellar (1516), Oak Barrel (1517). .
- the thickness of the link line between images is meaningful.
- a thick link line represents a stronger degree of association than a thin link line.
- Wine (1502) has links to Olive (1505) and Cheese (1506), but the link with Cheese (1506) is thicker than the link of Olive (1505) in this example. That is, the relationship between Wine (1502) and Cheese (1506) is stronger.
- Decanter (1509) is re-entered into the relevance search window.
- the image recognition engine 200 processes the image Decanter (1509), extracts a new image constituent element group, extracts a new related element group related to the extracted image element group, and displays it.
- a relevance graph different from FIG. 15 is developed.
- FIG. 16A shows the relationship with the graph structure corresponding to the scenario of FIG.
- a graph structure corresponding to the image 1501 and the images 1502 to 1504 is shown.
- Four nodes 1601, 1602, 1603, and 1604 correspond to the respective image components.
- a data set 1605 is stored in the GDB 102A.
- FIG. 16B shows a part of metadata for each image constituent element.
- the node 1602 has two pieces of metadata, a node 1610 (red) and a node 1611 (white), and the node 1603 has three pieces of data, a node 1612 (crystal), a node 1613 (company name), and a node 1614 (creator name).
- the node 1604 has three pieces of metadata, that is, a node 1615 (name), a node 1616 (vintage), and a node 1617 (winery). Furthermore, these metadata are linked to further related nodes (not shown).
- FIG. 17A shows a graph structure relating to related elements corresponding to the scenario of FIG.
- the graph structure for the image 1502 and its related elements 1501, 1503, 1504 and 1505-1508 is shown. Seven nodes 1601 to 1604 and 1701 to 1704 respectively correspond to related elements.
- a data set 1705 is stored in the GDB 102A.
- FIG. 17B shows a graph structure for each related element.
- a part of the graph structure is shown because of the space.
- the node 1604 has a link between a node group 1710 corresponding to metadata and a further related link group 1711. Similar links exist in other related element nodes 1601 to 1603.
- FIG. 18 shows an example of the relevance derivation operation according to the present invention, and shows the processing in the relevance calculation unit 224 in the relevance search engine 220.
- a complex graph structure exists between the image of the image component and the nodes constituting the related element.
- the graph of FIG. This is a subgraph extracted from the graph structure between two nodes.
- the link f between the nodes is calculated (the function f in FIG. 14E is calculated).
- the function f varies depending on the type of node and link, such as probability or vector. For example, when a value obtained by calculating f of the link 1801 is set as one element of a row and this is repeated for all links, a matrix (v1) of FIG.
- FIG. 18B is obtained.
- FIG. 18C is a histogram diagram in which each row of the matrix (v1) is associated with a bin.
- This matrix (v1) is used as a multidimensional feature vector for calculating the relationship between nodes. That is, this multidimensional feature vector represents the strength of a direct relationship between nodes.
- the relationship between the node 1801 (n1) and the node 1809 (n2) is represented by the multidimensional feature vector and recorded in the relationship knowledge database 1310.
- a value obtained by dimensional compression of a multidimensional feature vector by f (v1) may be assigned to the thickness of the link line.
- the link line can be expressed on the graph as being thicker.
- a known calculation method can be used for dimensional compression.
- FIG. 19 shows an example of interest graph acquisition according to the present invention.
- FIG. 19A depicts a relevance graph in a simplified manner centering on a node 1903 corresponding to the user (the type of the node is “USER”).
- the node 1903 is connected to the nodes 1904, 1905, and 1906 corresponding to the three objects (they are of the type “OBJECT”).
- a multidimensional feature vector 1901 in FIG. 19A is obtained by calculating and summing up the multidimensional feature vectors between the nodes 1903 and 1904, the node 1905, and the node 1906 in the procedure shown in FIG.
- FIG. 19B Node 1913 and node 1914.
- multidimensional feature vectors 1911 in FIG. 19B are obtained as a result of calculating and summing the multidimensional feature vectors between the node 1903 and the nodes 1913 and 1914. Note the difference in feature vectors between dotted circle 1902 and dotted circle 1912. In this way, an interest graph having the user 1903 as a central node is acquired by adaptively strengthening the multidimensional feature vector.
- an interest graph corresponding to each user can be obtained.
- the calculation by the relevance difference unit 224 is applied to a specific group of users, it represents characteristics regarding the users (so-called user clusters) of the group, and if the calculation of the whole user is applied. , It represents the characteristics of the entire user.
- the statistical information processing unit 209 expresses a statistical interest graph for the multidimensional feature vector group centered on the user.
- Display example of graph structure 20A to 20C show display examples of graph structures in an embodiment of the system according to the present invention.
- an image 2001 (for example, illustration) can be used as an image corresponding to a node in the graph structure, and a logo 2002 or an image thumbnail 2003 can also be used.
- an official image (2004) can be used from the MDB 102B.
- a company logo includes a plurality of meanings as an example. Specifically, the company itself may be indicated, and the product of the company may also be indicated.
- FIG. 20B shows an example in which a relevance graph is visually displayed together with a time axis variable as an observation time.
- FIG. 20B (A) is an example in which the time axis is displayed with the relevance graph as the horizontal axis, with the left being the past and the right being the future.
- the user may flip the display surface (1210).
- the display of the time axis becomes the future time axis of the past, the day before yesterday, 3 days ago, etc., or the day after tomorrow, 3 days later, etc.
- FIG. 20B (B) prepares a scroll bar 2011 for changing the time axis, displays a relevance graph at a certain time (2012), and moves a scroll bar to display a relevance graph at another time axis. This is the displayed example (2013).
- a relevance graph linked to a map or a globe may be displayed based on position information.
- FIG. 20C is an example of displaying a more detailed relationship graph centering on a certain node.
- the display 2021 by double-tapping the node 2022 (2023) or pinching out (not shown in the figure) (FIG. 20C (A)), the relevance graph centering on the node 2022 becomes more detailed (2031). ) Is displayed.
- FIG. 20C (B) a node 2032 to which the node 2024 is further connected and a new node 2033 are additionally displayed.
- FIG. 21 shows an operation example in another embodiment of the system according to the present invention.
- a logical operator (AND 2101 and OR 2102) is introduced as an input search condition in the relevance search window 1203.
- AND (2101) is designated, a node group that is common and directly related between nodes starting from the image 2104 and the image 2105 is selected. That is, a node group having a direct link from the node 2104 and the node 2105 is selected.
- OR (2102) a node group directly related to any one or more of the respective nodes is selected. That is, both a node directly linked from the node 2104 and a node directly linked from the node 2105 are selected.
- FIG. 21B shows an operation example of the AND operator 2101.
- the node 2108 and the node 2109 related in common and directly are selected from the node 2106 corresponding to the image 2104 and the node 2107 corresponding to the image 2105.
- a node 2108 represents a node related to the Arabicy region of Italy
- a node 2109 displays a node related to a winery.
- FIG. 22 shows an operation example in another embodiment of the system according to the present invention.
- FIG. 22A shows an operation when CONNECTION SEARCH (2103) is selected as the search condition of the relevance search window.
- CONNECTION SEARCH (2103) is selected as the search condition of the relevance search window.
- a state in which two images (2201 and 2203) are dragged and dropped in the relevance search window is shown.
- a node 2202 to node 2206 (something ⁇ ⁇ ⁇ 1) corresponding to the image 2201 represents a reachable state
- a throat 2209 something 2) from the node 2204 corresponding to the image 2203 represents a reachable state.
- searching the GDB 102A a link in the graph structure of the node 2206 and the node 2209 is searched, and when there is a direct or indirect link between the nodes, the node is displayed. become.
- a direct link When there is a direct link, a plurality of corresponding links are extracted from the GDB 102A, and the image is displayed every time the node that holds the URI to the image is reached.
- a multi-dimensional feature generated by the multi-dimensional feature vector generation unit 1302 is extracted from the GDB 102A using the statistical information processing unit 209 described later and the node 2202 as a root. For example, a node group having a multidimensional feature vector having a probability larger than the co-occurrence probability of the multidimensional vector is selected, and the node 2201 and the node 2203 are indirectly connected.
- CONNECTION SEARCH As a modification of CONNECTION SEARCH (2103), only one image, for example, image 2201, may be dragged and dropped into the relevance search window to connect the links selected by the above method. After this, a direct link (2210) may be generated between the node 2202 and the node 2204.
- the specific associative relationship shown in FIG. 23 can be derived by the CONNECTION SEACH operator 2103. That is, the relationship between the wine bottle of the image 2301 and the wine glass of the image 2302 is extracted through a material called wine, and the relationship with the image 2303 of a wine glass manufactured by a high-class wine glass company through an article of the type of glass. Is extracted, and the association with the image 2304 of the chair of the same material is extracted, and further the association with the image 2305 is extracted through an article of the same type of chair, and further, the image 2306 is obtained through the fact that it is the same author.
- FIG. 24A shows an operation example in another embodiment of the system according to the present invention.
- FIG. 24A it is assumed that two objects 2403 and 2404 are associated with the node 2402 corresponding to the user 2401.
- a new object 2410 is found by the operation shown in FIG. 22, when the connection operator LIKE (2420) is applied to the new object 2410 (FIG. 24A (A)), between the user node 2402 and the object node 2410, A link 2411 is generated and a direct association is made (FIG. 24A (B)).
- link data “(2411, ⁇ user A, object C ⁇ )” is newly registered in the GDB 102A.
- FIG. 24B (A) when the disconnect operator DISLIKE (2421) is applied to the object 2410, the link 2411 is disconnected and the directed link "(2412, ⁇ object C, user A ⁇ ')". Is registered in the GDB 102A (FIG. 24B (B)).
- connection operator LIKE connection operator
- disconnection operator DISLIKE disconnection operator
- FIG. 25 shows an operation example in another embodiment of the system according to the present invention.
- FIG. 25 it is assumed that two objects 2503 and 2504 are associated with the node 2502 corresponding to the user 2501 (FIG. 25A). On the other hand, it is assumed that three objects 2513, 2514, and 2515 are associated with the node 2512 corresponding to the user 2511 (FIG. 25B).
- the direct link 2510 does not exist between the object 2504 and the object 2515.
- the relevance search engine 220 in the present invention may find an indirect relevance as seen in FIG. Therefore, when the possibility that the object 2515 exists on the network terminal of the user 2501 is urged to the user 2501 (FIG. 25C), the user uses an operator Reference (2506) that directly connects them. Can be executed.
- a link between the object 2504 and the object 2515 is proposed, and a new multidimensional feature vector is generated by the processing of the relevance difference unit 224.
- a plurality of link generation requests are generated and a predetermined threshold value is exceeded, or a supervisor having specific authority creates a link that directly associates the object 2504 and the object 2515 with the “CONNECT” operation in FIG. 14B.
- a specific authority may be given to the user.
- the link generation request by the operator Reference is immediately executed, and a link directly relating the object 2504 and the object 2515 is generated by the “CONNECT” operation of FIG. 14B.
- a dotted-line temporary link (1520) is drawn between Olive Tree (1519) and Grape (1513) (FIG. 15).
- the distance between the two is far from the viewpoint of the initial relevance graph, it is possible to propose that the user directly associates with the operator Reference. In that case, you may induce communication between users triggered by the proposal about the existence of the relevance. As a result, if the proposal is valid, the link 1520 can be updated as a solid line (established as a direct relationship) link.
- FIG. 26 shows an operation example in another embodiment of the system according to the present invention.
- two objects 2303 and 2304 are associated with the node 2302 corresponding to the user 2301 (FIG. 26A).
- three objects 2403, 2404, and 2405 are associated with the node 2402 corresponding to the user 2401 (FIG. 26B), and a direct association link 2501 exists between the object 2304 and the object 2405.
- the user 2301 executes the operator Unreference (2406) because the association is suspicious.
- Unreference a plurality of requests for raising the non-existence of this direct relationship occur and a predetermined threshold is exceeded, for example, when a user exceeding a certain number executes an unreference operation, between the object 2304 and the object 2405 Direct associations are cut off as false positives. Alternatively, if this request is confirmed by the supervisor's authority, the direct association between the object 2304 and the object 2405 can be similarly broken.
- the statistical information processing unit 209 is composed of three elements.
- the inference engine unit 2703 further includes a decision tree processing unit 2710 and a Bayesian network processing unit 2711, and the graph mining processing unit 2703 includes a pattern mining unit 2703, an inference engine unit 2702, and a graph mining processing unit 2703.
- a mining processing unit 2712 and an RWR (Random Walk with Restarts) processing unit 2713 are included. Note that the graph mining processing procedure is not limited to these.
- the graph / vector construction unit 2701 in FIG. 27 uses the data from the GDB 102A and / or the data from the statistical information database 2704 to extract a subgraph related to the input node, and from the processing in the relevance calculation unit 224.
- a multidimensional feature vector is obtained and used as an input to the inference engine unit 2702.
- the inference engine unit 2702 performs processing by a decision tree processing unit 2710 that executes processing by a decision tree method that is one of probabilistic inference models, or in a Bayesian network configuration unit 2711. Or sent to a graph mining processing unit 2703 for extracting frequent main parts.
- the graph mining processing unit 2703 generates a partial graph (FIG. 28C) as a result of performing the graph mining process using the pattern mining method or the RWR method.
- FIG. 28 shows the configuration of the specific user filter processing unit 210 in an embodiment of the system according to the present invention.
- the processing unit is composed of three elements.
- a partial graph obtained from the GDB 102A and processed by the statistical information processing unit 209, for example, is reconstructed as a multidimensional vector by the multidimensional vector configuration unit 2801.
- a subjective filter processing unit 2802 generates a user's subjective evaluation filter as a multidimensional feature vector (FIG. 28B) using information in the user database 2804.
- the multidimensional feature vectors output from these two components are processed by the multidimensional feature vector processing unit 2803, and the connection between nodes reflecting the user's subjective evaluation in FIG. Is reconstructed as a multidimensional feature vector representing the depth of.
- evaluations of nodes linked to users are quantified.
- the numerical value may be determined by a learning process, or a floor specification directly by the user, or may be obtained by using the number of links between the user and the node.
- the subjective filter construction unit 2802 generates a multidimensional feature vector (FIG. 28B) from the subgraphs constituting the subjective elements.
- the value of each bin of the multidimensional feature vector can be used as a numerical value reflecting the subjective element.
- FIG. 29 shows an example in which the subjectivity different for each user is visually expressed as a relevance graph.
- the user 2901 shows a state where six objects including the object 2902 are directly or indirectly connected. Each relevance depth is displayed up to the second order. The thickness of each link line represents the strength of relevance. If it is found that the user 2901 has a special interest in the object 2906 through the processing related to the above, the object may be highlighted (2908).
- the user 2911 shares a relevance graph similar to the user 2901. If the object of interest is the object 2902, the object includes highlights and special decoration effects.
- the image may be displayed with a visual effect (2912).
- the input of the subjective filter configuration unit 2802 in FIG. 28A is a multi-dimensional configuration that configures an environment filter that reflects the time zone that the user is trying to search for and the time axis and position information such as date, season, era, and place.
- a vector may be used.
- FIG. 30 shows an operation example in another embodiment of the system according to the present invention.
- the image recognition engine 200 confirms the validity of the metadata as shown in FIG.
- the comparison process it is possible to confirm the coincidence between the graph structure of the metadata shown in FIG. 30A and the graph structure of the metadata shown in FIG. As a result, a significant reduction in processing time can be realized.
- FIG. 31A shows an example of an interest graph.
- a user and an object (thing) are drawn as nodes for simplification.
- information other than things such as a context and a scene is actually extracted from the image by the image recognition system 202, it is an interest.
- the relationship between three users 3101 to 3103 and six objects 3110 to 3115 is depicted. It is depicted that user 3101 is interested in objects 3110, 3111, 3112, user 3102 is interested in objects 3111, 3113, 3114, and user 3103 is interested in objects 3110, 3111, 3113, 3115.
- the interest graph is a node related to the user from the data of the GDB 102A, is extracted by the graph calculation unit 221, and exists in the graph storage unit 222 in the relevance search engine 220. Since the information of the GDB 102A is constantly changing by the connection operator LIKE, the disconnect operator DISLIKE, the reference operator Reference, and the non-reference operator Unreference, the interest graph of FIG. 31A is also dynamic interest. -Can be acquired as a graph.
- the user 3102 has a new relationship (link 3201) with the object 3112 by, for example, the connection operator LIKE.
- a new relationship is performed by many users, and the number of links to the object 3112 in the server changes (changed from 1 to 2 in FIG. 31B).
- a predetermined threshold is set for the number of links, and exceeding this is regarded as a significant change in the point of interest for the node, and the change is notified to the node (3104) related to the node (3112).
- the node 3104 illustrates an advertiser, and it is possible to notify the advertiser that the number of links to the object 3112 has changed beyond a threshold.
- the notification may be made to the users 3101 and 3102 that are directly related to the object 3112.
- the notification may enable an advertiser to present an advertisement or a recommendation that stimulates the purchase intention regarding the target object.
- FIG. 31B illustrates a case where the node 3104 is an advertiser, and an advertisement regarding the object 3112 (corresponding image is 3120) can be notified to the user 3101 and the user 3102.
- FIG. 32 shows an example in which when displaying an interest graph having the user as a central node, the interest graph is displayed only for the user from the viewpoint of privacy protection.
- an interest graph centered on the user 3201 is displayed, but the interest graph centered on the user 3202 (gray box 3210) relates to the privacy of the user 3202, and from the viewpoint of the user 3201. It is a requirement that is not displayed.
- These can control display / non-display by distinguishing the node type “USER” on the server side.
- FIG. 33 shows an example of social graph acquisition. From a comprehensive interest graph acquired through a process of visual relevance search using a series of relevance search engines 220 incorporating the image recognition engine 200 or the image recognition system 202 in the present invention, a node of a node is transmitted to a specific user. By extracting a node representing a person whose type is “USER” and mapping the node to the plane 3301, a social graph including the relationship between persons can be obtained.
- Fig. 34 shows an outline of the process diagram related to interest graph collection.
- the entire process system is divided into a real-time system and a background system, and the graph storage unit 222 connects the two.
- the GDB 102A, the relevance calculation unit 224, and the statistical information processing unit 209 are arranged in the background system.
- an image recognition system 202 or an image recognition engine 200 (not shown)
- a graph calculation unit 221 and a network communication control unit 204 are arranged in the real-time system.
- the network communication fish fish unit 204 is connected to the network terminal 105 via a network including the Internet.
- the interest graph is obtained by selecting and extracting a multi-dimensional vector of a predetermined number of element groups in descending order of the degree of association with the user from among the node groups that are primarily connected to the user, and obtaining them as a multi-dimensional feature vector of a finite length unique to the user. .
- FIG. 35 is an image of a multidimensional feature vector corresponding to an interest graph for each user. Since the total number of candidate dimensions of the interest graph corresponding to all nodes reaches the order of the comprehensive number of nodes registered in the GDB 102A, a certain number is extracted from the candidates in descending order of relevance with the user. As shown in FIG. 35, it is stored in the user database 2804 as a multidimensional feature vector of finite length.
- FIG. 35A illustrates the multidimensional feature vector of Mr. A
- FIG. 35B illustrates the multidimensional feature vector of B.
- Interest graph collection system 101 Server 102A Graph database (GDB) 102B Mother Database (MDB) 103 Connection 104 Network (or Internet) 105a to 105d Network terminal device 106 General object recognition system 107 Image category database 108 Scene recognition system 109 Scene component database 110 Specific object recognition system 200 Image recognition engine 209 Statistical information processing unit 210 Specific user filter processing unit 220 Relevance search engine 221 Graph calculation unit 222 Graph storage unit 223 Graph management unit 224 Relevance calculation unit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Discrete Mathematics (AREA)
- Mathematical Physics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
なお、領域処理部201、一般物体認識部202、特定物体認識部203、MDB検索部206、MDB学習部207、及びMDBマネージメント208は、画像認識エンジン200を構成している。画像認識エンジン200は後述の図6A記載の画像認識システムに置き換えても良い。また、グラフ演算部221、グラフ記憶部222、グラフ管理部223、及び関連性演算部224は、関連性検索エンジン220を構成している。
サーバ101の機能ブロックは必ずしもこれらに限定されるものではないが、これら代表的な機能について簡単に説明する。
ネットワーク通信制御部204は、画像の入力出力処理や、ネットワーク端末との情報通信制御などを行う。データ検索処理部205は、リンク先から情報を収集したり、集合知の問合せや、収集、ならびに検索などを行う。
MDB検索部206は、物体の名称等のタグデータ等を検索する。MDB学習部207は、新規設計データの追加や詳細情報の追加、時間情報の登録、付帯情報の登録、更新、追加などを行う。MDBマネージメント部208は、設計データから特徴点や特徴量の抽出、付帯情報からカテゴリ情報を抽出しカテゴリデータへの登録や、カテゴリデータ内のカテゴリ分類の拡張、分割、更新、統合、修正、新規カテゴリの登録などを行う。
リンクデータ232は、リンクに関するデータを格納する。リンク構造の一例については、図14A(E)に基づいて後述する。
付加情報データ252は、物体の名称、製造者、部品番号、日時、素材、組成、加工情報など物体に関するあらゆる情報を保持している。
特徴量データ253は、設計情報に基づいて生成される個々の物体の特徴点や特徴量情報を保持している。
カテゴリデータ254は、一般物体認識部において物体のカテゴリ分類を行う際に使用する情報を保持している。
不特定物体データ255は、現時点で特定物体認識不能の物体に関しての情報を保持しており、その後に類似の特徴を有する物体が頻繁に検出されれば、新たな特定物体として新規登録処理される。
次に、図4に従って、本発明に係るシステムの一実施形態における画像認識システムの全体的なフローを説明する。
次に、S502においてMDBで生成された特徴量データ253を元に、元画像内の特徴点及び特徴量が抽出され、MDBより生成される特徴量データと比較される。ここで、MDBより特徴量データを生成・比較する方法には、以下の2種類がある。
ここで、図6Aに、本発明に係るシステムの他の実施形態における画像認識システムの機能ブロックを示す。図6Aに示す画像認識システム202は、サーバ101の一部として運用することも可能であるし、あるいは、サーバ101とは独立したサーバシステムとしても運用可能である。画像認識システム202は、サーバ101における一般物体認識部や特定物体認識部に対応する一般物体認識システム及び特定物体認識システムに加え、シーンを認識するためのシーン認識システムをも備えている。サーバ101における画像認識機能部の別形態あるいは応用例として、以下に詳述する。
次に、図12~図36に基づいて、本発明に係るシステムの一実施形態におけるインタレスト・グラフ収集処理について説明する。
例えば、関連性検索窓への検索対象画像の投入という操作に代えて、検索の対象となる画像全体、或いは特定の画像領域を明示的にネットワーク端末105の表示スクリーン上でダブルタップする等の操作により、サーバ側の画像認識エンジン200に当該選択画像の認識処理を問い合わせる事も可能である。
図12B(A)では、フリックした結果、グレープジュース1220の関連要素としてのオレンジジュース1221及びグレープ1222が表示され、さらに、グレープ122の関連要素としてのフルーツ群1223~1226が表示されている。
図14A(A)に示す通り、生成時刻及びvalue(1402)にhash演算1404を施すことでkey(1401)を得る。例えば、hash演算1404がハッシュアルゴリズムSHA-1を用いた場合には、keyは160ビット長になる。Key(1401)はlocate演算1403によってvalue(1402)を得る。例えばlocate演算1403には分散ハッシュテーブル(Distributed Hash Table)方が利用できる。本実施例では、このkeyとvalueの関係を”(key, {value})”で表現し(図14A(B)))、ノードデータ及びリンクデータとしてGDB102Aに格納する単位とする。
例えば、図14A(C)の2つのノードがリンクされている場合、ノードn1(1410)は、”(n1, {ノードn1})”で、ノードn2(1411)は、”(n2, {ノードn2})”で表現する。n1やn2はそれぞれノードn1(1410)、ノードn2(1411)のkeyであり、ノード実体ノードn1(1410)、ノードn2(1411)をそれぞれhash演算しそれぞれのkeyを得る。また、リンクl1(1412)は、ノードと同様に“(l1, {n1, n2})”で表現し、{n1, n2}をhash演算することでそのkey(l1)1412を得る。
図15は、画像1501を関連性検索窓1203にドラグ&ドロップした後の関連性グラフを示している。画像1501は画像認識エンジン200、または、画像認識システム202で処理され、3個の画像構成要素が抽出されている。すわなち、画像構成要素Wine(1502)、Wine Glass(1503)、Wine Bottle(1504)の3個である。図中では、それらにWine, Wine Glass, Wine Bottleというラベルが付されているが、これらは画面に出力されることはなく、あくまで本発明における説明を容易にするためのものである。これらの画像構成要素は関連検索エンジン220で処理され、それぞれ1505から1518の関連要素群が抽出されている。例えば、画像1502は5個の関連要素群の画像と関連がある。それらは、Olive(1505)、Cheese(1506)、Bread(1507)、Fine Dish(1508)、Wine Glass(1508)の5個である。Wine Glass(1503)はDecanter(1509)を関連要素とし関連がある。Wine Bottle(1504)は8個の関連要素群の画像と関連がある。それらは、Wine(1502)、Wine Glass(1503)、Decanter(1509)、Cork(1511)、Grape(1513)、DRC(1515)、Wine Cellar(1516)、Oak Barrel(1517)の8個である。
これらの関係を表現するために、GDB102Aにはデータセット1605が格納されている。
これらの関係を表現するために、GDB102Aにはデータセット1705が格納されている。
図18は本発明に係わる関連性導出演算の一例で、関連性検索エンジン220における関連性演算部224での処理を示したものである。図17で見たように、画像構成要素の画像と関連要素を構成しているノードとの間には複雑なグラフ構造が存在している。例えば、図18(A)のグラフが与えられているとする。これは、2つのノード間のグラフ構造から抽出された部分グラフである。ここで、それぞれのノード間のリンクのfを計算する(図14(E)の関数fを計算する)。関数fは確率であったりベクトルであったりとノードとリンクのタイプによって変わる。例えば,リンク1801のfを計算し得られた値を行の一要素とし、これをすべてのリンクに繰り返すと、図18(B)の行列(v1)を得る。図18(C)は行列(v1)の各行をビンに対応させたヒストグラム図として描いた。この行列(v1)を多次元特徴ベクトルとしてノード間の関連性の計算に用いる。つまり、この多次元特徴ベクトルはノード間の直接的な関係性の強さを表している。図18ではノード1801(n1)とノード1809(n2)の間の関連性を当該多次元特徴ベクトルで表し、関連性知識データベース1310に記録する。また、GDB102Aにはノードn1とノードn2との間にリンクが生成されたことになるので、すなわち、”(ln1-n2, {f(v1)})”(ここでf(v1)は関連性知識データベースへのアクセス関数/メソッドである)なるリンクデータをGDB102Aに格納する。この様にして、ノード間の関連性を学習していく。
図19に、本発明に係わるインタレスト・グラフ獲得の一例を示す。図19(A)は関連性グラフをユーザに対応するノード1903(当該ノードのタイプが”USER”である)を中心に簡略化して描いた。ノード1903は3つのオブジェクトに対応したノード1904、1905、1906(それらのノードはタイプが”OBJECT”である)と繋がっている。図18に記載の手順でノード1903とノード1904、ノード1905、及びノード1906の間のそれぞれの多次元特徴ベクトルを計算し合計したものが図19(A)中の多次元特徴ベクトル1901である。
ここで、2つのオブジェクトがノード1903に追加されたとする(図19(B))。ノード1913とノード1914である。同様にノード1903とノード1913、ノード1914の間のそれぞれの多次元特徴ベクトルを計算し合計した結果、図19(B)中の多次元特徴ベクトル1911が得られる。点線円1902と点線円1912における特徴ベクトルの違いに注意されたい。この様に多次元特徴ベクトルを適応的に強めることで当該ユーザ1903を中心ノードとするインタレスト・グラフを獲得する。
また、関連性差分部224による計算を特定のユーザの集まりに対して適応すれば、当該グループのユーザ(いわゆるユーザクラスタ)に関する特徴を表したものになるし、ユーザ全体の当該計算を適応すれば、ユーザ全体に関する特徴を表したものになる。そして、詳細は後述するが統計情報処理部209により当該ユーザを中心とした多次元特徴ベクトル群は統計的なインタレスト・グラフを表現する。
図20A乃至図20Cに、本発明に係るシステムの一実施形態におけるグラフ構造の表示例を示す。
ここで、会社のロゴの場合には、一例として複数の意味が含まれることに留意されたい。具体的には、会社そのものを指し示し、かつ、その会社の商品も指し示すことがあり得る。
例えば、図20C(B)では、ノード2024のさらなるつながりのあるノード2032や、新しいノード2033が追加表示されている。
図21に、本発明に係るシステムの他の実施形態における動作例を示す。図21(A)において、関連性検索窓1203には、入力検索条件として論理演算子(AND2101及びOR2102)が導入される。ここで、AND(2101)を指定した場合、画像2104と画像2105とから始まるノード間で共通かつ直接的に関連するノード群が選択される。すなわち、ノード2104とノード2105からの直接リンクのあるノード群が選ばれる。一方、OR(2102)の場合、それぞれのノードのいずれか1つ以上に直接的に関連するノード群が選択される。すなわち、ノード2104から直接リンクのあるノードとノード2105から直接リンクのあるノードの両方が選ばれる。
また間接的リンクが存在する場合には、後述の統計情報処理部209を利用して、ノード2202をルートとする部分グラフをGDB102Aから抽出し多次元特徴ベクトル生成部1302で生成された多次元特徴ベクトルを対象として、例えば、当該多次元ベクトルの共起確率よりも大きな確率を持つ多次元特徴ベクトルをもつノード群を選び出し、ノード2201とノード2203を間接的につないでいく。この方法の場合、当該ノード間をつなぐパスは複数存在する場合があるが、その時には、当該パス上のノードの数が最小になるパス、あるいは、当該パス上のノード間の重みが最小になるパスを最短パスとして、それを含む関連性グラフを表示しても良い。
なお、この後には、ノード2202とノード2204との間に直接のリンク(2210)を生成しても良い。
このような非直接的なパスが複数発見される場合、前記したように、最も中継ノード数の少ない、或いは、当該パス上のノード間の重みが最小となるような間接関連性を抽出する事が可能である。
さらには、これら複数の非直接的なパスを辿る事により、思いもよらないノード間の関連を発見することができる。
ここでまずオブジェクト2504とオブジェクト2515との間には直接的リンク2510は存在しないとする。しかし本発明における関連性検索エンジン220は図23にみてきたような間接的な関連性を見つけ出す可能性がある。そこで、ユーザ2501のネットワーク端末上にオブジェクト2515の存在の可能性がユーザ2501に対して喚起された場合(図25(C))、当該ユーザは、それらを直接に結びつける演算子Reference(2506)を実行することができる。これにより、オブジェクト2504とオブジェクト2515の間のリンクが提起され、関連性差分部224の処理により新たな多次元特徴ベクトルが生成される。このリンク生成の要請が複数発生し所定の閾値を超えた場合、或いは特定の権限を有するスーパーバイザにより、オブジェクト2504とオブジェクト2515を直接関連付けるリンクを図14Bの”CONNECT”オペレーションによって生成する。
図27を用いて、統計情報処理部209の一実施形態における機能ブロック構成を説明する。
統計情報処理部209は3つの要素から構成される。グラフ・ベクトル構成部2701、推論エンジン部2702、およびグラフマイニング処理部2703の3つであり、推論エンジン部2703はさらに決定木処理部2710、ベイジアンネットワーク処理部2711から、グラフマイニング処理部2703はパターンマイニング処理部2712、RWR(Random Walk with Restarts)処理部2713から構成されている。なお、グラフマイニングの処理手順はこれらに限らない。
図28に本発明に係るシステムの一実施形態における特定ユーザフィルター処理部210の構成を示す。当該処理部は3個の要素から構成される。多次元ベクトル構成部2801、主観フィルター構成部2802、及び多次元ベクトル処理部2803である。GDB102Aから取り出され、例えば統計情報処理部209で処理された結果の部分グラフは、多次元ベクトル構成部2801にて多次元ベクトルとして再構成される。ユーザ情報は主観フィルター処理部2802において、ユーザデータベース2804の情報を用いてユーザの主観的評価フィルターを多次元特徴ベクトル(図28(B))として生成する。これら2個の構成部(2801と2802)からの出力の多次元特徴ベクトルが、多次元特徴ベクトル処理部2803で処理され、図28(C)のユーザの主観的評価を反映したノード間のつながりの深さを表した多次元特徴ベクトルとして再構成される。
これらはユーザデータベース2804に登録可能であると同時に、これら主観が”SUBJECT”として汎用的に適用可能となる。前記主観フィルター構成部2802は上記主観要素を構成する部分グラフから多次元特徴ベクトル(図28(B))を生成する。当該多次元特徴ベクトルの各ビンの値は、上記主観要素を反映した数値として利用可能である。
図には、3人のユーザ3101から3103と6個のオブジェクト3110から3115の関係が描かれている。ユーザ3101はオブジェクト3110、3111、3112に興味があり、ユーザ3102はオブジェクト3111、3113、3114に興味があり、ユーザ3103はオブジェクト3110、3111、3113、3115に興味があることが描かれている。
このインタレスト・グラフはGDB102Aのデータからユーザと関連のあるノードであり、グラフ演算部221が抽出したものであり、関連性検索エンジン220におけるグラフ記憶部222に存在している。
上記接続演算子LIKE、切断演算子DISLIKE、参照演算子Reference、非参照演算子UnreferenceによってGDB102Aの情報は時々刻々と変化しているので、この図31(A)のインタレスト・グラフも動的なインタレスト・グラフとして獲得できる。
101 サーバ
102A グラフデータベース(GDB)
102B マザーデータベース(MDB)
103 接続
104 ネットワーク(あるいはインターネット)
105a~105d ネットワーク端末装置
106 一般物体認識システム
107 画像カテゴリデータベース
108 シーン認識システム
109 シーン構成要素データベース
110 特定物体認識システム
200 画像認識エンジン
209 統計情報処理部
210 特定ユーザフィルター処理部
220 関連性検索エンジン
221 グラフ演算部
222 グラフ記憶部
223 グラフ管理部
224 関連性演算部
Claims (17)
- キーワードやメタデータ、或いは文章といったような表意的な文字による入力手段によらず、様々な物体(object)や主題(subject)を内包する画像情報を入力手段として用いる検索システムであって、インターネット上或いは専用ネットワーク上に存在する多数の画像群、或いはネットワーク端末経由でユーザがインターネット上にアップロードした画像の中から、ユーザが関心を持った画像全体、或いは画像の特定の領域をユーザがネットワーク端末上で選択し、それら選択した画像をネットワークを経由してサーバ側の画像認識エンジンに問い合わせる事により、インターネットを介してサーバ側の画像認識エンジンが、選択された画像全体、或いは指定された画像領域に含まれる様々な一般物体・特定物体・人・顔・シーン・文字・記号・イラストレーション、ロゴ、ファビコン(等)をリアルタイムに抽出・認識し、それら認識された入力画像に含まれる画像構成要素群を、画像認識エンジン経由でサーバ側の関連性検索エンジンに通知する事で、関連性検索エンジンが個々の画像構成要素群それぞれに対し直接的及び間接的に一定以上の関連があると判断したその他の関連要素群を、関連性検索エンジン内の関連性知識データベース上に学習可能な状態で収納されている要素間の直接関連性を記述した多次元特徴ベクトルを基に抽出し、上記画像認識エンジンにより認識された画像構成要素群及び関連性検索エンジンにより抽出された関連要素群それぞれをノードとする関連性グラフとして、各ノード間の関連性の深さと共に、ユーザのネットワーク端末上に二次元画像、或いは奥行きを持った三次元画像、或いは上記関連性グラフの観察時間としての時間軸変数を加えた四次元時空間画像として視覚的に表現する事を可能にしたことを特徴とするインタレスト・グラフ収集システム。
- 上記関連性検索操作において、ネットワーク端末上に表示されている関連性グラフ上の任意のノードを、ユーザがタッチスクリーン上でタップ或いはタッチして選択するか、ポインタのカーソルを任意のノード上に移動し選択する、或いは関連性グラフ上の任意の領域に向けてユーザがタッチスクリーン上でフリックするか、ポインタのカーソルを関連性グラフ上の任意の領域に移動し画面全体をドラグしスクロールする、或いは方向キー等による同様の操作、或いはユーザによるジェスチャーや視線、音声、或いは脳波を用いた同様の効果を及ぼす入力操作を用いる事により、当該選択されたノード、或いは移動後の領域を中心とする新たな関連性グラフを、そこに至る途中経過も含めて関連性検索エンジンがネットワーク端末に向け追加的に送出する事で、ユーザにとって関心のあるノード或いは領域をユーザが関連性グラフ上でシームレスに辿りながら、複数のノード間にまたがる広範な関連性として視覚的に認識する事が可能な、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、上記画像認識エンジンが提示した複数の画像構成要素群の中から、ユーザが選択した特定の画像構成要素、或いはネットワーク端末上に表示されている関連性グラフ上の特定のノードをユーザがタッチスクリーン上でダブルタップするかピンチアウトする操作、或いはポインタ等の操作による当該ノードの選択と当該ノードを中心とする領域の拡大、或いはユーザによるジェスチャーや視線、音声、或いは脳波を用いた同様の効果を及ぼす入力操作を用いる事により、当該ノードを中心とするより詳細な関連性グラフをユーザのネットワーク端末上に視覚的に表現する事を可能にした上で、それらの一連の操作を当該ノードに対するユーザの一定の関心の存在であるとし、当該ユーザを中心ノードとする各要素間の直接関連性を記述した多次元特徴ベクトル上で、当該ノードに係るユーザの関心の深さを表わす特徴ベクトル値を適応的に強める事で、当該ユーザを中心ノードとする個々のユーザに対応したインタレスト・グラフとして獲得可能にすると共に、当該インタレスト・グラフを広範なユーザに拡大して獲得する事により、特定のユーザクラスタ、或いはユーザ全体にまたがる統計的な広義のインタレスト・グラフとして収集可能な、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、ユーザが着目し選択したノードを関連性グラフ上で辿る事なく、それらノード画像をネットワークを経由してサーバ側の画像認識エンジンに再び問い合わせる事により、画像認識エンジンの助けを借りて当該ノードに関する新たな画像構成要素群を獲得し、それら画像構成要素群を起点とする新たな関連要素群を関連性検索エンジンからネットワーク端末に向け送出する事で、ユーザが当該ノードに対する新たな関連性をそれら相互の関連性の深さと共に関連性グラフ上で視覚的に認識する事を可能にした上で、直前の同様の操作において当該ノードへの起点となっている画像構成要素から当該ノードに至る一連のノード間の関連性の存在をユーザが認識し利用していると関連性検索エンジンが推測し、各要素間の直接関連性を記述した多次元特徴ベクトル上で、それら一連の関連性を構成するそれぞれのノード間の直接的な関係性の深さを表わす特徴ベクトル値を適応的に強める事で、関連性検索エンジン内の関連性知識データベースの追加学習を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、画像認識エンジンにより認識可能となった画像構成要素群、及びそれら画像構成要素群それぞれに対応する各関連要素群に対し、それらを代表する写真、イラストレーション、文字、記号、ロゴ、ファビコン等から生成した縮小画像サムネイルを関連性検索エンジンが元画像に代えてネットワーク端末に向け送出する事により、関連性グラフ上のノードとして画像サムネイル単位での表示・選択を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、複数のノードをサーバ側の画像認識エンジンに問い合わせる事を可能にした上で、画像認識プロセスに備わる入力条件選択機能として、論理的な演算子(AND、OR)を導入し、ANDを選択した場合はそれぞれのノード間で共通かつ直接的に関連するノードを、ORを選択した場合はそれぞれのノードのいずれか一つ以上に直接的に関連するノードを、相互の関連度の深さと共にネットワーク端末上に視覚的に表現する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、複数のノードをサーバ側の画像認識エンジンに問い合わせる事を可能にした上で、画像認識プロセスに備わる入力条件選択機能として、関連性探索演算子(Connection Search)を導入し、一見全く関連性がないと思われるような複数のノード間の関係を、それぞれの入力ノード群に対し直接的及び間接的に関連するその他のノードを経由する一連の関連性として探索する事で、異なるレイヤ(階層)に及ぶノード間の間接的な関係を発見し、それらノード間の最短パスを含む関連性グラフとしてネットワーク端末上に表示する事を可能にすると同時に、上記関連性探索プロセスにおいて、発見された複数のノード間にまたがる上記間接的関係を、関連性検索エンジン内の関連性知識データベース内に学習の上獲得する事で、以降の同様或いは類似の関連性探索要求に備える事が更に可能な、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、ユーザと間接的な関係にあるノード、或いはユーザとの関連性が凡そないとされているその他のノードに対し、それらノードを当該ユーザとの直接的な関係として結び付ける接続演算子(LIKE)、及び既に結びつけられている当該ノードとユーザ間の直接的な関係を切断する切断演算子(DISLIKE)を導入する事で、当該ユーザを中心ノードとする各要素間の直接関連性を記述した多次元特徴ベクトル上で、当該ノードに係るユーザの関心の深さを表わす値を増加、或いは減少、或いは滅消させる事で、当該ユーザを中心ノードとする、個々のユーザに対応したインタレスト・グラフの更新が更に可能な、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、ユーザ以外のノードを対象にした新たな直接的関係の存在及び非存在の可能性を、それらの複数のノードが直接的に結び付けられるべきだとして提起する参照演算子(REFERENCE)、及び既に直接的に結び付いているもののその直接的な関連の存在が疑わしいとして直接的関係の非存在を提起する非参照演算子(UNREFERENCE)を導入する事で、それらの新たなノード間の直接関連性の存在、或いは非存在の可能性を関連性検索エンジンがユーザに対し喚起する事を可能にした上で、特定の権限を有するスーパーバイザ或いは一定数以上のユーザから関連がある或いはないと判断されたノード間の関連性に係る特徴ベクトルの値を関連性検索エンジンが更新可能にした上で、当該ノード群に係る更新された関連性グラフとしてネットワーク端末上に反映可能にすると共に、それらの新たな直接関連性の存在或いは非存在に係る更新情報を、全てのユーザに通知する事が更に可能な、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、関連性検索エンジン内に関連性知識データベースとして収納されている要素間の直接関連性を記述した多次元特徴ベクトルに対し、ユーザそれぞれの主観的な評価が反映可能になるような重み付け操作を可能にし、それら修飾された多次元特徴ベクトルを基に、各ノード間相互の関連性及び関連性の深さを、ユーザのネットワーク端末上に、ユーザがそれぞれ持つ感じ方の違いという要因を反映した関連性グラフとして視覚的に表現する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、関連性検索エンジン内に関連性知識データベースとして収納されている要素間の直接関連性を記述した多次元特徴ベクトルに対し、ユーザが検索しようとする時間帯や日時、季節、時代、場所といった環境フィルターを適用可能にし、それら修飾された多次元特徴ベクトルを基に、各ノード間相互の関連性及び関連性の深さを、ユーザのネットワーク端末上に、観察時間や地域(Location)特性といった時空間要因を反映した関連性グラフとして表現する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性検索操作において、ユーザが選択したノードに対応した画像自体に予めメタデータが付帯している場合、当該メタデータと当該画像との適合性、或いはメタデータ自体の正当性を検証する目的で、上記画像認識エンジンが当該画像から抽出可能な様々な特徴群及びそれらの特徴量と、メタデータで表現される物体(object)や主題(subject)が有する固有の特徴群及びそれらの特徴量とを優先的に照合する事で、画像認識エンジンの画像認識に係る処理時間を短縮可能とすると共に、当該画像には無関係か或いは一定以上の関係が認められないメタデータを排除する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記ユーザによる一連の関連性検索の過程を通じて、人以外のノードに対する特定のユーザクラスタ或いはユーザ全体集団に及ぶ視覚的な関心の在り所や関心の遷移が一定以上変化した時点で、当該ノードに対する着目度の大幅な変化を統計情報化し、関連性グラフ上でそれらノードに直接的に関わるユーザ、或いはユーザクラスタ、或いはユーザ全体、或いは特定の第三者に速やかに通知する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記関連性グラフの表示に関して、ユーザ自身を中心ノードとするインタレスト・グラフの表示は、プライバシー保護の観点から当該ユーザのみに限定する事を更に可能とする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記インタレスト・グラフを活用して、特定のノードが表わす商品やサービスに対し一定以上の関心を示した特定のユーザ、或いは特定のユーザ群に対し、それら商品或いはサービスへの購買意欲を喚起する広告やリコメンデーションの直接的な提示や、それらを提供する第三者への繋ぎ込みを可能にすると同時に、ユーザの属性やユーザの属する時空間要因を加味した上記関連性グラフを活用して、さらにそれらと直接的或いは間接的に関連する他の商品やサービスに対する広告やリコメンデーションの直接的な提示や、それらを提供する第三者への繋ぎ込みを更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記インタレスト・グラフを活用する事により提示可能になる広告やサービス、或いはリコメンデーションを表わす視覚情報及び/又はリンク情報を、対象となるユーザのネットワーク端末上に表示されている関連性グラフ上に、表示/非表示を選択可能な状態で提示することを更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
- 上記画像認識エンジンを組み込んだ一連の視覚的な関連性検索の過程を通じて獲得したインタレスト・グラフを活用して、同様の関心を持つ広範なユーザ間のコミュニケーションを、上記関連性検索を通じて喚起し、インタレスト・グラフに加えて人と人、人と人以外の関係性を包含する、広義の動的なソーシャル・グラフを、ネットワークを介して獲得する事を更に可能にする、請求項1に記載のインタレスト・グラフ収集システム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/064463 WO2012176317A1 (ja) | 2011-06-23 | 2011-06-23 | 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム |
JP2013521387A JP5830784B2 (ja) | 2011-06-23 | 2011-06-23 | 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム |
US14/129,005 US9600499B2 (en) | 2011-06-23 | 2011-06-23 | System for collecting interest graph by relevance search incorporating image recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/064463 WO2012176317A1 (ja) | 2011-06-23 | 2011-06-23 | 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012176317A1 true WO2012176317A1 (ja) | 2012-12-27 |
Family
ID=47422192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/064463 WO2012176317A1 (ja) | 2011-06-23 | 2011-06-23 | 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム |
Country Status (3)
Country | Link |
---|---|
US (1) | US9600499B2 (ja) |
JP (1) | JP5830784B2 (ja) |
WO (1) | WO2012176317A1 (ja) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014132349A1 (ja) * | 2013-02-27 | 2014-09-04 | 株式会社日立製作所 | 画像解析装置、画像解析システム、画像解析方法 |
JP2015026202A (ja) * | 2013-07-25 | 2015-02-05 | 日本放送協会 | 番組検索装置及び番組検索プログラム |
WO2015141341A1 (ja) * | 2014-03-19 | 2015-09-24 | 国立大学法人京都大学 | 関係性グラフ用オブジェクト表示システム |
WO2015147317A1 (en) * | 2014-03-27 | 2015-10-01 | Canon Kabushiki Kaisha | Information processing apparatus and information processing method |
CN105354550A (zh) * | 2015-11-03 | 2016-02-24 | 华东师范大学 | 一种基于图像局部特征点配准的表单内容提取方法 |
JP2016511900A (ja) * | 2013-02-14 | 2016-04-21 | フェイスブック,インク. | ソーシャル化されたアプリケーションを備えるロック画面 |
WO2016103651A1 (ja) * | 2014-12-22 | 2016-06-30 | 日本電気株式会社 | 情報処理システム、情報処理方法、及び、記録媒体 |
JP2016529612A (ja) * | 2013-08-02 | 2016-09-23 | エモティエント インコーポレイテッド | 画像情動コンテンツに基づくフィルタ及びシャッター |
WO2016190264A1 (ja) * | 2015-05-26 | 2016-12-01 | 株式会社神戸デジタル・ラボ | インタレスト情報生成システム |
WO2017110640A1 (ja) * | 2015-12-25 | 2017-06-29 | キヤノン株式会社 | 画像処理装置、画像処理方法、コンピュータプログラム |
JP2017157201A (ja) * | 2016-02-29 | 2017-09-07 | トヨタ自動車株式会社 | 人間を中心とした場所認識方法 |
JP2017167806A (ja) * | 2016-03-16 | 2017-09-21 | 株式会社東芝 | 関係可視化装置、方法およびプログラム |
US9817845B2 (en) | 2013-12-12 | 2017-11-14 | Xyzprinting, Inc. | Three-dimensional image file searching method and three-dimensional image file searching system |
JP2017220225A (ja) * | 2016-06-07 | 2017-12-14 | パロ アルト リサーチ センター インコーポレイテッド | 複雑なグラフ検索のための局所的な視覚グラフ・フィルタ |
WO2018142756A1 (ja) * | 2017-01-31 | 2018-08-09 | 株式会社Nttドコモ | 情報処理装置及び情報処理方法 |
JP2019049943A (ja) * | 2017-09-12 | 2019-03-28 | 凸版印刷株式会社 | 画像処理装置、画像処理方法、及びプログラム |
US10394987B2 (en) | 2017-03-21 | 2019-08-27 | International Business Machines Corporation | Adaptive bug-search depth for simple and deep counterexamples |
CN111247536A (zh) * | 2017-10-27 | 2020-06-05 | 三星电子株式会社 | 用于搜索相关图像的电子装置及其控制方法 |
JP2020535501A (ja) * | 2017-09-15 | 2020-12-03 | 達闥科技(北京)有限公司Cloudminds (Beijing) Technologies Co., Ltd. | 対象認識方法、装置及びインテリジェント端末 |
JP2021061598A (ja) * | 2013-06-28 | 2021-04-15 | 日本電気株式会社 | 映像処理システム、映像処理方法及び映像処理プログラム |
JP2022505015A (ja) * | 2019-10-08 | 2022-01-14 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 知識グラフのベクトル表現生成方法、装置及び電子機器 |
CN114282545A (zh) * | 2021-09-23 | 2022-04-05 | 中国银联股份有限公司 | 生成对象简称的方法和系统、存储介质以及程序产品 |
Families Citing this family (70)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013058608A2 (ko) * | 2011-10-20 | 2013-04-25 | 아주대학교산학협력단 | 트리맵 가시화 시스템 및 방법 |
US9054876B1 (en) * | 2011-11-04 | 2015-06-09 | Google Inc. | Fast efficient vocabulary computation with hashed vocabularies applying hash functions to cluster centroids that determines most frequently used cluster centroid IDs |
US10360706B2 (en) * | 2012-05-22 | 2019-07-23 | Sony Corporation | Device method and program for adjusting a display state of a superimposed image |
SG11201407749TA (en) * | 2012-05-24 | 2014-12-30 | Hitachi Ltd | Image analysis device, image analysis system, and image analysis method |
IN2015KN00242A (ja) * | 2012-07-20 | 2015-06-12 | Intertrust Tech Corp | |
EP2973044A2 (en) * | 2013-03-15 | 2016-01-20 | James Webber | Graph database devices and methods for partitioning graphs |
US10740396B2 (en) | 2013-05-24 | 2020-08-11 | Sap Se | Representing enterprise data in a knowledge graph |
US20140351241A1 (en) * | 2013-05-24 | 2014-11-27 | Sap Ag | Identifying and invoking applications based on data in a knowledge graph |
US9158599B2 (en) | 2013-06-27 | 2015-10-13 | Sap Se | Programming framework for applications |
US9348947B2 (en) | 2013-07-26 | 2016-05-24 | Helynx, Inc. | Systems and methods for visualizing and manipulating graph databases |
US10776965B2 (en) * | 2013-07-26 | 2020-09-15 | Drisk, Inc. | Systems and methods for visualizing and manipulating graph databases |
US10152495B2 (en) * | 2013-08-19 | 2018-12-11 | Qualcomm Incorporated | Visual search in real world using optical see-through head mounted display with augmented reality and user interaction tracking |
EP3044731A4 (en) * | 2013-09-11 | 2017-02-22 | See-Out Pty Ltd. | Image searching method and apparatus |
KR102120864B1 (ko) * | 2013-11-06 | 2020-06-10 | 삼성전자주식회사 | 영상 처리 방법 및 장치 |
US10885095B2 (en) * | 2014-03-17 | 2021-01-05 | Verizon Media Inc. | Personalized criteria-based media organization |
EP3147798A4 (en) * | 2014-05-22 | 2018-01-17 | Sony Corporation | Information processing device, information processing method, and program |
US10091411B2 (en) * | 2014-06-17 | 2018-10-02 | Lg Electronics Inc. | Mobile terminal and controlling method thereof for continuously tracking object included in video |
US20160203137A1 (en) * | 2014-12-17 | 2016-07-14 | InSnap, Inc. | Imputing knowledge graph attributes to digital multimedia based on image and video metadata |
US20160292247A1 (en) * | 2015-03-31 | 2016-10-06 | Kenneth Scott Kaufman | Method of retrieving categorical data entries through an interactive graphical abstraction |
KR101713197B1 (ko) * | 2015-04-01 | 2017-03-09 | 주식회사 씨케이앤비 | 서버 컴퓨팅 장치 및 이를 이용한 콘텐츠 인식 기반의 영상 검색 시스템 |
US10078651B2 (en) * | 2015-04-27 | 2018-09-18 | Rovi Guides, Inc. | Systems and methods for updating a knowledge graph through user input |
US10402446B2 (en) | 2015-04-29 | 2019-09-03 | Microsoft Licensing Technology, LLC | Image entity recognition and response |
US9934327B2 (en) | 2015-06-01 | 2018-04-03 | International Business Machines Corporation | Mining relevant approximate subgraphs from multigraphs |
US10474320B2 (en) * | 2015-06-07 | 2019-11-12 | Apple Inc. | Document channel selection for document viewing application |
KR20170004450A (ko) * | 2015-07-02 | 2017-01-11 | 엘지전자 주식회사 | 이동단말기 및 그 제어방법 |
US9396400B1 (en) * | 2015-07-30 | 2016-07-19 | Snitch, Inc. | Computer-vision based security system using a depth camera |
US10878021B2 (en) | 2015-08-17 | 2020-12-29 | Adobe Inc. | Content search and geographical considerations |
US10475098B2 (en) | 2015-08-17 | 2019-11-12 | Adobe Inc. | Content creation suggestions using keywords, similarity, and social networks |
US11048779B2 (en) | 2015-08-17 | 2021-06-29 | Adobe Inc. | Content creation, fingerprints, and watermarks |
US9881226B1 (en) * | 2015-09-24 | 2018-01-30 | Amazon Technologies, Inc. | Object relation builder |
WO2017055890A1 (en) | 2015-09-30 | 2017-04-06 | The Nielsen Company (Us), Llc | Interactive product auditing with a mobile device |
US10956948B2 (en) * | 2015-11-09 | 2021-03-23 | Anupam Madiratta | System and method for hotel discovery and generating generalized reviews |
CN106779791B (zh) * | 2015-11-25 | 2021-01-15 | 阿里巴巴集团控股有限公司 | 一种搭配对象图片组合的生成方法及装置 |
US10872114B2 (en) * | 2015-12-17 | 2020-12-22 | Hitachi, Ltd. | Image processing device, image retrieval interface display device, and method for displaying image retrieval interface |
US10185755B2 (en) * | 2015-12-28 | 2019-01-22 | Business Objects Software Limited | Orchestration of data query processing in a database system |
US20170249325A1 (en) * | 2016-02-26 | 2017-08-31 | Microsoft Technology Licensing, Llc | Proactive favorite leisure interest identification for personalized experiences |
US10452874B2 (en) | 2016-03-04 | 2019-10-22 | Disney Enterprises, Inc. | System and method for identifying and tagging assets within an AV file |
US10783382B2 (en) * | 2016-04-06 | 2020-09-22 | Semiconductor Components Industries, Llc | Systems and methods for buffer-free lane detection |
US10740385B1 (en) * | 2016-04-21 | 2020-08-11 | Shutterstock, Inc. | Identifying visual portions of visual media files responsive to search queries |
US10671930B2 (en) * | 2016-05-01 | 2020-06-02 | Indiggo Associates LLC | Knowledge inference apparatus and methods to determine emerging industrial trends and adapt strategic reasoning thereof |
US20170337293A1 (en) * | 2016-05-18 | 2017-11-23 | Sisense Ltd. | System and method of rendering multi-variant graphs |
US10339708B2 (en) * | 2016-11-01 | 2019-07-02 | Google Inc. | Map summarization and localization |
WO2018093182A1 (en) * | 2016-11-16 | 2018-05-24 | Samsung Electronics Co., Ltd. | Image management method and apparatus thereof |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
JP6811645B2 (ja) * | 2017-02-28 | 2021-01-13 | 株式会社日立製作所 | 画像検索装置及び画像検索方法 |
US20180339730A1 (en) * | 2017-05-26 | 2018-11-29 | Dura Operating, Llc | Method and system for generating a wide-area perception scene graph |
GB201709845D0 (en) * | 2017-06-20 | 2017-08-02 | Nchain Holdings Ltd | Computer-implemented system and method |
US10572743B1 (en) | 2017-08-28 | 2020-02-25 | Ambarella, Inc. | Real-time color classification for street vehicles |
JP6989450B2 (ja) * | 2018-06-21 | 2022-01-05 | 株式会社東芝 | 画像解析装置、画像解析方法及びプログラム |
JP7131195B2 (ja) * | 2018-08-14 | 2022-09-06 | 日本電信電話株式会社 | 物体認識装置、物体認識学習装置、方法、及びプログラム |
US11436215B2 (en) | 2018-08-20 | 2022-09-06 | Samsung Electronics Co., Ltd. | Server and control method thereof |
WO2020055910A1 (en) | 2018-09-10 | 2020-03-19 | Drisk, Inc. | Systems and methods for graph-based ai training |
WO2020054067A1 (ja) * | 2018-09-14 | 2020-03-19 | 三菱電機株式会社 | 画像情報処理装置、画像情報処理方法、及び画像情報処理プログラム |
RU2707710C1 (ru) * | 2018-10-13 | 2019-11-28 | Анатолий Васильевич Попов | Способ выделения вектора признаков для распознавания изображений объектов |
US11205050B2 (en) * | 2018-11-02 | 2021-12-21 | Oracle International Corporation | Learning property graph representations edge-by-edge |
US10896493B2 (en) * | 2018-11-13 | 2021-01-19 | Adobe Inc. | Intelligent identification of replacement regions for mixing and replacing of persons in group portraits |
CN111382628B (zh) * | 2018-12-28 | 2023-05-16 | 成都云天励飞技术有限公司 | 同行判定方法及装置 |
JP7272626B2 (ja) * | 2019-01-09 | 2023-05-12 | i-PRO株式会社 | 照合システム、照合方法およびカメラ装置 |
US11755925B2 (en) * | 2019-03-13 | 2023-09-12 | Fair Isaac Corporation | Computer-implemented decision management systems and methods |
US11720621B2 (en) * | 2019-03-18 | 2023-08-08 | Apple Inc. | Systems and methods for naming objects based on object content |
US10853983B2 (en) * | 2019-04-22 | 2020-12-01 | Adobe Inc. | Suggestions to enrich digital artwork |
US11107098B2 (en) | 2019-05-23 | 2021-08-31 | Content Aware, Llc | System and method for content recognition and data categorization |
US11861863B2 (en) * | 2019-06-17 | 2024-01-02 | Faro Technologies, Inc. | Shape dependent model identification in point clouds |
JP7155074B2 (ja) * | 2019-07-03 | 2022-10-18 | 富士フイルム株式会社 | 情報提案システム、情報提案方法、プログラムおよび記録媒体 |
US11475065B2 (en) * | 2019-10-29 | 2022-10-18 | Neo4J Sweden Ab | Pre-emptive graph search for guided natural language interactions with connected data systems |
CN111582152A (zh) * | 2020-05-07 | 2020-08-25 | 微特技术有限公司 | 一种识别图像中复杂事件的方法及系统 |
WO2021256184A1 (en) * | 2020-06-18 | 2021-12-23 | Nec Corporation | Method and device for adaptively displaying at least one potential subject and a target subject |
CN111967467B (zh) * | 2020-07-24 | 2022-10-04 | 北京航空航天大学 | 图像目标检测方法、装置、电子设备和计算机可读介质 |
US12045278B2 (en) | 2020-09-18 | 2024-07-23 | Google Llc | Intelligent systems and methods for visual search queries |
CN117610105B (zh) * | 2023-12-07 | 2024-06-07 | 上海烜翊科技有限公司 | 一种面向系统设计结果自动生成的模型视图结构设计方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001222586A (ja) * | 2000-02-09 | 2001-08-17 | Sony Corp | オンラインショッピング装置、オンラインショピング方法とそのシステムおよびその端末装置 |
US20060004892A1 (en) * | 2004-06-14 | 2006-01-05 | Christopher Lunt | Visual tags for search results generated from social network information |
JP2007264718A (ja) * | 2006-03-27 | 2007-10-11 | Yafoo Japan Corp | ユーザ興味分析装置、方法、プログラム |
JP2009205289A (ja) * | 2008-02-26 | 2009-09-10 | Nippon Telegr & Teleph Corp <Ntt> | 興味体系グラフ形成装置、興味体系グラフ形成方法、および、興味体系グラフ形成プログラム |
JP2010250529A (ja) * | 2009-04-15 | 2010-11-04 | Yahoo Japan Corp | 画像検索装置、画像検索方法及びプログラム |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6038333A (en) * | 1998-03-16 | 2000-03-14 | Hewlett-Packard Company | Person identifier and management system |
EP1129417A4 (en) * | 1998-12-04 | 2004-06-30 | Technology Enabling Company Ll | DATA ORGANIZATION SYSTEMS AND METHODS |
GB2375212B (en) * | 1999-04-29 | 2003-06-11 | Mitsubishi Electric Inf Tech | Method and apparatus for searching for an object using shape |
US6598043B1 (en) * | 1999-10-04 | 2003-07-22 | Jarg Corporation | Classification of information sources using graph structures |
KR100353798B1 (ko) * | 1999-12-01 | 2002-09-26 | 주식회사 코난테크놀로지 | 영상 객체 모양 정보 추출 방법 및 그를 이용한 내용기반 이미지 검색 시스템 및 그 방법 |
US6691126B1 (en) * | 2000-06-14 | 2004-02-10 | International Business Machines Corporation | Method and apparatus for locating multi-region objects in an image or video database |
US7624337B2 (en) * | 2000-07-24 | 2009-11-24 | Vmark, Inc. | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
TW501035B (en) * | 2001-03-20 | 2002-09-01 | Ulead Systems Inc | Interactive image searching method based on local object |
US7773800B2 (en) * | 2001-06-06 | 2010-08-10 | Ying Liu | Attrasoft image retrieval |
US7512612B1 (en) * | 2002-08-08 | 2009-03-31 | Spoke Software | Selecting an optimal path through a relationship graph |
US7672911B2 (en) * | 2004-08-14 | 2010-03-02 | Hrl Laboratories, Llc | Graph-based cognitive swarms for object group recognition in a 3N or greater-dimensional solution space |
JP2009504222A (ja) * | 2005-08-09 | 2009-02-05 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | ブラインドデコンボリューションによりノイズの多い画像の構造を空間的に強調するシステム及び方法 |
US8085995B2 (en) * | 2006-12-01 | 2011-12-27 | Google Inc. | Identifying images using face recognition |
US20080201206A1 (en) * | 2007-02-01 | 2008-08-21 | 7 Billion People, Inc. | Use of behavioral portraits in the conduct of E-commerce |
JP2010525431A (ja) * | 2007-04-19 | 2010-07-22 | ディー−ウェイブ システムズ,インコーポレイテッド | 自動画像認識用のシステム、方法、および装置 |
JP2010530998A (ja) * | 2007-05-08 | 2010-09-16 | アイトゲネーシッシュ テヒニッシュ ホーホシューレ チューリッヒ | 画像ベース情報検索の方法およびシステム |
US8364528B2 (en) * | 2008-05-06 | 2013-01-29 | Richrelevance, Inc. | System and process for improving product recommendations for use in providing personalized advertisements to retail customers |
US8417698B2 (en) * | 2008-05-06 | 2013-04-09 | Yellowpages.Com Llc | Systems and methods to provide search based on social graphs and affinity groups |
US8386486B2 (en) * | 2008-07-02 | 2013-02-26 | Palo Alto Research Center Incorporated | Method for facilitating social networking based on fashion-related information |
US8145521B2 (en) * | 2008-07-15 | 2012-03-27 | Google Inc. | Geographic and keyword context in embedded applications |
WO2010006367A1 (en) * | 2008-07-16 | 2010-01-21 | Imprezzeo Pty Ltd | Facial image recognition and retrieval |
US8972410B2 (en) * | 2008-07-30 | 2015-03-03 | Hewlett-Packard Development Company, L.P. | Identifying related objects in a computer database |
US8391615B2 (en) * | 2008-12-02 | 2013-03-05 | Intel Corporation | Image recognition algorithm, method of identifying a target image using same, and method of selecting data for transmission to a portable electronic device |
US8320617B2 (en) | 2009-03-27 | 2012-11-27 | Utc Fire & Security Americas Corporation, Inc. | System, method and program product for camera-based discovery of social networks |
US20110145275A1 (en) * | 2009-06-19 | 2011-06-16 | Moment Usa, Inc. | Systems and methods of contextual user interfaces for display of media items |
US9135277B2 (en) * | 2009-08-07 | 2015-09-15 | Google Inc. | Architecture for responding to a visual query |
US8670597B2 (en) * | 2009-08-07 | 2014-03-11 | Google Inc. | Facial recognition with social network aiding |
US8781231B1 (en) * | 2009-08-25 | 2014-07-15 | Google Inc. | Content-based image ranking |
KR101657565B1 (ko) * | 2010-04-21 | 2016-09-19 | 엘지전자 주식회사 | 증강 원격제어장치 및 그 동작 방법 |
US8818049B2 (en) * | 2011-05-18 | 2014-08-26 | Google Inc. | Retrieving contact information based on image recognition searches |
-
2011
- 2011-06-23 JP JP2013521387A patent/JP5830784B2/ja active Active
- 2011-06-23 WO PCT/JP2011/064463 patent/WO2012176317A1/ja active Application Filing
- 2011-06-23 US US14/129,005 patent/US9600499B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001222586A (ja) * | 2000-02-09 | 2001-08-17 | Sony Corp | オンラインショッピング装置、オンラインショピング方法とそのシステムおよびその端末装置 |
US20060004892A1 (en) * | 2004-06-14 | 2006-01-05 | Christopher Lunt | Visual tags for search results generated from social network information |
JP2007264718A (ja) * | 2006-03-27 | 2007-10-11 | Yafoo Japan Corp | ユーザ興味分析装置、方法、プログラム |
JP2009205289A (ja) * | 2008-02-26 | 2009-09-10 | Nippon Telegr & Teleph Corp <Ntt> | 興味体系グラフ形成装置、興味体系グラフ形成方法、および、興味体系グラフ形成プログラム |
JP2010250529A (ja) * | 2009-04-15 | 2010-11-04 | Yahoo Japan Corp | 画像検索装置、画像検索方法及びプログラム |
Non-Patent Citations (2)
Title |
---|
NORIYUKI YAMAZAKI: "03 Life Log & Recommendation ni yoru Jisedai Kensaku no Kanosei (1) Social Graph ga Tsumugu Atarashii Kankeisei", WEB SITE EXPERT, 25 June 2008 (2008-06-25), pages 52 - 55 * |
YOSHISUKE YAMAKAWA ET AL.: "Ruiji Gazo Kensaku Gijutsu o Mochiita Shohin Suisen System", IMAGE LAB, vol. 19, no. 8, 10 August 2008 (2008-08-10), pages 11 - 15 * |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10241645B2 (en) | 2013-02-14 | 2019-03-26 | Facebook, Inc. | Lock screen with socialized applications |
JP2016511900A (ja) * | 2013-02-14 | 2016-04-21 | フェイスブック,インク. | ソーシャル化されたアプリケーションを備えるロック画面 |
CN105027162A (zh) * | 2013-02-27 | 2015-11-04 | 株式会社日立制作所 | 图像解析装置、图像解析系统、图像解析方法 |
US10438050B2 (en) | 2013-02-27 | 2019-10-08 | Hitachi, Ltd. | Image analysis device, image analysis system, and image analysis method |
WO2014132349A1 (ja) * | 2013-02-27 | 2014-09-04 | 株式会社日立製作所 | 画像解析装置、画像解析システム、画像解析方法 |
CN105027162B (zh) * | 2013-02-27 | 2018-02-02 | 株式会社日立制作所 | 图像解析装置、图像解析系统、图像解析方法 |
JP6005837B2 (ja) * | 2013-02-27 | 2016-10-12 | 株式会社日立製作所 | 画像解析装置、画像解析システム、画像解析方法 |
US11729347B2 (en) | 2013-06-28 | 2023-08-15 | Nec Corporation | Video surveillance system, video processing apparatus, video processing method, and video processing program |
JP7036186B2 (ja) | 2013-06-28 | 2022-03-15 | 日本電気株式会社 | 映像処理システム、映像処理方法及び映像処理プログラム |
JP2021061598A (ja) * | 2013-06-28 | 2021-04-15 | 日本電気株式会社 | 映像処理システム、映像処理方法及び映像処理プログラム |
JP2015026202A (ja) * | 2013-07-25 | 2015-02-05 | 日本放送協会 | 番組検索装置及び番組検索プログラム |
JP2016529612A (ja) * | 2013-08-02 | 2016-09-23 | エモティエント インコーポレイテッド | 画像情動コンテンツに基づくフィルタ及びシャッター |
US9817845B2 (en) | 2013-12-12 | 2017-11-14 | Xyzprinting, Inc. | Three-dimensional image file searching method and three-dimensional image file searching system |
EP3121736A4 (en) * | 2014-03-19 | 2017-08-16 | Kyoto University | Object display system for relatedness graph |
WO2015141341A1 (ja) * | 2014-03-19 | 2015-09-24 | 国立大学法人京都大学 | 関係性グラフ用オブジェクト表示システム |
US10366125B2 (en) | 2014-03-19 | 2019-07-30 | Kyoto University | Object display system for relationship graph |
JP2015179387A (ja) * | 2014-03-19 | 2015-10-08 | 国立大学法人京都大学 | 関係性グラフ用オブジェクト表示システム |
JP2015191334A (ja) * | 2014-03-27 | 2015-11-02 | キヤノン株式会社 | 情報処理装置、情報処理方法 |
WO2015147317A1 (en) * | 2014-03-27 | 2015-10-01 | Canon Kabushiki Kaisha | Information processing apparatus and information processing method |
KR20160136391A (ko) * | 2014-03-27 | 2016-11-29 | 캐논 가부시끼가이샤 | 정보처리장치 및 정보처리방법 |
US10255517B2 (en) | 2014-03-27 | 2019-04-09 | Canon Kabushiki Kaisha | Information processing apparatus and information processing method |
KR101964397B1 (ko) | 2014-03-27 | 2019-04-01 | 캐논 가부시끼가이샤 | 정보처리장치 및 정보처리방법 |
WO2016103651A1 (ja) * | 2014-12-22 | 2016-06-30 | 日本電気株式会社 | 情報処理システム、情報処理方法、及び、記録媒体 |
WO2016190264A1 (ja) * | 2015-05-26 | 2016-12-01 | 株式会社神戸デジタル・ラボ | インタレスト情報生成システム |
JP2016218950A (ja) * | 2015-05-26 | 2016-12-22 | 株式会社神戸デジタル・ラボ | インタレスト情報生成システム |
CN105354550A (zh) * | 2015-11-03 | 2016-02-24 | 华东师范大学 | 一种基于图像局部特征点配准的表单内容提取方法 |
CN105354550B (zh) * | 2015-11-03 | 2018-09-28 | 华东师范大学 | 一种基于图像局部特征点配准的表单内容提取方法 |
WO2017110640A1 (ja) * | 2015-12-25 | 2017-06-29 | キヤノン株式会社 | 画像処理装置、画像処理方法、コンピュータプログラム |
US10049267B2 (en) | 2016-02-29 | 2018-08-14 | Toyota Jidosha Kabushiki Kaisha | Autonomous human-centric place recognition |
JP2017157201A (ja) * | 2016-02-29 | 2017-09-07 | トヨタ自動車株式会社 | 人間を中心とした場所認識方法 |
JP2017167806A (ja) * | 2016-03-16 | 2017-09-21 | 株式会社東芝 | 関係可視化装置、方法およびプログラム |
JP2017220225A (ja) * | 2016-06-07 | 2017-12-14 | パロ アルト リサーチ センター インコーポレイテッド | 複雑なグラフ検索のための局所的な視覚グラフ・フィルタ |
JP7041473B2 (ja) | 2016-06-07 | 2022-03-24 | パロ アルト リサーチ センター インコーポレイテッド | 複雑なグラフ検索のための局所的な視覚グラフ・フィルタ |
WO2018142756A1 (ja) * | 2017-01-31 | 2018-08-09 | 株式会社Nttドコモ | 情報処理装置及び情報処理方法 |
JPWO2018142756A1 (ja) * | 2017-01-31 | 2019-11-07 | 株式会社Nttドコモ | 情報処理装置及び情報処理方法 |
US10394987B2 (en) | 2017-03-21 | 2019-08-27 | International Business Machines Corporation | Adaptive bug-search depth for simple and deep counterexamples |
JP2019049943A (ja) * | 2017-09-12 | 2019-03-28 | 凸版印刷株式会社 | 画像処理装置、画像処理方法、及びプログラム |
JP7006059B2 (ja) | 2017-09-12 | 2022-01-24 | 凸版印刷株式会社 | 画像処理装置、画像処理方法、及びプログラム |
JP7104779B2 (ja) | 2017-09-15 | 2022-07-21 | 達闥科技(北京)有限公司 | 対象認識方法、装置及びインテリジェント端末 |
JP2020535501A (ja) * | 2017-09-15 | 2020-12-03 | 達闥科技(北京)有限公司Cloudminds (Beijing) Technologies Co., Ltd. | 対象認識方法、装置及びインテリジェント端末 |
CN111247536A (zh) * | 2017-10-27 | 2020-06-05 | 三星电子株式会社 | 用于搜索相关图像的电子装置及其控制方法 |
CN111247536B (zh) * | 2017-10-27 | 2023-11-10 | 三星电子株式会社 | 用于搜索相关图像的电子装置及其控制方法 |
US11853108B2 (en) | 2017-10-27 | 2023-12-26 | Samsung Electronics Co., Ltd. | Electronic apparatus for searching related image and control method therefor |
JP2022505015A (ja) * | 2019-10-08 | 2022-01-14 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 知識グラフのベクトル表現生成方法、装置及び電子機器 |
JP7262571B2 (ja) | 2019-10-08 | 2023-04-21 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 知識グラフのベクトル表現生成方法、装置及び電子機器 |
US11995560B2 (en) | 2019-10-08 | 2024-05-28 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating vector representation of knowledge graph |
CN114282545A (zh) * | 2021-09-23 | 2022-04-05 | 中国银联股份有限公司 | 生成对象简称的方法和系统、存储介质以及程序产品 |
Also Published As
Publication number | Publication date |
---|---|
JP5830784B2 (ja) | 2015-12-09 |
US9600499B2 (en) | 2017-03-21 |
JPWO2012176317A1 (ja) | 2015-02-23 |
US20140149376A1 (en) | 2014-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5830784B2 (ja) | 画像認識システムを組込んだ関連性検索によるインタレスト・グラフ収集システム | |
KR101768521B1 (ko) | 이미지에 포함된 객체에 대한 정보 데이터를 제공하는 방법 및 시스템 | |
US7917514B2 (en) | Visual and multi-dimensional search | |
US8837831B2 (en) | Method and system for managing digital photos | |
US10042866B2 (en) | Searching untagged images with text-based queries | |
US9053194B2 (en) | Method and apparatus for correlating and viewing disparate data | |
US11036790B1 (en) | Identifying visual portions of visual media files responsive to visual portions of media files submitted as search queries | |
CN111125422A (zh) | 一种图像分类方法、装置、电子设备及存储介质 | |
JP2020504378A (ja) | 様々な順位付けを用いた視覚カテゴリ表示 | |
US20070288453A1 (en) | System and Method for Searching Multimedia using Exemplar Images | |
CN106255968A (zh) | 自然语言图像搜索 | |
EP2038775A1 (en) | Visual and multi-dimensional search | |
US10740385B1 (en) | Identifying visual portions of visual media files responsive to search queries | |
Zhang et al. | Image clustering: An unsupervised approach to categorize visual data in social science research | |
JP6787831B2 (ja) | 検索結果による学習が可能な対象検出装置、検出モデル生成装置、プログラム及び方法 | |
Cho et al. | Classifying tourists’ photos and exploring tourism destination image using a deep learning model | |
CN114329069A (zh) | 视觉搜索查询的智能系统和方法 | |
Maihami et al. | Automatic image annotation using community detection in neighbor images | |
Zhang et al. | Automatic latent street type discovery from web open data | |
Kitamura et al. | Tourist spot recommendation applying generic object recognition with travel photos | |
US20220198771A1 (en) | Discovery, Management And Processing Of Virtual Real Estate Content | |
Lei et al. | A new clothing image retrieval algorithm based on sketch component segmentation in mobile visual sensors | |
Guo et al. | Object discovery in high-resolution remote sensing images: a semantic perspective | |
US10783398B1 (en) | Image editor including localized editing based on generative adversarial networks | |
CN114489434A (zh) | 一种显示方法、显示装置及计算机存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11868338 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2013521387 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14129005 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11868338 Country of ref document: EP Kind code of ref document: A1 |