WO2023033199A1 - Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence - Google Patents

Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence Download PDF

Info

Publication number
WO2023033199A1
WO2023033199A1 PCT/KR2021/011669 KR2021011669W WO2023033199A1 WO 2023033199 A1 WO2023033199 A1 WO 2023033199A1 KR 2021011669 W KR2021011669 W KR 2021011669W WO 2023033199 A1 WO2023033199 A1 WO 2023033199A1
Authority
WO
WIPO (PCT)
Prior art keywords
product
image
images
similarity
respective reference
Prior art date
Application number
PCT/KR2021/011669
Other languages
French (fr)
Inventor
Jae Hyung Lee
Youngsik Kim
Yeul Na
Original Assignee
Stratio
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Stratio filed Critical Stratio
Priority to PCT/KR2021/011669 priority Critical patent/WO2023033199A1/en
Publication of WO2023033199A1 publication Critical patent/WO2023033199A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • the disclosed implementations relate generally to methods for matching a product image with reference images, and more specifically to systems and methods for matching a product image with patent drawings.
  • Determining a match between two images is an important task in image processing technologies.
  • an electronic device that can identify similar images has many potential applications.
  • a method of assessing a similarity between a product image and reference drawings is performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors.
  • the method includes obtaining one or more feature vectors corresponding to one or more images of a product; retrieving a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and providing the respective similarity between the product and the respective reference product.
  • a computer readable storage medium stores one or more programs for execution by a computer system having one or more processors and memory.
  • the one or more programs include instructions for obtaining one or more feature vectors corresponding to one or more images of a product; retrieving a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and providing the respective similarity between the product and the respective reference product.
  • a computer system includes one or more processors and memory.
  • the memory stores one or more programs including instructions, which when executed by the one or more processors, cause the one or more processors to: obtain one or more feature vectors corresponding to one or more images of a product; retrieve a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, apply the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and provide the respective similarity between the product and the respective reference product.
  • a method for building models for assessing a similarity between a product image and reference drawings is performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors.
  • the method includes, for a respective reference product of a plurality of reference products: retrieving images of the respective reference product; and forming a respective feature vector for a respective image of the retrieved images of the respective reference product.
  • the method also includes using at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and storing the trained model in a database for subsequent use in determining a similarity to the respective reference product.
  • a computer readable storage medium stores one or more programs for execution by a computer system having one or more processors and memory.
  • the one or more programs include instructions for, for a respective reference product of a plurality of reference products: retrieving images of the respective reference product; and forming a respective feature vector for a respective image of the retrieved images of the respective reference product.
  • the one or more programs also includes instructions for: using at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and storing the trained model in a database for subsequent use in determining a similarity to the respective reference product.
  • a computer system includes one or more processors and memory.
  • the memory stores one or more programs including instructions, which, when executed by the one or more processors, cause the one or more processors to: for a respective reference product of a plurality of reference products: retrieve images of the respective reference product; and form a respective feature vector for a respective image of the retrieved images of the respective reference product.
  • the one or more programs also includes instructions, which, when executed by the one or more processors, cause the one or more processors to: use at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and store the trained model in a database for subsequent use in determining a similarity to the respective reference product.
  • methods and systems are disclosed for building (e.g., training) models for determining a similarity between a product image and a reference drawing and for using the trained models for determining a similarity between a product image and a reference drawing.
  • Such methods and systems may be used to facilitate identifying one or more products that may appear similar to reference drawings, such as automatically identifying products that may potentially infringe third parties' intellectual property rights (e.g., design patents, trade dress, etc.).
  • Figure 1A illustrates training one or more models in accordance with some implementations.
  • Figure 1B illustrates using one or more models in accordance with some implementations.
  • Figure 2A is a block diagram illustrating a computing device according to some implementations.
  • Figure 2B is a block diagram illustrating a server according to some implementations.
  • Figures 3A - 3B illustrate how a model is trained according to some implementations.
  • Figures 4A - 4B provide a flow diagram of a method for assessing a similarity between a product image and reference drawings according to some implementations.
  • Figures 5A - 5B provide a flow diagram of a method for building models for assessing a similarity between a product image and reference drawings according to some implementations.
  • Figure 6 is a schematic diagram illustrating two operations involved in identifying a match according to some implementations.
  • Figure 1A illustrates training one or more model(s) 132 using drawing data 120 from a plurality of drawings 110 (e.g., reference drawings, such as engineering drawings or patent drawings).
  • the drawing data 120 is input into a machine learning engine 130 configured to train (e.g., produce, generate) one or more models 132 for determining a similarity to one or more drawings of the plurality of drawings 110.
  • the plurality of drawings includes drawings 110 of a same type.
  • the plurality of drawings may include drawings 110 that are all black and white drawings.
  • the plurality of drawings may include drawings 110 that are all color drawings.
  • the type of the drawings in the plurality of drawings determines the type for which the one or more model(s) 132 are trained to provide similarity value(s).
  • the one or more model(s) 132 trained using the drawing data 120 corresponding to the plurality of drawings 110 e.g., plurality of black and white drawings
  • the type of a drawing also identifies whether the drawing is an outline drawing or a photograph and/or whether the drawing includes a shading or not.
  • the drawing data 120 is obtained for each drawing in the plurality of drawings.
  • the drawing data 120 includes image data 122 and image type 124.
  • the drawing data 120 also includes other data 126.
  • the first drawing data 120-1 e.g., medical data or medical information
  • the drawing data 120-1 includes image data 122-1, image type 124-1, and optionally, other data 126-1 corresponding to the first drawing 110-1.
  • the image data 122 includes information representing graphical elements of the associated drawing.
  • the image data 122 may include information indicating color (or black and white) for a plurality of pixels.
  • the image data 122 is compressed.
  • the image data 122 is not compressed.
  • the image type 124 includes the type of drawing.
  • the image type may include information such as black and white or color, outline drawings or filled drawings, drawings with shading or drawings without shading, etc.
  • the other data 126 includes information identifying an associated product or an associated patent information (e.g., a registration number of a patent from which the drawing is obtained).
  • the machine learning engine 130 forms a feature vector for each respective drawing of the plurality of drawings using the image data 122, and optionally, the image type 124 corresponding to the respective drawing.
  • the machine learning engine 130 then uses the feature vectors to train the one or more models 132 so that the models 132 can determine a similarity to one or more drawings.
  • the one or more models 132 include a plurality of models (e.g., a first model and a second model) and each model of the one or more models 132 is trained to provide a similarity value for a product image.
  • a first model may be trained to provide a similarity value for a product image of a first image type and a second model may be trained to provide a similarity value for a product image of a second image type is different from the first image type.
  • the first image type may include a black and white plan view and the second image type may include a black and white perspective view.
  • the first model is trained using a first plurality of drawings and the second model is trained using a second plurality of drawings that is different from the first plurality of drawings.
  • the first plurality of drawings includes drawings who have been or are currently treated with one or more drug therapies and the one or more drug therapies associated with the first plurality of drawings includes the first image type.
  • the second plurality of drawings includes drawings who have been or are currently treated with one or more drug therapies and the one or more drug therapies associated with the second plurality of drawings includes the second image type.
  • the first plurality of drawings differs from the second plurality of drawings by at least one drawing.
  • the first plurality of drawings includes one or more drawings in common with the second plurality of drawings.
  • Figure 1B illustrates using one or more trained models 132 that are trained to determine similarity value(s).
  • New drawing data 141 corresponding to a new drawing 1400 is provided to the model(s) 132.
  • the new drawing data 141 includes image data 142 and image type 144 corresponding to the new drawing 140.
  • the new drawing data 141 also includes other data 146 corresponding to the new drawing 140.
  • the new drawing data 141 is provided as input to the one or more trained models 132, and the one or more trained models 132 output results 151 for one or more different reference drawings.
  • the one or more trained models 132 outputs a plurality of similarity values 152.
  • FIG. 2A is a block diagram illustrating a computing device 200, corresponding to a computing system, which can train and/or execute model(s) 132 in accordance with some implementations.
  • the computing device 200 include a desktop computer, a laptop computer, a tablet computer, a server computer, a server system, a wearable device such as a smart watch, and other computing devices that have a processor capable of training and/or running model(s) 132.
  • the computing device 200 may be a data server that hosts one or more databases, models, or modules, or may provide various executable applications or modules.
  • the computing device 200 typically includes one or more processing units (processors or cores) 202, one or more network or other communications interfaces 204, memory 206, and one or more communication buses 208 for interconnecting these components.
  • the communication buses 208 optionally include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
  • the computing device 200 typically includes a user interface 210.
  • the user interface 210 typically includes a display device 212 (e.g., a screen or monitor).
  • the computing device 200 includes input devices such as a keyboard, mouse, and/or other input buttons 216.
  • the display device 212 includes a touch-sensitive surface 214, in which case the display device 212 is a touch-sensitive display.
  • the touch-sensitive surface 214 is configured to detect various swipe gestures (e.g., continuous gestures in vertical and/or horizontal directions) and/or other gestures (e.g., single/double tap).
  • a physical keyboard is optional (e.g., a soft keyboard may be displayed when keyboard entry is needed).
  • the user interface 210 also includes an audio output device 218, such as speakers or an audio output connection connected to speakers, earphones, or headphones.
  • some computing devices 200 use an audio input device 220, such as a microphone, and voice recognition software to supplement or replace the keyboard.
  • the audio input device 220 (e.g., a microphone) captures audio (e.g., speech from a user).
  • the memory 206 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.
  • the memory 206 includes one or more storage devices remotely located from the processors 202.
  • the memory 206, or alternatively the non-volatile memory devices within the memory 206 includes a non-transitory computer-readable storage medium.
  • the memory 206 or the computer-readable storage medium of the memory 206 stores the following programs, modules, and data structures, or a subset or superset thereof:
  • an operating system 222 which includes procedures for handling various basic system services and for performing hardware dependent tasks;
  • a communications module 224 which is used for connecting the computing device 200 to other computers and devices via the one or more communication network interfaces 204 (wired or wireless), such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
  • a web browser 226 (or other application capable of displaying web pages), which enables a user to communicate over a network with remote computers or devices;
  • an audio input module 228 e.g., a microphone module for processing audio captured by the audio input device 220.
  • the captured audio may be sent to a remote server and/or processed by an application executing on the computing device 200 (e.g., modeling application 230);
  • a modeling application 230 which includes a graphical user interface 100 that allows a user to navigate the modeling application 230, such as accessing and editing drawing data 120, including image data 122, image type 124, and other data 126, and/or accessing and editing new drawing data 141, including image data 142, image type 144, and other data 146.
  • the drawing data 120 is then compiled by the machine learning engine 130 in order to train model(s) 132.
  • the modeling application 230 may also input new drawing data 141 into the model(s) 132 and utilize the model(s) 132 to determine a similarity value.
  • the model(s) 132 take drawing data (including image data 142, image type 144, and other data 146) into account when generating similarity values;
  • a data processing module 232 configured to perform preprocessing steps necessary to convert any raw information into correct data types for the drawing data 120 or for the new drawing data 141.
  • the data processing module 232 may be configured to perform image processing to convert color drawings to black and white drawings, or convert a view of a drawing.
  • the data processing module 232 may also be configured to perform one or more calculations based on the received raw data in order to generate a data value for the drawing data 120 or the new drawing data 141.
  • the data processing module 232 may also be configured to generate imputed (e.g., inferred) data to replace missing values in the drawing data 120 or the new drawing data 141.
  • the data processing module 232 may utilize a variety of different methods to generate (e.g., determine or calculate) the imputed data.
  • the imputation or inference method used by the data processing module 232 to generate the imputed data is based at least in part on the type of data that is missing;
  • a machine learning engine 130 configured to train the model(s) 132 using the drawing data 120 (including image data 122, image type 124, and other data 126) as inputs for training the model(s) 132;
  • Drawing data 120 includes image data 122, image type 124, and other data 126.
  • New drawing data 141 includes image data 142, image type 144, and other data 146.
  • the memory 206 stores metrics and/or scores determined by the one or more models 132.
  • the memory 206 may store thresholds and other criteria, which are compared against the metrics and/or scores determined by the machine learning engine 130 and/or model(s) 132.
  • the model(s) 132 may compare (e.g., calculate) similarities values against a threshold similarity value to select similarities values that satisfy (e.g., exceeds) the threshold similarity value.
  • Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i.e., sets of instructions
  • the memory 206 stores a subset of the modules and data structures identified above.
  • the memory 206 may store additional modules or data structures not described above.
  • Figure 2A shows a computing device 200
  • Figure 2A is intended more as a functional description of the various features that may be present rather than as a structural schematic of the implementations described herein.
  • items shown separately could be combined and some items could be separated.
  • FIG. 2B is a block diagram of a server 250 in accordance with some implementations.
  • a server 250 may host one or more databases 290 or may provide various executable applications or modules.
  • a server 250 typically includes one or more processing units/cores (CPUs) 252, one or more network interfaces 262, memory 264, and one or more communication buses 254 for interconnecting these components.
  • the server 250 includes a user interface 256, which includes a display 258 and one or more input devices 260, such as a keyboard and a mouse.
  • the communication buses 254 include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
  • the memory 264 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices, and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.
  • the memory 264 includes one or more storage devices remotely located from the CPU(s) 252.
  • the memory 264, or alternatively the non-volatile memory devices within the memory 264 comprises a non-transitory computer readable storage medium.
  • the memory 264 stores the following programs, modules, and data structures, or a subset thereof:
  • an operating system 270 which includes procedures for handling various basic system services and for performing hardware dependent tasks;
  • a network communication module 272 which is used for connecting the server 250 to other computers via the one or more communication network interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
  • a web server 274 (such as an HTTP server), which receives web requests from users and responds by providing responsive web pages or other resources;
  • a predictive application or a modeling web application 280 which may be downloaded and executed by a web browser 226 on a user's computing device 200.
  • a modeling web application 280 has the same functionality as a desktop predictive application 230, but provides the flexibility of access from any device at any location with network connectivity, and does not require installation and maintenance.
  • the modeling web application 280 includes various software modules to perform certain tasks.
  • the modeling web application 280 includes a graphical user interface module 282, which provides the user interface for all aspects of the modeling web application 280.
  • the modeling web application 280 includes drawing data 120 and new drawing data 141 as described above for a computing device 200;
  • a data processing module 232 for performing preprocessing steps required to convert raw information into correct data types for the drawing data 120 or for the new drawing data 141, performing one or more calculations based on the received raw data in order to generate a data value for the drawing data 120 or the new drawing data 141, and/or generate imputed (e.g., inferred) data to replace missing values in the drawing data 120 or the new drawing data 141 as described above;
  • a machine learning engine 130 for training the model(s) 132 as described above;
  • the databases 290 may store drawing data 120, new drawing data 141, results 151 (including predicted drawing response(s) 152 to drug therapies and corresponding prediction interval(s) 154), and one or more model(s) 132 as described above.
  • Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i.e., sets of instructions
  • the memory 264 stores a subset of the modules and data structures identified above.
  • the memory 264 stores additional modules or data structures not described above.
  • Figure 2B shows a server 250
  • Figure 2B is intended more as a functional description of the various features that may be present rather than as a structural schematic of the implementations described herein.
  • items shown separately could be combined and some items could be separated.
  • some of the programs, functions, procedures, or data shown above with respect to a server 250 may be stored or executed on a computing device 200.
  • the functionality and/or data may be allocated between a computing device 200 and one or more servers 250.
  • Figure 2B need not represent a single physical device.
  • the server functionality is allocated across multiple physical devices that comprise a server system.
  • references to a "server” include various groups, collections, or arrays of servers that provide the described functionality, and the physical servers need not be physically collocated (e.g., the individual physical devices could be spread throughout the United States or throughout the world).
  • Figures 3A - 3B illustrate how a model (e.g., a first model) of the plurality of models 132 is trained according to some implementations.
  • the machine learning engine 130 receives drawing data 120 (e.g., training data) for a plurality of drawings (e.g., n number of drawings, drawings 110-1 to 110-n.
  • the drawing data 120 for a respective drawing of the plurality of drawings 110 includes image data 122, image type 124, and optionally, other data 126.
  • the machine learning engine 130 divides the drawing data 120 into a first subset of drawing data 120 to be used as training data 310 and a second subset of drawing data 120 to be used as testing data 312.
  • the first subset of drawing data (e.g., the training data 310, drawing data 120-1 to 120-p, where p ⁇ n) includes information corresponding to a first subset of drawings (e.g., drawings 110-1 to 110-p) and the second subset of drawing data (e.g., the testing data 312, drawing data 120-(p+1) to 120-n) includes information corresponding to a second subset of drawings (e.g., drawings 110-(p+1) to 110-n).
  • the training data 310 (e.g., the first subset of drawing data) includes at least 50%, 60%, 70%, 80%, or 90% of the plurality of drawing data 120.
  • the training data 310 (e.g., the first subset of the plurality of drawing data) may include 70% of the plurality of drawing data 120 and the testing data 312 (e.g., the second subset of the plurality of drawing data) includes 30% of the plurality of drawing data 120.
  • the machine learning engine 130 uses the training data 310 and the testing data 312 to train the model of the plurality of models 132.
  • the machine learning engine 130 uses the training data 310 to train (e.g., generate) a model in-training 320 and uses the testing data 312 to test and refine the model in-training 320 in order to generate (e.g., train) the model 132-m.
  • the model 132-m can be used to predict a drawing's response to a specific image type.
  • This process can be repeated for a plurality of models 132, where the drawing data 120 and the plurality of drawings 110 used as inputs to train each model 132 differs for each different (e.g., distinct) model.
  • Figures 4A - 4B provide a flow diagram of a method 400 for assessing a similarity between a product image and reference drawings according to some implementations.
  • the operations of the method 400 may be performed by a computer system, corresponding to a computer device 200 or a server 250.
  • the computer system includes one or more processors and memory.
  • Figures 4A - 4B correspond to instructions stored in computer memory or a computer-readable storage medium (e.g., the memory 206 of the computing device 200).
  • the memory stores one or more programs configured for execution by the one or more processors.
  • the operations of the method 400 are performed, at least in part, by a machine learning engine 130.
  • a computer system obtains (410) one or more feature vectors corresponding to one or more images of a product.
  • a feature vector is a group of numerical features that represent a certain object.
  • the computer system may two images of a product (e.g., a perspective view and a plan view), and each image may be represented by a respective feature vector (e.g., a first image of the product is represented by a first feature vector and a second image of the product is represented by a second feature vector).
  • the computer system receives feature vectors instead of receiving a raw image file.
  • the computer system obtains (402), before obtaining the one or more feature vectors, the one or more images of the product; and extracts the one or more feature vectors from the one or more images of the product. For example, in some implementations, the computer system receives raw images and extracts the one or more feature vectors. In some implementations, the computer system receives two images and extracts a first feature vector for a first image and a second feature vector for a second image. In some implementations, the computer system includes, or is in communication with, a camera (e.g., the computer system is a mobile phone with a built-in camera). The camera is used to collect an image of a product, which is subsequently processed for extracting one or more feature vectors.
  • a camera e.g., the computer system is a mobile phone with a built-in camera. The camera is used to collect an image of a product, which is subsequently processed for extracting one or more feature vectors.
  • the computer system retrieves (420) a plurality of trained models built to determine respective similarities to respective reference products.
  • a respective model is configured for determining a similarity value with respect to a respective reference product (or a respective drawing of the reference product). For comparing the image of the product with multiple reference products, in some implementations, multiple models are used.
  • At least a first subset of the plurality of trained models is associated with respective image types (e.g., a color photograph, a black and white photograph, a rendering, a drawing, etc.).
  • the computer system selects (422) one or more trained models of at least the first subset of the plurality of trained models associated with a respective image type corresponding to an image type of a particular image of the one or more images of the product. For example, when the received product image is a rendered image, the computer system selects models configured for use with rendered images. In another example, when the received product image is an engineering drawing, the computer system selects models configured for use with engineering drawings.
  • At least a second subset of the plurality of trained models is not associated (424) with a particular image type.
  • a deep neural network engine or an open cross-domain visual search algorithm may be used to compare images of different image types.
  • the respective trained model has been trained (426) according to data for a plurality of products.
  • a first subset of the plurality of products includes products that are similar to the respective reference product.
  • a second subset of the plurality of products includes products that are not similar to the respective reference product.
  • the respective trained model is trained with both positive and negative references.
  • the computer system for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applies (430, Figure 4B) the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product.
  • a similarity value is obtained by applying the respective trained model to a feature vector.
  • software tools such as SphereFace, CosFace, AcrFace, etc., may be used to determine the similarity.
  • At least a first subset of the plurality of trained models is associated with respective image types.
  • the computer system for the respective trained model, converts (432) the one or more images of the product to respective images of a respective image type associated with the respective trained model. For example, as described with respect to Figures 6 and 7, an image of a first image type may be converted into an image of a second image type that is distinct from the first image type.
  • software tools such as Domain Transfer Network, Pix2Pix, etc., may be used.
  • the computer system determines (434) the respective similarity between the product and the respective reference product based on applying two or more trained models built to determine respective similarities between the product and the respective reference product.
  • the two or more trained models are associated with two or more image types.
  • applying the respective trained model to a feature vector of the one or more feature vectors includes (436) determining a distance between the feature vector of the one or more feature vectors corresponding to the one or more images of the product and a feature vector extracted from the image of the respective reference product.
  • a Euclidean distance or a Cosine distance may be calculated.
  • the image of the respective reference product includes (438) a patent drawing.
  • the computer system provides (440) the respective similarity between the product and the respective reference product.
  • Figures 5A - 5B provide a flow diagram of a method 500 for building models for assessing a similarity between a product image and reference drawings according to some implementations.
  • the steps of the method 500 may be performed by a computer system, corresponding to a computer device 200 or a server 250.
  • the computer system includes one or more processors and memory.
  • Figures 5A - 5B correspond to instructions stored in a computer memory or computer-readable storage medium (e.g., the memory 206 of the computing device 200).
  • the memory stores one or more programs configured for execution by the one or more processors.
  • a computer system for a respective reference product of a plurality of reference products, retrieves (510) images of the respective reference product.
  • the images of the respective reference products are associated with respective image types.
  • the computer system selects (512) one or more images of the images of the respective reference products corresponding to a particular image type of the respective image types.
  • the selected one or more images correspond to a particular image type, such as a photograph, a rendered image, an engineering drawing, or a patent drawing.
  • the images of the respective reference products are associated with respective image types.
  • the computer system converts (514) the images of the respective reference product to images of a particular image type of the respective image types. For example, the images of different image types are converted into a same image type, such as a photograph, a rendered image, an engineering drawing, or a patent drawing.
  • the computer system also for the respective reference product, forms (520) a respective feature vector for a respective image of the retrieved images of the respective reference product (e.g., the computer system extracts a feature vector from the respective image).
  • the computer system uses (530) at least a subset of feature vectors to train a model to determine a similarity to the respective reference product.
  • the computer system trains (532) the model using feature vectors associated with two or more image types.
  • the computer system determines (534) a respective similarity between a product and the respective reference product.
  • determining the respective similarity includes (536) determining a distance to the respective feature vector.
  • the computer system identifies (538) one or more clusters of the feature vectors, and determines the respective similarity includes determining a distance to a respective cluster of the one or more clusters.
  • the computer system stores (540) the trained model in a database for subsequent use in determining a similarity to the respective reference product.
  • Figure 6 is a schematic diagram illustrating two operations involved in identifying a match according to some implementations.
  • the first operation involves processing a received image 610 (e.g., a photograph) to obtain a processed image 620.
  • the processing includes detecting an object detection operation.
  • the object detection operation may be performed using software tools, such as Faster Region-based Convolutional Neural Network-Region Proposal Network, Yolo, etc.
  • the processing includes aligning the received image (or at least a partially processed image, or a portion thereof).
  • the processing includes converting the received image 610 into an image of a different type (e.g., photograph to a line drawing).
  • the conversion operation may be performed using algorithms, such as Domain Transfer Network or Pix2Pix.
  • the second operation involves comparing the processed image 620 with a reference drawing 630 to determine a similarity value as described with respect to Figure 1B.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A method of assessing a similarity between a product image and reference drawings may include obtaining one or more feature vectors corresponding to one or more images of a product; retrieving a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and providing the respective similarity between the product and the respective reference product. A method of such building models is also described.

Description

METHODS FOR AUTOMATICALLY IDENTIFYING A MATCH BETWEEN A PRODUCT IMAGE AND A REFERENCE DRAWING BASED ON ARTIFICIAL INTELLIGENCE
The disclosed implementations relate generally to methods for matching a product image with reference images, and more specifically to systems and methods for matching a product image with patent drawings.
Determining a match between two images is an important task in image processing technologies. In particular, an electronic device that can identify similar images has many potential applications.
Although there have been advances in image matching techniques, such techniques do not work well with particular drawings, such as engineering drawings (e.g., computer aided design drawings) or patent drawings.
Accordingly, there is a need for tools that can accurately determine a match between a product image and such reference drawings. The methods and devices described herein address the needs. Such methods and devices may replace conventional methods and devices. Alternatively, such methods and devices may complement conventional methods and devices.
In accordance with some embodiments, a method of assessing a similarity between a product image and reference drawings is performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors. The method includes obtaining one or more feature vectors corresponding to one or more images of a product; retrieving a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and providing the respective similarity between the product and the respective reference product.
In accordance with some embodiments, a computer readable storage medium stores one or more programs for execution by a computer system having one or more processors and memory. The one or more programs include instructions for obtaining one or more feature vectors corresponding to one or more images of a product; retrieving a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and providing the respective similarity between the product and the respective reference product.
In accordance with some embodiments, a computer system includes one or more processors and memory. The memory stores one or more programs including instructions, which when executed by the one or more processors, cause the one or more processors to: obtain one or more feature vectors corresponding to one or more images of a product; retrieve a plurality of trained models built to determine respective similarities to respective reference products; for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, apply the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and provide the respective similarity between the product and the respective reference product.
In accordance with some embodiments, a method for building models for assessing a similarity between a product image and reference drawings is performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors. The method includes, for a respective reference product of a plurality of reference products: retrieving images of the respective reference product; and forming a respective feature vector for a respective image of the retrieved images of the respective reference product. The method also includes using at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and storing the trained model in a database for subsequent use in determining a similarity to the respective reference product.
In accordance with some embodiments, a computer readable storage medium stores one or more programs for execution by a computer system having one or more processors and memory. The one or more programs include instructions for, for a respective reference product of a plurality of reference products: retrieving images of the respective reference product; and forming a respective feature vector for a respective image of the retrieved images of the respective reference product. The one or more programs also includes instructions for: using at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and storing the trained model in a database for subsequent use in determining a similarity to the respective reference product.
In accordance with some embodiments, a computer system includes one or more processors and memory. The memory stores one or more programs including instructions, which, when executed by the one or more processors, cause the one or more processors to: for a respective reference product of a plurality of reference products: retrieve images of the respective reference product; and form a respective feature vector for a respective image of the retrieved images of the respective reference product. The one or more programs also includes instructions, which, when executed by the one or more processors, cause the one or more processors to: use at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and store the trained model in a database for subsequent use in determining a similarity to the respective reference product.
Thus, methods and systems are disclosed for building (e.g., training) models for determining a similarity between a product image and a reference drawing and for using the trained models for determining a similarity between a product image and a reference drawing. Such methods and systems may be used to facilitate identifying one or more products that may appear similar to reference drawings, such as automatically identifying products that may potentially infringe third parties' intellectual property rights (e.g., design patents, trade dress, etc.).
Both the foregoing general description and the following detailed description are exemplary and explanatory, and are intended to provide further explanation of the invention as claimed.
For a better understanding of these systems, methods, and graphical user interfaces, as well as additional systems, methods, and graphical user interfaces that correlate drawings with treating clinicians, refer to the Description of Implementations below, in conjunction with the following drawings, in which like reference numerals refer to corresponding parts throughout the figures.
Figure 1A illustrates training one or more models in accordance with some implementations.
Figure 1B illustrates using one or more models in accordance with some implementations.
Figure 2A is a block diagram illustrating a computing device according to some implementations.
Figure 2B is a block diagram illustrating a server according to some implementations.
Figures 3A - 3B illustrate how a model is trained according to some implementations.
Figures 4A - 4B provide a flow diagram of a method for assessing a similarity between a product image and reference drawings according to some implementations.
Figures 5A - 5B provide a flow diagram of a method for building models for assessing a similarity between a product image and reference drawings according to some implementations.
Figure 6 is a schematic diagram illustrating two operations involved in identifying a match according to some implementations.
Reference will now be made to implementations, examples of which are illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without requiring these specific details.
Figure 1A illustrates training one or more model(s) 132 using drawing data 120 from a plurality of drawings 110 (e.g., reference drawings, such as engineering drawings or patent drawings). The drawing data 120 is input into a machine learning engine 130 configured to train (e.g., produce, generate) one or more models 132 for determining a similarity to one or more drawings of the plurality of drawings 110.
In some embodiments, the plurality of drawings includes drawings 110 of a same type. For example, the plurality of drawings may include drawings 110 that are all black and white drawings. In another example, the plurality of drawings may include drawings 110 that are all color drawings. The type of the drawings in the plurality of drawings determines the type for which the one or more model(s) 132 are trained to provide similarity value(s). For example, when the plurality of drawings include drawings that are black and white drawings, the one or more model(s) 132 trained using the drawing data 120 corresponding to the plurality of drawings 110 (e.g., plurality of black and white drawings) are trained to provide similarity value(s) of a specific black and white drawing. In some implementations, the type of a drawing also identifies whether the drawing is an outline drawing or a photograph and/or whether the drawing includes a shading or not.
The drawing data 120 is obtained for each drawing in the plurality of drawings. The drawing data 120 includes image data 122 and image type 124. In some implementations, the drawing data 120 also includes other data 126. For example, the first drawing data 120-1 (e.g., medical data or medical information) is obtained for the first drawing 110-1. The drawing data 120-1 includes image data 122-1, image type 124-1, and optionally, other data 126-1 corresponding to the first drawing 110-1.
The image data 122 includes information representing graphical elements of the associated drawing. For example, the image data 122 may include information indicating color (or black and white) for a plurality of pixels. In some implementations, the image data 122 is compressed. In some implementations, the image data 122 is not compressed. The image type 124 includes the type of drawing. For example, the image type may include information such as black and white or color, outline drawings or filled drawings, drawings with shading or drawings without shading, etc. The other data 126 includes information identifying an associated product or an associated patent information (e.g., a registration number of a patent from which the drawing is obtained).
The machine learning engine 130 forms a feature vector for each respective drawing of the plurality of drawings using the image data 122, and optionally, the image type 124 corresponding to the respective drawing. The machine learning engine 130 then uses the feature vectors to train the one or more models 132 so that the models 132 can determine a similarity to one or more drawings.
In some implementations, the one or more models 132 include a plurality of models (e.g., a first model and a second model) and each model of the one or more models 132 is trained to provide a similarity value for a product image. For example, a first model may be trained to provide a similarity value for a product image of a first image type and a second model may be trained to provide a similarity value for a product image of a second image type is different from the first image type. For instance, the first image type may include a black and white plan view and the second image type may include a black and white perspective view.
In some implementations, the first model is trained using a first plurality of drawings and the second model is trained using a second plurality of drawings that is different from the first plurality of drawings. For example, the first plurality of drawings includes drawings who have been or are currently treated with one or more drug therapies and the one or more drug therapies associated with the first plurality of drawings includes the first image type. In contrast, the second plurality of drawings includes drawings who have been or are currently treated with one or more drug therapies and the one or more drug therapies associated with the second plurality of drawings includes the second image type. The first plurality of drawings differs from the second plurality of drawings by at least one drawing. In some implementations, the first plurality of drawings includes one or more drawings in common with the second plurality of drawings.
Additional details regarding training the one or more models 132 is provided with respect to Figure 3A.
Figure 1B illustrates using one or more trained models 132 that are trained to determine similarity value(s). New drawing data 141 corresponding to a new drawing 1400 is provided to the model(s) 132. The new drawing data 141 includes image data 142 and image type 144 corresponding to the new drawing 140. In some implementations, the new drawing data 141 also includes other data 146 corresponding to the new drawing 140. The new drawing data 141 is provided as input to the one or more trained models 132, and the one or more trained models 132 output results 151 for one or more different reference drawings. The one or more trained models 132 outputs a plurality of similarity values 152.
Figure 2A is a block diagram illustrating a computing device 200, corresponding to a computing system, which can train and/or execute model(s) 132 in accordance with some implementations. Various examples of the computing device 200 include a desktop computer, a laptop computer, a tablet computer, a server computer, a server system, a wearable device such as a smart watch, and other computing devices that have a processor capable of training and/or running model(s) 132. The computing device 200 may be a data server that hosts one or more databases, models, or modules, or may provide various executable applications or modules. The computing device 200 typically includes one or more processing units (processors or cores) 202, one or more network or other communications interfaces 204, memory 206, and one or more communication buses 208 for interconnecting these components. The communication buses 208 optionally include circuitry (sometimes called a chipset) that interconnects and controls communications between system components. The computing device 200 typically includes a user interface 210. The user interface 210 typically includes a display device 212 (e.g., a screen or monitor). In some implementations, the computing device 200 includes input devices such as a keyboard, mouse, and/or other input buttons 216. Alternatively or in addition, in some implementations, the display device 212 includes a touch-sensitive surface 214, in which case the display device 212 is a touch-sensitive display. In some implementations, the touch-sensitive surface 214 is configured to detect various swipe gestures (e.g., continuous gestures in vertical and/or horizontal directions) and/or other gestures (e.g., single/double tap). In computing devices that have a touch-sensitive surface 214, a physical keyboard is optional (e.g., a soft keyboard may be displayed when keyboard entry is needed). The user interface 210 also includes an audio output device 218, such as speakers or an audio output connection connected to speakers, earphones, or headphones. Furthermore, some computing devices 200 use an audio input device 220, such as a microphone, and voice recognition software to supplement or replace the keyboard. The audio input device 220 (e.g., a microphone) captures audio (e.g., speech from a user).
The memory 206 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. In some implementations, the memory 206 includes one or more storage devices remotely located from the processors 202. The memory 206, or alternatively the non-volatile memory devices within the memory 206, includes a non-transitory computer-readable storage medium. In some implementations, the memory 206 or the computer-readable storage medium of the memory 206 stores the following programs, modules, and data structures, or a subset or superset thereof:
an operating system 222, which includes procedures for handling various basic system services and for performing hardware dependent tasks;
a communications module 224, which is used for connecting the computing device 200 to other computers and devices via the one or more communication network interfaces 204 (wired or wireless), such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
a web browser 226 (or other application capable of displaying web pages), which enables a user to communicate over a network with remote computers or devices;
an audio input module 228 (e.g., a microphone module) for processing audio captured by the audio input device 220. The captured audio may be sent to a remote server and/or processed by an application executing on the computing device 200 (e.g., modeling application 230);
a modeling application 230, which includes a graphical user interface 100 that allows a user to navigate the modeling application 230, such as accessing and editing drawing data 120, including image data 122, image type 124, and other data 126, and/or accessing and editing new drawing data 141, including image data 142, image type 144, and other data 146. For example, one or more users may use the graphical user interface 100 of the modeling application 230 to select a subset of reference drawings. The drawing data 120 is then compiled by the machine learning engine 130 in order to train model(s) 132. The modeling application 230 may also input new drawing data 141 into the model(s) 132 and utilize the model(s) 132 to determine a similarity value. The model(s) 132 take drawing data (including image data 142, image type 144, and other data 146) into account when generating similarity values;
a data processing module 232 configured to perform preprocessing steps necessary to convert any raw information into correct data types for the drawing data 120 or for the new drawing data 141. For example, the data processing module 232 may be configured to perform image processing to convert color drawings to black and white drawings, or convert a view of a drawing. The data processing module 232 may also be configured to perform one or more calculations based on the received raw data in order to generate a data value for the drawing data 120 or the new drawing data 141. The data processing module 232 may also be configured to generate imputed (e.g., inferred) data to replace missing values in the drawing data 120 or the new drawing data 141. The data processing module 232 may utilize a variety of different methods to generate (e.g., determine or calculate) the imputed data. In some implementations, the imputation or inference method used by the data processing module 232 to generate the imputed data is based at least in part on the type of data that is missing;
a machine learning engine 130 configured to train the model(s) 132 using the drawing data 120 (including image data 122, image type 124, and other data 126) as inputs for training the model(s) 132;
one or more models 132 trained by machine learning engine 130 to provide results 151 including similarity values 152;
a database 240, which stores information, such as drawing data 120, new drawing data 141, results 151 (e.g., similarity values 152). Drawing data 120 includes image data 122, image type 124, and other data 126. New drawing data 141 includes image data 142, image type 144, and other data 146.
In some implementations, the memory 206 stores metrics and/or scores determined by the one or more models 132. In addition, the memory 206 may store thresholds and other criteria, which are compared against the metrics and/or scores determined by the machine learning engine 130 and/or model(s) 132. For example, the model(s) 132 may compare (e.g., calculate) similarities values against a threshold similarity value to select similarities values that satisfy (e.g., exceeds) the threshold similarity value.
Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 206 stores a subset of the modules and data structures identified above. Furthermore, the memory 206 may store additional modules or data structures not described above.
Although Figure 2A shows a computing device 200, Figure 2A is intended more as a functional description of the various features that may be present rather than as a structural schematic of the implementations described herein. In practice, and as recognized by those of ordinary skill in the art, items shown separately could be combined and some items could be separated.
Figure 2B is a block diagram of a server 250 in accordance with some implementations. A server 250 may host one or more databases 290 or may provide various executable applications or modules. A server 250 typically includes one or more processing units/cores (CPUs) 252, one or more network interfaces 262, memory 264, and one or more communication buses 254 for interconnecting these components. In some implementations, the server 250 includes a user interface 256, which includes a display 258 and one or more input devices 260, such as a keyboard and a mouse. In some implementations, the communication buses 254 include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
In some implementations, the memory 264 includes high-speed random-access memory, such as DRAM, SRAM, DDR RAM, or other random-access solid-state memory devices, and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. In some implementations, the memory 264 includes one or more storage devices remotely located from the CPU(s) 252. The memory 264, or alternatively the non-volatile memory devices within the memory 264, comprises a non-transitory computer readable storage medium.
In some implementations, the memory 264, or the computer readable storage medium of the memory 264, stores the following programs, modules, and data structures, or a subset thereof:
an operating system 270, which includes procedures for handling various basic system services and for performing hardware dependent tasks;
a network communication module 272, which is used for connecting the server 250 to other computers via the one or more communication network interfaces (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
a web server 274 (such as an HTTP server), which receives web requests from users and responds by providing responsive web pages or other resources;
a predictive application or a modeling web application 280, which may be downloaded and executed by a web browser 226 on a user's computing device 200. In general, a modeling web application 280 has the same functionality as a desktop predictive application 230, but provides the flexibility of access from any device at any location with network connectivity, and does not require installation and maintenance. In some implementations, the modeling web application 280 includes various software modules to perform certain tasks. In some implementations, the modeling web application 280 includes a graphical user interface module 282, which provides the user interface for all aspects of the modeling web application 280. In some implementations, the modeling web application 280 includes drawing data 120 and new drawing data 141 as described above for a computing device 200;
a data processing module 232 for performing preprocessing steps required to convert raw information into correct data types for the drawing data 120 or for the new drawing data 141, performing one or more calculations based on the received raw data in order to generate a data value for the drawing data 120 or the new drawing data 141, and/or generate imputed (e.g., inferred) data to replace missing values in the drawing data 120 or the new drawing data 141 as described above;
a machine learning engine 130 for training the model(s) 132 as described above;
one or more models 132 trained to provide results 151 as described above;
one or more databases 290, which store data used or created by the modeling web application 280 or predictive application 230. The databases 290 may store drawing data 120, new drawing data 141, results 151 (including predicted drawing response(s) 152 to drug therapies and corresponding prediction interval(s) 154), and one or more model(s) 132 as described above.
Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 264 stores a subset of the modules and data structures identified above. In some implementations, the memory 264 stores additional modules or data structures not described above.
Although Figure 2B shows a server 250, Figure 2B is intended more as a functional description of the various features that may be present rather than as a structural schematic of the implementations described herein. In practice, and as recognized by those of ordinary skill in the art, items shown separately could be combined and some items could be separated. In addition, some of the programs, functions, procedures, or data shown above with respect to a server 250 may be stored or executed on a computing device 200. In some implementations, the functionality and/or data may be allocated between a computing device 200 and one or more servers 250. Furthermore, one of skill in the art recognizes that Figure 2B need not represent a single physical device. In some implementations, the server functionality is allocated across multiple physical devices that comprise a server system. As used herein, references to a "server" include various groups, collections, or arrays of servers that provide the described functionality, and the physical servers need not be physically collocated (e.g., the individual physical devices could be spread throughout the United States or throughout the world).
Figures 3A - 3B illustrate how a model (e.g., a first model) of the plurality of models 132 is trained according to some implementations. In order to train the model of the plurality of models 132, the machine learning engine 130 receives drawing data 120 (e.g., training data) for a plurality of drawings (e.g., n number of drawings, drawings 110-1 to 110-n. The drawing data 120 for a respective drawing of the plurality of drawings 110 includes image data 122, image type 124, and optionally, other data 126. The machine learning engine 130 divides the drawing data 120 into a first subset of drawing data 120 to be used as training data 310 and a second subset of drawing data 120 to be used as testing data 312. For example, as shown in Figure 3A, the first subset of drawing data (e.g., the training data 310, drawing data 120-1 to 120-p, where p < n) includes information corresponding to a first subset of drawings (e.g., drawings 110-1 to 110-p) and the second subset of drawing data (e.g., the testing data 312, drawing data 120-(p+1) to 120-n) includes information corresponding to a second subset of drawings (e.g., drawings 110-(p+1) to 110-n). In some implementations, the training data 310 (e.g., the first subset of drawing data) includes at least 50%, 60%, 70%, 80%, or 90% of the plurality of drawing data 120. For example, the training data 310 (e.g., the first subset of the plurality of drawing data) may include 70% of the plurality of drawing data 120 and the testing data 312 (e.g., the second subset of the plurality of drawing data) includes 30% of the plurality of drawing data 120.
Referring to Figure 3B, the machine learning engine 130 uses the training data 310 and the testing data 312 to train the model of the plurality of models 132. The machine learning engine 130 uses the training data 310 to train (e.g., generate) a model in-training 320 and uses the testing data 312 to test and refine the model in-training 320 in order to generate (e.g., train) the model 132-m. Once the model 132-m is trained, the model 132-m can be used to predict a drawing's response to a specific image type.
This process can be repeated for a plurality of models 132, where the drawing data 120 and the plurality of drawings 110 used as inputs to train each model 132 differs for each different (e.g., distinct) model.
Figures 4A - 4B provide a flow diagram of a method 400 for assessing a similarity between a product image and reference drawings according to some implementations. The operations of the method 400 may be performed by a computer system, corresponding to a computer device 200 or a server 250. In some implementations, the computer system includes one or more processors and memory. Figures 4A - 4B correspond to instructions stored in computer memory or a computer-readable storage medium (e.g., the memory 206 of the computing device 200). The memory stores one or more programs configured for execution by the one or more processors. For example, the operations of the method 400 are performed, at least in part, by a machine learning engine 130.
In accordance with some implementations, a computer system (e.g., computing device 200, a server 250, etc.) obtains (410) one or more feature vectors corresponding to one or more images of a product. A feature vector is a group of numerical features that represent a certain object. For example, the computer system may two images of a product (e.g., a perspective view and a plan view), and each image may be represented by a respective feature vector (e.g., a first image of the product is represented by a first feature vector and a second image of the product is represented by a second feature vector). In some implementations, the computer system receives feature vectors instead of receiving a raw image file.
In some embodiments, the computer system obtains (402), before obtaining the one or more feature vectors, the one or more images of the product; and extracts the one or more feature vectors from the one or more images of the product. For example, in some implementations, the computer system receives raw images and extracts the one or more feature vectors. In some implementations, the computer system receives two images and extracts a first feature vector for a first image and a second feature vector for a second image. In some implementations, the computer system includes, or is in communication with, a camera (e.g., the computer system is a mobile phone with a built-in camera). The camera is used to collect an image of a product, which is subsequently processed for extracting one or more feature vectors.
The computer system retrieves (420) a plurality of trained models built to determine respective similarities to respective reference products. In some implementations, a respective model is configured for determining a similarity value with respect to a respective reference product (or a respective drawing of the reference product). For comparing the image of the product with multiple reference products, in some implementations, multiple models are used.
In some embodiments, at least a first subset of the plurality of trained models is associated with respective image types (e.g., a color photograph, a black and white photograph, a rendering, a drawing, etc.). The computer system selects (422) one or more trained models of at least the first subset of the plurality of trained models associated with a respective image type corresponding to an image type of a particular image of the one or more images of the product. For example, when the received product image is a rendered image, the computer system selects models configured for use with rendered images. In another example, when the received product image is an engineering drawing, the computer system selects models configured for use with engineering drawings.
In some embodiments, at least a second subset of the plurality of trained models is not associated (424) with a particular image type. For example, a deep neural network engine or an open cross-domain visual search algorithm may be used to compare images of different image types.
In some embodiments, the respective trained model has been trained (426) according to data for a plurality of products. A first subset of the plurality of products includes products that are similar to the respective reference product. A second subset of the plurality of products includes products that are not similar to the respective reference product. For example, the respective trained model is trained with both positive and negative references.
The computer system, for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applies (430, Figure 4B) the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product. For example, as described with respect to Figure 1B, a similarity value is obtained by applying the respective trained model to a feature vector. In some implementations, software tools, such as SphereFace, CosFace, AcrFace, etc., may be used to determine the similarity.
In some embodiments, at least a first subset of the plurality of trained models is associated with respective image types. The computer system, for the respective trained model, converts (432) the one or more images of the product to respective images of a respective image type associated with the respective trained model. For example, as described with respect to Figures 6 and 7, an image of a first image type may be converted into an image of a second image type that is distinct from the first image type. For the conversion of the images between different image types, software tools, such as Domain Transfer Network, Pix2Pix, etc., may be used.
In some embodiments, the computer system determines (434) the respective similarity between the product and the respective reference product based on applying two or more trained models built to determine respective similarities between the product and the respective reference product. The two or more trained models are associated with two or more image types.
In some embodiments, applying the respective trained model to a feature vector of the one or more feature vectors includes (436) determining a distance between the feature vector of the one or more feature vectors corresponding to the one or more images of the product and a feature vector extracted from the image of the respective reference product. In some implementations, a Euclidean distance or a Cosine distance may be calculated.
In some embodiments, the image of the respective reference product includes (438) a patent drawing.
The computer system provides (440) the respective similarity between the product and the respective reference product.
Figures 5A - 5B provide a flow diagram of a method 500 for building models for assessing a similarity between a product image and reference drawings according to some implementations. The steps of the method 500 may be performed by a computer system, corresponding to a computer device 200 or a server 250. In some implementations, the computer system includes one or more processors and memory. Figures 5A - 5B correspond to instructions stored in a computer memory or computer-readable storage medium (e.g., the memory 206 of the computing device 200). The memory stores one or more programs configured for execution by the one or more processors.
In accordance with some implementations, a computer system (e.g., computing device 200, a server 250, etc.), for a respective reference product of a plurality of reference products, retrieves (510) images of the respective reference product.
In some embodiments, the images of the respective reference products are associated with respective image types. The computer system selects (512) one or more images of the images of the respective reference products corresponding to a particular image type of the respective image types. For example, the selected one or more images correspond to a particular image type, such as a photograph, a rendered image, an engineering drawing, or a patent drawing.
In some embodiments, the images of the respective reference products are associated with respective image types. The computer system converts (514) the images of the respective reference product to images of a particular image type of the respective image types. For example, the images of different image types are converted into a same image type, such as a photograph, a rendered image, an engineering drawing, or a patent drawing.
The computer system, also for the respective reference product, forms (520) a respective feature vector for a respective image of the retrieved images of the respective reference product (e.g., the computer system extracts a feature vector from the respective image).
The computer system uses (530) at least a subset of feature vectors to train a model to determine a similarity to the respective reference product.
In some embodiments, the computer system trains (532) the model using feature vectors associated with two or more image types.
In some embodiments, the computer system determines (534) a respective similarity between a product and the respective reference product.
In some embodiments, determining the respective similarity includes (536) determining a distance to the respective feature vector.
In some embodiments, the computer system identifies (538) one or more clusters of the feature vectors, and determines the respective similarity includes determining a distance to a respective cluster of the one or more clusters.
The computer system stores (540) the trained model in a database for subsequent use in determining a similarity to the respective reference product.
Figure 6 is a schematic diagram illustrating two operations involved in identifying a match according to some implementations.
The first operation involves processing a received image 610 (e.g., a photograph) to obtain a processed image 620. In some implementations, the processing includes detecting an object detection operation. The object detection operation may be performed using software tools, such as Faster Region-based Convolutional Neural Network-Region Proposal Network, Yolo, etc. In some implementations, the processing includes aligning the received image (or at least a partially processed image, or a portion thereof). In some implementations, the processing includes converting the received image 610 into an image of a different type (e.g., photograph to a line drawing). The conversion operation may be performed using algorithms, such as Domain Transfer Network or Pix2Pix.
The second operation involves comparing the processed image 620 with a reference drawing 630 to determine a similarity value as described with respect to Figure 1B.
The terminology used in the description of the invention herein is for the purpose of describing particular implementations only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various implementations with various modifications as are suited to the particular use contemplated.
This work was supported by the National IT Industry Promotion Agency(NIPA) grant funded by the Korea government(MSIT)(A0222-21-1001) and Seoul R&BD Program(CY200002, Fake Product Warning Service through the Spectrum Analysis of the "Material Fingerprints") through the Seoul Business Agency(SBA) funded by The Seoul Metropolitan Government.

Claims (20)

  1. A method of assessing a similarity between a product image and reference drawings, performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors, the method comprising:
    obtaining one or more feature vectors corresponding to one or more images of a product;
    retrieving a plurality of trained models built to determine respective similarities to respective reference products;
    for a respective trained model, of the plurality of trained models, built to determine a respective similarity between the product and the respective reference product, applying the respective trained model to a feature vector of the one or more feature vectors corresponding to the one or more images of the product for determining a similarity between an image of the one or more images of the product and an image of the respective reference product; and
    providing the respective similarity between the product and the respective reference product.
  2. The method of claim 1, wherein:
    at least a first subset of the plurality of trained models is associated with respective image types; and
    the method further comprises selecting one or more trained models of at least the first subset of the plurality of trained models associated with a respective image type corresponding to an image type of a particular image of the one or more images of the product.
  3. The method of claim 1, wherein:
    at least a first subset of the plurality of trained models is associated with respective image types; and
    the method further comprises, for the respective trained model, converting the one or more images of the product to respective images of a respective image type associated with the respective trained model.
  4. The method of claim 1, wherein:
    at least a second subset of the plurality of trained models is not associated with a particular image type.
  5. The method of any of claims 1-4, wherein:
    the respective trained model has been trained according to data for a plurality of products;
    a first subset of the plurality of products includes products that are similar to the respective reference product; and
    a second subset of the plurality of products includes products that are not similar to the respective reference product.
  6. The method of any of claims 1-5, further comprising:
    determining the respective similarity between the product and the respective reference product based on applying two or more trained models built to determine respective similarities between the product and the respective reference product, the two or more trained models being associated with two or more image types.
  7. The method of any of claims 1-6, further comprising:
    obtaining the one or more images of the product; and
    extracting the one or more feature vectors from the one or more images of the product.
  8. The method of any of claims 1-7, wherein:
    applying the respective trained model to a feature vector of the one or more feature vectors includes determining a distance between the feature vector of the one or more feature vectors corresponding to the one or more images of the product and a feature vector extracted from the image of the respective reference product.
  9. The method of any of claims 1-8, wherein:
    the image of the respective reference product includes a patent drawing.
  10. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer system having one or more processors and memory, the one or more programs comprising instructions for performing any of the methods of claims 1 - 9.
  11. A computer system, comprising:
    one or more processors;
    memory; and
    one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1 - 9.
  12. A method for building models for assessing a similarity between a product image and reference drawings, performed at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors, the method comprising:
    for a respective reference product of a plurality of reference products:
    retrieving images of the respective reference product; and
    forming a respective feature vector for a respective image of the retrieved images of the respective reference product;
    using at least a subset of feature vectors to train a model to determine a similarity to the respective reference product; and
    storing the trained model in a database for subsequent use in determining a similarity to the respective reference product.
  13. The method of claim 12, wherein:
    the images of the respective reference products are associated with respective image types; and
    the method further comprises selecting one or more images of the images of the respective reference products corresponding to a particular image type of the respective image types.
  14. The method of claim 12, wherein:
    the images of the respective reference products are associated with respective image types; and
    the method further comprises converting the images of the respective reference product to images of a particular image type of the respective image types.
  15. The method of claim 12, including:
    training the model using feature vectors associated with two or more image types.
  16. The method of any of claims 12-15, further comprising:
    determining a respective similarity between a product and the respective reference product.
  17. The method of claim 16, wherein:
    determining the respective similarity includes determining a distance to the respective feature vector.
  18. The method of claim 16 or 17, further comprising:
    identifying one or more clusters of the feature vectors; and
    determining the respective similarity includes determining a distance to a respective cluster of the one or more clusters.
  19. A non-transitory computer readable storage medium storing one or more programs configured for execution by a computer system having one or more processors and memory, the one or more programs comprising instructions for performing any of the methods of claims 12 - 18.
  20. A computer system, comprising:
    one or more processors;
    memory; and
    one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 12 - 18.
PCT/KR2021/011669 2021-08-31 2021-08-31 Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence WO2023033199A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/KR2021/011669 WO2023033199A1 (en) 2021-08-31 2021-08-31 Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/KR2021/011669 WO2023033199A1 (en) 2021-08-31 2021-08-31 Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence

Publications (1)

Publication Number Publication Date
WO2023033199A1 true WO2023033199A1 (en) 2023-03-09

Family

ID=85412554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/011669 WO2023033199A1 (en) 2021-08-31 2021-08-31 Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence

Country Status (1)

Country Link
WO (1) WO2023033199A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200242480A1 (en) * 2018-11-29 2020-07-30 SparkCognition, Inc. Automated model building search space reduction
US20210081376A1 (en) * 2018-05-25 2021-03-18 ZFusion Technology Co., Ltd. Xiamen Construction method, device, computing device, and storage medium for constructing patent knowledge database
US11030484B2 (en) * 2019-03-22 2021-06-08 Capital One Services, Llc System and method for efficient generation of machine-learning models
WO2021158702A1 (en) * 2020-02-03 2021-08-12 Strong Force TX Portfolio 2018, LLC Artificial intelligence selection and configuration

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210081376A1 (en) * 2018-05-25 2021-03-18 ZFusion Technology Co., Ltd. Xiamen Construction method, device, computing device, and storage medium for constructing patent knowledge database
US20200242480A1 (en) * 2018-11-29 2020-07-30 SparkCognition, Inc. Automated model building search space reduction
US11030484B2 (en) * 2019-03-22 2021-06-08 Capital One Services, Llc System and method for efficient generation of machine-learning models
WO2021158702A1 (en) * 2020-02-03 2021-08-12 Strong Force TX Portfolio 2018, LLC Artificial intelligence selection and configuration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BHARADWAJ MANDA; SHUBHAM DHAYARKAR; SAI MITHERAN; V.K. VIEKASH; RAMANATHAN MUTHUGANAPATHY: "'CADSketchNet' -- An Annotated Sketch dataset for 3D CAD Model Retrieval with Deep Neural Networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 July 2021 (2021-07-13), 201 Olin Library Cornell University Ithaca, NY 14853, XP091011400, DOI: 10.1016/j.cag.2021.07.001 *

Similar Documents

Publication Publication Date Title
WO2018174623A1 (en) Apparatus and method for image analysis using virtual three-dimensional deep neural network
WO2017213398A1 (en) Learning model for salient facial region detection
WO2019033571A1 (en) Facial feature point detection method, apparatus and storage medium
WO2014051246A1 (en) Method and apparatus for inferring facial composite
WO2016013885A1 (en) Method for retrieving image and electronic device thereof
CN112801298B (en) Abnormal sample detection method, device, equipment and storage medium
WO2021208601A1 (en) Artificial-intelligence-based image processing method and apparatus, and device and storage medium
WO2022131497A1 (en) Learning apparatus and method for image generation, and image generation apparatus and method
WO2014069822A1 (en) Apparatus and method for face recognition
EP3871116A1 (en) Method and apparatus for retrieving intelligent information from electronic device
WO2019119396A1 (en) Facial expression recognition method and device
WO2022059969A1 (en) Deep neural network pre-training method for electrocardiogram data classificiation
WO2022097927A1 (en) Method of live video event detection based on natural language queries, and an apparatus for the same
WO2023273297A1 (en) Multi-modality-based living body detection method and apparatus, electronic device, and storage medium
CN115828112A (en) Fault event response method and device, electronic equipment and storage medium
WO2019033567A1 (en) Method for capturing eyeball movement, device and storage medium
TW202030683A (en) Method and apparatus for extracting claim settlement information, and electronic device
CN111898538A (en) Certificate authentication method and device, electronic equipment and storage medium
CN111242083A (en) Text processing method, device, equipment and medium based on artificial intelligence
CN113705469A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN114079801A (en) On-demand overlaying of metadata onto video streams for intelligent video analytics
CN111881740A (en) Face recognition method, face recognition device, electronic equipment and medium
WO2023033199A1 (en) Methods for automatically identifying a match between a product image and a reference drawing based on artificial intelligence
WO2023068495A1 (en) Electronic device and control method thereof
WO2024101466A1 (en) Attribute-based missing person tracking apparatus and method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21956127

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE