WO2021079386A1 - Method and system for searching a digital image in an online database - Google Patents

Method and system for searching a digital image in an online database Download PDF

Info

Publication number
WO2021079386A1
WO2021079386A1 PCT/IT2019/000083 IT2019000083W WO2021079386A1 WO 2021079386 A1 WO2021079386 A1 WO 2021079386A1 IT 2019000083 W IT2019000083 W IT 2019000083W WO 2021079386 A1 WO2021079386 A1 WO 2021079386A1
Authority
WO
WIPO (PCT)
Prior art keywords
digital image
online database
searching
database
online
Prior art date
Application number
PCT/IT2019/000083
Other languages
French (fr)
Inventor
Andrea PROVENZALE
Original Assignee
Yakkyo S.R.L.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yakkyo S.R.L. filed Critical Yakkyo S.R.L.
Priority to PCT/IT2019/000083 priority Critical patent/WO2021079386A1/en
Publication of WO2021079386A1 publication Critical patent/WO2021079386A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Definitions

  • the object of the present invention is a method and a system for searching images which are associated with various products and contained in online databases.
  • the aim of the invention is to provide a system and a method for searching images which are associated with equal or equivalent products or material goods, by comparing the said images in online databases.
  • the said images are generally contained in various producers' and/or distributors' (located differently) online databases.
  • the present invention drastically reduces the search times, by eliminating the long process of comparing the images, which is performed manually at present.
  • the first obvious difficulty lies in having to manually analyse hundreds or thousands of pages, which are scattered in various online databases.
  • the system and method object of the present invention provide a completely automated process for searching products by means of digital images.
  • the said aims and advantages are obtained by the method for searching digital images in online databases through a calculator according to the present invention, said method comprising: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
  • the described method may include comparing said first digital image to a third digital image, said third digital image being collected in a second online database, said second online database being allocated on a remote server, different from said first online database.
  • the comparison between said first digital image and said second or third digital image consists in a template matching.
  • the comparison between said first digital image and said second or third digital image consists in a feature matching.
  • the method includes: generating a signature for said second or third digital image; and collecting said second or third digital image in a third database.
  • the method includes: generating a signature for said first digital image; comparing the signature generated for said first digital image to the one generated for a fourth digital image, said fourth digital image and its signature being collected in said third database.
  • the method includes: assigning a score to said second digital image or to said third digital image, said score assessing the affinity between said first digital image and said second or third digital image.
  • the method includes: classifying said first digital image through a machine learning or deep learning algorithm and extracting some textual tags; using said tags to perform textual searches in said first or second online database; comparing said first digital image to said second or third digital image; collecting these tags inside said third online database.
  • the present invention includes a system for searching a digital image in online databases, said system comprising at least one processor and at least one memory holding instructions to be executed by said processor, said processor being configured by said instructions to perform a method including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
  • the present invention includes a non-transient memory (readable by a calculator) including instructions which make - when they are executed by one or several processors - said one or several processors execute the method for searching a digital image in online databases including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
  • FIG. 1 a flowchart wich illustrates the method according to the present invention
  • FIG. 1 An architectural scheme according to the present invention.
  • the method according to the present invention includes distinct phases which are carried out by means of one or several calculators.
  • a first digital image is provided by the user.
  • Said first digital image may include for example a JPEG or PNG digital image.
  • Said first digital image may represent a commercial product which the user wants to search in an online database.
  • the user may provide the link to an online database for a given product.
  • the system will automatically download the images of said online database.
  • the user may provide a video of a product or of an image in a different format. Also in this case, the user will have to upload the media which will then be processed by the system in the same way.
  • the system will carry out a number of comparisons by means of different computer vision algorithms, starting from those less time consuming and at a lower computational cost.
  • the system may generate a “signature” or fingerprint of same first image.
  • Said signature or fingerprint consists of an alphanumeric code which uniquely identifies said first digital image. This code in particular is little enough to allow an efficient search through a “nearest- neighbor search”, it is sensitive enough to effectively filter a database to detect any possible duplicates and reliable enough to find any resized or compressed duplicates.
  • Said generated signature or fingerprint may be used for carrying out a search between several digital images collected in a database of the system, hereinafter indicated as third database.
  • This database of the system (third database) is an internal database, i.e. directly allocated on the system according to the present invention and it is different from the databases in which the digital images search is carried out. More particularly, said third database consists of a proprietary Full Text capacity database which will include the already analyzed, processed and classified digital images, its aim being to simplify and speed up any future searches.
  • the system compares said first digital image to at least one digital image collected in said third database.
  • Said at lease one digital image collected in said third database is hereinafter indicated as fourth digital image.
  • the system starts a concurrent scraping process in different web sites , or online databases.
  • results of this second search in online databases are then processed, image by image, searching for a match with the reference image.
  • the outcome of this process is then filtered, ordered and presented to the user.
  • the system may compare, or search, said first digital image to a second digital image, said second digital image being collected in a first online database.
  • Said first online database allocated on a remote server, may include a collection of digital images of commercial products which the user intends to search.
  • the system may simultaneously compare, or search said first digital image to a third digital image collected in a second online database, different from said first online database.
  • Said second online database allocated on a remote server, may include a collection of digital images of commercial products which the user intends to search.
  • the present search phase may be simultaneously carried out in a plurality of online databases, each including several digital images corresponding to the commercial products which the user intends to search. If the present search phase is successful, i.e. said second or third digital image matches said first digital image, the search process terminates and the outcome is presented to the user.
  • the system classifies, through a Machine Learning algorithm, said first digital image, generating some textual annotations (or tags).
  • the so generated annotations are then used to carry out textual searches in said first or second online database.
  • the system compares the input images to those obtained for each product by means of different computer vision algorithms.
  • OpenCV open source library for real-time computer vision
  • Said library includes several algorithms for video and image analysis to be possibly applied in numerous different fields.
  • the library easily integrates with the most modern machine learning frameworks, such as TensorFlow or PyTorch.
  • the comparison among said digital images may be carried out through a “template matching” technique, or a technique for elaborating the digital images to find small parts of an image which match with an image model. More specifically, it is mainly used to find the “exact matches”, that is two exactly (or almost) identical digital images.
  • the comparison among said digital images may be carried out by means of a “feature matching” algorithm.
  • a “feature matching” algorithm More specifically, a library named FLANN is used, the latter including a collection of algorithms which allow rapid “nearest neighbor” searches and a system for automatically choosing the best algorithm and the optimum parameters according to the dataset. This kind of comparison allows to find similarities between two images and to identify the same product in different contexts.
  • a third kind of comparison may be carried out by generating an alphanumeric code (called signature or fingerprint) which uniquely identifies an image.
  • This code is therefore a simplification of the images, and it is used in order to accelerate the searches.
  • system may assign a score to said second digital image or to said third digital image, said score assessing the affinity between said first digital image and said second or third digital image.
  • the results are then filtered and ordered.
  • the found matches are saved in a document-non-relational JSON style database with dynamic scheme and presented to the user.
  • the system will save the analysed images and the respective generated signatures or tags in the fourth digital database, in order to accelerate and improve any future searches.
  • the present invention includes a system for searching a digital image in online databases, said system comprising one processor and at least one memory holding instructions to be executed by said processor, said processor being configured by said instructions to perform the above described method for searching digital images in an online database.
  • the present invention includes a non-transient memory (readable by a calculator) including instructions which make - when they are executed by one or several processors - said one or several processors execute the above described method for searching digital images in an online database.

Abstract

The object of the present invention is a method and a system for searching images which are associated with various products and contained in online databases. The method for searching a digital image in an online database by means of a calculator including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.

Description

METHOD AND SYSTEM FOR SEARCHING A DIGITAL IMAGE IN AN ONLINE DATABASE
The object of the present invention is a method and a system for searching images which are associated with various products and contained in online databases.
The aim of the invention is to provide a system and a method for searching images which are associated with equal or equivalent products or material goods, by comparing the said images in online databases. The said images are generally contained in various producers' and/or distributors' (located differently) online databases.
The present invention drastically reduces the search times, by eliminating the long process of comparing the images, which is performed manually at present.
In fact, the first obvious difficulty lies in having to manually analyse hundreds or thousands of pages, which are scattered in various online databases.
Furthermore, in most cases the online databases are to be entirely browsed in one single language, so making the textual search almost impossible to those who do not know that particular language. Using digital images releases the process from the said linguistic inadequacy.
At present, the state of the art does not include a completely automated system for solving the described problem. Many companies exist - known as International Procurement Organizations (or IPO) - which provide the possibility to manually search a product in online databases. Anyway, all the solutions which are available at present are both expensive and time consuming.
In order to find the best provider on the market, the user would have to manually visit all the online databases in which he believes a given product might be collected. For each database, he would accordingly have to perform a search (in most cases only a textual search) and analyse the proposed results.
The system and method object of the present invention provide a completely automated process for searching products by means of digital images.
It is accordingly possible to reduce both the search costs and the time waisted while waiting for the search results. More specifically, such time may be radically shortened, from days to few minutes or even seconds.
The said aims and advantages are obtained by the method for searching digital images in online databases through a calculator according to the present invention, said method comprising: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
In another aspect of the present invention, the described method may include comparing said first digital image to a third digital image, said third digital image being collected in a second online database, said second online database being allocated on a remote server, different from said first online database.
In a further aspect of the present invention, the comparison between said first digital image and said second or third digital image consists in a template matching.
In a further aspect of the present invention, the comparison between said first digital image and said second or third digital image consists in a feature matching.
In a further aspect of the present invention, the method includes: generating a signature for said second or third digital image; and collecting said second or third digital image in a third database.
In another aspect of the present invention, the method includes: generating a signature for said first digital image; comparing the signature generated for said first digital image to the one generated for a fourth digital image, said fourth digital image and its signature being collected in said third database.
In another aspect of the present invention, the method includes: assigning a score to said second digital image or to said third digital image, said score assessing the affinity between said first digital image and said second or third digital image.
In a further aspect of the present invention, the method includes: classifying said first digital image through a machine learning or deep learning algorithm and extracting some textual tags; using said tags to perform textual searches in said first or second online database; comparing said first digital image to said second or third digital image; collecting these tags inside said third online database.
In a second aspect, the present invention includes a system for searching a digital image in online databases, said system comprising at least one processor and at least one memory holding instructions to be executed by said processor, said processor being configured by said instructions to perform a method including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
In a third aspect, the present invention includes a non-transient memory (readable by a calculator) including instructions which make - when they are executed by one or several processors - said one or several processors execute the method for searching a digital image in online databases including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
The above aims (and others, as it will be hereinafter better described) are achieved by the method for searching digital images in online databases through a calculator according to the present invention, said calculator being described below, in a preferred embodiment to be possibly further developed and improved, through the attached drawings which show respectively:
Figure 1, a flowchart wich illustrates the method according to the present invention;
Figure 2, an architectural scheme according to the present invention.
Referring to figure 1 , the method according to the present invention includes distinct phases which are carried out by means of one or several calculators.
In a first phase a first digital image is provided by the user. Said first digital image may include for example a JPEG or PNG digital image. Said first digital image may represent a commercial product which the user wants to search in an online database. Alternatively, the user may provide the link to an online database for a given product. In this case, the system will automatically download the images of said online database. Alternatively, the user may provide a video of a product or of an image in a different format. Also in this case, the user will have to upload the media which will then be processed by the system in the same way.
Once said first image is received by the system, it is queued so avoiding that several concurrent processes may overload the server.
Then, the system will carry out a number of comparisons by means of different computer vision algorithms, starting from those less time consuming and at a lower computational cost.
More particularly, the system may generate a “signature” or fingerprint of same first image. Said signature or fingerprint consists of an alphanumeric code which uniquely identifies said first digital image. This code in particular is little enough to allow an efficient search through a “nearest- neighbor search”, it is sensitive enough to effectively filter a database to detect any possible duplicates and reliable enough to find any resized or compressed duplicates. Said generated signature or fingerprint may be used for carrying out a search between several digital images collected in a database of the system, hereinafter indicated as third database. This database of the system (third database) is an internal database, i.e. directly allocated on the system according to the present invention and it is different from the databases in which the digital images search is carried out. More particularly, said third database consists of a proprietary Full Text capacity database which will include the already analyzed, processed and classified digital images, its aim being to simplify and speed up any future searches.
The system compares said first digital image to at least one digital image collected in said third database. Said at lease one digital image collected in said third database (elastic search database) is hereinafter indicated as fourth digital image.
If the search in said third database is successful, i.e. said fourth digital image matches said first digital image, the process terminates and the outcome is presented to the user.
If the outcome is negative, the system starts a concurrent scraping process in different web sites , or online databases.
The results of this second search in online databases are then processed, image by image, searching for a match with the reference image.
The outcome of this process is then filtered, ordered and presented to the user.
More specifically, the system may compare, or search, said first digital image to a second digital image, said second digital image being collected in a first online database. Said first online database, allocated on a remote server, may include a collection of digital images of commercial products which the user intends to search.
The system may simultaneously compare, or search said first digital image to a third digital image collected in a second online database, different from said first online database. Said second online database, allocated on a remote server, may include a collection of digital images of commercial products which the user intends to search.
The present search phase may be simultaneously carried out in a plurality of online databases, each including several digital images corresponding to the commercial products which the user intends to search. If the present search phase is successful, i.e. said second or third digital image matches said first digital image, the search process terminates and the outcome is presented to the user.
Otherwise, the system classifies, through a Machine Learning algorithm, said first digital image, generating some textual annotations (or tags).
The so generated annotations are then used to carry out textual searches in said first or second online database.
Once the search results are collected, the system compares the input images to those obtained for each product by means of different computer vision algorithms.
These algorithms are implemented by means of a cross-platform open source library for real-time computer vision, known as "OpenCV”. Said library includes several algorithms for video and image analysis to be possibly applied in numerous different fields. Furthermore, the library easily integrates with the most modern machine learning frameworks, such as TensorFlow or PyTorch.
In particular, the comparison among said digital images may be carried out through a “template matching” technique, or a technique for elaborating the digital images to find small parts of an image which match with an image model. More specifically, it is mainly used to find the “exact matches”, that is two exactly (or almost) identical digital images.
Alternatively, the comparison among said digital images may be carried out by means of a “feature matching” algorithm. More specifically, a library named FLANN is used, the latter including a collection of algorithms which allow rapid “nearest neighbor” searches and a system for automatically choosing the best algorithm and the optimum parameters according to the dataset. This kind of comparison allows to find similarities between two images and to identify the same product in different contexts.
A third kind of comparison may be carried out by generating an alphanumeric code (called signature or fingerprint) which uniquely identifies an image. This code is therefore a simplification of the images, and it is used in order to accelerate the searches.
Furthermore, the system may assign a score to said second digital image or to said third digital image, said score assessing the affinity between said first digital image and said second or third digital image.
By means of the ascertained score, the results are then filtered and ordered. The found matches are saved in a document-non-relational JSON style database with dynamic scheme and presented to the user.
Lastly, irrespective of the outcome of the above said process, the system will save the analysed images and the respective generated signatures or tags in the fourth digital database, in order to accelerate and improve any future searches.
In a second aspect, the present invention includes a system for searching a digital image in online databases, said system comprising one processor and at least one memory holding instructions to be executed by said processor, said processor being configured by said instructions to perform the above described method for searching digital images in an online database.
In a third aspect, the present invention includes a non-transient memory (readable by a calculator) including instructions which make - when they are executed by one or several processors - said one or several processors execute the above described method for searching digital images in an online database.

Claims

1 ) Method for searching a digital image in an online database by means of a calculator, said method including: providing a first digital image; comparing said first digital image to a second digital image, said second digital image being collected in a first online database, said first online database being allocated on a remote server.
2) Method for searching a digital image in an online database by means of a calculators as claimed in claim one, including: comparing said first digital image to a third digital image; said third digital image being collected in a second online database, said second online database being allocated on a remote server, said first online database being different from said second online database.
3) Method for searching a digital image in an online database as claimed in claim one, in which comparing said first digital image to said second or third digital image consists in a template matching.
4) Method for searching a digital image in an online database as claimed in any previous claims, in which comparing said first digital image to said second or third digital image consists in a feature matching.
5) Method for searching a digital image in an online database as claimed in any previous claims, including: generating a signature for said second or third digital image; and collecting said second or third digital image in a third database.
6) Method for searching a digital image in an online database by means of a calculator as claimed in any previous claims, including: generating a signature for said first digital image; comparing said first digital image to a fourth digital image, said fourth digital image being collected in said third database.
7) Method for searching a digital image in an online database as claimed in any previous claims, including: assigning a score to said second digital image or to said third digital image, said score assessing the affinity between said first digital image and said second or third digital image.
8) Method for searching a digital image in an online database as claimed in any previous claims, including: classifying said first digital image through a machine learning or deep learning algorithm; extracting at least one textual tag from said first digital image; using said at least one textual tag to perform a textual search in said first and/or second online database; comparing said first digital image to said second and/or third digital image; collecting said textual tags inside said third online database.
9) System for searching a digital image in online databases, said system including one processor and at least one memory holding instructions to be executed by said processor, said processor being configured by said instructions to perform the method as claimed in claims one to eight.
10) Non-transient memory (readable by a calculator) including instructions which make - when they are executed by one or several processors - said one or several processors execute the method for searching a digital image in online databases as claimed in claims one to eight.
PCT/IT2019/000083 2019-10-23 2019-10-23 Method and system for searching a digital image in an online database WO2021079386A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IT2019/000083 WO2021079386A1 (en) 2019-10-23 2019-10-23 Method and system for searching a digital image in an online database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IT2019/000083 WO2021079386A1 (en) 2019-10-23 2019-10-23 Method and system for searching a digital image in an online database

Publications (1)

Publication Number Publication Date
WO2021079386A1 true WO2021079386A1 (en) 2021-04-29

Family

ID=68887090

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IT2019/000083 WO2021079386A1 (en) 2019-10-23 2019-10-23 Method and system for searching a digital image in an online database

Country Status (1)

Country Link
WO (1) WO2021079386A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100034470A1 (en) * 2008-08-06 2010-02-11 Alexander Valencia-Campo Image and website filter using image comparison
WO2015017439A1 (en) * 2013-07-31 2015-02-05 Alibaba Group Holding Limited Method and system for searching images
WO2018071501A1 (en) * 2016-10-16 2018-04-19 Ebay Inc. Personal assistant with offline visual search database

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100034470A1 (en) * 2008-08-06 2010-02-11 Alexander Valencia-Campo Image and website filter using image comparison
WO2015017439A1 (en) * 2013-07-31 2015-02-05 Alibaba Group Holding Limited Method and system for searching images
WO2018071501A1 (en) * 2016-10-16 2018-04-19 Ebay Inc. Personal assistant with offline visual search database

Similar Documents

Publication Publication Date Title
US11232324B2 (en) Methods and apparatus for recommending collocating dress, electronic devices, and storage media
JP6144839B2 (en) Method and system for retrieving images
CN110765275A (en) Search method, search device, computer equipment and storage medium
US20160132720A1 (en) Vector-based face recognition algorithm and image search system
US20170132314A1 (en) Identifying relevant topics for recommending a resource
CN110929752A (en) Knowledge-driven and data-driven clustering method and related equipment
Meng Two-stage recognition for oracle bone inscriptions
US20210248419A1 (en) Methods for identifying biological material by microscopy
CN110780965B (en) Vision-based process automation method, equipment and readable storage medium
Skluzacek et al. Skluma: An extensible metadata extraction pipeline for disorganized data
US10642793B2 (en) Method and system for compressing genome sequences using graphic processing units
Di Ruberto et al. Comparison of statistical features for medical colour image classification
CN110021386B (en) Feature extraction method, feature extraction device, equipment and storage medium
CN113221570A (en) Processing method, device, equipment and storage medium based on-line inquiry information
He et al. Identifying genes and their interactions from pathway figures and text in biomedical articles
WO2021079386A1 (en) Method and system for searching a digital image in an online database
JP2023130409A (en) Information processing device, information processing method, and program
US20130054553A1 (en) Method and apparatus for automatically extracting information of products
Di Ruberto et al. On different colour spaces for medical colour image classification
CN115357259A (en) Product updating method, system, equipment and storage medium based on industry
Arshad Prediction and diagnosis of breast cancer using machine learning and ensemble classifiers
CN109446330B (en) Network service platform emotional tendency identification method, device, equipment and storage medium
Yaşar Çiklaçandir et al. The effects of fusion-based feature extraction for fabric defect classification
Gutierrez-Cáceres et al. Computer aided medical diagnosis tool to detect normal/abnormal studies in digital mr brain images
Pedrosa et al. Lesion-based chest radiography image retrieval for explainability in pathology detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19820887

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 200722)

122 Ep: pct application non-entry in european phase

Ref document number: 19820887

Country of ref document: EP

Kind code of ref document: A1