WO2019020049A1 - 一种图像检索方法、装置及电子设备 - Google Patents

一种图像检索方法、装置及电子设备 Download PDF

Info

Publication number
WO2019020049A1
WO2019020049A1 PCT/CN2018/097008 CN2018097008W WO2019020049A1 WO 2019020049 A1 WO2019020049 A1 WO 2019020049A1 CN 2018097008 W CN2018097008 W CN 2018097008W WO 2019020049 A1 WO2019020049 A1 WO 2019020049A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
target
interest
retrieved
Prior art date
Application number
PCT/CN2018/097008
Other languages
English (en)
French (fr)
Inventor
陈畅怀
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Priority to US16/632,775 priority Critical patent/US11586664B2/en
Priority to ES18839135T priority patent/ES2924268T3/es
Priority to EP18839135.3A priority patent/EP3660700B1/en
Publication of WO2019020049A1 publication Critical patent/WO2019020049A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/54Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the present application relates to the field of image processing and pattern recognition technologies, and in particular, to an image retrieval method, apparatus, and electronic device.
  • the retrieval system receives the query image provided by the user, and then extracts the region of interest of the query image according to the user's instruction, wherein the region of interest represents an area having the ability to recognize and reflect the characteristics of the image, and the recognition capability Representing the ability to distinguish different targets; then extracting the features of the region of interest and the features of the image in the image library corresponding to the region of interest; and then querying the features of the region of interest of the image with the corresponding regions of the image in the database The features are compared and finally sorted by similarity to return the search results to obtain an image that satisfies the requirements.
  • An object of the embodiments of the present application is to provide an image retrieval method, apparatus, and electronic device to improve the accuracy of image retrieval.
  • the specific technical solutions are as follows:
  • an embodiment of the present application provides an image retrieval method, including:
  • the step of acquiring the target features corresponding to the plurality of to-be-retrieved images includes: acquiring target features of the plurality of to-be-retrieved images saved in a preset database; or, based on the pre-trained deep nerves a network that determines target features of the plurality of images to be retrieved.
  • the predetermined feature is a region of interest feature
  • the target feature is a feature that is aggregated into a region feature
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the step of determining a target feature of the query image based on the pre-trained deep neural network includes:
  • the determining, according to the calculated similarity, the search image corresponding to the query image from the plurality of to-be-retrieved images including:
  • the image is retrieved and determined as a search image corresponding to the query image, wherein the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the method further includes:
  • an embodiment of the present application further provides an image retrieval apparatus, including:
  • An image acquisition module is configured to obtain a query image.
  • a first feature determining module configured to determine a target feature of the query image based on a pre-trained depth neural network; wherein the depth neural network is a predetermined image capable of forming a target feature according to each sample image and each sample image Feature training.
  • the second feature determining module is configured to acquire target features of the plurality of images to be retrieved.
  • a calculation module configured to calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved.
  • a search image determining module configured to determine, according to the calculated similarity, a search image corresponding to the query image from the plurality of to-be-retrieved images.
  • the second feature determining module is configured to acquire target features of the plurality of to-be-retrieved images saved in a preset database, or determine the multiples based on the pre-trained deep neural network.
  • the target feature of the image to be retrieved is configured to acquire target features of the plurality of to-be-retrieved images saved in a preset database, or determine the multiples based on the pre-trained deep neural network. The target feature of the image to be retrieved.
  • the predetermined feature is a region of interest feature
  • the target feature is a feature that is aggregated into a feature region feature
  • the first feature determining module includes:
  • a region of interest obtaining sub-module configured to input the query image into a pre-trained first depth neural network to obtain a target region of interest of the query image, wherein the first depth neural network is based on each sample image And the region of interest corresponding to each sample image is trained; the region of interest feature determining submodule is configured to input the target region of interest into the pre-trained second depth neural network to obtain the target region of interest.
  • a target region of interest feature wherein the second depth neural network is trained according to each region of interest and a region of interest feature corresponding to each region of interest; a first feature determining sub-module for using the target The regions of interest feature converge into the target features of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the first feature determining module includes:
  • a second feature determining submodule configured to input the query image into a pre-trained third depth neural network to obtain global features of the query image, wherein the third depth neural network is based on each sample image, and The global feature corresponding to each sample image is trained.
  • the search image determining module is specifically configured to sort the calculated similarity, and determine, according to the result obtained by the sorting, a search image corresponding to the query image from the plurality of to-be-searched images; or And determining, by the target image to be retrieved, the image to be retrieved in the image to be retrieved, wherein the image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the device further includes: an output module, configured to output location information of the target region of interest after the target region of interest of the query image is obtained.
  • an embodiment of the present application further provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete each other through the communication bus Communication between.
  • the memory is for storing a computer program.
  • the processor when used to execute a program stored on the memory, implements the following method steps:
  • the step of acquiring the target features corresponding to the multiple images to be retrieved includes:
  • the predetermined feature is a region of interest feature
  • the target feature is a feature that is aggregated into a region feature
  • the step of determining a target feature of the query image based on the pre-trained depth neural network including Entering the query image into a pre-trained first depth neural network to obtain a target region of interest of the query image, wherein the first depth neural network is based on each sample image and a corresponding sense of each sample image
  • the region of interest is obtained by inputting the target region of interest into a pre-trained second depth neural network to obtain a target region of interest feature of the target region of interest, wherein the second depth neural network is based on each And the region of interest, and the region of interest corresponding to each region of interest are trained; and the target region of interest feature is aggregated into a target feature of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the step of determining a target feature of the query image based on the pre-trained deep neural network comprises: inputting the query image into advance In the trained third depth neural network, global features of the query image are obtained, wherein the third depth neural network is trained according to each sample image and global features corresponding to each sample image.
  • the determining, according to the calculated similarity, the search image corresponding to the query image from the plurality of to-be-retrieved images including: sorting the calculated similarities, and obtaining the results according to the sorting Determining, from the plurality of to-be-retrieved images, the search image corresponding to the query image; or determining the target image to be retrieved in the plurality of to-be-retrieved images as the search image corresponding to the query image, where The target image to be retrieved is an image to be retrieved corresponding to a similarity greater than a predetermined similarity threshold.
  • the processor is further configured to output location information of the target region of interest after the target region of interest of the query image is obtained.
  • the embodiment of the present application further provides a storage medium for storing executable code, where the executable code is used to execute the method steps of the image retrieval method described in the above first aspect at runtime.
  • an embodiment of the present application further provides an application program for performing the method steps of the image retrieval method according to the first aspect described above at runtime.
  • the target feature of the query image may be determined based on the pre-trained depth neural network; the similarity between the target feature of the query image and the target feature of each image to be retrieved is calculated; The similarity is determined from the plurality of images to be retrieved, and the search image corresponding to the query image is determined. It can be seen that, by adopting the scheme, it is not necessary to extract the features of the image according to the user's instruction, that is, without the subjective participation of the user, the characteristics reflecting the image features can be accurately determined, thereby improving the accuracy of the image retrieval.
  • the target features of the query image are determined, and the automatic positioning of the target features is realized, and the user experience is improved.
  • any of the products or methods of the present application necessarily does not necessarily require all of the advantages described above to be achieved at the same time.
  • FIG. 1 is a flowchart of an image retrieval method according to an embodiment of the present application
  • FIG. 2 is a flow chart of steps of an image retrieval method according to an embodiment of the present application.
  • FIG. 3 is a flowchart of determining image target features by two deep neural networks according to an embodiment of the present application
  • FIG. 4 is a flowchart of another step of an image retrieval method according to an embodiment of the present application.
  • FIG. 5 is a flowchart of determining an image target feature by using a deep neural network according to an embodiment of the present application
  • FIG. 6 is a flowchart of a specific process of image retrieval provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an image retrieval apparatus according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 1 is a flowchart of an image retrieval method according to an embodiment of the present application.
  • the image retrieval method provided by the embodiment of the present application is described in detail with reference to FIG. 1 , and the method includes:
  • Step 101 Acquire a query image.
  • the image retrieval method provided by the embodiment of the present application can be applied to an electronic device, wherein the electronic device can include a desktop computer, a portable computer, a smart mobile terminal, and the like.
  • the electronic device acquires a query image, that is, acquires a target image that needs to be retrieved.
  • a query image that is, acquires a target image that needs to be retrieved.
  • an image including a cat face is acquired.
  • the query image may be manually uploaded by the user or automatically captured by the electronic device, which is reasonable.
  • Step 102 Determine a target feature of the query image based on the pre-trained depth neural network; wherein the depth neural network is trained according to each sample image and a predetermined feature corresponding to each sample image capable of forming a target feature.
  • the image retrieval is completed by comparing the target features of the query image with the features corresponding to the images in the image library. Therefore, in the process of image retrieval, determining the target feature of the query image is a very important process.
  • the electronic device may train the deep neural network according to a certain number of sample images, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and predetermined features corresponding to the respective sample images capable of forming the target features. Based on the deep neural network, the target features of the query image can be determined.
  • the electronic device may input the query image into the pre-trained deep neural network, and then determine the target feature of the query image based on the pre-trained deep neural network.
  • the predetermined feature required for the deep neural network training may be the same as the target feature, for example, the predetermined feature is a global feature, and the target feature is a global feature; In a specific implementation manner, the predetermined feature required for deep neural network training may be different from the target feature, but the target feature may be generated by the predetermined feature, for example, the predetermined feature is a region of interest feature, and the target feature is a region feature of interest. Converging features.
  • the so-called region of interest feature refers to an image feature corresponding to the region of interest that has the ability to recognize and can reflect the characteristics of the image.
  • Step 103 Acquire target features of multiple images to be retrieved.
  • the target features of the plurality of images to be retrieved in the image library may be directly acquired; or may be determined in real time during the process of image retrieval.
  • the target features of the plurality of images to be retrieved stored in the preset database may be directly acquired. Specifically, target features of a plurality of images to be retrieved are extracted in advance, and the target features are saved in a preset database. In this way, in the process of image retrieval, corresponding target features can be directly obtained from the preset database.
  • the target feature of the image to be retrieved is extracted in advance, and in the process of image retrieval, the target features of the plurality of images to be retrieved stored in the preset database are directly acquired.
  • the target feature of the image to be retrieved may be stored in advance to implement offline extraction of the target feature of the image to be retrieved.
  • the problem of ultra-long delay of extracting target features of multiple images to be retrieved in real time is solved, so that the requirements of real-time applications can be met.
  • the target features of the plurality of images to be retrieved may be determined online.
  • the target features of the plurality of images to be retrieved are determined based on the pre-trained deep neural network. Specifically, based on the pre-trained deep neural network, the process of determining the target features of the plurality of images to be retrieved is similar to the process of determining the target features of the query image based on the pre-trained deep neural network described above, and details are not described herein again.
  • Step 104 Calculate the similarity between the target feature of the query image and the target feature of each image to be retrieved.
  • the target feature of the query image and the target features of the plurality of images to be retrieved may be respectively compared, and then the search image corresponding to the query image is determined according to the comparison result of the target feature.
  • the feature similarity measure is an important aspect that affects the image retrieval performance. Therefore, in the embodiment of the present application, after the target feature of the query image and the target features of the plurality of images to be retrieved are determined, the target of the query image may be separately calculated. The similarity between the feature and the target feature of each image to be retrieved. Specifically, in an implementation manner, the target feature of the query image and the target feature of each image to be retrieved may be represented by a feature vector, and then the similarity between the feature vectors is calculated to obtain a target feature of the query image and each to be retrieved. The similarity between the target features of the image is of course not limited to this.
  • Step 105 Determine, according to the calculated similarity, a search image corresponding to the query image from the plurality of to-be-retrieved images.
  • the similarity between the target feature of the query image and the target feature of each image to be retrieved is calculated, and the search image corresponding to the query image is determined from the plurality of images to be retrieved according to the similarity.
  • the search image corresponding to the query image may be determined from the plurality of images to be retrieved according to the order of the similarity from high to low.
  • determining the search image corresponding to the query image from the plurality of to-be-searched images according to the calculated similarity may include:
  • the calculated similarity is performed from high to low or low to high, and the preset number of images to be retrieved with the highest similarity is selected as the search image corresponding to the query image. For example, if it is arranged from high to low, the preset number of images to be retrieved arranged in front is selected as the search image corresponding to the determined query image; if it is arranged from low to high, the preset presets are selected.
  • the number of images to be retrieved is a search image corresponding to the determined query image.
  • the preset number may be 1, 2, 10, or the like.
  • determining the search image corresponding to the query image from the plurality of to-be-searched images according to the calculated similarity may include:
  • the image to be retrieved in the plurality of to-be-retrieved images is determined as a search image corresponding to the query image, wherein the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the similarity threshold is determined.
  • the similarity is greater than the similarity threshold, the corresponding preset number of images to be retrieved are the search images corresponding to the query image, and the similarity threshold may be determined according to actual conditions.
  • the image retrieval method provided by the embodiment of the present application determines a target feature of the query image based on the pre-trained deep neural network; calculates a similarity between the target feature of the query image and the target feature of each image to be retrieved; And determining a search image corresponding to the query image from the plurality of images to be retrieved. It can be seen that the image retrieval method provided by the embodiment of the present application can accurately determine the features reflecting the characteristics of the image, thereby improving the accuracy of the image retrieval.
  • the user may select to use the region of interest search or the global search.
  • the image may be compared by the feature of the region of interest of the image or the global feature of the image, thereby implementing the process of image retrieval.
  • the global feature of the query image may be directly determined, and the global feature is used as the target feature of the query image; the region of interest of the query image may be determined first, and then the feature of the region of interest is aggregated into the query image.
  • Target characteristics may be directly determined, and the global feature is used as the target feature of the query image; the region of interest of the query image may be determined first, and then the feature of the region of interest is aggregated into the query image.
  • the predetermined feature is a feature of the region of interest
  • the target feature is a feature of the feature of the region of interest
  • the region of interest of the query image may be extracted by two pre-trained deep neural networks, thereby extracting features of the region of interest.
  • an image retrieval method may include the following steps:
  • Step 201 Acquire a query image.
  • Step 202 The query image is input into the pre-trained first depth neural network to obtain a target region of interest of the query image, wherein the first depth neural network is trained according to each sample image and the region of interest corresponding to each sample image. owned.
  • the first deep neural network is trained according to a certain number of sample images, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and the region of interest corresponding to each sample image.
  • the query image is input into the pre-trained first depth neural network to obtain a target region of interest of the query image.
  • the query image is input into the pre-trained first depth neural network, and the first depth neural network operates on the query image to obtain a feature map of the downsampling scale that is equal in size or maintains the aspect ratio of the query image.
  • the value of each position in the feature map indicates the ability to identify the original position of the input query image, and the feature map is thresholded and morphologically operated to obtain a plurality of sub-regions with strong recognition capabilities. For the identified area of interest.
  • the location information of the target region of interest may also be output.
  • the location information of the target region of interest of the obtained query image may be output to the user.
  • Step 203 Input a target region of interest into a pre-trained second depth neural network to obtain a target region of interest feature of the target region of interest, wherein the second depth neural network is based on each region of interest, and each of the regions of interest The area corresponding to the region of interest is trained.
  • the second deep neural network is trained according to a certain number of sample images, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and the region of interest corresponding to each region of interest.
  • the target region of interest of the query image obtained through the pre-trained first depth neural network is input into the pre-trained second depth neural network, that is, the target region of interest feature of the target region of interest can be obtained.
  • the recognition ability score of the region of interest may be calculated according to the corresponding recognition ability in the region of interest, and then the recognition ability score and the region of interest are input into the pre-trained second depth neural network together with the query image.
  • the pre-trained second depth neural network extracts features according to the region of interest and its corresponding recognition ability to obtain features of each region of interest.
  • Step 204 Converging the target region of interest features into target features of the query image.
  • the target region of interest obtained through the pre-trained first depth neural network.
  • FIG. 3 is a flowchart of determining image target features by two networks according to an embodiment of the present application.
  • the image is input into a pre-trained first depth neural network, i.e., the region of interest detection sub-network shown in Figure 3, to obtain the region of interest of the image.
  • a pre-trained first depth neural network i.e., the region of interest detection sub-network shown in Figure 3, to obtain the region of interest of the image.
  • the obtained region of interest is input into a pre-trained second depth neural network, that is, the region of interest feature extraction sub-network shown in FIG. 3, to obtain a region of interest feature of the image.
  • the obtained regions of interest corresponding to all regions of interest are aggregated to obtain target features of the image.
  • Step 205 Acquire target features of multiple images to be retrieved.
  • Step 206 Calculate the similarity between the target feature of the query image and the target feature of each image to be retrieved.
  • Step 207 Determine, according to the calculated similarity, a search image corresponding to the query image from the plurality of to-be-retrieved images.
  • the step 201 is the same as the step 101 in the above embodiment, and the steps 205-207 are the same as the steps 103-105 in the above embodiment, and are not described herein.
  • the target region of interest of the query image is obtained through a pre-trained deep neural network, and then the target region of interest region of the target region of interest is obtained through another pre-trained depth neural network, and the target region of interest is obtained.
  • Features are aggregated into the desired features of the search process.
  • Two independent deep neural networks can be trained separately, which simplifies the complexity of training and reduces the complexity of image retrieval.
  • the results obtained by each deep neural network can also be output to the user to interact with the user.
  • the predetermined feature is a global feature
  • the target feature is a global feature.
  • the target feature of the query image can be obtained through a pre-trained deep neural network.
  • an image retrieval method may include the following steps:
  • Step 401 Acquire a query image.
  • Step 402 The query image is input into the pre-trained third depth neural network to obtain a global feature of the query image, wherein the third depth neural network is trained according to each sample image and the global feature corresponding to each sample image.
  • the query image is input into the pre-trained third depth neural network to obtain a global feature of the query image, wherein the third depth neural network is trained according to each sample image and the global feature corresponding to each sample image.
  • the third depth is obtained according to a certain number of sample images, such as 100 sheets, 500 sheets, 1000 sheets, and the like, and global features corresponding to the respective sample images.
  • Neural Networks In the image retrieval process, the query image is input into the pre-trained third depth neural network to obtain a global feature of the query image, and the global feature of the obtained query image is used as the target feature of the query image.
  • the query image is input into the pre-trained third depth neural network, and the third depth neural network operates on the query image to obtain a feature map of the downsampling scale that is equal in size or maintains the aspect ratio of the query image.
  • the value of each position in the feature map not only indicates the recognition ability of the corresponding position in the query image, but also the feature response of the corresponding query image. Then, the global feature of the query image is determined according to the feature map.
  • FIG. 5 is a flowchart of determining an image target feature by using a deep neural network in the embodiment of the present application.
  • the image is input into a pre-trained deep neural network, for example, the global feature extraction sub-network shown in FIG. 5, through which the global feature of the image is directly extracted, and the global feature is taken as the target feature of the image.
  • Step 403 Acquire target features of multiple images to be retrieved.
  • Step 404 Calculate the similarity between the target feature of the query image and the target feature of each image to be retrieved.
  • Step 405 Determine, according to the calculated similarity, a search image corresponding to the query image from the plurality of to-be-retrieved images.
  • the step 401 is the same as the step 101 in the above embodiment, and the steps 403-405 are the same as the steps 103 to 105 in the above embodiment, and are not described herein.
  • the global feature of the query image is obtained through a pre-trained deep neural network, which is the target feature required in the retrieval process. Only a deep neural network needs to be trained, and then the target feature of the image can be obtained through the pre-trained deep neural network, which simplifies the training process and improves the efficiency of image retrieval.
  • the extraction of the image region of interest and the extraction of the image feature are determined by a pre-trained deep neural network, which is
  • the end-to-end overall scheme is similar to the response of the human visual system, so that the extracted image features are more recognizable and expressive, and thus the final image retrieval result can be guaranteed.
  • FIG. 6 is a flowchart of a specific process of image retrieval according to an embodiment of the present application. The specific process of image retrieval in the embodiment of the present application is described in detail with reference to FIG. 6 .
  • Step 601 Acquire a query image submitted by a user.
  • Step 602 Extract the region of interest of the query image through the pre-trained deep neural network, and then aggregate the features of the region of interest or directly extract the global features of the image.
  • the location information of the region of interest can also be returned to the user for selection by the user.
  • step 603 the user selects a retrieval mode.
  • Step 604 If the global retrieval mode is selected, the global feature of the plurality of images to be retrieved is directly determined by the pre-trained deep neural network, and then the global features of the query image and the global features of the respective images to be retrieved are compared.
  • Step 605 If the region of interest retrieval mode is selected, the region of interest of the plurality of images to be retrieved is extracted by the pre-trained depth neural network, and then the region of interest region of the region of interest is extracted, and then the sense of the query image is compared. The region of interest feature and the region of interest feature of each image to be retrieved.
  • Step 606 if the global retrieval mode is selected, after the global feature comparison, the similarity between the global feature of the query image and the global feature of each image to be retrieved is obtained, and then the similarity of the global feature comparison is finally obtained from multiple The retrieved image is determined in the retrieved image.
  • the similarity between the region of interest feature of the query image and the region of interest feature of each image to be retrieved is obtained, and then compared according to the feature of the region of interest. The similarity finally determines the retrieved image from a plurality of images to be retrieved.
  • the retrieval image is finally determined from the plurality of to-be-retrieved images or the similarity of the region of interest feature comparison is finally determined from the plurality of to-be-retrieved images, and the similarity may be performed.
  • Sorting and further, determining a search image from the plurality of images to be retrieved according to the sorting result or selecting a plurality of preset images to be retrieved corresponding to the similarity threshold is a search image corresponding to the query image.
  • Step 607 obtaining an image to be retrieved.
  • FIG. 7 is a schematic structural diagram of an image retrieving apparatus according to an embodiment of the present disclosure.
  • the image retrieving apparatus provided by the embodiment of the present application is described in detail with reference to FIG.
  • the image obtaining module 701 is configured to acquire a query image.
  • the first feature determining module 702 is configured to determine a target feature of the query image based on the pre-trained depth neural network; wherein the depth neural network is trained according to each sample image and a predetermined feature corresponding to each sample image capable of forming a target feature of.
  • the second feature determining module 703 is configured to acquire target features of the plurality of images to be retrieved.
  • the calculating module 704 is configured to calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved.
  • the search image determining module 705 is configured to determine a search image corresponding to the query image from the plurality of to-be-retrieved images according to the calculated similarity.
  • the image retrieval device provided by the embodiment of the present application may determine a target feature of the query image based on the pre-trained depth neural network; calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved; Similarity, the search image corresponding to the query image is determined from the plurality of images to be retrieved. It can be seen that the image retrieval device provided by the embodiment of the present application does not need to extract the feature of the image according to the user's instruction, that is, without the subjective participation of the user, and can accurately determine the feature reflecting the image feature, thereby improving the accuracy of the image retrieval. At the same time, based on the pre-trained deep neural network, the target features of the query image are determined, and the automatic positioning of the target features is realized, and the user experience is improved.
  • the second feature determining module 703 is specifically configured to acquire target features of the plurality of images to be retrieved stored in the preset database; or determine target features of the plurality of images to be retrieved based on the pre-trained deep neural network.
  • the predetermined feature is a feature of the region of interest
  • the target feature is a feature that is aggregated into the feature of the region of interest
  • the first feature determining module 702 includes:
  • the region of interest obtains a sub-module for inputting the query image into the pre-trained first depth neural network to obtain a target region of interest of the query image, wherein the first depth neural network is corresponding to each sample image and each sample image The area of interest is trained.
  • a region of interest feature determining sub-module configured to input the target region of interest into the pre-trained second depth neural network to obtain a target region of interest feature of the target region of interest, wherein the second depth neural network is based on each interest The region, and the region of interest corresponding to each region of interest, are trained.
  • the first feature determining sub-module is configured to aggregate the target region of interest features into the target features of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature.
  • the first feature determining module 702 includes: a second feature determining sub-module, configured to input the query image into the pre-trained third depth neural network to obtain a query. A global feature of the image, wherein the third depth neural network is trained based on each sample image and global features corresponding to each sample image.
  • the search image determining module 705 is specifically configured to sort the calculated similarity, and determine, according to the result obtained by the sorting, the search image corresponding to the query image from the plurality of to-be-retrieved images; or The target image to be retrieved in the image is retrieved and determined as a search image corresponding to the query image, wherein the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the image retrieval apparatus provided by the embodiment of the present application further includes: an output module, configured to output location information of the target region of interest after obtaining the target region of interest of the query image.
  • the image retrieval device of the embodiment of the present application is a device applying the image retrieval method described above, and all embodiments of the image retrieval method are applicable to the device, and all of the same or similar beneficial effects can be achieved.
  • the embodiment of the present application further provides an electronic device, as shown in FIG. 8, including a processor 801, a communication interface 802, a memory 803, and a communication bus 804, wherein the processor 801, the communication interface 802, and the memory 803 pass through the communication bus 804. Complete communication with each other.
  • the memory 803 is configured to store a computer program.
  • the processor 801 is configured to perform the following steps when executing the program stored on the memory 803:
  • the target features of the plurality of images to be retrieved stored in the preset database are acquired; or the target features of the plurality of images to be retrieved are determined based on the pre-trained deep neural network.
  • the predetermined feature is a feature of the region of interest
  • the target feature is a feature in which the feature of the region of interest is aggregated
  • the query image is input into the pre-trained first depth neural network to obtain a target region of interest of the query image, where A depth neural network is trained according to each sample image and a region of interest corresponding to each sample image
  • the target region of interest is input into a pre-trained second depth neural network to obtain a target region of interest region of the target region of interest
  • the second depth neural network is trained according to each region of interest and the region of interest corresponding to each region of interest
  • the target region of interest features are aggregated into target features of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the query image is input into a pre-trained third depth neural network to obtain a global feature of the query image, wherein the third depth neural network is based on each sample image.
  • the global feature corresponding to each sample image is trained.
  • the calculated similarity is sorted, and the search image corresponding to the query image is determined from the plurality of to-be-retrieved images according to the result obtained by the sorting; or, the target to be retrieved image in the plurality of to-be-retrieved images is selected And determining, as the search image corresponding to the query image, the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the processor is further configured to output location information of the target region of interest after obtaining the target region of interest of the query image.
  • the communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the above electronic device and other devices.
  • the memory may include a random access memory (RAM), and may also include a non-volatile memory (NVM), such as at least one disk storage.
  • RAM random access memory
  • NVM non-volatile memory
  • the memory may also be at least one storage device located away from the aforementioned processor.
  • the above processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; or may be a digital signal processing (DSP), dedicated integration.
  • CPU central processing unit
  • NP network processor
  • DSP digital signal processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the electronic device provided by the embodiment of the present application may determine a target feature of the query image based on the pre-trained deep neural network; calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved; And determining a search image corresponding to the query image from the plurality of images to be retrieved. It can be seen that, by using the electronic device provided by the embodiment of the present application, it is not necessary to extract the feature of the image according to the user's instruction, that is, without the subjective participation of the user, the feature reflecting the image feature can be accurately determined, thereby improving the accuracy of the image retrieval. At the same time, based on the pre-trained deep neural network, the target features of the query image are determined, and the automatic positioning of the target features is realized, and the user experience is improved.
  • the embodiment of the present application further provides a storage medium for storing executable code, and the executable code is configured to perform the following steps at runtime:
  • the target features of the plurality of images to be retrieved stored in the preset database are acquired; or the target features of the plurality of images to be retrieved are determined based on the pre-trained deep neural network.
  • the predetermined feature is a feature of the region of interest
  • the target feature is a feature in which the feature of the region of interest is aggregated
  • the query image is input into the pre-trained first depth neural network to obtain a target region of interest of the query image, where A depth neural network is trained according to each sample image and a region of interest corresponding to each sample image
  • the target region of interest is input into a pre-trained second depth neural network to obtain a target region of interest region of the target region of interest
  • the second depth neural network is trained according to each region of interest and the region of interest corresponding to each region of interest
  • the target region of interest features are aggregated into target features of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the query image is input into a pre-trained third depth neural network to obtain a global feature of the query image, wherein the third depth neural network is based on each sample image.
  • the global feature corresponding to each sample image is trained.
  • the calculated similarity is sorted, and the search image corresponding to the query image is determined from the plurality of to-be-retrieved images according to the result obtained by the sorting; or, the target to be retrieved image in the plurality of to-be-retrieved images is selected And determining, as the search image corresponding to the query image, the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the location information of the target region of interest is output.
  • the storage medium provided by the embodiment of the present application may determine a target feature of the query image based on the pre-trained deep neural network; calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved; And determining a search image corresponding to the query image from the plurality of images to be retrieved. It can be seen that, by using the storage medium provided by the embodiment of the present application, it is not necessary to extract the feature of the image according to the user's instruction, that is, without the subjective participation of the user, the feature reflecting the image feature can be accurately determined, thereby improving the accuracy of the image retrieval. At the same time, based on the pre-trained deep neural network, the target features of the query image are determined, and the automatic positioning of the target features is realized, and the user experience is improved.
  • the target features of the plurality of images to be retrieved stored in the preset database are acquired; or the target features of the plurality of images to be retrieved are determined based on the pre-trained deep neural network.
  • the predetermined feature is a feature of the region of interest
  • the target feature is a feature in which the feature of the region of interest is aggregated
  • the query image is input into the pre-trained first depth neural network to obtain a target region of interest of the query image, where A depth neural network is trained according to each sample image and a region of interest corresponding to each sample image
  • the target region of interest is input into a pre-trained second depth neural network to obtain a target region of interest region of the target region of interest
  • the second depth neural network is trained according to each region of interest and the region of interest corresponding to each region of interest
  • the target region of interest features are aggregated into target features of the query image.
  • the predetermined feature is a global feature
  • the target feature is a global feature
  • the query image is input into a pre-trained third depth neural network to obtain a global feature of the query image, wherein the third depth neural network is based on each sample image.
  • the global feature corresponding to each sample image is trained.
  • the calculated similarity is sorted, and the search image corresponding to the query image is determined from the plurality of to-be-retrieved images according to the result obtained by the sorting; or, the target to be retrieved image in the plurality of to-be-retrieved images is selected And determining, as the search image corresponding to the query image, the target image to be retrieved is a to-be-retrieved image whose corresponding similarity is greater than a predetermined similarity threshold.
  • the location information of the target region of interest is output.
  • the application program provided by the embodiment of the present application may determine a target feature of the query image based on the pre-trained deep neural network; calculate a similarity between the target feature of the query image and the target feature of each image to be retrieved; And determining a search image corresponding to the query image from the plurality of images to be retrieved. It can be seen that the application provided by the embodiment of the present application does not need to extract the feature of the image according to the user's instruction, that is, without the subjective participation of the user, and can accurately determine the feature reflecting the image feature, thereby improving the accuracy of the image retrieval. At the same time, based on the pre-trained deep neural network, the target features of the query image are determined, and the automatic positioning of the target features is realized, and the user experience is improved.

Abstract

本申请实施例提供了一种图像检索方法、装置及电子设备,其中,所述方法包括:获取查询图像;基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。通过本申请实施例提供的图像检索方法、装置及电子设备,可以提高图像检索的准确度。

Description

一种图像检索方法、装置及电子设备
本申请要求于2017年7月28日提交中国专利局、申请号为201710632446.X申请名称为“一种图像检索方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理和模式识别技术领域,特别是涉及一种图像检索方法、装置及电子设备。
背景技术
随着存储技术、多媒体、压缩技术和网络带宽等技术的不断发展,每天都有成千上万的图片产生,如何从海量的图像库中快速而准确地找到满足用户需求的图像,就成为了图像处理和模式识别领域需迫切解决的重要问题。
对于检索满足用户需求的图像而言,首先需要对用户的需求进行分析,然后从图像库中查找满足用户需求的图像。目前的图像检索方法中,检索系统接收用户提供的查询图像,然后根据用户的指示提取该查询图像的感兴趣区域,其中,该感兴趣区域表示具有辨识能力、能够反映图像特点的区域,辨识能力表示能够分辨不同目标的能力;接着提取该感兴趣区域的特征、和图像库中图像的与该感兴趣区域对应区域的特征;再将查询图像的感兴趣区域的特征与数据库中图像的对应区域的特征进行比对,最后按照相似度进行排序返回检索结果,得到满足要求的图像。
可以看出,目前的图像检索方法中,根据用户指示提取的感兴趣区域,主观性太强,导致感兴趣区域的确定存在较大偏差,最终使得图像检索准确度低。
发明内容
本申请实施例的目的在于提供一种图像检索方法、装置及电子设备,以实现提高图像检索的准确度。具体技术方案如下:
第一方面,本申请实施例提供了一种图像检索方法,包括:
获取查询图像;基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像 对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算所述查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
可选的,所述获取多个待检索图像对应的目标特征的步骤,包括:获取保存在预设数据库的、所述多个待检索图像的目标特征;或者,基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
可选的,所述预定特征为感兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
将所述查询图像输入预先训练的第一深度神经网络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
可选的,所述预定特征为全局特征,所述目标特征为全局特征;所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,所述根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像,包括:
对计算得到的相似度进行排序,并根据排序所得到的结果,从所述多个待检索图像中确定所述查询图像对应的检索图像;或者,将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,在所述得到所述查询图像的目标感兴趣区域之后,所述方法还包括:
输出所述目标感兴趣区域的位置信息。
第二方面,本申请实施例还提供了一种图像检索装置,包括:
图像获取模块,用于获取查询图像。
第一特征确定模块,用于基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的。
第二特征确定模块,用于获取多个待检索图像的目标特征。
计算模块,用于计算所述查询图像的目标特征与各个待检索图像的目标特征的相似度。
检索图像确定模块,用于根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
可选的,所述第二特征确定模块具体用于获取保存在预设数据库的、所述多个待检索图像的目标特征;或者,基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
可选的,所述预定特征为感兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;所述第一特征确定模块,包括:
感兴趣区域获得子模块,用于将所述查询图像输入预先训练的第一深度神经网络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;感兴趣区域特征确定子模块,用于将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;第一特征确定子模块,用于将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
可选的,所述预定特征为全局特征,所述目标特征为全局特征;所述第 一特征确定模块,包括:
第二特征确定子模块,用于将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,所述检索图像确定模块具体用于对计算得到的相似度进行排序,并根据排序所得到的结果,从所述多个待检索图像中确定所述查询图像对应的检索图像;或者,将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,所述装置还包括:输出模块,用于在所述得到所述查询图像的目标感兴趣区域之后,输出所述目标感兴趣区域的位置信息。
第三方面,本申请实施例还提供了一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,所述处理器,所述通信接口,所述存储器通过所述通信总线完成相互间的通信。
所述存储器,用于存放计算机程序。
所述处理器,用于执行存储器上所存放的程序时,实现如下方法步骤:
获取查询图像;基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算所述查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
可选的,所述获取多个待检索图像对应的目标特征的步骤,包括:
获取保存在预设数据库的、所述多个待检索图像的目标特征;或者,基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
可选的,所述预定特征为感兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:将所述查询图像输入预先训练的第一深度神经网 络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
可选的,所述预定特征为全局特征,所述目标特征为全局特征;所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,所述根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像,包括:对计算得到的相似度进行排序,并根据排序所得到的结果,从所述多个待检索图像中确定所述查询图像对应的检索图像;或者,将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,所述处理器还用于在所述得到所述查询图像的目标感兴趣区域之后,输出所述目标感兴趣区域的位置信息。
第四方面,本申请实施例还提供了一种存储介质,用于存储可执行代码,所述可执行代码用于在运行时执行上述第一方面所述的图像检索方法的方法步骤。
第五方面,本申请实施例还提供了一种应用程序,用于在运行时执行上述第一方面所述的图像检索方法的方法步骤。
本申请实施例提供的图像检索方法中,可以基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本方案,无需根据用户的指示提取图像的特征,也即没有用户的主观参与,能够准确确定反映图像特 点的特征,进而提高图像检索的准确度。同时,基于预先训练的深度神经网络,确定查询图像的目标特征,实现了目标特征的自动定位,提高了用户体验。当然,实施本申请的任一产品或方法必不一定需要同时达到以上所述的所有优点。
附图说明
图1为本申请实施例提供的图像检索方法的流程图;
图2为本申请实施例提供的图像检索方法的一种步骤流程图;
图3为本申请实施例提供的通过两个深度神经网络确定图像目标特征的流程图;
图4为本申请实施例提供的图像检索方法的另一种步骤流程图;
图5为本申请实施例提供的通过一个深度神经网络确定图像目标特征的流程图;
图6为本申请实施例提供的图像检索的具体过程流程图;
图7为本申请实施例提供的图像检索装置的结构示意图;
图8为本申请实施例提供的电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1为本申请实施例提供的图像检索方法的流程图,参照图1对本申请实施例提供的图像检索方法进行详细说明,该方法包括:
步骤101,获取查询图像。
本申请实施例提供的图像检索方法可以应用于电子设备,其中,该电子设备可以包括台式计算机、便携式计算机、智能移动终端等。
在本申请实施例中,电子设备获取查询图像,即获取需要检索的目标图像。例如,获取包含猫脸的图像等。并且,该查询图像可以是用户手动上传 的,也可以为该电子设备自动抓取的,这都是合理的。
步骤102,基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的。
在本申请实施例中,通过将查询图像的目标特征与图像库中图像对应的特征进行比对,完成图像的检索。所以,在图像检索的过程中,确定查询图像的目标特征是非常重要的过程。
为了提高图像检索的准确性,电子设备可以预先根据一定数量的样本图像,如100张、500张、1000张等,以及各个样本图像对应的能够形成目标特征的预定特征,训练得到深度神经网络。基于该深度神经网络,可以确定查询图像的目标特征。
因此,在图像检索的过程中,当电子设备获取到查询图像后,其可以将该查询图像输入该预先训练的深度神经网络中,进而基于该预先训练的深度神经网络,确定查询图像的目标特征。
其中,由于基于预定特征能够形成目标特征,因此,在一种具体实现方式中,深度神经网络训练所需的预定特征可以与目标特征相同,例如:预定特征为全局特征,目标特征为全局特征;而在一种具体实现方式中,深度神经网络训练所需的预定特征可以与目标特征不同,但通过预定特征能够生成目标特征,例如:预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征。其中,所谓的感兴趣区域特征指具有辨识能力、能够反映图像特点的感兴趣区域对应的图像特征。
为了方案清楚及布局清晰,后续结合具体实施例,介绍基于预先训练的深度神经网络,确定查询图像的目标特征的具体实现方式。
步骤103,获取多个待检索图像的目标特征。
为了从图像库中大量的图像中检索到满足需求的图像,需要确定图像库中图像与查询图像的目标特征对应的特征,即需要确定图像库中多个待检索图像的目标特征。具体地,可以直接获取预先保存的多个待检索图像的目标特征;也可以在图像检索的过程中实时确定。
可选地,在本申请实施例中,可以直接获取保存在预设数据库的、多个待检索图像的目标特征。具体地,预先提取多个待检索图像的目标特征,并将该目标特征保存在预设数据库中。如此,在图像检索的过程中,可以直接从预设数据库中获取对应的目标特征。
可以看出,预先提取待检索图像的目标特征,在图像检索的过程中,直接获取保存在预设数据库的、多个待检索图像的目标特征。可以将待检索图像的目标特征提前存储起来,实现对待检索图像的目标特征的离线提取。解决了实时提取多个待检索图像的目标特征的超长延时问题,使得能够满足实时应用的需求。
或者,也可以在线地确定多个待检索图像的目标特征,在本申请实施例一种可选的实施方式中,基于预先训练的深度神经网络,确定多个待检索图像的目标特征。具体地基于预先训练的深度神经网络,确定多个待检索图像的目标特征的过程与上述基于预先训练的深度神经网络,确定查询图像的目标特征的过程类似,这里就不再赘述。
步骤104,计算查询图像的目标特征与各个待检索图像的目标特征的相似度。
查询图像的目标特征以及多个待检索图像的目标特征确定后,可以分别比对查询图像的目标特征以及各个待检索图像的目标特征,进而根据目标特征的比对结果确定查询图像对应的检索图像。
而特征的相似性度量是影响图像检索性能的一个重要方面,因此,在本申请实施例中,查询图像的目标特征以及多个待检索图像的目标特征确定后,则可以分别计算查询图像的目标特征与各个待检索图像的目标特征的相似度。具体地,在一种实现方式中,可以将查询图像的目标特征与各个待检索图像的目标特征利用特征向量表示,然后,计算特征向量之间的相似度得到查询图像的目标特征与各个待检索图像的目标特征之间的相似度,当然并不局限于此。
步骤105,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
计算得到查询图像的目标特征与各个待检索图像的目标特征之间的相似 度,根据相似度的不同,从多个待检索图像中确定查询图像对应的检索图像。例如,可以根据相似度的从高到低的顺序,从多个待检索图像中确定出查询图像对应的检索图像。
需要说明的是,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像的具体实现方式存在多种。
可选地,在本申请实施例一种可选的实施方式中,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像,可以包括:
对计算得到的相似度进行排序,并根据排序所得到的结果,从多个待检索图像中确定该查询图像对应的检索图像;
具体地,对计算得到的相似度进行从高到低或者从低到高的排列,选取相似度最高的预设个数个待检索图像为查询图像对应的检索图像。例如,如果是从高到低排列,选取排列在前面的预设个数的待检索图像为确定出的查询图像对应的检索图像;如果是从低到高排列,选取排列在后面的预设个数的待检索图像为确定出的查询图像对应的检索图像。其中,预设个数可以是1个、2个、10个等。
可选地,在本申请实施例另外一种可选的实施方式中,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像,可以包括:
将多个待检索图像中的目标待检索图像,确定为查询图像对应的检索图像,其中,目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
具体地,确定相似度阈值,选取相似度大于该相似度阈值时对应的预设个数个待检索图像为查询图像对应的检索图像,具体地相似度阈值可以根据实际情况来确定。
本申请实施例提供的图像检索方法,基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本申请实施例提供的图像检索方法,能够准确确定反映图像特点的特征,进而提高图像检索的准确度。
需要说明的是,图像检索过程中,用户可以选择采用感兴趣区域检索或者全局检索,具体地,可以通过图像的感兴趣区域特征或者图像的全局特征进行比对,进而实现图像检索的过程。
在本申请实施例中,可以直接确定查询图像的全局特征,将该全局特征作为查询图像的目标特征;也可以先确定查询图像的感兴趣区域特征,然后将感兴趣区域特征汇聚为查询图像的目标特征。
下面结合具体实施例,对本申请实施例所提供的一种图像检索方法进行介绍。
其中,该具体实施例中,预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征。
此时,可以通过两个预先训练的深度神经网络提取查询图像的感兴趣区域特征,进而提取感兴趣区域的特征。
如图2所示,一种图像检索方法,可以包括如下步骤:
步骤201,获取查询图像。
步骤202,将查询图像输入预先训练的第一深度神经网络中,得到查询图像的目标感兴趣区域,其中,该第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的。
预先根据一定数量的样本图像,如100张、500张、1000张等,以及各个样本图像对应的感兴趣区域,训练得到第一深度神经网络。在图像检索的过程中,将查询图像输入该预先训练的第一深度神经网络,得到查询图像的目标感兴趣区域。
具体地,将查询图像输入该预先训练的第一深度神经网络中,该第一深度神经网络对查询图像进行操作,得到与查询图像同等大小或者保持宽高比的下采样尺度的特征图。其中,该特征图中每个位置的值表示对应输入的查询图像原始位置的辨识能力,对该特征图进行阈值化处理和形态学操作,得到具有较强辨识能力的多个子区域,这些区域即为确定的感兴趣区域。
另外,在得到查询图像的目标感兴趣区域之后,还可以输出目标感兴趣区域的位置信息。例如,可以将得到的查询图像的目标感兴趣区域的位置信息输出给用户。
步骤203,将目标感兴趣区域输入预先训练的第二深度神经网络中,得到目标感兴趣区域的目标感兴趣区域特征,其中,该第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的。
对应于第一深度神经网络的训练过程,预先根据一定数量的样本图像,如100张、500张、1000张等,以及各个感兴趣区域对应的感兴趣区域特征,训练得到第二深度神经网络。
将通过预先训练的第一深度神经网络得到的查询图像的目标感兴趣区域,输入该预先训练的第二深度神经网络中,即可以得到目标感兴趣区域的目标感兴趣区域特征。
具体地,可以根据感兴趣区域内对应的辨识能力计算出该感兴趣区域的辨识能力评分,然后将辨识能力评分和感兴趣区域与查询图像一起输入该预先训练的第二深度神经网络。该预先训练的第二深度神经网络根据感兴趣区域和其对应的辨识能力对其进行特征提取,得到每个感兴趣区域的特征。
步骤204,将目标感兴趣区域特征汇聚成查询图像的目标特征。
实际的图像检索过程中,通过预先训练的第一深度神经网络得到的目标感兴趣区域有可能不止一个,对应的,通过预先训练的第二深度神经网络的目标感兴趣区域特征有可能不止一个、不止一种类型或者尺寸等不相同。因此,在通过预先训练的第一深度神经网络和第二深度神经网络,提取查询图像的目标感兴趣区域特征之后,可以将多个不同的目标感兴趣区域对应的不同目标感兴趣区域特征汇聚成查询图像的目标特征。需要说明的是,汇聚可以是将多个不同目标感兴趣区域特征集合成目标特征,或者将多个尺寸、类型不相同的目标感兴趣区域特征调整成相同尺寸或者相同类型的目标感兴趣区域特征,进而将这些目标感兴趣区域特征合起来作为查询图像的目标特征。
图3为本申请实施例通过两个网络确定图像目标特征的流程图。
第一步,将图像输入预先训练的第一深度神经网络,即图3中所示感兴趣 区域检测子网络,得到图像的感兴趣区域。
第二步,将得到的感兴趣区域输入预先训练的第二深度神经网络,即图3中所示感兴趣区域特征提取子网络,得到图像的感兴趣区域特征。
第三步,将得到的所有感兴趣区域对应的感兴趣区域特征进行汇聚得到图像的目标特征。
步骤205,获取多个待检索图像的目标特征。
步骤206,计算查询图像的目标特征与各个待检索图像的目标特征的相似度。
步骤207,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
本具体实施例中,步骤201与上述实施例中的步骤101相同,步骤205-207与上述实施例中的步骤103-步骤105相同,在此不做赘述。
该实施例中,通过预先训练的深度神经网络得到查询图像的目标感兴趣区域,进而通过另一个预先训练的深度神经网络得到目标感兴趣区域的目标感兴趣区域特征,将得到的目标感兴趣区域特征汇聚成检索过程中需要的目标特征。两个独立的深度神经网络可以分开训练,简化了训练的复杂度,进而降低了图像检索的复杂度。同时,还可以将每个深度神经网络得到的结果输出给用户,与用户进行交互。
下面结合另一具体实施例,对本申请实施例所提供的一种图像检索方法进行介绍。
其中,该具体实施例中,预定特征为全局特征,目标特征为全局特征。此时,可以通过一个预先训练的深度神经网络得到查询图像的目标特征。
如图4所示,一种图像检索方法,可以包括如下步骤:
步骤401,获取查询图像。
步骤402,将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,该第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
对应于第一深度神经网络和第二深度神经网络的训练过程,预先根据一定数量的样本图像,如100张、500张、1000张等,以及各个样本图像对应的全局特征训练得到该第三深度神经网络。在图像检索过程中,将查询图像输入该预先训练的第三深度神经网络中,得到查询图像的全局特征,将得到的查询图像的全局特征作为查询图像的目标特征。
具体地,将查询图像输入该预先训练的第三深度神经网络中,该第三深度神经网络对查询图像进行操作,得到与查询图像同等大小或者保持宽高比的下采样尺度的特征图。其中,该特征图中每个位置的值既表示查询图像中对应位置的辨识能力,也是对应查询图像的特征响应。进而根据该特征图确定出查询图像的全局特征。
图5为本申请实施例中通过一个深度神经网络确定图像目标特征的流程图。将图像输入预先训练的深度神经网络,例如,图5中所示全局特征提取子网络,通过该全局特征提取子网络直接提取图像的全局特征,将该全局特征作为图像的目标特征。
步骤403,获取多个待检索图像的目标特征。
步骤404,计算查询图像的目标特征与各个待检索图像的目标特征的相似度。
步骤405,根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
本具体实施例中,步骤401与上述实施例中的步骤101相同,步骤403-405与上述实施例中的步骤103-步骤105相同,在此不做赘述。
该实施例中,通过预先训练的深度神经网络得到查询图像的全局特征,该全局特征即为检索过程中需要的目标特征。只需要训练一个深度神经网络,进而通过该预先训练的深度神经网络即可得到图像的目标特征,简化了训练的过程,提高了图像检索的效率。
如图2和图4所示的具体实施例所示,本申请实施例提供的图像检索方法中,图像感兴趣区域的提取和图像特征的提取都是通过预先训练的深度神经网络确定的,是端到端的整体方案,与人类视觉系统的响应比较相似,使得提取的图像特征更具有辨识、表达能力,进而能够保证最终的图像检索的结果。
图6为本申请实施例图像检索的具体过程流程图,参照图6对本申请实施例图像检索的具体过程进行详细说明。
步骤601,获取用户提交的查询图像。
步骤602,通过预先训练的深度神经网络,提取查询图像的感兴趣区域,进而对感兴趣区域的特征进行聚合或者直接提取图像的全局特征。另外,还可以将感兴趣区域的位置信息返回给用户,供用户选择。
步骤603,用户选择检索模式。
步骤604,若选择了全局检索模式,即选择通过预先训练的深度神经网络直接确定多个待检索图像的全局特征,然后比对查询图像的全局特征和各个待检索图像的全局特征。
步骤605,若选择了感兴趣区域检索模式,即选择通过预先训练的深度神经网络提取多个待检索图像的感兴趣区域,进而提取感兴趣区域的感兴趣区域特征,然后比对查询图像的感兴趣区域特征和各个待检索图像的感兴趣区域特征。
步骤606,若选择了全局检索模式,全局特征比对之后,得到查询图像的全局特征和各个待检索图像的全局特征之间的相似度,进而根据全局特征比对的相似度最终从多个待检索图像中确定检索图像。
若选择了感兴趣区域检索模式,感兴趣区域特征比对之后,得到查询图像的感兴趣区域特征和各个待检索图像的感兴趣区域特征之间的相似度,进而根据感兴趣区域特征比对的相似度最终从多个待检索图像中确定检索图像。
具体地,根据全局特征比对的相似度最终从多个待检索图像中确定检索 图像或者根据感兴趣区域特征比对的相似度最终从多个待检索图像中确定检索图像,可以将相似度进行排序,进而根据排序结果从多个待检索图像中确定出检索图像或者可以选取相似度大于相似度阈值时对应的预设个数个待检索图像为查询图像对应的检索图像。
步骤607,得到待检索图像。
图7为本申请实施例提供的图像检索装置的结构示意图,参照图7对本申请实施例提供的图像检索装置进行详细说明,包括:
图像获取模块701,用于获取查询图像。
第一特征确定模块702,用于基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的。
第二特征确定模块703,用于获取多个待检索图像的目标特征。
计算模块704,用于计算查询图像的目标特征与各个待检索图像的目标特征的相似度。
检索图像确定模块705,用于根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
本申请实施例提供的图像检索装置,可以基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本申请实施例提供的图像检索装置,无需根据用户的指示提取图像的特征,也即没有用户的主观参与,能够准确确定反映图像特点的特征,进而提高图像检索的准确度。同时,基于预先训练的深度神经网络,确定查询图像的目标特征,实现了目标特征的自动定位,提高了用户体验。
可选的,第二特征确定模块703具体用于获取保存在预设数据库的、多个待检索图像的目标特征;或者,基于预先训练的深度神经网络,确定多个待检索图像的目标特征。
可选的,预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征;第一特征确定模块702,包括:
感兴趣区域获得子模块,用于将查询图像输入预先训练的第一深度神经网络中,得到查询图像的目标感兴趣区域,其中,第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的。
感兴趣区域特征确定子模块,用于将目标感兴趣区域输入预先训练的第二深度神经网络中,得到目标感兴趣区域的目标感兴趣区域特征,其中,第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的。
第一特征确定子模块,用于将目标感兴趣区域特征汇聚成查询图像的目标特征。
可选的,预定特征为全局特征,目标特征为全局特征;第一特征确定模块702,包括:第二特征确定子模块,用于将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,检索图像确定模块705具体用于对计算得到的相似度进行排序,并根据排序所得到的结果,从多个待检索图像中确定查询图像对应的检索图像;或者,将多个待检索图像中的目标待检索图像,确定为查询图像对应的检索图像,其中,目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,本申请实施例提供的图像检索装置还包括:输出模块,用于在得到查询图像的目标感兴趣区域之后,输出目标感兴趣区域的位置信息。
需要说明的是,本申请实施例的图像检索装置是应用上述图像检索方法的装置,则上述图像检索方法的所有实施例均适用于该装置,且均能达到相同或相似的有益效果。
本申请实施例还提供了一种电子设备,如图8所示,包括处理器801、通信接口802、存储器803和通信总线804,其中,处理器801,通信接口802,存 储器803通过通信总线804完成相互间的通信。
存储器803,用于存放计算机程序。
处理器801,用于执行存储器803上所存放的程序时,实现如下步骤:
获取查询图像;基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
可选的,获取保存在预设数据库的、多个待检索图像的目标特征;或者,基于预先训练的深度神经网络,确定多个待检索图像的目标特征。
可选的,预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征;将查询图像输入预先训练的第一深度神经网络中,得到查询图像的目标感兴趣区域,其中,第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;将目标感兴趣区域输入预先训练的第二深度神经网络中,得到目标感兴趣区域的目标感兴趣区域特征,其中,第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;将目标感兴趣区域特征汇聚成查询图像的目标特征。
可选的,预定特征为全局特征,目标特征为全局特征;将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,对计算得到的相似度进行排序,并根据排序所得到的结果,从多个待检索图像中确定查询图像对应的检索图像;或者,将多个待检索图像中的目标待检索图像,确定为查询图像对应的检索图像,其中,目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,该处理器还用于在得到查询图像的目标感兴趣区域之后,输出目标感兴趣区域的位置信息。
上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral  Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
通信接口用于上述电子设备与其他设备之间的通信。
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
本申请实施例提供的电子设备,可以基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本申请实施例提供的电子设备,无需根据用户的指示提取图像的特征,也即没有用户的主观参与,能够准确确定反映图像特点的特征,进而提高图像检索的准确度。同时,基于预先训练的深度神经网络,确定查询图像的目标特征,实现了目标特征的自动定位,提高了用户体验。
本申请实施例还提供了一种存储介质,用于存储可执行代码,可执行代码用于在运行时执行如下步骤:
获取查询图像;基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
可选的,获取保存在预设数据库的、多个待检索图像的目标特征;或者,基于预先训练的深度神经网络,确定多个待检索图像的目标特征。
可选的,预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征;将查询图像输入预先训练的第一深度神经网络中,得到查询图像的目标感兴趣区域,其中,第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;将目标感兴趣区域输入预先训练的第二深度神经网络中,得到目标感兴趣区域的目标感兴趣区域特征,其中,第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;将目标感兴趣区域特征汇聚成查询图像的目标特征。
可选的,预定特征为全局特征,目标特征为全局特征;将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,对计算得到的相似度进行排序,并根据排序所得到的结果,从多个待检索图像中确定查询图像对应的检索图像;或者,将多个待检索图像中的目标待检索图像,确定为查询图像对应的检索图像,其中,目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,在得到查询图像的目标感兴趣区域之后,输出目标感兴趣区域的位置信息。
本申请实施例提供的存储介质,可以基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本申请实施例提供的存储介质,无需根据用户的指示提取图像的特征,也即没有用户的主观参与,能够准确确定反映图像特点的特征,进而提高图像检索的准确度。同时,基于预先训练的深度神经网络,确定查询图像的目标特征,实现了目标特征的自动定位,提高了用户体验。
本申请实施例还提供了一种应用程序,用于在运行时执行如下步骤:
获取查询图像;基于预先训练的深度神经网络,确定查询图像的目标特征;其中,深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;获取多个待检索图像的目标特征;计算查询图像的目标特征与各个待检索图像的目标特征的相似度;根据计算得到的相似度,从多个待检索图像中确定查询图像对应的检索图像。
可选的,获取保存在预设数据库的、多个待检索图像的目标特征;或者,基于预先训练的深度神经网络,确定多个待检索图像的目标特征。
可选的,预定特征为感兴趣区域特征,目标特征为感兴趣区域特征汇聚成的特征;将查询图像输入预先训练的第一深度神经网络中,得到查询图像的目标感兴趣区域,其中,第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;将目标感兴趣区域输入预先训练的第二深度神经网络中,得到目标感兴趣区域的目标感兴趣区域特征,其中,第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;将目标感兴趣区域特征汇聚成查询图像的目标特征。
可选的,预定特征为全局特征,目标特征为全局特征;将查询图像输入预先训练的第三深度神经网络中,得到查询图像的全局特征,其中,第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
可选的,对计算得到的相似度进行排序,并根据排序所得到的结果,从多个待检索图像中确定查询图像对应的检索图像;或者,将多个待检索图像中的目标待检索图像,确定为查询图像对应的检索图像,其中,目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
可选的,在得到查询图像的目标感兴趣区域之后,输出目标感兴趣区域的位置信息。
本申请实施例提供的应用程序,可以基于预先训练的深度神经网络,确定查询图像的目标特征;计算该查询图像的目标特征与各个待检索图像的目标特征的相似度;进而根据计算得到的相似度,从多个待检索图像中确定该查询图像对应的检索图像。可以看出,通过本申请实施例提供的应用程序,无需根据用户的指示提取图像的特征,也即没有用户的主观参与,能够准确 确定反映图像特点的特征,进而提高图像检索的准确度。同时,基于预先训练的深度神经网络,确定查询图像的目标特征,实现了目标特征的自动定位,提高了用户体验。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、电子设备、存储介质以及应用程序实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。

Claims (20)

  1. 一种图像检索方法,其特征在于,包括:
    获取查询图像;
    基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;
    获取多个待检索图像的目标特征;
    计算所述查询图像的目标特征与各个待检索图像的目标特征的相似度;
    根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
  2. 根据权利要求1所述的方法,其特征在于,所述获取多个待检索图像对应的目标特征的步骤,包括:
    获取保存在预设数据库的、所述多个待检索图像的目标特征;
    或者,
    基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
  3. 根据权利要求1或2所述的方法,其特征在于,所述预定特征为感兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;
    所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
    将所述查询图像输入预先训练的第一深度神经网络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;
    将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;
    将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
  4. 根据权利要求1或2所述的方法,其特征在于,所述预定特征为全局特征,所述目标特征为全局特征;
    所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
    将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
  5. 根据权利要求1或2所述的方法,其特征在于,所述根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像,包括:
    对计算得到的相似度进行排序,并根据排序所得到的结果,从所述多个待检索图像中确定所述查询图像对应的检索图像;
    或者,
    将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
  6. 根据权利要求3所述的方法,其特征在于,在所述得到所述查询图像的目标感兴趣区域之后,所述方法还包括:
    输出所述目标感兴趣区域的位置信息。
  7. 一种图像检索装置,其特征在于,包括:
    图像获取模块,用于获取查询图像;
    第一特征确定模块,用于基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;
    第二特征确定模块,用于获取多个待检索图像的目标特征;
    计算模块,用于计算所述查询图像的目标特征与各个待检索图像的目标 特征的相似度;
    检索图像确定模块,用于根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
  8. 根据权利要求7所述的装置,其特征在于,所述第二特征确定模块具体用于获取保存在预设数据库的、所述多个待检索图像的目标特征;或者,基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
  9. 根据权利要求7或8所述的装置,其特征在于,所述预定特征为感兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;
    所述第一特征确定模块,包括:
    感兴趣区域获得子模块,用于将所述查询图像输入预先训练的第一深度神经网络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;
    感兴趣区域特征确定子模块,用于将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;
    第一特征确定子模块,用于将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
  10. 根据权利要求7或8所述的装置,其特征在于,所述预定特征为全局特征,所述目标特征为全局特征;
    所述第一特征确定模块,包括:
    第二特征确定子模块,用于将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
  11. 根据权利要求7或8所述的装置,其特征在于,所述检索图像确定模块具体用于对计算得到的相似度进行排序,并根据排序所得到的结果,从所 述多个待检索图像中确定所述查询图像对应的检索图像;或者,将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
  12. 根据权利要求9所述的装置,其特征在于,所述装置还包括:输出模块,用于在所述得到所述查询图像的目标感兴趣区域之后,输出所述目标感兴趣区域的位置信息。
  13. 一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,所述处理器,所述通信接口,所述存储器通过所述通信总线完成相互间的通信;
    所述存储器,用于存放计算机程序;
    所述处理器,用于执行存储器上所存放的程序时,实现如下方法步骤:
    获取查询图像;
    基于预先训练的深度神经网络,确定所述查询图像的目标特征;其中,所述深度神经网络为根据各个样本图像,以及各个样本图像对应的能够形成目标特征的预定特征训练得到的;
    获取多个待检索图像的目标特征;
    计算所述查询图像的目标特征与各个待检索图像的目标特征的相似度;
    根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像。
  14. 根据权利要求13所述的设备,其特征在于,所述获取多个待检索图像对应的目标特征的步骤,包括:
    获取保存在预设数据库的、所述多个待检索图像的目标特征;
    或者,
    基于所述预先训练的深度神经网络,确定所述多个待检索图像的目标特征。
  15. 根据权利要求13或14所述的设备,其特征在于,所述预定特征为感 兴趣区域特征,所述目标特征为感兴趣区域特征汇聚成的特征;
    所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
    将所述查询图像输入预先训练的第一深度神经网络中,得到所述查询图像的目标感兴趣区域,其中,所述第一深度神经网络为根据各个样本图像,以及各个样本图像对应的感兴趣区域训练得到的;
    将所述目标感兴趣区域输入预先训练的第二深度神经网络中,得到所述目标感兴趣区域的目标感兴趣区域特征,其中,所述第二深度神经网络为根据各个感兴趣区域,以及各个感兴趣区域对应的感兴趣区域特征训练得到的;
    将所述目标感兴趣区域特征汇聚成所述查询图像的目标特征。
  16. 根据权利要求13或14所述的设备,其特征在于,所述预定特征为全局特征,所述目标特征为全局特征;
    所述基于预先训练的深度神经网络,确定所述查询图像的目标特征的步骤,包括:
    将所述查询图像输入预先训练的第三深度神经网络中,得到所述查询图像的全局特征,其中,所述第三深度神经网络是根据各个样本图像,以及各个样本图像对应的全局特征训练得到的。
  17. 根据权利要求13或14所述的设备,其特征在于,所述根据计算得到的相似度,从所述多个待检索图像中确定所述查询图像对应的检索图像,包括:
    对计算得到的相似度进行排序,并根据排序所得到的结果,从所述多个待检索图像中确定所述查询图像对应的检索图像;
    或者,
    将所述多个待检索图像中的目标待检索图像,确定为所述查询图像对应的检索图像,其中,所述目标待检索图像为所对应相似度大于预定相似度阈值的待检索图像。
  18. 根据权利要求15所述的设备,其特征在于,所述处理器还用于在所 述得到所述查询图像的目标感兴趣区域之后,输出所述目标感兴趣区域的位置信息。
  19. 一种存储介质,其特征在于,用于存储可执行代码,所述可执行代码用于在运行时执行:权利要求1-6任一项所述的图像检索方法的方法步骤。
  20. 一种应用程序,其特征在于,用于在运行时执行:权利要求1-6任一项所述的图像检索方法的方法步骤。
PCT/CN2018/097008 2017-07-28 2018-07-25 一种图像检索方法、装置及电子设备 WO2019020049A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/632,775 US11586664B2 (en) 2017-07-28 2018-07-25 Image retrieval method and apparatus, and electronic device
ES18839135T ES2924268T3 (es) 2017-07-28 2018-07-25 Procedimiento, aparato y dispositivo electrónico de recuperación de imágenes
EP18839135.3A EP3660700B1 (en) 2017-07-28 2018-07-25 Image retrieval method and apparatus, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710632446.X 2017-07-28
CN201710632446.XA CN110019896B (zh) 2017-07-28 2017-07-28 一种图像检索方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2019020049A1 true WO2019020049A1 (zh) 2019-01-31

Family

ID=65040008

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097008 WO2019020049A1 (zh) 2017-07-28 2018-07-25 一种图像检索方法、装置及电子设备

Country Status (5)

Country Link
US (1) US11586664B2 (zh)
EP (1) EP3660700B1 (zh)
CN (1) CN110019896B (zh)
ES (1) ES2924268T3 (zh)
WO (1) WO2019020049A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942046A (zh) * 2019-12-05 2020-03-31 腾讯云计算(北京)有限责任公司 图像检索方法、装置、设备及存储介质
CN111723240A (zh) * 2019-03-20 2020-09-29 杭州海康威视数字技术股份有限公司 一种图像检索方法、装置及电子设备
CN112836089A (zh) * 2021-01-28 2021-05-25 浙江大华技术股份有限公司 运动轨迹的确认方法及装置、存储介质、电子装置
CN112905828A (zh) * 2021-03-18 2021-06-04 西北大学 一种结合显著特征的图像检索器、数据库及检索方法
CN112990228A (zh) * 2021-03-05 2021-06-18 浙江商汤科技开发有限公司 图像特征匹配方法和相关装置、设备及存储介质

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488475A (zh) * 2019-01-29 2020-08-04 北京三星通信技术研究有限公司 图像检索方法、装置、电子设备及计算机可读存储介质
CN110704652A (zh) * 2019-08-22 2020-01-17 长沙千视通智能科技有限公司 基于多重注意力机制的车辆图像细粒度检索方法及装置
CN111242888A (zh) * 2019-12-03 2020-06-05 中国人民解放军海军航空大学 一种基于机器视觉的图像处理方法及系统
CN111914110A (zh) * 2020-07-29 2020-11-10 厦门大学 一种基于深度激活显著区域的实例检索方法
CN111950728A (zh) * 2020-08-17 2020-11-17 珠海格力电器股份有限公司 图像特征提取模型的构建方法、图像检索方法及存储介质
CN111930983B (zh) * 2020-08-18 2023-09-22 创新奇智(成都)科技有限公司 一种图像检索方法、装置、电子设备及存储介质
CN112052350B (zh) * 2020-08-25 2024-03-01 腾讯科技(深圳)有限公司 一种图片检索方法、装置、设备和计算机可读存储介质
CN112153571A (zh) * 2020-09-18 2020-12-29 浪潮电子信息产业股份有限公司 一种电子设备及其设备寻回系统
CN113282781B (zh) * 2021-05-18 2022-06-28 稿定(厦门)科技有限公司 图像检索方法及装置
CN113360038A (zh) * 2021-05-31 2021-09-07 维沃移动通信(杭州)有限公司 应用功能查找方法、装置及电子设备
CN113449130A (zh) * 2021-06-02 2021-09-28 武汉旷视金智科技有限公司 一种图像检索方法、装置、一种计算机可读存储介质和计算设备
CN113743455A (zh) * 2021-07-23 2021-12-03 北京迈格威科技有限公司 目标检索方法、装置、电子设备及存储介质
CN113807516A (zh) * 2021-09-13 2021-12-17 长城计算机软件与系统有限公司 神经网络模型的训练方法及图像检索方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250812A (zh) * 2016-07-15 2016-12-21 汤平 一种基于快速r‑cnn深度神经网络的车型识别方法
CN106886573A (zh) * 2017-01-19 2017-06-23 博康智能信息技术有限公司 一种图像检索方法及装置
CN106933867A (zh) * 2015-12-30 2017-07-07 杭州华为企业通信技术有限公司 一种图像查询方法和装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9846836B2 (en) * 2014-06-13 2017-12-19 Microsoft Technology Licensing, Llc Modeling interestingness with deep neural networks
US9195912B1 (en) * 2014-07-24 2015-11-24 National Taipei University Of Technology Face annotation method and a face annotation system
US9569700B1 (en) * 2014-12-17 2017-02-14 Amazon Technologies, Inc. Identification of item attributes using artificial intelligence
CN104517103A (zh) * 2014-12-26 2015-04-15 广州中国科学院先进技术研究所 一种基于深度神经网络的交通标志分类方法
CN106326288B (zh) * 2015-06-30 2019-12-03 阿里巴巴集团控股有限公司 图像搜索方法及装置
CN106355188B (zh) * 2015-07-13 2020-01-21 阿里巴巴集团控股有限公司 图像检测方法及装置
CN106445939B (zh) * 2015-08-06 2019-12-13 阿里巴巴集团控股有限公司 图像检索、获取图像信息及图像识别方法、装置及系统
US10810252B2 (en) * 2015-10-02 2020-10-20 Adobe Inc. Searching using specific attributes found in images
US10789525B2 (en) 2015-10-02 2020-09-29 Adobe Inc. Modifying at least one attribute of an image with at least one attribute extracted from another image
US9858496B2 (en) * 2016-01-20 2018-01-02 Microsoft Technology Licensing, Llc Object detection and classification in images
IL297846B2 (en) * 2016-11-15 2023-12-01 Magic Leap Inc A deep learning system for discovering a cube
CN106682092A (zh) * 2016-11-29 2017-05-17 深圳市华尊科技股份有限公司 一种目标检索方法及终端
WO2019006381A1 (en) * 2017-06-30 2019-01-03 Facet Labs, Llc INTELLIGENT END POINT SYSTEMS FOR MANAGING EXTREME DATA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106933867A (zh) * 2015-12-30 2017-07-07 杭州华为企业通信技术有限公司 一种图像查询方法和装置
CN106250812A (zh) * 2016-07-15 2016-12-21 汤平 一种基于快速r‑cnn深度神经网络的车型识别方法
CN106886573A (zh) * 2017-01-19 2017-06-23 博康智能信息技术有限公司 一种图像检索方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3660700A4

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723240A (zh) * 2019-03-20 2020-09-29 杭州海康威视数字技术股份有限公司 一种图像检索方法、装置及电子设备
CN110942046A (zh) * 2019-12-05 2020-03-31 腾讯云计算(北京)有限责任公司 图像检索方法、装置、设备及存储介质
CN110942046B (zh) * 2019-12-05 2023-04-07 腾讯云计算(北京)有限责任公司 图像检索方法、装置、设备及存储介质
CN112836089A (zh) * 2021-01-28 2021-05-25 浙江大华技术股份有限公司 运动轨迹的确认方法及装置、存储介质、电子装置
CN112836089B (zh) * 2021-01-28 2023-08-22 浙江大华技术股份有限公司 运动轨迹的确认方法及装置、存储介质、电子装置
CN112990228A (zh) * 2021-03-05 2021-06-18 浙江商汤科技开发有限公司 图像特征匹配方法和相关装置、设备及存储介质
CN112990228B (zh) * 2021-03-05 2024-03-29 浙江商汤科技开发有限公司 图像特征匹配方法和相关装置、设备及存储介质
CN112905828A (zh) * 2021-03-18 2021-06-04 西北大学 一种结合显著特征的图像检索器、数据库及检索方法

Also Published As

Publication number Publication date
CN110019896B (zh) 2021-08-13
EP3660700A4 (en) 2020-06-03
EP3660700A1 (en) 2020-06-03
US20200175062A1 (en) 2020-06-04
US11586664B2 (en) 2023-02-21
EP3660700B1 (en) 2022-06-15
CN110019896A (zh) 2019-07-16
ES2924268T3 (es) 2022-10-05

Similar Documents

Publication Publication Date Title
WO2019020049A1 (zh) 一种图像检索方法、装置及电子设备
JP6144839B2 (ja) 画像を検索するための方法およびシステム
WO2019154262A1 (zh) 一种图像分类方法及服务器、用户终端、存储介质
TWI684922B (zh) 基於圖像的車輛定損方法、裝置、電子設備及系統
CN110162695B (zh) 一种信息推送的方法及设备
US20210182611A1 (en) Training data acquisition method and device, server and storage medium
CN109960742B (zh) 局部信息的搜索方法及装置
WO2017206661A1 (zh) 语音识别的方法及系统
KR20130142191A (ko) 비주얼 탐색을 위한 강건한 특징 매칭
CN110210457A (zh) 人脸检测方法、装置、设备及计算机可读存储介质
CN111222051B (zh) 一种趋势预测模型的训练方法及装置
JPH11250106A (ja) 内容基盤の映像情報を利用した登録商標の自動検索方法
CN110059594B (zh) 一种环境感知自适应图像识别方法和装置
CN116304155A (zh) 基于二维图片的三维构件检索方法、装置、设备及介质
CN114463746A (zh) 目标识别模型训练以及细胞识别方法、装置及电子设备
CN116415020A (zh) 一种图像检索的方法、装置、电子设备及存储介质
WO2020237674A1 (zh) 目标跟踪方法、目标跟踪装置和无人机
US11366833B2 (en) Augmenting project data with searchable metadata for facilitating project queries
CN114611565A (zh) 数据处理方法、装置、设备和存储介质
CN109740671B (zh) 一种图像识别方法及装置
CN113326805B (zh) 一种人体封面更新方法、装置、电子设备及存储介质
CN114880995B (zh) 算法方案部署方法及相关装置、设备和存储介质
US20230401691A1 (en) Image defect detection method, electronic device and readable storage medium
US20230401670A1 (en) Multi-scale autoencoder generation method, electronic device and readable storage medium
CN113434731A (zh) 音乐视频流派分类方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2018839135

Country of ref document: EP