WO2022241987A1 - Image retrieval method and apparatus - Google Patents

Image retrieval method and apparatus Download PDF

Info

Publication number
WO2022241987A1
WO2022241987A1 PCT/CN2021/119402 CN2021119402W WO2022241987A1 WO 2022241987 A1 WO2022241987 A1 WO 2022241987A1 CN 2021119402 W CN2021119402 W CN 2021119402W WO 2022241987 A1 WO2022241987 A1 WO 2022241987A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
retrieval
historical
feature
copy
Prior art date
Application number
PCT/CN2021/119402
Other languages
French (fr)
Chinese (zh)
Inventor
曾锐
林汉权
林杰兴
Original Assignee
稿定(厦门)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 稿定(厦门)科技有限公司 filed Critical 稿定(厦门)科技有限公司
Publication of WO2022241987A1 publication Critical patent/WO2022241987A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text

Definitions

  • the present invention relates to the technical field of image retrieval, in particular to an image retrieval method, a computer-readable storage medium, a computer device and an image retrieval device.
  • Search by image is a function of image retrieval based on the specified image provided by the user to obtain the target image; this function does not require the user to organize keywords and analyze the retrieval method; it can effectively improve the user's retrieval efficiency and reduce the user's time spent on retrieval. The time spent in the target image process.
  • the entire image is mostly input into the model to extract the features of the entire image; then, the target image is retrieved based on the features of the entire image.
  • This method tends to ignore the important information of the specified image, resulting in inaccurate retrieval results of the final target image.
  • an object of the present invention is to propose an image retrieval method, which can extract feature information of images from multiple dimensions, deeply mine potential information of original images, and further improve the accuracy of image retrieval.
  • a second object of the present invention is to propose a computer-readable storage medium.
  • a third object of the present invention is to propose a computer device.
  • the fourth object of the present invention is to provide an image retrieval device.
  • the embodiment of the first aspect of the present invention proposes an image retrieval method, including the following steps: acquiring historical images, and performing saliency detection on the historical images through a pre-trained saliency detection network, and according to The significance detection result performs semantic extraction on the historical image to obtain the semantic features of the historical image; performs text extraction on the historical image, and calculates the text feature corresponding to the historical image according to the text extraction result;
  • the historical image is input to the style recognition model to obtain the style feature of the historical image;
  • the retrieval vector corresponding to the historical image is calculated according to the semantic feature, the copy feature and the style feature, and according to a plurality of the historical
  • the image and the retrieval vector corresponding to each historical image generate a retrieval database; obtain the image to be retrieved, and calculate the retrieval vector corresponding to the image to be retrieved, and calculate the retrieval database according to the retrieval vector and the retrieval vector
  • the image retrieval method of the embodiment of the present invention first, obtain historical images, and perform saliency detection on the historical images through a pre-trained saliency detection network, so as to extract the main part in the historical images; then, according to the saliency detection results Semantic analysis is performed on historical images to obtain the semantic features of historical images; then, copywriting is extracted from historical images, and the corresponding copywriting features of historical images are calculated according to the results of copywriting extraction; then, historical images are input into the style recognition model to The style features of historical images are extracted through the style recognition model; then, the semantic features, copy features and style features are fused to obtain a retrieval vector; and the historical images and corresponding retrieval vectors are added to the retrieval database to pass multiple The historical images and their corresponding retrieval vectors are used to generate a retrieval database; then, the images to be retrieved are obtained, and the retrieval vectors corresponding to the images to be retrieved are calculated, and according to the retrieval vectors corresponding to the retrieval vectors and any one of the
  • the similarity values between the historical images then, according to the similarity values corresponding to all historical images, return the retrieval results corresponding to the images to be retrieved; thereby extracting the feature information of the image from multiple dimensions, deeply mining the potential information of the original image, and then improving Accuracy of Image Retrieval.
  • image retrieval method proposed according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
  • the training of the saliency detection network includes: acquiring an open-source dataset and a subject-free image, extracting subject information of images in the open-source dataset, and fusing the subject information with the subject-free image; A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
  • calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword;
  • the keywords are mapped to keyword vectors, and a weighted average is performed according to the keyword vectors and corresponding weights to obtain copy features corresponding to the historical images.
  • calculating the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature includes: obtaining the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the The weight corresponding to the style feature, and performing feature fusion on the semantic feature, the copy feature and the style feature according to the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the weight corresponding to the style feature, to get the retrieval vector.
  • the method further includes: obtaining click data of the user on the search result, and updating the weight corresponding to the semantic feature, the weight corresponding to the copywriting feature, and the weight corresponding to the style feature according to the click data .
  • the embodiment of the second aspect of the present invention provides a computer-readable storage medium on which an image retrieval program is stored, and when the image retrieval program is executed by a processor, the above image retrieval method is realized.
  • the computer-readable storage medium of the embodiment of the present invention by storing the image retrieval program, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
  • the embodiment of the third aspect of the present invention proposes a computer device, including a memory, a processor, and a computer program stored in the memory and operable on the processor.
  • the processor executes the program, Implement the image retrieval method as described above.
  • the image retrieval program is stored through the memory, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
  • the embodiment of the fourth aspect of the present invention proposes an image retrieval device, including: a semantic feature module, the semantic feature module is used to obtain historical images, and the pre-trained saliency detection network detects the Performing saliency detection on historical images, and performing semantic extraction on the historical images according to the saliency detection results to obtain semantic features of the historical images; a copy feature module, the copy feature module is used to copy the historical images Extracting, and calculating the copywriting feature corresponding to the historical image according to the copywriting extraction result; a style feature module, the style feature module is used to input the historical image into a style recognition model to obtain the style feature of the historical image; database module, the database module is used to calculate the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature, and according to the multiple historical images and the retrieval vector corresponding to each historical image Generate a retrieval database; a retrieval module, the retrieval module is used to obtain images to be retrieved, and calculate
  • the semantic feature module is used to obtain historical images
  • the pre-trained saliency detection network is used to detect the saliency of the historical images
  • the semantics of the historical images is performed according to the saliency detection results.
  • the copy feature module is used to extract the copy of the historical image, and calculate the copy feature corresponding to the historical image according to the copy extraction result;
  • the style feature module is used to input the historical image to the style recognition model to Get the style features of historical images;
  • the database module is used to calculate the retrieval vectors corresponding to the historical images according to the semantic features, copy features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image;
  • the retrieval module uses To obtain the image to be retrieved, calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector;
  • the feedback module is used to The similarity value corresponding to the image returns the retrieval result corresponding to the image to be retrieved; thus, the feature information of the image can be extracted from multiple dimensions, and the potential information of the original image can be deeply mined, thereby improving the accuracy
  • image retrieval device proposed according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
  • the training of the saliency detection network includes: acquiring an open-source dataset and a subject-free image, extracting subject information of images in the open-source dataset, and fusing the subject information with the subject-free image; A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
  • calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword;
  • the keywords are mapped to keyword vectors, and a weighted average is performed according to the keyword vectors and corresponding weights to obtain copy features corresponding to the historical images.
  • Fig. 1 is a schematic flow chart of an image retrieval method according to an embodiment of the present invention
  • Fig. 2 is a schematic block diagram of an image retrieval device according to an embodiment of the present invention.
  • the entire image is mostly input into the model to extract the features of the entire image; then, the target image is retrieved based on the features of the entire image.
  • the image retrieval method of the embodiment of the present invention firstly, the historical image is obtained, and the historical image is processed through the pre-trained saliency detection network.
  • Saliency detection to extract the main part of the historical image; then, carry out semantic analysis on the historical image according to the saliency detection result to obtain the semantic features of the historical image; then, perform copy extraction on the historical image, and calculate according to the copy extraction result
  • the copy features corresponding to the historical image then, input the historical image into the style recognition model to extract the style features of the historical image through the style recognition model; then, perform feature fusion on the semantic feature, copy feature and style feature to obtain the retrieval vector; and adding the historical image and the corresponding retrieval vector to the retrieval database, so as to generate the retrieval database through multiple historical images and their corresponding retrieval vectors; then, obtain the image to be retrieved, and calculate the retrieval vector corresponding to the image to be retrieved, And calculate the similarity value between the image to be retrieved and the historical image according to the retrieval vector corresponding to the vector to be retrieved and any historical image; then, return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images; thereby real
  • Fig. 1 is a schematic flow chart of an image retrieval method according to an embodiment of the present invention. As shown in Fig. 1, the image retrieval method includes the following steps:
  • the training of the saliency detection network includes: obtaining an open source dataset and subject-free images, extracting subject information of images in the open source dataset, and fusing subject information with subject-free images; according to the open source dataset and subject The fusion result of the information and the subject-free image generates a training set, so that the training of the saliency detection network can be performed according to the training set.
  • the training set is generated by manual marking, it will consume a lot of manpower and material resources; therefore, when training the saliency detection network; first, by extracting the subject information corresponding to the image in the open source dataset, and using the The information is fused with the non-subject image to generate a new image; in this way, a large number of training samples can be obtained without manual labeling; the resources required for the training process of the saliency detection network are reduced.
  • calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; It is mapped to a keyword vector, and a weighted average is performed according to the keyword vector and the corresponding weight to obtain the copy features corresponding to the historical image.
  • crawlers and other technologies are used to search the public copywriting on the Internet, so as to generate a training data set according to the collected data; then, the word2vector model and word segmentation model are trained according to the training data set; then, the history The text detection and recognition of the image is used to extract the text part in the historical image; then, the text part is segmented and the keywords are extracted through the word segmentation model to obtain the corresponding keywords and the corresponding weight of each keyword; then, through word2vector Each keyword is mapped to a corresponding keyword vector; then, weighted summation is performed according to the keyword vector and weight corresponding to the keyword to obtain the copy feature vector corresponding to the historical image.
  • the style recognition of historical images is carried out through the pre-trained style recognition model (it is understandable that each image will have its corresponding style; for example, most of the Spring Festival posters will use red as the main color to highlight the festive atmosphere); to obtain the style features of historical images; it can be understood that this style recognition will effectively improve the accuracy of subsequent image retrieval.
  • the training of the style recognition model may include: first, obtaining the result image corresponding to the image template (that is, the image generated by the image template), so as to use the result image corresponding to the same image template as an image of the same style; In this way, a large amount of effective training data can be obtained. Further, the dominant color of each result image in the same style can be extracted, and the color distance of the dominant color between the result images can be calculated to filter out the result images that obviously do not belong to the same style, and determine the final training data.
  • ResNet50 combined with triplet loss can be used to train a style recognition model.
  • S104 Calculate retrieval vectors corresponding to the historical images according to the semantic features, copywriting features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image.
  • the calculation of the retrieval vector corresponding to the historical image is performed according to the semantic features, copy features and style features; furthermore, after the calculation is completed, the historical image and the corresponding retrieval vector are added to the retrieval database; thus, based on multiple historical images
  • the retrieval vector corresponding to each historical image can construct a retrieval database, so that subsequent image retrieval can be performed according to the retrieval database.
  • calculating the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature includes: obtaining the weight corresponding to the semantic feature, the weight corresponding to the copy feature and the weight corresponding to the style feature, and according to the corresponding weight of the semantic feature Weights, weights corresponding to copywriting features, and weights corresponding to style features perform feature fusion on semantic features, copywriting features, and style features to obtain retrieval vectors.
  • semantic features, copywriting features, and style features are all one-dimensional vectors with a length of 128, which are verctor1, vecotr2, and vector3; then, define the weights corresponding to the three features as a1, a2, and a3; then finally The retrieval vector of is expressed as: a1*vector1+a2*vector2+a3*vector3.
  • the image retrieval method proposed by the embodiment of the present invention further includes: acquiring the user's click data on the retrieval results, and performing the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the weight corresponding to the style feature according to the click data. renew.
  • the initial weight (for example, 1, 1, 1) may be used for calculation in combination with the values of the three features.
  • the accuracy of the search results can be judged by obtaining the user's click data on the search results; furthermore, according to the click data, the weights corresponding to the semantic features, the weights corresponding to the copy features, and the weights corresponding to the style features Updating can effectively improve the accuracy of the final weight setting; thereby improving the accuracy of the final image retrieval.
  • S105 Acquire an image to be retrieved, calculate a vector to be retrieved corresponding to the image to be retrieved, and calculate a similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector.
  • the image to be retrieved uploaded by the user obtains the image to be retrieved uploaded by the user, extract the semantic feature, copy feature and style feature corresponding to the image to be retrieved, and fuse the three features to obtain the vector to be retrieved corresponding to the image to be retrieved; then, calculate The cosine similarity between the vector to be retrieved and the retrieval image corresponding to any historical image in the retrieval database; the cosine similarity is used as the similarity value between the image to be retrieved and the historical image; thus, the traversal retrieval
  • the database can calculate the similarity value between the image to be retrieved and each historical image; then, sort the historical images according to the size of the similarity value, and return the retrieval result corresponding to the image to be retrieved according to the sorting result.
  • the image retrieval method of the embodiment of the present invention first, obtain historical images, and perform saliency detection on the historical images through a pre-trained saliency detection network, so as to extract the main part in the historical images; then, According to the saliency detection results, the historical images are semantically analyzed to obtain the semantic features of the historical images; then, the historical images are extracted from the text, and the corresponding copy features of the historical images are calculated according to the text extraction results; then, the historical images are input into the style In the recognition model, the style feature of the historical image is extracted through the style recognition model; then, the semantic feature, copy feature and style feature are fused to obtain a retrieval vector; and the historical image and the corresponding retrieval vector are added to the retrieval database , to generate a retrieval database through multiple historical images and their corresponding retrieval vectors; then, obtain the image to be retrieved, and calculate the retrieval vector corresponding to the retrieval image, and calculate according to the retrieval vector corresponding to the retrieval vector and any historical image
  • an embodiment of the present invention proposes a computer-readable storage medium on which an image retrieval program is stored, and when the image retrieval program is executed by a processor, the above-mentioned image retrieval method is implemented.
  • the computer-readable storage medium of the embodiment of the present invention by storing the image retrieval program, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
  • the embodiment of the present invention proposes a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor executes the program, the following The image retrieval method described above.
  • the image retrieval program is stored through the memory, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
  • the embodiment of the present invention proposes an image retrieval device, as shown in FIG. module 50 and feedback module 60 .
  • the semantic feature module 10 is used to obtain historical images, and perform saliency detection on historical images through a pre-trained saliency detection network, and perform semantic extraction on historical images according to the saliency detection results to obtain semantic features of historical images ;
  • the copy feature module 20 is used to extract the text of the historical image, and calculate the corresponding text feature of the historical image according to the text extraction result;
  • the style feature module 30 is used for inputting the historical image into the style recognition model, to obtain the style feature of the historical image;
  • the database module 40 is used to calculate the retrieval vectors corresponding to the historical images according to the semantic features, copywriting features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image;
  • the retrieval module 50 is used to obtain the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector;
  • the feedback module 60 is used to return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images.
  • the training of the saliency detection network includes: obtaining an open source dataset and subject-free images, extracting subject information of images in the open source dataset, and fusing subject information with subject-free images; according to the open source dataset and subject The fusion result of the information and the subject-free image generates a training set, so that the training of the saliency detection network can be performed according to the training set.
  • calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; It is mapped to a keyword vector, and a weighted average is performed according to the keyword vector and the corresponding weight to obtain the copy features corresponding to the historical image.
  • the image retrieval device of the embodiment of the present invention by setting the semantic feature module to obtain historical images, and performing saliency detection on historical images through a pre-trained saliency detection network, and according to the saliency detection results Semantic extraction of historical images to obtain the semantic features of historical images; the copy feature module is used to extract text from historical images, and calculates the corresponding copy features of historical images according to the results of text extraction; the style feature module is used to input historical images into The style recognition model is used to obtain the style features of historical images; the database module is used to calculate the retrieval vectors corresponding to historical images based on semantic features, copy features and style features, and generate retrievals based on multiple historical images and the retrieval vectors corresponding to each historical image database; the retrieval module is used to obtain the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval
  • the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
  • the invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware.
  • the use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.
  • first and second are used for description purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • “plurality” means two or more, unless otherwise specifically defined.
  • the first feature may be in direct contact with the first feature or the first and second feature may be in direct contact with the second feature through an intermediary. touch.
  • “above”, “above” and “above” the first feature on the second feature may mean that the first feature is directly above or obliquely above the second feature, or simply means that the first feature is higher in level than the second feature.
  • “Below”, “beneath” and “beneath” the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the first feature is less horizontally than the second feature.

Abstract

Disclosed in the present disclosure are an image retrieval method and apparatus, and a medium and a device. The image retrieval method comprises: acquiring historical images, performing saliency detection on the historical images, and performing semantic extraction on the historical images according to saliency detection results, so as to obtain semantic features of the historical images; calculating text features corresponding to the historical images; inputting the historical images into a style recognition model, so as to obtain style features of the historical images; according to the semantic features, the text features and the style features, calculating retrieval vectors corresponding to the historical images, and generating a retrieval database; acquiring an image to be subjected to retrieval, calculating a vector to be subjected to retrieval corresponding to said image, and according to said vector and a retrieval vector, calculating a similarity value between any historical image in the retrieval database and said image; and according to the similarity values corresponding to all the historical images, returning a retrieval result corresponding to said image. Feature information of images can be extracted from a plurality of dimensions, and potential information of an original image can be deeply extracted, thereby improving the accuracy of image retrieval.

Description

图像检索方法及装置Image retrieval method and device 技术领域technical field
本发明涉及图像检索技术领域,特别涉及一种图像检索方法、一种计算机可读存储介质、一种计算机设备以及一种图像检索装置。The present invention relates to the technical field of image retrieval, in particular to an image retrieval method, a computer-readable storage medium, a computer device and an image retrieval device.
背景技术Background technique
以图搜图,是根据用户提供的指定图像进行图像检索,以得到目标图像的功能;这一功能不需要用户自行整理关键词、分析检索方式;可以有效提高用户的检索效率,降低用户在检索目标图像过程中所需要耗费的时间。Search by image is a function of image retrieval based on the specified image provided by the user to obtain the target image; this function does not require the user to organize keywords and analyze the retrieval method; it can effectively improve the user's retrieval efficiency and reduce the user's time spent on retrieval. The time spent in the target image process.
相关技术中,在根据用户指定图像进行图像检索的过程中,多只是将整张图像输入到模型,以提取整张图像的特征;接着,根据整张图片的特征进行目标图像的检索。这种方式容易忽略指定图像的重要信息,造成最终目标图像检索结果不准确。In related technologies, in the process of image retrieval according to the image specified by the user, the entire image is mostly input into the model to extract the features of the entire image; then, the target image is retrieved based on the features of the entire image. This method tends to ignore the important information of the specified image, resulting in inaccurate retrieval results of the final target image.
发明内容Contents of the invention
本发明旨在至少在一定程度上解决上述技术中的技术问题之一。为此,本发明的一个目的在于提出一种图像检索方法,能够从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。The present invention aims to solve one of the technical problems in the above-mentioned technologies at least to a certain extent. Therefore, an object of the present invention is to propose an image retrieval method, which can extract feature information of images from multiple dimensions, deeply mine potential information of original images, and further improve the accuracy of image retrieval.
本发明的第二个目的在于提出一种计算机可读存储介质。A second object of the present invention is to propose a computer-readable storage medium.
本发明的第三个目的在于提出一种计算机设备。A third object of the present invention is to propose a computer device.
本发明的第四个目的在于提出一种图像检索装置。The fourth object of the present invention is to provide an image retrieval device.
为达到上述目的,本发明第一方面实施例提出了一种图像检索方法,包括以下步骤:获取历史图像,并通过预先训练好的显著性检测网络对所述历史图像进行显著性检测,以及根据显著性检测结果对所述历史图像进行语义提取,以得到所述历史图像的语义特征;对所述历史图像进行文案提取,并根据文案提取结果计算所述历史图像对应的文案特征;将所述历史图像输入到风格识别模型,以得到所述历史图像的风格特征;根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,并根据多个所述历史图像和每个历史图像对应的检索向量生成检索数据库;获取待检索图像,并计算所述待检索图像对应的待检索向量,以及根据所述待检索向量和所述检索向量计算所述检索数据库中任意一个历史图像与所述待检索图像之间的相似值;根据所有历史图像对应的相似值返回所述待检索图像对应的检索结果。In order to achieve the above purpose, the embodiment of the first aspect of the present invention proposes an image retrieval method, including the following steps: acquiring historical images, and performing saliency detection on the historical images through a pre-trained saliency detection network, and according to The significance detection result performs semantic extraction on the historical image to obtain the semantic features of the historical image; performs text extraction on the historical image, and calculates the text feature corresponding to the historical image according to the text extraction result; The historical image is input to the style recognition model to obtain the style feature of the historical image; the retrieval vector corresponding to the historical image is calculated according to the semantic feature, the copy feature and the style feature, and according to a plurality of the historical The image and the retrieval vector corresponding to each historical image generate a retrieval database; obtain the image to be retrieved, and calculate the retrieval vector corresponding to the image to be retrieved, and calculate the retrieval database according to the retrieval vector and the retrieval vector The similarity value between any historical image and the image to be retrieved; return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images.
根据本发明实施例的图像检索方法,首先,获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以提取历史图像中的主体部分;接着,根据显著性检测结果对历史图像进行语义分析,以得到历史图像的语义特征;然后,对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;接着,将历史图像输入到风格识别模型中,以通过风格识别模型提取历史图像的风格特征;然后,对语义特征、文案特征和风格特征进行特征融合以得到检索向量;并将该历史图像和对应的检索向量加入到检索数据库中,以通过多个历史图像及其对应的检索向量生成检索数据库;接着,获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和任意一个历史图像对应的检索向量计算该待检索图像与该历史图像之间的相似值;然后,根据所有历史图像对应的相似值返回待检索图像对应的检索结果;从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the image retrieval method of the embodiment of the present invention, first, obtain historical images, and perform saliency detection on the historical images through a pre-trained saliency detection network, so as to extract the main part in the historical images; then, according to the saliency detection results Semantic analysis is performed on historical images to obtain the semantic features of historical images; then, copywriting is extracted from historical images, and the corresponding copywriting features of historical images are calculated according to the results of copywriting extraction; then, historical images are input into the style recognition model to The style features of historical images are extracted through the style recognition model; then, the semantic features, copy features and style features are fused to obtain a retrieval vector; and the historical images and corresponding retrieval vectors are added to the retrieval database to pass multiple The historical images and their corresponding retrieval vectors are used to generate a retrieval database; then, the images to be retrieved are obtained, and the retrieval vectors corresponding to the images to be retrieved are calculated, and according to the retrieval vectors corresponding to the retrieval vectors and any one of the historical images, the relationship between the image to be retrieved and the retrieval vector is calculated. The similarity values between the historical images; then, according to the similarity values corresponding to all historical images, return the retrieval results corresponding to the images to be retrieved; thereby extracting the feature information of the image from multiple dimensions, deeply mining the potential information of the original image, and then improving Accuracy of Image Retrieval.
另外,根据本发明上述实施例提出的图像检索方法还可以具有如下附加的技术特征:In addition, the image retrieval method proposed according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
可选地,所述显著性检测网络的训练包括:获取开源数据集和无主体图像,并提取所述开源数据集中图像的主体信息,以及将所述主体信息与所述无主体图像进行融合;根据所述开源数据集和所述主体信息与所述无主体图像的融合结果生成训练集,以便根据所述训练集进行所述显著性检测网络的训练。Optionally, the training of the saliency detection network includes: acquiring an open-source dataset and a subject-free image, extracting subject information of images in the open-source dataset, and fusing the subject information with the subject-free image; A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
可选地,根据文案提取结果计算所述历史图像对应的文案特征,包括:对文案提取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和所述关键词对应的权重;将所述关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以得到所述历史图像对应的文案特征。Optionally, calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; The keywords are mapped to keyword vectors, and a weighted average is performed according to the keyword vectors and corresponding weights to obtain copy features corresponding to the historical images.
可选地,根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,包括:获取所述语义特征对应的权重、所述文案特征对应的权重和所述风格特征对应的权重,并根据所述语义特征对应的权重、所述文案特征对应的权重、所述风格特征对应的权重对所述语义特征、所述文案特征和所述风格特征进行特征融合,以得到所述检索向量。Optionally, calculating the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature includes: obtaining the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the The weight corresponding to the style feature, and performing feature fusion on the semantic feature, the copy feature and the style feature according to the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the weight corresponding to the style feature, to get the retrieval vector.
可选地,还包括:获取用户对于所述检索结果的点击数据,并根据所述点击数据对所述语义特征对应的权重、所述文案特征对应的权重、所述风格特征对应的权重进行更新。Optionally, the method further includes: obtaining click data of the user on the search result, and updating the weight corresponding to the semantic feature, the weight corresponding to the copywriting feature, and the weight corresponding to the style feature according to the click data .
为达到上述目的,本发明第二方面实施例提出了一种计算机可读存储介质,其上存储有图像检索程序,该图像检索程序被处理器执行时实现如上述的图像检索方法。In order to achieve the above purpose, the embodiment of the second aspect of the present invention provides a computer-readable storage medium on which an image retrieval program is stored, and when the image retrieval program is executed by a processor, the above image retrieval method is realized.
根据本发明实施例的计算机可读存储介质,通过存储图像检索程序,以使得处理器在执行该图像检索程序时,实现如上述的图像检索方法,从而实现从多个维度提取图像的特 征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the computer-readable storage medium of the embodiment of the present invention, by storing the image retrieval program, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
为达到上述目的,本发明第三方面实施例提出了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时,实现如上述的图像检索方法。In order to achieve the above object, the embodiment of the third aspect of the present invention proposes a computer device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, Implement the image retrieval method as described above.
根据本发明实施例的计算机设备,通过存储器对图像检索程序进行存储,以使得处理器在执行该图像检索程序时,实现如上述的图像检索方法,从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the computer device of the embodiment of the present invention, the image retrieval program is stored through the memory, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
为达到上述目的,本发明第四方面实施例提出了一种图像检索装置,包括:语义特征模块,所述语义特征模块用于获取历史图像,并通过预先训练好的显著性检测网络对所述历史图像进行显著性检测,以及根据显著性检测结果对所述历史图像进行语义提取,以得到所述历史图像的语义特征;文案特征模块,所述文案特征模块用于对所述历史图像进行文案提取,并根据文案提取结果计算所述历史图像对应的文案特征;风格特征模块,所述风格特征模块用于将所述历史图像输入到风格识别模型,以得到所述历史图像的风格特征;数据库模块,所述数据库模块用于根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,并根据多个所述历史图像和每个历史图像对应的检索向量生成检索数据库;检索模块,所述检索模块用于获取待检索图像,并计算所述待检索图像对应的待检索向量,以及根据所述待检索向量和所述检索向量计算所述检索数据库中任意一个历史图像与所述待检索图像之间的相似值;反馈模块,所述反馈模块用于根据所有历史图像对应的相似值返回所述待检索图像对应的检索结果。In order to achieve the above-mentioned purpose, the embodiment of the fourth aspect of the present invention proposes an image retrieval device, including: a semantic feature module, the semantic feature module is used to obtain historical images, and the pre-trained saliency detection network detects the Performing saliency detection on historical images, and performing semantic extraction on the historical images according to the saliency detection results to obtain semantic features of the historical images; a copy feature module, the copy feature module is used to copy the historical images Extracting, and calculating the copywriting feature corresponding to the historical image according to the copywriting extraction result; a style feature module, the style feature module is used to input the historical image into a style recognition model to obtain the style feature of the historical image; database module, the database module is used to calculate the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature, and according to the multiple historical images and the retrieval vector corresponding to each historical image Generate a retrieval database; a retrieval module, the retrieval module is used to obtain images to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate any vector in the retrieval database according to the vector to be retrieved and the retrieval vector A similarity value between a historical image and the image to be retrieved; a feedback module, the feedback module is used to return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images.
根据本发明实施例的图像检索装置,通过设置语义特征模块用于获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以及根据显著性检测结果对历史图像进行语义提取,以得到历史图像的语义特征;文案特征模块用于对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;风格特征模块用于将历史图像输入到风格识别模型,以得到历史图像的风格特征;数据库模块用于根据语义特征、文案特征和风格特征计算历史图像对应的检索向量,并根据多个历史图像和每个历史图像对应的检索向量生成检索数据库;检索模块用于获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和检索向量计算检索数据库中任意一个历史图像与待检索图像之间的相似值;反馈模块用于根据所有历史图像对应的相似值返回待检索图像对应的检索结果;从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the image retrieval device of the embodiment of the present invention, the semantic feature module is used to obtain historical images, and the pre-trained saliency detection network is used to detect the saliency of the historical images, and the semantics of the historical images is performed according to the saliency detection results. Extract to obtain the semantic features of the historical image; the copy feature module is used to extract the copy of the historical image, and calculate the copy feature corresponding to the historical image according to the copy extraction result; the style feature module is used to input the historical image to the style recognition model to Get the style features of historical images; the database module is used to calculate the retrieval vectors corresponding to the historical images according to the semantic features, copy features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image; the retrieval module uses To obtain the image to be retrieved, calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector; the feedback module is used to The similarity value corresponding to the image returns the retrieval result corresponding to the image to be retrieved; thus, the feature information of the image can be extracted from multiple dimensions, and the potential information of the original image can be deeply mined, thereby improving the accuracy of image retrieval.
另外,根据本发明上述实施例提出的图像检索装置还可以具有如下附加的技术特征:In addition, the image retrieval device proposed according to the above-mentioned embodiments of the present invention may also have the following additional technical features:
可选地,所述显著性检测网络的训练包括:获取开源数据集和无主体图像,并提取所述开源数据集中图像的主体信息,以及将所述主体信息与所述无主体图像进行融合;根据所述开源数据集和所述主体信息与所述无主体图像的融合结果生成训练集,以便根据所述训练集进行所述显著性检测网络的训练。Optionally, the training of the saliency detection network includes: acquiring an open-source dataset and a subject-free image, extracting subject information of images in the open-source dataset, and fusing the subject information with the subject-free image; A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
可选地,根据文案提取结果计算所述历史图像对应的文案特征,包括:对文案提取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和所述关键词对应的权重;将所述关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以得到所述历史图像对应的文案特征。Optionally, calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; The keywords are mapped to keyword vectors, and a weighted average is performed according to the keyword vectors and corresponding weights to obtain copy features corresponding to the historical images.
附图说明Description of drawings
图1为根据本发明实施例的图像检索方法的流程示意图;Fig. 1 is a schematic flow chart of an image retrieval method according to an embodiment of the present invention;
图2为根据本发明实施例的图像检索装置的方框示意图。Fig. 2 is a schematic block diagram of an image retrieval device according to an embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.
相关技术中,在根据用户指定图像进行图像检索的过程中,多只是将整张图像输入到模型,以提取整张图像的特征;接着,根据整张图片的特征进行目标图像的检索。这种方式容易忽略指定图像的重要信息,造成最终目标图像检索结果不准确;根据本发明实施例的图像检索方法,首先,获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以提取历史图像中的主体部分;接着,根据显著性检测结果对历史图像进行语义分析,以得到历史图像的语义特征;然后,对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;接着,将历史图像输入到风格识别模型中,以通过风格识别模型提取历史图像的风格特征;然后,对语义特征、文案特征和风格特征进行特征融合以得到检索向量;并将该历史图像和对应的检索向量加入到检索数据库中,以通过多个历史图像及其对应的检索向量生成检索数据库;接着,获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和任意一个历史图像对应的检索向量计算该待检索图像与该历史图像之间的相似值;然后,根据所有历史图像对应的相似值返回待检索图像对应的检索结果;从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。In related technologies, in the process of image retrieval according to the image specified by the user, the entire image is mostly input into the model to extract the features of the entire image; then, the target image is retrieved based on the features of the entire image. This way is easy to ignore the important information of the specified image, resulting in inaccurate retrieval results of the final target image; according to the image retrieval method of the embodiment of the present invention, firstly, the historical image is obtained, and the historical image is processed through the pre-trained saliency detection network. Saliency detection to extract the main part of the historical image; then, carry out semantic analysis on the historical image according to the saliency detection result to obtain the semantic features of the historical image; then, perform copy extraction on the historical image, and calculate according to the copy extraction result The copy features corresponding to the historical image; then, input the historical image into the style recognition model to extract the style features of the historical image through the style recognition model; then, perform feature fusion on the semantic feature, copy feature and style feature to obtain the retrieval vector; and adding the historical image and the corresponding retrieval vector to the retrieval database, so as to generate the retrieval database through multiple historical images and their corresponding retrieval vectors; then, obtain the image to be retrieved, and calculate the retrieval vector corresponding to the image to be retrieved, And calculate the similarity value between the image to be retrieved and the historical image according to the retrieval vector corresponding to the vector to be retrieved and any historical image; then, return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images; thereby realizing Extract the feature information of the image from multiple dimensions, deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
为了更好的理解上述技术方案,下面将参照附图更详细地描述本发明的示例性实施例。 虽然附图中显示了本发明的示例性实施例,然而应当理解,可以以各种形式实现本发明而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本发明,并且能够将本发明的范围完整的传达给本领域的技术人员。In order to better understand the above technical solutions, the following will describe exemplary embodiments of the present invention in more detail with reference to the accompanying drawings. Although exemplary embodiments of the present invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present invention and to fully convey the scope of the present invention to those skilled in the art.
为了更好的理解上述技术方案,下面将结合说明书附图以及具体的实施方式对上述技术方案进行详细的说明。In order to better understand the above-mentioned technical solution, the above-mentioned technical solution will be described in detail below in conjunction with the accompanying drawings and specific implementation methods.
图1为根据本发明实施例的图像检索方法的流程示意图,如图1所示,该图像检索方法包括以下步骤:Fig. 1 is a schematic flow chart of an image retrieval method according to an embodiment of the present invention. As shown in Fig. 1, the image retrieval method includes the following steps:
S101,获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以及根据显著性检测结果对历史图像进行语义提取,以得到历史图像的语义特征。S101. Acquire historical images, perform saliency detection on the historical images through a pre-trained saliency detection network, and perform semantic extraction on the historical images according to the saliency detection results, so as to obtain semantic features of the historical images.
也就是说,获取用于训练的历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以得到历史图像中的主体区域;然后,如果历史图像存在主体区域,则对主体区域进行语义提取;如果历史图像不存在主体区域,则对整个历史图像进行语义提取,以得到历史图像的语义特征。可以理解,如果图像为商品类图像,则该商品类图像中往往会有一个凸显的主体,该主体位置、颜色等均会较为吸引用户的眼球;而如果是海报图像,则该海报图像中会存在很多的小元素分布在海报当中;通过首先提取具有显著性的主体区域,可以有效提高后续对于目标图像的检索准确性。That is to say, obtain the historical images used for training, and perform saliency detection on the historical images through the pre-trained saliency detection network to obtain the main body area in the historical image; then, if there is a main body area in the historical image, then the Semantic extraction is performed on the subject area; if there is no subject area in the historical image, semantic extraction is performed on the entire historical image to obtain the semantic features of the historical image. It can be understood that if the image is a commodity image, there will often be a prominent subject in the commodity image, and the position and color of the subject will attract the user's attention; and if it is a poster image, there will be a prominent subject in the poster image. There are many small elements distributed in the poster; by extracting the salient main body area first, the subsequent retrieval accuracy of the target image can be effectively improved.
在一些实施例中,显著性检测网络的训练包括:获取开源数据集和无主体图像,并提取开源数据集中图像的主体信息,以及将主体信息与无主体图像进行融合;根据开源数据集和主体信息与无主体图像的融合结果生成训练集,以便根据训练集进行显著性检测网络的训练。In some embodiments, the training of the saliency detection network includes: obtaining an open source dataset and subject-free images, extracting subject information of images in the open source dataset, and fusing subject information with subject-free images; according to the open source dataset and subject The fusion result of the information and the subject-free image generates a training set, so that the training of the saliency detection network can be performed according to the training set.
可以理解,开源数据集中的图像大部分都是自然场景下的图片,和具体的应用场景下的图像会有所偏差。例如,在海报场景下,图片中会存在大量的文本框、小元素。而如果通过人工打标来生成训练集的话,将要耗费大量的人力和物力;因此,在进行显著性检测网络的训练时;首先,通过提取开源数据集中图像所对应的主体信息,并将该主体信息与无主体图像进行融合,以生成新的图像;如此,可以得到大量的训练样本,无需人工进行标注;降低显著性检测网络训练过程所需要耗费的资源。It can be understood that most of the images in the open source data set are pictures in natural scenes, and there will be deviations from the images in specific application scenarios. For example, in a poster scene, there will be a large number of text boxes and small elements in the picture. However, if the training set is generated by manual marking, it will consume a lot of manpower and material resources; therefore, when training the saliency detection network; first, by extracting the subject information corresponding to the image in the open source dataset, and using the The information is fused with the non-subject image to generate a new image; in this way, a large number of training samples can be obtained without manual labeling; the resources required for the training process of the saliency detection network are reduced.
S102,对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征。S102. Perform text extraction on historical images, and calculate text features corresponding to the historical images according to text extraction results.
即言,首先,对历史图像进行文字检测识别,以识别历史图像中的文字部分,完成文案提取;接着,根据文案提取结果计算历史图像所对应的文案特征。That is to say, firstly, text detection and recognition is performed on the historical image to identify the text part in the historical image to complete text extraction; then, the text feature corresponding to the historical image is calculated according to the text extraction result.
其中,根据文案提取结果计算历史图像对应的文案特征的方式可以有多种。Among them, there may be multiple ways to calculate the copy feature corresponding to the historical image according to the copy extraction result.
在一些实施例中,根据文案提取结果计算历史图像对应的文案特征,包括:对文案提 取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和关键词对应的权重;将关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以得到历史图像对应的文案特征。In some embodiments, calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; It is mapped to a keyword vector, and a weighted average is performed according to the keyword vector and the corresponding weight to obtain the copy features corresponding to the historical image.
作为一种示例,首先,通过爬虫等技术对网络上公开的文案进行搜罗,以便根据搜罗到的数据生成训练数据集;接着,根据训练数据集进行word2vector模型和分词模型的训练;然后,对历史图像进行文字检测识别,以提取出历史图像中的文字部分;接着,通过分词模型对文字部分进行分词以及关键词提取,以得到相应的关键词和每个关键词对应的权重;然后,通过word2vector将每个关键词映射成相应的关键词向量;接着,根据关键词对应的关键词向量和权重进行加权求和,以得到该历史图像对应的文案特征向量。As an example, firstly, crawlers and other technologies are used to search the public copywriting on the Internet, so as to generate a training data set according to the collected data; then, the word2vector model and word segmentation model are trained according to the training data set; then, the history The text detection and recognition of the image is used to extract the text part in the historical image; then, the text part is segmented and the keywords are extracted through the word segmentation model to obtain the corresponding keywords and the corresponding weight of each keyword; then, through word2vector Each keyword is mapped to a corresponding keyword vector; then, weighted summation is performed according to the keyword vector and weight corresponding to the keyword to obtain the copy feature vector corresponding to the historical image.
S103,将历史图像输入到风格识别模型,以得到历史图像的风格特征。S103. Input the historical image into the style recognition model to obtain the style features of the historical image.
即言,通过预先训练好的风格识别模型对历史图像进行风格识别(可以理解,每个图像都会有其相应的风格;例如,春节类的海报大部分都会使用红色作为主色调,以凸出喜庆的氛围);以得到历史图像的风格特征;可以理解,这种风格识别将有效提高后续图像检索的准确性。That is to say, the style recognition of historical images is carried out through the pre-trained style recognition model (it is understandable that each image will have its corresponding style; for example, most of the Spring Festival posters will use red as the main color to highlight the festive atmosphere); to obtain the style features of historical images; it can be understood that this style recognition will effectively improve the accuracy of subsequent image retrieval.
作为一种示例,风格识别模型的训练可以包括:首先,获取图像模板对应的结果图像(即言,通过该图像模板生成的图像),以将同一图像模板对应的结果图像作为同一风格的图像;如此,可以得到大量有效的训练数据。进一步地,可以提取同一风格中每个结果图像的主色,并计算结果图像之间的主色颜色距离,以滤除明显不属于同一风格的结果图像,确定最终训练数据。As an example, the training of the style recognition model may include: first, obtaining the result image corresponding to the image template (that is, the image generated by the image template), so as to use the result image corresponding to the same image template as an image of the same style; In this way, a large amount of effective training data can be obtained. Further, the dominant color of each result image in the same style can be extracted, and the color distance of the dominant color between the result images can be calculated to filter out the result images that obviously do not belong to the same style, and determine the final training data.
作为另一种示例,可以使用ResNet50结合triplet loss进行训练以得到风格识别模型。As another example, ResNet50 combined with triplet loss can be used to train a style recognition model.
S104,根据语义特征、文案特征和风格特征计算历史图像对应的检索向量,并根据多个历史图像和每个历史图像对应的检索向量生成检索数据库。S104. Calculate retrieval vectors corresponding to the historical images according to the semantic features, copywriting features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image.
即言,根据语义特征、文案特征和风格特征进行历史图像对应的检索向量的计算;进而,在计算完成之后,将该历史图像和对应的检索向量加入到检索数据库;从而,根据多个历史图像和每个历史图像对应的检索向量能够构建出检索数据库,以便后续根据检索数据库进行图像检索。That is to say, the calculation of the retrieval vector corresponding to the historical image is performed according to the semantic features, copy features and style features; furthermore, after the calculation is completed, the historical image and the corresponding retrieval vector are added to the retrieval database; thus, based on multiple historical images The retrieval vector corresponding to each historical image can construct a retrieval database, so that subsequent image retrieval can be performed according to the retrieval database.
在一些实施例中,根据语义特征、文案特征和风格特征计算历史图像对应的检索向量,包括:获取语义特征对应的权重、文案特征对应的权重和风格特征对应的权重,并根据语义特征对应的权重、文案特征对应的权重、风格特征对应的权重对语义特征、文案特征和风格特征进行特征融合,以得到检索向量。In some embodiments, calculating the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature includes: obtaining the weight corresponding to the semantic feature, the weight corresponding to the copy feature and the weight corresponding to the style feature, and according to the corresponding weight of the semantic feature Weights, weights corresponding to copywriting features, and weights corresponding to style features perform feature fusion on semantic features, copywriting features, and style features to obtain retrieval vectors.
作为一种示例,语义特征、文案特征和风格特征均为长度为128的一维向量,分别为verctor1,vecotr2,vector3;接着,定义三个特征所对应的权重为a1,a2,a3;则最终的检索向量表达为:a1*vector1+a2*vector2+a3*vector3。As an example, semantic features, copywriting features, and style features are all one-dimensional vectors with a length of 128, which are verctor1, vecotr2, and vector3; then, define the weights corresponding to the three features as a1, a2, and a3; then finally The retrieval vector of is expressed as: a1*vector1+a2*vector2+a3*vector3.
在一些实施例中,本发明实施例提出的图像检索方法还包括:获取用户对于检索结果的点击数据,并根据点击数据对语义特征对应的权重、文案特征对应的权重、风格特征对应的权重进行更新。In some embodiments, the image retrieval method proposed by the embodiment of the present invention further includes: acquiring the user's click data on the retrieval results, and performing the weight corresponding to the semantic feature, the weight corresponding to the copy feature, and the weight corresponding to the style feature according to the click data. renew.
可以理解,在初始进行检索向量的计算时,可以使用初始化权重(例如,1,1,1)结合三个特征的值进行计算。而在方法的持续使用过程中;可以通过获取用户对于检索结果的点击数据来判断检索结果的准确性;进而,根据点击数据对语义特征对应的权重、文案特征对应的权重、风格特征对应的权重进行更新,可以有效提高最终权重设置的准确性;进而提高最终图像检索的准确性。It can be understood that when initially calculating the retrieval vector, the initial weight (for example, 1, 1, 1) may be used for calculation in combination with the values of the three features. In the process of continuous use of the method, the accuracy of the search results can be judged by obtaining the user's click data on the search results; furthermore, according to the click data, the weights corresponding to the semantic features, the weights corresponding to the copy features, and the weights corresponding to the style features Updating can effectively improve the accuracy of the final weight setting; thereby improving the accuracy of the final image retrieval.
S105,获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和检索向量计算检索数据库中任意一个历史图像与待检索图像之间的相似值。S105. Acquire an image to be retrieved, calculate a vector to be retrieved corresponding to the image to be retrieved, and calculate a similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector.
S106,根据所有历史图像对应的相似值返回待检索图像对应的检索结果。S106. Return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all the historical images.
即言,获取用户上传的待检索图像,并提取待检索图像对应的语义特征、文案特征和风格特征,以及对三个特征进行融合,以得到该待检索图像对应的待检索向量;接着,计算该待检索向量与检索数据库中任意一个历史图像对应的检索图像对应检索向量之间的余弦相似度;以将该余弦相似度作为待检索图像与该历史图像之间的相似值;如此,遍历检索数据库,可以计算待检索图像与每个历史图像之间的相似值;然后,根据相似值的大小对历史图像进行排序,并根据排序结果返回该待检索图像对应的检索结果。That is to say, obtain the image to be retrieved uploaded by the user, extract the semantic feature, copy feature and style feature corresponding to the image to be retrieved, and fuse the three features to obtain the vector to be retrieved corresponding to the image to be retrieved; then, calculate The cosine similarity between the vector to be retrieved and the retrieval image corresponding to any historical image in the retrieval database; the cosine similarity is used as the similarity value between the image to be retrieved and the historical image; thus, the traversal retrieval The database can calculate the similarity value between the image to be retrieved and each historical image; then, sort the historical images according to the size of the similarity value, and return the retrieval result corresponding to the image to be retrieved according to the sorting result.
综上所述,根据本发明实施例的图像检索方法,首先,获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以提取历史图像中的主体部分;接着,根据显著性检测结果对历史图像进行语义分析,以得到历史图像的语义特征;然后,对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;接着,将历史图像输入到风格识别模型中,以通过风格识别模型提取历史图像的风格特征;然后,对语义特征、文案特征和风格特征进行特征融合以得到检索向量;并将该历史图像和对应的检索向量加入到检索数据库中,以通过多个历史图像及其对应的检索向量生成检索数据库;接着,获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和任意一个历史图像对应的检索向量计算该待检索图像与该历史图像之间的相似值;然后,根据所有历史图像对应的相似值返回待检索图像对应的检索结果;从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像 检索的准确性。To sum up, according to the image retrieval method of the embodiment of the present invention, first, obtain historical images, and perform saliency detection on the historical images through a pre-trained saliency detection network, so as to extract the main part in the historical images; then, According to the saliency detection results, the historical images are semantically analyzed to obtain the semantic features of the historical images; then, the historical images are extracted from the text, and the corresponding copy features of the historical images are calculated according to the text extraction results; then, the historical images are input into the style In the recognition model, the style feature of the historical image is extracted through the style recognition model; then, the semantic feature, copy feature and style feature are fused to obtain a retrieval vector; and the historical image and the corresponding retrieval vector are added to the retrieval database , to generate a retrieval database through multiple historical images and their corresponding retrieval vectors; then, obtain the image to be retrieved, and calculate the retrieval vector corresponding to the retrieval image, and calculate according to the retrieval vector corresponding to the retrieval vector and any historical image The similarity value between the image to be retrieved and the historical image; then, return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images; thereby realizing the extraction of feature information of the image from multiple dimensions and deep mining of the original image latent information, thereby improving the accuracy of image retrieval.
为了实现上述实施例,本发明实施例提出了一种计算机可读存储介质,其上存储有图像检索程序,该图像检索程序被处理器执行时实现如上述的图像检索方法。In order to realize the above-mentioned embodiments, an embodiment of the present invention proposes a computer-readable storage medium on which an image retrieval program is stored, and when the image retrieval program is executed by a processor, the above-mentioned image retrieval method is implemented.
根据本发明实施例的计算机可读存储介质,通过存储图像检索程序,以使得处理器在执行该图像检索程序时,实现如上述的图像检索方法,从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the computer-readable storage medium of the embodiment of the present invention, by storing the image retrieval program, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
为了实现上述实施例,本发明实施例提出了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时,实现如上述的图像检索方法。In order to realize the above-mentioned embodiments, the embodiment of the present invention proposes a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the program, the following The image retrieval method described above.
根据本发明实施例的计算机设备,通过存储器对图像检索程序进行存储,以使得处理器在执行该图像检索程序时,实现如上述的图像检索方法,从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。According to the computer device of the embodiment of the present invention, the image retrieval program is stored through the memory, so that when the processor executes the image retrieval program, the above-mentioned image retrieval method is realized, thereby realizing the feature information extraction of the image from multiple dimensions, Deeply mine the potential information of the original image, and then improve the accuracy of image retrieval.
为了实现上述实施例,本发明实施例提出了一种图像检索装置,如图2所示,该图像检索装置包括:语义特征模块10、文案特征模块20、风格特征模块30、数据库模块40、检索模块50和反馈模块60。In order to realize the above-mentioned embodiment, the embodiment of the present invention proposes an image retrieval device, as shown in FIG. module 50 and feedback module 60 .
其中,语义特征模块10用于获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以及根据显著性检测结果对历史图像进行语义提取,以得到历史图像的语义特征;Among them, the semantic feature module 10 is used to obtain historical images, and perform saliency detection on historical images through a pre-trained saliency detection network, and perform semantic extraction on historical images according to the saliency detection results to obtain semantic features of historical images ;
文案特征模块20用于对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;The copy feature module 20 is used to extract the text of the historical image, and calculate the corresponding text feature of the historical image according to the text extraction result;
风格特征模块30用于将历史图像输入到风格识别模型,以得到历史图像的风格特征;The style feature module 30 is used for inputting the historical image into the style recognition model, to obtain the style feature of the historical image;
数据库模块40用于根据语义特征、文案特征和风格特征计算历史图像对应的检索向量,并根据多个历史图像和每个历史图像对应的检索向量生成检索数据库;The database module 40 is used to calculate the retrieval vectors corresponding to the historical images according to the semantic features, copywriting features and style features, and generate a retrieval database according to multiple historical images and the retrieval vectors corresponding to each historical image;
检索模块50用于获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和检索向量计算检索数据库中任意一个历史图像与待检索图像之间的相似值;The retrieval module 50 is used to obtain the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector;
反馈模块60用于根据所有历史图像对应的相似值返回待检索图像对应的检索结果。The feedback module 60 is used to return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all historical images.
在一些实施例中,显著性检测网络的训练包括:获取开源数据集和无主体图像,并提取开源数据集中图像的主体信息,以及将主体信息与无主体图像进行融合;根据开源数据集和主体信息与无主体图像的融合结果生成训练集,以便根据训练集进行显著性检测网络的训练。In some embodiments, the training of the saliency detection network includes: obtaining an open source dataset and subject-free images, extracting subject information of images in the open source dataset, and fusing subject information with subject-free images; according to the open source dataset and subject The fusion result of the information and the subject-free image generates a training set, so that the training of the saliency detection network can be performed according to the training set.
在一些实施例中,根据文案提取结果计算历史图像对应的文案特征,包括:对文案提 取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和关键词对应的权重;将关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以得到历史图像对应的文案特征。In some embodiments, calculating the copy feature corresponding to the historical image according to the copy extraction result includes: performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keyword; It is mapped to a keyword vector, and a weighted average is performed according to the keyword vector and the corresponding weight to obtain the copy features corresponding to the historical image.
需要说明的是,上述关于图1中图像检索方法的描述同样适用于该图像检索装置,在此不做赘述。It should be noted that the above description about the image retrieval method in FIG. 1 is also applicable to the image retrieval device, and details are not repeated here.
综上所述,根据本发明实施例的图像检索装置,通过设置语义特征模块用于获取历史图像,并通过预先训练好的显著性检测网络对历史图像进行显著性检测,以及根据显著性检测结果对历史图像进行语义提取,以得到历史图像的语义特征;文案特征模块用于对历史图像进行文案提取,并根据文案提取结果计算历史图像对应的文案特征;风格特征模块用于将历史图像输入到风格识别模型,以得到历史图像的风格特征;数据库模块用于根据语义特征、文案特征和风格特征计算历史图像对应的检索向量,并根据多个历史图像和每个历史图像对应的检索向量生成检索数据库;检索模块用于获取待检索图像,并计算待检索图像对应的待检索向量,以及根据待检索向量和检索向量计算检索数据库中任意一个历史图像与待检索图像之间的相似值;反馈模块用于根据所有历史图像对应的相似值返回待检索图像对应的检索结果;从而实现从多个维度提取图像的特征信息,深度挖掘原始图像的潜在信息,进而提高图像检索的准确性。To sum up, according to the image retrieval device of the embodiment of the present invention, by setting the semantic feature module to obtain historical images, and performing saliency detection on historical images through a pre-trained saliency detection network, and according to the saliency detection results Semantic extraction of historical images to obtain the semantic features of historical images; the copy feature module is used to extract text from historical images, and calculates the corresponding copy features of historical images according to the results of text extraction; the style feature module is used to input historical images into The style recognition model is used to obtain the style features of historical images; the database module is used to calculate the retrieval vectors corresponding to historical images based on semantic features, copy features and style features, and generate retrievals based on multiple historical images and the retrieval vectors corresponding to each historical image database; the retrieval module is used to obtain the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity value between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector; the feedback module It is used to return the retrieval results corresponding to the images to be retrieved according to the similarity values corresponding to all historical images; thus, the feature information of images can be extracted from multiple dimensions, and the potential information of original images can be deeply mined, thereby improving the accuracy of image retrieval.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机 或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
应当注意的是,在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的部件或步骤。位于部件之前的单词“一”或“一个”不排除存在多个这样的部件。本发明可以借助于包括有若干不同部件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。While preferred embodiments of the invention have been described, additional changes and modifications to these embodiments can be made by those skilled in the art once the basic inventive concept is appreciated. Therefore, it is intended that the appended claims be construed to cover the preferred embodiment as well as all changes and modifications which fall within the scope of the invention.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention also intends to include these modifications and variations.
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。In the description of the present invention, it should be understood that the terms "first" and "second" are used for description purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of the present invention, "plurality" means two or more, unless otherwise specifically defined.
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。In the present invention, unless otherwise clearly specified and limited, terms such as "installation", "connection", "connection" and "fixation" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection , or integrated; it can be mechanically connected or electrically connected; it can be directly connected or indirectly connected through an intermediary, and it can be the internal communication of two components or the interaction relationship between two components. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention according to specific situations.
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。In the present invention, unless otherwise clearly specified and limited, the first feature may be in direct contact with the first feature or the first and second feature may be in direct contact with the second feature through an intermediary. touch. Moreover, "above", "above" and "above" the first feature on the second feature may mean that the first feature is directly above or obliquely above the second feature, or simply means that the first feature is higher in level than the second feature. "Below", "beneath" and "beneath" the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the first feature is less horizontally than the second feature.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术 语的示意性表述不应理解为必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms should not be understood as necessarily referring to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (10)

  1. 一种图像检索方法,其特征在于,包括以下步骤:An image retrieval method, characterized in that, comprising the following steps:
    获取历史图像,并通过预先训练好的显著性检测网络对所述历史图像进行显著性检测,以及根据显著性检测结果对所述历史图像进行语义提取,以得到所述历史图像的语义特征;Acquiring historical images, performing saliency detection on the historical images through a pre-trained saliency detection network, and performing semantic extraction on the historical images according to the saliency detection results, to obtain semantic features of the historical images;
    对所述历史图像进行文案提取,并根据文案提取结果计算所述历史图像对应的文案特征;performing copy extraction on the historical image, and calculating the copy features corresponding to the historical image according to the copy extraction result;
    将所述历史图像输入到风格识别模型,以得到所述历史图像的风格特征;inputting the historical image into a style recognition model to obtain the style features of the historical image;
    根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,并根据多个所述历史图像和每个历史图像对应的检索向量生成检索数据库;calculating a retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature, and generating a retrieval database according to a plurality of the historical images and the retrieval vector corresponding to each historical image;
    获取待检索图像,并计算所述待检索图像对应的待检索向量,以及根据所述待检索向量和所述检索向量计算所述检索数据库中任意一个历史图像与所述待检索图像之间的相似值;Acquire the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the similarity between any historical image in the retrieval database and the image to be retrieved according to the vector to be retrieved and the retrieval vector value;
    根据所有历史图像对应的相似值返回所述待检索图像对应的检索结果。The retrieval result corresponding to the image to be retrieved is returned according to the similarity values corresponding to all historical images.
  2. 如权利要求1所述的图像检索方法,其特征在于,所述显著性检测网络的训练包括:The image retrieval method according to claim 1, wherein the training of the saliency detection network comprises:
    获取开源数据集和无主体图像,并提取所述开源数据集中图像的主体信息,以及将所述主体信息与所述无主体图像进行融合;Obtaining an open source dataset and a subject-free image, extracting subject information of the image in the open source dataset, and fusing the subject information with the subject-free image;
    根据所述开源数据集和所述主体信息与所述无主体图像的融合结果生成训练集,以便根据所述训练集进行所述显著性检测网络的训练。A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
  3. 如权利要求1所述的图像检索方法,其特征在于,根据文案提取结果计算所述历史图像对应的文案特征,包括:The image retrieval method according to claim 1, wherein calculating the copy features corresponding to the historical images according to the copy extraction results includes:
    对文案提取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和所述关键词对应的权重;performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keywords;
    将所述关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以得到所述历史图像对应的文案特征。The keywords are mapped to keyword vectors, and a weighted average is performed according to the keyword vectors and corresponding weights to obtain copy features corresponding to the historical images.
  4. 如权利要求1所述的图像检索方法,其特征在于,根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,包括:The image retrieval method according to claim 1, wherein calculating the retrieval vector corresponding to the historical image according to the semantic feature, the copy feature and the style feature comprises:
    获取所述语义特征对应的权重、所述文案特征对应的权重和所述风格特征对应的权重,并根据所述语义特征对应的权重、所述文案特征对应的权重、所述风格特征对应的权重对所述语义特征、所述文案特征和所述风格特征进行特征融合,以得到所述检索向量。Obtaining the weights corresponding to the semantic features, the weights corresponding to the copywriting features, and the weights corresponding to the style features, and according to the weights corresponding to the semantic features, the weights corresponding to the copywriting features, and the weights corresponding to the style features Perform feature fusion on the semantic feature, the copy feature and the style feature to obtain the retrieval vector.
  5. 如权利要求4所述的图像检索方法,其特征在于,还包括:The image retrieval method according to claim 4, further comprising:
    获取用户对于所述检索结果的点击数据,并根据所述点击数据对所述语义特征对应的权重、所述文案特征对应的权重、所述风格特征对应的权重进行更新。Acquiring the user's click data on the retrieval result, and updating the weight corresponding to the semantic feature, the weight corresponding to the copywriting feature, and the weight corresponding to the style feature according to the click data.
  6. 一种计算机可读存储介质,其特征在于,其上存储有图像检索程序,该图像检索程序被处理器执行时实现如权利要求1-5中任一项所述的图像检索方法。A computer-readable storage medium, characterized in that an image retrieval program is stored thereon, and when the image retrieval program is executed by a processor, the image retrieval method according to any one of claims 1-5 is implemented.
  7. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时,实现如权利要求1-5中任一项所述的图像检索方法。A computer device, comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, characterized in that, when the processor executes the program, any one of claims 1-5 can be realized. The image retrieval method described in item.
  8. 一种图像检索装置,其特征在于,包括:An image retrieval device, characterized in that it comprises:
    语义特征模块,所述语义特征模块用于获取历史图像,并通过预先训练好的显著性检测网络对所述历史图像进行显著性检测,以及根据显著性检测结果对所述历史图像进行语义提取,以得到所述历史图像的语义特征;a semantic feature module, wherein the semantic feature module is used to acquire historical images, perform saliency detection on the historical images through a pre-trained saliency detection network, and perform semantic extraction on the historical images according to the saliency detection results, to obtain the semantic features of the historical image;
    文案特征模块,所述文案特征模块用于对所述历史图像进行文案提取,并根据文案提取结果计算所述历史图像对应的文案特征;A copy feature module, the copy feature module is used to extract the copy of the historical image, and calculate the copy feature corresponding to the historical image according to the copy extraction result;
    风格特征模块,所述风格特征模块用于将所述历史图像输入到风格识别模型,以得到所述历史图像的风格特征;a style feature module, the style feature module is used to input the historical image into a style recognition model to obtain the style feature of the historical image;
    数据库模块,所述数据库模块用于根据所述语义特征、所述文案特征和所述风格特征计算所述历史图像对应的检索向量,并根据多个所述历史图像和每个历史图像对应的检索向量生成检索数据库;A database module, the database module is used to calculate the retrieval vectors corresponding to the historical images according to the semantic features, the copy features and the style features, and to retrieve vectors corresponding to each historical image based on multiple historical images and each historical image vector generation retrieval database;
    检索模块,所述检索模块用于获取待检索图像,并计算所述待检索图像对应的待检索向量,以及根据所述待检索向量和所述检索向量计算所述检索数据库中任意一个历史图像与所述待检索图像之间的相似值;A retrieval module, the retrieval module is used to obtain the image to be retrieved, and calculate the vector to be retrieved corresponding to the image to be retrieved, and calculate the relationship between any historical image in the retrieval database according to the vector to be retrieved and the retrieval vector the similarity value between the images to be retrieved;
    反馈模块,所述反馈模块用于根据所有历史图像对应的相似值返回所述待检索图像对应的检索结果。A feedback module, the feedback module is used to return the retrieval result corresponding to the image to be retrieved according to the similarity values corresponding to all the historical images.
  9. 如权利要求8所述的图像检索装置,其特征在于,所述显著性检测网络的训练包括:The image retrieval device according to claim 8, wherein the training of the saliency detection network comprises:
    获取开源数据集和无主体图像,并提取所述开源数据集中图像的主体信息,以及将所述主体信息与所述无主体图像进行融合;Obtaining an open source dataset and a subject-free image, extracting subject information of the image in the open source dataset, and fusing the subject information with the subject-free image;
    根据所述开源数据集和所述主体信息与所述无主体图像的融合结果生成训练集,以便根据所述训练集进行所述显著性检测网络的训练。A training set is generated according to the fusion result of the open source data set and the subject information and the subject-free image, so as to train the saliency detection network according to the training set.
  10. 如权利要求8所述的图像检索装置,其特征在于,根据文案提取结果计算所述历史图像对应的文案特征,包括:The image retrieval device according to claim 8, wherein the calculation of the copy features corresponding to the historical images according to the copy extraction results includes:
    对文案提取结果进行分词和关键词提取,以生成对应文案提取结果的关键词和所述关键词对应的权重;performing word segmentation and keyword extraction on the copy extraction result to generate keywords corresponding to the copy extraction result and weights corresponding to the keywords;
    将所述关键词映射为关键词向量,并根据关键词向量和对应的权重进行加权平均,以 得到所述历史图像对应的文案特征。The keywords are mapped to keyword vectors, and weighted average is carried out according to the keyword vectors and corresponding weights to obtain the copy features corresponding to the historical images.
PCT/CN2021/119402 2021-05-18 2021-09-18 Image retrieval method and apparatus WO2022241987A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110539585.4 2021-05-18
CN202110539585.4A CN113282781B (en) 2021-05-18 2021-05-18 Image retrieval method and device

Publications (1)

Publication Number Publication Date
WO2022241987A1 true WO2022241987A1 (en) 2022-11-24

Family

ID=77279558

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119402 WO2022241987A1 (en) 2021-05-18 2021-09-18 Image retrieval method and apparatus

Country Status (2)

Country Link
CN (1) CN113282781B (en)
WO (1) WO2022241987A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282781B (en) * 2021-05-18 2022-06-28 稿定(厦门)科技有限公司 Image retrieval method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291855A (en) * 2017-06-09 2017-10-24 中国电子科技集团公司第五十四研究所 A kind of image search method and system based on notable object
CN108829826A (en) * 2018-06-14 2018-11-16 清华大学深圳研究生院 A kind of image search method based on deep learning and semantic segmentation
CN110598037A (en) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 Image searching method, device and storage medium
CN110866140A (en) * 2019-11-26 2020-03-06 腾讯科技(深圳)有限公司 Image feature extraction model training method, image searching method and computer equipment
US20200175062A1 (en) * 2017-07-28 2020-06-04 Hangzhou Hikvision Digital Technology Co., Ltd. Image retrieval method and apparatus, and electronic device
CN113282781A (en) * 2021-05-18 2021-08-20 稿定(厦门)科技有限公司 Image retrieval method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11144587B2 (en) * 2016-03-08 2021-10-12 Shutterstock, Inc. User drawing based image search
CN106649487B (en) * 2016-10-09 2020-02-18 苏州大学 Image retrieval method based on interest target
CN111415396A (en) * 2019-01-08 2020-07-14 腾讯科技(深圳)有限公司 Image generation method and device and storage medium
CN110297931B (en) * 2019-04-23 2021-12-03 西北大学 Image retrieval method
CN110175249A (en) * 2019-05-31 2019-08-27 中科软科技股份有限公司 A kind of search method and system of similar pictures
CN110288019A (en) * 2019-06-21 2019-09-27 北京百度网讯科技有限公司 Image labeling method, device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291855A (en) * 2017-06-09 2017-10-24 中国电子科技集团公司第五十四研究所 A kind of image search method and system based on notable object
US20200175062A1 (en) * 2017-07-28 2020-06-04 Hangzhou Hikvision Digital Technology Co., Ltd. Image retrieval method and apparatus, and electronic device
CN108829826A (en) * 2018-06-14 2018-11-16 清华大学深圳研究生院 A kind of image search method based on deep learning and semantic segmentation
CN110598037A (en) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 Image searching method, device and storage medium
CN110866140A (en) * 2019-11-26 2020-03-06 腾讯科技(深圳)有限公司 Image feature extraction model training method, image searching method and computer equipment
CN113282781A (en) * 2021-05-18 2021-08-20 稿定(厦门)科技有限公司 Image retrieval method and device

Also Published As

Publication number Publication date
CN113282781A (en) 2021-08-20
CN113282781B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
US9430719B2 (en) System and method for providing objectified image renderings using recognition information from images
Torralba et al. Labelme: Online image annotation and applications
US8649572B2 (en) System and method for enabling the use of captured images through recognition
WO2019169872A1 (en) Method and device for searching for content resource, and server
US10963504B2 (en) Zero-shot event detection using semantic embedding
US8935246B2 (en) Identifying textual terms in response to a visual query
CN102549603B (en) Relevance-based image selection
CN108334627B (en) Method and device for searching new media content and computer equipment
US20060251292A1 (en) System and method for recognizing objects from images and identifying relevancy amongst images and information
WO2022134701A1 (en) Video processing method and apparatus
CN103988202A (en) Image attractiveness based indexing and searching
August et al. AI naturalists might hold the key to unlocking biodiversity data in social media imagery
Weyand et al. Visual landmark recognition from internet photo collections: A large-scale evaluation
CN109271624B (en) Target word determination method, device and storage medium
WO2022241987A1 (en) Image retrieval method and apparatus
Choi et al. Multimodal location estimation of consumer media: Dealing with sparse training data
Patwardhan et al. ViTag: Automatic video tagging using segmentation and conceptual inference
JPH11250106A (en) Method for automatically retrieving registered trademark through the use of video information of content substrate
Panda et al. Heritage app: annotating images on mobile phones
CN111752922A (en) Method and device for establishing knowledge database and realizing knowledge query
Averbuch‐Elor et al. Distilled collections from textual image queries
Demidova et al. Semantic image-based profiling of users’ interests with neural networks
Lee et al. A scalable service for photo annotation, sharing, and search
Badghaiya et al. Image classification using tag and segmentation based retrieval
CN112988749A (en) Method and device for responding to retrieval request through KV storage equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21940435

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE