CN111797765A

CN111797765A - Image processing method, image processing apparatus, server, and storage medium

Info

Publication number: CN111797765A
Application number: CN202010637297.8A
Authority: CN
Inventors: 张水发
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-07-03
Filing date: 2020-07-03
Publication date: 2020-10-20
Anticipated expiration: 2040-07-03
Also published as: CN111797765B

Abstract

The disclosure relates to an image processing method, an image processing device, a server and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: acquiring a target image; identifying the target image to obtain image content information of the target image, wherein the image content information comprises at least one of text content and object keywords of the target image; determining an importance parameter of the target image according to the image content information, wherein the importance parameter is used for representing the degree of correlation between the image content information and the target image; and adding the target image to the image of the index corresponding to the image content information according to the importance parameter. According to the method and the device, the target image is added to the image corresponding to the index according to the importance parameter, and then when the index is hit, the image with higher importance parameter, namely higher index correlation degree, can be preferentially output, so that the accuracy of image retrieval is improved.

Description

Image processing method, image processing apparatus, server, and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an image processing method, an image processing apparatus, a server, and a storage medium.

Background

Currently, a user can search for a desired image from a large number of images by inputting a search term. The server can search images matched with the search words based on the search words input by the user and the indexes of the images in the image database, and feed back the images to the user.

In the related art, the server may classify images in the image database, and use the category of the image as an index of the image. When the image retrieval is carried out, the server compares the retrieval words input by the user with the indexes of the images in the image database, and feeds back the matched images to the user.

In the above process, the index of the image is the category of the image, the index expresses information contained in the image in a single plane, and the accuracy of image retrieval based on the index is low.

Disclosure of Invention

The embodiment of the disclosure provides an image processing method, an image processing device, a server and a storage medium, so as to improve the accuracy of image retrieval. The technical scheme of the disclosure is as follows:

according to a first aspect of embodiments of the present disclosure, there is provided an image processing method, the method including:

acquiring a target image;

identifying the target image to obtain image content information of the target image, wherein the image content information comprises at least one of text content and object keywords of the target image;

determining an importance parameter of the target image according to the image content information, wherein the importance parameter is used for representing the degree of correlation between the image content information and the target image;

and adding the target image to the image of the index corresponding to the image content information according to the importance parameter.

In a possible implementation manner, the determining the importance parameter of the target image according to the image content information includes:

and determining the importance parameter corresponding to the target category as the importance parameter of the target image in response to the fact that the category indicated by the object keyword is the target category.

In another possible implementation manner, the determining the importance parameter of the target image according to the image content information includes:

in response to that the category indicated by the object keyword is not a target category, determining a first importance parameter corresponding to the object keyword according to a first position of an object corresponding to the object keyword in the target image, wherein the first importance parameter is in negative correlation with a distance between the first position and a first target position in the target image;

determining the first importance parameter as an importance parameter of the target image.

In another possible implementation manner, before the adding the target image to the image of the index corresponding to the image content information according to the importance parameter, the image processing method further includes:

determining a first correlation between the text content and the object keyword;

in response to the first correlation meeting a first target condition, increasing an importance parameter corresponding to the object keyword;

and in response to the first correlation not meeting the first target condition, reducing the importance parameter corresponding to the object keyword.

determining a second correlation between the object keyword and an image feature of the target image;

in response to the second correlation meeting a second target condition, increasing an importance parameter corresponding to the object keyword;

and reducing the importance parameter corresponding to the object keyword in response to the second correlation not meeting the second target condition.

determining a second importance parameter corresponding to the text content according to a second position of the text content in the target image and a text area occupied by the text content in the target image, wherein the second importance parameter is in negative correlation with a distance between the second position and the second target position of the target image and is in positive correlation with the text area;

determining the second importance parameter as the importance parameter of the target image.

In another possible implementation manner, the editing type of the text content includes a human editing type and a scene shooting type, and the second importance parameter corresponding to the text content of the human editing type is greater than the second importance parameter corresponding to the text content of the scene shooting type.

In another possible implementation manner, the image content information further includes scene information of the target image, and before the target image is added to the image of the index corresponding to the image content information according to the importance parameter, the image processing method further includes:

determining a third correlation between the text content and the scene information;

in response to that the third correlation meets a third target condition, increasing an importance parameter corresponding to the text content;

and in response to the third correlation not meeting a third target condition, reducing the importance parameter corresponding to the text content.

In another possible implementation manner, the adding the target image to the image of the index corresponding to the image content information according to the importance parameter includes:

respectively comparing the importance parameter of the target image with the importance parameters of the plurality of images to determine the ordering positions of the target image in the plurality of images;

adding the target image to the sort position.

In another possible implementation manner, the image processing method further includes:

in response to receiving a retrieval instruction, and a retrieval word included in the retrieval instruction hits any index, outputting an image corresponding to the retrieval instruction from a plurality of images corresponding to the hit index according to the sequence of the importance parameters of the plurality of images from large to small.

According to a second aspect of the embodiments of the present disclosure, there is provided an image processing apparatus including:

an acquisition unit configured to perform acquisition of a target image;

the identification unit is configured to identify the target image to obtain image content information of the target image, wherein the image content information comprises at least one of text content and object keywords of the target image;

a first determination unit configured to perform determination of an importance parameter of the target image according to the image content information, the importance parameter being used to indicate a degree of correlation of the image content information with the target image;

and the adding unit is configured to add the target image to the image of the index corresponding to the image content information according to the importance parameter.

In a possible implementation manner, the image content information includes the object keyword, and the first determining unit is configured to perform, in response to that the category indicated by the object keyword is a target category, determining an importance parameter corresponding to the target category as an importance parameter of the target image.

In another possible implementation manner, the image content information includes the object keyword, and the first determining unit is configured to perform:

In another possible implementation manner, the image processing apparatus further includes:

a second determination unit configured to perform determination of a first correlation between the text content and the object keyword;

a first increasing unit configured to perform increasing an importance parameter corresponding to the object keyword in response to the first correlation meeting a first target condition;

a first reducing unit configured to perform reducing the importance parameter corresponding to the object keyword in response to the first correlation not meeting the first target condition.

a third determination unit configured to perform determining a second correlation between the object keyword and an image feature of the target image;

a second increasing unit configured to perform increasing the importance parameter corresponding to the object keyword in response to the second correlation meeting a second target condition;

and the second reducing unit is configured to execute reducing the importance parameter corresponding to the object keyword in response to the second relevance not meeting the second target condition.

In another possible implementation manner, the image content information includes the text content, and the first determining unit is configured to perform:

a fourth determination unit configured to perform determining a third correlation between the text content and the scene information;

a third increasing unit configured to perform increasing an importance parameter corresponding to the text content in response to the third correlation meeting a third target condition;

and the third reducing unit is configured to reduce the importance parameter corresponding to the text content in response to the third relevance not meeting a third target condition.

In another possible implementation manner, the index corresponding to the image content information corresponds to a plurality of images, and the adding unit is configured to perform:

adding the target image to the sort position.

the output unit is configured to execute that in response to receiving a retrieval instruction, a retrieval word included in the retrieval instruction hits any index, and the images corresponding to the retrieval instruction are output from a plurality of images corresponding to the hit index according to the sequence of the importance parameters of the images from large to small.

According to a third aspect of embodiments of the present disclosure, there is provided a server, including: one or more processors; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the image processing method of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having instructions that, when executed by a processor of a server, enable the server to perform the image processing method of the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein the instructions of the computer program product, when executed by a processor of a server, enable the server to perform the image processing method of the first aspect.

The technical scheme provided by the embodiment of the disclosure at least has the following beneficial effects:

the method comprises the steps of obtaining image content information comprising at least one of text content and object keywords by identifying a target image, determining importance parameters of the target image on the dimension of any image content information by fully utilizing the image content information, and adding the target image to an image pointed by an index corresponding to the image content information according to the importance parameters, so that the images pointed by the index are arranged according to the correlation degree of the image content information and the target image, and further, when the index is hit, the image with higher correlation degree with the index can be preferentially output, and the accuracy of image retrieval is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a schematic diagram of an implementation environment of a method of image processing according to an example embodiment;

FIG. 2 is a flow diagram illustrating a method of image processing according to an exemplary embodiment;

FIG. 3 is a flow diagram illustrating a method of image processing according to an exemplary embodiment;

FIG. 4 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment;

FIG. 5 is a block diagram illustrating a server in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The user information to which the present disclosure relates may be information authorized by the user or sufficiently authorized by each party.

Fig. 1 is a schematic diagram illustrating an implementation environment of an image processing method according to an exemplary embodiment. Referring to fig. 1, the implementation environment specifically includes: a terminal 101 and a server 102.

The terminal 101 may be at least one of a smartphone, a tablet, a smartwatch, a desktop computer, a laptop computer, and the like. Various applications, such as a video sharing application, a browser application, a social contact application, or an image sharing application, may be installed and run on the terminal 101, where the video sharing application may be a short video application or a live broadcast application. The application program on the terminal 101 may provide a user with a search function for images or videos, a recommendation function for images or videos, and the like through interaction with the server 102.

Terminal 101 may refer broadly to one of a plurality of terminals and the disclosed embodiments are illustrated with terminal 101 only. Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. For example, the number of the terminals may be only a few, or the number of the terminals may be several tens or hundreds, or more, and the number of the terminals and the type of the device are not limited in the embodiments of the present disclosure.

The server 102 may be at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. Server 102 may identify image content information from the image and index the image according to an importance parameter indicating a degree of correlation between the image content information and the image. The server 102 may be connected to the terminal 101 through a wireless network or a wired network, and when receiving a search instruction sent by the terminal 101, the server 102 may return corresponding images to the terminal 101 from a plurality of images corresponding to an index according to an index hit by a search term included in the search instruction, in an order from a large importance parameter to a small importance parameter of the plurality of images, so that the terminal 101 displays the received images to a user. Optionally, the number of the servers may be more or less, and the embodiment of the disclosure does not limit this. Of course, the server 102 may also include other functional servers to provide more comprehensive and diverse services.

FIG. 2 is a flow diagram illustrating an image processing method according to an exemplary embodiment. Referring to fig. 2, the image processing method includes the following steps.

In step S201, a target image is acquired.

In step S202, the target image is identified to obtain image content information of the target image, where the image content information includes at least one of text content of the target image and an object keyword.

In step S203, an importance parameter of the target image is determined according to the image content information, and the importance parameter is used for indicating the degree of correlation between the image content information and the target image.

In step S204, the target image is added to the image of the index corresponding to the image content information according to the importance parameter.

In the embodiment of the disclosure, image content information including at least one of text content and object keywords is obtained by identifying a target image, the image content information is fully utilized to determine an importance parameter of the target image in a dimension of any image content information, and the target image is added to an image pointed by an index corresponding to the image content information according to the importance parameter, so that the images pointed by the index are arranged according to the degree of correlation between the image content information and the target image, and further, when the index is hit, the image with a higher degree of correlation with the index can be preferentially output, and the accuracy of image retrieval is improved.

In one possible implementation manner, the determining the importance parameter of the target image according to the image content information includes:

and determining the importance parameter corresponding to the target category as the importance parameter of the target image in response to the category indicated by the object keyword being the target category.

In another possible implementation manner, the image content information includes an object keyword, and the determining the importance parameter of the target image according to the image content information includes:

in response to the fact that the category indicated by the object keyword is not the target category, determining a first importance parameter corresponding to the object keyword according to a first position of an object corresponding to the object keyword in the target image, wherein the first importance parameter is in negative correlation with the distance between the first position and the first target position in the target image;

the first importance parameter is determined as the importance parameter of the target image.

In another possible implementation manner, before adding the target image to the image of the index corresponding to the image content information according to the importance parameter, the image processing method further includes:

in response to the first correlation meeting the first target condition, increasing the importance parameter corresponding to the object keyword;

and reducing the importance parameter corresponding to the object keyword in response to the first correlation not meeting the first target condition.

determining a second correlation between the object keyword and the image feature of the target image;

in response to the second correlation meeting the second target condition, increasing the importance parameter corresponding to the object keyword;

In another possible implementation manner, the image content information includes text content, and the determining the importance parameter of the target image according to the image content information includes:

determining a second importance parameter corresponding to the text content according to a second position of the text content in the target image and the text area occupied by the text content in the target image, wherein the second importance parameter is in negative correlation with the distance between the second position and the second target position of the target image and is in positive correlation with the text area;

the second importance parameter is determined as the importance parameter of the target image.

In another possible implementation manner, the image content information may further include scene information of a target image, and before the target image is added to an image of an index corresponding to the image content information according to the importance parameter, the image processing method further includes:

in response to that the third relevance meets a third target condition, increasing the importance parameter corresponding to the text content;

and in response to that the third relevance does not meet the third target condition, reducing the importance parameter corresponding to the text content.

respectively comparing the importance parameters of the target image with the importance parameters of the plurality of images, and determining the sequencing positions of the target image in the plurality of images;

the target image is added to the sort position.

and in response to receiving the retrieval instruction, the retrieval word included in the retrieval instruction hits any index, and the images corresponding to the retrieval instruction are output from the plurality of images corresponding to the hit index according to the descending order of the importance parameters of the plurality of images.

FIG. 3 is a flow diagram illustrating an image processing method according to an exemplary embodiment. Referring to fig. 3, the image processing method includes the following steps.

In step S301, the server acquires a target image.

The target image is the image for which an index needs to be constructed. The server may maintain an image database from which the target image is obtained. The server can acquire any image in the image database as a target image; the server can also obtain a newly added image without an index in the image database as a target image.

In another embodiment of the present disclosure, the target image may also be an image extracted from a video, the target image being used to represent the video content of the video. For example, the server provides a service for a video sharing application program, a user can upload a video to the server through the video sharing application program, and the server can extract a target image from the video uploaded by the user.

The server may extract an image of any frame from the video as a target image, for example, the server may extract a cover image of the video from the video, and determine the cover image as the target image; the server can also randomly extract an image of a certain frame from the video as a target image. The server can also extract any multi-frame image from the video as a target image. For example, the server may acquire images of multiple frames from the video as the target image according to a time period, or the server may randomly acquire images of multiple frames from the video as the target image.

The server can acquire a target image and process the target image; the server can also acquire a plurality of target images and process each image in the plurality of target images; in steps S302 to S305, the following describes an example of processing of one target image by the server.

In step S302, the server identifies the target image to obtain image content information of the target image, where the image content information includes text content of the target image and the object keyword.

The image content information is used for reflecting the image content expressed by the target image. The object keyword included in the image content information is used to indicate a category of the object included in the target image, and for example, the object keyword may include any one of categories such as a person, a cat, or a dog. If the image content information includes an object keyword, the image content information may include one or more object keywords, each corresponding to an object included in the target image. In addition, the image content information may further include scene information representing a scene photographed by the target image.

In a possible implementation manner, the image content information includes an object keyword of the target image, and the step of identifying the target image by the server to obtain the image content information of the target image may be: the server inputs the target image into at least one object classification model, and determines object keywords of the target image according to the output of each object classification model.

The object classification model can be a multi-classification model, and the object classification model is used for allocating a class label to an object from a plurality of class labels based on the identification of the object contained in the target image and outputting the class label; the server may determine the class label output by the object classification model as the object keyword. For example, the plurality of category labels corresponding to the object classification model may include a cat and a dog, and if the object included in the target image is a cat, the object classification model may output the category label of the target image as a cat.

The object classification model can also be a binary classification model, and the object classification model is used for determining and outputting whether an object contained in the target image is a class identified by the object classification model; if the output of the object classification model is used to indicate that the object included in the target image is the class identified by the object classification model, the server may determine the class identified by the object classification model as the object keyword. For example, if the category identified by the object classification model is a person and the object included in the target image is a person, the object classification model may output a result indicating that the person is included in the target image, and the server may determine the person as the object keyword.

Further, if the category indicated by the object keyword meets the fourth target condition, the server may further identify the target image, and further determine the object keyword based on the identification result. The fourth target condition may be that the category indicated by the object keyword is capable of category subdivision, for example, the fourth target condition may be that the category indicated by the object keyword is a person.

If the fourth target condition is that the category indicated by the object keyword is a person and the category indicated by the object keyword meets the fourth target condition, the server further identifies the target image, and the step of determining the object keyword may be: the server extracts a first face image of a person contained in the target image; comparing the first face image with face images in a target face library through a face recognition model, wherein the target face library stores face images of persons belonging to a target category; and in response to the matching of the first face image and a second face image in the target face library, determining that the person contained in the target image belongs to a target class, and determining the target class as an object keyword corresponding to the person contained in the target image. The target category may be a category to which a person having an influence in a certain field belongs, for example, the target category may be a singer, an actor, or an official speaker.

It should be noted that the server may determine the target category as one of the target keywords of the target image, for example, if the server determines that the target included in the target image is a person and the target category to which the person belongs is a singer, the target keywords of the target image may include the person and the singer. The server may update the target category to an object keyword corresponding to a person included in the target image. For example, if the server determines that the object included in the target image is a person and the target category to which the person belongs is a singer, the object keyword of the target image may include the singer.

Further, if the first face image is matched with a second face image in the target face library, the server may further obtain basic information of a person corresponding to the second face image, and determine the basic information as an object keyword corresponding to the person included in the target image. For example, the basic information of the person may include a person name, and the server may determine the person name as an object keyword corresponding to the person included in the target image. For example, if the server determines that the object included in the target image is a person and the target category to which the person belongs is a singer, and the basic information of the person includes the person name small a, the object keywords of the target image may include the singer and the small a.

If the first face image is not matched with the second face image in the target face library, the server can also identify the first face image through the face classification model, and determine the object keywords corresponding to the persons contained in the target image. Correspondingly, if the first face image does not match with the second face image in the target face library, the step of determining the object keyword by the server may be: the server responds to the fact that the first face image is not matched with a second face image in the target face library, the first face image is input into the face classification model, and an image label corresponding to a person contained in the target image is obtained; and determining the image label as an object keyword corresponding to the person contained in the target image. The image tag is used to represent an image state of a person included in the target image, for example, the image tag may be a beauty girl or a handsome.

Further, the server may identify the age of the person included in the target image by an age identification model based on the first face image of the person included in the target image, and specify the age of the person as the target keyword corresponding to the person included in the target image. The server may identify the gender of the person included in the target image from the first face image of the person included in the target image by a gender identification model, and determine the gender of the person as the target keyword corresponding to the person included in the target image.

It should be noted that, if the fourth target condition is that the category indicated by the object keyword is a person, and the category indicated by the object keyword meets the fourth target condition, the server further identifies the target image, and the step of determining the object keyword may further be: the server extracts a human body image of a person contained in the target image; inputting the human body image into a posture recognition model to obtain the posture information of a person contained in the target image; and determining the posture information as an object keyword corresponding to the person contained in the target image.

The server may further determine, based on the identification of the object included in the target image, a first position of the object included in the target image, an area of the object occupied in the target image, or an aspect ratio of a region in which the object included in the target image is located; and one or more items of the first position, the object area and the aspect ratio of the area where the object contained in the target image is located are taken as the image content information.

In another possible implementation manner, the image content information includes text content of the target image, and the server may recognize the target image by using an OCR (Optical Character Recognition) technique to obtain the text content of the target image.

It should be noted that, while identifying the target image, the server may also determine the editing type of the text content, where the editing type includes a human editing type and a scene shooting type. After the image is shot, the editing type of the text content added to the image is manually edited; the edit type of the text content existing in the shot scene during the image shooting is a scene shooting type.

The step of the server determining the editing type of the text content may be: the server determines that the editing type of the text content is an artificial editing type in response to the fact that the resolution of the area where the text content is located is different from the resolution of other areas in the target image; and the server determines that the editing type of the text content is a scene shooting type in response to that the resolution of the area where the text content is located is the same as the resolution of other areas in the target image.

The server may further determine, based on the recognized text content, one or more of a second position of the text content in the target image, a text area occupied by the text content in the target image, an aspect ratio of a region where the text content is located, and a character size of a character included in the text content, and use one or more of the second position, the text area, the aspect ratio of the region where the text content is located, and the character size of the character included in the text content as the image content information.

In the embodiment of the disclosure, through the identification of the target image, richer image content information is mined, so that the image content information reflects the image content expressed by the target image more comprehensively, and the reality of the image content information reflecting the target image is improved.

In step S303, the server determines an importance parameter of the target image based on the object keyword, the importance parameter indicating a degree of correlation between the object keyword and the target image.

The object keyword represents an object included in the target image, and if the object represented by the object keyword is an object highlighted in the target image, the higher the degree of correlation between the object keyword and the target image is, the higher the corresponding importance parameter is.

The target image may correspond to an object keyword, and the server may determine the importance parameter of the target image according to the object keyword. For example, the object included in the target image is a cat, the object keyword of the target image is "cat", and the server may determine the importance parameter of the target image according to "cat". The target image may also correspond to a plurality of object keywords, and the server may determine the importance parameter of the target image in the dimension of each object keyword according to each object keyword, for example, the objects included in the target image include a little a of a singer and a little dog of a firewood house, the keywords of the target image may include "little a" and "little dog", and the importance parameter of the target image includes an importance parameter corresponding to "little a" and an importance parameter corresponding to "little dog".

The server may determine the importance parameter of the target image according to whether the category indicated by the object keyword is the target category. In a possible implementation manner, if the category indicated by the object keyword is a target category, the step of determining, by the server, the importance parameter of the target image according to the image content information may be: and the server determines the importance parameter corresponding to the target category as the importance parameter of the target image in response to the fact that the category indicated by the object keyword is the target category. Wherein, the importance parameter corresponding to the target category is the importance parameter of the target image in the dimension of the object keyword.

The target category is a fine category to which the object contained in the target image belongs, and the importance parameter corresponding to the target category is larger than the importance parameters corresponding to other categories. The target category may be a category of persons having influence in a certain field, the importance parameter corresponding to the target category is larger than the importance parameters corresponding to other categories, and the importance parameter corresponding to the target category may also be the largest of the importance parameters corresponding to each item of image content information.

For example, the object category may be a star, the importance parameter for the object category may be 9, and the importance parameter for other categories such as a girl category or a handsome category may be 7. Because the target category is obtained by identifying the object contained in the target image, the image content expressed by the target image can be truly reflected, and if the image content information of the target image also comprises text content, the importance parameter corresponding to the target category can be larger than the importance parameter corresponding to the text content.

It should be noted that the server may also perform importance ranking according to the object keywords, determine the importance levels of the object keywords, and then determine the importance parameters corresponding to the importance levels as the importance parameters of the target images. For example, the importance level division may include a most important level, a more important level, and a general importance level, and the server determines that the importance level of the target category is the most important level in response to the category indicated by the object keyword being the target category; and determining the importance parameter corresponding to the most important level as the importance parameter of the target image.

In the embodiment of the disclosure, when the category indicated by the object keyword is the target category, the server may determine the importance parameter corresponding to the target category as the importance parameter of the target image, and the importance parameter corresponding to the target category is greater than the importance parameters corresponding to other categories, so that when the search term hits the index corresponding to the object keyword, the image with the greater importance parameter can be preferentially output, and the accuracy of image search is improved.

In another possible implementation manner, if the category indicated by the object keyword is not the target category, the server may determine the importance parameter according to the position of the object corresponding to the object keyword. Correspondingly, the step of determining, by the server, the importance parameter of the target image according to the image content information may be: the server responds to the fact that the category indicated by the object keyword is not the target category, and determines a first importance parameter corresponding to the object keyword according to a first position of an object corresponding to the object keyword in a target image, wherein the first importance parameter is in negative correlation with the distance between the first position and the first target position in the target image; the first importance parameter is determined as the importance parameter of the target image. Wherein, the first importance parameter is the importance parameter of the target image in the dimension of the object keyword.

The first target position is a position for highlighting the subject in the target image, for example, the first target position may be a center position of the target image. The first position of the object corresponding to the object keyword in the target image may be any position of the area where the object is located, for example, the first position may be a center position of the object.

The first importance parameter is negatively correlated with the distance between the first position and the first target position, that is, the smaller the distance between the first position and the first target position is, the closer the object is to the center of the target image, the larger the importance parameter corresponding to the object is; the greater the distance between the first position and the first target position, the farther the object is from the center of the target image, and the smaller the importance parameter corresponding to the object.

It should be noted that the server may further determine, by combining with the object area of the area where the object corresponding to the object keyword is located, the first importance parameter corresponding to the object keyword, where the first importance parameter is positively correlated with the object area, and is negatively correlated with the distance between the first position and the first target position.

The server may determine the importance level of the object corresponding to the object keyword according to the distance between the first position and the first target position, and then determine the importance parameter corresponding to the importance level as the first importance parameter.

In this embodiment of the disclosure, when the category indicated by the object keyword is not the target category, the server may further determine, according to a distance between the first position where the object corresponding to the object keyword is located and the first target position in the target image, an importance parameter negatively related to the distance, so that the importance parameter corresponding to the object with the smaller distance from the first target position is larger, that is, the importance parameter corresponding to the object located at the main body position of the target image is larger, and when the search term hits the index corresponding to the object keyword, the image with the larger degree of correlation between the image content and the search term can be preferentially output, thereby improving the accuracy of image retrieval.

Another point to be described is that, if the image content information includes the text content of the target image and the object keyword, after determining the importance parameter of the target image according to the object keyword, the server may further adjust the importance parameter corresponding to the object keyword according to the correlation between the text content and the object keyword. Correspondingly, the step of the server adjusting the importance parameter corresponding to the object keyword may be: the server determines a first correlation between the text content and the object keyword; in response to the first correlation meeting the first target condition, increasing the importance parameter corresponding to the object keyword; and reducing the importance parameter corresponding to the object keyword in response to the first correlation not meeting the first target condition.

Wherein the first relevance is used for representing the degree of relevance between the text content and the object keywords. The first target condition may be a predetermined condition, for example, the first target condition may be that the first correlation is greater than a first threshold. The first target condition may correspond to an adjustment coefficient, and the server may increase the importance parameter corresponding to the object keyword based on the adjustment coefficient in response to the first correlation satisfying the first target condition. And the server responds to that the first correlation does not accord with the first target condition, and reduces the importance parameter corresponding to the object keyword based on the adjusting coefficient. For example, the server may add the original importance parameter corresponding to the object keyword to the adjustment coefficient to obtain an increased importance parameter; the server may also multiply the original importance parameter corresponding to the target keyword by the adjustment coefficient to obtain the increased importance parameter.

If the importance parameter is an importance parameter obtained based on the importance level, the first target condition may also correspond to level adjustment information that indicates to increase or decrease the corresponding importance level, and the server updates the importance parameter corresponding to the adjusted importance level to the importance parameter corresponding to the target keyword after adjusting the importance level based on the level adjustment information. For example, the importance level corresponding to the object keyword is a general importance level, the server responds that the first relevance meets a first target condition, and obtains level adjustment information corresponding to the first target condition, wherein the level adjustment information is used for indicating that the importance level is increased by one level; according to the level adjustment information, adjusting the importance level corresponding to the object keyword to a relatively important level; and updating the importance parameter corresponding to the comparative importance level into the importance parameter corresponding to the object keyword.

The step of the server determining the first correlation between the text content and the object keyword may be: the server acquires a feature vector of text content and a feature vector of an object keyword; determining the distance between the feature vector of the text content and the feature vector of the object keyword; the distance between the feature vector of the text content and the feature vector of the object keyword is determined as a first correlation. The server may extract feature vectors of text content and feature vectors of object keywords through word2vec (a group of relevant models used to generate word vectors), that is, extract embedded features of text content and embedded features of object keywords.

In the embodiment of the present disclosure, the server may further adjust the importance parameter corresponding to the object keyword according to the first correlation between the text content and the object keyword, and when the first correlation meets the first target condition, that is, the correlation degree between the text content and the object keyword is higher, the importance parameter corresponding to the object keyword is increased, and further when the search term hits the index corresponding to the object keyword, an image with higher correlation degree between the text content and the object keyword can be preferentially output, that is, an image with higher correlation degree between the text content and the object contained in the target image is preferentially output, so that the accuracy of image retrieval is improved.

Another point to be noted is that, after determining the importance parameter of the target image according to the object keyword, the server may also adjust the importance parameter corresponding to the object keyword according to the correlation between the object keyword and the image feature of the target image. Correspondingly, the step of the server adjusting the importance parameter corresponding to the object keyword may be: the server determines a second correlation between the object keyword and the image feature of the target image; in response to the second correlation meeting the second target condition, increasing the importance parameter corresponding to the object keyword; and reducing the importance parameter corresponding to the object keyword in response to the second correlation not meeting the second target condition. The image features of the target image may be image embedding features, and the process of adjusting the importance parameters corresponding to the object keywords by the server according to the correlation between the object keywords and the image features of the target image is the same as the process of adjusting the importance parameters corresponding to the object keywords by the server according to the correlation between the text content and the object keywords.

In the embodiment of the present disclosure, the server may further adjust the importance parameter corresponding to the object keyword according to a second correlation between the object keyword and the image feature of the target image, and when the second correlation meets a second target condition, that is, the correlation degree between the object keyword and the target image is higher, the importance parameter corresponding to the object keyword is increased, so that when the search term hits the index corresponding to the object keyword, the image with the higher correlation degree between the object keyword and the target image is preferentially output, and the accuracy of image search is improved.

In step S304, the server determines an importance parameter of the target image according to the text content, wherein the importance parameter is used for indicating the degree of correlation between the text content and the target image.

The text content is an explanation of the target image, and if the text content is emphasized in the target image, the higher the correlation degree between the text content and the target image is, the higher the corresponding importance parameter is.

The server can determine the importance parameter of the target image according to the position of the text content and the text area. Correspondingly, the steps can be as follows: the server determines a second importance parameter corresponding to the text content according to a second position of the text content in the target image and the text area occupied by the text content in the target image, wherein the second importance parameter is in negative correlation with the distance between the second position and the second target position of the target image and is in positive correlation with the text area; the second importance parameter is determined as the importance parameter of the target image.

The second target position is a position in the target image for highlighting the text content, for example, the second target position may be a center position of the target image, and the second target position may also be a center position of a top area of the target image.

In the embodiment of the disclosure, the second importance parameter corresponding to the text content is negatively correlated with the distance between the second position and the second target position, and is positively correlated with the text area, that is, if the text area of the text content is larger, the closer the second position is to the second target position, the more the second position can represent the theme of the target image, the larger the second importance parameter corresponding to the text content is; if the text area of the text content is smaller, the second position is not close to the second target position, and the text content is a supplementary description of the subject of the target image, the second importance parameter corresponding to the text content is smaller, so that the second importance parameter can represent the correlation degree between the text content and the target image, and further when the search word hits the index corresponding to the text content, the image with the higher correlation degree between the text content and the target image can be preferentially output, and the accuracy of image search is improved.

It should be noted that the editing type of the text content includes a human editing type and a scene shooting type, and the second importance parameter corresponding to the text content of the human editing type is greater than the second importance parameter corresponding to the text content of the scene shooting type.

In the embodiment of the disclosure, because the text content of the artificial edit type is used for highlighting the theme of the target image, the second importance parameter corresponding to the text content of the artificial edit type is greater than the second importance parameter corresponding to the text content of the scene shooting type, when the search word hits the index corresponding to the text content, the target image with the theme matched with the search word can be preferentially output, and the accuracy of image retrieval is improved.

The other point to be described is that the server may further determine, by combining the sizes of the characters included in the text content, a second importance parameter corresponding to the text content, where the second importance parameter is positively correlated with the size of the character, and the size of the character may be an area of a character region to which any character in the text content belongs. The larger the character size is, the more important the text content needs to be, the larger the second importance parameter corresponding to the text content is, the higher the second importance parameter can be preferentially output during searching, and the accuracy of image searching is improved.

Another point to be noted is that when the text content is composed of a row or a column of characters, the server may further determine, in combination with the aspect ratio of the area where the text content is located, a second importance parameter corresponding to the text content. Wherein the second importance parameter is inversely related to the aspect ratio of the area where the text content is located. The aspect ratio of the region in which the text content is located can indicate the number of words included in the text content, and if the aspect ratio is larger, that is, if the number of words is larger, the supplementary description indicating that the text content is the target image is smaller, the second importance parameter corresponding to the text content is smaller, and if the text content serving as the supplementary description is matched with the search word during image search, the target image corresponding to the text content is output after being delayed, so that the accuracy of image search can be improved.

It should be noted that the image content information may further include scene information of the target image, where the scene information is used to represent a scene shot by the target image, and for example, the scene information may be a concert, a zoo, a basketball court, or the like. After determining the importance coefficient of the target image according to the text content, the server can adjust the importance parameter corresponding to the text content according to the correlation between the text content and the scene information. Correspondingly, the step of the server adjusting the importance parameter corresponding to the text content may be: the server determines a third correlation between the text content and the scene information; in response to that the third relevance meets a third target condition, increasing the importance parameter corresponding to the text content; and in response to that the third relevance does not meet the third target condition, reducing the importance parameter corresponding to the text content.

The process of adjusting the importance parameter corresponding to the text content by the server according to the correlation between the text content and the scene information is the same as the process of adjusting the importance parameter corresponding to the object keyword by the server according to the correlation between the text content and the object keyword. For example, if the text content of the target image is "basketball shooting", and the scene information of the target image is a basketball court, the third correlation meets the third target condition, and the importance parameter corresponding to the text content is increased.

In the embodiment of the disclosure, the server may further adjust the importance parameter of the text content according to the correlation between the text content and the scene information, and increase the importance coefficient corresponding to the text content when the correlation between the text content and the scene information is high, so that when the search word hits the index corresponding to the text content, the target image with the high correlation between the text content and the scene information is preferentially output, and the accuracy of image search is improved.

Another point to be explained is that the image content information may include two items of text content and object keywords of the target image; the image content information may also include an item of an object keyword of the target image, and the server may directly perform step S305 after performing step S303; the image content information may also include the text content of the target image, and the server may directly perform step 304 after performing step S302.

Another point to be noted is that, when the image content information includes two items of the text content and the object keyword of the target image, the step S303 and the step S304 have no strict time sequence, and the step S303 may be executed first, and then the step S304 may be executed; step S304 may be executed first, and then step S303 may be executed; step S303 and step S304 may also be performed simultaneously; in the embodiment of the present disclosure, the execution order of step S303 and step S304 is not limited.

In step S305, the server adds the target image to the indexed image corresponding to the image content information according to the importance parameter.

The index is used to point to a plurality of images corresponding to the index. For example, the index is "basketball," which may correspond to an image having a plurality of object keywords including "basketball" or textual content including "basketball. Each image may correspond to a plurality of indexes, for example, the object keyword of the target image includes "small a" and "basketball," the image pointed by the index corresponding to the small a "includes the target image, and the image pointed by the index corresponding to the basketball" may also include the target image. The target image can be found based on any index corresponding to the target image.

In a possible implementation manner, the target image is a newly added image without an index, and the server may compare the importance parameter of the target image with the importance parameters of the plurality of images pointed by the index corresponding to the image content information of the target image, and add the target image to the image of the index corresponding to the image content information. Correspondingly, if the index corresponding to the image content information corresponds to a plurality of images, the step of adding the target image to the image of the index corresponding to the image content information by the server according to the importance parameter may be: the server respectively compares the importance parameters of the target image and the importance parameters of the plurality of images and determines the sequencing positions of the target image in the plurality of images; the target image is added to the sort position. The importance parameter of the image before the ranking position determined by the server is larger than that of the target image, and the importance parameter of the image after the ranking position is smaller than that of the target image.

For example, if the object keyword of the target image includes "basketball", the importance parameter of the target image in the dimension of "basketball" is 8, and the plurality of images pointed by the index corresponding to "basketball" may include image a, image b and image c, where the importance parameter of image a is 9, the importance parameter of image b is 6, and the importance parameter of image c is 4, the ranking position of the target image may be between image a and image b.

In another possible implementation manner, the target image is any image in the image database, and the server may sort the importance parameter of each image according to the importance parameters of the plurality of images pointed by any index, and add the plurality of images to the indexed images according to the order of the importance parameters from large to small.

It should be noted that, before the server sorts the importance parameters of each image according to the importance parameters of a plurality of images pointed by any index, the server needs to establish a corresponding relationship between the index and the image, and the server may determine the image content information as the index and establish a corresponding relationship between the image content information and the index, which are matched with the index, and the index.

Another point to be noted is that the server may add the target image to the image pointed by the index corresponding to the image content information according to the importance parameter. The server may also add the image identifier of the target image to the image identifier pointed by the index corresponding to the image content information according to the importance parameter.

It should be noted that, when adding the target image to the image of the index corresponding to the image content information, the server may store the importance parameter of the target image and the correspondence between the target image and the index corresponding to the image content information without sorting the target images in the order of descending importance parameters. The server may, when receiving the retrieval instruction and the index is hit by the retrieval word included in the retrieved instruction, sort the plurality of images corresponding to the index in the order from large to small according to the importance parameter, and output the image corresponding to the retrieval instruction in the order from large to small according to the importance parameter.

In another embodiment of the present disclosure, if the target image is an image of any frame extracted from the video, the server may add the video corresponding to the target image to the video pointed by the index corresponding to the image content information according to the importance parameter of the target image. If the server extracts any multi-frame image from the video as the target image, the server may determine the importance parameter of the video according to the importance parameters of the corresponding target images, and add the video to the video pointed by the index corresponding to the image content information according to the importance parameter of the video.

The server may be configured to, in response to receiving the search instruction, hit any one of the indexes with a search term included in the search instruction, and output an image corresponding to the search instruction from among the plurality of images corresponding to the hit index in an order from a large importance parameter to a small importance parameter of the plurality of images. The search instruction can be sent by a trigger terminal when a user executes a search operation, and a search word included in the search instruction can be a search word input by the user. The retrieval instruction can also be sent by a trigger terminal when the user performs a recommendation operation, and a retrieval word included in the retrieval instruction can be used for representing an image category in which the user is interested. The retrieval instruction can also be sent by a server with a recommendation sorting function, and the retrieval instruction comprises a retrieval word for representing the image category in which the user is interested.

In the embodiment of the disclosure, the importance parameter is used for indicating the degree of correlation between the image content information and the target image, and the images are output according to the descending order of the importance parameter, so that the image with higher degree of correlation between the image content information and the target image can be preferentially output, and the accuracy of image retrieval is improved.

All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.

Fig. 4 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment. Referring to fig. 4, the image processing apparatus includes an acquisition unit 401, a recognition unit 402, a first determination unit 403, and an addition unit 404.

An acquisition unit 401 configured to perform acquisition of a target image;

the recognition unit 402 is configured to perform recognition on the target image, and obtain image content information of the target image, wherein the image content information comprises at least one of text content and object keywords of the target image;

a first determination unit 403 configured to perform determining an importance parameter of the target image according to the image content information, the importance parameter being used for indicating a degree of correlation of the image content information with the target image;

an adding unit 404 configured to perform adding the target image to the image of the index corresponding to the image content information according to the importance parameter.

In a possible implementation manner, the image content information includes an object keyword, and the first determining unit 403 is configured to perform, in response to that the category indicated by the object keyword is the target category, determining the importance parameter corresponding to the target category as the importance parameter of the target image.

In another possible implementation manner, the image content information includes an object keyword, and the first determining unit 403 is configured to perform:

a first increasing unit configured to perform increasing an importance parameter corresponding to the object keyword in response to the first correlation meeting the first target condition;

and the first reducing unit is configured to reduce the importance parameter corresponding to the object keyword in response to the first correlation not meeting the first target condition.

a third determination unit configured to perform determination of a second correlation between the object keyword and the image feature of the target image;

a second increasing unit configured to perform increasing the importance parameter corresponding to the object keyword in response to the second relevance meeting the second target condition;

and the second reducing unit is configured to reduce the importance parameter corresponding to the object keyword in response to the second relevance not meeting the second target condition.

In another possible implementation manner, the image content information includes text content, and the first determining unit 403 is configured to perform:

a third increasing unit configured to perform increasing the importance parameter corresponding to the text content in response to the third relevance meeting the third target condition;

and the third reducing unit is configured to reduce the importance parameter corresponding to the text content in response to the third relevance not meeting the third target condition.

In another possible implementation manner, the index corresponding to the image content information corresponds to a plurality of images, and the adding unit 404 is configured to perform:

the target image is added to the sort position.

and the output unit is configured to execute that in response to receiving the retrieval instruction, the retrieval word included in the retrieval instruction hits any index, and the images corresponding to the retrieval instruction are output from the plurality of images corresponding to the hit index according to the descending order of the importance parameters of the plurality of images.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that: in the image processing apparatus provided in the above embodiment, only the division of the above functional modules is taken as an example when performing image processing, and in practical applications, the above functions may be distributed by different functional modules as needed, that is, the internal structure of the server is divided into different functional modules to complete all or part of the above described functions. In addition, the image processing apparatus and the image processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 5 is a block diagram of a server 102 according to an exemplary embodiment, where the server 102 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1021 and one or more memories 1022, where the memories 1022 are used for storing executable instructions, and the processors 1021 are configured to execute the executable instructions to implement the image processing methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, there is also provided a storage medium comprising instructions, such as the memory 1022 comprising instructions, executable by the processor 1021 of the server 1020 to perform the image processing method described above. Alternatively, the storage medium may be a non-transitory computer readable storage medium, for example, the non-transitory computer readable storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, there is also provided a computer program product, wherein instructions of the computer program product, when executed by a processor of a server, enable the server to perform the image processing method in the above-described respective method embodiments.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image processing method, characterized in that the method comprises:

acquiring a target image;

2. The image processing method according to claim 1, wherein the image content information includes the object keyword, and the determining the importance parameter of the target image according to the image content information includes:

3. The image processing method according to claim 1, wherein the image content information includes the object keyword, and the determining the importance parameter of the target image according to the image content information includes:

4. The image processing method according to claim 1, wherein before the target image is added to the image of the index corresponding to the image content information according to the importance parameter, the image processing method further comprises:

5. The image processing method according to claim 1, wherein before the target image is added to the image of the index corresponding to the image content information according to the importance parameter, the image processing method further comprises:

6. The image processing method according to claim 1, wherein the image content information includes the text content, and the determining the importance parameter of the target image according to the image content information includes:

7. The image processing method according to claim 6, wherein the editing types of the text content include a human editing type and a scene shooting type, and a second importance parameter corresponding to the text content of the human editing type is greater than a second importance parameter corresponding to the text content of the scene shooting type.

8. An image processing apparatus, characterized in that the apparatus comprises:

an acquisition unit configured to perform acquisition of a target image;

9. A server, characterized in that the server comprises:

one or more processors;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the image processing method of any one of claims 1 to 7.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of a server, enable the server to perform the image processing method of any one of claims 1 to 7.