CN111160240B

CN111160240B - Image object recognition processing method and device, intelligent device and storage medium

Info

Publication number: CN111160240B
Application number: CN201911379526.4A
Authority: CN
Inventors: 徐昊; 张瑞; 任逍航; 程培
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-12-27
Filing date: 2019-12-27
Publication date: 2024-05-24
Anticipated expiration: 2039-12-27
Also published as: CN111160240A; CN118334320A

Abstract

The embodiment of the application discloses an image object identification processing method, an image object identification processing device and intelligent equipment, wherein the method comprises the following steps: performing object region identification processing on the acquired target image to obtain a prediction region in the target image; invoking an image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to an image object to be recognized; determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to sub-image areas of the image object to be identified. By adopting the method and the device, the missing detection and the false detection of the image object area can be reduced well, the efficiency of identifying the image object from the image is improved, and the whole scheme has strong feasibility, low cost and high precision.

Description

Image object recognition processing method and device, intelligent device and storage medium

Technical Field

The present application relates to the field of image recognition technologies, and in particular, to a method and apparatus for recognizing and processing an image object, an intelligent device, and a storage medium.

Background

With the rapid popularization of deep learning technology and the improvement of computing power of intelligent equipment, the semantic segmentation technology has greatly improved performance, and the application of the technology is increasingly wide, so that the demands for changing hairstyles or dyeing hair are increasingly increased in the fields of beauty and makeup and P-pictures. In order to find the image object to be processed in the later stage from the image, if the whole picture is subjected to semantic segmentation by using a deep learning technology, huge calculation amount and false detection of irrelevant areas are definitely brought.

Disclosure of Invention

The application provides an image object identification processing method and device, intelligent equipment and storage medium, which can improve the efficiency of identifying image objects from images.

In one aspect, the present application provides a method for identifying and processing an image object, including:

Performing object region identification processing on the acquired target image to obtain a prediction region in the target image;

invoking an image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to an image object to be recognized;

Determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to sub-image areas of the image object to be identified.

On the other hand, the application also provides an identification processing device of the image object, which comprises the following steps:

the prediction module is used for carrying out object region identification processing on the acquired target image to obtain a prediction region in the target image;

The recognition module is used for calling an image object recognition model to recognize the image objects of the sub-image areas in the preset area and determining whether each sub-image area belongs to the image object to be recognized or not;

The determining module is used for determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to sub-image areas of the image object to be identified.

In still another aspect, the present application further provides an intelligent device, including a storage device and a processor, where the storage device stores program instructions, and the processor invokes the program instructions to implement a method for identifying and processing an image object.

Accordingly, the present application also provides a computer-readable storage medium in which program instructions are stored, which program instructions, when executed, implement a method of recognizing an image object.

The application detects the region where the image object such as hair is located, and then performs the fine detection of the region, on one hand, the false detection of the image object region can be reduced well, the detection amount is reduced, and on the other hand, the fine detection of the local region can also improve the accuracy of identifying the image object from the image, thereby greatly improving the efficiency of detecting the image object.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an image object recognition processing method according to an embodiment of the present invention;

FIG. 2a is a schematic size diagram of a target image according to an embodiment of the present invention;

FIG. 2b is a schematic diagram of relevant image area parameters according to an embodiment of the present invention;

FIG. 3 is a schematic illustration of a masking layer for a hair object according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an adjustment method for adjusting the position of an initial region in a target image according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of an initial region and a predicted region after height value expansion in accordance with an embodiment of the present invention;

FIG. 6 is a schematic diagram of an embodiment of the invention for adjusting an initial region;

Fig. 7 is a schematic structural view of an image object recognition processing apparatus according to an embodiment of the present invention;

Fig. 8 is a schematic structural diagram of an intelligent device according to an embodiment of the present invention.

Detailed Description

For the obtained target images, the user can perform various processes on the target images as required, such as recognizing the image objects of eyes, nose, mouth, etc. of the face, performing cosmetic treatment on the image objects, or adding decorative props, etc. For example, for the hair image object in the target image, the hair area in the target image may be separately segmented and extracted, and the hair area may be specially treated, such as dyeing, hair replacement, setting hair decoration, etc., specifically, the target image may be first segmented into image areas, and then the image object area where the hair of the user is located may be dyed red according to the segmentation result. In some embodiments, after the target images are processed, the user can upload the target images to the social platform through the intelligent terminal to share the target images with other users, and if the target images are collected during video communication between the users, the processed target images can display video images after image object processing such as the above on video communication interfaces of the two parties or the other party.

The target image can be acquired by a user through an intelligent terminal such as a smart phone, a tablet computer and the like, and can also be an image extracted from a memory or downloaded from a network. The processing of the target images can be performed at an intelligent terminal provided with an application APP with a corresponding function, for example, when a user shoots through a smart phone, the target images to be processed obtained by shooting can be processed, or the acquired target images used as previews are processed in a shooting preview stage, so that the processed images can be directly shot and acquired when shooting is clicked. Or after the video frame is uploaded to the server through the intelligent terminal, the server executes corresponding image processing, for example, in the video call process, the video communication server automatically processes the transmitted video image frame (target image) according to the user operation, so as to display the processed video picture, such as the video picture after hair dyeing, on the communication interface of the opposite party.

The processing of the target image comprises a plurality of stages, specifically comprises the steps of identifying the object region of the target image, completing region prediction, carrying out image object identification processing in the predicted region, and finally determining the region where the image object is located according to the identification processing result. In one embodiment, after the target image is obtained, firstly, performing rough estimation on the target image to obtain an image object to be detected in an image position area of the target image, namely, an initial area, after the initial area is determined by rough estimation, expanding the rectangular area corresponding to the initial area on the basis of the initial area to obtain a final prediction area, then identifying and determining the image object to be detected in the prediction area, determining an area where the image object is located, for example, the mentioned area where hair is located on the target image, in this embodiment, by performing processing on the target object to be detected for two times on the area where the position of the target object is located (corresponding to the mentioned prediction area), the to-be-identified image area where the image object is located (corresponding to the mentioned prediction area), missed detection and false detection of the image object can be reduced well, the finally-determined area where the image object to be detected is identified and determined can be improved, further image processing on the target area is conveniently performed, for example, the position of the hair is determined in the target image area is covered by the position of the target image, the position of the hair is covered by the mask, and the mask is conveniently processed, and the position of the mask is conveniently achieved.

Referring to fig. 1, a flowchart of an image object recognition processing method according to an embodiment of the present invention is shown, where the method according to the embodiment of the present invention may be applied to various scenes where image processing may be performed, for example, in social applications where images may be shared, in video call applications, in video session system applications, and in drawing and repair applications, where the method may be performed by an intelligent terminal where a corresponding application is installed, or may be performed by a powerful server where image processing may be performed. The method of the embodiment of the invention comprises the following steps.

The user who has a need for repair (for example, a need for make-up and make-up) may acquire a target image with an image object to be detected and adjusted through various ways, for example, capturing a person image with a smart phone, extracting a task image from a memory, downloading a task image from a network, and the like, and after loading the acquired target image, performing object region identification processing on the acquired target image in S101 to obtain a prediction region in the target image. The object region identification process may be performed in a variety of ways, and in one embodiment, an initial region may be roughly identified and determined based on a location point clicked by a user or a selected partial region.

In one embodiment, an image object prediction model may be invoked to perform a rough detection of the full view in the target image to determine a prediction region from the target image. The image object prediction model may directly obtain a prediction region as needed, or obtain a region geometry parameter about the prediction region, from which a prediction region may be directly determined. Or the S101 may also include: calling an image object prediction model to perform region prediction on the target image, and determining an initial region; and adjusting the position of the initial region in the target image according to a region expansion rule to obtain a predicted region. That is, a rough initial region is obtained, and then fine adjustment is performed on the basis of the initial region to obtain a predicted region, and the expansion rule is a complementary process for preventing that the initial region cannot contain all hairs. In some cases, in the case of a small amount of hair, or short hair, etc., the initial region determined based on the image object prediction model is likely to encase all the hair, in which case the initial region may be directly used as a prediction region, that is, after the image object prediction model is called to perform region prediction on the target image, the position of the initial region may be displayed on the user interface and a region editing button and a confirmation button may be displayed, if the confirmation button is selected, the initial region may be directly used as a prediction region, and if the selected region editing button is selected, adjustment of the position of the initial region in the target image according to a region expansion rule may be performed, so as to obtain a prediction region.

In one embodiment, the calling the image object prediction model to perform region prediction on the target image, and determining the initial region may specifically include: calling an image object prediction model to perform region prediction on the target image, and acquiring region geometric feature parameters output by the image object prediction model after prediction; and determining an initial region according to the geometric characteristic parameters of the region.

In another embodiment, the face position area in the target image may be first identified, the face position may be determined accurately by using a relatively mature face recognition technology, the face area is used as an initial area, and then the predicted area of the image object such as hair, eyes, or nose is estimated roughly from the face position area. For example, after confirming the initial area for the face position, editing the initial area according to the position of the eye object on the face to obtain the predicted area.

In another embodiment, the method may further include advanced face detection, if a face is detected, estimating and determining a prediction area from a face position area determined by face detection, and if no face is detected, invoking the above-mentioned image object prediction model to perform area prediction on the target image so as to output an area geometric feature parameter of an initial area obtained by prediction to finally obtain a prediction area.

The image object prediction model may be constructed based on a neural network, predicts a rectangular region corresponding to the initial region by the image object prediction model, and may include at least a part or all of the image objects to be detected and identified in the rectangular region, and the image object prediction model may predict and output a region geometric feature parameter, where the region geometric feature parameter may include offset information and rectangular region related parameters (height information and width information of the rectangular region), and in one embodiment, the offset information includes: and the offset of the area center of the rectangular area and the image center of the target image, wherein the height information comprises: a ratio of a height of the rectangular region to a height of the target image, the width information including: the ratio of the width of the rectangular area relative to the width of the target image. In other embodiments, the geometric feature parameter of the region may also be the center position image coordinates of the rectangular region, the height of the rectangular region, and the width of the rectangular region.

In one embodiment, in order to rapidly realize the prediction of geometric characteristic parameters of a region such as offset, a neural network with the overall calculated amount within 2M and consisting of a convolution layer, a pooling layer, a nonlinear layer and a full-connection layer can be designed, and an image object prediction model can be obtained based on the neural network training. The predicted values output by the neural network or the image object prediction model are the offset (i.e. offset information) of the center point of the rectangular area corresponding to the image object such as hair and the relative proportion of the length and width of the rectangular area to the length and width of the image (i.e. the proportion of the height of the rectangular area to the height of the target image and the proportion of the width of the rectangular area to the width of the target image), which can be expressed as、、/>All of which are normalized values. Assume that the target image center point and width and height are (/ >)，/>) The calculation modes of the rectangular center, the rectangular width and the rectangular height of the initial area where the image object such as hair is positioned are as follows:

Equation 1

Specifically, the offset information in the geometric characteristic parameters of the region is as followsThe height information is/>Width information is/>The rectangular center of the initial region, or rectangular image center, is (/ >，/>) The height of the initial region is/>The width of the initial region is/>. The width of the target image is w, the height of the target image is h, and the image center coordinates of the target image (/ >)，/>). The illustration of the geometric feature parameters of the target image, the initial region and the initial region can be seen in particular with reference to fig. 2a, 2 b. The original image is shown in fig. 2a, and the rectangular region and related parameters determined after the original image is input into a neural network, namely an image object prediction model, are shown in fig. 2 b. The dashed rectangular box in fig. 2b is the initial region of the target image.

The image object prediction model may be obtained by training the image object prediction model based on a positive sample image set with an image object such as hair and a negative sample image set without an image object such as hair, and the rough region prediction may be performed by the image object prediction model obtained by training, and the region geometric feature parameters of the initial region may be output.

In one embodiment, after determining the initial region, the above-mentioned adjustment of the position of the initial region in the target image according to the region expansion rule may be performed, so as to obtain the predicted region. The region extension rule is mainly used for adjusting the size of the initial region so as to determine a more suitable region containing the image object, i.e. a prediction region. In one embodiment, the size of the initial region may be adjusted based on the region geometry parameters of the initial region to obtain the predicted region. For example, the predicted region is 20% higher or other value and 10% wider or other value based on the initial region. For a specific description of the adjustment process from the initial region to the predicted region, reference may be made to the description of the subsequent. The solid rectangular box in fig. 2b is the prediction area.

After the prediction area is determined, in S102, an image object recognition model is called to perform image object recognition on the sub-image areas in the prediction area, and it is determined whether each sub-image area belongs to an image object to be recognized. The image object recognition model may be a dedicated model dedicated to recognizing an image object, for example, a model dedicated to recognizing a hair object in an image, or a model dedicated to recognizing an eye object in an image. In one embodiment, the sub-image region is a region in which one pixel point in the prediction region is located; the invoking the image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to the image object to be recognized may include: invoking an image object recognition model to perform image object recognition on each pixel point in the preset area, and determining the probability that each pixel point in the preset area belongs to an image object to be recognized; wherein, the pixel points with probability values larger than or equal to a preset threshold belong to the image object to be identified; and the recognition result of the image object recognition is a probability map corresponding to the prediction area determined according to the obtained probability, and the image object area in the prediction area is conveniently determined according to the probability map after the probability map is obtained.

The image object recognition model can be a region of an image object such as hair in a preset region, which is obtained by segmenting the image object recognition model, the neural network corresponding to the image object recognition model can be composed of a cavity convolution layer, a pooling layer, a nonlinear layer and an up-sampling layer, a rectangular region (which is the preset region at this time) in a target image is input into the image object recognition model, a probability map with the same size as the preset region can be obtained, the value range of the probability map is 0-1, each point on the probability map corresponds to a pixel point (namely, a sub-image region), the probability value on the probability map refers to the probability that the corresponding pixel point belongs to the image object waiting to be recognized by the hair, when the probability is larger than or equal to a preset probability threshold, for example, the pixel point corresponding to the sub-image region belongs to the image object waiting to be recognized, and when the probability is smaller than the preset probability threshold, the pixel point corresponding to the sub-image region does not belong to the image object waiting to be recognized. The training of the image object recognition model can also be performed according to a training set formed by images with the image objects such as hair, and the cross entropy can be used in training to establish a loss function, and the method is specifically shown as the following formula:

Equation 2

Wherein the method comprises the steps ofThe probability that the i-th pixel is predicted as an image object such as hair.

In S103, an image object area in the target image is determined according to a recognition result of image object recognition, where the recognition result includes: for recording result information pertaining to sub-image areas of the image object to be identified. After obtaining the probability map, the probability value smaller than the preset probability threshold may be set to 0, the probability value greater than or equal to the preset probability threshold may be set to 1, a binary image with the same size as the prediction area is obtained, the binary image is further corresponding to the target image based on the prediction area, the image object area in the target image is determined according to the point with the value of 1 in the binary image, and the mask layer (mask layer) of the final image object area such as hair is formed, fig. 3 is a mask schematic diagram corresponding to the hair in the target image in fig. 2a, and as shown in fig. 3, the purposes of dyeing the image object such as hair may be achieved by adjusting the parameters such as the color of the mask layer.

Referring to fig. 4 again, an adjustment method for adjusting the position of the initial region in the target image according to an embodiment of the invention is shown, and the adjustment method corresponds to S101 described above. The method can still be implemented by an intelligent terminal or a server.

S400: and judging whether the initial area determined according to the geometric characteristic parameters of the area meets the area condition. The area of the initial region can be calculated according to the length and width of the rectangular region of the initial region, and the length and width of the rectangular region can be calculated according to the geometric characteristic parameters of the region.

S401: when the initial area determined according to the area geometric feature parameters meets the area condition, determining a second updated geometric parameter according to the geometric parameters of the target image; wherein the initial area satisfying the area condition includes: the ratio between the initial area and the area of the target image is greater than a preset area ratio threshold. Alternatively, the initial area satisfying the area condition includes: the initial area is larger than a times the area of the target image, and a may be, for example, a value of 0.8, or 0.9, etc. When the area condition is satisfied, the target image photographed at this time is considered to be mainly a head image, the size of the target image is directly taken as the size included in the second updated geometric parameter, and the entire image area of the target image is taken as the prediction area. Alternatively, b times the size of the target image may be taken as the size included in the second updated geometric parameter, e.g., b may be 0.9 or another value, and the target image may be widened0.9 As the width included in the second updated geometry, will be the height/>, of the target image0.9 As the second updated geometry parameter.

S402: and adjusting the position of the initial region in the target image according to the second updated geometric parameters to obtain a predicted region. As described in S401, the prediction area obtained after the adjustment may be the area of the whole target image or 0.9 times the size of the whole target image area.

S403: and when the initial area determined according to the area geometric feature parameters does not meet the area condition, selecting a corresponding area expansion rule according to the area geometric feature parameters of the initial area.

S404: and determining a first updated geometric parameter according to the region expansion rule. When the area condition is not satisfied, different area expansion rules may be selected according to the area geometric feature parameter, so as to determine the first updated geometric parameter according to the different area expansion rules.

S405: and adjusting the position of the initial region in the target image according to the first updated geometric parameters to obtain a predicted region. In the embodiment of the invention, the prediction area obtained after the initial area is adjusted according to the first updated geometric parameter is different from the prediction area obtained after the initial area is adjusted according to the second updated geometric parameter. And the area or the size of the predicted area obtained according to the second updated geometric parameters is correspondingly larger than the area or the size of the predicted area obtained according to the first updated geometric parameters.

In one embodiment, in S403, selecting the corresponding region expansion rule according to the region geometric feature parameter of the initial region includes: calculating to obtain a width value of the initial region according to the region geometric feature parameters of the initial region, and calculating to obtain a height value of the initial region according to the region geometric feature parameters of the initial region, wherein the calculation mode can refer to the formula 1; if the width value is larger than N times of the height value, selecting a first region expansion rule, wherein N is a positive number larger than 1; in one embodiment, the first region expansion rule is selected when N is equal to 1.5 or other values, that is, the width of the initial region is 1.5 times the height.

In S404, determining the first updated geometric parameter according to the region expansion rule includes: according to a first region expansion rule, taking the width value of the initial region as the width value of a first updated geometric parameter; according to a first region expansion rule, an updated height value obtained by expanding a first preset proportion on the basis of the height value of an initial region is used as the height value of a first updated geometric parameter, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value, and the second height threshold value can be P times of the height value of the target image or the height value of the target image, and P is a value of 0.8, 0.9 or the like. Under the first region expansion rule, the width value of the initial region is the same as the width value of the predicted region, but a first preset proportion is expanded up and down in the height direction, the first preset proportion may be, for example, 20%, for example, when the height value of the initial region is 100 pixels, the height value of the first updated geometric parameter corresponding to the predicted region is 140 pixels, that is, 20 pixels are expanded up in the height direction and 20 pixels are also expanded down, and the numerical values about the pixels in this example are merely examples.

In one embodiment, in S403, selecting the corresponding region expansion rule according to the region geometric feature parameter of the initial region includes: calculating to obtain a width value of the initial region according to the region geometric feature parameters of the initial region, and calculating to obtain a height value of the initial region according to the region geometric feature parameters of the initial region, wherein the calculation mode can refer to the formula 1; if the width value is larger than the height value but the width value is smaller than or equal to N times of the height value, selecting a second region expansion rule, wherein N is a positive number larger than 1; that is, the second region expansion rule is selected if the width value is greater than the height value, but the width value is less than 1.5 times the height value (or N is other value).

In S404, determining the first updated geometric parameter according to the region expansion rule includes: according to a second region expansion rule, an updated height value obtained by expanding a second preset proportion on the basis of the height value of the initial region is used as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold and smaller than a second height threshold, the second height threshold can be, for example, P times of the height value of the target image or the height value of the target image, P is, for example, a value of 0.8 or 0.9, etc., and according to the second region expansion rule, an updated width value obtained by expanding a third preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than the first width threshold and smaller than the second width threshold, the second width threshold can be, for example, P times of the width value of the target image or the width value of the target image, and P is, for example, a value of 0.8 or 0.9, etc. Under a second region expansion rule, expanding the height value of the initial region up and down by a second preset proportion, for example, 10%, and simultaneously updating the height value to be larger than a first height threshold (for example, 0) and smaller than a second height threshold, wherein the second height threshold can be, for example, the height value of the target image or P times of the height value of the target image, and P is, for example, a value of 0.8 or 0.9; the width value w of the initial area is extended by a third preset proportion, for example, 5%, while the updated width value is larger than the first width threshold (for example, 0) and smaller than the second width threshold, and the second width threshold may be, for example, the width value of the target image or P times the width value of the target image, and P is, for example, a value of 0.8, 0.9, or the like. For example, when the height value of the initial region is 100 pixels, the height value of the first updated geometric parameter of the prediction region is 120 pixels, i.e., 10 pixels are extended upward in the height direction, and 10 pixels are extended downward; when the width value of the initial region is 100 pixels, the width value of the first updated geometric parameter of the prediction region is 110 pixels, i.e., 5 pixels are extended to the left and 5 pixels are extended to the right in the width direction, and the values about the pixels in this example are merely examples.

In one embodiment, in S403, selecting the corresponding region expansion rule according to the region geometric feature parameter of the initial region includes: calculating to obtain a width value of the initial region according to the region geometric feature parameters of the initial region, and calculating to obtain a height value of the initial region according to the region geometric feature parameters of the initial region, wherein the calculation mode can refer to the formula 1; if the height value is greater than M times of the width value, selecting a third region expansion rule, wherein M is a positive number greater than 1; in one embodiment, M may be equal to 1.5 or other values, for example, that is, the third region expansion rule is selected when the initial region is 1.5 times wider.

In S404, determining the first updated geometric parameter according to the region expansion rule includes: according to a third region expansion rule, taking the height value of the initial region as the height value of the first updating geometric parameter, and according to the third region expansion rule, expanding an updating width value obtained after a fourth preset proportion on the basis of the width value of the initial region to obtain the width value of the first updating geometric parameter, wherein the updating width value is larger than a first width threshold and smaller than a second width threshold, the second width threshold can be P times of the width value of the target image or the width value of the target image, and P is a value of 0.8 or 0.9 and the like. Under the third extension rule, the height value of the prediction area is the same as the height value of the initial area, but the fourth preset proportion is extended on the basis of the width value of the initial area, for example, the fourth preset proportion may be 20%, for example, when the width value of the initial area is 100 pixels, the width value of the first updated geometric parameter corresponding to the prediction area is 140 pixels, that is, 20 pixels are extended leftwards in the width direction and 20 pixels are extended rightwards, and the numerical value about the pixels in this example is merely an example.

In one embodiment, in S403, selecting the corresponding region expansion rule according to the region geometric feature parameter of the initial region includes: calculating to obtain a width value of the initial region according to the region geometric feature parameters of the initial region, and calculating to obtain a height value of the initial region according to the region geometric feature parameters of the initial region, wherein the calculation mode can refer to the formula 1; if the height value is larger than the width value and the height value is smaller than or equal to M times of the width value, selecting a fourth region expansion rule, wherein M is a positive number larger than 1; m may be, for example, 1.5 or other values, that is, when the height value is greater than the width value, but the height value is less than or equal to 1.5 times the width value, the fourth region expansion rule may be selected.

In S404, determining the first updated geometric parameter according to the region expansion rule includes: according to a fourth region expansion rule, on the basis of the height value of the initial region, expanding the updated height value obtained after the fifth preset proportion as the height value of the first updated geometric parameter, wherein the updated height value is larger than the first height threshold and smaller than the second height threshold, the second height threshold can be, for example, the height value of the target image or P times of the height value of the target image, P is, for example, a value of 0.8 or 0.9, etc., and according to the fourth region expansion rule, on the basis of the width value of the initial region, expanding the updated width value obtained after the sixth preset proportion as the width value of the first updated geometric parameter, wherein the updated width value is larger than the first width threshold and smaller than the second width threshold, and the second width threshold can be, for example, the width value of the target image or P times of the width value of the target image, P is, for example, a value of 0.8 or 0.9, etc. On the basis of a fourth region expansion rule, expanding the height value h of the initial region up and down by a fifth preset proportion, for example, 5%, and simultaneously updating the height value to be larger than a first height threshold (for example, 0) and smaller than a second height threshold, wherein the second height threshold can be, for example, the height value of the target image or P times of the height value of the target image, and P is, for example, a value of 0.8 or 0.9; the width value w of the initial area is extended by a sixth preset proportion, for example, 10%, while the updated width value is larger than the first width threshold (for example, 0) and smaller than the second width threshold, and the second width threshold may be, for example, the width value of the target image or P times the width value of the target image, and P is, for example, a value of 0.8, 0.9, or the like. For example, when the height value of the initial region is 100 pixels, the height value of the first updated set parameter of the prediction region is 110 pixels, i.e., 5 pixels are extended upward in the height direction, and 5 pixels are extended downward; when the width value of the initial region is 100 pixels, the width value of the first updated geometric parameter of the prediction region is 120 pixels, i.e., 10 pixels are extended to the left and 10 pixels are extended to the right in the width direction, and the values about the pixels in this example are merely examples.

As shown in fig. 5, the dotted rectangular frame portion is an initial area, and the solid rectangular frame portion is a prediction area obtained by extending in the up-down direction and extending in the left-right direction, and the prediction area determined by the empirical values in the above-mentioned examples can ensure that all the image objects are substantially within the area, although there may be a portion of the redundant area where the hair objects are not present, when the hair recognition is performed compared with the whole target image, false detection can be greatly reduced, recognition calculation amount is reduced, software and hardware resources are saved, and recognition accuracy is relatively ensured.

Referring again to fig. 6, the manner in which the initial region is adjusted for expansion is visually presented in fig. 6. The schematic diagram of fig. 6 corresponds to the adjustment process mentioned above with reference to fig. 4. When the initial area is larger than 0.8 times of the area of the target image, the size of the target image is directly taken as the size of the prediction area, and the target image is taken as the prediction area to identify the image object. When the area of the initial region is smaller than or equal to 0.8 times of the area of the target image, judging whether the width value of the initial region is larger than 1.5 times of the height value of the initial region, and if so, expanding the height value of the initial region up and down by 20% in the height direction. If the width value of the initial region is less than or equal to 1.5 times of the height value of the initial region, further judging whether the width value of the initial region is greater than the height value, if so, expanding the height value of the initial region by 10% up and down in the height direction, wherein the expanded height value is greater than 0 and less than the height value of the target image, and the width value of the initial region is expanded by 5% left and right in the width direction, and the expanded width value is greater than 0 and less than the width value of the target image. If the width value of the initial area is smaller than or equal to the height value, further judging whether the height value of the initial area is larger than 1.5 times of the width value, and if so, expanding the width value of the initial area by 20% left and right in the width direction. If the height value of the initial region is less than or equal to 1.5 times the width value, the height value of the initial region is expanded up and down by 5% in the height direction, but the expanded height value is greater than 0 and less than the height value of the target image, and the width value of the initial region is expanded left and right by 10% in the width direction, but the expanded width value is greater than 0 and less than the width value of the target image.

The method detects the approximate initial area of the area where the image object such as hair is located, and then carries out the strategy of fine expansion of the initial area, thereby being capable of reducing missing detection and false detection of the image object area, improving the efficiency of identifying the image object from the image, and having strong feasibility, low cost and high precision of the whole scheme.

Referring to fig. 7 again, a schematic structural diagram of an image object recognition processing apparatus according to an embodiment of the present invention may be provided in an intelligent device, where the intelligent device may be a terminal such as a smart phone, a tablet computer, a personal computer, etc., and the intelligent device may also be a server, where the server may be a server dedicated to image object processing, or may also be some social platform servers or communication application servers. In an embodiment of the present invention, the apparatus includes a prediction module 701, an identification module 702, and a determination module 703.

The prediction module 701 is configured to perform object region identification processing on an acquired target image, so as to obtain a predicted region in the target image; the recognition module 702 is configured to invoke an image object recognition model to perform image object recognition on the sub-image areas in the predetermined area, and determine whether each sub-image area belongs to an image object to be recognized; a determining module 703, configured to determine an image object area in the target image according to a recognition result of image object recognition, where the recognition result includes: for recording result information pertaining to sub-image areas of the image object to be identified.

In a possible implementation manner, the prediction module 701 is configured to invoke an image object prediction model to perform region prediction on the target image, and determine an initial region; and adjusting the position of the initial region in the target image according to a region expansion rule to obtain a predicted region.

In a possible implementation manner, the prediction module 701 is configured to invoke an image object prediction model to perform region prediction on the target image, and obtain a region geometric feature parameter output by the image object prediction model after prediction; determining an initial region according to the geometric characteristic parameters of the region; the initial region comprises a rectangular region, and the geometric characteristic parameters of the region comprise: offset information, height information, and width information; the offset information includes: and the offset of the area center of the rectangular area and the image center of the target image, wherein the height information comprises: a ratio of a height of the rectangular region to a height of the target image, the width information including: the ratio of the width of the rectangular area relative to the width of the target image.

In a possible implementation manner, the prediction module 701 is configured to select a corresponding region expansion rule according to a region geometric feature parameter of the initial region; determining a first updated geometric parameter according to the region expansion rule; and adjusting the position of the initial region in the target image according to the first updated geometric parameters to obtain an adjusted prediction region.

In a possible implementation manner, the prediction module 701 is configured to determine a second updated geometric parameter according to the geometric parameter of the target image when the initial area determined according to the geometric feature parameter of the region meets an area condition; adjusting the position of the initial region in the target image according to the second updated geometric parameters to obtain an adjusted prediction region; and when the initial area determined according to the area geometric feature parameters does not meet the area condition, selecting a corresponding area expansion rule according to the area geometric feature parameters of the initial area.

In one possible implementation, the initial area meeting the area condition includes: the ratio between the initial area and the area of the target image is greater than a preset area ratio threshold.

In a possible implementation manner, the prediction module 701 is configured to calculate a width value of the initial area according to an area geometric feature parameter of the initial area, and calculate a height value of the initial area according to an area geometric feature parameter of the initial area; if the width value is larger than N times of the height value, selecting a first region expansion rule, wherein N is a positive number larger than 1; according to a first region expansion rule, taking the width value of the initial region as the width value of a first updated geometric parameter; according to a first region expansion rule, an updated height value obtained after expanding a first preset proportion is used as a height value of a first updated geometric parameter on the basis of the height value of an initial region, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value.

In a possible implementation manner, the prediction module 701 is configured to calculate a width value of the initial area according to an area geometric feature parameter of the initial area, and calculate a height value of the initial area according to an area geometric feature parameter of the initial area; if the width value is larger than the height value but the width value is smaller than or equal to N times of the height value, selecting a second region expansion rule, wherein N is a positive number larger than 1; according to a second region expansion rule, expanding an updated height value obtained after a second preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value; according to a second region expansion rule, an updated width value obtained after expanding a third preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

In a possible implementation manner, the prediction module 701 is configured to calculate a width value of the initial area according to an area geometric feature parameter of the initial area, and calculate a height value of the initial area according to an area geometric feature parameter of the initial area; if the height value is greater than M times of the width value, selecting a third region expansion rule, wherein M is a positive number greater than 1; according to a third region expansion rule, taking the height value of the initial region as the height value of the first updated geometric parameter; according to a third region expansion rule, an updated width value obtained after expanding a fourth preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

In a possible implementation manner, the prediction module 701 is configured to calculate a width value of the initial area according to an area geometric feature parameter of the initial area, and calculate a height value of the initial area according to an area geometric feature parameter of the initial area; if the height value is larger than the width value and the height value is smaller than or equal to M times of the width value, selecting a fourth region expansion rule, wherein M is a positive number larger than 1; according to a fourth region expansion rule, expanding an updated height value obtained after a fifth preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold and smaller than a second height threshold; according to a fourth region expansion rule, expanding an updated width value obtained after a sixth preset proportion on the basis of the width value of the initial region to serve as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold and smaller than a second width threshold.

In one possible implementation manner, the sub-image area is an area where one pixel point in the prediction area is located; the recognition module 702 is configured to invoke an image object recognition model to perform image object recognition on each pixel point in the preset area, and determine a probability that each pixel point in the preset area belongs to an image object to be recognized; wherein, the pixel points with probability values larger than or equal to a preset threshold belong to the image object to be identified; the recognition result of the image object recognition is a probability map corresponding to the prediction area determined according to the obtained probability, so that the determining module 703 determines the image object area in the prediction area according to the probability map.

The specific implementation of each module in the embodiment of the present application may refer to the description of the related content in the foregoing method embodiment, which is not repeated herein. The method detects the approximate initial area of the area where the image object such as hair is located, and then carries out the strategy of fine expansion of the initial area, thereby being capable of reducing missing detection and false detection of the image object area, improving the efficiency of identifying the image object from the image, and having strong feasibility, low cost and high precision of the whole scheme.

Referring to fig. 8, a schematic structural diagram of an intelligent device according to an embodiment of the present invention is shown, where the intelligent device according to an embodiment of the present invention may be an intelligent terminal such as a smart phone, a tablet computer, a personal computer, or a server. The intelligent device comprises a power module and various shell structures, and in the embodiment of the invention, the intelligent device comprises: the processor 801 and the storage device 802 may further include a communication interface 803, a user interface 804, and the like as needed to realize various data transmission and communication functions and meet the interaction requirements with the user.

The storage device 802 may include volatile memory (RAM), such as random-access memory (RAM); the storage device 802 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid state disk (solid-state drive-STATE DRIVE, SSD), etc.; the storage device 802 may also include a combination of the types of memory described above.

The processor 801 may be a central processing unit (central processing unit, CPU). The processor 801 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (programmable logic device, PLD), or the like. The PLD may be a field-programmable gate array (FPGA) GATE ARRAY, general array logic (GENERIC ARRAY logic, GAL), or the like.

The storage device 802 is also used to store program instructions. The processor 801 may call the program instructions to implement the relevant steps of the image object recognition processing method mentioned in the previous embodiment.

In one embodiment, the processor 801 invokes program instructions stored in the storage 802, to perform object region identification processing on the acquired target image, to obtain a predicted region in the target image; invoking an image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to an image object to be recognized; determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to sub-image areas of the image object to be identified.

In one embodiment, the processor 801 is configured to invoke an image object prediction model to perform region prediction on the target image, and determine an initial region; and adjusting the position of the initial region in the target image according to a region expansion rule to obtain a predicted region.

In one embodiment, the processor 801 is configured to invoke an image object prediction model to perform region prediction on the target image, and obtain a region geometric feature parameter output by the image object prediction model after prediction; determining an initial region according to the geometric characteristic parameters of the region; the initial region comprises a rectangular region, and the geometric characteristic parameters of the region comprise: offset information, height information, and width information; the offset information includes: and the offset of the area center of the rectangular area and the image center of the target image, wherein the height information comprises: a ratio of a height of the rectangular region to a height of the target image, the width information including: the ratio of the width of the rectangular area relative to the width of the target image.

In one embodiment, the processor 801 is configured to select a corresponding region expansion rule according to a region geometric feature parameter of the initial region; determining a first updated geometric parameter according to the region expansion rule; and adjusting the position of the initial region in the target image according to the first updated geometric parameters to obtain a predicted region.

In one embodiment, the processor 801 is configured to determine a second updated geometric parameter according to the geometric parameter of the target image when the initial region area determined according to the region geometric feature parameter satisfies an area condition; adjusting the position of the initial region in the target image according to the second updated geometric parameters to obtain a predicted region; and when the initial area determined according to the area geometric feature parameters does not meet the area condition, executing the area geometric feature parameters according to the initial area, and selecting a corresponding area expansion rule.

In one embodiment, the initial area satisfying the area condition includes: the ratio between the initial area and the area of the target image is greater than a preset area ratio threshold.

In one embodiment, the processor 801 is configured to calculate a width value of the initial region according to a region geometric feature parameter of the initial region, and calculate a height value of the initial region according to a region geometric feature parameter of the initial region; if the width value is larger than N times of the height value, selecting a first region expansion rule, wherein N is a positive number larger than 1; according to a first region expansion rule, taking the width value of the initial region as the width value of a first updated geometric parameter; according to a first region expansion rule, an updated height value obtained after expanding a first preset proportion is used as a height value of a first updated geometric parameter on the basis of the height value of an initial region, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value.

In one embodiment, the processor 801 is configured to calculate a width value of the initial region according to a region geometric feature parameter of the initial region, and calculate a height value of the initial region according to a region geometric feature parameter of the initial region; if the width value is larger than the height value but the width value is smaller than or equal to N times of the height value, selecting a second region expansion rule, wherein N is a positive number larger than 1; according to a second region expansion rule, expanding an updated height value obtained after a second preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value; according to a second region expansion rule, an updated width value obtained after expanding a third preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

In one embodiment, the processor 801 is configured to calculate a width value of the initial region according to a region geometric feature parameter of the initial region, and calculate a height value of the initial region according to a region geometric feature parameter of the initial region; if the height value is greater than M times of the width value, selecting a third region expansion rule, wherein M is a positive number greater than 1; according to a third region expansion rule, taking the height value of the initial region as the height value of the first updated geometric parameter; according to a third region expansion rule, an updated width value obtained after expanding a fourth preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

In one embodiment, the processor 801 is configured to calculate a width value of the initial region according to a region geometric feature parameter of the initial region, and calculate a height value of the initial region according to a region geometric feature parameter of the initial region; if the height value is larger than the width value and the height value is smaller than or equal to M times of the width value, selecting a fourth region expansion rule, wherein M is a positive number larger than 1; according to a fourth region expansion rule, expanding an updated height value obtained after a fifth preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold and smaller than a second height threshold; according to a fourth region expansion rule, expanding an updated width value obtained after a sixth preset proportion on the basis of the width value of the initial region to serve as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold and smaller than a second width threshold.

In one embodiment, the sub-image region is a region in which one pixel point in the prediction region is located; the processor 801 is configured to invoke an image object recognition model to perform image object recognition on each pixel point in the preset area, and determine a probability that each pixel point in the preset area belongs to an image object to be recognized; wherein, the pixel points with probability values larger than or equal to a preset threshold belong to the image object to be identified; the recognition result of the image object recognition is a probability map corresponding to the prediction area determined according to the obtained probability.

The specific implementation of the processor 801 in the embodiment of the present application may refer to the description of the related content in the foregoing method embodiment, which is not repeated herein. The method detects the approximate initial area of the area where the image object such as hair is located, and then carries out the strategy of fine expansion of the initial area, thereby being capable of reducing missing detection and false detection of the image object area, improving the efficiency of identifying the image object from the image, and having strong feasibility, low cost and high precision of the whole scheme.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.

The above disclosure is only a few examples of the present invention, and it is not intended to limit the scope of the present invention, but it is understood by those skilled in the art that all or a part of the above embodiments may be implemented and equivalents thereof may be modified according to the scope of the present invention.

Claims

1. A method of recognizing and processing an image object, comprising:

Performing face recognition on the acquired target image by using a face recognition technology to determine a face area, determining the face area as an initial area, and if the face area is not detected, calling an image object prediction model to perform area prediction on the acquired target image to determine the initial area;

according to the region expansion rule, the position of the initial region in the target image is adjusted to obtain a predicted region;

Invoking an image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to an image object to be recognized; the method comprises the steps that an image object to be identified is a hair image object, an image object identification model is used for dividing an area where the hair image object in a preset area is located, a probability map with the same size as the preset area is obtained through the image object identification model, and probability values on the probability map refer to probabilities that corresponding pixel points belong to the hair image object;

Determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to a sub-image region of the image object to be identified;

The adjusting the position of the initial region in the target image according to the region expansion rule to obtain a predicted region includes:

Selecting a corresponding region expansion rule according to the region geometric characteristic parameters of the initial region; wherein different region expansion rules are allowed to be selected according to the region geometry parameters in order to determine the first updated geometry parameters according to the different region expansion rules; when selecting the corresponding region expansion rule, the method is carried out according to the ratio between the width value of the initial region calculated based on the region geometric feature parameter of the initial region and the height value of the initial region calculated based on the region geometric feature parameter of the initial region;

determining a first updated geometric parameter according to the region expansion rule;

adjusting the position of the initial region in the target image according to the first updated geometric parameters to obtain a predicted region;

The sub-image region is a region in which one pixel point in the prediction region is located; the invoking the image object recognition model to perform image object recognition on the sub-image areas in the preset area, and determining whether each sub-image area belongs to the image object to be recognized comprises the following steps:

Invoking an image object recognition model to perform image object recognition on each pixel point in the preset area, and determining the probability that each pixel point in the preset area belongs to an image object to be recognized; wherein, the pixel points with probability values larger than or equal to a preset threshold belong to the image object to be identified;

The recognition result of the image object recognition is a probability map corresponding to the prediction area determined according to the obtained probability.

2. The method of claim 1, wherein invoking the image object prediction model to perform region prediction on the acquired target image, determining an initial region, comprises:

Calling an image object prediction model to perform region prediction on the acquired target image, and acquiring region geometric characteristic parameters output by the image object prediction model after prediction;

determining an initial region according to the geometric characteristic parameters of the region;

the initial region comprises a rectangular region, and the geometric characteristic parameters of the region comprise: offset information, height information, and width information;

The offset information includes: and the offset of the area center of the rectangular area and the image center of the target image, wherein the height information comprises: a ratio of a height of the rectangular region to a height of the target image, the width information including: the ratio of the width of the rectangular area relative to the width of the target image.

3. The method of claim 1, wherein before selecting the corresponding region expansion rule according to the region geometry parameters of the initial region, further comprises:

When the initial area determined according to the area geometric feature parameters meets the area condition, determining a second updated geometric parameter according to the geometric parameters of the target image;

Adjusting the position of the initial region in the target image according to the second updated geometric parameters to obtain a predicted region;

and when the initial area determined according to the area geometric feature parameters does not meet the area condition, executing the area geometric feature parameters according to the initial area, and selecting a corresponding area expansion rule.

4. The method of claim 3, wherein the initial area meeting an area condition comprises: the ratio between the initial area and the area of the target image is greater than a preset area ratio threshold.

5. The method according to any one of claims 2-4, wherein selecting a corresponding region expansion rule according to the region geometry parameters of the initial region comprises:

calculating to obtain a width value of the initial region according to the region geometric feature parameters of the initial region, and calculating to obtain a height value of the initial region according to the region geometric feature parameters of the initial region;

if the width value is larger than N times of the height value, selecting a first region expansion rule, wherein N is a positive number larger than 1;

The determining the first updated geometric parameter according to the region expansion rule includes:

According to a first region expansion rule, taking the width value of the initial region as the width value of a first updated geometric parameter;

According to a first region expansion rule, an updated height value obtained after expanding a first preset proportion is used as a height value of a first updated geometric parameter on the basis of the height value of an initial region, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value.

6. The method according to any one of claims 2-4, wherein selecting a corresponding region expansion rule according to the region geometry parameters of the initial region comprises:

if the width value is larger than the height value but the width value is smaller than or equal to N times of the height value, selecting a second region expansion rule, wherein N is a positive number larger than 1;

According to a second region expansion rule, expanding an updated height value obtained after a second preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold value and smaller than a second height threshold value;

According to a second region expansion rule, an updated width value obtained after expanding a third preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

7. The method according to any one of claims 2-4, wherein selecting a corresponding region expansion rule according to the region geometry parameters of the initial region comprises:

if the height value is greater than M times of the width value, selecting a third region expansion rule, wherein M is a positive number greater than 1;

according to a third region expansion rule, taking the height value of the initial region as the height value of the first updated geometric parameter;

according to a third region expansion rule, an updated width value obtained after expanding a fourth preset proportion on the basis of the width value of the initial region is used as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold value and smaller than a second width threshold value.

8. The method according to any one of claims 2-4, wherein selecting a corresponding region expansion rule according to the region geometry parameters of the initial region comprises:

If the height value is larger than the width value and the height value is smaller than or equal to M times of the width value, selecting a fourth region expansion rule, wherein M is a positive number larger than 1;

According to a fourth region expansion rule, expanding an updated height value obtained after a fifth preset proportion on the basis of the height value of the initial region to serve as the height value of the first updated geometric parameter, wherein the updated height value is larger than a first height threshold and smaller than a second height threshold;

According to a fourth region expansion rule, expanding an updated width value obtained after a sixth preset proportion on the basis of the width value of the initial region to serve as the width value of the first updated geometric parameter, wherein the updated width value is larger than a first width threshold and smaller than a second width threshold.

9. An image object recognition processing apparatus, comprising:

The prediction module is used for carrying out face recognition on the acquired target image by utilizing a face recognition technology to determine a face area, determining the face area as an initial area, and if the face area is not detected, calling an image object prediction model to carry out area prediction on the acquired target image to determine the initial area; according to the region expansion rule, the position of the initial region in the target image is adjusted to obtain a predicted region;

The recognition module is used for calling an image object recognition model to recognize the image objects of the sub-image areas in the preset area and determining whether each sub-image area belongs to the image object to be recognized or not; the method comprises the steps that an image object to be identified is a hair image object, an image object identification model is used for dividing an area where the hair image object in a preset area is located, a probability map with the same size as the preset area is obtained through the image object identification model, and probability values on the probability map refer to probabilities that corresponding pixel points belong to the hair image object;

The determining module is used for determining an image object area in the target image according to an identification result of image object identification, wherein the identification result comprises: for recording result information pertaining to a sub-image region of the image object to be identified;

The prediction module is used for adjusting the position of the initial region in the target image according to a region expansion rule, and selecting a corresponding region expansion rule according to the region geometric characteristic parameters of the initial region when a predicted region is obtained; determining a first updated geometric parameter according to the region expansion rule; adjusting the position of the initial region in the target image according to the first updated geometric parameters to obtain an adjusted prediction region; wherein different region expansion rules are allowed to be selected according to the region geometry parameters in order to determine the first updated geometry parameters according to the different region expansion rules; when selecting the corresponding region expansion rule, the method is carried out according to the ratio between the width value of the initial region calculated based on the region geometric feature parameter of the initial region and the height value of the initial region calculated based on the region geometric feature parameter of the initial region;

The sub-image area is an area where one pixel point in the prediction area is located; the recognition module is used for calling an image object recognition model to recognize the image object of each pixel point in the preset area and determining the probability that each pixel point in the preset area belongs to the image object to be recognized; wherein, the pixel points with probability values larger than or equal to a preset threshold belong to the image object to be identified; the recognition result of the image object recognition is a probability map corresponding to the prediction area determined according to the obtained probability.

10. A smart device comprising storage means and a processor, said storage means having stored therein program instructions, said processor invoking said program instructions for implementing the method according to any of claims 1-8.

11. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein program instructions which, when executed, will implement the method of any of claims 1-8.

12. A computer program product comprising computer/instructions which, when executed by a processor, implement the steps of the method as claimed in any one of claims 1to 8.