CN114255389A

CN114255389A - Target object detection method, device, equipment and storage medium

Info

Publication number: CN114255389A
Application number: CN202111348064.7A
Authority: CN
Inventors: 王泓清; 丁晟; 邱庆举
Original assignee: Zhejiang Geely Holding Group Co Ltd; Zhejiang Shikong Daoyu Technology Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd; Zhejiang Shikong Daoyu Technology Co Ltd
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2022-03-29

Abstract

The invention relates to a target object detection method, a device, equipment and a storage medium, wherein an image to be detected is acquired and input into a first model to identify an object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to a target object; acquiring characteristic data of a first object to obtain first characteristic data; inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; determining whether the target object is contained in the image to be detected based on the second possibility. According to the scheme, the target detection result which is more accurate than that of a deep learning algorithm can be obtained, different types of target objects can be detected, the category range of the target objects which can be detected is expanded, and the applicability is improved.

Description

Target object detection method, device, equipment and storage medium

Technical Field

The present invention relates to the field of computer vision, and in particular, to a method, an apparatus, a device, and a storage medium for detecting a target object.

Background

At present, a good effect is obtained by adopting a deep learning algorithm to detect a target object, but the deep learning algorithm still has defects when the target object in a remote sensing image is detected.

First, the deep learning model is trained using a large number of image samples. The principle of detecting the target object by the deep learning model is to detect the target object according to image features such as texture, details, edges, pixel gradients, shapes and the like of pixel points in the image. However, the marine remote sensing images are affected by conditions such as illumination, weather, sea conditions, and the like, so that the resolution of the images is various, and the image features of the remote sensing images with low resolution are not obvious.

In addition, because the deep learning model uses supervised learning, under the condition of unbalanced training data, the problem of class preference is easy to occur, so that the detection effect on objects in the same type of images of a non-training set is poor. For example, in the detection of a ship target in a marine remote sensing image, the deep learning model has a good detection effect on ships of the same type and structure as those in the sample image, but other types of ship objects which are greatly different from the sample image cannot be detected.

Therefore, the accuracy of target detection on the remote sensing image by adopting the deep learning algorithm is low, and for multi-class detection targets, the algorithm can detect fewer classes and has poor applicability.

Disclosure of Invention

The present invention is directed to solving at least one of the problems of the prior art. To this end, a first aspect of the present invention provides a target object detection method, including:

acquiring an image to be detected;

inputting the image to be detected into a first model to identify the object in the image to be detected, and obtaining a first possibility that each object in the image to be detected belongs to a target object; the first model is obtained by training according to a first sample image of the target object;

acquiring characteristic data of a first object to obtain first characteristic data; the first object is an object of which the first possibility is greater than a preset first threshold;

inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; the second model is obtained by training according to second feature data, the second feature data are obtained by extracting text feature information corresponding to features of each object in a second sample image, and the second sample image comprises images of the target objects in different categories;

and determining whether the target object is contained in the image to be detected or not based on the second possibility.

Optionally, after obtaining the second possibility that the first object belongs to the target object, the method further includes:

acquiring the second characteristic data;

determining a third likelihood that the first object belongs to the target object based on the second feature data and the first feature data.

Optionally, the first feature data at least includes: the determining of the third likelihood that the first object belongs to the target object comprises:

acquiring size information of the first object from the first characteristic data to obtain a first size;

acquiring pixel equivalent information of the second sample image from the second characteristic data;

determining an actual size of the first object according to the pixel equivalent information and the first size;

determining a third likelihood that the first object belongs to the target object based on the actual size.

Optionally, the determining whether the image to be detected includes the target object based on the second possibility includes:

carrying out weighted summation on the second possibility and the third possibility to obtain a target possibility;

determining whether the image to be detected comprises the target object or not according to the relation between the target possibility and a preset second threshold value; the second threshold is greater than the first threshold.

Optionally, after determining whether the target object is included in the image to be detected based on the second possibility, the method further includes:

if the target object is determined to be contained in the image to be detected, acquiring characteristic data of the target object;

acquiring feature information corresponding to each target object category in the second feature data to obtain category feature information;

and matching the characteristic data of the target object with the category characteristic information to determine the category of the target object.

Optionally, before the image to be detected is input into the first model to identify the object in the image to be detected, the method further includes:

setting an output result of the first model, wherein the output result comprises characteristic data of each object in the image to be detected;

the acquiring feature data of the first object comprises:

and acquiring characteristic data of the first object from the output result.

Optionally, the first model is obtained by training through the following method:

preprocessing an original image to obtain a first sample image; the preprocessing at least comprises the steps of carrying out linear transformation, spatial transformation and image enhancement on the original image;

determining a use scene and a use requirement of target object detection, and selecting a matched algorithm model from a plurality of algorithm models according to the use scene and the use requirement to obtain a first algorithm model.

And training the first algorithm model by using the first sample image to obtain a first model.

Optionally, the second model is obtained by training through the following method:

extracting text characteristic information corresponding to a target object to be detected from the second sample image to obtain second characteristic data;

determining an output value type detected by a target object, and selecting an algorithm model matched with the output value type from a plurality of algorithm models to obtain a second algorithm model; the output value type comprises one of a continuous value and a type value;

and training the second algorithm model by using the second characteristic data to obtain a second model.

Optionally, after obtaining the second model, the method further includes:

acquiring a third sample image, wherein an object in the third sample image is an object which is not present in the first sample image and the second sample image;

extracting feature information of a target object to be detected from the third image sample data to obtain newly added sample feature data;

and retraining the second algorithm model by using the newly added sample characteristic data and the sample characteristic data obtained according to the second sample image to obtain an updated second model.

Optionally, the second characteristic data at least includes: the target object comprises category information, length-width ratio information, color information, shape information, geographical position information and pixel equivalent information of the second sample image.

A second aspect of the present invention provides a target object detection apparatus, the apparatus comprising:

the acquisition module is used for acquiring an image to be detected;

the first identification module is used for inputting the image to be detected into a first model to identify the object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to a target object; the first model is obtained by training according to a first sample image of the target object;

the first characteristic data acquisition module is used for acquiring the characteristic data of the first object to obtain first characteristic data; the first object is an object of which the first possibility is greater than a preset first threshold;

the second identification module is used for inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; the second model is obtained by training according to second feature data, the second feature data are extracted from text feature information corresponding to features of each object in a second sample image, and the second sample image at least comprises the first sample image;

a determining module, configured to determine whether the target object is included in the image to be detected based on the second likelihood.

Optionally, the apparatus further comprises:

the second characteristic data acquisition module is used for acquiring the second characteristic data;

a third likelihood determination module, configured to determine a third likelihood that the first object belongs to the target object according to the second feature data and the first feature data.

Optionally, the first feature data at least includes: the third possibility determining module is specifically configured to:

Optionally, the determining module is specifically configured to:

Optionally, the apparatus further comprises:

the target object characteristic data acquisition module is used for acquiring the characteristic data of the target object if the target object is determined to be contained in the image to be detected;

a category characteristic information obtaining module, configured to obtain characteristic information corresponding to each target object category in the second characteristic data, to obtain category characteristic information;

and the target object category determining module is used for matching the feature data of the target object with the category feature information to determine the category of the target object.

Optionally, the apparatus further comprises:

the output result setting module is used for setting the output result of the first model, and the output result comprises the characteristic data of each object in the image to be detected;

the first characteristic data acquisition module is specifically configured to:

and acquiring characteristic data of the first object from the output result.

Optionally, the apparatus further includes a first model training module, where the first model training module is specifically configured to:

Optionally, the apparatus further includes a second model training module, where the second model training module is specifically configured to:

optionally, the second model training module is further configured to:

A third aspect of the present invention proposes an apparatus comprising a processor and a memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement the target object detection method according to the first aspect.

A fourth aspect of the present invention proposes a computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the target object detecting method according to the first aspect.

According to the specific embodiment provided by the invention, the invention has the following technical effects:

according to the target detection method provided by the embodiment of the invention, the image to be detected is acquired, and the image to be detected is input into the first model to identify the object in the image to be detected, so that the first possibility that each object in the image to be detected belongs to the target object is obtained; acquiring characteristic data of a first object to obtain first characteristic data; inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; determining whether the target object is contained in the image to be detected based on the second possibility. According to the scheme, the image characteristics of each object in the image to be detected are inspected through the first model to obtain the first object, and the object characteristics of the first object are further inspected through the second model, so that the image characteristics and the object characteristics of the image to be detected are integrated, and a target detection result which is more accurate than a deep learning algorithm can be obtained. In addition, the second model collects the feature data of different types of target objects, so that different types of target objects can be detected, the category range of the detectable target objects is expanded, and the applicability is improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the drawings used in the description of the embodiment or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art it is also possible to derive other drawings from these drawings without inventive effort.

Fig. 1 is a flowchart illustrating steps of a target object detection method according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating steps of another method for detecting a target object according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating steps of a method for training a first model according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating steps of a method for training a second model according to an embodiment of the present invention;

fig. 5 is a block diagram of a target object detection apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

Fig. 1 is a flowchart of a target object detection method according to an embodiment of the present invention. The present specification provides method steps as described in the examples or flowcharts, but more or fewer steps may be included based on routine or non-invasive labor. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures.

The method may comprise the steps of:

step 101, obtaining an image to be detected.

The image to be detected can be a common image or a remote sensing image. In particular, ships are used as important military targets and main transportation carriers on the sea, and have important significance and wide application prospect in the military field and the civil field. The embodiment of the invention takes the ocean remote sensing image as the image to be detected and takes the detection of the ship target as an example to explain the method of the invention.

Step 102, inputting the image to be detected into a first model to identify an object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to a target object; the first model is trained from a first sample image of the target object.

After the image to be detected is input into the first model for target object detection, in the output result of the first model, each object in the image to be detected is identified by a marking frame, and a first possibility that the object belongs to the target object is marked in each marking frame. The first likelihood is generally represented by a probability value, which may be, for example, a percentage or a decimal number.

For example, when a marine remote sensing image is input to the first model to detect a target object, the probability values of the objects belonging to a ship are respectively indicated on the detected objects such as islands, birds, clouds, and ships on the marine remote sensing image in the output result.

The first model is obtained through training of an input first sample image, and therefore the first model detects the target object according to image features such as texture, details, edges, pixel gradients, shapes and the like of pixel points in the image. However, the remote sensing images are affected by conditions such as illumination, weather, sea state, and the like, so that the resolution of the images is various, and the image features of the remote sensing images with low resolution are not obvious.

In addition, the first model is trained by using the first sample image, and supervised learning is used, so that the problem of class preference is easily caused under the condition of unbalanced training data, and the detection effect on objects in the same type of images in a non-training set is poor.

Therefore, the target object detection is performed only through the first model, the accuracy of the detection result is low, and the applicability to multiple types of target objects is poor.

103, acquiring characteristic data of the first object to obtain first characteristic data; the first object is an object of which the first possibility is greater than a preset first threshold.

In order to further improve the target detection accuracy, objects with a higher probability of belonging to the target object may be screened from the detection result of the first model, that is, objects with a first probability greater than a preset first threshold may be screened. Objects which are different from the target object greatly can be removed preliminarily through a first threshold value, and the rest objects are all used as objects detected in the next step. Thus, the first threshold may be set to a lower threshold.

For example, if the first likelihood is expressed in terms of a percentage, the first threshold may be set to 50%. Of course, the value of the first threshold is not specifically limited in the embodiment of the present invention.

Step 104, inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; the second model is obtained by training according to second feature data, the second feature data are extracted from text feature information corresponding to features of objects in a second sample image, and the second sample image comprises images of the target objects of different categories.

The second model is obtained through training of second feature data, and the second feature data is text information extracted from the second sample image, so that the second model has strong processing capacity on the text data. The second model can judge the similarity degree of the text characteristic data and the text characteristic data of the target object according to a mapping function formed during training between the characteristic data of the object to be detected and the target object, and further obtains the possibility that the object to be detected belongs to the target object.

In addition, the second sample image includes images of target objects of different types, and the obtained second feature data also includes feature data of the target objects of different types, so that the second model trained by the second feature data can be used for detecting the target objects of different types.

In the embodiment of the invention, a first object with high possibility of belonging to a target object is screened out through a first model, and then characteristic attribute related data of the first object is extracted to obtain first characteristic data. The first feature data is text data and the second model is adept at processing the text data, and the first feature data is input into the second model, so that the possibility that the first object belongs to the target object can be accurately predicted.

Second feature data including feature attribute information of the target object and feature attribute information of the image, which can comprehensively determine whether an object is the target object, is extracted from the second sample image.

The category information of the target object refers to a category to which the target object belongs, for example, for a ship object, the category includes: aircraft carriers, battle trains, cruisers, battle trains, destroyers, and the like; length-width ratio information of the target object refers to a ratio between the length and the width of the target object; the geographical location information of the target object refers to the longitude and latitude of the target object.

The pixel equivalent information of the second sample image belongs to the feature attribute information of the image. The pixel equivalent refers to the actual physical size represented by a pixel point in the image.

The characteristic information basically covers all characteristic attributes of the target object, so that the detection precision of the second model is improved, and the applicability of the second model to multiple classes of target objects is improved.

And 105, determining whether the target object is contained in the image to be detected or not based on the second possibility.

The second model outputs a second likelihood that the first object belongs to the target object, which may also be in the form of a percentage or a decimal. Whether the image to be detected contains the target object can be finally determined according to the second possibility.

The second possibility is to input the detection result of the first model into the second model for re-verification, which comprehensively considers the image information of the texture, detail, edge, pixel gradient, shape, etc. of the image to be detected to obtain the first object, and further considers the characteristic information of the first object itself, such as length-width ratio information, color information, shape information, etc., so as to synthesize the image information and the characteristic information of the object to be detected, and obtain a more accurate target detection result compared with the single first model or second model.

In addition, the second feature data includes feature information of target objects of different types, so that the target objects of different types can be detected, and the detection types of the target objects are expanded.

For example, a marine remote sensing image is input into the first model to detect a target object, and an object which is more than a first threshold and possibly belongs to a ship is selected from the output result as a first object, wherein the first object may include objects which are not ships, such as small islands, dark sea waves and the like. And extracting first characteristic data of the first object, and inputting the first characteristic data into the second model. The second model may determine a second possibility that the first feature data of each first object belongs to the feature data of the ship, i.e., a second possibility that the first object belongs to the ship object. Therefore, whether the ship is included in the ocean remote sensing image or not can be judged according to the second possibility, and the object in the image is the ship.

For example, the extracted first feature data has a length-width ratio of 5:1 of the first object, and the second probability obtained by inputting the first feature data into the second model is only 20%, because the second feature data does not have a ship with an aspect ratio of 5:1, and the probability that the first object belongs to the ship is very small when accurately determined by the second model.

To sum up, in the target detection method provided by the embodiment of the present invention, the image to be detected is obtained, and the image to be detected is input into the first model to identify the object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to the target object; acquiring characteristic data of a first object to obtain first characteristic data; inputting the first characteristic data into a second model to identify the first object, and obtaining a second possibility that the first object belongs to the target object; determining whether the target object is contained in the image to be detected based on the second possibility. According to the scheme, the image characteristics of each object in the image to be detected are inspected through the first model to obtain the first object, and the object characteristics of the first object are further inspected through the second model, so that the image characteristics and the object characteristics of the image to be detected are integrated, and a more accurate target detection result can be obtained compared with a deep learning algorithm or compared with the independent first model or the independent second model. In addition, the second model collects the feature data of different types of target objects, so that different types of target objects can be detected, the category range of the detectable target objects is expanded, and the applicability is improved.

Fig. 2 is a flowchart illustrating steps of another target object detection method according to an embodiment of the present invention.

The method may comprise the steps of:

step 201, obtaining an image to be detected.

In the embodiment of the present invention, step 201 may refer to step 101, which is not described herein again.

Step 202, setting an output result of the first model, wherein the output result comprises characteristic data of each object in the image to be detected.

The output of the first model will generally mark each object in the image to be detected with a label box, and indicate a first likelihood that the object belongs to the target object in each label box. In order to facilitate subsequent use of the feature data of each object, the first model may output an additional output result, that is, the feature data of each object.

The feature data of each object can be obtained by algorithm identification calculation in the image. The feature data may include length-width ratio information, color information, shape information, size information, and the like of the object.

Step 203, inputting the image to be detected into a first model to identify the object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to a target object; the first model is trained from a first sample image of the target object.

In the embodiment of the present invention, step 203 may refer to step 102, which is not described herein again.

Step 204, acquiring characteristic data of the first object to obtain first characteristic data; the first object is an object of which the first possibility is greater than a preset first threshold.

In the embodiment of the present invention, step 204 may refer to step 103, which is not described herein again.

Optionally, the acquiring the feature data of the first object includes: and acquiring characteristic data of the first object from the output result.

Specifically, a first object is screened out from the output result, and characteristic data of the first object is obtained from the output result.

The characteristic data of the first object is directly obtained from the output result, and the method is simpler, quicker and more accurate.

The first characteristic data may also be acquired in other ways. For example, after the first object is determined, an image of the first object is acquired through third-party software, and feature data of the first object is extracted.

Step 205, inputting the first characteristic data into a second model to identify the first object, so as to obtain a second possibility that the first object belongs to the target object; the second model is obtained by training according to second feature data, the second feature data is extracted from text feature information corresponding to features of each object in a second sample image, and the second sample image at least comprises the first sample image.

In the embodiment of the present invention, step 205 may refer to step 104, which is not described herein again.

The above steps 201 to 205 are processes of acquiring the first feature data from the first model, and inputting the first feature data into the second model for comprehensive detection. In order to further improve the detection accuracy, the second feature data used in the training of the second model may also be compared with the first feature data output by the first model to obtain a new detection result, i.e., step 206-step 207.

And step 206, acquiring the second characteristic data.

And step 207, determining a third possibility that the first object belongs to the target object according to the second characteristic data and the first characteristic data.

In steps 206-207, each item in the first feature data may be compared with a corresponding item in the second feature data, thereby determining a third likelihood that the first object belongs to the target object.

Optionally, step 207 includes the following steps A1-A4:

a1, obtaining size information of the first object from the first characteristic data to obtain a first size;

a2, acquiring pixel equivalent information of the second sample image from the second characteristic data;

a3, determining the actual size of the first object according to the pixel equivalent information and the first size;

a4, determining a third possibility that the first object belongs to the target object according to the actual size.

In steps a1-a4, the pixel equivalent refers to the actual physical size represented by a pixel in the image, the first size is the number of pixels in the image of parameters such as the length and width of the first object, and the actual physical size of the first object can be calculated through the pixel equivalent information of the sample image. The third possibility may be determined based on the degree of difference between the actual size and the first size.

For example, if the first object has a first size of 5 x 3 in the image, and the pixel equivalent of the second sample image is 100 centimeters: 1, i.e. the physical size corresponding to a pixel in the image is 100 cm, the actual physical size of the first object can be calculated to be 500 cm x 300 cm. Then, it can be determined whether the first object of 500 cm × 300 cm is a ship or not according to the size of the ship in practice.

And step 208, carrying out weighted summation on the second possibility and the third possibility to obtain a target possibility.

The second possibility and the third possibility are preset according to the experimental test result, and weighted summation is performed on the second possibility and the third possibility, so that the second possibility and the third possibility are combined in the obtained target possibility, and the obtained detection result is more comprehensive and accurate.

In addition, the target possibility can be obtained by performing comprehensive judgment on the second possibility and the third possibility by using a decision method such as ensemble learning, EMV (Expected money Value) decision comparison, and the like.

Step 209, determining whether the image to be detected includes the target object according to a relation between the target possibility and a preset second threshold; the second threshold is greater than the first threshold.

The second threshold may be set to a value larger than the first threshold, and an optimal value may be determined experimentally.

For example, if the second threshold is 90% and the value of the target possibility is 92%, it may be determined that the first object corresponding to the target possibility is the target object, and the conclusion that the image to be detected includes the target object is obtained.

Step 210, if it is determined that the target object is included in the image to be detected, acquiring feature data of the target object.

And step 211, obtaining feature information corresponding to each target object category in the second feature data to obtain category feature information.

Step 212, matching the feature data of the target object with the category feature information, and determining the category of the target object.

In steps 210 to 212, since the second feature data includes the category information of the target object, the feature data corresponding to different categories of the target object may be counted to obtain the category feature information. And matching the characteristic information of the target object with the category characteristic information to obtain the category of the target object.

For example, the category characteristic information includes the geographical location where the ship of the category would normally be located, such as the geographical range where Liaoning ships would appear in China and the geographical range where Reshen numbers would appear in the United states. And acquiring the geographic position of the target object in China from the characteristic data of the target object, and acquiring one of the ship types of China of the target object according to the category characteristic information.

Fig. 3 is a flowchart illustrating steps of a training method for a first model according to an embodiment of the present invention.

The method comprises the following steps:

step 301, performing data cleaning and preprocessing on an original image to obtain a first sample image; the preprocessing at least comprises the steps of carrying out linear transformation, spatial transformation and image enhancement on the original image.

The method comprises the following steps of cleaning manually acquired original images, and mainly removing dirty data or error data existing in the original images. The original image is preprocessed, including but not limited to linear transformation, spatial transformation, image enhancement and other image processing, in order to make the features of the image more prominent, so that the algorithm model can learn the image features more easily, thereby speeding up the training and improving the detection effect of the model.

Step 302, determining a usage scenario and a usage requirement of target object detection, and selecting a matched algorithm model from a plurality of algorithm models according to the usage scenario and the usage requirement to obtain a first algorithm model.

Determining the use scene and the requirement of target object detection, and selecting different algorithm models according to the use scene and the requirement. For example, a two-stage object detection model like fast Region-based Convolutional Neural Networks (Faster Convolutional Neural Networks) may be selected for a scene with a high requirement for effect or a scene with a high requirement for operational stability; for a scene with a high requirement on processing speed, a one-stage target detection model like YOLO (You can Only see Once) can be selected.

Step 303, training the first algorithm model by using the first sample image to obtain a first model.

After the first algorithm model is determined, the first model is finally obtained through the steps of characteristic information extraction, model training, model parameter adjustment, model evaluation and the like.

In the steps 301 to 303, the original image is preprocessed, so that the features of the image are more prominent, and the algorithm model can learn the features of the image more easily, thereby accelerating the training speed and improving the detection effect of the model; according to the use scene and the use requirement of the target object detection, the matched algorithm model is selected, so that the training result can better meet the practical requirement, and the user has better use experience.

Fig. 4 is a flowchart illustrating steps of a training method for a second model according to an embodiment of the present invention.

The method comprises the following steps:

step 401, extracting text feature information corresponding to the target object to be detected from the second sample image to obtain second feature data.

Firstly, data cleaning and preprocessing are carried out on an original image acquired manually, wherein the data cleaning and preprocessing comprise logic error and abnormal value removal, missing value supplement, format content correction and the like.

And then, selecting the attribute features to be extracted by using a feature selection technology, namely selecting the features which can enable the model detection effect to be best from all the features of the current target object. Different feature selection methods are selected according to requirements, a filtering method is used when training speed is pursued, a packaging method is used when model performance is pursued, and an embedding method is pursued to balance the two methods.

Step 402, determining an output value type detected by a target object, and selecting an algorithm model matched with the output value type from a plurality of algorithm models to obtain a second algorithm model; the output value type includes one of a continuous value, a type value.

Under different use environments, the types of the output results in the deep learning model may be different, and if a certain continuous value needs to be fed back, for example, a certain characteristic value of a ship is fed back according to different proportions, the method belongs to the regression problem; if a certain type value needs to be returned, such as returning a ship type code, the classification problem is solved.

And selecting an algorithm model matched with the output value type from the multiple algorithm models according to the output result type to obtain a second algorithm model.

And 403, training the second algorithm model by using the second feature data to obtain a second model.

In the process of training the second algorithm model by using the second feature data, the method comprises a step of adjusting model parameters, and technicians are required to adjust the parameters according to the used model and the training result condition.

After the model training is completed, the quality of the model needs to be judged. The model evaluation index is a method for judging the quality of the model. This step judges the quality of the model using the model evaluation index. Different models are used to evaluate the metrics for different types of problems and situations.

And judging whether further iterative training is needed according to the result of the model, and if the result is good, completing the model training and predicting by using the model. And if the effect is not ideal, further training again according to the steps of model parameter adjustment and the like.

In step 401-.

Optionally, after obtaining the second model, the following steps B1-B3 are further included:

b1, acquiring a third sample image, wherein the object in the third sample image is an object which is not in the first sample image and the second sample image;

b2, extracting the feature information of the target object to be detected from the third image sample data to obtain newly added sample feature data;

and B3, retraining the second algorithm model by using the newly added sample characteristic data and the sample characteristic data obtained according to the second sample image to obtain an updated second model.

In the conventional deep learning algorithm, if a target object type is to be added as a new target object, a large amount of training data of the type of target object needs to be collected, a data set needs to be identified manually, and the new training data is retrained together with the old data, which requires a large amount of time cost and money cost.

In this scenario, however, only the second model need be trained separately. Specifically, if a target object type different from the original target object type is to be added, only a third sample image of the target object of the type needs to be collected, and the added sample feature data of the shape, structure, size, and the like of the target object is extracted from the third sample image. And then retraining the newly added sample characteristic data and the old second characteristic data together to obtain an updated second model.

Thus, even if the first model cannot prepare for recognizing the type of target object in step 102, the second model has the feature data of the type of target object, and the feature data of the type of target object is input into the second model, so that the effect of accurate recognition can be achieved.

Therefore, when the target object category is newly added, the method only trains the second model, the effect of identifying the newly added target object can be achieved, the training time is far shorter than that of the traditional method, and the time cost and the labor cost are greatly reduced.

The object detection device 500 includes:

an obtaining module 501, configured to obtain an image to be detected;

a first identification module 502, configured to input the image to be detected into a first model to identify an object in the image to be detected, so as to obtain a first possibility that each object in the image to be detected belongs to a target object; the first model is obtained by training according to a first sample image of the target object;

a first feature data obtaining module 503, configured to obtain feature data of a first object to obtain first feature data; the first object is an object of which the first possibility is greater than a preset first threshold;

a second identification module 504, configured to input the first feature data into a second model to identify the first object, so as to obtain a second possibility that the first object belongs to the target object; the second model is obtained by training according to second feature data, the second feature data are extracted from text feature information corresponding to features of each object in a second sample image, and the second sample image at least comprises the first sample image;

a determining module 505, configured to determine whether the target object is included in the image to be detected based on the second likelihood.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In yet another embodiment provided by the present invention, an apparatus is also provided, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the target object detection method described in the embodiment of the present invention.

In yet another embodiment provided by the present invention, a computer-readable storage medium is further provided, in which at least one instruction, at least one program, code set, or instruction set is stored, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the target object detection method described in the embodiment of the present invention.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A target object detection method, the method comprising:

acquiring an image to be detected;

2. The method of claim 1, after obtaining the second likelihood that the first object belongs to the target object, further comprising:

acquiring the second characteristic data;

3. The method according to claim 2, characterized in that the first characteristic data comprise at least: the determining of the third likelihood that the first object belongs to the target object comprises:

4. The method according to claim 2 or 3, wherein the determining whether the target object is contained in the image to be detected based on the second likelihood comprises:

5. The method according to claim 1 or 4, characterized in that after determining whether the target object is contained in the image to be detected based on the second likelihood, the method further comprises:

6. The method according to claim 1, before inputting the image to be detected into the first model to identify the object in the image to be detected, further comprising:

the acquiring feature data of the first object comprises:

and acquiring characteristic data of the first object from the output result.

7. The method of any of claims 1-6, wherein the first model is trained by:

8. The method of any of claims 1-6, wherein the second model is trained by:

9. The method of claim 8, after obtaining the second model, further comprising:

10. The method according to any of claims 1-9, wherein the second characteristic data comprises at least: the target object comprises category information, length-width ratio information, color information, shape information, geographical position information and pixel equivalent information of the second sample image.

11. A target object detection apparatus, characterized in that the apparatus comprises:

the image acquisition module to be detected is used for acquiring an image to be detected;

and the target object determining module is used for determining whether the target object is contained in the image to be detected or not based on the second possibility.

12. An apparatus comprising a processor and a memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement a target object detection method according to any one of claims 1 to 10.

13. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the target object detection method according to any one of claims 1 to 10.