CN113762220B - Object recognition method, electronic device, and computer-readable storage medium - Google Patents

Object recognition method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN113762220B
CN113762220B CN202111292368.6A CN202111292368A CN113762220B CN 113762220 B CN113762220 B CN 113762220B CN 202111292368 A CN202111292368 A CN 202111292368A CN 113762220 B CN113762220 B CN 113762220B
Authority
CN
China
Prior art keywords
image
detection result
sub
target
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111292368.6A
Other languages
Chinese (zh)
Other versions
CN113762220A (en
Inventor
郭宇鹏
王晓
毛少将
雷庆庆
袁帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CRSC Communication and Information Group Co Ltd CRSCIC
Original Assignee
CRSC Communication and Information Group Co Ltd CRSCIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CRSC Communication and Information Group Co Ltd CRSCIC filed Critical CRSC Communication and Information Group Co Ltd CRSCIC
Priority to CN202111292368.6A priority Critical patent/CN113762220B/en
Publication of CN113762220A publication Critical patent/CN113762220A/en
Application granted granted Critical
Publication of CN113762220B publication Critical patent/CN113762220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present disclosure provides a target identification method and a target identification deviceThe sub-device and the computer-readable storage medium, the target identification method comprises: judging whether the original image meets a preset first image segmentation condition or not; under the condition that the original image meets the first image segmentation condition, segmenting the original image to obtain M1A 1 st sub-image; for each ith sub-image, inputting the ith sub-image into a target recognition model to obtain an (i + 2) th detection result; judging whether the ith sub-image meets a preset second image segmentation condition or not according to the (i + 2) th detection result; under the condition that the ith sub-image meets the second image segmentation condition according to the (i + 2) th detection result, segmenting the ith sub-image to obtain Mi+1(ii) the (i + 1) th sub-image; and under the condition that the ith sub-image is determined not to meet the second image segmentation condition according to the (i + 2) th detection result, combining the detection results corresponding to all the sub-images obtained by segmentation to obtain a final detection result.

Description

Object recognition method, electronic device, and computer-readable storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of image processing, and in particular relates to a target identification method, electronic equipment and a computer-readable storage medium.
Background
The target recognition algorithm of the image is always one of the main scenes for artificial intelligence application, such as face recognition, pedestrian detection, mask wearing detection, and the like, and intrusion detection, perimeter protection, crowd gathering detection, and the like based on these basic recognition functions. Although the recognition accuracy rate of the current mainstream deep learning method is high in most experimental scenes, the situation is often more complex in practical application scenes, and therefore the landing effect of the target recognition algorithm is not ideal. For example, cameras in public places such as high-speed rail stations are generally higher in installation position, targets are less imaged in image pictures, the environment is more complex, and human eye resolution has certain difficulty, so that a serious challenge is provided for the accuracy of target identification algorithms.
An object recognition algorithm in the industry generally inputs image data with a specific size into a model for detection to obtain a final detection result, and the obvious defect of the model is that when the size of some images is larger, the detection model is generally limited by the limit of the size of the image size acceptable by the model under the condition, and when the size of the input image is larger than the size of the image acceptable by the model, the size of the input image is reduced; when the size of the input image is smaller than the size of the image acceptable by the model, the size of the input image is enlarged. In the case of enlargement, there is no information loss in the features of the input image, but in the case of reduction, the loss of the feature information of the input image is relatively large, and particularly in the case where the resolution of the input image is large and the angle of view is large, resulting in relatively poor final detection effect.
Disclosure of Invention
The embodiment of the disclosure provides a target identification method, electronic equipment and a computer readable storage medium.
In a first aspect, an embodiment of the present disclosure provides a target identification method, including: judging whether the original image meets a preset first image segmentation condition or not; under the condition that the original image meets the first image segmentation condition, segmenting the original image to obtain M1A 1 st sub-image; wherein M is1Is an integer greater than or equal to 1; for each ith sub-image, inputting the ith sub-image into a target recognition model to obtain an (i + 2) th detection result; judging whether the ith sub-image meets a preset second image segmentation condition or not according to the (i + 2) th detection result; under the condition that the ith sub-image meets the second image segmentation condition according to the (i + 2) th detection result, segmenting the ith sub-image to obtain Mi+1(ii) the (i + 1) th sub-image; wherein M isi+1Is greater than or equal toAn integer from 1; i is an integer of 1,2 and … … in sequence; continuing to perform the step of inputting the (i + 1) th sub-image into the target recognition model to obtain an (i + 3) th detection result for each (i + 1) th sub-image; and under the condition that the ith sub-image is determined not to meet the second image segmentation condition according to the (i + 2) th detection result, combining the detection results corresponding to all the sub-images obtained by segmentation to obtain a final detection result.
In some exemplary embodiments, after the merging the detection results corresponding to all the sub-images obtained by the segmentation to obtain the final detection result, the method further includes: and deleting the target of which the size of the target frame in the final detection result is smaller than or equal to a first preset threshold value.
In some exemplary embodiments, the determining whether the original image satisfies a preset first image segmentation condition includes: under the condition that the size of the original image is larger than the standard input size of a target recognition model, inputting the original image into the target recognition model to obtain a 1 st detection result; reducing the size of the original image to one-half of the original size of N1 to obtain a reduced original image, and inputting the reduced original image into the target recognition model to obtain a 2 nd detection result; wherein N1 is an integer greater than or equal to 2; and judging whether the original image meets the first image segmentation condition or not according to the number of small targets in the 1 st detection result, the 2 nd detection result and the 1 st detection result.
In some exemplary embodiments, the determining whether the original image satisfies the first image segmentation condition according to the number of small objects in the 1 st detection result, the 2 nd detection result, and the 1 st detection result includes at least one of: determining that the original image meets the first image segmentation condition when the difference between the 1 st detection result and the 2 nd detection result is greater than or equal to a second preset threshold and the number of small targets in the 1 st detection result is greater than or equal to a third preset threshold; determining that the original image does not satisfy the first image segmentation condition when a difference between the 1 st detection result and the 2 nd detection result is less than or equal to a fourth preset threshold, or when the number of small objects in the 1 st detection result is less than the third preset threshold.
In some exemplary embodiments, the small target in the 1 st detection result is a target in the 1 st detection result whose difference between the size and the maximum target frame size is greater than a fifth preset threshold.
In some exemplary embodiments, the determining whether the ith sub-image satisfies a preset second image segmentation condition according to the (i + 2) th detection result includes at least one of: determining that the ith sub-image meets the second image segmentation condition when the number of small targets in the (i + 2) th detection result is greater than or equal to the third preset threshold; determining that the ith sub-image does not satisfy the second image segmentation condition if the number of small targets in the (i + 2) th detection result is less than the third preset threshold.
In some exemplary embodiments, the segmenting the original image results in M1The 1 st sub-image includes: according to the standard input size of the target recognition model, performing sliding segmentation on the original image to enable the size of the 1 st sub-image obtained by segmentation to be smaller than or equal to the standard input size of the target recognition model; the step length of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the 1 st detection result, and the 1 st detection result is the detection result obtained by inputting the original image into the target recognition model.
In some exemplary embodiments, the dividing the ith sub-image is Mi+1The (i + 1) th sub-image includes: magnifying the ith sub-image by N2 times; wherein N2 is an integer greater than or equal to 2; performing sliding segmentation on the enlarged ith sub-image according to the standard input size of the target recognition model, so that the size of the (i + 1) th sub-image obtained by segmentation is smaller than or equal to the standard input size of the target recognition model; wherein the step length of the sliding is the target identificationA difference between a standard input size of the other model and a maximum target frame size in the (i + 2) th detection result.
In a second aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; a memory having at least one program stored thereon, which when executed by the at least one processor, causes the at least one processor to implement any of the above-described object recognition methods.
In a third aspect, the disclosed embodiments provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the above-mentioned object recognition methods.
According to the target identification method provided by the embodiment of the disclosure, the accuracy of target identification is improved by performing adaptive segmentation on the original image, so that the final detection effect is improved.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. Detailed exemplary embodiments are described by reference to the accompanying drawings, in which:
fig. 1 is a flowchart of a target identification method according to an embodiment of the present disclosure;
fig. 2 is a block diagram of a target recognition apparatus according to another embodiment of the present disclosure.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present disclosure, the object recognition method, the electronic device, and the computer-readable storage medium provided in the present disclosure are described in detail below with reference to the accompanying drawings.
Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of at least one of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of at least one other feature, integer, step, operation, element, component, and/or group thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Fig. 1 is a flowchart of a target identification method according to an embodiment of the present disclosure.
In a first aspect, referring to fig. 1, an embodiment of the present disclosure provides a target identification method, including:
and step 100, judging whether the original image meets a preset first image segmentation condition.
In some exemplary embodiments, the determining whether the original image satisfies a preset first image segmentation condition includes: under the condition that the size of the original image is larger than the standard input size of a target recognition model, inputting the original image into the target recognition model to obtain a 1 st detection result; reducing the size of the original image to one-half of the original size of N1 to obtain a reduced original image, and inputting the reduced original image into the target recognition model to obtain a 2 nd detection result; wherein N1 is an integer greater than or equal to 2; and judging whether the original image meets the first image segmentation condition or not according to the number of small targets in the 1 st detection result, the 2 nd detection result and the 1 st detection result.
In some exemplary embodiments, the value of N1 may be set empirically. For example, N1 may be set to 2.
In some exemplary embodiments, the size of the original image being larger than the standard input size of the target recognition model comprises at least one of: the width of the original image is larger than the standard input width of the target recognition model; the height of the original image is greater than the standard input height of the target recognition model.
In some exemplary embodiments, the standard input size of the target recognition model refers to the maximum size of the input image supported by the target recognition model, and the size of the image generally includes the width and height of the image.
The object recognition model in the embodiments of the present disclosure may be any object recognition model known to those skilled in the art, or may be a new object recognition model developed in the future, and the object recognition model is used for recognizing an object in an image.
In some exemplary embodiments, the 1 st detection result may include at least one of: the number of the detected targets in the original image, the confidence of each detected target, and the position information of the target frame in which each detected target is located in the original image.
In some exemplary embodiments, the position information of the target frame in the original image includes: coordinates of any point on the target frame in the original image, width of the target frame and height of the target frame.
In some exemplary embodiments, the 2 nd detection result may include at least one of: the number of the targets detected from the reduced original image, the confidence of each detected target, and the position information of the target frame where each detected target is located in the reduced original image.
In some exemplary embodiments, the position information of the target frame in the reduced original image includes: coordinates of any point on the target frame in the reduced original image, width of the target frame and height of the target frame.
In some exemplary embodiments, the determining whether the original image satisfies the first image segmentation condition according to the number of small objects in the 1 st detection result, the 2 nd detection result, and the 1 st detection result includes at least one of:
determining that the original image meets the first image segmentation condition when the difference between the 1 st detection result and the 2 nd detection result is greater than or equal to a second preset threshold and the number of small targets in the 1 st detection result is greater than or equal to a third preset threshold;
determining that the original image does not satisfy the first image segmentation condition when a difference between the 1 st detection result and the 2 nd detection result is less than or equal to a fourth preset threshold, or when the number of small objects in the 1 st detection result is less than the third preset threshold.
In some exemplary embodiments, the difference between the 1 st and 2 nd test results is calculated in at least one of the following ways: the absolute value of the difference between the number of targets in the 1 st detection result and the number of targets in the 2 nd detection result; for each object detected, the absolute value of the difference between the confidence of the object detected in the 1 st detection result and the confidence of the object detected in the 2 nd detection result.
In other exemplary embodiments, the difference between the 1 st and 2 nd test results is calculated using at least one of: the ratio of the absolute value of the difference between the number of targets in the 1 st detection result and the number of targets in the 2 nd detection result to the number of targets in the 1 st detection result; for each object detected, the ratio of the absolute value of the difference between the confidence of the object detected in the 1 st detection result and the confidence of the object detected in the 2 nd detection result to the confidence of the object detected in the 1 st detection result.
In other exemplary embodiments, the difference between the 1 st and 2 nd test results is calculated using at least one of: the ratio of the absolute value of the difference between the number of targets in the 1 st detection result and the number of targets in the 2 nd detection result to the number of targets in the 2 nd detection result; for each object detected, the ratio of the absolute value of the difference between the confidence of the object detected in the 1 st detection result and the confidence of the object detected in the 2 nd detection result to the confidence of the object detected in the 2 nd detection result.
In some exemplary embodiments, the second preset threshold and the fourth preset threshold may be set according to actual needs, for example, the second preset threshold is set to be 30%, and the fourth preset threshold is set to be 20%.
In some exemplary embodiments, the small target in the 1 st detection result is a target in the 1 st detection result whose difference between the size and the maximum target frame size is greater than a fifth preset threshold.
In some exemplary embodiments, the fifth preset threshold may be set according to actual needs, for example, the fifth preset threshold is set to 40%.
In some exemplary embodiments, the difference between the size of the small target in the 1 st detection result and the maximum target frame size is the difference between the maximum target frame size in the 1 st detection result and the size of the target frame in which the small target in the 1 st detection result is located.
In other exemplary embodiments, the difference between the size of the small target in the 1 st detection result and the size of the maximum target frame is a ratio of a difference between the size of the maximum target frame in the 1 st detection result and the size of the target frame in which the small target in the 1 st detection result is located to the size of the maximum target frame in the 1 st detection result.
In other exemplary embodiments, the difference between the size of the small target in the 1 st detection result and the size of the maximum target frame is a ratio of a difference between the size of the maximum target frame in the 1 st detection result and the size of the target frame in which the small target in the 1 st detection result is located to the size of the target frame in which the small target in the 1 st detection result is located.
In other exemplary embodiments, the difference between the size of the small target and the size of the maximum target frame in the 1 st detection result may be calculated in other manners, and the specific calculation manner is not used to limit the protection scope of the embodiment of the present disclosure.
In some exemplary embodiments, the maximum target frame size refers to the size of the target frame having the largest area or the size of the target frame having the longest circumference, and the size of the target frame generally includes the width and height of the target frame.
In some exemplary embodiments, the third preset threshold may be set according to actual needs, for example, the third preset threshold is set to 10.
Step 101, under the condition that the original image meets the first image segmentation condition, segmenting the original image to obtain M1A 1 st sub-image; wherein M is1Is an integer greater than or equal to 1.
In some exemplary embodiments, the segmenting the original image results in M1The 1 st sub-image includes: according to the standard input size of the target recognition model, performing sliding segmentation on the original image to enable the size of the 1 st sub-image obtained by segmentation to be smaller than or equal to the standard input size of the target recognition model; the step length of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the 1 st detection result, and the 1 st detection result is the detection result obtained by inputting the original image into the target recognition model.
In some exemplary embodiments, the size of the divided 1 st sub-image is smaller than or equal to the standard input size of the target recognition model, that is, the width of the divided 1 st sub-image is smaller than or equal to the standard input width of the target recognition model, and the height of the divided 1 st sub-image is smaller than or equal to the standard input height of the target recognition model.
In some exemplary embodiments, the standard input width of the target recognition model refers to a maximum width of the input image supported by the target recognition model, and the standard input height of the target recognition model refers to a maximum height of the input image supported by the target recognition model.
In some exemplary embodiments, the width-direction sliding step is a difference between a standard input width of the target recognition model and a width of a maximum target frame in the 1 st detection result, and the height-direction sliding step is a difference between a standard input height of the target recognition model and a height of the maximum target frame in the 1 st detection result.
In some exemplary embodiments, the purpose of the sliding segmentation of the original image is to prevent missing detection of the target.
102, inputting the ith sub-image into a target recognition model to obtain an (i + 2) th detection result for each ith sub-image; judging whether the ith sub-image meets a preset second image segmentation condition or not according to the (i + 2) th detection result; under the condition that the ith sub-image meets the second image segmentation condition according to the (i + 2) th detection result, segmenting the ith sub-image to obtain Mi+1(ii) the (i + 1) th sub-image; wherein M i +1 is an integer greater than or equal to 1; i is an integer of 1,2 and … … in sequence; and continuing to execute the step of inputting the (i + 1) th sub-image into the target recognition model to obtain an (i + 3) th detection result for each (i + 1) th sub-image.
In some exemplary embodiments, inputting the ith sub-image into the target recognition model to obtain the (i + 2) th detection result may be implemented in a parallel or serial manner. For example, when the parallel mode is adopted, two target recognition models can be set to perform parallel processing, different target recognition models process different ith sub-images, but the specific implementation codes of different target recognition models are completely the same.
In some exemplary embodiments, the (i + 2) th detection result may include at least one of: the number of the targets detected from the ith sub-image, the confidence level of each detected target, and the position information of the target frame where each detected target is located in the ith sub-image.
In some exemplary embodiments, the position information of the target frame in the ith sub-image includes: coordinates of any point on the target frame in the ith sub-image, width of the target frame and height of the target frame.
In some exemplary embodiments, the determining whether the ith sub-image satisfies a preset second image segmentation condition according to the (i + 2) th detection result includes at least one of:
determining that the ith sub-image meets the second image segmentation condition when the number of small targets in the (i + 2) th detection result is greater than or equal to the third preset threshold;
determining that the ith sub-image does not satisfy the second image segmentation condition if the number of small targets in the (i + 2) th detection result is less than the third preset threshold.
In some exemplary embodiments, the small target in the (i + 2) th detection result is a target in the (i + 2) th detection result whose difference between the size and the maximum target frame size is greater than a fifth preset threshold.
In some exemplary embodiments, the difference between the size of the small target in the (i + 2) th detection result and the maximum target frame size is the difference between the maximum target frame size in the (i + 2) th detection result and the size of the target frame in which the small target in the (i + 2) th detection result is located.
In other exemplary embodiments, the difference between the size of the small target in the (i + 2) th detection result and the maximum target frame size is a ratio of a difference between the maximum target frame size in the (i + 2) th detection result and the size of the target frame in which the small target in the (i + 2) th detection result is located to the maximum target frame size in the (i + 2) th detection result.
In other exemplary embodiments, the difference between the size of the small target in the (i + 2) th detection result and the maximum target frame size is a ratio of a difference between the maximum target frame size in the (i + 2) th detection result and the size of the target frame in which the small target in the (i + 2) th detection result is located to the size of the target frame in which the small target in the (i + 2) th detection result is located.
In other exemplary embodiments, the difference between the size of the small target and the size of the maximum target frame in the (i + 2) th detection result may be calculated in other manners, and the specific calculation manner is not used to limit the protection scope of the embodiment of the present disclosure.
In some exemplary embodiments, the dividing the ith sub-image is Mi+1The (i + 1) th sub-image includes: magnifying the ith sub-image by N2 times; wherein N2 is an integer greater than or equal to 2; performing sliding segmentation on the enlarged ith sub-image according to the standard input size of the target recognition model, so that the size of the (i + 1) th sub-image obtained by segmentation is smaller than or equal to the standard input size of the target recognition model; wherein the step size of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the (i + 2) th detection result.
In some exemplary embodiments, the specific value of N2 may be set empirically, e.g., N2 may be set to 2.
In some exemplary embodiments, the size of the (i + 1) th segmented sub-image being smaller than or equal to the standard input size of the target recognition model means that the width of the (i + 1) th segmented sub-image being smaller than or equal to the standard input width of the target recognition model and the height of the (i + 1) th segmented sub-image being smaller than or equal to the standard input height of the target recognition model.
In some exemplary embodiments, the width-direction sliding step is a difference between a standard input width of the target recognition model and a width of a maximum target frame in the (i + 2) th detection result, and the height-direction sliding step is a difference between a standard input height of the target recognition model and a height of a maximum target frame in the (i + 2) th detection result.
In some exemplary embodiments, the purpose of the sliding segmentation of the enlarged ith sub-image is to prevent missing detection of the target.
And 103, combining the detection results corresponding to all the sub-images obtained by division to obtain a final detection result under the condition that the ith sub-image is determined not to meet the second image division condition according to the (i + 2) th detection result.
In some exemplary embodiments, all the sub-images obtained by the splitting refer to all the ith sub-images obtained by the splitting, and i is an integer which is 1,2,3, … … in sequence. For example, when i =1, all the sub-images obtained by the division include: m obtained by division1A 1 st sub-image; when i =2, all the sub-images obtained by the division include: m obtained by division11 st sub-image and M2A 2 nd sub-image; when i =3, all the sub-images obtained by the division include: m1 sub-image 1, M obtained by division22 nd sub-image and M3A 3 rd sub-image; and so on.
In some exemplary embodiments, merging the detection results corresponding to all the sub-images obtained by the segmentation to obtain a final detection result includes: and calculating the position information of each target in the original image according to the position information of the target frame where the target is located in the detection results corresponding to all the sub-images obtained by segmentation, the sliding step length adopted by each segmentation and the N2 value adopted by each segmentation, and combining the position information of all the targets in the original image by adopting an Intersection-over-Unit (IOU) and a Non-Maximum Suppression (NMS) to obtain a final detection result.
In some exemplary embodiments, the IOU is a concept used in target detection, and is the overlap ratio of the generated candidate frame and the original marked frame, i.e. the ratio of their intersection to union, and ideally the ratio is complete overlap, i.e. the ratio is 1.
In some exemplary embodiments, the NMS, as its name implies, suppresses either the elements that are maxima for use in target detection, or extracts the target detection boxes with high confidence and suppresses false detection boxes with low confidence.
In some exemplary embodiments, after the merging the detection results corresponding to all the sub-images obtained by the segmentation to obtain the final detection result, the method further includes: and deleting the target of which the size of the target frame in the final detection result is smaller than or equal to a first preset threshold value.
According to the target identification method provided by the embodiment of the disclosure, the accuracy of target identification is improved by performing adaptive segmentation on the original image, so that the final detection effect is improved.
In a second aspect, another embodiment of the present disclosure provides an electronic device, including: at least one processor; a memory having at least one program stored thereon, the at least one program, when executed by the at least one processor, causing the at least one processor to implement the object recognition method of any one of the above.
Wherein, the processor is a device with data processing capability, which includes but is not limited to a Central Processing Unit (CPU) and the like; memory is a device with data storage capabilities including, but not limited to, random access memory (RAM, more specifically SDRAM, DDR, etc.), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), FLASH memory (FLASH).
In some embodiments, the processor, memory, and in turn other components of the computing device are connected to each other by a bus.
In a third aspect, another embodiment of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement any one of the above-mentioned object recognition methods.
Fig. 2 is a block diagram of a target recognition apparatus according to another embodiment of the present disclosure.
In a fourth aspect, another embodiment of the present disclosure provides an object recognition apparatus, including: a judging module 201, configured to judge whether an original image meets a preset first image segmentation condition; a segmentation module 202, configured to segment the original image to obtain M when the original image meets the first image segmentation condition1A 1 st sub-image; wherein M is1Is an integer greater than or equal to 1; the determining module 201 is further configured to: for each ith sub-image, inputting the ith sub-image into a target recognition model to obtain an (i + 2) th detection result; judging whether the ith sub-image meets a preset second image segmentation condition or not according to the (i + 2) th detection result(ii) a The segmentation module 202 is further configured to: under the condition that the ith sub-image meets the second image segmentation condition according to the (i + 2) th detection result, segmenting the ith sub-image to obtain Mi+1(ii) the (i + 1) th sub-image; wherein M isi+1Is an integer greater than or equal to 1; i is an integer of 1,2 and … … in sequence; continuing to perform the step of inputting the (i + 1) th sub-image into the target recognition model to obtain an (i + 3) th detection result for each (i + 1) th sub-image; a merging module 203, configured to merge the detection results corresponding to all the sub-images obtained by segmentation to obtain a final detection result when it is determined that the ith sub-image does not satisfy the second image segmentation condition according to the (i + 2) th detection result.
In some exemplary embodiments, the merge module 203 is further configured to: and deleting the target of which the size of the target frame in the final detection result is smaller than or equal to a first preset threshold value.
In some exemplary embodiments, the determining module 201 is specifically configured to implement the determining whether the original image meets a preset first image segmentation condition by: under the condition that the size of the original image is larger than the standard input size of a target recognition model, inputting the original image into the target recognition model to obtain a 1 st detection result; reducing the size of the original image to one-half of the original size of N1 to obtain a reduced original image, and inputting the reduced original image into the target recognition model to obtain a 2 nd detection result; wherein N1 is an integer greater than or equal to 2; and judging whether the original image meets the first image segmentation condition or not according to the number of small targets in the 1 st detection result, the 2 nd detection result and the 1 st detection result.
In some exemplary embodiments, the determining module 201 is specifically configured to determine whether the original image satisfies the first image segmentation condition according to the small target number of the 1 st detection result, the 2 nd detection result, and the 1 st detection result by at least one of the following manners: determining that the original image meets the first image segmentation condition when the difference between the 1 st detection result and the 2 nd detection result is greater than or equal to a second preset threshold and the number of small targets in the 1 st detection result is greater than or equal to a third preset threshold; determining that the original image does not satisfy the first image segmentation condition when a difference between the 1 st detection result and the 2 nd detection result is less than or equal to a fourth preset threshold, or when the number of small objects in the 1 st detection result is less than the third preset threshold.
In some exemplary embodiments, the small target in the 1 st detection result is a target in the 1 st detection result whose difference between the size and the maximum target frame size is greater than a fifth preset threshold.
In some exemplary embodiments, the determining module 201 is specifically configured to determine whether the ith sub-image meets a preset second image segmentation condition according to the (i + 2) th detection result by at least one of the following manners: determining that the ith sub-image meets the second image segmentation condition when the number of small targets in the (i + 2) th detection result is greater than or equal to the third preset threshold; determining that the ith sub-image does not satisfy the second image segmentation condition if the number of small targets in the (i + 2) th detection result is less than the third preset threshold.
In some exemplary embodiments, the segmentation module 202 is specifically configured to implement the segmentation of the original image into M in the following manner11 st sub-image: according to the standard input size of the target recognition model, performing sliding segmentation on the original image to enable the size of the 1 st sub-image obtained by segmentation to be smaller than or equal to the standard input size of the target recognition model; the step length of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the 1 st detection result, and the 1 st detection result is the detection result obtained by inputting the original image into the target recognition model.
In some exemplary embodiments, the segmentation module 202 is specifically configured to implement the i-th sub-image in the following mannerIs divided into Mi+1(i + 1) th sub-image: magnifying the ith sub-image by N2 times; wherein N2 is an integer greater than or equal to 2; performing sliding segmentation on the enlarged ith sub-image according to the standard input size of the target recognition model, so that the size of the (i + 1) th sub-image obtained by segmentation is smaller than or equal to the standard input size of the target recognition model; wherein the step size of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the (i + 2) th detection result.
The specific implementation process of the target identification device is the same as that of the target identification method in the foregoing embodiment, and is not described here again.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (8)

1. An object recognition method, comprising:
judging whether the original image meets a preset first image segmentation condition or not;
under the condition that the original image meets the first image segmentation condition, segmenting the original image to obtain M1A 1 st sub-image; wherein M is1Is an integer greater than or equal to 1;
for each ith sub-image, inputting the ith sub-image into a target recognition model to obtain an (i + 2) th detection result; judging whether the ith sub-image meets a preset second image segmentation condition or not according to the (i + 2) th detection result; under the condition that the ith sub-image meets the second image segmentation condition according to the (i + 2) th detection result, segmenting the ith sub-image to obtain Mi+1(ii) the (i + 1) th sub-image; wherein M isi+1Is an integer greater than or equal to 1; i is an integer of 1,2 and … … in sequence; continue execution for eachInputting the (i + 1) th sub-image into the target recognition model to obtain an (i + 3) th detection result;
under the condition that the ith sub-image is determined not to meet the second image segmentation condition according to the (i + 2) th detection result, combining the detection results corresponding to all the sub-images obtained by segmentation to obtain a final detection result;
wherein, the judging whether the original image meets a preset first image segmentation condition comprises:
under the condition that the size of the original image is larger than the standard input size of a target recognition model, inputting the original image into the target recognition model to obtain a 1 st detection result;
reducing the size of the original image to one-half of the original size of N1 to obtain a reduced original image, and inputting the reduced original image into the target recognition model to obtain a 2 nd detection result; wherein N1 is an integer greater than or equal to 2;
judging whether the original image meets the first image segmentation condition or not according to the number of small targets in the 1 st detection result, the 2 nd detection result and the 1 st detection result;
wherein the judging whether the ith sub-image meets a preset second image segmentation condition according to the (i + 2) th detection result comprises at least one of the following steps:
determining that the ith sub-image meets the second image segmentation condition when the number of small targets in the (i + 2) th detection result is greater than or equal to a third preset threshold;
determining that the ith sub-image does not satisfy the second image segmentation condition if the number of small targets in the (i + 2) th detection result is less than the third preset threshold.
2. The object recognition method according to claim 1, wherein after the detection results corresponding to all the divided sub-images are combined to obtain the final detection result, the method further comprises:
and deleting the target of which the size of the target frame in the final detection result is smaller than or equal to a first preset threshold value.
3. The target recognition method of claim 1, wherein the determining whether the original image satisfies the first image segmentation condition according to the small target number in the 1 st detection result, the 2 nd detection result and the 1 st detection result comprises at least one of:
determining that the original image meets the first image segmentation condition when the difference between the 1 st detection result and the 2 nd detection result is greater than or equal to a second preset threshold and the number of small targets in the 1 st detection result is greater than or equal to a third preset threshold;
determining that the original image does not satisfy the first image segmentation condition when a difference between the 1 st detection result and the 2 nd detection result is less than or equal to a fourth preset threshold, or when the number of small objects in the 1 st detection result is less than the third preset threshold.
4. The target identification method according to claim 1, wherein the small target in the 1 st detection result is a target in the 1 st detection result whose difference between the size and the maximum target frame size is greater than a fifth preset threshold.
5. The object recognition method according to any one of claims 1-2, wherein the segmenting the original image results in M1The 1 st sub-image includes:
according to the standard input size of the target recognition model, performing sliding segmentation on the original image to enable the size of the 1 st sub-image obtained by segmentation to be smaller than or equal to the standard input size of the target recognition model; the step length of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the 1 st detection result, and the 1 st detection result is the detection result obtained by inputting the original image into the target recognition model.
6. The object recognition method of any one of claims 1-2, wherein the segmenting the ith sub-image results in Mi+1The (i + 1) th sub-image includes:
magnifying the ith sub-image by N2 times; wherein N2 is an integer greater than or equal to 2;
performing sliding segmentation on the enlarged ith sub-image according to the standard input size of the target recognition model, so that the size of the (i + 1) th sub-image obtained by segmentation is smaller than or equal to the standard input size of the target recognition model; wherein the step size of the sliding is the difference between the standard input size of the target recognition model and the maximum target frame size in the (i + 2) th detection result.
7. An electronic device, comprising:
at least one processor;
memory having stored thereon at least one program which, when executed by the at least one processor, causes the at least one processor to carry out the object recognition method of any one of claims 1-6.
8. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the object recognition method of any one of claims 1 to 6.
CN202111292368.6A 2021-11-03 2021-11-03 Object recognition method, electronic device, and computer-readable storage medium Active CN113762220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111292368.6A CN113762220B (en) 2021-11-03 2021-11-03 Object recognition method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111292368.6A CN113762220B (en) 2021-11-03 2021-11-03 Object recognition method, electronic device, and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN113762220A CN113762220A (en) 2021-12-07
CN113762220B true CN113762220B (en) 2022-03-15

Family

ID=78784569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111292368.6A Active CN113762220B (en) 2021-11-03 2021-11-03 Object recognition method, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN113762220B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663418A (en) * 2022-04-06 2022-06-24 京东安联财产保险有限公司 Image processing method and device, storage medium and electronic equipment
CN115937169A (en) * 2022-12-23 2023-04-07 广东创新科技职业学院 Shrimp fry counting method and system based on high resolution and target detection

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017071160A1 (en) * 2015-10-28 2017-05-04 深圳大学 Sea-land segmentation method and system for large-size remote-sensing image
WO2018076138A1 (en) * 2016-10-24 2018-05-03 深圳大学 Target detection method and apparatus based on large-scale high-resolution hyper-spectral image

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112802027A (en) * 2019-11-13 2021-05-14 成都天府新区光启未来技术研究院 Target object analysis method, storage medium and electronic device
CN112991349B (en) * 2019-12-17 2023-12-26 阿里巴巴集团控股有限公司 Image processing method, device, equipment and storage medium
CN111191730B (en) * 2020-01-02 2023-05-12 中国航空工业集团公司西安航空计算技术研究所 Method and system for detecting oversized image target oriented to embedded deep learning
CN111401463A (en) * 2020-03-25 2020-07-10 维沃移动通信有限公司 Method for outputting detection result, electronic device, and medium
CN111223115B (en) * 2020-04-22 2020-07-14 杭州涂鸦信息技术有限公司 Image segmentation method, device, equipment and medium
CN111598091A (en) * 2020-05-20 2020-08-28 北京字节跳动网络技术有限公司 Image recognition method and device, electronic equipment and computer readable storage medium
CN112348835B (en) * 2020-11-30 2024-04-16 广联达科技股份有限公司 Material quantity detection method and device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017071160A1 (en) * 2015-10-28 2017-05-04 深圳大学 Sea-land segmentation method and system for large-size remote-sensing image
WO2018076138A1 (en) * 2016-10-24 2018-05-03 深圳大学 Target detection method and apparatus based on large-scale high-resolution hyper-spectral image

Also Published As

Publication number Publication date
CN113762220A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN109325954B (en) Image segmentation method and device and electronic equipment
US11423634B2 (en) Object detection model training method, apparatus, and device
JP5775225B2 (en) Text detection using multi-layer connected components with histograms
CN113762220B (en) Object recognition method, electronic device, and computer-readable storage medium
US8995714B2 (en) Information creation device for estimating object position and information creation method and program for estimating object position
US20190156499A1 (en) Detection of humans in images using depth information
EP3203417A2 (en) Method for detecting texts included in an image and apparatus using the same
US9152856B2 (en) Pedestrian detection system and method
KR102119897B1 (en) Skin condition detection method and electronic device
US9300828B1 (en) Image segmentation
CN111259878A (en) Method and equipment for detecting text
US20180047271A1 (en) Fire detection method, fire detection apparatus and electronic equipment
CN113221768A (en) Recognition model training method, recognition method, device, equipment and storage medium
US20220058422A1 (en) Character recognition method and terminal device
CN113420682A (en) Target detection method and device in vehicle-road cooperation and road side equipment
CN110647818A (en) Identification method and device for shielding target object
EP3726421A2 (en) Recognition method and apparatus for false detection of an abandoned object and image processing device
CN109712134B (en) Iris image quality evaluation method and device and electronic equipment
CN113378857A (en) Target detection method and device, electronic equipment and storage medium
CN110969640A (en) Video image segmentation method, terminal device and computer-readable storage medium
CN112001336A (en) Pedestrian boundary crossing alarm method, device, equipment and system
CN113762027B (en) Abnormal behavior identification method, device, equipment and storage medium
CN111598013A (en) Nut-pin state identification method and related device
CN113312949A (en) Video data processing method, video data processing device and electronic equipment
CN112784638B (en) Training sample acquisition method and device, pedestrian detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant