CN113111921A

CN113111921A - Object recognition method, object recognition device, electronic equipment and storage medium

Info

Publication number: CN113111921A
Application number: CN202110294884.6A
Authority: CN
Inventors: 曹秀伟; 苏世龙; 樊则森; 雷俊; 马栓鹏; 丁沛然; 田璐璐
Original assignee: China Construction Science and Technology Group Co Ltd
Current assignee: China Construction Science and Technology Group Co Ltd
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-07-13
Anticipated expiration: 2041-03-19
Also published as: CN113111921B

Abstract

The application is applicable to the technical field of image processing, and provides an object identification method, an object identification device, electronic equipment and a storage medium, wherein the object identification method comprises the following steps: acquiring a picture to be recognized, wherein the picture to be recognized is a minimum external rectangular picture corresponding to an object to be recognized; performing target processing on the picture to be identified according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process; determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object. The method and the device for recognizing the object can improve the accuracy of object recognition.

Description

Object recognition method, object recognition device, electronic equipment and storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to an object recognition method and apparatus, an electronic device, and a storage medium.

Background

With the increasing automation degree of manufacturing industry and industrial production line, in a factory production line, a corresponding picture is often obtained by shooting an object, and then identification processing is performed, so as to determine the category of the object and obtain an identification result of the object.

In the existing recognition technology, features of different types of objects are summarized and summarized manually based on a traditional machine learning method and are described by using a mathematical formula, but the method has great misjudgment in the face of complex environments; meanwhile, the manual searching of features greatly tests the abilities of algorithm personnel, so that the classification algorithm based on the traditional machine learning is gradually eliminated by the neural network algorithm. The recognition algorithm based on the neural network has the characteristics of high recognition accuracy, high training speed, easiness in deployment and the like, but has a limit that the sizes of input pictures are required to be uniform. When various objects needing to be identified and classified are similar in shape, different in size and same in length-width ratio, the difference of the objects of different types is reduced after the images are zoomed to realize size unification, so that the accuracy of an identification network is reduced, and the accuracy of object identification is low.

Disclosure of Invention

In view of this, embodiments of the present application provide an object identification method, an object identification device, an electronic device, and a storage medium, so as to solve the problem of how to improve accuracy of object identification in the prior art.

A first aspect of an embodiment of the present application provides an object identification method, including:

acquiring a picture to be recognized, wherein the picture to be recognized is a minimum external rectangular picture corresponding to an object to be recognized;

performing target processing on the picture to be identified according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process;

determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object.

Optionally, the acquiring the picture to be recognized includes:

acquiring a gray level picture, wherein the gray level picture is a minimum external rectangular picture of a gray level format corresponding to an object to be identified;

and carrying out binarization processing on the gray level picture to obtain a picture to be identified.

Optionally, the binarizing the grayscale image to obtain the image to be identified includes:

determining a binarization threshold value according to the gray level histogram of the gray level picture;

and carrying out binarization processing on the gray level picture according to the binarization threshold value to obtain a picture to be identified.

Optionally, the performing target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template size includes:

and if the size of the picture to be recognized is smaller than the size of the template, performing background expansion processing around the picture to be recognized by taking the picture to be recognized as the center to obtain a target picture with the size of the template.

if the template size is a square size and the size of the picture to be recognized is larger than the template size, determining a scaling ratio according to the ratio of the side length of the template size to the length of the long side of the picture to be recognized;

zooming the picture to be identified according to the zooming proportion to obtain a zoomed picture;

and expanding the short edge of the zoomed picture to be consistent with the side length of the template size through background expansion processing to obtain a target picture with the size of the template.

Optionally, the object identification method is configured to identify an object of a preset type, and correspondingly, before the obtaining of the picture to be identified, the method includes:

taking the object with the largest size in the preset types of objects as a template object;

and determining the size of the template according to the size of the minimum circumscribed rectangle picture corresponding to the template object.

Optionally, the determining, according to the target picture and a preset object identification network, an identification result corresponding to the object to be identified includes:

according to the size of an input layer of the object recognition network, the target picture is reduced to obtain a picture to be input

And inputting the picture to be input into the object recognition network for processing to obtain a recognition result corresponding to the object to be recognized.

A second aspect of an embodiment of the present application provides an object recognition apparatus, including:

the image recognition device comprises a to-be-recognized image acquisition unit, a recognition unit and a recognition unit, wherein the to-be-recognized image acquisition unit is used for acquiring a to-be-recognized image which is a minimum external rectangular image corresponding to an object to be recognized;

the target processing unit is used for carrying out target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process;

the identification result determining unit is used for determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object.

A third aspect of embodiments of the present application provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, which when executed by the processor, causes the electronic device to implement the steps of the object identification method as described.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium, which stores a computer program that, when executed by a processor, causes an electronic device to implement the steps of the object identification method as described.

A fifth aspect of embodiments of the present application provides a computer program product, which, when run on an electronic device, causes the electronic device to execute the object identification method of any one of the first aspects.

Compared with the prior art, the embodiment of the application has the advantages that: in the embodiment of the application, a picture to be recognized is obtained, wherein the picture to be recognized is a minimum circumscribed rectangle picture corresponding to an object to be recognized; performing target processing on the picture to be identified according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process; determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object. Under the same condition, the minimum circumscribed rectangle picture corresponding to the object is related to the actual size of the object, so that the size of the picture to be identified in the embodiment of the application can reflect the size of the object to a certain degree; when the target pictures with the template sizes are required to be obtained through background expansion processing, areas to be subjected to background expansion of the pictures to be recognized with different sizes are different in size, namely, compared with the existing mode of carrying out uniform scaling processing on all the pictures to be recognized, the background expansion processing can carry out background expansion on the pictures to be recognized with different sizes to different degrees, the size of an image area of an object to be recognized in the target picture is kept synchronous with the size of the picture to be recognized to a certain degree, so that the target pictures corresponding to objects with similar shapes but different sizes have larger difference, and therefore when object recognition is carried out according to the target picture and an object recognition network subsequently, different objects can be distinguished more accurately, and the accuracy of object recognition is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the embodiments or the description of the prior art will be briefly described below.

Fig. 1 is a schematic flow chart illustrating an implementation of an object identification method according to an embodiment of the present application;

FIG. 2 is a schematic diagram illustrating a comparison of a scaling process performed on a picture to be recognized according to an embodiment of the present application;

FIG. 3 is a schematic diagram illustrating a comparison between target processing and recognition of a picture to be recognized according to an embodiment of the present application;

fig. 4 is an exemplary diagram of a picture obtained by a picture enhancement process according to an embodiment of the present application;

fig. 5 is a schematic diagram of a template object and a corresponding template picture according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a to-be-identified picture of a set of rebars and a target picture corresponding to the to-be-identified picture provided in the embodiment of the present application;

fig. 7 is a schematic diagram of an object recognition apparatus according to an embodiment of the present application;

fig. 8 is a schematic diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In addition, in the description of the present application, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

At present, when object recognition is performed based on a neural network, there is a limitation: the size of the input pictures must be uniform. Under the limiting condition, when various objects needing to be identified and classified are similar in shape, different in size and same in length-width ratio, after the images are zoomed to achieve size unification, differences of the objects of different types become small, accuracy of an identification network is reduced, and accordingly accuracy of object identification is low. In order to solve the technical problem, an embodiment of the present application provides an object identification method, an apparatus, an electronic device, and a storage medium, including: acquiring a picture to be recognized, wherein the picture to be recognized is a minimum external rectangular picture corresponding to an object to be recognized; performing target processing on the picture to be identified according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process; determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object. Under the same condition, the minimum circumscribed rectangle picture corresponding to the object is related to the actual size of the object, so that the size of the picture to be identified in the embodiment of the application can reflect the size of the object to a certain degree; when the target pictures with the template sizes are required to be obtained through background expansion processing, areas to be subjected to background expansion of the pictures to be recognized with different sizes are different in size, namely, compared with the existing mode of carrying out uniform scaling processing on all the pictures to be recognized, the background expansion processing can carry out background expansion on the pictures to be recognized with different sizes to different degrees, the size of an image area of an object to be recognized in the target picture is kept synchronous with the size of the picture to be recognized to a certain degree, so that the target pictures corresponding to objects with similar shapes but different sizes have larger difference, and therefore when object recognition is carried out according to the target picture and an object recognition network subsequently, different objects can be distinguished more accurately, and the accuracy of object recognition is improved.

The first embodiment is as follows:

fig. 1 shows a schematic flow chart of an object identification method provided in an embodiment of the present application, which is detailed as follows:

in S101, a picture to be recognized is obtained, wherein the picture to be recognized is a minimum circumscribed rectangle picture corresponding to an object to be recognized.

In the embodiment of the application, the picture to be recognized is the minimum external rectangular picture corresponding to the object to be recognized, that is, the rectangular picture which just can completely contain the image information of the object to be recognized. For example, a designated area where the object to be recognized is placed may be photographed by a camera, resulting in a photograph containing image information of the object to be recognized. And then, determining a minimum circumscribed rectangular area corresponding to the image of the object to be recognized from the picture by methods such as target detection or image segmentation, extracting the minimum circumscribed rectangular area from the picture to obtain a minimum circumscribed rectangular picture corresponding to the object to be recognized, and taking the minimum circumscribed rectangular picture as the picture to be recognized.

In S102, performing target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process.

In the embodiment of the application, the size of the template is a size set in advance according to the requirement of subsequent picture identification. For example, since the object recognition network generally requires that the inputted picture be a square picture, the template size is generally a square size. After the picture to be recognized is obtained, target processing is carried out on the picture to be recognized according to the preset template size, so that the target picture which contains the image information of the object to be recognized and has the size of the template size is obtained, the size specification of the picture is further realized, and the subsequent object recognition is facilitated.

Specifically, unlike the existing processing manner in which the change of the picture size is realized by simple picture scaling, the target processing in the embodiment of the present application specifically includes background expansion processing. The background expansion processing specifically refers to a processing procedure of changing the size of a picture by changing the size of a background region of the picture while keeping image information corresponding to an original object in the picture unchanged for the picture. For example, a preset image area with the size of the template may be set, after the picture to be recognized is placed in the preset image area, pixel points with the same pixel value as the background of the picture to be recognized are filled in the area except the picture to be recognized in the preset image area, so as to realize background expansion, and obtain the target picture with the size of the template. Through the target processing, the size of the image area of the object to be recognized in the target picture can be kept synchronous with the size of the picture to be recognized to a certain degree, so that the target pictures corresponding to the objects with similar shapes but different sizes have larger differences, the target picture corresponding to the current object to be recognized is effectively distinguished from the target pictures of other types of objects with similar shapes, and accurate recognition results can be obtained according to the target pictures subsequently.

Exemplarily, fig. 2 shows a schematic diagram that, for a to-be-recognized picture a corresponding to an object a to be recognized and a to-be-recognized picture B corresponding to an object B to be recognized, a corresponding zoom picture a and a corresponding zoom picture B are obtained according to the current zoom processing. Fig. 3 shows a schematic diagram of the image a to be recognized and the image B to be recognized, which are respectively processed according to the target of the embodiment of the present application to obtain the corresponding target image a and the target image B. The shape and the length-width ratio of the object a to be recognized and the object B to be recognized are similar, but the sizes of the object a and the object B to be recognized are different, so that the sizes of the picture a to be recognized and the picture B to be recognized are different. As shown in fig. 2, if the existing single scaling process is adopted, the sizes of the obtained scaled picture a and the scaled picture B are the same, and the shape of the image corresponding to the object a to be recognized in the picture is similar to the shape of the image corresponding to the object B to be recognized, so that the scaled picture a and the scaled picture B are highly similar, and further, when the object recognition is performed based on the scaled picture, the object a to be recognized and the object B to be recognized cannot be effectively distinguished, and the object recognition accuracy is low. Through the target processing of the embodiment of the application, as shown in fig. 3, the sizes of the images corresponding to the objects in the target picture a and the target picture B are significantly different, so that the target picture a and the target picture B are significantly different, the object a to be recognized and the object B to be recognized can be effectively distinguished in the subsequent object recognition based on the target picture, and the accuracy of the object recognition is improved.

In S103, determining an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object.

The preset object recognition network is obtained by training according to a preset number of sample pictures in advance and is used for recognizing the neural network of the object. The sample pictures with the preset number can be obtained by acquiring the minimum circumscribed rectangle picture corresponding to the object of the preset type to be identified in advance. In an embodiment, when the object recognition network is trained, a preset number of minimum circumscribed rectangular pictures can be obtained as a training set, where the training set includes a minimum circumscribed rectangular picture corresponding to each of preset types of objects; then, enhancement processing is performed on the pictures in the training set, for example, white rectangular blocks (for example, as shown in fig. 4) are randomly added to a certain number (for example, 20% of a preset number) of pictures in the training set to perform picture random enhancement, so that the robustness of the trained object recognition network is stronger.

According to the target picture and a preset object identification network, the identification result corresponding to the object to be identified can be accurately obtained according to the image information corresponding to the object to be identified. The identification result can be any one or more items of information of the type, the name and the grabbing position of the object to be identified.

Under the same condition, the minimum circumscribed rectangle picture corresponding to the object is related to the actual size of the object, so that the size of the picture to be identified in the embodiment of the application can reflect the size of the object to a certain degree; when the target pictures with the template sizes are required to be obtained through background expansion processing, areas to be subjected to background expansion of the pictures to be recognized with different sizes are different in size, namely, compared with the existing mode of carrying out uniform scaling processing on all the pictures to be recognized, the background expansion processing can carry out background expansion on the pictures to be recognized with different sizes to different degrees, the size of an image area of an object to be recognized in the target picture is kept synchronous with the size of the picture to be recognized to a certain degree, so that the target pictures corresponding to objects with similar shapes but different sizes have larger difference, and therefore when object recognition is carried out according to the target picture and an object recognition network subsequently, different objects can be distinguished more accurately, and the accuracy of object recognition is improved.

Optionally, the step S101 includes:

In the embodiment of the application, the picture to be recognized is specifically a minimum circumscribed rectangle picture in a binary format corresponding to the object to be recognized. Generally, a picture shot by a camera is a picture in a color format, for example, a picture in RGB (Red, Green, Blue) format including three colors of Red, Green, and Blue, that is, an initial minimum external rectangular picture of an object to be recognized, which is usually obtained, is a picture in a color format, and since the picture in the color format has color information of 3 channels, an amount of calculation for directly processing the initial picture in the color format is large, in this embodiment of the present application, gray processing may be performed on the minimum external rectangular picture in the color format corresponding to the object to be recognized first, so as to obtain the minimum external rectangular picture in the gray format corresponding to the object to be recognized, and then a gray picture is obtained. The gray picture is a single-channel picture, so the computation amount of subsequent picture processing can be reduced.

And after the gray level picture is obtained, carrying out binarization processing on the gray level picture to obtain a picture to be recognized, wherein the picture to be recognized is a minimum circumscribed rectangle picture in a binary format corresponding to an object to be recognized. In an embodiment, a fixed binarization threshold (for example, 100) may be set in advance, and a binarization process is performed on the grayscale picture according to the binarization threshold, where a pixel value of an image position where an object to be recognized is located in the grayscale picture is set as a first grayscale value, and a pixel value of a background region, other than the image position where the object to be recognized is located, in the grayscale picture is set as a second grayscale value, so as to obtain the picture to be recognized. For example, the first gray value may be 255, and the second gray value may be 0, so that the image corresponding to the object to be recognized in the picture to be recognized is white, and the image corresponding to the background area in the picture to be recognized is black, at this time, when the background expansion processing is performed in step S102, a black border may be specifically added to the picture to be processed, so as to expand the background area of the picture to be recognized, and obtain the target picture with the template size. It is understood that the first gray scale value may be set to 0 and the second gray scale value may be set to 255. The image to be recognized obtained through binarization processing can filter out interference information (such as stains existing in the background of the image) existing in the background regions of the image and the gray-scale image in the original color format, so that an interested target contour can be highlighted, the background region of the image to be recognized and the image region corresponding to the object to be recognized are obviously distinguished, and the accuracy of subsequent object recognition is further improved.

In the embodiment of the application, before the binarization processing is performed on the grayscale picture, a grayscale histogram of the grayscale picture can be specifically obtained, and the binarization threshold value is determined according to the grayscale histogram. Specifically, after the gray level histogram corresponding to the gray level picture is obtained through calculation according to the gray level picture, clustering is performed through a k-means clustering algorithm (k-means clustering algorithm), and two clustering centers in the gray level histogram are determined, where the two clustering centers can respectively represent the gray level value corresponding to the background region in the gray level picture and the gray level value corresponding to the region where the image of the object to be identified is located in the gray level picture. And determining a binarization threshold value capable of accurately dividing the background area of the gray level picture and the image area of the object to be identified according to the two clustering centers. For example, two gray values respectively corresponding to two centers of the two cluster centers may be added and divided by 2 to obtain a corresponding average value as the binarization threshold.

And then, carrying out binarization processing on the gray level picture by using the binarization threshold value, so as to obtain the picture to be identified, which can accurately distinguish the background area from the image area.

In the embodiment of the application, the binarization threshold value suitable for the current gray level picture can be determined according to the gray level histogram of the gray level picture, so that the accuracy of binarization processing can be improved, a more accurate picture to be identified can be obtained, and the accuracy of object identification can be improved.

Optionally, the step S102 includes:

In the embodiment of the application, after the picture to be recognized is obtained, the size of the picture to be recognized is compared with the size of the template. And if the size of the picture to be recognized is smaller than the size of the template, directly taking the picture to be recognized as the center, and adding pixel points consistent with the pixel values of the background of the picture to be recognized around the picture to be recognized until the size of the picture reaches the size of the template, so as to realize background expansion processing and obtain a target picture with the size of the template. Specifically, a preset image area with the size of the template can be set, and the picture to be recognized is placed in the center of the preset image area; and then, filling the blank area in the preset image area with pixel points with pixel values as background pixel values, thereby realizing background expansion and obtaining a target picture with the size as a template size. The blank area is an area except for an area where the picture to be identified is located in the preset image area; the background pixel value is a pixel value corresponding to a background area of the picture to be identified. For example, when the background area of the to-be-identified picture is black, that is, the pixel value (specifically, the gray value) is 0, the blank area is filled with black pixel points.

Optionally, the step S102 includes:

In the embodiment of the application, when the template size is a square size and the size of the current to-be-recognized picture is larger than the template size, the current to-be-recognized picture in the rectangular shape cannot obtain the target picture in the template size through a single background expansion process or a single scaling process. At this time, the scaling ratio may be determined according to a ratio of the side length of the template size to the length of the long side of the picture to be recognized. And then, according to the scaling, carrying out equal-scale scaling on the picture to be identified to obtain a scaled picture, wherein the scaled picture is a rectangular picture with the length of the long side equal to the size of the side of the template. And then, expanding the short edge of the zoomed picture to be consistent with the side length of the template size through background expansion processing, and obtaining a square target picture with the size of the template size. Specifically, a preset image area may be set according to the size of the square template, the zoomed picture is placed in the center of the preset image area, and the remaining area of the preset image area is filled with pixels with pixel values as background pixel values, so as to realize background expansion in the short side direction and obtain the target picture. The background pixel value is a pixel value corresponding to a background area of the picture to be identified.

Optionally, the object identification method is configured to identify an object of a preset type, and correspondingly, before the obtaining of the picture to be identified, the method further includes:

In the embodiment of the present application, the object identification method is specifically configured to identify objects of a preset type, and the template size in step S102 is specifically determined according to the size of the objects of the preset type. Specifically, the size of each kind of object in the preset kind of objects may be obtained first, and the object with the largest size may be used as the template object.

And then, determining the size of the template according to the size of the minimum circumscribed rectangle picture corresponding to the template object. Specifically, the minimum circumscribed rectangle picture (referred to as a template object picture for short) corresponding to the template object is the minimum rectangle picture including the image information of the template object, which is obtained by placing the template object in the designated area, and shooting and extracting the minimum circumscribed rectangle area through the camera. Correspondingly, when the picture to be recognized is acquired subsequently, the shooting conditions (such as the distance between the camera and the designated area, the shooting parameters and the like) corresponding to the picture to be recognized are kept consistent with the shooting conditions of the template object picture, so that the accuracy of subsequent object recognition is ensured.

In one embodiment, a preset number of template object pictures may be obtained, and an average value of sizes of the preset number of template object pictures is obtained as the template size, so that during subsequent object identification, the sizes of the pictures to be identified corresponding to objects except the template object may exceed the template size, and the sizes of the pictures to be identified corresponding to other types of objects are all smaller than the template size. In another embodiment, after a preset number of template object pictures are obtained, the maximum value of the sizes of the preset number of template object pictures (referred to as the maximum size of the template object picture for short) may be obtained, and the template size is set to a value larger than the maximum size of the template object picture, so that the corresponding size of the object to be recognized is smaller than the template size no matter what kind of object is during object recognition. In yet another embodiment, the template size is a square size and the minimum bounding matrix picture is a rectangular picture with long and short sides; in this case, the sizes of the long sides of the preset number of template object pictures can be obtained, and the average value is obtained to obtain the average value of the long sides; then, the template size is determined based on the long-side average value (the size equal to or slightly larger than the long-side average value is taken as the size of the template size).

By setting any one of the foregoing embodiments, the minimum circumscribed rectangular picture corresponding to at most one object (template object) may be slightly larger than the size of the template, and the minimum circumscribed rectangular pictures of other objects are all smaller than the size of the template, so that subsequently, when an object is identified, the picture to be identified basically needs to be expanded through the background to obtain the size of the template. Correspondingly, the step S102 may specifically include: if the size of the picture to be recognized is smaller than the size of the template, performing background expansion processing on the picture to be recognized to obtain a target picture; otherwise, performing scaling processing or scaling processing plus background expansion processing (specifically corresponding to the case that the object to be identified is the template object) on the image to be identified to obtain the target image.

In the embodiment of the application, the object with the largest size in the preset objects is obtained in advance to serve as the template object, and the size of the template is determined according to the size of the minimum external rectangular picture of the template object, so that the size of the template can be accurately determined according to the specific size condition of the object needing to be identified at present, and the accuracy and the efficiency of subsequent target processing are improved.

Optionally, the step S103 includes:

In the embodiment of the application, the size of the input layer of the object identification network is usually set to be smaller, that is, the size of the image required to be input by the object identification network is smaller, so that the operation amount of the object identification network during object identification processing is reduced, and the processing efficiency of the object identification network is improved. The size (i.e., the size of the template) of the target picture obtained through the target processing in step S102 is usually much larger than the size of the input layer, so that the target picture needs to be reduced according to the input size of the object recognition network, and a picture with a size consistent with the size of the input layer is obtained as the picture to be input. And then, inputting the picture to be input into the object identification network for processing, so as to obtain an identification result corresponding to the object to be identified.

In one embodiment, the object recognition network is specifically a convolutional neural network, and since the size of the input layer of the object recognition network is small, that is, the size of the picture to be input is small, the convolutional neural network may specifically not include a pooling layer, that is, no downsampling process is performed in the process, so that the detail features of the picture to be input can be retained. In particular, the object recognition network may be a convolutional neural network comprising two convolutional layers and 1 fully-connected layer. For example, the size of the input layer of the object recognition network may be 128 × 128; the convolution kernel parameter of the first convolution layer is 3 x 3, the channel number is 16, and the step length is 3; the convolution kernel parameter of the second convolution layer is 3 x 3, the channel number is 64, and the step length is 1; the number of nodes of the fully connected layer is 1849 × 64; the number of output nodes of the classification layer is: 7 (i.e., representing the object to be recognized as one of the preset 7 objects).

Optionally, after the step S103, the method further includes:

and according to the identification result, indicating a mechanical arm to grab the object to be identified.

In the embodiment of the present application, the identification result may specifically be the type information of the object to be identified. And after the identification result is determined, acquiring a grabbing position calculation algorithm and grabbing actions suitable for the type of the object to be identified, and indicating the mechanical arm to grab the object to be identified according to the grabbing point and the grabbing actions. For example, if it is determined that the identification result corresponding to the current object to be identified is: and the No. 8 steel bar can acquire a grabbing position calculation algorithm and grabbing actions corresponding to the No. 8 steel bar through the identification result, so that the grabbing of the No. 8 steel bar is realized. Illustratively, the algorithm for calculating the grabbing position corresponding to the steel bar number 8 comprises the following steps: determining the position of the center point of the image of the No. 8 steel bar; and determining the positive direction of the image of the No. 8 steel bar, and determining the current grabbing position according to the position relation between the vector and the vertical line and the angle of the steel bar. Illustratively, the act of grabbing comprises: grabbing the steel bars through a mechanical arm tool head, and then binding the steel bars through a binding gun; alternatively, the grabbing comprises: the posture of the steel bar is corrected by rotating the poking rod, and the electromagnet is used for adsorption positioning.

The following describes an example of the object identification method according to the embodiment of the present application, with steel bar identification as an application scenario:

the object identification method in the embodiment of the application is used for identifying No. 1-7 different types of steel bars, namely the current object identification method specifically comprises the following steps: and determining the object to be identified as a steel bar number of the 7 steel bars. In the reinforcing steel bars 1-7 to be identified, because the size of the reinforcing steel bar 2 is the largest, and the average size of the corresponding minimum external rectangular picture is 1380 × 780, the reinforcing steel bar 2 can be used as a template object, and the template size is set to be slightly larger than the size of the long side according to the long side of the average size of the minimum external rectangular picture corresponding to the reinforcing steel bar 2: 1400*1400. A comparison graph of the minimum circumscribed rectangle picture corresponding to the No. 2 steel bar and the template picture (the picture with the size determined according to the No. 2 steel bar as the template size) is shown in fig. 5.

In the reinforcing bar production line, regard as the object of waiting to discern with the reinforcing bar of waiting to discern, its identification process is as follows:

(1) and arranging a camera above the appointed area through which the steel bar to be identified passes for shooting the steel bar to be identified to obtain a color picture containing the image information of the steel bar to be identified. And extracting the minimum circumscribed rectangular area where the image of the steel bar to be identified is located from the color picture to obtain a target color picture. The target color picture is a minimum external rectangular picture of a color format corresponding to the steel bar to be identified.

(2) And carrying out gray level processing on the target color picture to obtain a gray level picture corresponding to the steel bar to be identified.

(3) And carrying out binarization processing on the gray level picture to obtain a picture to be identified, wherein the background area is black, and the reinforcing steel bar image area is white.

(4) If the size of the picture to be recognized is larger than the size 1400 × 1400 of the template (usually, it is stated that the picture to be recognized includes a picture corresponding to the number 2 steel bar), the picture is scaled to 1400 × 1400 by a resize instruction.

(5) If the size of the picture to be recognized is smaller than the size of the template, the picture to be recognized is placed in the middle of a preset image area, and after the spare space around the picture is calculated, filling is carried out through black pixels, so that background expansion of the picture to be recognized is achieved, and a target picture with the size corresponding to the picture to be recognized as the size of the template is obtained. For example, a comparison graph between the picture to be recognized and the target picture corresponding to the steel bar No. 1 and the steel bars No. 3-7 is shown in fig. 6.

(7) And reducing the target picture with the size of 1400 × 1400 to the size of 128 × 128 of the input layer of the object identification network to obtain the picture to be input.

(8) Inputting the picture to be input with the size of 128 × 128 into an object identification network, firstly performing convolution with the step length of 3 by using the first convolution kernel of 3 × 16, then changing the picture size into 43 × 16, then performing convolution with the step length of 1 by using the second convolution kernel of 3 × 64, then changing the picture size into 43 × 64, adding a full connection layer at the moment, wherein the parameter of the full connection layer is 43 × 64, finally inputting the full connection layer into a softmax output layer of the neural network, obtaining a corresponding identification result, and determining that the current steel bar to be identified is a steel bar of several numbers. Since the types of the bent reinforcing steel bars identified at this time are 7 types, the parameters of the output layer are 7.

By the object identification method, different types of objects can be accurately identified, and the method is particularly suitable for classification and identification of objects (such as reinforcing steel bars) with similar shapes, similar length-width ratios and different sizes. And its network structure is little, and recognition speed is fast, is applicable to and deploys to industrial terminal, cooperates with industrial robot arm, and what realize carries out operations such as categorised detection and subsequent snatching and placing to the object on the assembly line.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Example two:

fig. 7 is a schematic structural diagram of an object recognition device according to an embodiment of the present application, and for convenience of description, only the parts related to the embodiment of the present application are shown:

the object recognition apparatus includes: a picture to be recognized acquisition unit 71, a target processing unit 72 and a recognition result determination unit 73. Wherein:

and the to-be-identified picture acquiring unit 71 is configured to acquire a to-be-identified picture, where the to-be-identified picture is a minimum circumscribed rectangle picture corresponding to the to-be-identified object.

The target processing unit 72 is configured to perform target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template; wherein the target process comprises a background augmentation process.

An identification result determining unit 73, configured to determine an identification result corresponding to the object to be identified according to the target picture and a preset object identification network; the object recognition network is a neural network which is trained in advance and used for recognizing the object.

Optionally, the to-be-identified picture obtaining unit 71 includes a grayscale picture obtaining module and a binarization processing module:

the grayscale image acquisition module is used for acquiring a grayscale image, wherein the grayscale image is a minimum external rectangular image in a grayscale format corresponding to an object to be identified;

and the binarization processing module is used for carrying out binarization processing on the gray level picture to obtain a picture to be identified.

Optionally, the binarization processing module is specifically configured to determine a binarization threshold according to a gray level histogram of the gray level picture; and carrying out binarization processing on the gray level picture according to the binarization threshold value to obtain a picture to be identified.

Optionally, the target processing unit 72 is specifically configured to, if the size of the to-be-identified picture is smaller than the size of the template, perform background expansion processing around the to-be-identified picture by taking the to-be-identified picture as a center, and obtain a target picture with the size of the template.

Optionally, the target processing unit 72 is specifically configured to determine a scaling ratio according to a ratio of a side length of the template size to a length of a long side of the picture to be recognized if the template size is a square size and the size of the picture to be recognized is larger than the template size; zooming the picture to be identified according to the zooming proportion to obtain a zoomed picture; and expanding the short edge of the zoomed picture to be consistent with the side length of the template size through background expansion processing to obtain a target picture with the size of the template.

Optionally, the object identification apparatus further includes:

a template size determining unit for taking the object with the largest size among the preset kinds of objects as a template object; and determining the size of the template according to the size of the minimum circumscribed rectangle picture corresponding to the template object.

Optionally, the recognition result determining unit 73 is specifically configured to perform reduction processing on the target picture according to the size of the input layer of the object recognition network to obtain a picture to be input; and inputting the picture to be input into the object recognition network for processing to obtain a recognition result corresponding to the object to be recognized.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Example three:

fig. 8 is a schematic diagram of an electronic device provided in an embodiment of the present application. As shown in fig. 8, the electronic apparatus 8 of this embodiment includes: a processor 80, a memory 81 and a computer program 82, such as an object identification program, stored in said memory 81 and operable on said processor 80. The processor 80, when executing the computer program 82, implements the steps in the various object identification method embodiments described above, such as the steps S101 to S103 shown in fig. 1. Alternatively, the processor 80 executes the computer program 82 to implement the functions of the modules/units in the device embodiments, such as the functions of the to-be-recognized picture acquiring unit 71 to the recognition result determining unit 73 shown in fig. 7.

Illustratively, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 82 in the electronic device 8.

The electronic device 8 may be a desktop computer, a notebook, a palm computer, or other computing devices. The electronic device may include, but is not limited to, a processor 80, a memory 81. Those skilled in the art will appreciate that fig. 8 is merely an example of an electronic device 8 and does not constitute a limitation of the electronic device 8 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the electronic device may also include input-output devices, network access devices, buses, etc.

The Processor 80 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 81 may be an internal storage unit of the electronic device 8, such as a hard disk or a memory of the electronic device 8. The memory 81 may also be an external storage device of the electronic device 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 8. Further, the memory 81 may also include both an internal storage unit and an external storage device of the electronic device 8. The memory 81 is used for storing the computer program and other programs and data required by the electronic device. The memory 81 may also be used to temporarily store data that has been output or is to be output.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. An object recognition method, comprising:

2. The object recognition method according to claim 1, wherein the acquiring the picture to be recognized includes:

3. The object identification method according to claim 2, wherein the binarizing the grayscale image to obtain the image to be identified comprises:

4. The object recognition method of claim 1, wherein the performing the target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template size comprises:

5. The object recognition method of claim 1, wherein the performing the target processing on the picture to be recognized according to a preset template size to obtain a target picture with the size of the template size comprises:

6. The object recognition method according to claim 1, wherein the object recognition method is used for recognizing a preset kind of object, and correspondingly, before the acquiring the picture to be recognized, the method further comprises:

7. The object identification method according to any one of claims 1 to 6, wherein the determining the identification result corresponding to the object to be identified according to the target picture and a preset object identification network comprises:

according to the size of an input layer of the object recognition network, carrying out reduction processing on the target picture to obtain a picture to be input;

8. An object recognition device, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the computer program, when executed by the processor, causes the electronic device to carry out the steps of the method according to any one of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes an electronic device to carry out the steps of the method according to any one of claims 1 to 7.