CN113807315A - Method, device, equipment and medium for constructing recognition model of object to be recognized - Google Patents

Method, device, equipment and medium for constructing recognition model of object to be recognized Download PDF

Info

Publication number
CN113807315A
CN113807315A CN202111171015.0A CN202111171015A CN113807315A CN 113807315 A CN113807315 A CN 113807315A CN 202111171015 A CN202111171015 A CN 202111171015A CN 113807315 A CN113807315 A CN 113807315A
Authority
CN
China
Prior art keywords
recognized
picture
identified
graph
sample picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111171015.0A
Other languages
Chinese (zh)
Other versions
CN113807315B (en
Inventor
陈茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wensihai Huizhike Technology Co ltd
Original Assignee
Wensihai Huizhike Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wensihai Huizhike Technology Co ltd filed Critical Wensihai Huizhike Technology Co ltd
Priority to CN202111171015.0A priority Critical patent/CN113807315B/en
Publication of CN113807315A publication Critical patent/CN113807315A/en
Application granted granted Critical
Publication of CN113807315B publication Critical patent/CN113807315B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a method, a device, equipment and a medium for constructing an identification model of an object to be identified, wherein the method comprises the following steps: obtaining a sample picture; acquiring a first position coordinate of a graph of the object to be identified in each sample picture with the object to be identified; for each sample picture, inputting the sample picture into an identification initial model of the object to be identified to obtain a second position coordinate of a prediction graph of the object to be identified; and training the recognition initial model of the object to be recognized based on the second position coordinate of the prediction graph of the object to be recognized and the first position coordinate of the graph of the object to be recognized to obtain the recognition model of the object to be recognized. According to the method and the device, the problem that the recognition accuracy of the recognition model of the object to be recognized obtained through training in the prior art is not high is solved.

Description

Method, device, equipment and medium for constructing recognition model of object to be recognized
Technical Field
The present application relates to the field of computer information technology, and in particular, to a method, an apparatus, a device, and a medium for constructing an identification model of an object to be identified.
Background
With the rapid development of automation technology in recent years, the requirements for automatic detection and automatic identification of pictures are also increasing. For example, the traffic sign is used as an important component of road facilities and an important carrier of road traffic information, and comprises a plurality of key traffic information such as speed limit prompts, front road condition changes and the like, and the traffic sign can provide road information for a driver, provide safety warning for the driver in time and urge the driver to drive cautiously, so that the identification of the traffic sign in the field of automatic driving needs to be faster and more accurate.
In the prior art, there are many methods for identifying pictures, which are relatively common in that an identification model is built, and a picture to be identified is input into the identification model, so that whether the input picture to be identified contains a desired object to be identified can be obtained. However, in the method, when the identification model is constructed, the sample picture containing the object to be identified and the sample picture not containing the object to be identified are used as training sets, and whether the input sample picture contains the prediction result of the object to be identified or not is compared with the actual result of whether the sample picture contains the object to be identified or not through the model, so that the training of the identification model of the object to be identified is completed.
Disclosure of Invention
In view of this, an object of the present application is to provide a method, an apparatus, a device, and a medium for constructing an object to be recognized recognition model, so as to solve the problem in the prior art that the recognition accuracy of the object to be recognized recognition model obtained by training is not high.
In a first aspect, an embodiment of the present application provides a method for constructing a recognition model of an object to be recognized, where the method includes:
obtaining a sample picture;
acquiring a first position coordinate of a graph of the object to be identified in each sample picture with the object to be identified;
for each sample picture, inputting the sample picture into an identification initial model of the object to be identified to obtain a second position coordinate of a prediction graph of the object to be identified;
and training the recognition initial model of the object to be recognized based on the second position coordinate of the prediction graph of the object to be recognized and the first position coordinate of the graph of the object to be recognized to obtain the recognition model of the object to be recognized.
Further, the training of the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified includes:
if the sample picture corresponding to the object to be recognized prediction graph is a picture without the object to be recognized, adjusting the training parameters of the object to be recognized recognition initial model until the object to be recognized prediction graph output by the trained object to be recognized recognition initial model is empty;
if the sample picture corresponding to the predicted graph of the object to be recognized is a picture with the object to be recognized, acquiring a first pixel point of the predicted graph of the object to be recognized from the predicted graph of the object to be recognized;
acquiring a first pixel number marked as an object to be identified in a sample picture, and acquiring a second pixel number marked as the object to be identified from a predicted graph of the object to be identified;
calculating a loss value based on the second position coordinate of the first pixel point, the first position coordinate corresponding to the first pixel point, the first pixel number and the second pixel number;
and if the loss value is greater than the preset loss threshold, adjusting the training parameters of the initial model for recognizing the object to be recognized until the loss value of the trained initial model for recognizing the object to be recognized is not greater than the loss threshold.
Further, the method further comprises:
adjusting the obtained sample picture to the input picture size required by the identification model of the object to be identified;
carrying out data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture;
selecting enhanced pictures with random numbers to be spliced to obtain spliced pictures;
adjusting the spliced picture to the size of the input picture, and acquiring the position coordinates of each object to be identified in the spliced picture with the adjusted size;
and expanding the sample picture with the adjusted size according to the spliced picture with the adjusted size.
Further, the data enhancement comprises: random scaling, gamut variation, flipping.
Further, the data enhancement includes random amplification, and performs data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture, including:
and adding an additional bar around the sample picture with the adjusted size to obtain an enhanced picture with the additional bar.
Further, the method further comprises:
acquiring a picture to be recognized, and adjusting the acquired picture to be recognized to an input picture size required by the recognition model of the object to be recognized;
and inputting the picture to be recognized with the adjusted size into the recognition model of the object to be recognized to obtain the image of the object to be recognized.
Further, the object to be identified is a traffic sign, and the method further includes:
and inquiring a preset mapping relation library of each traffic sign template graph and the traffic sign type, and identifying the traffic sign type of the object graph to be identified.
In a second aspect, an embodiment of the present application provides an apparatus for constructing a recognition model of an object to be recognized, where the apparatus includes:
the sample picture acquisition module is used for acquiring a sample picture;
the first position coordinate acquisition module is used for acquiring a first position coordinate of a graph of the object to be identified in each sample picture with the object to be identified;
the second position coordinate acquisition module is used for inputting the sample picture into the identification initial model of the object to be identified aiming at each sample picture to obtain a second position coordinate of the prediction graph of the object to be identified;
and the identification model determining module of the object to be identified is used for training the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified to obtain the identification model of the object to be identified.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the method of constructing an object recognition model to be recognized as described above.
In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to execute the steps of the method for constructing an object recognition model to be recognized.
According to the method and the device for constructing the identification model of the object to be identified, a sample picture is obtained; then, acquiring a first position coordinate of the graph of the object to be identified in each sample picture with the object to be identified; for each sample picture, inputting the sample picture into an identification initial model of the object to be identified to obtain a second position coordinate of a prediction graph of the object to be identified; and finally, training the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified to obtain the identification model of the object to be identified.
According to the method and the device for constructing the recognition model of the object to be recognized, when the initial model of the object to be recognized is trained, the position coordinates of the image of the object to be recognized in the sample picture are compared with the position coordinates of the predicted object to be recognized in the predicted picture, and then the pixel marks are used for comparison. The higher the accuracy of the model for identifying the object to be identified is, the more accurate the identified object to be identified is.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart of a method for constructing an object recognition model to be recognized according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for training an object to be recognized to recognize an initial model according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an apparatus for constructing an identification model of an object to be identified according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. Every other embodiment that can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present application falls within the protection scope of the present application.
With the rapid development of automation technology in recent years, the requirements for automatic detection and automatic identification of pictures are also increasing. For example, the traffic sign is used as an important component of road facilities and an important carrier of road traffic information, and comprises a plurality of key traffic information such as speed limit prompts, front road condition changes and the like, and the traffic sign can provide road information for a driver, provide safety warning for the driver in time and urge the driver to drive cautiously, so that the identification of the traffic sign in the field of automatic driving needs to be faster and more accurate.
It has been found that there are many methods for identifying an object to be identified in the prior art, such as color-based detection, shape-based detection, multi-feature fusion-based detection, and candidate region-based target detection algorithms. However, there are many disadvantages to the several approaches described above.
The detection method based on the color is divided into two methods, one method is an RGB color model method, the method directly divides the collected RGB image, thus the calculated amount can be reduced, the speed is greatly improved, and the real-time requirement of the algorithm is met, but the method has certain defects, when the environment where the traffic sign is located is complex, the traffic sign can be mixed with background noise, and the algorithm can not achieve good detection effect; the other method is an HSI color model method, the HSI color space has the characteristics of illumination invariance and the like, so the robustness is better, but the conversion of RGB into the HSI color space has certain calculation amount, and the real-time performance needs to be improved by means of hardware processing.
The basic idea of the shape-based detection method is to divide the image into cells and accumulate histograms of edge directions within the cells, and finally generate features to describe the object by combining the histogram entries. This method has the advantage of rotational scaling invariance, but is computationally too extensive.
The detection method based on multi-feature fusion combines the information of RGB and HIS color channels to segment the traffic sign. The algorithm combines the segmentation results of RGB and HIS color spaces, overcomes the defect of image information loss caused by S space segmentation in the HIS space, and improves the detection accuracy, but the method has extremely low detection speed and cannot meet the requirements of real-time application.
The candidate region-based target detection algorithm contains a rich feature layer structure for accurate object detection and semantic segmentation, and achieves excellent object detection accuracy by classifying object proposals using a deep convolutional neural network, but the detection speed of this method is slow because it repeatedly extracts and stores features of each candidate region, taking a lot of computation time and storage resources.
Whether the method for detecting the object to be recognized is based on color, shape, multi-feature fusion or candidate region, there are corresponding recognition models, and in the prior art, when training these models, training of the recognition models is basically performed by using the whole image to be recognized, for example, training of the recognition models is performed based on the color of the whole object to be recognized or based on the shape of the whole object to be recognized. However, the precision of the training mode is not high, the prediction precision of the obtained recognition model is not high, and recognition errors may occur when the object to be recognized is recognized.
Based on this, the embodiment of the application provides a method for constructing an object to be recognized recognition model, so as to solve the problem that the recognition accuracy of the object to be recognized recognition model obtained through training in the prior art is not high, and improve the recognition accuracy of the object to be recognized recognition model obtained through training.
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for constructing an object recognition model according to an embodiment of the present disclosure. As shown in fig. 1, a method for constructing an object recognition model to be recognized according to an embodiment of the present application includes:
and S101, acquiring a sample picture.
It should be noted that the sample picture refers to each training sample in the model training set for training the prediction model. The sample picture may be a picture with or without the object to be recognized. As an optional implementation mode, the sample picture can be a picture with a traffic sign, and can also be a picture without a traffic sign. Traffic signs refer to assets that convey guidance, restriction, warning, or indication information in words or symbols. In general, a traffic sign which is safe, striking, clear and bright is an important measure for implementing traffic management and ensuring the safety and smoothness of road traffic. The sample picture with the traffic signs can contain various types of traffic signs, and can be distinguished in various ways: primary and secondary signs; movable signs and fixed signs; illuminated signs, luminous signs and reflective signs; and a variable information mark reflecting the driving environment change. After the sample picture is obtained, the traffic signs in the sample picture need to be identified. Specifically, there are many methods for identifying the traffic sign in the sample picture, for example, manually identifying the sample picture, or identifying the traffic sign by using the existing color-based, shape-based, multi-feature fusion-based, candidate region-based target detection algorithm. How to perform traffic sign identification based on color, shape, multi-feature fusion and candidate area-based target detection algorithm is described in detail in the prior art, and is not described in detail herein. As an optional implementation manner, the sample picture may be a picture taken by a camera, or a picture uploaded by a user, which is not limited in this application.
Here, it should be noted that the above example for the sample picture is merely an example, and actually, the sample picture is not limited to the above example.
When the sample pictures are used for training the identification model of the object to be identified, the sizes of different sample pictures are possibly different, so that the sizes of the obtained sample pictures are adjusted to be the same, and the construction speed of the identification model of the object to be identified can be increased. As an alternative embodiment, the method is a sample picture obtained by the following steps:
step 1011, adjusting the obtained sample picture to the input picture size required by the identification model of the object to be identified.
It should be noted that the identification model of the object to be identified refers to a model for identifying the object to be identified in the picture. The input picture size refers to the size of a picture which is preset in advance and required by the recognition model of the object to be recognized.
For the above step 1011, in a specific implementation, the sample picture obtained in step S101 is adjusted to the input picture size required by the identification model of the object to be identified, so as to obtain a sample picture with the same size as the input picture. And judging whether the size of the sample picture is larger than the size of an input picture, if so, reducing the size of the sample picture to the size of the input picture to obtain the picture with the same size as the sample picture. And if the size of the sample picture is smaller than the size of the input picture, adding additional strips around the sample picture to obtain the picture with the same sample size, so that the size of the picture with the same sample size is the same as that of the input picture. Here, the additional bar refers to an additional bar of the same color which is formed by one more turn around the original picture, outside the normal picture of the sample picture. As an alternative embodiment, the above-mentioned additional bar may be black or gray, and the application is not limited thereto. In specific implementation, after the size of the sample picture is judged to be smaller than the size of the input picture, additional strips are added around the sample picture, and the sample picture added with the additional strips is the same as the size of the input picture, so that the picture with the same size as the sample is obtained. For example, the resulting sample picture is 16: 9, the input picture size is 4:3, and then an additional strip needs to be added around the original sample picture to make the size of the adjusted sample picture reach 4: 3.
Here, it should be noted that the above selection of the color of the additional bar is merely an example, and in reality, the color of the additional bar is not limited to the above example.
Therefore, when the identification model of the object to be identified is constructed, the sample pictures are adjusted to the same size, all the sample pictures can be adjusted to the picture size required by the traffic sign identification model, the size problem of the sample pictures does not need to be considered when the identification model of the object to be identified is constructed, the size of each processed sample picture is the same as that of the input picture, and the construction speed of the identification model of the object to be identified can be improved.
And 1012, performing data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture.
As an optional embodiment, the data enhancement comprises: random scaling, gamut variation, flipping.
The enhancement picture refers to a picture obtained by performing data enhancement processing on the resized sample picture. The random scaling refers to an operation of scaling the sample picture with the adjusted size, the color gamut change refers to an operation of changing the brightness, saturation and hue of the sample picture with the adjusted size, and the flipping refers to an operation of flipping the sample picture with the adjusted size from left to right.
As an optional implementation manner, the data enhancement includes random amplification, and performs data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture, including:
and adding an additional bar around the sample picture with the adjusted size to obtain an enhanced picture with the additional bar.
The random enlargement refers to an operation of randomly enlarging the sample picture of the adjusted size. The additional strip refers to an additional strip with the same color which is arranged outside the normal picture of the sample picture and is arranged around the original sample picture in a circle. As an alternative embodiment, the above-mentioned additional bar may be black or gray, and the application is not limited thereto. In a specific implementation, when the data enhancement operation is a random zoom-in operation, an additional bar may be added around the resized sample picture, resulting in an enhanced picture with the additional bar. Specifically, because the colors in the additional strips are uniform, when the initial model for identifying the object to be identified identifies the enhanced picture with the additional strips, when it is detected that the color of a certain pixel point in the enhanced picture is the same as the color of a preset additional strip, it is determined that the position corresponding to the pixel point does not necessarily contain the object to be identified, and therefore, only images except the additional strips are identified when the initial model for identifying the object to be identified identifies the enhanced picture with the additional strips.
And 1013, selecting the enhanced pictures with the random numbers to be spliced to obtain spliced pictures.
It should be noted that splicing refers to splicing at least two enhanced pictures into one spliced picture. As an alternative embodiment, four enhanced pictures may be randomly selected and spliced. The Mosaic of the enhanced picture can adopt a Mosaic data enhancement mode. Specifically, the Mosaic data enhancement is to randomly select four enhancement pictures and splice the four enhancement pictures in a random distribution manner to obtain a spliced picture. Continuing the example of splicing the four enhanced pictures, in specific implementation, firstly, randomly reading the four enhanced pictures, splicing the four enhanced pictures together according to a random distribution mode, for example, placing the four enhanced pictures in the order of placing the first enhanced picture in the upper left corner, placing the second enhanced picture in the upper right corner, placing the third enhanced picture in the lower left corner, and placing the fourth enhanced picture in the lower right corner. After the placement of the four enhanced pictures is completed, the fixed areas of the four enhanced pictures are intercepted in a matrix mode, and then the four enhanced pictures are spliced together to form a new picture serving as a spliced picture.
The splicing mode greatly enriches the model training set, particularly increases a plurality of small targets by random scaling, and can ensure that the robustness of the prediction model is better. And a plurality of pictures are spliced to obtain a spliced picture before prediction, then the spliced picture is transmitted into an initial model for identifying the object to be identified for learning, namely four pieces of reinforcement are transmitted to a neural network for learning at one time, so that the backgrounds of the detected object are enriched, and the data of a plurality of sample pictures can be calculated at one time when the object to be identified is identified, so that a GPU can achieve a good effect.
Here, it should be noted that the selection of the splicing manner of the enhanced pictures and the selection of the splicing number of the enhanced pictures are merely examples, and in practice, the splicing manner of the enhanced pictures and the splicing number of the enhanced pictures are not limited to the above examples.
And 1014, adjusting the spliced picture to the size of the input picture, and acquiring the position coordinates of each object to be identified in the spliced picture with the adjusted size.
For the above step 1014, after the stitched picture is obtained, the stitched picture is adjusted to the input picture size, and specifically, the method for adjusting the size of the stitched picture is the same as the method for adjusting the size of the acquired sample picture to the input picture size in step 1011, and is not described herein again. After the size is adjusted, the position coordinates of each object to be identified in the spliced picture with the adjusted size are also acquired. After the sample picture is randomly scaled and spliced, the position coordinates of the object to be recognized in the sample picture are also changed. For example, the sample picture has a size of 500 pixels × 500 pixels, and the position coordinates of the object to be recognized in the sample picture are (100, 50). When stitching, the sample picture is reduced in size to 50% of the original sample picture. At this time, the size of the reduced sample picture is 250 pixels × 250 pixels, and the position coordinates of the object to be recognized in the reduced sample picture are (50, 25).
And step 1015, expanding the sample picture with the adjusted size according to the spliced picture with the adjusted size.
In step 1024, after the spliced picture with the adjusted size is obtained, the spliced picture with the adjusted size is also used as a sample picture with the adjusted size, so that training data for constructing an initial model for identifying the object to be identified can be richer, and the constructed model for identifying the object to be identified is more accurate.
S102, aiming at each sample picture with the object to be identified in the sample pictures, obtaining a first position coordinate of the object to be identified in the sample picture.
It should be noted that the object to be recognized refers to an object existing in the sample picture and desired to be recognized from the sample picture. The first position coordinates are used for representing outline position coordinates of a graph of the object to be identified in the sample picture. Continuing with the above embodiment, when the sample picture is a picture with a traffic sign, the image to be recognized is the traffic sign in the sample picture, and the first position coordinate is the first position coordinate of the pattern of the traffic sign in the sample picture.
For step S102, for each sample picture with the object to be identified, a first position coordinate of the object to be identified in the sample picture is obtained. Specifically, after a sample picture with an object to be recognized is recognized, a contour of the object to be recognized is obtained, contour pixel points of the object to be recognized in the sample picture are marked relative to the sample picture according to pixel points in the contour, after the contour pixel points of the object to be recognized in the sample picture are obtained, a coordinate system can be established by taking a vertex at the lower left corner of the sample picture as an origin, and a first position coordinate of a graph of the object to be recognized in the sample picture is determined based on the coordinate system.
Here, it should be noted that the above-described manner of acquiring the first position coordinates of the object pattern to be recognized in the sample picture is merely an example, and in practice, the manner of splicing the enhanced pictures and the number of splicing the enhanced pictures are not limited to the above-described example.
S103, aiming at each sample picture, inputting the sample picture into an identification initial model of the object to be identified to obtain a second position coordinate of the prediction graph of the object to be identified.
It should be noted that the identification initial model of the object to be identified refers to an initial model for identifying the object to be identified in the sample picture. The object to be recognized prediction graph refers to a graph recognized by the object to be recognized recognition initial model aiming at the sample picture. Since the sample picture may be a picture with the object to be recognized or a picture without the object to be recognized, the second position coordinates of the predicted pattern of the object to be recognized, which is recognized by the initial model for recognizing the object to be recognized, may not exist.
In the specific implementation of step S103, for each sample picture, the sample picture is input into the initial model for identifying the object to be identified, and the neural network in the initial model for identifying the object to be identified is used to determine the second position coordinates of the predicted graph of the object to be identified in the sample picture.
Specifically, after the initial model for recognizing the object to be recognized determines the predicted graph of the object to be recognized in the sample picture, the obtained predicted graph of the object to be recognized is also marked to obtain the second position coordinate of the predicted graph of the object to be recognized in the sample picture. Specifically, after the predicted graph of the object to be recognized is recognized, the contour of the predicted graph of the object to be recognized is obtained, according to the contour pixel points of the pixel points in the contour relative to the predicted graph of the object to be recognized in the sample picture, after the contour pixel points of the predicted graph of the object to be recognized in the sample picture are obtained, a coordinate system can be established by taking the vertex at the lower left corner of the sample picture as an origin, and the second position coordinate of the predicted graph of the object to be recognized in the sample picture is determined based on the coordinate system.
And S104, training the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified to obtain the identification model of the object to be identified.
After the second position coordinate of the predicted graph of the object to be recognized and the first position coordinate of the graph of the object to be recognized are determined, the initial model of the object to be recognized is trained by using the two parameters, so as to obtain the recognition model of the object to be recognized in step S104.
And the first position coordinate is the outline position coordinate of the object to be identified in the picture with the same size as the sample.
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for training an object to be recognized to recognize an initial model according to an embodiment of the present disclosure. As shown in fig. 2, the training the object to be recognized identification initial model based on the second position coordinate of the object to be recognized prediction graph and the first position coordinate of the object to be recognized graph includes:
s201, if the sample picture corresponding to the object to be recognized prediction graph is a picture without the object to be recognized, adjusting the training parameters of the object to be recognized recognition initial model until the object to be recognized prediction graph output by the trained object to be recognized recognition initial model is empty.
In step S201, the sample pictures include pictures with objects to be recognized and pictures without objects to be recognized. When the initial model for recognizing the object to be recognized recognizes the picture without the object to be recognized, a predicted graph of the object to be recognized is obtained, and at this time, it is considered that the recognition of the initial model for recognizing the object to be recognized is wrong, and training parameters of the initial model for recognizing the object to be recognized need to be modified, specifically, the training parameters may be learning rate, network parameters and the like of the initial model for recognizing the object to be recognized. The method comprises the steps that a training parameter of an object to be recognized identification initial model is continuously adjusted through an iteration mode, an object to be recognized prediction graph predicted by the object to be recognized identification initial model is output again in each iteration step, when the object to be recognized prediction graph is not empty, the training parameter of the object to be recognized identification initial model is continuously adjusted, the new parameter is output to obtain a new object to be recognized prediction graph until the object to be recognized prediction graph output by the trained object to be recognized identification initial model is empty, and at the moment, the object to be recognized identification initial model is considered to be accurately recognized.
S202, if the sample picture corresponding to the predicted graph of the object to be recognized is a picture with the object to be recognized, obtaining a first pixel point of the predicted graph of the object to be recognized from the predicted graph of the object to be recognized.
For the step S202, when the initial model for recognizing the object to be recognized recognizes the picture with the object to be recognized, the initial model for recognizing the object to be recognized outputs a predicted graph of the object to be recognized. The range of the object to be recognized exists in the prediction graph of the object to be recognized, and the range of the object not to be recognized also exists, so that the first pixel point marked as the object to be recognized needs to be obtained. Here, the pixel point means that in an image, the image is divided into a plurality of small squares, and each small square becomes one pixel point. According to the embodiment provided by the application, the obtained prediction graph of the object to be recognized is divided into a plurality of small squares, and the pixel point marked as the object to be recognized is obtained as the first pixel point.
S203, acquiring a first pixel number marked as an object to be identified in the sample picture, and acquiring a second pixel number marked as the object to be identified from the predicted graph of the object to be identified.
It should be noted that the number of pixels refers to the total number of pixels for marking the object to be recognized. For step S203, in a specific implementation, the total number of the pixel points of the object to be identified marked in the sample picture is obtained from the sample picture based on the object to be identified in the sample picture, and is used as the first pixel number of the object to be identified. And acquiring the total number of pixel points of the marked object to be recognized based on the first pixel points marked as the object to be recognized from the object to be recognized prediction graph output by the initial model for recognizing the object to be recognized, and taking the total number as the second pixel number of the object to be recognized.
S204, calculating a loss value based on the second position coordinate of the first pixel point, the first position coordinate corresponding to the first pixel point, the first pixel number and the second pixel number.
It should be noted that the loss value (loss function) is a function value that maps a random event or a value of a random variable related to the random event to a non-negative real number to represent "risk" or "loss" of the random event. In application, the loss values are usually associated with the optimization problem as learning criteria, i.e. the model is solved and evaluated by minimizing the loss function.
For the step S204, when calculating the loss value of the initial model for identifying the object to be identified, the method includes two parts, one part is to calculate the loss value by using the error between the first position coordinate and the second position coordinate of the first pixel point, the other part is to judge the accuracy of identifying the initial model for identifying the object to be identified, and the loss value is calculated by using the first pixel number and the second pixel number.
When the loss is calculated by using the error between the first position coordinate and the second position coordinate of the first pixel point, whether the prediction of the to-be-recognized object recognition initial model is accurate is judged by comparing the first position coordinate and the second position coordinate of the first pixel point, and when the first position coordinate and the second position coordinate of the first pixel point are different, the prediction of the to-be-recognized object recognition initial model is considered to be inaccurate. For example, if the first position coordinate of the first pixel point is determined to be (250 ) and the second position coordinate of the first pixel point is determined to be (100, 50), it is determined that an error exists between the first position coordinate and the second position coordinate of the first pixel point, that is, the prediction of the identification initial model of the object to be identified is inaccurate. At this time, the loss value of the recognition initial model of the object to be recognized in the current state needs to be calculated. The manner in which the loss value is calculated is described in detail in the prior art and will not be described in greater detail herein.
When the loss value is calculated by using the first pixel number and the second pixel number, whether the prediction of the to-be-recognized object recognition initial model is accurate is judged by comparing the first pixel number with the second pixel number, when the first pixel number and the second pixel number have a difference, the prediction of the to-be-recognized object recognition initial model is considered to be inaccurate, and at this time, the loss value of the to-be-recognized object recognition initial model in the current state needs to be calculated. The manner in which the loss value is calculated is described in detail in the prior art and will not be described in greater detail herein.
When the sample picture is a spliced sample picture, that is, the sample picture may include a plurality of sample pictures, the obtained first pixel point marked as the object to be identified is also plural, and the corresponding first position coordinate, second position coordinate, first pixel number and second pixel number are also plural. At this time, the parameters corresponding to the objects to be recognized need to be compared respectively. For example, the sample picture is obtained by splicing two sample pictures, including a sample picture a and a sample picture B, where the sample picture a includes an object a to be identified, and the sample picture B includes an object B to be identified. After the sample picture is input into the initial model for identifying the object to be identified, the initial model for identifying the object to be identified correspondingly outputs two predicted graphs of the object to be identified, one predicted graph A of the object to be identified in the sample picture A and the other predicted graph B of the object to be identified in the sample picture B. At this time, the two predicted images of the object to be recognized need to be compared respectively, the predicted image a of the object to be recognized is compared with the object a to be recognized in the sample image a, the predicted image B of the object to be recognized is compared with the object B to be recognized in the sample image B, and whether the prediction of the initial model for recognizing the object to be recognized is accurate or not is judged.
And S205, if the loss value is greater than a preset loss threshold, adjusting the training parameters of the to-be-recognized object recognition initial model until the loss value of the trained to-be-recognized object recognition initial model is not greater than the loss threshold.
In the embodiments provided in the present application, the loss threshold refers to a criterion that is set in advance, as an alternative implementation manner, the minimum threshold may be set to be a second derivative of the loss value close to 0, because when the second derivative is close to 0, the slope of the loss value is minimum, that is, the change of the loss value between two iterations of the initial model for identifying the object to be identified is already small, when the loss value is close to the loss threshold, the initial model for identifying the object to be identified is considered to reach a convergence state, and the prediction of the initial model for identifying the object to be identified at this time is relatively accurate.
For step S205, after the loss value of the to-be-recognized object recognition initial model in the current state is calculated in step S204, the training parameters in the to-be-recognized object recognition initial model are continuously adjusted, specifically, the training parameters may be the learning rate, the network parameters, and the like of the to-be-recognized object recognition initial model. Specifically, the loss of the initial model for identifying the object to be identified is continuously minimized in an iterative manner, the loss value of the initial model for identifying the object to be identified is calculated in each step of the iteration, when the loss value of the initial model for identifying the object to be identified cannot reach a loss threshold value, the training parameters of the initial model for identifying the object to be identified are continuously updated, and the new parameters are calculated to obtain a new loss value, so that the loss value shows a trend of fluctuation reduction in the iterative process. And finally, when the loss value reaches the smoothness, namely the loss value of the trained object recognition initial model to be recognized is not larger than the loss threshold value, namely the loss value is not obviously reduced compared with the loss value calculated last time, the object recognition initial model to be recognized is considered to reach the convergence, and the training is finished at this time to obtain the object recognition model to be recognized.
According to the method and the device for constructing the recognition model of the object to be recognized, when the initial model of the object to be recognized is trained, the position coordinates of the image of the object to be recognized in the sample picture are compared with the position coordinates of the predicted object to be recognized in the predicted picture, and then the pixel marks are used for comparison. The higher the accuracy of the model for identifying the object to be identified is, the more accurate the identified object to be identified is.
After the identification model of the object to be identified is constructed, the identification model of the object to be identified is used for identifying the object to be identified in the sample picture, and specifically, the method further comprises the following steps:
a: and acquiring a picture to be recognized, and adjusting the acquired picture to be recognized to the size of the input picture required by the recognition model of the object to be recognized.
It should be noted that the picture to be recognized refers to a picture to be recognized, which may include an object to be recognized. As an optional implementation manner, the picture to be recognized may be a picture shot by a camera, or may also be a picture uploaded by a user, which is not limited in this application.
For the above steps, in specific implementation, after the picture to be recognized is acquired, the acquired picture to be recognized is adjusted to the input picture size required by the recognition model of the object to be recognized. Specifically, the method for adjusting the size of the picture to be recognized is the same as the method for adjusting the size of the acquired sample picture to the input picture in step 1011, and is not repeated here.
B: and inputting the picture to be recognized with the adjusted size into the recognition model of the object to be recognized to obtain the image of the object to be recognized.
Aiming at the steps, in the specific implementation, the picture to be recognized with the adjusted size is input into the recognition model of the object to be recognized, and the image of the object to be recognized is obtained. Here, the object map to be recognized refers to a picture with an object to be recognized. Specifically, when the image of the object to be recognized is obtained, firstly, the pixel points of the object to be recognized in the image to be recognized with the adjusted size are determined. And marking by using pixel points of the object to be identified in the image to be identified with the adjusted size and acquiring the position coordinates of the object to be identified in the image to be identified with the adjusted size. And drawing in the picture to be recognized with the adjusted size by using the determined position coordinates, namely connecting the positions corresponding to the determined position coordinates by using lines to obtain a position block diagram in the picture to be recognized with the adjusted size. The picture content in the position frame diagram is the object to be identified, so the picture content in the position frame diagram is used as the object to be identified in the picture to be identified.
As an optional implementation manner, the object to be identified is a traffic sign, and the method further includes:
and inquiring a preset mapping relation library of each traffic sign template graph and the traffic sign type, and identifying the traffic sign type of the object graph to be identified.
It should be noted that the traffic sign template map refers to a pre-stored template map for identifying the type of the traffic sign. The mapping relation library refers to a database for storing mapping relations between objects, and corresponds to a database for representing information in the form of objects. Mapping relationships generally refer to object relational mapping, which is the transformation between data used to implement different types of systems in an object-oriented programming language. According to the embodiment provided by the application, preset traffic sign template graphs and traffic sign types can be stored in the mapping relation library, and one traffic sign template graph corresponds to one traffic sign type.
Traffic sign types can be distinguished in various ways: primary and secondary signs; movable signs and fixed signs; illuminated signs, luminous signs and reflective signs; and a variable information mark reflecting the driving environment change. The main signs may include the following four major categories: the road traffic warning sign is a sign for warning drivers and pedestrians of danger and taking measures in time; the road traffic indicator is used for indicating drivers and pedestrians to drive according to specified directions and places; the road traffic direction mark is used for indicating the direction of a transmission road; road traffic prohibition sign: is a sign for imposing restrictions on a part of traffic behaviors of vehicles and pedestrians.
Here, it should be noted that the above description for the traffic sign type in the mapping relation library is merely an example, and actually, the traffic sign type in the mapping relation library is not limited to the above example.
As an optional implementation manner, after the traffic sign map in the picture to be recognized is obtained, the traffic sign type of the object map to be recognized may be recognized by querying a mapping relation library of preset traffic sign template maps and traffic sign types.
According to the embodiment provided by the application, the picture to be recognized can be input into the recognition model of the object to be recognized, the traffic sign map in the picture to be recognized is quickly recognized, the preset mapping relation library of the traffic sign template maps and the traffic sign types is inquired, the traffic sign types of the traffic sign map are recognized, road information is provided for vehicles in time, and the unmanned vehicles can be helped to select correct roads to run.
Referring to fig. 3, fig. 3 is a schematic structural diagram of an apparatus for constructing a recognition model of an object to be recognized according to an embodiment of the present disclosure. As shown in fig. 3, the apparatus 300 for constructing an object recognition model to be recognized includes:
a sample picture obtaining module 301, configured to obtain a sample picture;
a first position coordinate obtaining module 302, configured to obtain, for each sample picture with an object to be identified in the sample pictures, a first position coordinate of the object to be identified in the sample picture;
a second position coordinate obtaining module 303, configured to, for each sample picture, input the sample picture into the identification initial model of the object to be identified, and obtain a second position coordinate of the prediction graph of the object to be identified;
and the to-be-recognized object recognition model determining module 304 is configured to train the to-be-recognized object recognition initial model based on the second position coordinate of the to-be-recognized object prediction graph and the first position coordinate of the to-be-recognized object graph, so as to obtain the to-be-recognized object recognition model.
Further, the training of the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified includes:
if the sample picture corresponding to the object to be recognized prediction graph is a picture without the object to be recognized, adjusting the training parameters of the object to be recognized recognition initial model until the object to be recognized prediction graph output by the trained object to be recognized recognition initial model is empty;
if the sample picture corresponding to the predicted graph of the object to be recognized is a picture with the object to be recognized, acquiring a first pixel point of the predicted graph of the object to be recognized from the predicted graph of the object to be recognized;
acquiring a first pixel number marked as an object to be identified in a sample picture, and acquiring a second pixel number marked as the object to be identified from a predicted graph of the object to be identified;
calculating a loss value based on the second position coordinate of the first pixel point, the first position coordinate corresponding to the first pixel point, the first pixel number and the second pixel number;
and if the loss value is greater than the preset loss threshold, adjusting the training parameters of the initial model for recognizing the object to be recognized until the loss value of the trained initial model for recognizing the object to be recognized is not greater than the loss threshold.
Further, the apparatus 300 for constructing an object recognition model to be recognized is further configured to:
adjusting the obtained sample picture to the input picture size required by the identification model of the object to be identified;
carrying out data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture;
selecting enhanced pictures with random numbers to be spliced to obtain spliced pictures;
adjusting the spliced picture to the size of the input picture, and acquiring the position coordinates of each object to be identified in the spliced picture with the adjusted size;
and expanding the sample picture with the adjusted size according to the spliced picture with the adjusted size.
Further, the data enhancement comprises: random scaling, gamut variation, flipping.
Further, the data enhancement includes random amplification, and performs data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture, including:
and adding an additional bar around the sample picture with the adjusted size to obtain an enhanced picture with the additional bar.
Further, the apparatus 300 for constructing an object recognition model to be recognized is further configured to:
acquiring a picture to be recognized, and adjusting the acquired picture to be recognized to an input picture size required by the recognition model of the object to be recognized;
and inputting the picture to be recognized with the adjusted size into the recognition model of the object to be recognized to obtain the image of the object to be recognized.
Further, the object to be recognized is a traffic sign, and the apparatus 300 for constructing a recognition model of the object to be recognized is further configured to:
and inquiring a preset mapping relation library of each traffic sign template graph and the traffic sign type, and identifying the traffic sign type of the object graph to be identified.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 4, the electronic device 400 includes a processor 410, a memory 420, and a bus 430.
The memory 420 stores machine-readable instructions executable by the processor 410, when the electronic device 400 runs, the processor 410 communicates with the memory 420 through the bus 430, and when the machine-readable instructions are executed by the processor 410, the steps of the method for constructing the identification model of the object to be identified in the method embodiments shown in fig. 1 and fig. 2 can be executed, so that the problem that the identification precision of the identification model of the object to be identified obtained by training in the prior art is not high is solved, and specific implementation manners can refer to the method embodiments and are not described herein again.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method for constructing the object to be recognized in the method embodiments shown in fig. 1 and fig. 2 may be executed, so as to solve the problem that the recognition accuracy of the object to be recognized obtained by training in the prior art is not high.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of constructing a recognition model of an object to be recognized, the method comprising:
obtaining a sample picture;
acquiring a first position coordinate of a graph of the object to be identified in each sample picture with the object to be identified;
for each sample picture, inputting the sample picture into an identification initial model of the object to be identified to obtain a second position coordinate of a prediction graph of the object to be identified;
and training the recognition initial model of the object to be recognized based on the second position coordinate of the prediction graph of the object to be recognized and the first position coordinate of the graph of the object to be recognized to obtain the recognition model of the object to be recognized.
2. The method according to claim 1, wherein the first position coordinates are contour position coordinates of the object to be recognized in a sample picture, and the training of the object to be recognized recognition initial model based on the second position coordinates of the object to be recognized prediction graph and the first position coordinates of the object to be recognized graph comprises:
if the sample picture corresponding to the object to be recognized prediction graph is a picture without the object to be recognized, adjusting the training parameters of the object to be recognized recognition initial model until the object to be recognized prediction graph output by the trained object to be recognized recognition initial model is empty;
if the sample picture corresponding to the predicted graph of the object to be recognized is a picture with the object to be recognized, acquiring a first pixel point of the predicted graph of the object to be recognized from the predicted graph of the object to be recognized;
acquiring a first pixel number marked as an object to be identified in a sample picture, and acquiring a second pixel number marked as the object to be identified from a predicted graph of the object to be identified;
calculating a loss value based on the second position coordinate of the first pixel point, the first position coordinate corresponding to the first pixel point, the first pixel number and the second pixel number;
and if the loss value is greater than the preset loss threshold, adjusting the training parameters of the initial model for recognizing the object to be recognized until the loss value of the trained initial model for recognizing the object to be recognized is not greater than the loss threshold.
3. The method of claim 1, further comprising:
adjusting the obtained sample picture to the input picture size required by the identification model of the object to be identified;
carrying out data enhancement processing on the sample picture with the adjusted size to obtain an enhanced picture;
selecting enhanced pictures with random numbers to be spliced to obtain spliced pictures;
adjusting the spliced picture to the size of the input picture, and acquiring the position coordinates of each object to be identified in the spliced picture with the adjusted size;
and expanding the sample picture with the adjusted size according to the spliced picture with the adjusted size.
4. The method of claim 3, wherein the data enhancement comprises: random scaling, gamut variation, flipping.
5. The method according to claim 4, wherein the data enhancement comprises random amplification, and the data enhancement processing is performed on the sample picture with the adjusted size to obtain an enhanced picture, and comprises:
and adding an additional bar around the sample picture with the adjusted size to obtain an enhanced picture with the additional bar.
6. The method according to any one of claims 1 to 5, further comprising:
acquiring a picture to be recognized, and adjusting the acquired picture to be recognized to an input picture size required by the recognition model of the object to be recognized;
and inputting the picture to be recognized with the adjusted size into the recognition model of the object to be recognized to obtain the image of the object to be recognized.
7. The method of claim 6, wherein the object to be identified is a traffic sign, the method further comprising:
and inquiring a preset mapping relation library of each traffic sign template graph and the traffic sign type, and identifying the traffic sign type of the object graph to be identified.
8. An apparatus for constructing a recognition model of an object to be recognized, the apparatus comprising:
the sample picture acquisition module is used for acquiring a sample picture;
the first position coordinate acquisition module is used for acquiring a first position coordinate of a graph of the object to be identified in each sample picture with the object to be identified;
the second position coordinate acquisition module is used for inputting the sample picture into the identification initial model of the object to be identified aiming at each sample picture to obtain a second position coordinate of the prediction graph of the object to be identified;
and the identification model determining module of the object to be identified is used for training the identification initial model of the object to be identified based on the second position coordinate of the prediction graph of the object to be identified and the first position coordinate of the graph of the object to be identified to obtain the identification model of the object to be identified.
9. An electronic device, comprising: processor, memory and bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the method of constructing an object recognition model to be recognized according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method of constructing an object recognition model to be recognized according to any one of claims 1 to 7.
CN202111171015.0A 2021-10-08 2021-10-08 Method, device, equipment and medium for constructing object recognition model to be recognized Active CN113807315B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111171015.0A CN113807315B (en) 2021-10-08 2021-10-08 Method, device, equipment and medium for constructing object recognition model to be recognized

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111171015.0A CN113807315B (en) 2021-10-08 2021-10-08 Method, device, equipment and medium for constructing object recognition model to be recognized

Publications (2)

Publication Number Publication Date
CN113807315A true CN113807315A (en) 2021-12-17
CN113807315B CN113807315B (en) 2024-06-04

Family

ID=78897340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111171015.0A Active CN113807315B (en) 2021-10-08 2021-10-08 Method, device, equipment and medium for constructing object recognition model to be recognized

Country Status (1)

Country Link
CN (1) CN113807315B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564656A (en) * 2022-11-11 2023-01-03 成都智元汇信息技术股份有限公司 Multi-graph merging and graph recognizing method and device based on scheduling

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002229727A (en) * 2001-02-02 2002-08-16 Canon Inc Coordinate input device
CN102156980A (en) * 2011-01-14 2011-08-17 耿则勋 Method for evaluating influence of data compression on positioning accuracy of remote sensing image
CN106340062A (en) * 2015-07-09 2017-01-18 长沙维纳斯克信息技术有限公司 Three-dimensional texture model file generating method and device
CN110472602A (en) * 2019-08-20 2019-11-19 腾讯科技(深圳)有限公司 A kind of recognition methods of card card, device, terminal and storage medium
CN111476159A (en) * 2020-04-07 2020-07-31 哈尔滨工业大学 Method and device for training and detecting detection model based on double-angle regression
CN111523465A (en) * 2020-04-23 2020-08-11 中船重工鹏力(南京)大气海洋信息系统有限公司 Ship identity recognition system based on camera calibration and deep learning algorithm
CN112508109A (en) * 2020-12-10 2021-03-16 锐捷网络股份有限公司 Training method and device for image recognition model
CN112560834A (en) * 2019-09-26 2021-03-26 武汉金山办公软件有限公司 Coordinate prediction model generation method and device and graph recognition method and device
CN113021355A (en) * 2021-03-31 2021-06-25 重庆正格技术创新服务有限公司 Agricultural robot operation method for predicting sheltered crop picking point
CN113096017A (en) * 2021-04-14 2021-07-09 南京林业大学 Image super-resolution reconstruction method based on depth coordinate attention network model
CN113436251A (en) * 2021-06-24 2021-09-24 东北大学 Pose estimation system and method based on improved YOLO6D algorithm

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002229727A (en) * 2001-02-02 2002-08-16 Canon Inc Coordinate input device
CN102156980A (en) * 2011-01-14 2011-08-17 耿则勋 Method for evaluating influence of data compression on positioning accuracy of remote sensing image
CN106340062A (en) * 2015-07-09 2017-01-18 长沙维纳斯克信息技术有限公司 Three-dimensional texture model file generating method and device
CN110472602A (en) * 2019-08-20 2019-11-19 腾讯科技(深圳)有限公司 A kind of recognition methods of card card, device, terminal and storage medium
CN112560834A (en) * 2019-09-26 2021-03-26 武汉金山办公软件有限公司 Coordinate prediction model generation method and device and graph recognition method and device
CN111476159A (en) * 2020-04-07 2020-07-31 哈尔滨工业大学 Method and device for training and detecting detection model based on double-angle regression
CN111523465A (en) * 2020-04-23 2020-08-11 中船重工鹏力(南京)大气海洋信息系统有限公司 Ship identity recognition system based on camera calibration and deep learning algorithm
CN112508109A (en) * 2020-12-10 2021-03-16 锐捷网络股份有限公司 Training method and device for image recognition model
CN113021355A (en) * 2021-03-31 2021-06-25 重庆正格技术创新服务有限公司 Agricultural robot operation method for predicting sheltered crop picking point
CN113096017A (en) * 2021-04-14 2021-07-09 南京林业大学 Image super-resolution reconstruction method based on depth coordinate attention network model
CN113436251A (en) * 2021-06-24 2021-09-24 东北大学 Pose estimation system and method based on improved YOLO6D algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JAIDEV: "Weighted Loss Functions for Instance Segmentation", Retrieved from the Internet <URL:https://jaidevd.com/posts/weighted-loss-functions-for-instance-segmentation/> *
SENBINYU: "图像分割中的损失函数分类和汇总", Retrieved from the Internet <URL:https://blog.csdn.net/senbinyu/article/details/108232122> *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115564656A (en) * 2022-11-11 2023-01-03 成都智元汇信息技术股份有限公司 Multi-graph merging and graph recognizing method and device based on scheduling

Also Published As

Publication number Publication date
CN113807315B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
CN111563442B (en) Slam method and system for fusing point cloud and camera image data based on laser radar
CN111178355B (en) Seal identification method, device and storage medium
CN109871829B (en) Detection model training method and device based on deep learning
CN110969592B (en) Image fusion method, automatic driving control method, device and equipment
CN111738252B (en) Text line detection method, device and computer system in image
CN110288612B (en) Nameplate positioning and correcting method and device
CN113989167B (en) Contour extraction method, device, equipment and medium based on seed point self-growth
CN113158977B (en) Image character editing method for improving FANnet generation network
CN113255578B (en) Traffic identification recognition method and device, electronic equipment and storage medium
US20220358634A1 (en) Methods and systems of utilizing image processing systems to measure objects
CN111368682A (en) Method and system for detecting and identifying station caption based on faster RCNN
CN111126393A (en) Vehicle appearance refitting judgment method and device, computer equipment and storage medium
JP2009163682A (en) Image discrimination device and program
CN114898321A (en) Method, device, equipment, medium and system for detecting road travelable area
CN113807315B (en) Method, device, equipment and medium for constructing object recognition model to be recognized
CN114005120A (en) License plate character cutting method, license plate recognition method, device, equipment and storage medium
CN117593420A (en) Plane drawing labeling method, device, medium and equipment based on image processing
CN110874170A (en) Image area correction method, image segmentation method and device
CN114118127B (en) Visual scene sign detection and recognition method and device
CN102682308B (en) Imaging processing method and device
CN112381034A (en) Lane line detection method, device, equipment and storage medium
CN115393379A (en) Data annotation method and related product
CN105654457A (en) Device and method for processing image
CN117523087B (en) Three-dimensional model optimization method based on content recognition
CN115359346B (en) Small micro-space identification method and device based on street view picture and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant