CN115294552A

CN115294552A - Rod-shaped object identification method, device, equipment and storage medium

Info

Publication number: CN115294552A
Application number: CN202210945907.XA
Authority: CN
Inventors: 李德辉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-08-08
Filing date: 2022-08-08
Publication date: 2022-11-04

Abstract

The embodiment of the application provides a shaft identification method, a shaft identification device, shaft identification equipment and a storage medium, which are used for acquiring accurate position information of a shaft. The method comprises the following steps: acquiring an image to be recognized, wherein the image to be recognized comprises at least one rod-shaped object to be recognized; inputting an image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to a labeled training image; scanning the first segmentation result to obtain a first central point set corresponding to the first rod-shaped object to be identified, wherein the first central point set comprises a central point obtained by scanning the first rod-shaped object to be identified; obtaining a first central line corresponding to the first shaft-shaped object to be identified according to the central point in the first central point set; position information of the first shaft to be identified is determined from the first centerline. The application can be applied to the fields of traffic, maps and car networking.

Description

Rod-shaped object identification method, device, equipment and storage medium

Technical Field

The present application relates to the field of image recognition, and in particular, to a method, an apparatus, a device, and a storage medium for recognizing a rod-shaped object.

Background

The high-precision map is an electronic map with higher precision and more data dimensions, and the construction of the high-precision map is considered as an important technical link of an automatic driving technology. The high-precision map contains a large amount of information that can be referred to by the autonomous vehicle during travel, such as road information including the position, type, width, gradient, and curvature of a lane line, and fixed object information around a lane such as a traffic sign, traffic light, guardrail, road edge, and shaft. An important application of high-precision maps is for high-precision positioning. The automatic driving requires that the vehicle has centimeter-level Positioning accuracy, and the Positioning of the Global Positioning System (GPS) has meter-level or even ten-level errors, so that the position can be corrected by combining with the elements of the high-precision map. Since the shaft is a common static object in road scenes, it is often used as an important reference element in high-precision maps.

In shaft sensing of images, it is now common to base object detection. Namely, a circumscribed rectangle frame is output to the rod-shaped object. The detection rectangular frame can only output the area containing the rod-shaped object, so that the position information range of the rod-shaped object in the image is large, and the accurate position information of the rod-shaped object cannot be identified.

Disclosure of Invention

The embodiment of the application provides a shaft identification method, a shaft identification device, shaft identification equipment and a storage medium, which are used for identifying accurate position information of a shaft.

In view of the above, an aspect of the present application provides a shaft identification method, including: acquiring an image to be recognized, wherein the image to be recognized comprises at least one rod-shaped object to be recognized; inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network; scanning the first segmentation result to obtain a first central point set corresponding to the first rod-shaped object to be identified, wherein the first central point set comprises a central point obtained by scanning the first rod-shaped object to be identified; obtaining a first central line corresponding to the first shaft to be identified according to the central point in the first central point set; and determining the position information of the first shaft to be identified according to the first middle line.

Another aspect of the present application provides a shaft identification apparatus, including:

the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring an image to be recognized, and the image to be recognized comprises at least one rod-shaped object to be recognized; inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network;

the processing module is used for scanning the first segmentation result to obtain a first central point set corresponding to the first rod-shaped object to be identified, and the first central point set comprises a central point obtained by scanning the first rod-shaped object to be identified; obtaining a first central line corresponding to the first shaft to be identified according to the central point in the first central point set;

and the determining module is used for determining the position information of the first shaft to be identified according to the first middle line.

In one possible design, in another implementation of another aspect of the embodiment of the present application, the processing module is specifically configured to perform a line-by-line pixel scanning on the first segmentation result to determine a first abscissa and a second abscissa of the first segmentation result on each line, where the first abscissa is an abscissa of a first occurrence position of the first segmentation result on each line, and the second abscissa is an abscissa of a last occurrence position of the first segmentation result on each line; and taking a middle value of the first abscissa and the second abscissa to obtain an abscissa of the first center point of the first segmentation result in each row, then taking a ordinate of each row as a ordinate of the first center point, and generating the first center point set according to the first center point.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the processing module is specifically configured to segment the first segmentation result according to a preset number of rows to obtain a scanning row; performing pixel scanning on the scanning line to determine a third abscissa and a fourth abscissa of the first segmentation result on the scanning line, wherein the third abscissa is the abscissa of the first appearance position of the first segmentation result on the scanning line, and the fourth abscissa is the abscissa of the last appearance position of the first segmentation result on the scanning line; and taking a middle value of the third abscissa and the fourth abscissa to obtain an abscissa of the first segmentation result at a second central point of the scanning line, then taking a ordinate of the scanning line as a ordinate of the second central point, and generating the first central point set according to the second central point.

In a possible design, in another implementation manner of another aspect of the embodiment of the present application, the processing module is specifically configured to determine, according to abscissa of each central point in the first central point set, an abscissa variation range corresponding to the first central point set;

when the variation range of the abscissa is smaller than or equal to a first preset threshold, defining the first central line as a vertical central line; the abscissa of the first central line is the mean of the abscissas of the central points in the first central point set, and the ordinate of the first central line is the minimum ordinate and the maximum ordinate of the central points in the first central point set;

when the variation range of the abscissa is larger than the first preset threshold and smaller than a second preset threshold, fitting each central point of the first central point set to obtain a first central line;

when the variation range of the abscissa is larger than or equal to the second preset threshold, discarding the abnormal central point from the first central point set to obtain a second central point set, and performing fitting processing on the central points of the second central point set to obtain the first central line, wherein the abnormal central point is a central point of which the difference between the abscissa and the abscissa mean value of each central point in the first central point set exceeds a third preset threshold.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the obtaining module is further configured to obtain the training image data and the initial segmentation network, where the training image data is labeled with a rod, the initial segmentation network adopts an encoder-decoder structure, and the initial segmentation network includes a context information enhancement module that obtains vertical information of the rod;

the shaft recognition device also comprises a training module, wherein the training module is used for training the initial segmentation network according to the training image data to obtain the segmentation network.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the context information enhancement module is a plurality of vertically asymmetric convolutional layers with different scales, a convolution kernel of each vertically asymmetric convolutional layer is designed to be N × 1, and N is a positive integer greater than 1; the convolution module of the encoder is a lightweight depthwise-pointwise convolution module.

In one possible design, in another implementation manner of another aspect of the embodiment of the present application, the obtaining module is specifically configured to output, by the output segmentation network of the image to be recognized, M second segmentation results, where the M second segmentation results are segmentation results of all shaft objects to be recognized in the image to be recognized, and M is an integer greater than or equal to 1;

and performing an on operation on the M second division results to make adhesion breaking existing in the M second division results obtain X independent block division results, wherein the first division result of the first shaft to be identified is included in the X independent block division results, and X is an integer greater than or equal to M.

Another aspect of the present application provides a computer device, comprising: a memory, a processor, and a bus system;

wherein, the memorizer is used for storing the procedure;

a processor for executing the program in the memory, the processor for performing the above-described aspects of the method according to instructions in the program code;

the bus system is used for connecting the memory and the processor so as to enable the memory and the processor to communicate.

Another aspect of the present application provides a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to perform the above-described aspects of the method.

In another aspect of the application, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided by the above aspects.

According to the technical scheme, the embodiment of the application has the following advantages: after a prediction result is obtained by semantically segmenting a rod-shaped object in an image to be recognized, a central point set of the rod-shaped object is calculated according to the segmentation result, a central line of the rod-shaped object is further obtained, and finally, the position information of the rod-shaped object in the image to be recognized is determined according to the central line, so that the accurate position information of the rod-shaped object is obtained.

Drawings

Fig. 1 is a schematic diagram of an architecture of a communication system according to an embodiment of the present application;

FIG. 2 is a diagram illustrating an example of the position information of a shaft in an automatic driving process according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an embodiment of a shaft identification method in the embodiment of the present application;

FIG. 4 is a schematic diagram of an image to be recognized in the embodiment of the present application;

FIG. 5 is a schematic diagram of a segmentation result output by a segmentation network of an image to be recognized in an embodiment of the present application;

fig. 6 is a schematic diagram of a segmentation result corresponding to an image to be recognized adjusted based on fig. 5 in an embodiment of the present application;

FIG. 7 is a schematic diagram of a split network in an embodiment of the present application;

FIG. 8 is a schematic view of a set of center points of a shaft to be identified in an embodiment of the present application;

FIG. 9 is another illustration of a set of center points of shafts to be identified in an embodiment of the present application;

FIG. 10 is another illustration of a center point set based on the segmentation result shown in FIG. 6 in the embodiment of the present application;

FIG. 11 is a schematic view of an embodiment of the present application based on the centerline of the shaft to be identified shown in FIG. 10;

FIG. 12 is a schematic view of an embodiment of a shaft identifying apparatus according to the present embodiment;

FIG. 12a is a schematic view of an embodiment of a shaft recognition device according to the present application;

fig. 13 is a schematic view of another embodiment of the shaft identifying device in the embodiment of the present application;

fig. 14 is a schematic view of another embodiment of the shaft identifying device in the embodiment of the present application.

Detailed Description

The embodiment of the application provides a shaft identification method, a shaft identification device, shaft identification equipment and a storage medium, which are used for acquiring accurate position information of a shaft.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For ease of understanding, some of the terms in this application are explained below:

a rod-shaped material: the application refers to vertical rods appearing in traffic scenes and comprises a traffic signboard, a traffic lamp pole, a telegraph pole, an isolating rod and the like.

Semantic segmentation: a typical computer vision problem involves taking some raw data (e.g., a flat image) as input and converting them into a mask with highlighted regions of interest. Many people use the term full-pixel semantic segmentation (full-pixel semantic segmentation), in which each pixel in an image is assigned a category ID according to the object of interest to which it belongs. Early computer vision problems only found elements like edges (lines and curves) or gradients, but they never provided pixel-level image understanding in a fully human-perceptible manner. Semantic segmentation solves this problem by grouping together image parts belonging to the same object, thus expanding its application area. The classification thereof includes: standard semantic segmentation (standard semantic segmentation), also known as full-pixel semantic segmentation, is a process of classifying each pixel as belonging to an object class; instance aware semantic segmentation (instance aware semantic segmentation) is a subtype of standard semantic segmentation or full-pixel semantic segmentation that classifies each pixel as belonging to an object class and an entity identity of that class.

Masking: also called image mask. The mask operation of the image refers to recalculating the value of each pixel in the image through a mask kernel, describing the influence degree of a field pixel point on a new pixel value by the mask kernel, and simultaneously carrying out weighted average on the pixel point according to a weight factor in the mask kernel. Image masking operations are commonly used in areas of image smoothing, edge detection, feature analysis, and the like.

The method provided by the present application is applied to a communication system shown in fig. 1, please refer to fig. 1, fig. 1 is an architecture diagram of the communication system in the embodiment of the present application, and as shown in fig. 1, the communication system includes a server, a terminal device and a satellite, and a client is deployed on the terminal device, where the client may run on the terminal device in a browser form, or run on the terminal device in an Application (APP) form, and a specific presentation form of the client is not limited herein. The server related to the application can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, safety service, content Delivery Network (CDN), big data and an artificial intelligence platform. The terminal device may be a smart phone, a tablet computer, a notebook computer, a palm computer, a personal computer, a smart television, a smart watch, a vehicle-mounted device, a wearable device, and the like, but is not limited thereto. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. The number of servers and terminal devices is also not limited. The scheme provided by the application can be independently completed by the terminal device, can also be independently completed by the server, and can also be completed by the cooperation of the terminal device and the server, so that the application is not particularly limited.

A terminal device acquires Global Positioning System (GPS) data, and typically, the terminal device may acquire GPS data from one or more GPS satellites while being outdoors and having line-of-sight access to the satellites. The terminal device periodically determines GPS coordinates (e.g., latitude and longitude coordinates) using the GPS data to indicate the current location of the terminal device. When the vehicle is required to have centimeter-level positioning accuracy in automatic driving, and the GPS positioning has meter-level or even ten-level errors, the vehicle can be combined with elements of a high-precision map to correct the position. The high-precision map is an electronic map with higher precision and more data dimensions, and the construction of the high-precision map is considered as an important technical link of an automatic driving technology. The high-precision map contains a large amount of information that can be referred to by the autonomous vehicle during travel, such as road information including the position, type, width, gradient, and curvature of a lane line, and fixed object information around a lane such as a traffic sign, traffic light, guardrail, road edge, and shaft. An exemplary scene is shown in fig. 2, when a vehicle is in a driving process, objects such as lane lines, traffic signs and rod-shaped objects are sensed through a camera and a vehicle-end visual perception algorithm, and then the position relation between the vehicle and the objects is compared with data stored in a high-precision map to calculate the accurate position of the vehicle. Since the shaft is a common static object in road scenes, it is often used as an important reference element in high-precision maps. In shaft sensing of images, it is now common to base object detection. Namely, a circumscribed rectangle frame is output to the rod-shaped object. The detection rectangular frame can only output the area containing the rod-shaped object, so that the position information range of the rod-shaped object in the image is large, and the accurate position information of the rod-shaped object cannot be identified.

In order to solve the problem, the application provides the following technical scheme: acquiring an image to be recognized, wherein the image to be recognized comprises at least one rod-shaped object to be recognized; inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network; scanning the first segmentation result to obtain a first central point set corresponding to the first rod-shaped object to be identified, wherein the first central point set comprises a central point obtained by scanning the first rod-shaped object to be identified; obtaining a first central line corresponding to the first rod-shaped object to be identified according to the central point in the first central point set; and determining the position information of the first shaft to be identified according to the first midline. Therefore, after the shaft-shaped object in the image to be recognized is segmented based on the semantic meaning to obtain a prediction result, the central point set of the shaft-shaped object is obtained through calculation according to the segmentation result, the central line of the shaft-shaped object is further obtained, and finally the position information of the shaft-shaped object in the image to be recognized is determined according to the central line, so that the accurate position information of the shaft-shaped object is obtained. The technical scheme provided by the application can be applied to the construction of high-precision maps, so that the application in automatic driving is realized. The accurate position of the rod-shaped object is provided in the road sensing process, so that the high-precision map is more accurate, and the safety of automatic driving is higher.

In this application, the position information of the shaft can be stored in a Database (Database). In short, the database can be regarded as an electronic file cabinet, namely a place for storing electronic files, and a user can add, query, update, delete and the like to data in the files. A "database" is a collection of data that is stored together in a manner that can be shared by multiple users, has as little redundancy as possible, and is independent of the application. A Database Management System (DBMS) is a computer software System designed for managing a Database, and generally has basic functions such as storage, interception, security assurance, and backup. The database management system may be categorized according to the database model it supports, such as relational, extensible Markup Language (XML); or classified according to the type of computer supported, e.g., server cluster, mobile phone; or classified according to the Query Language used, such as Structured Query Language (SQL), XQuery; or by performance impact emphasis, such as maximum size, maximum operating speed; or other classification schemes. Regardless of the manner of classification used, some DBMSs are capable of supporting multiple query languages across categories, for example, simultaneously.

It is understood that in the specific implementation of the present application, the data related to the vehicle data sensed by the sensor and the road information, etc. need to be approved or agreed by the user when the above embodiments of the present application are applied to specific products or technologies, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant countries and regions.

With reference to fig. 3, a method for identifying a shaft in the present application will be described below by taking a terminal device as an example, where an embodiment of the method for identifying a shaft in the present application includes:

301. acquiring an image to be recognized, wherein the image to be recognized comprises at least one shaft to be recognized.

The terminal equipment acquires the image to be identified through a camera in the driving process of the vehicle. In this embodiment, the image to be recognized may be a traffic road real-time image, and therefore, the image to be recognized may include various shaft-shaped objects. In an exemplary scheme, the image to be recognized may be as shown in fig. 4, which includes a traffic sign, a traffic light pole, and a power pole.

302. And inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of the first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network.

After the terminal device obtains the image to be recognized, the image to be recognized is input to a segmentation network deployed on the terminal device to obtain a second segmentation result of the multiple rods in the image to be recognized. At this time, since there may be an overlapping portion in each shaft in the image to be recognized, the terminal device further needs to adjust the second segmentation result corresponding to the image to be recognized, so that the segmentation result corresponding to each shaft becomes the first segmentation result in an independent block. The specific operation mode can be as follows: outputting M segmentation results by the image to be recognized output segmentation network, wherein M is an integer greater than or equal to 1; performing an opening operation on the M second segmentation results to make the conglutination disconnection existing in the M segmentation results obtain X independent blocking segmentation results, where the M segmentation results and the X independent blocking segmentation results are both used to indicate a segmentation result of the rod-shaped object to be identified in the image to be identified, a first segmentation result of the first rod-shaped object to be identified is included in the X independent blocking segmentation results, and X is an integer greater than or equal to M.

In an exemplary scheme, as described in conjunction with fig. 4, the image to be recognized shown in fig. 4 is input into the segmentation network, and then the prediction result map containing the shaft segmentation result shown in fig. 5 is output. As can be seen from fig. 5, the division results of the traffic sign and the traffic light pole are partially overlapped, so the division result shown in fig. 5 is a division result of 4 independent blocks, and in order to separate the division results of the traffic sign and the traffic light pole into the division results of the independent blocks, the division result shown in fig. 5 needs to be adjusted by performing an on operation to obtain the division results of 5 independent blocks shown in fig. 6. Meanwhile, the segmentation results of the four independent log blocks can be subjected to label processing. For example, the division result of the traffic sign is denoted as traffic sign 1, the division result of the traffic sign is traffic sign 2, and the division result of the telegraph pole is telegraph pole 3, telegraph pole 4 and telegraph pole 5.

In this embodiment, the segmentation network is set as an image segmentation network based on semantic segmentation, and the segmentation result may be output in the form of a pixel-level mask. It is to be understood that the segmentation result may also be output in other pixel-level labeling forms, and is not limited herein.

Optionally, in order to ensure that more comprehensive characteristic information of the rod-shaped object is extracted, the initial segmentation network adopts an encoder-decoder structure, and the initial segmentation network includes a context information enhancement module for obtaining vertical information of the rod-shaped object, where the context information enhancement module is a plurality of vertical asymmetric convolution layers of different scales, and a convolution kernel of each vertical asymmetric convolution layer is designed to be N1. Meanwhile, in order to enable the segmentation network to be applied to terminal equipment with poor computing power (such as a vehicle-mounted terminal or a smart phone), the convolution module of the encoder of the initial segmentation network can be designed to be a lightweight depthwise-pointwise convolution module. In an exemplary scheme, the initial partition network may be structured as shown in fig. 7, and the vertical information pyramid (i.e., the context information enhancement module for the vertical information) is added between the encoder and the decoder in the initial partition network. As shown in fig. 7, the convolution module of the encoder is a lightweight depthwise-pointwise convolution module, the decoder includes an upsampling module thereon, and the vertical information pyramid is a plurality of vertically asymmetric convolution layers with different scales. Each convolution layer of the depthwise-pointwise convolution module may include, as shown in fig. 7, one convolution layer with a specification of "Conv,1 × 1, relu", one convolution layer with a specification of "Dwise3 × 3, relu, stride = n", one convolution layer with a specification of "Conv,1 × 1", one normalization layer (i.e., batchNorm layer), and one fusion layer. The processing flow of data in the depthwise-pointwise convolution module can be as follows: the image to be recognized is sequentially subjected to convolution layer with specification of 'Conv, 1 × 1, relu', convolution layer with specification of 'Dwise 3 × 3, relu, stride = n', convolution layer with specification of 'Conv, 1 × 1', convolution layer with specification of 'Dwise 3 × 3, relu, stride = n' and the normalization layer to obtain an output result, and then the output result and the image to be recognized are fused and processed through a ReLU function to obtain an output result output to the information adherence pyramid. The vertical information pyramid may include three parallel convolution modules as shown in fig. 7, one convolution module including a convolution layer and a normalization layer of "Conv,3 × 1" specification; the other convolution module comprises a convolution layer with the specification of 'Conv, 5 multiplied by 1' and a normalization layer; another convolution module includes a convolution layer with the specification "Conv,7 × 1" and a normalization layer. In this vertical information pyramid, the processing flow of data is as follows: and the output results of the depthwise-pointwise convolution module are respectively input into the three convolution modules to obtain three different output results, and then the three output results are fused with the output results of the depthwise-pointwise convolution module and processed through a ReLU function to obtain the output result output to the decoder. The upsampling module in the decoder may be of the specification as shown in fig. 7, and the upsampling module includes a deconvolution layer of the specification "Transpose3 × 3, stride =2" and a normalization layer. The data processing flow of the up-sampling module here may be as follows: and obtaining an up-sampled output result by the output result of the vertical information pyramid through the deconvolution layer and the normalization layer, and processing the output result through a ReLU function to obtain a segmentation result of the image to be recognized.

303. And scanning the first segmentation result to obtain a first central point set corresponding to the first shaft to be identified, wherein the first central point set comprises a central point obtained by scanning the first shaft to be identified.

After the terminal device obtains the segmentation result of each rod-shaped object to be identified in the image to be identified, pixel scanning is sequentially carried out on the segmentation result of each rod-shaped object to be identified to obtain a central point, the central line of each rod-shaped object to be identified is determined according to the central point, namely, pixel scanning is respectively carried out on the segmentation results which are independent into blocks, and only one segmentation result is subjected to pixel scanning each time. In this embodiment, a first segmentation result corresponding to the first shaft to be identified is taken as an example for explanation:

in a possible implementation manner, the terminal device performs pixel scanning on the first segmentation result line by line, determines a first abscissa and a second abscissa of the first segmentation result on each line, where the first abscissa is an abscissa of a first occurrence position of the first segmentation result in each line, and the second abscissa is an abscissa of a last occurrence position of the first segmentation result in each line, then takes a middle value between the first abscissa and the second abscissa to obtain an abscissa of a first central point of the first segmentation result in each line, then takes a ordinate of each line as a ordinate of the first central point, and generates the first central point set according to the first central point. As shown in fig. 8, on the first row, the first abscissa of the first segmentation result is 35, the second abscissa is 37, and the abscissa of the center point is 36. On the second row, the first abscissa of the first segmentation result is 34, the second abscissa is 36, the abscissa of the center point is 35, and so on, the center point of the first segmentation result on each row is obtained.

In another possible implementation manner, the terminal device segments the first segmentation result according to a preset number of lines to obtain a scanning line; then, carrying out pixel scanning on the scanning line to determine a third abscissa and a fourth abscissa of the first segmentation result on the scanning line, wherein the third abscissa is the abscissa of the first appearance position of the first segmentation result on the scanning line, and the fourth abscissa is the abscissa of the last appearance position of the first segmentation result on the scanning line; and taking a middle value of the third abscissa and the fourth abscissa to obtain an abscissa of the first segmentation result at a second central point of the scanning line, then taking a ordinate of the scanning line as a ordinate of the second central point, and generating the first central point set according to the second central point. As shown in fig. 9, the terminal device determines a scanning line for every 20 lines on the image to be recognized, where on the first scanning line, the first abscissa of the first segmentation result is 35, the second abscissa is 37, and the abscissa of the central point is 36. On the second scanning line, the first abscissa of the first division result is 34, the second abscissa is 36, the abscissa of the center point is 35, and so on, the center point of the first division result on each scanning line is obtained.

By analogy, the terminal device may obtain the set of the central points of the rods to be recognized in the image to be recognized in the same manner, which is not described herein again. In an exemplary scheme, in combination with the segmentation result diagram shown in fig. 6, the schematic diagram of the set of central points of the respective shafts in the image to be identified may be as shown in fig. 10.

304. And obtaining a first central line corresponding to the first shaft to be identified according to the central point in the first central point set.

After the central point set of each shaft-shaped object to be identified is obtained, the terminal equipment determines the central line corresponding to the first shaft-shaped object to be identified according to the central point set. The first shaft to be identified is taken as an example for description, and the specific operation thereof may be as follows: determining the abscissa variation range corresponding to the first central point set according to the abscissas of the central points in the first central point set; and determining a specific mode of determining the central line according to the central point set according to the abscissa variation range.

In a possible implementation manner, when the abscissa variation range is less than or equal to a first preset threshold, defining the first central line as a vertical central line; the abscissa of the first central line is an average of the abscissas of the central points in the first central point set, and the ordinate of the first central line is the minimum ordinate and the maximum ordinate of the central points in the first central point set. In this embodiment, the first preset threshold may be set to 3. That is, when the variation range of the abscissa of each central point in the central point set of the first shaft to be identified is not more than 3, if the variation range of the abscissa of each central point in the central point set is (36, 38), the central points of the shaft to be identified are approximately vertically arranged, and the central line is likely to be a vertical straight line, since the least square method cannot be used to fit a vertical straight line, the central line may be directly defined as a vertical central line x = b, where b is the mean of the abscissas of each central point in the central point set of the first shaft to be identified.

In a possible implementation manner, when the abscissa variation range is greater than the first preset threshold and smaller than a second preset threshold, fitting each central point of the first central point set to obtain the first central line. In this embodiment, the first preset threshold may be set to 3, and the second preset threshold may be set to 10. That is, when the range of variation of the abscissa of each central point in the central point set of the first shaft to be identified exceeds 3, but does not exceed 10, the central points of the first shaft to be identified may be fitted by using the least square method to obtain the centerline of the first shaft to be identified. For example, if the range of the abscissa variation of the central point in the first central point set is (36, 41), the terminal device performs least square fitting on each central point of the shaft to be recognized to obtain the centerline of the first shaft to be recognized.

When the variation range of the abscissa is larger than or equal to the second preset threshold, discarding the abnormal central point from the first central point set to obtain a second central point set, and performing fitting processing on the central points of the second central point set to obtain the first central line, wherein the abnormal central point is a central point of which the difference between the abscissa and the abscissa mean value of each central point in the first central point set exceeds a third preset threshold. In this embodiment, the second preset threshold may be set to 10. Namely, when the variation range of the abscissa of each central point in the central point set of the first shaft to be recognized exceeds 10, the terminal device determines that the first central point set includes an abnormal central point, the terminal device calculates the mean value of the abscissas of each central point in the first central point set, and calculates the difference value between the abscissa of each central point and the mean value, and if the difference value exceeds a third preset threshold, the terminal device determines that the central point is the abnormal central point. Then the terminal equipment discards the abnormal central point, and the central line of the rod to be identified is obtained by fitting the remaining central point by adopting a least square method. For example, the range of variation of the abscissa of the central point in the first central point set is (36, 50), and the terminal device determines that the central point with the abscissa of 50 is the abnormal central point, discards the central point with the abscissa of 50, and then fits the remaining central points of the shaft to be recognized by using a least square method to obtain the centerline of the first shaft to be recognized.

In this embodiment, as shown in fig. 10, the center line of each shaft to be recognized in the image to be recognized may be as shown in fig. 11.

305. And determining the position information of the first shaft to be identified according to the first midline.

The terminal device may determine the position information of the first shaft to be recognized in the image to be recognized after determining the center line of each shaft to be recognized.

In this embodiment, after determining the position information of the shaft to be recognized, the terminal device may compare the position relationship between the vehicle and the shaft to be recognized with the data stored in the high-precision map during the driving of the vehicle, and calculate the accurate vehicle position.

Referring to fig. 12, fig. 12 is a schematic view of an embodiment of the shaft identification device 20 of the present application, which includes:

an acquiring module 201, configured to acquire an image to be recognized, where the image to be recognized includes at least one shaft to be recognized; inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network;

the processing module 202 is configured to scan the first segmentation result to obtain a first center point set corresponding to the first shaft to be identified; obtaining a first central line corresponding to the first shaft to be identified according to the central point in the first central point set;

a determining module 203, configured to determine position information of the first shaft to be identified according to the first centerline.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, after the rod-shaped object in the image to be recognized is segmented based on the semantic meaning to obtain the prediction result, the central point set of the rod-shaped object is obtained through calculation according to the segmentation result, the central line of the rod-shaped object is further obtained, and finally the position information of the rod-shaped object in the image to be recognized is determined according to the central line, so that the accurate position information of the rod-shaped object is obtained.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the shaft recognition device 20 provided in this embodiment of the present application, the first segmentation result is labeled at a pixel level, and the processing module 202 is specifically configured to perform a line-by-line pixel scanning on the first segmentation result to determine a first abscissa and a second abscissa of the first segmentation result on each line, where the first abscissa is an abscissa of a first occurrence position of the first segmentation result on each line, and the second abscissa is an abscissa of a last occurrence position of the first segmentation result on each line; and taking a middle value of the first abscissa and the second abscissa to obtain an abscissa of the first center point of the first segmentation result in each row, then taking a ordinate of each row as a ordinate of the first center point, and generating the first center point set according to the first center point.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, the mask codes which are independent into blocks are subjected to line-by-line pixel scanning, the abscissa of the first appearance position and the abscissa of the last appearance position of the mask codes on each line are obtained, the intermediate value is taken for the two abscissas, the abscissa of the center point of the mask codes on each line is obtained, and then the ordinate of the center point is obtained, namely the position information of the center point is obtained. This provides a large number of centre points for determining the centre line of the first shaft to be identified, so that the centre line of the first shaft to be identified is more accurate.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the shaft recognition device 20 provided in this embodiment of the application, the first segmentation result is a pixel-level label, and the processing module 202 is specifically configured to segment the first segmentation result according to a preset number of rows to obtain a scanning row; performing pixel scanning on the scanning line to determine a third abscissa and a fourth abscissa of the first segmentation result on the scanning line, wherein the third abscissa is the abscissa of the first appearance position of the first segmentation result on the scanning line, and the fourth abscissa is the abscissa of the last appearance position of the first segmentation result on the scanning line; and taking a middle value of the third abscissa and the fourth abscissa to obtain an abscissa of the first segmentation result at a second central point of the scanning line, then taking a ordinate of the scanning line as a ordinate of the second central point, and generating the first central point set according to the second central point.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, a scanning line is taken according to a preset line for the mask code which is independent into a block, then the scanning line is subjected to pixel scanning, the abscissa of the first appearance position and the abscissa of the last appearance position of the mask code on the scanning line are obtained, the middle value is taken for the two abscissas, the abscissa of the central point of the mask code on the scanning line is obtained, and then the ordinate of the central point is obtained, namely the position information of the central point is obtained. This may increase shaft identification efficiency.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the shaft identification device 20 provided in the embodiment of the present application, the processing module 202 is specifically configured to determine an abscissa variation range corresponding to the first center point set according to the abscissa of each center point in the first center point set;

when the variation range of the abscissa is larger than or equal to the second preset threshold, discarding an abnormal central point from the first central point set to obtain a second central point set, and performing fitting processing on the central points of the second central point set to obtain the first central line, wherein the abnormal central point is a central point of which the difference between the abscissa and the abscissa mean value of each central point in the first central point set exceeds a third preset threshold.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, the center line determining mode under different conditions is carried out according to the abscissa condition of each center point in the center point set, and the accuracy of the result of the center line can be effectively ensured.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the shaft recognition apparatus 20 provided in the embodiment of the present application, as shown in fig. 12a, the obtaining module 201 further obtains the training image data and the initial segmentation network, where the training image data is marked with the shaft, the initial segmentation network adopts an encoder-decoder structure, and the initial segmentation network includes a context information enhancing module that obtains vertical information of the shaft;

the shaft recognition device 20 further includes a training module 204, and the training module 204 is configured to train the initial segmentation network to obtain the segmentation network according to the training image data.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, the context information enhancement module for acquiring the vertical information of the rod-shaped object is designed in the segmentation network, and the feature extraction of the rod-shaped object is increased, so that the rod-shaped object identification precision is improved.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the shaft identification device 20 provided in the embodiment of the present application, the context information enhancement module is a plurality of vertically asymmetric convolution layers with different scales, a convolution kernel of each vertically asymmetric convolution layer is designed to be N × 1, and N is a positive integer greater than 1; the convolution module of the encoder is a lightweight depthwise-pointwise convolution module.

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, the context information enhancement module is designed into a plurality of vertical asymmetric convolution layers with different scales, so that the characteristic information of the rod-shaped object can be more effectively extracted. Meanwhile, the segmentation network is designed into a lightweight depthwise-pointwise convolution module, so that the operation requirement of the rod-shaped object recognition device can be reduced.

Optionally, on the basis of the embodiment corresponding to fig. 12, in another embodiment of the rod recognition apparatus 20 provided in the embodiment of the present application, the obtaining module 201 is specifically configured to output M second segmentation results from the to-be-recognized image output segmentation network, where the M second segmentation results are segmentation results of all to-be-recognized rods in the to-be-recognized image, and M is an integer greater than or equal to 1;

In an embodiment of the present application, a shaft identification device is provided. By adopting the device, when a plurality of rod-shaped objects exist in the image to be identified and need to be identified, the output segmentation result needs to be adjusted, so that the segmentation result is ensured to be a mask of independent blocks, thus reducing the interference in the identification process of different rod-shaped objects and improving the identification precision of the rod-shaped objects.

The shaft recognition device provided in the present application may be a server, please refer to fig. 13, fig. 13 is a schematic structural diagram of a server provided in an embodiment of the present application, and the server 300 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the server 300.

The Server 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341, such as a Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM ,Linux ^TM ，FreeBSD ^TM And so on.

The steps performed by the shaft identifying means in the above-described embodiment may be based on the server structure shown in fig. 13.

The shaft recognition device provided by the present application may be a terminal device, please refer to fig. 14, which shows only a part related to the embodiment of the present application for convenience of description, and please refer to the method part of the embodiment of the present application for details that are not disclosed. In the embodiment of the present application, a terminal device is taken as an example to describe:

fig. 14 is a block diagram illustrating a partial structure of a vehicle-mounted terminal related to a terminal device provided in an embodiment of the present application. Referring to fig. 14, the in-vehicle terminal includes: radio Frequency (RF) circuitry 410, memory 420, input unit 430, display unit 440, sensor 450, audio circuitry 460, wireless fidelity (WiFi) module 470, processor 480, and power supply 490. Those skilled in the art will appreciate that the in-vehicle terminal structure shown in fig. 14 does not constitute a limitation of the in-vehicle terminal, and may include more or less components than those shown, or combine some components, or a different arrangement of components.

The following specifically describes each constituent element of the in-vehicle terminal with reference to fig. 14:

the RF circuit 410 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 480; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuitry 410 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 410 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for mobile communications (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), email, short Message Service (SMS), etc.

The memory 420 may be used to store software programs and modules, and the processor 480 executes various functional applications and data processing of the in-vehicle terminal by operating the software programs and modules stored in the memory 420. The memory 420 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the in-vehicle terminal, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 430 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the in-vehicle terminal. Specifically, the input unit 430 may include a touch panel 431 and other input devices 432. The touch panel 431, also called a touch screen, may collect touch operations of a user on or near the touch panel 431 (e.g., operations of the user on or near the touch panel 431 using any suitable object or accessory such as a finger or a stylus) and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 431 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 480, and receives and executes commands sent from the processor 480. In addition, the touch panel 431 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 430 may include other input devices 432 in addition to the touch panel 431. In particular, other input devices 432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 440 may be used to display information input by the user or information provided to the user and various menus of the in-vehicle terminal. The display unit 440 may include a display panel 441, and optionally, the display panel 441 may be configured in the form of a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 431 may cover the display panel 441, and when the touch panel 431 detects a touch operation on or near the touch panel 431, the touch panel is transmitted to the processor 480 to determine the type of the touch event, and then the processor 480 provides a corresponding visual output on the display panel 441 according to the type of the touch event. Although the touch panel 431 and the display panel 441 are two independent components to implement the input and output functions of the in-vehicle terminal in fig. 14, in some embodiments, the touch panel 431 and the display panel 441 may be integrated to implement the input and output functions of the in-vehicle terminal.

The in-vehicle terminal may also include at least one sensor 450, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 441 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 441 and/or a backlight when the in-vehicle terminal is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the vehicle-mounted terminal, and related functions (such as pedometer and tapping) for vibration recognition; other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer and an infrared sensor which can be configured on the vehicle-mounted terminal are omitted for description.

The audio circuit 460, speaker 461, microphone 462 may provide an audio interface between the user and the in-vehicle terminal. The audio circuit 460 may transmit the electrical signal converted from the received audio data to the speaker 461, and convert the electrical signal into a sound signal for output by the speaker 461; on the other hand, the microphone 462 converts the collected sound signal into an electric signal, which is received by the audio circuit 460 and converted into audio data, which is then processed by the audio data output processor 480, and then transmitted to, for example, another vehicle-mounted terminal via the RF circuit 410, or output to the memory 420 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the vehicle-mounted terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 470, and provides wireless broadband internet access for the user. Although fig. 14 shows the WiFi module 470, it is understood that it does not belong to the essential constitution of the in-vehicle terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 480 is a control center of the in-vehicle terminal, connects various parts of the entire in-vehicle terminal using various interfaces and lines, and performs various functions of the in-vehicle terminal and processes data by operating or executing software programs and/or modules stored in the memory 420 and calling data stored in the memory 420, thereby performing overall monitoring of the in-vehicle terminal. Optionally, processor 480 may include one or more processing units; optionally, the processor 480 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 480.

The vehicle-mounted terminal further includes a power source 490 (e.g., a battery) for supplying power to various components, and optionally, the power source may be logically connected to the processor 480 through a power management system, so that functions of managing charging, discharging, and power consumption are implemented through the power management system.

Although not shown, the in-vehicle terminal may further include a camera, a bluetooth module, and the like, which are not described herein.

The steps performed by the terminal device in the above-described embodiment may be based on the terminal device structure shown in fig. 14.

Embodiments of the present application also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer is caused to execute the method described in the foregoing embodiments.

Embodiments of the present application also provide a computer program product including a program, which, when run on a computer, causes the computer to perform the methods described in the foregoing embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A shaft identification method, comprising:

acquiring an image to be recognized, wherein the image to be recognized comprises at least one rod-shaped object to be recognized;

inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is the segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network;

scanning the first segmentation result to obtain a first central point set corresponding to the first shaft-shaped object to be identified, wherein the first central point set comprises a central point obtained by scanning the first shaft-shaped object to be identified;

obtaining a first central line corresponding to the first shaft to be identified according to the central point in the first central point set;

and determining the position information of the first shaft to be identified according to the first midline.

2. The method of claim 1, wherein the first segmentation result is represented as a pixel level mask, and wherein scanning the first segmentation result into a first set of center points corresponding to the first shaft to be identified comprises:

performing pixel scanning on the first segmentation result line by line to determine a first abscissa and a second abscissa of the first segmentation result on each line, wherein the first abscissa is the abscissa of a first occurrence position of the first segmentation result on each line, and the second abscissa is the abscissa of a last occurrence position of the first segmentation result on each line;

and taking a middle value of the first abscissa and the second abscissa to obtain an abscissa of the first center point of the first segmentation result in each row, then taking a ordinate of each row as a ordinate of the first center point, and generating the first center point set according to the first center point.

3. The method of claim 1, wherein the first segmentation result is represented as a pixel level mask, and wherein scanning the first segmentation result into a first set of center points corresponding to the first shaft to be identified comprises:

segmenting the first segmentation result according to a preset number of lines to obtain a scanning line;

pixel scanning is carried out on the scanning line to determine a third abscissa and a fourth abscissa of the first segmentation result on the scanning line, wherein the third abscissa is the abscissa of the first appearance position of the first segmentation result on the scanning line, and the fourth abscissa is the abscissa of the last appearance position of the first segmentation result on the scanning line;

and taking a middle value of the third abscissa and the fourth abscissa to obtain an abscissa of the first segmentation result at a second central point of the scanning line, then taking a ordinate of the scanning line as a ordinate of the second central point, and generating the first central point set according to the second central point.

4. The method according to claim 1, wherein the deriving a first centerline for the first shaft to be identified for a center point of the first set of center points comprises:

determining an abscissa variation range corresponding to the first central point set according to the abscissas of the central points in the first central point set;

when the variation range of the horizontal coordinate is smaller than or equal to a first preset threshold value, defining the first middle line as a vertical middle line; the abscissa of the first central line is the mean of the abscissas of the central points in the first central point set, and the ordinate of the first central line is the minimum ordinate and the maximum ordinate of the central points in the first central point set;

when the variation range of the abscissa is larger than the first preset threshold and smaller than a second preset threshold, fitting each central point of the first central point set to obtain the first central line;

5. The method according to any one of claims 1 to 4, further comprising:

acquiring the training image data and the initial segmentation network, wherein a rod-shaped object is marked in the training image data, the initial segmentation network adopts an encoder-decoder structure, and the initial segmentation network comprises a context information enhancement module for acquiring vertical information of the rod-shaped object;

and training the initial segmentation network according to the training image data to obtain the segmentation network.

6. The method of claim 5, wherein the context information enhancement module is a plurality of vertically asymmetric convolutional layers of different sizes, the convolutional cores of the vertically asymmetric convolutional layers are designed to be N x 1; the convolution module of the encoder is a lightweight depthwise-pointwise convolution module, and N is a positive integer greater than 1.

7. The method according to any one of claims 1 to 4 and 6, wherein inputting the image to be recognized into the first segmentation result of the first shaft to be recognized output by the segmentation network comprises:

outputting M second segmentation results by the to-be-identified image output segmentation network, wherein the M second segmentation results are the segmentation results of all to-be-identified rods in the to-be-identified image, and M is an integer greater than or equal to 1;

and performing an on operation on the M second division results so that the adhesion breaking existing in the M second division results obtains X independent blocking division results, wherein the first division result of the first shaft to be identified is included in the X independent blocking division results, and X is an integer greater than or equal to M.

8. A shaft identification device, comprising:

the device comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring an image to be recognized, and the image to be recognized comprises at least one rod-shaped object to be recognized; inputting the image to be recognized into a segmentation network to obtain a first segmentation result, wherein the first segmentation result is a segmentation result of a first rod-shaped object to be recognized in the image to be recognized, and the segmentation network is an image segmentation network based on semantic segmentation and obtained by training according to training image data and an initial segmentation network;

9. A computer device, comprising: a memory, a processor, and a bus system;

wherein the memory is used for storing programs;

the processor for executing the program in the memory, the processor for performing the method of any one of claims 1 to 7 according to instructions in program code;

10. A computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method of any of claims 1 to 7.