WO2022126377A1

WO2022126377A1 - Traffic lane line detection method and apparatus, and terminal device and readable storage medium

Info

Publication number: WO2022126377A1
Application number: PCT/CN2020/136540
Authority: WO
Inventors: 王磊; 钟宏亮; 马森炜; 程俊; 林佩珍; 范筱媛
Original assignee: 中国科学院深圳先进技术研究院
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2022-06-23

Abstract

A traffic lane line detection method and apparatus, and a terminal device and a readable storage medium, applicable to the technical field of computer vision and image processing. The method comprises: obtaining a road image of the current scene (S201); and inputting the road image into a trained neural network model for processing, and outputting a detection result of a traffic lane line in the road image of the current scene, wherein the trained neural network model is obtained by training according to sample images in a training set and a semantic segmentation model, and the sample images in the training set comprise collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes (S202). The method can solve the problems that most of deep learning models used for lane recognition at present are relatively large in calculation amount, and the models are relatively complex and unfavorable for meeting the requirement on real-time performance in an actual application scene of an automatic driving task.

Description

Method, device, terminal device and readable storage medium for detecting lane line

technical field

The present application belongs to the technical field of computer vision and image processing, and in particular, relates to a method, apparatus, terminal device and readable storage medium for detecting lane lines.

Background technique

With the rapid development of artificial intelligence and the automotive industry, the automatic driving of vehicles (such as fully automatic driverless or semi-automatic assisted driving) plays an important role in the safe driving of cars; and lane recognition is an important part of the automatic driving system. The result of lane recognition provides the basis for the control system of automatic driving, and plays an irreplaceable role in the fields of automatic parking, anti-collision warning and unmanned driving.

In recent years, semantic segmentation models have achieved good performance in lane recognition tasks, but limited by the lack of global information and contextual information, ordinary semantic segmentation models cannot handle bad lighting conditions or lane occlusions. for the lane recognition task. In addition, most of the deep learning models currently used for lane recognition have a large amount of calculation and are relatively complex, which is not conducive to the real-time requirements in practical application scenarios of automatic driving tasks.

technical problem

One of the purposes of the embodiments of the present application is to provide a method, device, terminal device and readable storage medium for detecting lane lines, which can solve the problem that most of the deep learning models currently used for lane recognition have a large amount of calculation, and the models are relatively It is complex and is not conducive to the real-time requirements in the actual application scenarios of automatic driving tasks.

technical solutions

In order to solve the above-mentioned technical problems, the technical solutions adopted in the embodiments of the present application are:

In a first aspect, an embodiment of the present application provides a method for detecting lane lines, including:

Obtain the road image of the current scene; input the road image into the trained neural network model for processing, and output the detection result of the lane lines in the road image of the current scene; wherein, the trained neural network model is based on The sample images in the training set and the semantic segmentation model are obtained by training, and the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes.

In a possible implementation manner of the first aspect, the trained neural network model includes a residual network layer, a dilated convolutional network layer, an upsampling network layer, and a detector; the road image is input into the training The post neural network model is processed, including:

Input the road image into the residual network layer, and through the convolution processing of the residual network layer, output a first feature map containing semantic features; input the first feature map into the hole convolutional network layer , through the feature extraction of the hole convolutional network layer, output the second feature map containing detailed features; input the second feature map into the up-sampling network layer, after the up-sampling process of the up-sampling network layer, Outputting a third feature map; inputting the third feature map into the detector, and after convolution processing by the detector, the detection result of the lane lines in the road image of the current scene is output.

In a possible implementation manner of the first aspect, the residual network layer includes a first residual module, a second residual module and a third residual module; the road image is input into the residual The network layer, through the convolution processing of the residual network layer, outputs a first feature map containing semantic features, including:

Input the road image into the first residual module, and perform convolution processing through the first residual module to obtain a first result; input the first result into the second residual module, and pass the second The second result is obtained through the convolution processing of the residual module; the second result is input into the third residual module, and the first feature map is output through the convolution processing of the third residual module.

In a possible implementation manner of the first aspect, the atrous convolutional network layer includes a first convolution module, a second convolution module, a third convolution module, a fourth convolution module, and a global average pooling module ; Input the first feature map into the hole convolution network layer, and extract the features through the hole convolution network layer, including:

Input the first feature map into the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module respectively, Feature extraction is performed on the first feature map.

In a possible implementation manner of the first aspect, the first feature map is input into the atrous convolutional network layer, and after feature extraction of the atrous convolutional network layer, the method further includes:

The feature maps respectively output by the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module are processed in the channel dimension. Splicing to obtain a splicing feature map; after the splicing feature map is processed by 1×1 convolution, the second feature map is output.

In a possible implementation manner of the first aspect, after feature extraction is performed on the first feature map through the atrous convolutional network layer, the method includes:

The second feature map is subjected to up-sampling processing, and the result of the up-sampling processing is input into the detector. After the convolution processing of the detector, the classification prediction result of the road image of the current scene is output. The classification prediction result is used to indicate the position of the lane in the road image of the current scene.

In a possible implementation manner of the first aspect, the method includes:

Mark the lane line pixel coordinates, the lane line type, and the characteristics of whether the current lane is a drivable lane in the collected road images of multiple scenes, and obtain all the sample images corresponding to the road images of the multiple scenes. the marked image.

In a possible implementation manner of the first aspect, the neural network model is trained according to the sample images in the training set and the semantic segmentation model, including:

Input the road images of the multiple scenes in the sample image into the residual network layer of the neural network model, and through the convolution processing of the residual network layer, output the fourth feature map of the road image; The fourth feature map is input into the hollow convolutional network layer of the neural network model, and the fifth feature map is output through feature extraction of the hollow convolutional network layer; the fifth feature map is subjected to up-sampling processing, and the The result of the upsampling processing is input into the detector of the neural network model, and after the convolution processing of the detector, the classification prediction results of the road images of the multiple scenes are output, and the classification prediction results are used to indicate the Location of lanes in road images for multiple scenes.

In a possible implementation manner of the first aspect, the residual network layer of the neural network model includes three residual modules, and the method includes:

Perform upsampling processing on the fifth feature map to obtain a sixth feature map; input the road images of the multiple scenes in the sample image into the residual network layer, and pass through the first two layers of the residual network layer. The convolution processing of the residual module outputs a seventh feature map; the seventh feature map is input into the semantic segmentation model, and the lane line pixels in the seventh feature map are segmented through the semantic segmentation model. , output the lane line instance feature map; splicing the lane line instance feature map and the sixth feature map to obtain a spliced image.

In a possible implementation manner of the first aspect, the method includes:

The stitched image is input into a segmenter, and after convolution processing by the segmenter, a semantic segmentation prediction result is output, and the semantic segmentation prediction result is used to indicate the lane recognition results in the road images of the multiple scenes.

In a possible implementation manner of the first aspect, the method includes:

The first loss function is used to calculate the first error value of the classification prediction result of the neural network model relative to the prediction probability of the marked image in the sample image, and the parameters of the neural network model are adjusted by the first error value; A loss function is expressed as follows:

Wherein, y is the classification truth value of the road images of the multiple scenes in the sample images in the training set, and p is the predicted probability that the road images of the multiple scenes in the sample images in the training set are processed by the neural network model , γ is the preset weight value.

In a possible implementation manner of the first aspect, the method includes:

Calculate the second error value of the lane recognition result of the semantic segmentation model relative to the predicted probability of the marked image in the sample image by using the second loss function, and adjust the parameters of the neural network model by the second error value; The second loss function is expressed as follows:

Wherein, y is the recognition truth value of the road images of the multiple scenes in the sample images in the training set, and p is the predicted probability that the road images of the multiple scenes in the sample images in the training set are processed by the semantic segmentation model .

In a second aspect, an embodiment of the present application provides a device for detecting lane lines, including:

an acquisition unit for acquiring the road image of the current scene;

The processing unit is used to input the road image into the trained neural network model for processing, and output the detection result of the lane line in the road image of the current scene; wherein, the trained neural network model is based on the training set The sample images in the training set and the semantic segmentation model are trained, and the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes.

In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, when the processor executes the computer program implement the method described.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program implements the method when executed by a processor.

In a fifth aspect, an embodiment of the present application provides a computer program product that, when the computer program product runs on a terminal device, causes the terminal device to execute the method described in any one of the above-mentioned first aspects.

It can be understood that, for the beneficial effects of the second aspect to the fifth aspect, reference may be made to the relevant description in the first aspect, which is not repeated here.

beneficial effect

The beneficial effects of the embodiments of the present application are: through the embodiments of the present application, the terminal device obtains the road image of the current scene; the road image is input into the trained neural network model for processing, and the detection result of the lane line in the road image of the current scene is outputted Wherein, the neural network model after training is obtained by training according to the sample images in the training set and the semantic segmentation model, and the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes; By training according to the sample images in the training set and the semantic segmentation model, the trained neural network model can be obtained, which can solve the problem of the low accuracy of lane line recognition in the current complex environment, and the large amount of calculation of the current model for lane line detection. The more complex problem is the slow response speed; the detection accuracy is improved, and at the same time, it meets the real-time requirements in the actual application scenarios of automatic driving tasks; it has strong ease of use and practicability.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments or exemplary technologies. Obviously, the drawings in the following description are only for the present application. In some embodiments, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.

1 is a schematic flowchart of an application scenario provided by an embodiment of the present application;

2 is a schematic flowchart of a method for detecting lane lines provided by an embodiment of the present application;

3 is a schematic diagram of the overall architecture of the neural network model after training provided by an embodiment of the present application;

4 is a schematic diagram of the architecture of a residual network layer provided by an embodiment of the present application;

5 is a schematic structural diagram of an atrous convolutional network layer provided by an embodiment of the present application;

6 is a schematic structural diagram of a detector provided by an embodiment of the present application;

7 is a schematic diagram of the overall architecture of the training neural network model provided by an embodiment of the present application;

8 is a schematic structural diagram of a splitter provided by an embodiment of the present application;

9 is a visual schematic diagram of a detection result of a lane line provided by an embodiment of the present application;

10 is a schematic structural diagram of a device for detecting lane lines provided by an embodiment of the present application;

FIG. 11 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.

Embodiments of the present invention

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are set forth in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It is to be understood that, when used in this specification and the appended claims, the term "comprising" indicates the presence of the described feature, integer, step, operation, element and/or component, but does not exclude one or more other The presence or addition of features, integers, steps, operations, elements, components and/or sets thereof.

It will also be understood that, as used in this specification and the appended claims, the term "and/or" refers to and including any and all possible combinations of one or more of the associated listed items.

As used in the specification of this application and the appended claims, the term "if" may be contextually interpreted as "when" or "once" or "in response to determining" or "in response to detecting ". Similarly, the phrases "if it is determined" or "if the [described condition or event] is detected" may be interpreted, depending on the context, to mean "once it is determined" or "in response to the determination" or "once the [described condition or event] is detected. ]" or "in response to detection of the [described condition or event]".

In addition, in the description of the specification of the present application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and should not be construed as indicating or implying relative importance.

References in this specification to "one embodiment" or "some embodiments" and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically emphasized otherwise. The terms "including", "including", "having" and their variants mean "including but not limited to" unless specifically emphasized otherwise.

At present, lane line detection technology plays an irreplaceable role in the field of autonomous driving, and research on lane line detection technology is also emerging one after another. Among them, based on Hough transform, combined with various factors such as road surface type and weather, edge extraction is used to detect lanes, and the operating efficiency of the algorithm is optimized; while traditional lane detection algorithm models are often based on image visual clues. Recognition: After converting the image into a model that can represent hue, saturation, and brightness through a color model (Hue-Saturation-Intensity, HSI), the fuzzy mean clustering algorithm is used to process the pixels of each row to identify the lane lines. ; The algorithm model used is more complex, and the model calculation amount is large. By using deep learning semantic segmentation technology, a large number of samples under severe weather conditions are added to the lane line recognition data set, and a multi-task model based on vanishing point analysis is proposed; or by constructing a multi-attention mechanism The lane line recognition model for knowledge extraction; or by introducing long-term and short-term neural networks, etc., to process the long-line features of the lane lines to identify the lane lines; the traditional convolutional neural network cannot well analyze the spatial relationship between rows and columns in the image .

In addition, for the lane line detection task, a single semantic segmentation angle is used for detection to realize the optimization of the model; however, the image is straightened during classification, which destroys the spatial structure of the data and cannot well analyze the image. The problem of global and contextual information.

In addition, by integrating semantic segmentation and image classification tasks, the detection task of lane lines is realized through semantic segmentation model and image classification model; however, the adopted model is relatively complex and requires a large amount of calculation, which is not conducive to the actual application of automatic driving tasks. Real-time requirements; and the processing of data by this model destroys the spatial structure of each part of the image and reduces the accuracy of lane line detection.

Based on the above problems, an embodiment of the present application proposes a method for detecting lane lines. The lane lines are detected and recognized through a trained neural network model, and the task of lane recognition is regarded as a task of classifying the positions of pixels in each row of a picture. The unit analyzes the position of the lane line, which better analyzes the global and contextual information, and optimizes the operating efficiency of the model. In addition, the model builds a model for processing semantic segmentation tasks during the training process as an aid for training; at the same time, in order to better deal with the global structural features of the image, the setting of the loss function is also adjusted to guide the The model pays better attention to the continuity law of lane lines; the neural network model used to detect lane lines has been improved. During classification and detection, the spatial structure of the data is maintained, the perception ability of the model is optimized, and the model's sensitivity is reduced. Complexity.

Referring to FIG. 1 , it is a schematic flowchart of an application scenario provided by an embodiment of the present application. After the neural network model is trained based on the assistance of the semantic segmentation model, the trained neural network model is obtained, the road image of the current scene is input into the trained neural network model, and the trained convolutional neural network model performs feature extraction and The processing of feature learning, output the detection result of the lane line, and realize the prediction of the position of the lane line.

In the embodiment of the present application, a hollow convolutional network is introduced into the neural network model for detecting lane lines, and the classifier in the neural network model is optimized. The size of the atrous convolutional network optimizes the perception ability of the model, maintains the data space structure, optimizes the complexity of the model, greatly reduces the computational load of the model, and improves the response speed.

Among them, in the training process of the neural network model, the semantic segmentation model is used as an auxiliary model for training, and the training process is optimized by combining the convolutional convolution network, which greatly improves the training speed of the neural network model; at the same time, the classifier and the neural network model are optimized. The segmenter of the semantic segmentation model maintains the spatial structure of the feature map in the last layer of the network, reduces the number of parameters of the model, reduces the complexity of the loss function, and optimizes the computational efficiency of the neural network model. In addition, the trained neural network model still achieves high detection accuracy for images collected under poor lighting conditions or under occlusion conditions.

The specific contents of model training and model architecture are further introduced below with reference to the implementation steps of the method for detecting lane lines provided by the embodiments of the present application.

Referring to FIG. 2, it is a schematic flowchart of a method for detecting lane lines provided by an embodiment of the present application; the method for detecting lane lines includes the following steps in the application process:

Step S201, obtaining a road image of the current scene.

In some embodiments, the terminal device may use a camera to capture a road image of the current driving scene of the vehicle; the captured road image includes a road image in front of, behind or on the side of the vehicle. The terminal device may be an in-vehicle device, which is respectively connected to the camera and the control system for automatic driving of the vehicle.

Among them, the terminal device can control the camera to collect the road image of the scene where the vehicle is located in real time or in a preset period according to the requirements of the vehicle driving scene. The road image may include continuous lane lines, partially occluded lane lines, or no lane lines.

It is understandable that the characteristics of the lane lines are preset in the terminal device, which provides a prediction basis for the detection of the lane lines in the middle of the road image. Among them, according to the regulations, the width of the lane lines includes 10 cm, 15 cm and 20 cm; the lane lines are divided into solid lines and dotted lines, and the colors include yellow and white.

Step S202, input the road image into the trained neural network model for processing, and output the detection result of the lane lines in the road image of the current scene; wherein, the trained neural network model is trained according to the sample images in the training set and the semantic segmentation model It is obtained that the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes.

In some embodiments, the processing of road images by the neural network model is the actual detection process of lane lines; the actually detected lane lines are composed of lane lines with different line widths, line types and colors in various environments; lane line detection The task is to identify lane lines in various environments, and the goal of lane line detection is to determine the location and direction of the lane lines.

Among them, the processing of the road image by the trained neural network model includes the extraction and detection of target features.

In some embodiments, the trained neural network model includes a residual network layer, an atrous convolutional network layer, an upsampling network layer, and a detector. Input the road image into the trained neural network model for processing, including:

The road image is input into the residual network layer, and after the convolution processing of the residual network layer, the first feature map containing semantic features is output; , output the second feature map containing detailed features; input the second feature map into the up-sampling network layer, and output the third feature map after the up-sampling process of the up-sampling network layer; input the third feature map into the detector, through the detector The convolution processing of the current scene outputs the detection results of the lane lines in the road image of the current scene.

Referring to FIG. 3 , it is a schematic diagram of the overall architecture of the neural network model after training provided by the embodiment of the present application, and the neural network model after training is a classification detection model. Among them, the classification detection model adopts the residual network layer in the residual neural network ResNet-34 (Residual Neural Network-34); the input road image is convolved through the residual network layer to extract the semantic features in the road image. , to get the first feature map. In the classification and detection model, a hole convolution network layer is also introduced; the input first feature map is sampled and classified through the hole convolution network layer, while ensuring the receptive field of the preset size, by adopting multiple long holes The convolution kernel performs convolution processing on the first feature map, extracts detailed features and global features of the first feature map, and obtains a second feature map. In order to ensure that the size of the feature map is equal to that of the original input road image, the upsampling network layer in the classification detection model performs upsampling processing on the input second feature map, and outputs the sampled feature map, that is, the third feature map. The detector of the classification detection model performs two-dimensional convolution processing on the input third feature map, and outputs the recognition result of the lane line.

In some embodiments, the residual network layer includes a first residual module, a second residual module and a third residual module; the road image is input into the residual network layer, and after convolution processing of the residual network layer, the output includes The first feature map of semantic features, including:

The road image is input into the first residual module, and the first result is obtained through convolution processing by the first residual module; Second result; the second result is input into the third residual module, and the first feature map is output after the convolution processing of the third residual module.

As shown in FIG. 4 , a schematic structural diagram of a residual network layer provided by an embodiment of the present application. As shown in (a) of Figure 4, the residual network layer in the classification and detection model includes three residual modules, namely the first residual module, the second residual module and the third residual module. As shown in (b) of Figure 4, each residual module includes six 3×3 convolutional layers with edge padding, and the result after processing the six 3×3 convolutional layers with edge padding It is superimposed with the input feature map to obtain the output of each residual module.

Exemplarily, the residual module in the ResNet-34 network is used to extract and transform the features of the road image. Taking the first residual block of ResNet-34 as an example, (b) in Figure 4 shows the structure of the residual block. The residual block uses multiple stacked convolutional layers to process the input feature map to analyze and extract the effective information of the image more deeply. The final convolution result will be accumulated with the original input to optimize the number of layers that are too deep, which may lead to vanishing gradients. In order to ensure that the size of the output feature after each convolution remains unchanged compared to the input, it is necessary to fill the edges of the input feature map with 0 values before each convolution.

Understandably, the identification result of the lane line includes the identified lanes, the positions of the lanes on each row of pixels or pixel units in the image, and the determination of which column of the row each lane belongs to or any column that does not belong to the row. a row.

In some embodiments, the atrous convolutional network layer includes a first convolution module, a second convolution module, a third convolution module, a fourth convolution module, and a global average pooling module. Input the first feature map into the atrous convolutional network layer, and extract the features of the atrous convolutional network layer, including:

The first feature map is input into the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module respectively, and feature extraction is performed on the first feature map.

In some embodiments, the atrous convolutional network layer uses convolution kernels of different sizes to perform convolution processing on the feature maps.

Referring to FIG. 5 , it is a schematic structural diagram of an atrous convolutional network layer provided by an embodiment of the present application. Atrous spatial convolution pyramid pooling module and global average pooling module in atrous convolutional network layer as shown in (a) of Fig. 5. The atrous spatial convolution pyramid pooling module includes multiple convolution kernels of different sizes and atrous convolutions with different sampling rates, and analyzes and extracts detailed features of the input first feature map through multiple atrous convolutions. For example, the 1×1 convolution shown in (a) of Figure 5, the 3×3 convolution with the sampling rate rate=1, the 3×3 convolution with the sampling rate rate=3, and the 3×3 convolution with the sampling rate rate=5 3 convolutions. The atrous convolution of each size and sampling rate performs convolution processing on the input first feature map, and extracts detailed features of different channel dimensions in the first feature map. The global features of the input first feature map are extracted through the global average pooling module of the atrous convolutional network layer.

Exemplarily, the first convolution module is a 1×1 convolution, the second convolution module is a 3×3 convolution with a sampling rate rate=1, and the third convolution module is a 3×3 convolution with a sampling rate rate=3 product, the fourth convolution module is a 3×3 convolution with a sampling rate of rate=5.

In some embodiments, the first feature map is input into the atrous convolutional network layer, and after the feature extraction of the atrous convolutional network layer, the method further includes:

The feature maps output by the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module are spliced in the channel dimension to obtain the spliced feature map; After the image is processed by 1×1 convolution, the second feature map is output.

As shown in (a) of Figure 5, after the global feature of the first feature map is extracted by the global average pooling module, the feature map is output after upsampling and 1×1 convolution processing. The feature map output by the global draw pooling module and the feature maps output by the hole convolution and 1×1 convolution of each size and sampling rate, respectively, are spliced in the channel dimension, and the spliced feature map is output; the spliced feature map is then passed through A layer of 1×1 convolutional processing outputs the second feature map.

Compared with the traditional convolution method, the hole convolution processing in the hole convolution network layer can effectively expand the convolution receptive field while reducing the amount of parameters of the classification and detection model, thereby optimizing the operating efficiency of the model.

As shown in (b) of Figure 5, the left side shows the general structure of the atrous convolution kernel. Compared with the traditional convolution kernel, the hollow convolution kernel only selects some positions in the kernel to give actual weights, such as the weight at the position of the small black square, and ignores the input of the remaining positions. The sampling interval of atrous convolution is called the sampling rate. Ordinary convolution is a hole convolution with a sampling rate of 1, which is a special case of hole convolution. As shown in the right part of (b) in Figure 5, at the end of the model, after adding multiple layers of ordinary convolution, the use of atrous convolution will not cause possible image information due to the gap between the actual weights of the convolution kernel. For example, after feature map 1 is processed by hole convolution, feature map 2 is output, and feature map 2 continues to undergo multi-layer ordinary convolution processing. In the process of ordinary convolution, the gap weight between the actual weights of hole convolution is filled. Output feature map 3 with more complete image information. Therefore, atrous convolution is an effective way to increase the image receptive field and reduce the computational complexity of the model.

In some embodiments, after feature extraction is performed on the first feature map through the atrous convolutional network layer, the method includes:

The second feature map is subjected to upsampling processing, and the result of the upsampling processing is input into the detector. After the convolution processing of the detector, the classification prediction result of the road image of the current scene is output, and the classification prediction result is used to indicate the current scene. The location of the lane in the road image.

Referring to FIG. 6 , it is a schematic structural diagram of a detector provided by an embodiment of the present application. After the second feature map input upsampling network layer is sampled, it outputs a sampling feature map in the shape of [the number of lanes, the number of rows, the number of columns + 1], and the sampling feature map continues to pass through two layers of the hole volume filled with edges. After product (such as 3×3 convolution with sampling rate=1), the final classification result of the shape of [number of lanes, number of rows, number of columns+1] is output, and the shape of the classification result is the same as that of the input sampling feature map , respectively represent each lane, locate each lane on each row of the image, and determine which column of the row each lane is located in, or any column that does not belong to the row.

Through the embodiment of the present application, based on the convolution of each hole in the hole convolution network layer, the extraction and transformation of multi-size features greatly simplifies the detector of the classification detection model, directly using bilinear interpolation and 3 × 3 two The dimensional convolution performs feature adjustment on the feature output of the hole convolution to obtain the final classification output, which further improves the operating efficiency of the model.

Through the embodiment of the present application, the road image is processed by a multi-layer residual network block and then outputs a feature map, and then after a layer of 1×1 convolution processing, the hole convolution pyramid in the middle hole convolution network layer of the model is detected by classification Pooling ASPP for feature processing. It will be processed by three atrous convolution blocks with sampling rates of 1, 3, and 5, and another 1×1 convolution. All convolution operations will use padding operations to keep the input and output sizes unchanged. In order to better grasp the global features of the feature map, the atrous convolutional network layer also adopts global average pooling to obtain a global generalization of the input feature map, and restores it to the input with the help of bilinear interpolation upsampling and 1×1 convolution. The size of the feature map, and after the outputs of each convolution and hole convolution are spliced in the channel dimension, the second feature map is output after a 1×1 convolution to adjust the dimension features. Therefore, hole convolution processing is used in the middle and end of the classification detection model to comprehensively extract and analyze image features from multiple dimensions; the last layer features in the classification detection model are no longer straightened, but the last layer of the classification detection model is used. The spatial structure of the feature map is maintained, and the final model output is generated with the help of the improved detector, which greatly reduces the number of parameters of the model and improves the operating efficiency of the model.

In the training phase, in order to better guide the output evaluation of the model and obtain a more accurate detection effect, in the training process of the neural network model, an auxiliary model based on semantic segmentation is used for training.

In some embodiments, the method includes: marking the pixel coordinates of lane lines, the type of lane lines, and the characteristics of whether the current lane is a drivable lane in the collected road images of multiple scenes, and obtains the corresponding characteristics in the sample images. the marked images corresponding to the road images of the multiple scenes.

Exemplarily, road images of different scenarios are collected, lane lines in the road images are marked, regions of interest are extracted from the road images and down-sampling operations are performed, and data sets with different annotation formats are processed by a preset processing function. , to obtain the lane line data set required for the training of the neural network model, that is, the marked images corresponding to the road images of multiple scenes in the sample image.

In some embodiments, the neural network model is trained according to the sample images in the training set and the semantic segmentation model, including:

The road images of multiple scenes in the sample image are input into the residual network layer of the neural network model, and after the convolution processing of the residual network layer, the fourth feature map of the road image is output; the fourth feature map is input into the neural network model. The hole convolution network layer, after the feature extraction of the hole convolution network layer, outputs the fifth feature map; the fifth feature map is up-sampled, and the result of the up-sampling process is input into the detector of the neural network model, after the detector After the convolution processing, the classification prediction results of the road images of the multiple scenes are output, and the classification prediction results are used to indicate the positions of the lanes in the road images of the multiple scenes.

In some embodiments, the residual network layer of the neural network model includes three residual modules, and the method includes:

The fifth feature map is up-sampled to obtain the sixth feature map; the road images of multiple scenes in the sample image are input into the residual network layer, and after the convolution processing of the first two residual modules of the residual network layer, Output the seventh feature map; input the seventh feature map into the semantic segmentation model, segment the lane line pixels in the seventh feature map through the semantic segmentation model, and output the lane line instance feature map; combine the lane line instance feature map with the sixth feature The images are stitched together to obtain a stitched image.

In some embodiments, the method includes: inputting the stitched image into a segmenter, subject to convolution processing by the segmenter, and outputting a semantic segmentation prediction result, where the semantic segmentation prediction result is used to indicate lane recognition results in road images of multiple scenes .

As shown in FIG. 7 , a schematic diagram of the overall architecture of the training neural network model provided by the embodiment of the present application; the classification detection model in the upper half is the model used in the lane line detection process, and its output will be used as the final result of the task in the prediction stage ; The semantic segmentation model in the lower part only participates in the training process of the model and is used to guide the model to obtain a more accurate recognition effect.

In the process of training the neural network model, the road images of multiple scenes in the sample image are input into the residual network layer of the neural network model. After the convolution processing of the residual network layer, the semantic features of the road image are extracted, and the first Four feature maps. The fourth feature map is input into the hole convolution of each size and sampling rate of the hole convolution network layer, 1 × 1 convolution, and global average pooling for convolution processing to extract different detailed features and global features. The image splicing process and the 1×1 convolution process output the fifth feature map. Input the fifth feature map into the detector of the neural network model, and after the two-dimensional convolution processing of the detector, output the classification prediction result [the number of lanes, the number of rows, the number of columns + 1]; the classification prediction result indicates that the classification is each lane. , locate each lane on each row of the image, and determine which column of the row each lane is in, or any column that does not belong to the row.

Among them, the feature map of the road image is sampled and classified by the hole convolution network layer in the neural network model to ensure a larger receptive field. Detail perception ability, and reduce the number of parameters of the model. Global feature extraction is performed on the feature map through global draw pooling in the atrous convolutional network layer, and the feature map is output after upsampling and 1×1 convolution. The feature map output by each hole convolution kernel and the feature map output by the global draw pooling are spliced in the channel dimension, and then the fifth feature map is output after a layer of 1×1 convolution processing.

Exemplarily, the multiple atrous convolution kernels may be 1×1 convolution, 3×3 convolution with sampling rate rate=1, 3×3 convolution with sampling rate rate=3, and 3×3 convolution with sampling rate rate=5 3 convolutions.

In some embodiments, in the training process based on the semantic segmentation model, the output result of the second residual module of the neural network model (such as ResNet-34) is input into the semantic segmentation model, and the second residual is processed by the semantic segmentation model. The output of the module is divided into pixels, and the feature map of the lane line instance is output. The fifth feature map is output after being processed by the atrous convolutional network layer for up-sampling processing, and the sixth feature map is output. The lane line instance feature map and the sixth feature map are spliced to obtain a spliced image.

In some embodiments, the stitched image is input into a segmenter, and the segmenter performs two-dimensional convolution processing to output a semantic segmentation prediction result [number of lanes + 1, number of rows, number of columns], indicating that for each basic pixel in the image zone, which lane it belongs to or does not belong to any lane.

As shown in FIG. 8 , a schematic structural diagram of a splitter provided by an embodiment of the present application. The network before the splitter upsamples the feature map to the shape of [number of lanes + 1, number of rows, number of columns], and after two layers of convolution with edge-filled holes and a layer of ordinary 3×3 convolution, the output The final semantic segmentation prediction result, the shape of the semantic segmentation prediction result is the same as the input feature map, which is [the number of lanes + 1, the number of rows, the number of columns], indicating which lane or Does not belong to any lane. Among them, the segmenter is the output part of the semantic segmentation model.

In the training phase, in order to better guide the output evaluation of the model and obtain a more accurate recognition effect. The neural network is trained using an auxiliary model based on semantic segmentation. After splicing the results of the second residual module of the neural network model ResNet-34 with the output of the up-sampled atrous convolutional network layer through the semantic segmentation model, the output is directly processed by a two-dimensional convolution-based semantic segmenter. The goal of this output is to determine the recognition results of each lane in road images of multiple scenes, such as whether each part of the image belongs to a certain lane, which lane it belongs to, and performs the task of semantic segmentation.

In some embodiments, the first loss function is used to calculate the first error value of the classification prediction result of the neural network model relative to the prediction probability of the marked image in the sample image, and the parameters of the neural network model are adjusted by the first error value; the first loss The function is represented as follows:

In the training process of the embodiment of the present application, the classification detection model adopts Focal Loss as its first loss function in the training process. Among them, y refers to the classification truth value of the sample, p refers to the predicted probability of the sample, and γ is the set weight. Focal Loss can attach larger weights to hard-to-classify and severely misclassified samples, making the model more focused on samples during training.

In some embodiments, a second error value of the lane recognition result of the semantic segmentation model relative to the predicted probability of the marked image in the sample image is calculated by a second loss function, and the parameters of the neural network model are adjusted by the second error value; the second loss The function is represented as follows:

In the training process of the embodiment of the present application, the semantic segmentation model adopts the cross entropy loss function Cross Entropy Loss as the second loss function, and the calculation formula of Cross Entropy Loss is shown in formula (2), where y refers to the classification truth value of the sample , p refers to the predicted probability of the sample. For each class of classification results, when y=1, that is, the classification truth value of the class is 1, so that the predicted probability is as close to 1 as possible, so the loss is -log(p); for the example whose classification truth value is 0 , so that the predicted probability approaches 0, and the loss is -log(1-p).

Through the embodiments of the present application, due to the effective optimization of the receptive field by the convolutional pyramid pooling of the empty space without increasing the amount of model parameters, the straightening operation is no longer performed on the feature map of the last layer in the classification detection model. The structured loss function that can constrain the spatial features of the lane lines is used for processing, and the output feature map of the atrous convolutional network layer is directly subjected to bilinear interpolation upsampling and two-dimensional convolution processing; the spatial structure of the feature map is maintained, and the reduces the training burden of the model.

Through the embodiment of the present application, the parameter quantity of the model is optimized, and the operation efficiency of the model is improved; the spatial structure of the feature map is maintained at the end of the model, which is conducive to the analysis of the overall characteristics of the image; the use of hole spatial convolution pyramid pooling optimizes the The feature analysis in the second half of the model improves the receptive field of the convolution kernel without increasing the training burden of the model, and uses multiple convolution kernels of different sizes to analyze different features of the feature map, enhancing the generalization of classification and semantic segmentation. It improves the detection accuracy of tasks.

In order to verify the effectiveness of the method proposed by the present invention, the same data set as the traditional neural network model was used for experimental verification, and the traditional neural network model was compared in terms of the training speed, convergence speed and detection accuracy of the model. Significant progress has been made. As shown in Table 1, the comparison of the detection accuracy and detection speed between the trained neural network model provided by the embodiment of the present application and the traditional original model. The correct rate refers to the correct rate of recognition of lane line pixels by the model on an image of 800×288 pixels. The running speed refers to the time it takes for the model to process a batch of 16 images.

	训练后的神经网络模型The trained neural network model	传统的原模型traditional original model
正确率Correct rate	92.96％92.96%	92.04％92.04%
运行速度running speed	32ms/batch32ms/batch	60ms/batch60ms/batch

Table 1

As shown in FIG. 9 , an embodiment of the present application provides a visual schematic diagram of a detection result of a lane line. FIG. 9 shows the detection results of the trained neural network model provided by the embodiments of the present application in two sets of test images. It can be seen that, as shown in (a) of FIG. 9 , the trained neural network model provided by the embodiment of the present application can accurately identify multiple lane lines in the image; and as shown in (b) of FIG. 9 , even if In the case that the lane line is blocked by obstacles, it can still maintain a good recognition effect.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Corresponding to the method for detecting lane lines described in the above embodiments, FIG. 10 shows a structural block diagram of the device for detecting lane lines provided by the embodiments of the present application. part.

10, the device includes:

an acquisition unit 101, used for acquiring a road image of the current scene;

The processing unit 102 is configured to input the road image into the trained neural network model for processing, and output the detection result of the lane line in the road image of the current scene; wherein, the trained neural network model is based on training The sample images in the set and the semantic segmentation model are obtained by training, and the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes.

Through the embodiment of the present invention, the terminal device obtains the road image of the current scene; inputs the road image into the trained neural network model for processing, and outputs the detection result of the lane line in the road image of the current scene; wherein, the trained neural network model In order to be obtained by training according to the sample images in the training set and the semantic segmentation model, the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of multiple scenes; The segmentation model is trained to obtain the trained neural network model, which can solve the problem of low accuracy of lane line recognition in the current complex environment, and the current model used for lane line detection has a large amount of calculation and a relatively complex model. The problem is that the response speed is slow ; Improve the detection accuracy, while meeting the real-time requirements in the actual application scenarios of automatic driving tasks; it has strong ease of use and practicability.

It should be noted that the information exchange, execution process and other contents between the above-mentioned devices/units are based on the same concept as the method embodiments of the present application. For specific functions and technical effects, please refer to the method embodiments section. It is not repeated here.

Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated in one processing unit, or each unit may exist physically alone, or two or more units may be integrated in one unit, and the above-mentioned integrated units may adopt hardware. It can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working processes of the units and modules in the above-mentioned system, reference may be made to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.

The embodiments of the present application provide a computer program product, when the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be implemented when the mobile terminal executes the computer program product.

FIG. 11 is a schematic structural diagram of a terminal device 11 according to an embodiment of the present application. As shown in FIG. 11 , the terminal device 11 of this embodiment includes: at least one processor 110 (only one is shown in FIG. 11 ), a processor, a memory 111 , and a processor 111 stored in the memory 111 and available for processing in the at least one processor A computer program 112 running on the processor 110, and the processor 110 implements the steps in any of the foregoing embodiments of the certificate storage method when the processor 110 executes the computer program 112.

The terminal device 11 may be a computing device such as a desktop computer, a notebook, a palmtop computer, a vehicle-mounted device, and a cloud server. The terminal device 11 may include, but is not limited to, a processor 110 and a memory 111 . Those skilled in the art can understand that FIG. 11 is only an example of the terminal device 11, and does not constitute a limitation on the terminal device 11, and may include more or less components than the one shown, or combine some components, or different components , for example, may also include input and output devices, network access devices, and the like.

The so-called processor 110 may be a central processing unit (Central Processing Unit, CPU), and the processor 110 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated Circuits) , ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 111 may be an internal storage unit of the terminal device 11 in some embodiments, such as a hard disk or a memory of the terminal device 11 . In other embodiments, the memory 111 may also be an external storage device of the terminal device 11, such as a plug-in hard disk equipped on the terminal device 11, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Further, the memory 111 may also include both an internal storage unit of the terminal device 11 and an external storage device. The memory 111 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as program codes of the computer program and the like. The memory 111 may also be used to temporarily store data that has been output or will be output.

The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the present application realizes all or part of the processes in the methods of the above embodiments, which can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium. When executed by a processor, the steps of each of the above method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable medium may include at least: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunication signals, and software distribution media. For example, U disk, mobile hard disk, disk or CD, etc. In some jurisdictions, under legislation and patent practice, computer readable media may not be electrical carrier signals and telecommunications signals.

In the foregoing embodiments, the description of each embodiment has its own emphasis. For parts that are not described or described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

In the embodiments provided in this application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units. Or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it can still be used for the above-mentioned implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be included in the within the scope of protection of this application.

Claims

A method for detecting lane lines, comprising:

Get the road image of the current scene;

Inputting the road image into the trained neural network model for processing, and outputting the detection result of the lane line in the road image of the current scene;

The trained neural network model is obtained by training according to the sample images in the training set and the semantic segmentation model, and the sample images in the training set include collected road images of multiple scenes and road images corresponding to the multiple scenes. Tag the image.
The method of claim 1, wherein the trained neural network model comprises a residual network layer, an atrous convolutional network layer, an upsampling network layer and a detector;

The processing of inputting the road image into the trained neural network model includes:

Inputting the road image into the residual network layer, and through convolution processing of the residual network layer, outputting a first feature map containing semantic features;

Inputting the first feature map into the hole convolution network layer, and outputting a second feature map containing detailed features through feature extraction of the hole convolution network layer;

Inputting the second feature map into the up-sampling network layer, and after up-sampling processing by the up-sampling network layer, a third feature map is output;

The third feature map is input into the detector, and the detection result of the lane line in the road image of the current scene is output through the convolution process of the detector.
The method of claim 2, wherein the residual network layer comprises a first residual module, a second residual module and a third residual module;

The said road image is input into the residual network layer, and after the convolution processing of the residual network layer, a first feature map containing semantic features is output, including:

Inputting the road image into the first residual module, and performing convolution processing through the first residual module to obtain a first result;

The first result is input into the second residual module, and the second result is obtained through the convolution processing of the second residual module;

The second result is input into the third residual module, and the first feature map is output after convolution processing by the third residual module.
The method of claim 2, wherein the atrous convolutional network layer comprises a first convolution module, a second convolution module, a third convolution module, a fourth convolution module and a global average pooling module ;

Inputting the first feature map into the atrous convolutional network layer, and extracting features from the atrous convolutional network layer, including:

Input the first feature map into the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module respectively, Feature extraction is performed on the first feature map.
The method of claim 4, wherein the first feature map is input into the atrous convolutional network layer, and after the feature extraction of the atrous convolutional network layer, the method further comprises:

The feature maps respectively output by the first convolution module, the second convolution module, the third convolution module, the fourth convolution module and the global average pooling module are processed in the channel dimension. Splicing to obtain a splicing feature map;

After the spliced feature map is processed by 1×1 convolution, the second feature map is output.
The method according to claim 5, wherein after the feature extraction is performed on the first feature map through the atrous convolutional network layer, the method comprises:

The second feature map is subjected to up-sampling processing, and the result of the up-sampling processing is input into the detector. After the convolution processing of the detector, the classification prediction result of the road image of the current scene is output. The classification prediction result is used to indicate the location of the lane in the road image of the current scene.
The method of claim 1, wherein the method comprises:

Mark the lane line pixel coordinates, the lane line type, and the characteristics of whether the current lane is a drivable lane in the collected road images of multiple scenes, and obtain all the sample images corresponding to the road images of the multiple scenes. the marked image.
The method according to claim 1 or 7, wherein, training a neural network model according to the sample images in the training set and the semantic segmentation model, comprising:

Inputting the road images of the multiple scenes in the sample image into the residual network layer of the neural network model, and through the convolution processing of the residual network layer, the fourth feature map of the road image is output;

Inputting the fourth feature map into the hollow convolutional network layer of the neural network model, and outputting the fifth feature map through feature extraction of the hollow convolutional network layer;

The fifth feature map is subjected to up-sampling processing, and the result of the up-sampling processing is input into the detector of the neural network model, and after the convolution processing of the detector, the road images of the multiple scenes are output. Classification prediction results, where the classification prediction results are used to indicate where the lanes in the road images of the multiple scenes are located.
The method of claim 8, wherein the residual network layer of the neural network model comprises three residual modules, and the method comprises:

performing up-sampling processing on the fifth feature map to obtain a sixth feature map;

Inputting the road images of the multiple scenes in the sample image into the residual network layer, and through the convolution processing of the first two residual modules of the residual network layer, the seventh feature map is output;

Inputting the seventh feature map into the semantic segmentation model, segmenting the lane line pixels in the seventh feature map through the semantic segmentation model, and outputting a lane line instance feature map;

Splicing the feature map of the lane line instance with the sixth feature map to obtain a spliced image.
The method of claim 9, wherein the method comprises:

The stitched image is input into a segmenter, and after convolution processing by the segmenter, a semantic segmentation prediction result is output, and the semantic segmentation prediction result is used to indicate the lane recognition results in the road images of the multiple scenes.
The method of claim 8, wherein the method further comprises:

The first loss function is used to calculate the first error value of the classification prediction result of the neural network model relative to the prediction probability of the marked image in the sample image, and the parameters of the neural network model are adjusted by the first error value; A loss function is expressed as follows:

Wherein, y is the classification truth value of the road images of the multiple scenes in the sample images in the training set, and p is the predicted probability that the road images of the multiple scenes in the sample images in the training set are processed by the neural network model , γ is the preset weight value.
The method of claim 8 or 11, wherein the method further comprises:

Calculate the second error value of the lane recognition result of the semantic segmentation model relative to the predicted probability of the marked image in the sample image by using the second loss function, and adjust the parameters of the neural network model by the second error value; The second loss function is expressed as follows:

Wherein, y is the recognition truth value of the road images of the multiple scenes in the sample images in the training set, and p is the predicted probability that the road images of the multiple scenes in the sample images in the training set are processed by the semantic segmentation model .
A device for detecting lane lines, comprising:

an acquisition unit for acquiring the road image of the current scene;

The processing unit is used to input the road image into the trained neural network model for processing, and output the detection result of the lane line in the road image of the current scene; wherein, the trained neural network model is based on the training set The sample images in the training set are obtained by training the sample images and the semantic segmentation model, and the sample images in the training set include collected road images of multiple scenes and marked images corresponding to the road images of the multiple scenes.
A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, when the processor executes the computer program, the process according to claim 1 to The method of any one of 12.
A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the method according to any one of claims 1 to 12 is implemented.