WO2022227769A1 - 车道线检测模型的训练方法、装置、电子设备及存储介质 - Google Patents
车道线检测模型的训练方法、装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022227769A1 WO2022227769A1 PCT/CN2022/075105 CN2022075105W WO2022227769A1 WO 2022227769 A1 WO2022227769 A1 WO 2022227769A1 CN 2022075105 W CN2022075105 W CN 2022075105W WO 2022227769 A1 WO2022227769 A1 WO 2022227769A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- lane line
- model
- target
- road condition
- elements
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 133
- 238000012549 training Methods 0.000 title claims abstract description 58
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 58
- 238000004590 computer program Methods 0.000 claims description 11
- 230000004044 response Effects 0.000 claims description 5
- 238000002372 labelling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 8
- 238000013135 deep learning Methods 0.000 abstract description 5
- 238000012545 processing Methods 0.000 description 14
- 230000011218 segmentation Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 4
- 238000003062 neural network model Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/588—Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning and the like, and can be applied to intelligent traffic scenarios, in particular to a training method, device, electronic device and storage medium for a lane line detection model.
- Artificial intelligence is the study of making computers to simulate certain thinking processes and intelligent behaviors of people (such as learning, reasoning, thinking, planning, etc.), both hardware-level technology and software-level technology.
- Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, and big data processing; artificial intelligence software technologies mainly include computer vision technology, speech recognition technology, natural language processing technology, and machine learning/depth Learning, big data processing technology, knowledge graph technology and other major directions.
- the logic of the semantic segmentation method for elements in the road condition image cannot be directly applied to the detection and segmentation of lane lines, and the computational complexity of lane line detection and segmentation is high and cannot meet real-time requirements.
- a training method, device, electronic device, storage medium and computer program product for a lane line detection model are provided.
- a method for training a lane line detection model including: acquiring a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively; a plurality of elements corresponding to a plurality of sample road condition images respectively, and a plurality of element semantics corresponding to the plurality of elements respectively; and according to the plurality of sample road condition images, the plurality of elements, the semantics of the plurality of elements and The multiple labeled lane line information trains an initial artificial intelligence model to obtain a lane line detection model.
- a training device for a lane line detection model comprising: an acquisition module configured to acquire multiple sample road condition images and multiple marked lane line information corresponding to the multiple sample road condition images respectively a determination module for determining a plurality of elements corresponding to the plurality of sample road condition images respectively, and a plurality of element semantics corresponding to the plurality of elements respectively; and a training module for determining according to the plurality of sample road conditions
- the image, the multiple elements, the semantics of the multiple elements, and the multiple labeled lane line information train an initial artificial intelligence model to obtain a lane line detection model.
- an electronic device comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instructions are executed by the at least one processor, so that the at least one processor can execute the method for training a lane line detection model according to an embodiment of the present disclosure.
- a non-transitory computer-readable storage medium storing computer instructions, the computer instructions are used to cause the computer to execute the training method of the lane line detection model disclosed in the embodiments of the present disclosure.
- a computer program product including a computer program that, when executed by a processor, implements the method for training a lane line detection model disclosed in the embodiments of the present disclosure.
- FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure.
- FIG. 2 is a schematic diagram of a second embodiment according to the present disclosure.
- FIG. 3 is a schematic diagram of a third embodiment according to the present disclosure.
- FIG. 4 is a schematic diagram of a fourth embodiment according to the present disclosure.
- FIG. 5 is a block diagram of an electronic device used to implement the training method of the lane line detection model according to the embodiment of the present disclosure.
- FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure.
- the execution body of the training method of the lane line detection model in the embodiment of the present disclosure is the training device of the lane line detection model, and the device can be realized by software and/or hardware, and the device can be configured in an electronic In the device, the electronic device may include, but is not limited to, a terminal, a server, and the like.
- the embodiments of the present disclosure relate to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning, etc., and can be applied to intelligent traffic scenarios, which can effectively reduce the computational complexity of lane line detection and recognition in road condition images, and improve lane line detection and recognition. efficiency, and improve the detection and recognition effect of lane lines.
- AI artificial intelligence
- AI the English abbreviation is AI. It is a new technical science that studies and develops theories, methods, techniques and application systems for simulating, extending and expanding human intelligence.
- Deep learning is to learn the inherent laws and representation levels of sample data, and the information obtained during these learning processes is of great help to the interpretation of data such as text, images, and sounds.
- the ultimate goal of deep learning is to enable machines to have the ability to analyze and learn like humans, and to recognize data such as words, images, and sounds.
- Computer vision refers to the use of cameras and computers instead of human eyes to identify, track and measure targets, and further perform graphics processing to make computer processing images that are more suitable for human eyes to observe or transmit to instruments for detection.
- the training method of the lane line detection model includes:
- S101 Acquire a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
- the road condition image used for training the lane line detection model may be called a sample road condition image, and the road condition image may be an image captured by a camera device in the environment in an intelligent traffic scene, which is not limited.
- a plurality of sample road condition images may be obtained from a sample road condition image pool, and the plurality of sample road condition images may be used to train an initial artificial intelligence model to obtain a human attribute detection model.
- the marked lane line information can be used as a reference mark when training an initial artificial intelligence model.
- the above lane line information can be used to describe the lane line related information in the sample road condition image, such as the lane line type, the image features corresponding to the image area of the lane line, or whether the lane line exists (whether the lane line exists can be called is the lane line state), or it can be any other possible lane line information, which is not limited.
- the multiple sample road condition images and multiple marked lanes may be combined. line information to train the initial artificial intelligence model.
- S102 Determine a plurality of elements corresponding to the plurality of sample road condition images respectively, and the semantics of the plurality of elements corresponding to the plurality of elements respectively.
- image recognition may be performed on the plurality of sample road condition images, respectively, to obtain the elements corresponding to each sample road condition images, and the element semantics corresponding to each element respectively, wherein the elements can be, for example, For the sky, trees, roads, etc. in the sample road condition image, the feature semantics can refer to the feature type and feature characteristics of the sky, trees, and roads, and usually the feature contains some pixels in the image, you can use the pixel content of the pixel it contains. Context information is used to classify elements to obtain element semantics, which is not limited.
- the corresponding elements and element semantics in the sample road condition images and the plurality of marked lane line information can be used. to train the initial artificial intelligence model to get the lane line detection model.
- the processing logic based on element recognition can detect and identify the lane line instance, thus avoiding relying on the anchor information of the lane line in the road condition image, reducing the complexity of the model calculation and improving the detection and recognition efficiency.
- S103 Train an initial artificial intelligence model according to multiple sample road condition images, multiple elements, multiple element semantics, and multiple labeled lane line information to obtain a lane line detection model.
- the initial artificial intelligence model can be, for example, a neural network model, a machine learning model, or a graph neural network model.
- a neural network model for example, a neural network model, a machine learning model, or a graph neural network model.
- any other possible model capable of performing image recognition and analysis tasks can also be used, which is not limited. .
- the multiple sample road condition images, multiple elements, and multiple element semantics can be input into the above-mentioned correspondingly.
- a neural network model or a machine learning model, or a graph neural network model, so as to obtain the predicted lane line information output by any of the aforementioned models.
- the predicted lane line information can be processed by any of the aforementioned models based on the model algorithm.
- Logic combining the elements in the sample road condition image and the lane line information predicted by the element semantics.
- multiple sample road condition images, multiple elements, and the semantics of multiple elements may be input into the initial artificial intelligence model to obtain multiple output images of the artificial intelligence model.
- the predicted lane line information and then determine the convergence timing of the artificial intelligence model according to the multiple predicted lane line information and the multiple labeled lane line information, that is, in response to the multiple predicted lane line information and the multiple labeled lane information If the target loss value between the line information satisfies the set conditions, the artificial intelligence model obtained by training is used as the lane line detection model, which can timely determine the convergence time of the model, and enable the trained lane line detection model to effectively build Modeling the image features of the lane lines in the intelligent traffic scene can effectively improve the efficiency of the lane line detection and recognition of the lane line detection model, so that the trained lane line detection model can effectively meet the application scenarios with high real-time requirements. .
- the number of the above target loss values may be one or more, and the loss value between the predicted lane line information and the marked lane line information may be referred to as the target loss value.
- any other possible methods may also be used to determine the convergence timing of the initial artificial intelligence model, until the artificial intelligence model satisfies certain convergence conditions, the artificial intelligence model obtained by training is used as the lane line detection model.
- the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
- FIG. 2 is a schematic diagram of a second embodiment according to the present disclosure.
- the training method of the lane line detection model includes:
- S201 Acquire a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
- S202 Determine a plurality of elements corresponding to the plurality of sample road condition images respectively, and the semantics of a plurality of elements corresponding to the plurality of elements respectively.
- S203 Input multiple sample road condition images, multiple elements, and multiple element semantics into the element detection sub-model to obtain target elements output by the element detection sub-model.
- the initial artificial intelligence model may include: an element detection sub-model and a lane line detection sub-model connected in sequence, so that when training the initial artificial intelligence model, multiple sample road condition images, multiple elements, and The semantics of multiple elements are input into the element detection sub-model to obtain target elements output by the element detection sub-model, and the target elements can be used to assist in the detection and identification of lane line instances.
- the above-mentioned feature detection sub-model can be used for image feature extraction and can be regarded as a pre-training model for lane line instance segmentation.
- the backbone network backbone of the training feature detection sub-model is used for feature extraction of each sample road condition image in the Cityscapes dataset to identify features and corresponding feature semantics from each sample road condition image.
- the above-mentioned element detection sub-model may specifically be a deep high-resolution representation learning-object context representation model (Deep High-Resolution Representation Learning for Visual Recognition Object Contextual Representation, HRNet-OCR) for visual recognition, which is not limited. That is, the backbone network of the HRNet-OCR model can be used for image feature extraction, and then, the embodiment of the present disclosure can improve the structure of the HRNet-OCR model, and train the improved HRNet-OCR model to realize the semantic segmentation method logic of elements and lanes Fusion application of line detection and recognition.
- HRNet-OCR Deep High-resolution representation learning-object context representation model
- the target element can be specifically an element whose element type is a road type, since the target element is first identified, and then the target element, the semantics of the target element corresponding to the target element, and a plurality of sample road condition images are input to the lane line detection sub-system
- the model to obtain multiple predicted lane line information output by the lane line detection sub-model, which can improve the pertinence of the model processing and recognition, avoid the interference caused by other elements to the lane line detection, and assist in improving the lane line detection. While the detection and recognition efficiency of the model is improved, the accuracy of detection and recognition is improved.
- S204 Input the target element, the semantics of the target element corresponding to the target element, and multiple sample road condition images into the lane line detection sub-model to obtain multiple predicted lane line information output by the lane line detection sub-model.
- the above-mentioned element-based detection sub-model processes multiple sample road condition images, multiple elements, and multiple element semantics to output the target element, and then the target element, the target element semantics corresponding to the target element, and the multiple sample road condition images can be input. into the lane line detection sub-model to obtain multiple predicted lane line information output by the lane line detection sub-model.
- the element semantics corresponding to the target elements may be called target element semantics. If the target element is an element whose element type is a road type, the target element semantics may be a road type and an image feature corresponding to the road.
- the above-mentioned predicted lane line information may specifically refer to the predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line.
- the marked lane line information may specifically refer to the marked lane line state, and/or the marked context information of multiple pixels in the image area covered by the lane line.
- the above lane line status may refer to the presence of lane lines or the absence of lane lines, and context information can be used to characterize the pixel features corresponding to each pixel in the image area covered by the lane line, and the relationship between each pixel and other pixels based on image features.
- the relative relationship of dimensions for example, relative position relationship, relative depth relationship, etc., which is not limited.
- the above-mentioned predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line, and the corresponding labeled lane line state, and/or the labeled context information can be combined.
- To determine the convergence timing of the artificial intelligence model so as to accurately determine the convergence timing of the artificial intelligence model, which can effectively reduce the consumption of computing resources for model training, and ensure the detection and recognition effect of the trained lane line detection model.
- S205 Determine a plurality of predicted lane line states and a plurality of first loss values between the corresponding plurality of marked lane line states.
- the loss value between each predicted lane line state and the corresponding labeled lane line state can be determined as the first loss.
- the first loss value can be used to characterize the difference of the loss of the lane line state prediction by the lane line detection model.
- S206 Select the target first loss value from among the plurality of first loss values, and determine the target predicted lane line information and the target marked lane line information corresponding to the target first loss value.
- the first loss value greater than the set loss threshold value among the plurality of first loss values may be used as the target first loss value, that is, when the first loss value among the first loss values is greater than the set loss threshold value , indicating that the predicted lane line state is closer to the marked lane line state, which reflects that the model at this time has more accurate state recognition results, so that the determination of the loss value is more in line with the detection logic of the actual model, ensuring the practicality of the method. sex and rationality.
- the loss threshold is set to be, for example, 0.5
- the first loss value is greater than 0.5, it can be shown that the detection accuracy of the lane line state of the lane line detection model at this stage meets certain requirements, and further, determine its accuracy. Predicted detection results for lane type or other lane information.
- the first loss value selected from the plurality of first loss values that satisfies a certain condition may be referred to as a first target loss value, and the predicted lane line state corresponding to the first target loss value belongs to the predicted lane.
- the line information may be referred to as target predicted lane line information, and the labeled lane line information to which the labeled lane line state corresponding to the first target loss value belongs may be referred to as target labeled lane line information.
- S207 Determine the prediction context information included in the target predicted lane line information, and determine the labeling context information included in the target labeled lane line information.
- the prediction context information contained in the target predicted lane line information can be determined, and the labeled context information contained in the target labeled lane line information can be determined, Then, trigger the next steps.
- S208 Determine a second loss value between the prediction context information and the target marked lane line information, and use the second loss value as the target loss value.
- a loss function can be configured for the structure of the above-mentioned improved HRNet-OCR model, the loss function can be used to fit the difference between the predicted context information and the target marked lane line information, and the obtained second loss value is used as the above-mentioned target loss value , there is no restriction on this.
- the lane line detection model can obtain a more accurate detection and recognition effect.
- a branch structure can be added to the network structure of the HRNet-OCR model for detection and segmentation of lane line instances.
- the preset number of lane lines is 4. Therefore, 4 is added to the feature category as HRNet.
- the loss of feature segmentation is l seg_ele , the loss of lane line detection and recognition can be added accordingly.
- the loss of lane line segmentation consists of two parts, one part is the pixel loss leeg_lane , the other part
- the binary classification loss l exist for the existence of 4 lane lines, in the case of the existence of the i-th lane line, then otherwise
- the loss value of the total output of the HRNet-OCR model can be expressed as:
- l total l seg_ele +l seg_lane +0.1*l exist ;
- the lane line detection and recognition stage In the case of , it indicates that the state of the i-th lane line is: the lane line exists, so as to output the result of its pixel prediction (including prediction context information, predicted lane line category, etc.), otherwise it is considered that the lane line does not exist.
- a segmentation network structure that effectively fuses element semantics and lane line instance recognition for road condition images is realized, thereby improving the accuracy of element semantics and lane line instance segmentation, and providing reliable traffic and smart city systems. The result of lane line segmentation.
- the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
- the predicted lane line information can improve the pertinence of the model processing and recognition, avoid the interference of other elements on the lane line detection, and improve the detection and recognition efficiency of the lane line detection model while improving the accuracy of detection and recognition. sex.
- the above-mentioned predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line, and the corresponding labeled lane line state, and/or the labeled context information can be used to generate Determining the convergence timing of the artificial intelligence model, so as to accurately determine the convergence timing of the artificial intelligence model, can effectively reduce the consumption of computing resources for model training, and ensure the detection and recognition effect of the trained lane line detection model.
- FIG. 3 is a schematic diagram of a third embodiment according to the present disclosure.
- the training device 30 of the lane line detection model includes:
- the obtaining module 301 is configured to obtain a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
- the determining module 302 is configured to determine multiple elements corresponding to the multiple sample road condition images respectively, and multiple element semantics corresponding to the multiple elements respectively.
- the training module 303 is configured to train an initial artificial intelligence model according to multiple sample road condition images, multiple elements, multiple element semantics, and multiple marked lane line information, so as to obtain a lane line detection model.
- the training device 40 of the lane line detection model includes: an acquisition module 401 , a determination module 402 , and a training device 40 .
- Module 403 wherein the training module 403 includes:
- the acquisition sub-module 4031 is used to input multiple sample road condition images, multiple elements and multiple elements semantically into the initial artificial intelligence model to obtain multiple predicted lane line information output by the artificial intelligence model;
- the training sub-module 4032 is configured to use the artificial intelligence model obtained by training as the lane in response to the target loss value between the plurality of predicted lane line information and the plurality of marked lane line information meeting the set condition Line detection model.
- the lane line information includes: lane line status, and/or context information of multiple pixels in the image area covered by the lane line, and the image area is a part of the sample road condition image to which the lane line belongs. image area.
- the training sub-module 4032 is specifically used for:
- the target first loss value is selected from the plurality of first loss values, and the target predicted lane line information and the target marked lane line information corresponding to the target first loss value are determined;
- a second loss value between the prediction context information and the target labeled lane line information is determined, and the second loss value is used as the target loss value.
- the training sub-module 4032 is specifically used for:
- a first loss value greater than the set loss threshold value among the plurality of first loss values is used as the target first loss value.
- the initial artificial intelligence model includes: an element detection sub-model and a lane line detection sub-model connected in sequence, wherein the acquisition sub-module 4031 is specifically used for:
- the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
- the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
- FIG. 5 is a block diagram of an electronic device used to implement the training method of the lane line detection model according to the embodiment of the present disclosure.
- Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
- Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
- the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
- the device 500 includes a computing unit 501 that can be executed according to a computer program stored in a read only memory (ROM) 502 or loaded from a storage unit 508 into a random access memory (RAM) 503 Various appropriate actions and handling. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored.
- the computing unit 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
- An input/output (I/O) interface 505 is also connected to bus 504 .
- Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506, such as a keyboard, mouse, etc.; an output unit 507, such as various types of displays, speakers, etc.; a storage unit 508, such as a magnetic disk, an optical disk, etc. ; and a communication unit 509, such as a network card, a modem, a wireless communication transceiver, and the like.
- the communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
- Computing unit 501 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 501 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
- the computing unit 501 executes the various methods and processes described above, eg, a training method of a lane line detection model.
- a method of training a lane line detection model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508 .
- part or all of the computer program may be loaded and/or installed on device 500 via ROM 502 and/or communication unit 509 .
- the computing unit 501 may be configured by any other suitable means (eg, by means of firmware) to perform the training method of the lane line detection model.
- Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
- FPGAs field programmable gate arrays
- ASICs application specific integrated circuits
- ASSPs application specific standard products
- SOC systems on chips system
- CPLD load programmable logic device
- computer hardware firmware, software, and/or combinations thereof.
- These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that
- the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
- Program code for implementing the training method of the lane line detection model of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented.
- the program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.
- a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
- the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
- machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or flash memory erasable programmable read only memory
- CD-ROM compact disk read only memory
- magnetic storage or any suitable combination of the foregoing.
- the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
- a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and pointing device eg, a mouse or trackball
- Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
- the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
- the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.
- a computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
- the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short) , there are the defects of difficult management and weak business expansion.
- the server can also be a server of a distributed system, or a server combined with a blockchain.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
Claims (15)
- 一种车道线检测模型的训练方法,包括:获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
- 根据权利要求1所述的方法,其中,所述根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,包括:将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息;响应于所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为所述车道线检测模型。
- 根据权利要求2所述的方法,所述车道线信息包括:车道线状态,和/或所述车道线覆盖的图像区域中多个像素的上下文信息,所述图像区域,是所述车道线所属的样本路况图像之中的局部的图像区域。
- 根据权利要求3所述的方法,其中,确定所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值,包括:确定所述多个预测的车道线状态,和相应所述多个标注的车道线状态之间的多个第一损失值;从所述多个第一损失值之中选取出目标第一损失值,并确定所述目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息;确定所述目标预测车道线信息包含的预测上下文信息,并确定所述目标标注车道线信息包含的标注上下文信息;确定所述预测上下文信息和所述目标标注车道线信息之间的第二损失值,并将所述第二损失值作为所述目标损失值。
- 根据权利要求4所述的方法,其中,所述从所述多个第一损失值之中选取出目标第一损失值,包括:将所述多个第一损失值之中大于设定损失阈值的第一损失值作为所述目标第一损失值。
- 根据权利要求2-5任一项所述的方法,所述初始的人工智能模型包括:顺序连接的要素检测子模型和车道线检测子模型,其中,所述将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息,包括:将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述要素检测子模型之中,以得到所述要素检测子模型输出的目标要素;将所述目标要素,和与所述目标要素对应的目标要素语义,以及所述多个样本路况图像输入至所述车道线检测子模型之中,以得到所述车道线检测子模型输出的多个预测的车道线信息。
- 一种车道线检测模型的训练装置,包括:获取模块,用于获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;确定模块,用于确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及训练模块,用于根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
- 根据权利要求7所述的装置,其中,所述训练模块,包括:获取子模块,用于将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息;训练子模块,用于响应于所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为所述车道线检测模型。
- 根据权利要求8所述的装置,所述车道线信息包括:车道线状态,和/或所述车道线覆盖的图像区域中多个像素的上下文信息,所述图像区域,是所述车道线所属的样本路况图像之中的局部的图像区域。
- 根据权利要求9所述的装置,其中,所述训练子模块,具体用于:确定所述多个预测的车道线状态,和相应所述多个标注的车道线状态之间的多个第一损失值;从所述多个第一损失值之中选取出目标第一损失值,并确定所述目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息;确定所述目标预测车道线信息包含的预测上下文信息,并确定所述目标标注车道线信息包含的标注上下文信息;确定所述预测上下文信息和所述目标标注车道线信息之间的第二损失值,并将所述第二损失值作为所述目标损失值。
- 根据权利要求10所述的装置,其中,所述训练子模块,具体用于:将所述多个第一损失值之中大于设定损失阈值的第一损失值作为所述目标第一损失值。
- 根据权利要求8-11任一项所述的装置,所述初始的人工智能模型包括:顺序连接的要素检测子模型和车道线检测子模型,其中,所述获取子模块,具体用于:将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述要素检测子模型之中,以得到所述要素检测子模型输出的目标要素;将所述目标要素,和与所述目标要素对应的目标要素语义,以及所述多个样本路况图像输入至所述车道线检测子模型之中,以得到所述车道线检测子模型输出的多个预测的车道线信息。
- 一种电子设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处 理器执行,以使所述至少一个处理器能够执行权利要求1-6中任一项所述的方法。
- 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1-6中任一项所述的方法。
- 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-6中任一项所述的方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020227027156A KR20220117341A (ko) | 2021-04-28 | 2022-01-29 | 차선 검출 모델의 트레이닝 방법, 장치, 전자 기기 및 저장 매체 |
US18/003,463 US20230245429A1 (en) | 2021-04-28 | 2022-01-29 | Method and apparatus for training lane line detection model, electronic device and storage medium |
JP2022580383A JP2023531759A (ja) | 2021-04-28 | 2022-01-29 | 車線境界線検出モデルの訓練方法、車線境界線検出モデルの訓練装置、電子機器、記憶媒体及びコンピュータプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110470476.1A CN113191256B (zh) | 2021-04-28 | 2021-04-28 | 车道线检测模型的训练方法、装置、电子设备及存储介质 |
CN202110470476.1 | 2021-04-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022227769A1 true WO2022227769A1 (zh) | 2022-11-03 |
Family
ID=76980425
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/075105 WO2022227769A1 (zh) | 2021-04-28 | 2022-01-29 | 车道线检测模型的训练方法、装置、电子设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113191256B (zh) |
WO (1) | WO2022227769A1 (zh) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113191256B (zh) * | 2021-04-28 | 2024-06-11 | 北京百度网讯科技有限公司 | 车道线检测模型的训练方法、装置、电子设备及存储介质 |
CN113673586B (zh) * | 2021-08-10 | 2022-08-16 | 北京航天创智科技有限公司 | 融合多源高分辨率卫星遥感影像的海上养殖区域分类方法 |
CN113705513B (zh) * | 2021-09-03 | 2023-09-26 | 北京百度网讯科技有限公司 | 模型训练和车道线预测方法、电子设备和自动驾驶车辆 |
CN113705515B (zh) * | 2021-09-03 | 2024-04-12 | 北京百度网讯科技有限公司 | 语义分割模型的训练和高精地图车道线的生成方法和设备 |
CN113762397B (zh) * | 2021-09-10 | 2024-04-05 | 北京百度网讯科技有限公司 | 检测模型训练、高精度地图更新方法、设备、介质及产品 |
CN113869249B (zh) * | 2021-09-30 | 2024-05-07 | 广州文远知行科技有限公司 | 一种车道线标注方法、装置、设备及可读存储介质 |
CN113837313B (zh) * | 2021-09-30 | 2024-06-14 | 广州文远知行科技有限公司 | 车道线标注模型的训练方法、装置、设备及可读存储介质 |
CN113963011A (zh) * | 2021-10-08 | 2022-01-21 | 北京百度网讯科技有限公司 | 图像识别方法、装置、电子设备及存储介质 |
CN114677570B (zh) * | 2022-03-14 | 2023-02-07 | 北京百度网讯科技有限公司 | 道路信息更新方法、装置、电子设备以及存储介质 |
CN117593717B (zh) * | 2024-01-18 | 2024-04-05 | 武汉大学 | 一种基于深度学习的车道追踪方法及系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460072A (zh) * | 2020-04-01 | 2020-07-28 | 北京百度网讯科技有限公司 | 车道线检测方法、装置、设备和存储介质 |
CN112528878A (zh) * | 2020-12-15 | 2021-03-19 | 中国科学院深圳先进技术研究院 | 检测车道线的方法、装置、终端设备及可读存储介质 |
CN112633380A (zh) * | 2020-12-24 | 2021-04-09 | 北京百度网讯科技有限公司 | 兴趣点特征提取方法、装置、电子设备及存储介质 |
CN113191256A (zh) * | 2021-04-28 | 2021-07-30 | 北京百度网讯科技有限公司 | 车道线检测模型的训练方法、装置、电子设备及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11068724B2 (en) * | 2018-10-11 | 2021-07-20 | Baidu Usa Llc | Deep learning continuous lane lines detection system for autonomous vehicles |
CN110084095B (zh) * | 2019-03-12 | 2022-03-25 | 浙江大华技术股份有限公司 | 车道线检测方法、车道线检测装置和计算机存储介质 |
CN111310593B (zh) * | 2020-01-20 | 2022-04-19 | 浙江大学 | 一种基于结构感知的超快速车道线检测方法 |
CN111507226B (zh) * | 2020-04-10 | 2023-08-11 | 北京觉非科技有限公司 | 道路图像识别模型建模方法、图像识别方法及电子设备 |
CN112200172B (zh) * | 2020-12-07 | 2021-02-19 | 天津天瞳威势电子科技有限公司 | 一种可行驶区域的检测方法及装置 |
CN112528864A (zh) * | 2020-12-14 | 2021-03-19 | 北京百度网讯科技有限公司 | 模型生成方法、装置、电子设备和存储介质 |
-
2021
- 2021-04-28 CN CN202110470476.1A patent/CN113191256B/zh active Active
-
2022
- 2022-01-29 WO PCT/CN2022/075105 patent/WO2022227769A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460072A (zh) * | 2020-04-01 | 2020-07-28 | 北京百度网讯科技有限公司 | 车道线检测方法、装置、设备和存储介质 |
CN112528878A (zh) * | 2020-12-15 | 2021-03-19 | 中国科学院深圳先进技术研究院 | 检测车道线的方法、装置、终端设备及可读存储介质 |
CN112633380A (zh) * | 2020-12-24 | 2021-04-09 | 北京百度网讯科技有限公司 | 兴趣点特征提取方法、装置、电子设备及存储介质 |
CN113191256A (zh) * | 2021-04-28 | 2021-07-30 | 北京百度网讯科技有限公司 | 车道线检测模型的训练方法、装置、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113191256B (zh) | 2024-06-11 |
CN113191256A (zh) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022227769A1 (zh) | 车道线检测模型的训练方法、装置、电子设备及存储介质 | |
EP3910492A2 (en) | Event extraction method and apparatus, and storage medium | |
WO2023015941A1 (zh) | 文本检测模型的训练方法和检测文本方法、装置和设备 | |
WO2022257487A1 (zh) | 深度估计模型的训练方法, 装置, 电子设备及存储介质 | |
KR20220122566A (ko) | 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 | |
CN113033622A (zh) | 跨模态检索模型的训练方法、装置、设备和存储介质 | |
WO2022257614A1 (zh) | 物体检测模型的训练方法、图像检测方法及其装置 | |
US20230073550A1 (en) | Method for extracting text information, electronic device and storage medium | |
CN114648676B (zh) | 点云处理模型的训练和点云实例分割方法及装置 | |
US20230245429A1 (en) | Method and apparatus for training lane line detection model, electronic device and storage medium | |
EP4191544A1 (en) | Method and apparatus for recognizing token, electronic device and storage medium | |
CN113361572A (zh) | 图像处理模型的训练方法、装置、电子设备以及存储介质 | |
WO2022227759A1 (zh) | 图像类别的识别方法、装置和电子设备 | |
JP2022185143A (ja) | テキスト検出方法、テキスト認識方法及び装置 | |
CN114111813B (zh) | 高精地图元素更新方法、装置、电子设备及存储介质 | |
CN113963186A (zh) | 目标检测模型的训练方法、目标检测方法及相关装置 | |
CN114972910B (zh) | 图文识别模型的训练方法、装置、电子设备及存储介质 | |
CN115482436B (zh) | 图像筛选模型的训练方法、装置以及图像筛选方法 | |
CN114220163B (zh) | 人体姿态估计方法、装置、电子设备及存储介质 | |
CN116127319A (zh) | 多模态负样本构建、模型预训练方法、装置、设备及介质 | |
CN113051926B (zh) | 文本抽取方法、设备和存储介质 | |
CN112818972B (zh) | 兴趣点图像的检测方法、装置、电子设备及存储介质 | |
CN114817476A (zh) | 语言模型的训练方法、装置、电子设备和存储介质 | |
CN114119972A (zh) | 模型获取及对象处理方法、装置、电子设备及存储介质 | |
CN113205131A (zh) | 图像数据的处理方法、装置、路侧设备和云控平台 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 20227027156 Country of ref document: KR Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22794250 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022580383 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22794250 Country of ref document: EP Kind code of ref document: A1 |