WO2022227769A1 - 车道线检测模型的训练方法、装置、电子设备及存储介质 - Google Patents

车道线检测模型的训练方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022227769A1
WO2022227769A1 PCT/CN2022/075105 CN2022075105W WO2022227769A1 WO 2022227769 A1 WO2022227769 A1 WO 2022227769A1 CN 2022075105 W CN2022075105 W CN 2022075105W WO 2022227769 A1 WO2022227769 A1 WO 2022227769A1
Authority
WO
WIPO (PCT)
Prior art keywords
lane line
model
target
road condition
elements
Prior art date
Application number
PCT/CN2022/075105
Other languages
English (en)
French (fr)
Inventor
何悦
李莹莹
谭啸
孙昊
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to KR1020227027156A priority Critical patent/KR20220117341A/ko
Priority to US18/003,463 priority patent/US20230245429A1/en
Priority to JP2022580383A priority patent/JP2023531759A/ja
Publication of WO2022227769A1 publication Critical patent/WO2022227769A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning and the like, and can be applied to intelligent traffic scenarios, in particular to a training method, device, electronic device and storage medium for a lane line detection model.
  • Artificial intelligence is the study of making computers to simulate certain thinking processes and intelligent behaviors of people (such as learning, reasoning, thinking, planning, etc.), both hardware-level technology and software-level technology.
  • Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, and big data processing; artificial intelligence software technologies mainly include computer vision technology, speech recognition technology, natural language processing technology, and machine learning/depth Learning, big data processing technology, knowledge graph technology and other major directions.
  • the logic of the semantic segmentation method for elements in the road condition image cannot be directly applied to the detection and segmentation of lane lines, and the computational complexity of lane line detection and segmentation is high and cannot meet real-time requirements.
  • a training method, device, electronic device, storage medium and computer program product for a lane line detection model are provided.
  • a method for training a lane line detection model including: acquiring a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively; a plurality of elements corresponding to a plurality of sample road condition images respectively, and a plurality of element semantics corresponding to the plurality of elements respectively; and according to the plurality of sample road condition images, the plurality of elements, the semantics of the plurality of elements and The multiple labeled lane line information trains an initial artificial intelligence model to obtain a lane line detection model.
  • a training device for a lane line detection model comprising: an acquisition module configured to acquire multiple sample road condition images and multiple marked lane line information corresponding to the multiple sample road condition images respectively a determination module for determining a plurality of elements corresponding to the plurality of sample road condition images respectively, and a plurality of element semantics corresponding to the plurality of elements respectively; and a training module for determining according to the plurality of sample road conditions
  • the image, the multiple elements, the semantics of the multiple elements, and the multiple labeled lane line information train an initial artificial intelligence model to obtain a lane line detection model.
  • an electronic device comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instructions are executed by the at least one processor, so that the at least one processor can execute the method for training a lane line detection model according to an embodiment of the present disclosure.
  • a non-transitory computer-readable storage medium storing computer instructions, the computer instructions are used to cause the computer to execute the training method of the lane line detection model disclosed in the embodiments of the present disclosure.
  • a computer program product including a computer program that, when executed by a processor, implements the method for training a lane line detection model disclosed in the embodiments of the present disclosure.
  • FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of a second embodiment according to the present disclosure.
  • FIG. 3 is a schematic diagram of a third embodiment according to the present disclosure.
  • FIG. 4 is a schematic diagram of a fourth embodiment according to the present disclosure.
  • FIG. 5 is a block diagram of an electronic device used to implement the training method of the lane line detection model according to the embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure.
  • the execution body of the training method of the lane line detection model in the embodiment of the present disclosure is the training device of the lane line detection model, and the device can be realized by software and/or hardware, and the device can be configured in an electronic In the device, the electronic device may include, but is not limited to, a terminal, a server, and the like.
  • the embodiments of the present disclosure relate to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning, etc., and can be applied to intelligent traffic scenarios, which can effectively reduce the computational complexity of lane line detection and recognition in road condition images, and improve lane line detection and recognition. efficiency, and improve the detection and recognition effect of lane lines.
  • AI artificial intelligence
  • AI the English abbreviation is AI. It is a new technical science that studies and develops theories, methods, techniques and application systems for simulating, extending and expanding human intelligence.
  • Deep learning is to learn the inherent laws and representation levels of sample data, and the information obtained during these learning processes is of great help to the interpretation of data such as text, images, and sounds.
  • the ultimate goal of deep learning is to enable machines to have the ability to analyze and learn like humans, and to recognize data such as words, images, and sounds.
  • Computer vision refers to the use of cameras and computers instead of human eyes to identify, track and measure targets, and further perform graphics processing to make computer processing images that are more suitable for human eyes to observe or transmit to instruments for detection.
  • the training method of the lane line detection model includes:
  • S101 Acquire a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
  • the road condition image used for training the lane line detection model may be called a sample road condition image, and the road condition image may be an image captured by a camera device in the environment in an intelligent traffic scene, which is not limited.
  • a plurality of sample road condition images may be obtained from a sample road condition image pool, and the plurality of sample road condition images may be used to train an initial artificial intelligence model to obtain a human attribute detection model.
  • the marked lane line information can be used as a reference mark when training an initial artificial intelligence model.
  • the above lane line information can be used to describe the lane line related information in the sample road condition image, such as the lane line type, the image features corresponding to the image area of the lane line, or whether the lane line exists (whether the lane line exists can be called is the lane line state), or it can be any other possible lane line information, which is not limited.
  • the multiple sample road condition images and multiple marked lanes may be combined. line information to train the initial artificial intelligence model.
  • S102 Determine a plurality of elements corresponding to the plurality of sample road condition images respectively, and the semantics of the plurality of elements corresponding to the plurality of elements respectively.
  • image recognition may be performed on the plurality of sample road condition images, respectively, to obtain the elements corresponding to each sample road condition images, and the element semantics corresponding to each element respectively, wherein the elements can be, for example, For the sky, trees, roads, etc. in the sample road condition image, the feature semantics can refer to the feature type and feature characteristics of the sky, trees, and roads, and usually the feature contains some pixels in the image, you can use the pixel content of the pixel it contains. Context information is used to classify elements to obtain element semantics, which is not limited.
  • the corresponding elements and element semantics in the sample road condition images and the plurality of marked lane line information can be used. to train the initial artificial intelligence model to get the lane line detection model.
  • the processing logic based on element recognition can detect and identify the lane line instance, thus avoiding relying on the anchor information of the lane line in the road condition image, reducing the complexity of the model calculation and improving the detection and recognition efficiency.
  • S103 Train an initial artificial intelligence model according to multiple sample road condition images, multiple elements, multiple element semantics, and multiple labeled lane line information to obtain a lane line detection model.
  • the initial artificial intelligence model can be, for example, a neural network model, a machine learning model, or a graph neural network model.
  • a neural network model for example, a neural network model, a machine learning model, or a graph neural network model.
  • any other possible model capable of performing image recognition and analysis tasks can also be used, which is not limited. .
  • the multiple sample road condition images, multiple elements, and multiple element semantics can be input into the above-mentioned correspondingly.
  • a neural network model or a machine learning model, or a graph neural network model, so as to obtain the predicted lane line information output by any of the aforementioned models.
  • the predicted lane line information can be processed by any of the aforementioned models based on the model algorithm.
  • Logic combining the elements in the sample road condition image and the lane line information predicted by the element semantics.
  • multiple sample road condition images, multiple elements, and the semantics of multiple elements may be input into the initial artificial intelligence model to obtain multiple output images of the artificial intelligence model.
  • the predicted lane line information and then determine the convergence timing of the artificial intelligence model according to the multiple predicted lane line information and the multiple labeled lane line information, that is, in response to the multiple predicted lane line information and the multiple labeled lane information If the target loss value between the line information satisfies the set conditions, the artificial intelligence model obtained by training is used as the lane line detection model, which can timely determine the convergence time of the model, and enable the trained lane line detection model to effectively build Modeling the image features of the lane lines in the intelligent traffic scene can effectively improve the efficiency of the lane line detection and recognition of the lane line detection model, so that the trained lane line detection model can effectively meet the application scenarios with high real-time requirements. .
  • the number of the above target loss values may be one or more, and the loss value between the predicted lane line information and the marked lane line information may be referred to as the target loss value.
  • any other possible methods may also be used to determine the convergence timing of the initial artificial intelligence model, until the artificial intelligence model satisfies certain convergence conditions, the artificial intelligence model obtained by training is used as the lane line detection model.
  • the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
  • FIG. 2 is a schematic diagram of a second embodiment according to the present disclosure.
  • the training method of the lane line detection model includes:
  • S201 Acquire a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
  • S202 Determine a plurality of elements corresponding to the plurality of sample road condition images respectively, and the semantics of a plurality of elements corresponding to the plurality of elements respectively.
  • S203 Input multiple sample road condition images, multiple elements, and multiple element semantics into the element detection sub-model to obtain target elements output by the element detection sub-model.
  • the initial artificial intelligence model may include: an element detection sub-model and a lane line detection sub-model connected in sequence, so that when training the initial artificial intelligence model, multiple sample road condition images, multiple elements, and The semantics of multiple elements are input into the element detection sub-model to obtain target elements output by the element detection sub-model, and the target elements can be used to assist in the detection and identification of lane line instances.
  • the above-mentioned feature detection sub-model can be used for image feature extraction and can be regarded as a pre-training model for lane line instance segmentation.
  • the backbone network backbone of the training feature detection sub-model is used for feature extraction of each sample road condition image in the Cityscapes dataset to identify features and corresponding feature semantics from each sample road condition image.
  • the above-mentioned element detection sub-model may specifically be a deep high-resolution representation learning-object context representation model (Deep High-Resolution Representation Learning for Visual Recognition Object Contextual Representation, HRNet-OCR) for visual recognition, which is not limited. That is, the backbone network of the HRNet-OCR model can be used for image feature extraction, and then, the embodiment of the present disclosure can improve the structure of the HRNet-OCR model, and train the improved HRNet-OCR model to realize the semantic segmentation method logic of elements and lanes Fusion application of line detection and recognition.
  • HRNet-OCR Deep High-resolution representation learning-object context representation model
  • the target element can be specifically an element whose element type is a road type, since the target element is first identified, and then the target element, the semantics of the target element corresponding to the target element, and a plurality of sample road condition images are input to the lane line detection sub-system
  • the model to obtain multiple predicted lane line information output by the lane line detection sub-model, which can improve the pertinence of the model processing and recognition, avoid the interference caused by other elements to the lane line detection, and assist in improving the lane line detection. While the detection and recognition efficiency of the model is improved, the accuracy of detection and recognition is improved.
  • S204 Input the target element, the semantics of the target element corresponding to the target element, and multiple sample road condition images into the lane line detection sub-model to obtain multiple predicted lane line information output by the lane line detection sub-model.
  • the above-mentioned element-based detection sub-model processes multiple sample road condition images, multiple elements, and multiple element semantics to output the target element, and then the target element, the target element semantics corresponding to the target element, and the multiple sample road condition images can be input. into the lane line detection sub-model to obtain multiple predicted lane line information output by the lane line detection sub-model.
  • the element semantics corresponding to the target elements may be called target element semantics. If the target element is an element whose element type is a road type, the target element semantics may be a road type and an image feature corresponding to the road.
  • the above-mentioned predicted lane line information may specifically refer to the predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line.
  • the marked lane line information may specifically refer to the marked lane line state, and/or the marked context information of multiple pixels in the image area covered by the lane line.
  • the above lane line status may refer to the presence of lane lines or the absence of lane lines, and context information can be used to characterize the pixel features corresponding to each pixel in the image area covered by the lane line, and the relationship between each pixel and other pixels based on image features.
  • the relative relationship of dimensions for example, relative position relationship, relative depth relationship, etc., which is not limited.
  • the above-mentioned predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line, and the corresponding labeled lane line state, and/or the labeled context information can be combined.
  • To determine the convergence timing of the artificial intelligence model so as to accurately determine the convergence timing of the artificial intelligence model, which can effectively reduce the consumption of computing resources for model training, and ensure the detection and recognition effect of the trained lane line detection model.
  • S205 Determine a plurality of predicted lane line states and a plurality of first loss values between the corresponding plurality of marked lane line states.
  • the loss value between each predicted lane line state and the corresponding labeled lane line state can be determined as the first loss.
  • the first loss value can be used to characterize the difference of the loss of the lane line state prediction by the lane line detection model.
  • S206 Select the target first loss value from among the plurality of first loss values, and determine the target predicted lane line information and the target marked lane line information corresponding to the target first loss value.
  • the first loss value greater than the set loss threshold value among the plurality of first loss values may be used as the target first loss value, that is, when the first loss value among the first loss values is greater than the set loss threshold value , indicating that the predicted lane line state is closer to the marked lane line state, which reflects that the model at this time has more accurate state recognition results, so that the determination of the loss value is more in line with the detection logic of the actual model, ensuring the practicality of the method. sex and rationality.
  • the loss threshold is set to be, for example, 0.5
  • the first loss value is greater than 0.5, it can be shown that the detection accuracy of the lane line state of the lane line detection model at this stage meets certain requirements, and further, determine its accuracy. Predicted detection results for lane type or other lane information.
  • the first loss value selected from the plurality of first loss values that satisfies a certain condition may be referred to as a first target loss value, and the predicted lane line state corresponding to the first target loss value belongs to the predicted lane.
  • the line information may be referred to as target predicted lane line information, and the labeled lane line information to which the labeled lane line state corresponding to the first target loss value belongs may be referred to as target labeled lane line information.
  • S207 Determine the prediction context information included in the target predicted lane line information, and determine the labeling context information included in the target labeled lane line information.
  • the prediction context information contained in the target predicted lane line information can be determined, and the labeled context information contained in the target labeled lane line information can be determined, Then, trigger the next steps.
  • S208 Determine a second loss value between the prediction context information and the target marked lane line information, and use the second loss value as the target loss value.
  • a loss function can be configured for the structure of the above-mentioned improved HRNet-OCR model, the loss function can be used to fit the difference between the predicted context information and the target marked lane line information, and the obtained second loss value is used as the above-mentioned target loss value , there is no restriction on this.
  • the lane line detection model can obtain a more accurate detection and recognition effect.
  • a branch structure can be added to the network structure of the HRNet-OCR model for detection and segmentation of lane line instances.
  • the preset number of lane lines is 4. Therefore, 4 is added to the feature category as HRNet.
  • the loss of feature segmentation is l seg_ele , the loss of lane line detection and recognition can be added accordingly.
  • the loss of lane line segmentation consists of two parts, one part is the pixel loss leeg_lane , the other part
  • the binary classification loss l exist for the existence of 4 lane lines, in the case of the existence of the i-th lane line, then otherwise
  • the loss value of the total output of the HRNet-OCR model can be expressed as:
  • l total l seg_ele +l seg_lane +0.1*l exist ;
  • the lane line detection and recognition stage In the case of , it indicates that the state of the i-th lane line is: the lane line exists, so as to output the result of its pixel prediction (including prediction context information, predicted lane line category, etc.), otherwise it is considered that the lane line does not exist.
  • a segmentation network structure that effectively fuses element semantics and lane line instance recognition for road condition images is realized, thereby improving the accuracy of element semantics and lane line instance segmentation, and providing reliable traffic and smart city systems. The result of lane line segmentation.
  • the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
  • the predicted lane line information can improve the pertinence of the model processing and recognition, avoid the interference of other elements on the lane line detection, and improve the detection and recognition efficiency of the lane line detection model while improving the accuracy of detection and recognition. sex.
  • the above-mentioned predicted lane line state, and/or the predicted context information of multiple pixels in the image area covered by the lane line, and the corresponding labeled lane line state, and/or the labeled context information can be used to generate Determining the convergence timing of the artificial intelligence model, so as to accurately determine the convergence timing of the artificial intelligence model, can effectively reduce the consumption of computing resources for model training, and ensure the detection and recognition effect of the trained lane line detection model.
  • FIG. 3 is a schematic diagram of a third embodiment according to the present disclosure.
  • the training device 30 of the lane line detection model includes:
  • the obtaining module 301 is configured to obtain a plurality of sample road condition images and a plurality of marked lane line information corresponding to the plurality of sample road condition images respectively.
  • the determining module 302 is configured to determine multiple elements corresponding to the multiple sample road condition images respectively, and multiple element semantics corresponding to the multiple elements respectively.
  • the training module 303 is configured to train an initial artificial intelligence model according to multiple sample road condition images, multiple elements, multiple element semantics, and multiple marked lane line information, so as to obtain a lane line detection model.
  • the training device 40 of the lane line detection model includes: an acquisition module 401 , a determination module 402 , and a training device 40 .
  • Module 403 wherein the training module 403 includes:
  • the acquisition sub-module 4031 is used to input multiple sample road condition images, multiple elements and multiple elements semantically into the initial artificial intelligence model to obtain multiple predicted lane line information output by the artificial intelligence model;
  • the training sub-module 4032 is configured to use the artificial intelligence model obtained by training as the lane in response to the target loss value between the plurality of predicted lane line information and the plurality of marked lane line information meeting the set condition Line detection model.
  • the lane line information includes: lane line status, and/or context information of multiple pixels in the image area covered by the lane line, and the image area is a part of the sample road condition image to which the lane line belongs. image area.
  • the training sub-module 4032 is specifically used for:
  • the target first loss value is selected from the plurality of first loss values, and the target predicted lane line information and the target marked lane line information corresponding to the target first loss value are determined;
  • a second loss value between the prediction context information and the target labeled lane line information is determined, and the second loss value is used as the target loss value.
  • the training sub-module 4032 is specifically used for:
  • a first loss value greater than the set loss threshold value among the plurality of first loss values is used as the target first loss value.
  • the initial artificial intelligence model includes: an element detection sub-model and a lane line detection sub-model connected in sequence, wherein the acquisition sub-module 4031 is specifically used for:
  • the initial artificial intelligence model is trained according to multiple sample road condition images, multiple elements, multiple element semantics and multiple labeled lane line information to obtain a lane line detection model, which can effectively Reduce the computational complexity of lane line detection and recognition in road condition images, improve the efficiency of lane line detection and recognition, and improve the detection and recognition effect of lane lines.
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 5 is a block diagram of an electronic device used to implement the training method of the lane line detection model according to the embodiment of the present disclosure.
  • Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the device 500 includes a computing unit 501 that can be executed according to a computer program stored in a read only memory (ROM) 502 or loaded from a storage unit 508 into a random access memory (RAM) 503 Various appropriate actions and handling. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored.
  • the computing unit 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
  • An input/output (I/O) interface 505 is also connected to bus 504 .
  • Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506, such as a keyboard, mouse, etc.; an output unit 507, such as various types of displays, speakers, etc.; a storage unit 508, such as a magnetic disk, an optical disk, etc. ; and a communication unit 509, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • Computing unit 501 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 501 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
  • the computing unit 501 executes the various methods and processes described above, eg, a training method of a lane line detection model.
  • a method of training a lane line detection model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508 .
  • part or all of the computer program may be loaded and/or installed on device 500 via ROM 502 and/or communication unit 509 .
  • the computing unit 501 may be configured by any other suitable means (eg, by means of firmware) to perform the training method of the lane line detection model.
  • Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC systems on chips system
  • CPLD load programmable logic device
  • computer hardware firmware, software, and/or combinations thereof.
  • These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that
  • the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • Program code for implementing the training method of the lane line detection model of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented.
  • the program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.
  • a computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS service ("Virtual Private Server", or "VPS" for short) , there are the defects of difficult management and weak business expansion.
  • the server can also be a server of a distributed system, or a server combined with a blockchain.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本公开提供了车道线检测模型的训练方法、装置、电子设备及存储介质,涉及人工智能技术领域,具体涉及计算机视觉、深度学习等技术领域,可应用于智能交通场景下。具体实现方案:获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,能够有效降低路况图像中车道线检测识别的计算复杂度,提升车道线检测识别的效率,提升车道线的检测识别效果。

Description

车道线检测模型的训练方法、装置、电子设备及存储介质
相关申请的交叉引用
本公开基于申请号为202110470476.1、申请日为2021年04月28日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及人工智能技术领域,具体涉及计算机视觉、深度学习等技术领域,可应用于智能交通场景下,尤其涉及车道线检测模型的训练方法、装置、电子设备及存储介质。
背景技术
人工智能是研究使计算机来模拟人的某些思维过程和智能行为(如学习、推理、思考、规划等)的学科,既有硬件层面的技术也有软件层面的技术。人工智能硬件技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理等技术;人工智能软件技术主要包括计算机视觉技术、语音识别技术、自然语言处理技术以及机器学习/深度学习、大数据处理技术、知识图谱技术等几大方向。
相关技术中,针对路况图像之中要素的语义分割方法逻辑,无法直接应用到车道线的检测分割中,车道线检测分割的计算复杂度高,无法满足实时性的要求。
发明内容
提供了一种车道线检测模型的训练方法、装置、电子设备、存储介质及计算机程序产品。
根据第一方面,提供了一种车道线检测模型的训练方法,包括:获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
根据第二方面,提供了一种车道线检测模型的训练装置,包括:获取模块,用于获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;确定模块,用于确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及训练模块,用于根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
根据第三方面,提供了一种电子设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行本公开实施例的车道线检测模型的训练方法。
根据第四方面,提出了一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行本公开实施例公开的车道线检测模型的训练方法。
根据第五方面,提出了一种计算机程序产品,包括计算机程序,当所述计算机程序由处理器执行时实现本公开实施例公开的车道线检测模型的训练方法。
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征, 也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。
附图说明
附图用于更好地理解本方案,不构成对本公开的限定。其中:
图1是根据本公开第一实施例的示意图。
图2是根据本公开第二实施例的示意图。
图3是根据本公开第三实施例的示意图。
图4是根据本公开第四实施例的示意图。
图5是用来实现本公开实施例的车道线检测模型的训练方法的电子设备的框图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
图1是根据本公开第一实施例的示意图。
其中,需要说明的是,本公开实施例的车道线检测模型的训练方法的执行主体为车道线检测模型的训练装置,该装置可以由软件和/或硬件的方式实现,该装置可以配置在电子设备中,电子设备可以包括但不限于终端、服务器端等。
本公开实施例涉及人工智能技术领域,具体涉及计算机视觉、深度学习等技术领域,可应用于智能交通场景下,能够有效降低路况图像中车道线检测识别的计算复杂度,提升车道线检测识别的效率,提升车道线的检测识别效果。
其中,人工智能(Artificial Intelligence),英文缩写为AI。它是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学。
深度学习是学习样本数据的内在规律和表示层次,这些学习过程中获得的信息对诸如文字,图像和声音等数据的解释有很大的帮助。深度学习的最终目标是让机器能够像人一样具有分析学习能力,能够识别文字、图像和声音等数据。
计算机视觉,指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。
如图1所示,该车道线检测模型的训练方法,包括:
S101:获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息。
其中,用于训练车道线检测模型的路况图像,可以被称为样本路况图像,而路况图像,可以是智能交通场景下,环境中的摄像装置捕获到的图像,对此不做限制。
本公开实施例中,可以从样本路况图像池中获取多个样本路况图像,该多个样本路况图像可以被用于训练初始的人工智能模型以得到人体属性检测模型。
上述在获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息,该标注的车道线信息,可以被用于训练初始的人工智能模型时作为参考标注。
上述的车道线信息,可以被用于描述样本路况图像中的车道线相关的信息,例如车道线类型、车道线的图像区域对应的图像特征,或者车道线是否存在(车道线是否存在可以被称为车道线状态),或者,也可以为其它任意可能的车道线信息,对此不做限制。
也即是说,本公开实施例中在获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息之后,可以结合多个样本路况图像和多个标注的车道线信息来训练初始的人工智能模型。
S102:确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义。
上述在获取多个样本路况图像之后,可以对多个样本路况图像分别进行图像识别,得到与每个样本路况图像分别对应的要素,以及与每个要素分别对应的要素语义,其中,要素可以例如为样本路况图像中的天空、树木、道路等等,要素语义可以是指天空、树木、道路的要素类型和要素特征,而通常要素包含了图像中的部分像素,则可以通过其所包含像素的上下文信息,来对要素分类得到要素语义,对此不做限制。
上述确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义之后,可以基于样本路况图像中相应的要素和要素语义,以及多个标注的车道线信息来训练初始的人工智能模型,以得到车道线检测模型。
也即是说,本公开实施例中,由于在训练车道线检测模型时,是针对多个样本路况图像分别进行图像解析,确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义,而后基于样本路况图像中相应的要素和要素语义,以及多个标注的车道线信息来训练初始的人工智能模型,从而实现要素的语义分割方法逻辑和车道线的检测识别的融合应用,基于要素识别的处理逻辑即可以检测识别出车道线实例,从而避免依赖路况图像中车道线的锚框anchor信息,降低模型计算的复杂度,提升检测识别效率。
S103:根据多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
其中,初始的人工智能模型可以例如为神经网络模型、机器学习模型,或者也可以是图神经网络模型,当然,也可以采用其它任意可能的能够执行图像识别解析任务的模型,对此不做限制。
上述在获取多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息之后,可以将多个样本路况图像、多个要素、多个要素语义分别对应地输入至上述的神经网络模型,或者机器学习模型,或者图神经网络模型之中,从而得到前述任一种模型输出的预测的车道线信息,该预测的车道线信息,可以是前述任一种模型基于模型算法处理逻辑,结合样本路况图像之中的要素和要素语义预测得到的车道线信息。
一些实施例中,在训练初始的人工智能模型时,可以是将多个样本路况图像、多个要素以及多个要素语义输入至初始的人工智能模型之中,以得到人工智能模型输出的多个预测的车道线信息,而后,根据多个预测的车道线信息和多个标注的车道线信息来确定人工智能模型的收敛时机,即,响应于多个预测的车道线信息和多个标注的车道线信息之间的目标损失值满足设定条件,则将训练得到的人工智能模型作为车道线检测模型,能够及时地确定出模型的收敛时机,并且使得训练得到的车道线检测模型能够有效地建模出智能交通场景中的车道线的图像特征,能够有效地提升车道线检测模型的车道线检测识别的效率,从而使得训练得到的车道线检测模型能够有效地满足实时性要求较高的应用场景。
上述的目标损失值的数量可以是一个或者多个,预测的车道线信息和标注的车道线信息之间的损失值,可以被称为目标损失值。
另外一些实施例中,也可以采用其他任意可能的方式来确定初始的人工智能模型的收敛时机,直至人工智能模型满足一定的收敛条件时,将训练得到的人工智能模型作为车道线检测模型。
本公开实施例中,通过获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息,并确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义,以及根据多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,能够有效降低路况图像中车道线检测识别的计算复杂度,提升车道线检测识别的效率,提升车道线的检测识别效果。
图2是根据本公开第二实施例的示意图。
如图2所示,该车道线检测模型的训练方法包括:
S201:获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线 信息。
S202:确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义。
S201-S202的描述说明可以具体参见上述实施例,在此不再赘述。
S203:将多个样本路况图像、多个要素以及多个要素语义输入至要素检测子模型之中,以得到要素检测子模型输出的目标要素。
本公开实施例中,初始的人工智能模型可以包括:顺序连接的要素检测子模型和车道线检测子模型,从而在训练初始的人工智能模型时,可以将多个样本路况图像、多个要素以及多个要素语义输入至要素检测子模型之中,以得到要素检测子模型输出的目标要素,该目标要素可以被用于辅助进行车道线实例的检测识别。
上述的要素检测子模型可以被用于进行图像特征提取,可以被视为车道线实例分割的预训练模型,在上述获取的多个样本路况图像构成了城市景观Cityscapes数据集情况下,而后,可以训练要素检测子模型的骨干网络backbone用于Cityscapes数据集中各个样本路况图像特征提取,以从各个样本路况图像之中识别出要素和相应的要素语义。
上述的要素检测子模型可以具体是用于视觉识别的深度高分辨率表示学习-对象上下文表示模型(Deep High-Resolution Representation Learning for Visual Recognition Object Contextual Representation,HRNet-OCR),对此不做限制,即可以采用HRNet-OCR模型的骨干网络进行图像特征提取,而后,本公开实施例可以改进HRNet-OCR模型的结构,并训练改进后HRNet-OCR模型,使其实现要素的语义分割方法逻辑和车道线的检测识别的融合应用。
可以理解的是,由于车道线通常标识在道路表面上,则本公开实施例中,可以支持基于要素检测子模型处理多个样本路况图像、多个要素以及多个要素语义,以输出目标要素,该目标要素可以具体是要素类型为道路类型的要素,由于是首先识别出目标要素,而后,将目标要素,和与目标要素对应的目标要素语义,以及多个样本路况图像输入至车道线检测子模型之中,以得到车道线检测子模型输出的多个预测的车道线信息,从而能够提升模型处理识别的针对性,避免其它要素对车道线检测所带来的干扰,在辅助提升车道线检测模型的检测识别效率的同时,提升了检测识别的准确性。
S204:将目标要素,和与目标要素对应的目标要素语义,以及多个样本路况图像输入至车道线检测子模型之中,以得到车道线检测子模型输出的多个预测的车道线信息。
上述基于要素检测子模型处理多个样本路况图像、多个要素以及多个要素语义,以输出目标要素之后,可以将目标要素,和与目标要素对应的目标要素语义,以及多个样本路况图像输入至车道线检测子模型之中,以得到车道线检测子模型输出的多个预测的车道线信息。
其中,与目标要素对应的要素语义,可以被称为目标要素语义,在目标要素是要素类型为道路类型的要素情况下,则目标要素语义可以是道路类型和道路对应的图像特征等。
上述预测的车道线信息,可以具体是指预测的车道线状态,和/或车道线覆盖的图像区域中多个像素的预测上下文信息。
相应的,标注的车道线信息,可以具体是指标注的车道线状态,和/或车道线覆盖的图像区域中多个像素的标注上下文信息。
也即是说,针对每个样本路况图像,会对应存在有相应的标注的车道线状态,和/或标注上下文信息,针对每个样本路况图像,会对应存在有人工智能模型输出的预测的车道线状态,和/或预测上下文信息。
上述的车道线状态可以是指车道线存在、车道线不存在,而上下文信息,可以用于表征车道线覆盖的图像区域中各个像素对应的像素特征,以及各个像素与其它像素之间基于图像特征维度的相对关系(例如,相对位置关系,相对深度关系等),对此不做限制。
从而本公开实施例中,能够结合上述的预测的车道线状态,和/或车道线覆盖的图像区 域中多个像素的预测上下文信息,以及相应的标注的车道线状态,和/或标注上下文信息来确定人工智能模型的收敛时机,从而实现准确地确定出人工智能模型的收敛时机,能够有效降低模型训练的运算资源消耗,且保障了训练得到的车道线检测模型的检测识别效果。
S205:确定多个预测的车道线状态,和相应多个标注的车道线状态之间的多个第一损失值。
上述在确定多个预测的车道线状态,和相应多个标注的车道线状态之后,可以确定各个预测的车道线状态,与相应的标注的车道线状态之间的损失值,并作为第一损失值,该第一损失值能够用于表征车道线检测模型对车道线状态预测的损失差异情况。
S206:从多个第一损失值之中选取出目标第一损失值,并确定目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息。
一些实施例中,可以将多个第一损失值之中大于设定损失阈值的第一损失值作为目标第一损失值,也即是说,当第一损失值之中大于设定损失阈值时,表明预测的车道线状态更为接近标注的车道线状态,从而反映出此时的模型已具备较为准确的状态识别结果,使得损失值的确定更为符合实际模型的检测逻辑,保障方法的实用性和合理性。
设定损失阈值例如为0.5,则在第一损失值之中大于0.5的情况下,可以表明此时阶段的车道线检测模型对于车道线状态的检测准确率符合了一定的需求,进而,确定其对于车道线类型或者其他车道线信息的预测检测结果。
上述从多个第一损失值之中选取出的满足一定条件的第一损失值,可以被称为第一目标损失值,与第一目标损失值对应的预测的车道线状态所属的预测的车道线信息,可以被称为目标预测车道线信息,与第一目标损失值对应的标注的车道线状态所属的标注的车道线信息,可以被称为目标标注车道线信息。
S207:确定目标预测车道线信息包含的预测上下文信息,并确定目标标注车道线信息包含的标注上下文信息。
上述在确定目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息之后,可以确定目标预测车道线信息包含的预测上下文信息,并确定目标标注车道线信息包含的标注上下文信息,而后,触发后续步骤。
S208:确定预测上下文信息和目标标注车道线信息之间的第二损失值,并将第二损失值作为目标损失值。
比如,可以针对上述改进HRNet-OCR模型的结构配置损失函数,采用该损失函数来拟合预测上下文信息和目标标注车道线信息之间的差异,将得到的第二损失值作为上述的目标损失值,对此不做限制。
也即是说,本公开实施例中,支持采用多个维度的损失值来确定人工智能模型的收敛时机,在基于车道线状态确定的第一损失值满足一定的条件的情况下,才触发基于预测上下文信息和目标标注车道线信息确定相应的第二损失值并作为目标损失值,用以确定收敛时机,能够有效提升损失值拟合的准确性,在基于该目标损失值确定出模型的收敛时机的情况下,使得车道线检测模型能够获得更为准确的检测识别效果。
举例而言,可以在HRNet-OCR模型的网络结构增加一个分支结构,用于进行车道线实例的检测分割,预设车道线的数量为4,由此,在要素类别的基础上加4作为HRNet-OCR模型输出的总类别,上述在要素分割的损失为l seg_ele的情况下,可以相应地添加车道线检测识别的损失,该车道线分割的损失包含两部分,一部分为像素损失l eeg_lane,一部分为4条车道线是否存在而形成的二分类损失l exist,在第i条车道线存在的情况下,则
Figure PCTCN2022075105-appb-000001
否则
Figure PCTCN2022075105-appb-000002
相应地,HRNet-OCR模型总输出的损失值可以表示为:
l total=l seg_ele+l seg_lane+0.1*l exist
在车道线检测识别阶段,在
Figure PCTCN2022075105-appb-000003
的情况下,则表明第i条车道线状态是:车道线存在,从而输出其像素预测的结果(包括预测上下文信息、预测车道线类别等),否则视为该车道线不存在。
从而本公开实施例中,实现了针对路况图像作出有效的融合要素语义与车道线实例识别的分割网络结构,从而提高要素语义与车道线实例分割的准确性,为智慧交通、智能城市系统提供可靠的车道线分割结果。
S209:响应于多个预测的车道线信息和多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为车道线检测模型。
S209的描述说明可以具体参见上述实施例,在此不再赘述。
本公开实施例中,通过获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息,并确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义,以及根据多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,能够有效降低路况图像中车道线检测识别的计算复杂度,提升车道线检测识别的效率,提升车道线的检测识别效果。由于是首先识别出目标要素,而后,将目标要素,和与目标要素对应的目标要素语义,以及多个样本路况图像输入至车道线检测子模型之中,以得到车道线检测子模型输出的多个预测的车道线信息,从而能够提升模型处理识别的针对性,避免其它要素对车道线检测所带来的干扰,在辅助提升车道线检测模型的检测识别效率的同时,提升了检测识别的准确性。本公开实施例中,能够结合上述的预测的车道线状态,和/或车道线覆盖的图像区域中多个像素的预测上下文信息,以及相应的标注的车道线状态,和/或标注上下文信息来确定人工智能模型的收敛时机,从而实现准确地确定出人工智能模型的收敛时机,能够有效降低模型训练的运算资源消耗,且保障了训练得到的车道线检测模型的检测识别效果。
图3是根据本公开第三实施例的示意图。
如图3所示,该车道线检测模型的训练装置30,包括:
获取模块301,用于获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息。
确定模块302,用于确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义。
训练模块303,用于根据多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
在本公开的一些实施例中,如图4所示,图4是根据本公开第四实施例的示意图,该车道线检测模型的训练装置40,包括:获取模块401、确定模块402,以及训练模块403,其中,训练模块403,包括:
获取子模块4031,用于将多个样本路况图像、多个要素以及多个要素语义输入至初始的人工智能模型之中,以得到人工智能模型输出的多个预测的车道线信息;
训练子模块4032,用于响应于所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为所述车道线检测模型。
在本公开的一些实施例中,车道线信息包括:车道线状态,和/或车道线覆盖的图像区域中多个像素的上下文信息,图像区域,是车道线所属的样本路况图像之中的局部的图像区域。
在本公开的一些实施例中,其中,训练子模块4032,具体用于:
确定多个预测的车道线状态,和相应多个标注的车道线状态之间的多个第一损失值;
从多个第一损失值之中选取出目标第一损失值,并确定目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息;
确定目标预测车道线信息包含的预测上下文信息,并确定目标标注车道线信息包含的标注上下文信息;
确定预测上下文信息和目标标注车道线信息之间的第二损失值,并将第二损失值作为目标损失值。
在本公开的一些实施例中,其中,训练子模块4032,具体用于:
将多个第一损失值之中大于设定损失阈值的第一损失值作为目标第一损失值。
在本公开的一些实施例中,初始的人工智能模型包括:顺序连接的要素检测子模型和车道线检测子模型,其中,获取子模块4031,具体用于:
将多个样本路况图像、多个要素以及多个要素语义输入至要素检测子模型之中,以得到要素检测子模型输出的目标要素;
将目标要素,和与目标要素对应的目标要素语义,以及多个样本路况图像输入至车道线检测子模型之中,以得到车道线检测子模型输出的多个预测的车道线信息。
可以理解的是,本公开实施例附图4中的车道线检测模型的训练装置40与上述实施例中的车道线检测模型的训练装置30,获取模块401与上述实施例中的获取模块301,确定模块402与上述实施例中的确定模块302,训练模块403与上述实施例中的训练模块303,可以具有相同的功能和结构。
需要说明的是,前述对车道线检测模型的训练方法的解释说明也适用于本公开实施例的车道线检测模型的训练装置,此处不再赘述。
本公开实施例中,通过获取多个样本路况图像,和与多个样本路况图像分别对应的多个标注的车道线信息,并确定与多个样本路况图像分别对应的多个要素,以及与多个要素分别对应的多个要素语义,以及根据多个样本路况图像、多个要素、多个要素语义以及多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,能够有效降低路况图像中车道线检测识别的计算复杂度,提升车道线检测识别的效率,提升车道线的检测识别效果。
根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。
图5是用来实现本公开实施例的车道线检测模型的训练方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图5所示,设备500包括计算单元501,其可以根据存储在只读存储器(ROM)502中的计算机程序或者从存储单元508加载到随机访问存储器(RAM)503中的计算机程序,来执行各种适当的动作和处理。在RAM 503中,还可存储设备500操作所需的各种程序和数据。计算单元501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
设备500中的多个部件连接至I/O接口505,包括:输入单元506,例如键盘、鼠标等;输出单元507,例如各种类型的显示器、扬声器等;存储单元508,例如磁盘、光盘等;以及通信单元509,例如网卡、调制解调器、无线通信收发机等。通信单元509允许设备500通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
计算单元501可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元501的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各 种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元501执行上文所描述的各个方法和处理,例如,车道线检测模型的训练方法。
例如,在一些实施例中,车道线检测模型的训练方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元508。在一些实施例中,计算机程序的部分或者全部可以经由ROM502和/或通信单元509而被载入和/或安装到设备500上。当计算机程序加载到RAM503并由计算单元501执行时,可以执行上文描述的车道线检测模型的训练方法的一个或多个步骤。备选地,在其他实施例中,计算单元501可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行车道线检测模型的训练方法。
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本公开的车道线检测模型的训练方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)、互联网及区块链网络。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务("Virtual Private Server",或简称"VPS")中,存在的管理难度大,业务扩展性弱的缺陷。服务器也可以为分布式系统的服务器,或者是结合了区块链的服务器。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (15)

  1. 一种车道线检测模型的训练方法,包括:
    获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;
    确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及
    根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
  2. 根据权利要求1所述的方法,其中,所述根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型,包括:
    将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息;
    响应于所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为所述车道线检测模型。
  3. 根据权利要求2所述的方法,所述车道线信息包括:车道线状态,和/或所述车道线覆盖的图像区域中多个像素的上下文信息,所述图像区域,是所述车道线所属的样本路况图像之中的局部的图像区域。
  4. 根据权利要求3所述的方法,其中,确定所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值,包括:
    确定所述多个预测的车道线状态,和相应所述多个标注的车道线状态之间的多个第一损失值;
    从所述多个第一损失值之中选取出目标第一损失值,并确定所述目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息;
    确定所述目标预测车道线信息包含的预测上下文信息,并确定所述目标标注车道线信息包含的标注上下文信息;
    确定所述预测上下文信息和所述目标标注车道线信息之间的第二损失值,并将所述第二损失值作为所述目标损失值。
  5. 根据权利要求4所述的方法,其中,所述从所述多个第一损失值之中选取出目标第一损失值,包括:
    将所述多个第一损失值之中大于设定损失阈值的第一损失值作为所述目标第一损失值。
  6. 根据权利要求2-5任一项所述的方法,所述初始的人工智能模型包括:顺序连接的要素检测子模型和车道线检测子模型,其中,
    所述将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息,包括:
    将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述要素检测子模型之中,以得到所述要素检测子模型输出的目标要素;
    将所述目标要素,和与所述目标要素对应的目标要素语义,以及所述多个样本路况图像输入至所述车道线检测子模型之中,以得到所述车道线检测子模型输出的多个预测的车道线信息。
  7. 一种车道线检测模型的训练装置,包括:
    获取模块,用于获取多个样本路况图像,和与所述多个样本路况图像分别对应的多个标注的车道线信息;
    确定模块,用于确定与所述多个样本路况图像分别对应的多个要素,以及与所述多个要素分别对应的多个要素语义;以及
    训练模块,用于根据所述多个样本路况图像、所述多个要素、所述多个要素语义以及所述多个标注的车道线信息训练初始的人工智能模型,以得到车道线检测模型。
  8. 根据权利要求7所述的装置,其中,所述训练模块,包括:
    获取子模块,用于将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述初始的人工智能模型之中,以得到所述人工智能模型输出的多个预测的车道线信息;
    训练子模块,用于响应于所述多个预测的车道线信息和所述多个标注的车道线信息之间的目标损失值满足设定条件,将训练得到的人工智能模型作为所述车道线检测模型。
  9. 根据权利要求8所述的装置,所述车道线信息包括:车道线状态,和/或所述车道线覆盖的图像区域中多个像素的上下文信息,所述图像区域,是所述车道线所属的样本路况图像之中的局部的图像区域。
  10. 根据权利要求9所述的装置,其中,所述训练子模块,具体用于:
    确定所述多个预测的车道线状态,和相应所述多个标注的车道线状态之间的多个第一损失值;
    从所述多个第一损失值之中选取出目标第一损失值,并确定所述目标第一损失值所对应的目标预测车道线信息和目标标注车道线信息;
    确定所述目标预测车道线信息包含的预测上下文信息,并确定所述目标标注车道线信息包含的标注上下文信息;
    确定所述预测上下文信息和所述目标标注车道线信息之间的第二损失值,并将所述第二损失值作为所述目标损失值。
  11. 根据权利要求10所述的装置,其中,所述训练子模块,具体用于:
    将所述多个第一损失值之中大于设定损失阈值的第一损失值作为所述目标第一损失值。
  12. 根据权利要求8-11任一项所述的装置,所述初始的人工智能模型包括:顺序连接的要素检测子模型和车道线检测子模型,其中,
    所述获取子模块,具体用于:
    将所述多个样本路况图像、所述多个要素以及所述多个要素语义输入至所述要素检测子模型之中,以得到所述要素检测子模型输出的目标要素;
    将所述目标要素,和与所述目标要素对应的目标要素语义,以及所述多个样本路况图像输入至所述车道线检测子模型之中,以得到所述车道线检测子模型输出的多个预测的车道线信息。
  13. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处 理器执行,以使所述至少一个处理器能够执行权利要求1-6中任一项所述的方法。
  14. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1-6中任一项所述的方法。
  15. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-6中任一项所述的方法。
PCT/CN2022/075105 2021-04-28 2022-01-29 车道线检测模型的训练方法、装置、电子设备及存储介质 WO2022227769A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
KR1020227027156A KR20220117341A (ko) 2021-04-28 2022-01-29 차선 검출 모델의 트레이닝 방법, 장치, 전자 기기 및 저장 매체
US18/003,463 US20230245429A1 (en) 2021-04-28 2022-01-29 Method and apparatus for training lane line detection model, electronic device and storage medium
JP2022580383A JP2023531759A (ja) 2021-04-28 2022-01-29 車線境界線検出モデルの訓練方法、車線境界線検出モデルの訓練装置、電子機器、記憶媒体及びコンピュータプログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110470476.1A CN113191256B (zh) 2021-04-28 2021-04-28 车道线检测模型的训练方法、装置、电子设备及存储介质
CN202110470476.1 2021-04-28

Publications (1)

Publication Number Publication Date
WO2022227769A1 true WO2022227769A1 (zh) 2022-11-03

Family

ID=76980425

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075105 WO2022227769A1 (zh) 2021-04-28 2022-01-29 车道线检测模型的训练方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN113191256B (zh)
WO (1) WO2022227769A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191256B (zh) * 2021-04-28 2024-06-11 北京百度网讯科技有限公司 车道线检测模型的训练方法、装置、电子设备及存储介质
CN113673586B (zh) * 2021-08-10 2022-08-16 北京航天创智科技有限公司 融合多源高分辨率卫星遥感影像的海上养殖区域分类方法
CN113705513B (zh) * 2021-09-03 2023-09-26 北京百度网讯科技有限公司 模型训练和车道线预测方法、电子设备和自动驾驶车辆
CN113705515B (zh) * 2021-09-03 2024-04-12 北京百度网讯科技有限公司 语义分割模型的训练和高精地图车道线的生成方法和设备
CN113762397B (zh) * 2021-09-10 2024-04-05 北京百度网讯科技有限公司 检测模型训练、高精度地图更新方法、设备、介质及产品
CN113869249B (zh) * 2021-09-30 2024-05-07 广州文远知行科技有限公司 一种车道线标注方法、装置、设备及可读存储介质
CN113837313B (zh) * 2021-09-30 2024-06-14 广州文远知行科技有限公司 车道线标注模型的训练方法、装置、设备及可读存储介质
CN113963011A (zh) * 2021-10-08 2022-01-21 北京百度网讯科技有限公司 图像识别方法、装置、电子设备及存储介质
CN114677570B (zh) * 2022-03-14 2023-02-07 北京百度网讯科技有限公司 道路信息更新方法、装置、电子设备以及存储介质
CN117593717B (zh) * 2024-01-18 2024-04-05 武汉大学 一种基于深度学习的车道追踪方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460072A (zh) * 2020-04-01 2020-07-28 北京百度网讯科技有限公司 车道线检测方法、装置、设备和存储介质
CN112528878A (zh) * 2020-12-15 2021-03-19 中国科学院深圳先进技术研究院 检测车道线的方法、装置、终端设备及可读存储介质
CN112633380A (zh) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 兴趣点特征提取方法、装置、电子设备及存储介质
CN113191256A (zh) * 2021-04-28 2021-07-30 北京百度网讯科技有限公司 车道线检测模型的训练方法、装置、电子设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11068724B2 (en) * 2018-10-11 2021-07-20 Baidu Usa Llc Deep learning continuous lane lines detection system for autonomous vehicles
CN110084095B (zh) * 2019-03-12 2022-03-25 浙江大华技术股份有限公司 车道线检测方法、车道线检测装置和计算机存储介质
CN111310593B (zh) * 2020-01-20 2022-04-19 浙江大学 一种基于结构感知的超快速车道线检测方法
CN111507226B (zh) * 2020-04-10 2023-08-11 北京觉非科技有限公司 道路图像识别模型建模方法、图像识别方法及电子设备
CN112200172B (zh) * 2020-12-07 2021-02-19 天津天瞳威势电子科技有限公司 一种可行驶区域的检测方法及装置
CN112528864A (zh) * 2020-12-14 2021-03-19 北京百度网讯科技有限公司 模型生成方法、装置、电子设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460072A (zh) * 2020-04-01 2020-07-28 北京百度网讯科技有限公司 车道线检测方法、装置、设备和存储介质
CN112528878A (zh) * 2020-12-15 2021-03-19 中国科学院深圳先进技术研究院 检测车道线的方法、装置、终端设备及可读存储介质
CN112633380A (zh) * 2020-12-24 2021-04-09 北京百度网讯科技有限公司 兴趣点特征提取方法、装置、电子设备及存储介质
CN113191256A (zh) * 2021-04-28 2021-07-30 北京百度网讯科技有限公司 车道线检测模型的训练方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113191256B (zh) 2024-06-11
CN113191256A (zh) 2021-07-30

Similar Documents

Publication Publication Date Title
WO2022227769A1 (zh) 车道线检测模型的训练方法、装置、电子设备及存储介质
EP3910492A2 (en) Event extraction method and apparatus, and storage medium
WO2023015941A1 (zh) 文本检测模型的训练方法和检测文本方法、装置和设备
WO2022257487A1 (zh) 深度估计模型的训练方法, 装置, 电子设备及存储介质
KR20220122566A (ko) 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치
CN113033622A (zh) 跨模态检索模型的训练方法、装置、设备和存储介质
WO2022257614A1 (zh) 物体检测模型的训练方法、图像检测方法及其装置
US20230073550A1 (en) Method for extracting text information, electronic device and storage medium
CN114648676B (zh) 点云处理模型的训练和点云实例分割方法及装置
US20230245429A1 (en) Method and apparatus for training lane line detection model, electronic device and storage medium
EP4191544A1 (en) Method and apparatus for recognizing token, electronic device and storage medium
CN113361572A (zh) 图像处理模型的训练方法、装置、电子设备以及存储介质
WO2022227759A1 (zh) 图像类别的识别方法、装置和电子设备
JP2022185143A (ja) テキスト検出方法、テキスト認識方法及び装置
CN114111813B (zh) 高精地图元素更新方法、装置、电子设备及存储介质
CN113963186A (zh) 目标检测模型的训练方法、目标检测方法及相关装置
CN114972910B (zh) 图文识别模型的训练方法、装置、电子设备及存储介质
CN115482436B (zh) 图像筛选模型的训练方法、装置以及图像筛选方法
CN114220163B (zh) 人体姿态估计方法、装置、电子设备及存储介质
CN116127319A (zh) 多模态负样本构建、模型预训练方法、装置、设备及介质
CN113051926B (zh) 文本抽取方法、设备和存储介质
CN112818972B (zh) 兴趣点图像的检测方法、装置、电子设备及存储介质
CN114817476A (zh) 语言模型的训练方法、装置、电子设备和存储介质
CN114119972A (zh) 模型获取及对象处理方法、装置、电子设备及存储介质
CN113205131A (zh) 图像数据的处理方法、装置、路侧设备和云控平台

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 20227027156

Country of ref document: KR

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794250

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022580383

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794250

Country of ref document: EP

Kind code of ref document: A1