WO2022142839A1 - 一种图像处理方法、装置以及智能汽车 - Google Patents

一种图像处理方法、装置以及智能汽车 Download PDF

Info

Publication number
WO2022142839A1
WO2022142839A1 PCT/CN2021/131609 CN2021131609W WO2022142839A1 WO 2022142839 A1 WO2022142839 A1 WO 2022142839A1 CN 2021131609 W CN2021131609 W CN 2021131609W WO 2022142839 A1 WO2022142839 A1 WO 2022142839A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
region
interest
length
Prior art date
Application number
PCT/CN2021/131609
Other languages
English (en)
French (fr)
Inventor
郑永豪
黄梓亮
位硕权
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022142839A1 publication Critical patent/WO2022142839A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • G06T5/80
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Definitions

  • the present application relates to the field of image processing, and in particular, to an image processing method, device and smart car.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that responds in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theory.
  • Autopilot is a mainstream application in the field of artificial intelligence.
  • Autopilot technology relies on the cooperation of computer vision, radar, monitoring devices and global positioning systems to allow motor vehicles to achieve autonomous driving without the need for human active operation.
  • Autonomous vehicles use various computing systems to help transport passengers from one location to another. Some autonomous vehicles may require some initial or continuous input from an operator, such as a pilot, driver, or passenger.
  • An autonomous vehicle permits the operator to switch from a manual mode of operation to an autonomous driving mode or a mode in between. Since automatic driving technology does not require humans to drive motor vehicles, it can theoretically effectively avoid human driving errors, reduce the occurrence of traffic accidents, and improve the efficiency of highway transportation. Therefore, autonomous driving technology is getting more and more attention.
  • Traffic lights as the hub equipment for traffic operation, improve the accuracy of traffic light detection, which is of great significance for autonomous driving.
  • the present application provides an image processing method, device and smart car, so as to improve the accuracy of object recognition at intersections, for example, to improve the accuracy of traffic light recognition.
  • a first aspect of the present application provides an image processing method, which can be used in the field of automatic driving in the field of artificial intelligence.
  • the first neural network may be a neural network for performing image segmentation tasks. Neural networks that can be used to perform image segmentation tasks in the related art can be used in all embodiments of the present application.
  • the first neural network includes, but is not limited to: a special convolutional neural network (specia convolutional neural network, SCNN) fully convolutional network (fully convolutional neural network, SCNN) convolutional networks, FCN), U-shaped neural network (U-Net), mask region convolutional neural network (Mask-RCNN), semantic segmentation network (semanticsegmentation net, SegNet).
  • SCNN convolutional neural network
  • U-Net U-shaped neural network
  • Mask-RCNN mask region convolutional neural network
  • Semanticsegmentation net SegNet
  • the set of pixel points whose probability of belonging to the stop lane line exceeds the preset threshold can be used to obtain a region of the stop lane line in the image to be processed.
  • the set of pixel points whose probability of belonging to a guide lane line exceeds a preset threshold can be used to obtain the area of a guide lane line in the image to be processed.
  • the second neural network may be a neural network for performing object recognition tasks, including but not limited to convolutional neural network (CNN), deep neural network (DNN), you can only Look once (you only look once, YOLO) v3 (version number, representing the third edition), single shot multibox detector (SSD).
  • the ROI in this application means to outline the area to be processed from the image to be processed in the form of a box (also referred to as a matting area in this application), and input the ROI to the second neural network to output the object to be detected. candidate boxes and classification. Determining the region of interest includes determining the location of interest, the length of the region of interest, and the width of the region of interest.
  • the solution provided in this application proposes a method of obtaining the region of interest by using lane lines. Specifically, the location of the region of interest and the length of the region of interest can be determined according to the lane lines, and the region of interest can be determined according to the object height of the object to be detected. width.
  • the solution provided by the present application uses lane lines to select the area corresponding to the intersection and road section in the image to be processed, which is beneficial to improve the detection accuracy of the object to be detected in the intersection and road section.
  • the lane line in the first area may include a stop line
  • acquiring the region of interest of the image to be processed according to the height information and the first area may include: acquiring the length of the stop line in the image to be processed .
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • the image to be processed includes the stop lane line
  • the area of interest can be obtained according to the stop lane line, and the area corresponding to the intersection can be well selected in the image to be processed. It is beneficial to improve the detection accuracy of the object to be detected at the intersection and road section.
  • the first area may include a plurality of first pixels, the probability that each of the first pixels in the plurality of first pixels belongs to a stop line exceeds a first preset threshold, and the stop line consists of a plurality of first pixels.
  • Pixel composition, and obtaining the length of the stop line in the image to be processed may include: obtaining the length of the stop line in the image to be processed according to the distance between two pixels with the farthest distance among the plurality of first pixels. In this embodiment, a specific acquisition stop line length in the image to be processed is given, which increases the variety of solutions.
  • the method may further include: acquiring a first distance, where the first distance is a distance between the object to be detected and the self-vehicle.
  • a second distance is obtained, which is the distance between the stop line and the lower edge of the image to be processed.
  • the scale is obtained according to the first distance and the second distance. In this embodiment, a specific way of obtaining the scale is given, which increases the variety of solutions.
  • the lane lines in the first area may further include at least two guide lane lines
  • the method may further include: acquiring any two adjacent guide lane lines among the at least two guide lane lines that are waiting to be Handle the width in the image.
  • the scale is obtained according to the width of any two adjacent guide lane lines in the image to be processed and the preset physical width of the two guide lane lines. In this embodiment, another specific way of obtaining the scale is given, which increases the variety of solutions.
  • obtaining the length of the region of interest according to the length of the stop line in the image to be processed may include: obtaining the length of the region of interest according to the distance between the first intersection point and the second intersection point, the first The intersection point is the intersection of the first guide lane line and one end of the stop line in the image to be processed, the second intersection point is the intersection of the second guide lane line and the other end of the stop line in the image to be processed, the first guide lane line and the second guide lane line. It is the two guide lane lines with the furthest distance among the at least two guide lane lines.
  • a specific way of obtaining the length of the region of interest according to the length of the stop line is provided, which increases the variety of solutions.
  • the position of the lower edge of the region of interest is determined according to the position of the stop line in the image to be processed.
  • the position of the lower edge of the region of interest is directly determined according to the position of the stop line in the image to be processed, which simplifies the calculation process.
  • the lane lines in the first area may include at least two guide lane lines and may not include stop lines
  • the area of interest of the image to be processed is obtained according to the height information and the first area, which may include: Obtain the length of the region of interest according to the distance between the third intersection point and the fourth intersection point, the third intersection point is the intersection point of the first guide lane line and one end of the first line segment in the image to be processed, and the second intersection point is the image to be processed.
  • the intersection of the second guide lane line and the other end of the first line segment, the first guide lane line and the second guide lane line are the two guide lane lines with the farthest distance among the at least two guide lane lines, and the first line segment is the A line segment of the second pixel, where the second pixel is the pixel corresponding to the highest point of the shortest guide lane line among the at least two guide lane lines in the image to be processed.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • the acquired image to be processed does not include stop lane lines, but includes guide lane lines, then the position of the lower edge of the ROI and the ROI area can be determined according to the positional relationship between the guide lane lines in the image to be processed length. It is ensured that when the stop lane line is not detected, the appropriate ROI area can also be determined according to the guide lane line, and the area corresponding to the intersection and road section is selected in the to-be-processed image to improve the detection accuracy of the object to be detected at the intersection and road section.
  • the first line segment is parallel to the lower edge of the image to be processed.
  • the lane lines in the first region may include at least two guide lane lines and may not include stop lines, and the position of the lower edge of the region of interest is based on the first line segment in the image to be processed.
  • the position is determined, the first line segment occupies a preset length of pixels, and one end of the first line segment intersects with the first guide lane line, and the other end of the first line segment intersects with the second guide lane line, and the first guide lane line and
  • the second guide lane lines are the two guide lane lines with the farthest distance among the at least two guide lane lines.
  • another method of acquiring the region of interest according to the guide lane line is provided for the case where the acquired image to be processed does not include the stop lane line but includes the guide lane line.
  • acquiring the region of interest of the image to be processed according to the height information and the first region may include: acquiring the length of the region of interest according to the length of the first line segment.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • the method may further include: compressing the resolution of the region of interest to a second Preset threshold.
  • the size of the region of interest may be too large.
  • the region of interest may also be compressed, and the compressed region of interest may be input to the first neural network. Two neural networks.
  • the method may further include: performing super-resolution processing on the region of interest, so that the The resolution of interest is boosted to a second preset threshold.
  • super-resolution processing can also be performed on the region of interest to improve the picture quality of the region of interest, and the region of interest after the super-resolution processing is input into the second neural network to Improve the effect of the second neural network for object detection.
  • the object to be detected may include a traffic light.
  • a second aspect of the present application provides an image processing apparatus, which may include: an acquisition module configured to acquire an image to be processed.
  • the image segmentation module is used for inputting the image to be processed into the first neural network to obtain the first prediction result.
  • the area of interest module is also used to obtain the area of interest of the object to be detected in the image to be processed according to the height information and the first area when the first prediction result indicates that the first area of the image to be processed is a lane line, and the height information may include a predetermined area.
  • the set physical height of the object to be detected, and the region of interest is used by the second neural network to obtain candidate frames and classification of the object to be detected.
  • the lane line in the first area may include a stop line
  • the region of interest module is specifically configured to: acquire the length of the stop line in the image to be processed. Obtain the length of the region of interest according to the length of the stop line in the image to be processed. The length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected. Obtain the width of the region of interest according to the length of the object to be detected in the image to be processed.
  • the first area may include a plurality of first pixels, the probability that each of the first pixels in the plurality of first pixels belongs to a stop line exceeds a first preset threshold, and the stop line consists of a plurality of first pixels.
  • the pixel composition, the region of interest module is specifically configured to: obtain the length of the stop line in the to-be-processed image according to the distance between the two most distant pixels among the plurality of first pixels.
  • the region of interest module is further configured to: acquire a first distance, where the first distance is a distance between the object to be detected and the self-vehicle.
  • a second distance is obtained, which is the distance between the stop line and the lower edge of the image to be processed.
  • the scale is obtained according to the first distance and the second distance.
  • the lane lines in the first area may further include at least two guide lane lines
  • the area of interest module is further configured to: acquire any two adjacent guide lines in the at least two guide lane lines The width of the lane lines in the image to be processed.
  • the scale is obtained according to the width of any two adjacent guide lane lines in the image to be processed and the preset physical widths of the two guide lane lines.
  • the region of interest module is specifically configured to: obtain the length of the region of interest according to the distance between the first intersection point and the second intersection point, where the first intersection point is the first guide lane line in the image to be processed The intersection point with one end of the stop line, the second intersection point is the intersection point between the second guide lane line and the other end of the stop line in the image to be processed, the first guide lane line and the second guide lane line are at least two guide lane lines with the farthest distance the two guide lane lines.
  • the position of the lower edge of the region of interest is determined according to the position of the stop line in the image to be processed.
  • the lane lines in the first area may include at least two guide lane lines and may not include stop lines.
  • the area of interest module is specifically configured to: according to the distance between the third intersection and the fourth intersection The length of the region of interest is obtained from the distance of The intersection of the other end, the first guide lane line and the second guide lane line are the two guide lane lines with the farthest distance among the at least two guide lane lines, the first line segment is a line segment passing through the second pixel, the second pixel is the pixel corresponding to the highest point of the shortest guide lane line in the at least two guide lane lines in the image to be processed.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected. Obtain the width of the region of interest according to the length of the object to be detected in the image to be processed.
  • the first line segment is parallel to the lower edge of the image to be processed.
  • the lane lines in the first region may include at least two guide lane lines and may not include stop lines, and the position of the lower edge of the region of interest is based on the first line segment in the image to be processed. The position is determined, the first line segment occupies a preset length of pixels, and one end of the first line segment intersects with the first guide lane line, and the other end of the first line segment intersects with the second guide lane line, and the first guide lane line and The second guide lane lines are the two guide lane lines with the farthest distance among the at least two guide lane lines.
  • the region of interest module is specifically configured to: acquire the length of the region of interest according to the length of the first line segment.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • a compression module may also be included, and the compression module is configured to, if the resolution of the region of interest obtained according to the height information and the first region is greater than a second preset threshold, compress the resolution of the region of interest rate compression to a second preset threshold.
  • a super-resolution processing module may also be included, and a super-resolution processing module is configured to, if the resolution of the region of interest obtained according to the height information and the first region is smaller than the second preset threshold, Perform super-resolution processing on the region of interest to increase the resolution of interest to a second preset threshold.
  • the object to be detected may include a traffic light.
  • a third aspect of the present application provides an image processing apparatus, which may include a processor, the processor is coupled with a memory, the memory stores program instructions, and the method described in the first aspect is implemented when the program instructions stored in the memory are executed by the processor.
  • a fourth aspect of the present application provides a computer-readable storage medium, which may include a program that, when executed on a computer, causes the computer to execute the method described in the first aspect.
  • a fifth aspect of the present application provides a computer program product which, when run on a computer, enables the computer to perform the method as described in the first aspect.
  • a sixth aspect of the present application provides a chip coupled with a memory for executing a program stored in the memory to execute the method described in the first aspect.
  • a seventh aspect of the present application provides a smart car.
  • the smart car may include a processing circuit and a storage circuit, the processing circuit and the storage circuit being configured to perform the method as described in the first aspect.
  • the solution provided by the present application is aimed at the image to be processed obtained by the vehicle.
  • the obtained image to be processed includes lane lines
  • the region of interest of the object to be detected in the image to be processed is obtained according to the lane line.
  • the position of the region of interest and the length of the region of interest may be determined according to the lane line
  • the width of interest may be determined according to the object height of the object to be detected.
  • the to-be-processed image includes a stop lane line
  • the lane line includes the stop lane line, according to the stop lane line.
  • the position in the image determines the position of the lower edge of the ROI and the length of the ROI area, which can well select the area corresponding to the intersection and road section in the to-be-processed image, which is beneficial to improve the detection accuracy of the object to be detected at the intersection and road section. If the acquired image to be processed does not include stop lane lines but includes guide lane lines, the position of the lower edge of the ROI and the length of the ROI area can be determined according to the positional relationship between the guide lane lines in the image to be processed.
  • the appropriate ROI area can also be determined according to the guide lane line, and the area corresponding to the intersection and road section is selected in the to-be-processed image to improve the detection accuracy of the object to be detected at the intersection and road section.
  • the region of interest in the image to be processed can be used as a cutout region, and the cutout region can be input into the second neural network, so that the second neural network can
  • the matting area determines the candidate frame and classification of the object to be detected.
  • super-resolution processing may also be performed on the cutout area to improve the picture quality of the cutout area, and the cutout area after the super-resolution processing is input into the second neural network to improve the quality of the cutout area.
  • the size of the cutout area may be too large.
  • the cutout area may also be compressed, and the compressed cutout area is input to the second Neural Networks.
  • FIG. 1 is a schematic structural diagram of an automatic driving vehicle provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • 4-a is a schematic diagram of a scheme for obtaining the length of the stop line in the embodiment of the present application.
  • 4-b is a schematic diagram of another solution for obtaining the length of the stop line in the embodiment of the present application.
  • 4-c is a schematic diagram of a scheme for obtaining a scale in the embodiment of the application.
  • 4-d is a schematic diagram of another scheme for obtaining a scale in the embodiment of the present application.
  • FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an application scenario of an image processing method provided by the present application.
  • FIG. 7 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of another application scenario of an image processing method provided by the present application.
  • FIG. 9 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of another application scenario of an image processing method provided by the present application.
  • FIG. 11 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of selecting a region of interest in an image to be processed
  • FIG. 13 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • FIG. 14 is another schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of another self-driving vehicle provided by an embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the embodiment of the present application provides an image processing method, which acquires a region of interest of an image to be processed according to lane lines. Through the solution provided in this application, the accuracy of object recognition in the intersection scene can be effectively improved.
  • the vehicle 100 is configured in a fully or partially autonomous driving mode, for example, the autonomous vehicle 100 may control itself while in the autonomous driving mode, and may determine the current state of the vehicle and its surrounding environment through human operation, determine the possible behavior of at least one other vehicle, and determine a confidence level corresponding to the possibility that the other vehicle performs the possible behavior, and control the autonomous vehicle 100 based on the determined information.
  • the autonomous vehicle 100 may also be placed to operate without human interaction when the autonomous vehicle 100 is in the autonomous driving mode.
  • Autonomous vehicle 100 may include various subsystems, such as travel system 102 , sensor system 104 , control system 106 , one or more peripherals 108 and power supply 110 , computer system 112 , and user interface 116 .
  • the autonomous vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple components. Additionally, each of the subsystems and components of the autonomous vehicle 100 may be wired or wirelessly interconnected.
  • the travel system 102 may include components that provide powered motion for the autonomous vehicle 100 .
  • travel system 102 may include engine 118 , energy source 119 , transmission 120 , and wheels 121 .
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a hybrid engine composed of a gasoline engine and an electric motor, and a hybrid engine composed of an internal combustion engine and an air compression engine.
  • Engine 118 converts energy source 119 into mechanical energy. Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy to other systems of the autonomous vehicle 100 .
  • Transmission 120 may transmit mechanical power from engine 118 to wheels 121 .
  • Transmission 120 may include a gearbox, a differential, and a driveshaft. In one embodiment, transmission 120 may also include other devices, such as clutches.
  • the drive shaft may include one or more axles that may be coupled to one or more wheels 121 .
  • the sensor system 104 may include several sensors that sense information about the environment surrounding the autonomous vehicle 100 .
  • the sensor system 104 may include a global positioning system 122 (the positioning system may be a global positioning GPS system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, a radar 126, a laser ranging instrument 128 and camera 130.
  • the sensor system 104 may also include sensors that monitor the internal systems of the autonomous vehicle 100 (eg, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensing data from one or more of these sensors can be used to detect objects and their corresponding properties (position, shape, orientation, velocity, etc.). This detection and identification is a critical function for the safe operation of the autonomous autonomous vehicle 100 .
  • the positioning system 122 may be used to estimate the geographic location of the autonomous vehicle 100 .
  • the IMU 124 is used to sense position and orientation changes of the autonomous vehicle 100 based on inertial acceleration.
  • IMU 124 may be a combination of an accelerometer and a gyroscope.
  • the radar 126 may utilize radio signals to perceive objects in the surrounding environment of the autonomous vehicle 100, and may be embodied as a millimeter-wave radar or a lidar. In some embodiments, in addition to sensing objects, radar 126 may be used to sense the speed and/or heading of objects.
  • the laser rangefinder 128 may utilize the laser light to sense objects in the environment in which the autonomous vehicle 100 is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, among other system components.
  • Camera 130 may be used to capture multiple images of the surrounding environment of autonomous vehicle 100 .
  • Camera 130 may be a still camera or a video camera.
  • Control system 106 controls the operation of the autonomous vehicle 100 and its components.
  • Control system 106 may include various components including steering system 132 , throttle 134 , braking unit 136 , computer vision system 140 , line control system 142 , and obstacle avoidance system 144 .
  • the steering system 132 is operable to adjust the heading of the autonomous vehicle 100 .
  • it may be a steering wheel system.
  • the throttle 134 is used to control the operating speed of the engine 118 and thus the speed of the autonomous vehicle 100 .
  • the braking unit 136 is used to control the deceleration of the autonomous vehicle 100 .
  • the braking unit 136 may use friction to slow the wheels 121 .
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electrical current.
  • the braking unit 136 may also take other forms to slow the wheels 121 to control the speed of the autonomous vehicle 100 .
  • Computer vision system 140 may be operable to process and analyze images captured by camera 130 in order to identify objects and/or features in the environment surrounding autonomous vehicle 100 .
  • the objects and/or features may include traffic signals, road boundaries and obstacles.
  • Computer vision system 140 may use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking, and other computer vision techniques.
  • SFM Structure from Motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and the like.
  • the route control system 142 is used to determine the travel route and travel speed of the autonomous vehicle 100 .
  • the route control system 142 may include a lateral planning module 1421 and a longitudinal planning module 1422, respectively, for combining information from the obstacle avoidance system 144, the GPS 122, and one or more predetermined maps
  • the data for the autonomous vehicle 100 determines the driving route and driving speed.
  • Obstacle avoidance system 144 is used to identify, evaluate and avoid or otherwise traverse obstacles in the environment of autonomous vehicle 100 , which may be embodied as actual obstacles and virtual moving objects that may collide with autonomous vehicle 100 .
  • the control system 106 may additionally or alternatively include components in addition to those shown and described. Alternatively, some of the components shown above may be reduced.
  • Peripherals 108 may include a wireless communication system 146 , an onboard computer 148 , a microphone 150 and/or a speaker 152 .
  • peripherals 108 provide a means for a user of autonomous vehicle 100 to interact with user interface 116 .
  • the onboard computer 148 may provide information to a user of the autonomous vehicle 100 .
  • User interface 116 may also operate on-board computer 148 to receive user input.
  • the onboard computer 148 can be operated via a touch screen.
  • peripherals 108 may provide a means for autonomous vehicle 100 to communicate with other devices located within the vehicle.
  • Wireless communication system 146 may wirelessly communicate with one or more devices, either directly or via a communication network.
  • wireless communication system 146 may use 3G cellular communications, such as, for example, code division multiple access (CDMA), EVDO, global system for mobile communications (GSM), general packet radio service technology (general packet radio service, GPRS), or 4G cellular communications, such as long term evolution (LTE) or 5G cellular communications.
  • the wireless communication system 146 may communicate using a wireless local area network (WLAN).
  • WLAN wireless local area network
  • the wireless communication system 146 may communicate directly with the device using an infrared link, Bluetooth, or ZigBee.
  • Other wireless protocols such as various vehicle communication systems, for example, wireless communication system 146 may include one or more dedicated short range communications (DSRC) devices, which may include communication between vehicles and/or roadside stations public and/or private data communications.
  • DSRC dedicated short range communications
  • the power source 110 may provide power to various components of the autonomous vehicle 100 .
  • the power source 110 may be a rechargeable lithium-ion or lead-acid battery.
  • One or more battery packs of such batteries may be configured as a power source to provide power to various components of the autonomous vehicle 100 .
  • power source 110 and energy source 119 may be implemented together, such as in some all-electric vehicles.
  • Computer system 112 may include at least one processor 113 that executes instructions 115 stored in a non-transitory computer-readable medium such as memory 114 .
  • Computer system 112 may also be a plurality of computing devices that control individual components or subsystems of autonomous vehicle 100 in a distributed fashion.
  • the processor 113 may be any conventional processor, such as a commercially available central processing unit (CPU).
  • the processor 113 may be a dedicated device such as an application specific integrated circuit (ASIC) or other hardware-based processor.
  • processors, memory, and other components of the computer system 112 may actually include not stored in the same Multiple processors, or memories, within a physical enclosure.
  • memory 114 may be a hard drive or other storage medium located within a different enclosure than computer system 112 .
  • references to processor 113 or memory 114 will be understood to include references to sets of processors or memories that may or may not operate in parallel.
  • some components such as the steering and deceleration components may each have their own processor that only performs computations related to component-specific functions .
  • the processor 113 may be located remotely from the autonomous vehicle 100 and in wireless communication with the autonomous vehicle 100 . In other aspects, some of the processes described herein are performed on the processor 113 disposed within the autonomous vehicle 100 while others are performed by the remote processor 113, including taking the necessary steps to perform a single maneuver.
  • memory 114 may include instructions 115 (eg, program logic) executable by processor 113 to perform various functions of autonomous vehicle 100 , including those described above. Memory 114 may also contain additional instructions, including instructions to send data to, receive data from, interact with, and/or control one or more of travel system 102 , sensor system 104 , control system 106 , and peripherals 108 . instruction.
  • Step 1 Consider safety factors and traffic regulations to determine the timing of changing lanes
  • Step 2 Plan a driving trajectory
  • Step 3 Control the accelerator, brakes and steering wheel to drive the vehicle along a predetermined trajectory.
  • the above operations correspond to autonomous vehicles and can be performed by the behavior planner (BP), motion planner (MoP) and motion controller (Control) of the autonomous vehicle, respectively.
  • BP is responsible for issuing high-level decisions
  • MoP is responsible for planning the expected trajectory and speed
  • Control is responsible for operating the accelerator and braking steering wheel, so that the autonomous vehicle can reach the target speed according to the target trajectory.
  • the related operations performed by the behavior planner, the motion planner and the motion controller may be that the processor 113 as shown in FIG.
  • the behavior planner, the motion planner, and the motion controller are also sometimes collectively referred to as a regulation module.
  • memory 114 may store data such as road maps, route information, vehicle location, direction, speed, and other such vehicle data, among other information. Such information may be used by the autonomous vehicle 100 and the computer system 112 during operation of the autonomous vehicle 100 in autonomous, semi-autonomous, and/or manual modes.
  • user interface 116 may include one or more input/output devices within the set of peripheral devices 108 , such as wireless communication system 146 , onboard computer 148 , microphone 150 and speaker 152 .
  • Computer system 112 may control functions of autonomous vehicle 100 based on input received from various subsystems (eg, travel system 102 , sensor system 104 , and control system 106 ) and from user interface 116 .
  • computer system 112 may utilize input from control system 106 to control steering system 132 to avoid obstacles detected by sensor system 104 and obstacle avoidance system 144 .
  • computer system 112 is operable to provide control over many aspects of autonomous vehicle 100 and its subsystems.
  • one or more of these components described above may be installed or associated with the autonomous vehicle 100 separately.
  • memory 114 may exist partially or completely separate from autonomous vehicle 100 .
  • the above-described components may be communicatively coupled together in a wired and/or wireless manner.
  • An autonomous vehicle traveling on a road can identify objects within its surroundings to determine adjustments to current speed.
  • the objects may be other vehicles, traffic control equipment, or other types of objects.
  • each identified object may be considered independently, and based on the object's respective characteristics, such as its current speed, acceleration, distance from the vehicle, etc., may be used to determine the speed at which the autonomous vehicle is to adjust.
  • autonomous vehicle 100 or a computing device associated with autonomous vehicle 100 such as computer system 112, computer vision system 140, and memory 114 of FIG. traffic, rain, ice on the road, etc.
  • each identified object is dependent on the behavior of the other, so it is also possible to predict the behavior of a single identified object by considering all identified objects together.
  • the autonomous vehicle 100 can adjust its speed based on the predicted behavior of the identified object. In other words, the autonomous vehicle 100 can determine what steady state the vehicle will need to adjust to (eg, accelerate, decelerate, or stop) based on the predicted behavior of the object.
  • the computing device may also provide instructions to modify the steering angle of the autonomous vehicle 100 so that the autonomous vehicle 100 follows a given trajectory and/or maintains a close proximity to the autonomous vehicle 100 safe lateral and longitudinal distances for objects that are not in use (for example, cars in adjacent lanes on the road).
  • the above-mentioned self-driving vehicle 100 can be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, an amusement vehicle, an amusement park vehicle, a construction equipment, a tram, a golf cart, a train, etc.
  • the present application The embodiment is not particularly limited.
  • the embodiment of the present application provides an image processing method, which can be applied to the automatic driving vehicle 100 shown in FIG. 1 .
  • FIG. 2 it is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • an image processing method provided by the present application may include the following steps:
  • the vehicle may acquire images to be processed through the sensor system 104 .
  • the vehicle may acquire images to be processed through the camera 130 .
  • the image to be processed is used to represent the environment around the vehicle.
  • the vehicle may acquire environmental information around the vehicle through the camera 130 in real time, that is, acquire the image to be processed in real time.
  • when the vehicle obtains the road ahead about to enter the intersection it starts to obtain the environmental information around the vehicle through the camera 130 , that is, when the road ahead of the vehicle is about to enter the intersection, starts to acquire the to-be-processed image.
  • the obtained images to be processed may be screened to obtain images to be processed whose signal-to-noise ratio satisfies a preset condition. According to the actual situation, different screening methods can be used to delete the data that does not meet the signal-to-noise ratio, and obtain the data that meets the signal-to-noise ratio. Some duplicate images to be processed can also be deleted.
  • the first neural network may be a neural network for performing image segmentation tasks. Neural networks that can be used to perform image segmentation tasks in the related art can be used in all embodiments of the present application.
  • the first neural network includes, but is not limited to: a special convolutional neural network (specia convolutional neural network, SCNN) fully convolutional network (fully convolutional neural network, SCNN) convolutional networks, FCN), U-shaped neural network (U-Net), mask region convolutional neural network (Mask-RCNN), semantic segmentation network (semanticsegmentation net, SegNet).
  • the first prediction result will indicate the probability that each pixel in the image to be processed belongs to the lane line, specifically the probability that each pixel belongs to the stop lane line, and the probability that each pixel belongs to the guide lane line.
  • the set of pixel points whose probability of belonging to the stop lane line exceeds the preset threshold can be used to obtain a region of the stop lane line in the image to be processed.
  • the set of pixel points whose probability of belonging to a guide lane line exceeds a preset threshold can be used to obtain the area of a guide lane line in the image to be processed.
  • Whether a lane line is included in the image to be processed can be obtained through the first neural network. If the lane line is included, the lane line is segmented from the image to be processed. Exemplarily, a lane line detection method is given below: The image is input to the first neural network for feature extraction, and then the extracted features (each feature map is divided into multiple grids in advance) are decoded by the prediction head model to generate dense line clusters (ie, multiple predicted lane lines), Finally, according to the confidence of each predicted lane line (also known as the confidence of the grid, the confidence reflects whether there is a lane line passing through the grid and how likely it is to pass through the grid, the confidence is greater than the predicted The grid with the set value is used to predict the lane line, and the grid with a confidence lower than the preset value is considered to have no contribution to the prediction.
  • the lane lines are divided into one group, the line cluster is divided into several groups in a similar way, and the baseline in each group is taken as the group. Finally, the detection result of a real lane line is output. It should be noted that a person skilled in the art can select a method for lane line detection according to actual conditions, which is not limited in this embodiment of the present application.
  • the first prediction result indicates that the first region of the image to be processed is a lane line
  • the height information includes the preset physical height of the object to be detected.
  • the object to be detected includes a traffic light
  • the height information includes the actual height of the preset traffic light.
  • the height of the traffic light is usually 6-7 meters, and the height of the object of the traffic light can be preset as 7 meters.
  • the height information includes the preset actual height of the car.
  • the height of the car is usually 1.4 to 1.6 meters, and the object height of the car can be preset to 1.6 meters.
  • the region of interest is used by the second neural network to obtain the candidate frame and classification of the object to be detected.
  • the second neural network may be a neural network for performing object recognition tasks, including but not limited to convolutional neural network (CNN), deep neural network (DNN), you can only Look once (you only look once, YOLO) v3 (version number, representing the third edition), single shot multibox detector (SSD).
  • the ROI in this application means to outline the area to be processed from the image to be processed (also referred to as a matting area in this application) in the form of a box, and input the ROI to the second neural network to output the to-be-detected Object box candidates and classification. Determining the region of interest includes determining the location of interest, the length of the region of interest, and the width of the region of interest.
  • the solution provided by the present application is aimed at the image to be processed obtained by the vehicle.
  • the obtained image to be processed includes lane lines
  • the region of interest of the object to be detected in the image to be processed is obtained according to the lane line.
  • the position of the region of interest and the length of the region of interest may be determined according to the lane line
  • the width of interest may be determined according to the object height of the object to be detected.
  • the to-be-processed image includes a stop lane line
  • the lane line includes the stop lane line, according to the stop lane line.
  • the position in the image determines the position of the lower edge of the ROI and the length of the ROI area, which can well select the area corresponding to the intersection and road section in the to-be-processed image, which is beneficial to improve the detection accuracy of the object to be detected at the intersection and road section. If the acquired image to be processed does not include stop lane lines but includes guide lane lines, the position of the lower edge of the ROI and the length of the ROI area can be determined according to the positional relationship between the guide lane lines in the image to be processed.
  • the appropriate ROI area can also be determined according to the guide lane line, and the area corresponding to the intersection and road section is selected in the to-be-processed image to improve the detection accuracy of the object to be detected at the intersection and road section.
  • this application may call a stop lane line a stop line, and both have the same meaning.
  • the object to be detected is a traffic light
  • the object to be detected is a traffic light.
  • the solution provided in this application can effectively improve the accuracy of traffic light detection.
  • One factor that restricts the accuracy of traffic light detection is that, for the to-be-processed image acquired at the same focal length, the pixels occupied by the traffic light in the to-be-processed image are much different from other to-be-detected objects in the to-be-processed image (such as people, The pixels occupied in the vehicle) are much smaller; before the image to be processed is input into the neural network, the image to be processed needs to be compressed to reduce the size of the image to be processed, which in turn can reduce the data required by the neural network to process the image to be processed Since the proportion of pixels occupied by traffic lights in the image to be processed is inherently small, after compression processing, the pixels occupied by traffic lights may be smaller, which greatly improves the detection difficulty of traffic lights; in order to ensure that traffic lights are in the image to be processed The pixels occupied in the
  • the area is input into the neural network, so that the neural network detects the traffic light according to the ROI area; because the ROI area of the traffic light is constantly changing during the driving process of the vehicle, such as the process of the vehicle approaching the traffic light, the ROI area of the traffic light is constantly changing.
  • the ground moves upward, so how to select the ROI area is very important to improve the accuracy of traffic light detection; in the methods used in the prior art, the ROI area is generally fixed, and the fixed ROI area cannot adapt to the ROI of the traffic light.
  • the area is constantly changing; and currently, it is generally used to obtain the ROI area of the traffic light through the high-precision map and the GPS positioning information of the vehicle. This method of obtaining the ROI area of the traffic light occurs when the GPS positioning is inaccurate, or the GPS signal cannot be obtained.
  • the solution provided by this application can obtain the ROI area of traffic lights through lane lines, which is not limited by GPS signals and high-precision maps; The location information, the length of the stop line and lane line in the image to be processed, and the positional relationship between the guide lane lines to obtain the ROI area of the traffic light, so that the ROI area of the traffic light changes dynamically, and combined with the actual physical height of the traffic light, it can be more It is good to select the ROI area of the traffic light in the image to be processed, and input the ROI area obtained by the solution provided by this application into the neural network, so that the neural network can perform traffic light detection according to the ROI area obtained by the solution provided by this application. Effectively improve the accuracy of traffic light detection.
  • the lane lines may include stop lane lines and guide lane lines, and according to whether the lane lines include stop lane lines, the interest of the object to be detected in the image to be processed is obtained according to the height information and the first area.
  • the region There may be different implementations of the region. The following describes how to obtain the region of interest for the object to be detected in the image to be processed according to the lane lines with reference to several typical implementations.
  • the image to be processed includes the stop lane line
  • FIG. 3 it is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • another image processing method provided by the present application may include the following steps:
  • Steps 301 and 302 can be understood with reference to steps 201 and 202 in the embodiment corresponding to FIG. 2 , and details are not repeated here.
  • the first prediction result indicates that the first area of the image to be processed is a lane line, and the lane line includes a stop line, acquire the length of the stop line in the image to be processed.
  • the first prediction result will indicate the probability that each pixel in the image to be processed belongs to the stop lane line, and the area occupied by the pixels whose probability of belonging to the stop line exceeds the preset threshold is called area 1, then the area 1 in the first area can be Used to indicate the position of the stop lane line in the image to be processed.
  • FIG. 4-a is a schematic diagram of a solution for obtaining the length of the stop line in the embodiment of the present application.
  • the stop line is composed of a plurality of first pixels, and the first pixels are pixels included in the first area. The distance between gets the length of the stop line in the image to be processed.
  • the first prediction result indicates that the area 1 of the image to be processed is a lane line, and the lane line further includes a guide lane line.
  • FIG. 4-b is a schematic diagram of another solution for obtaining the length of the stop line in the embodiment of the present application. In this manner, a plurality of pixel points may be selected from the area 1, and a straight line may be fitted according to the plurality of pixel points to obtain a fitted straight line segment.
  • the first prediction result will indicate the probability that each pixel belongs to the guide lane line, and each guide lane line has its own probability map.
  • the probability map of the guide lane line indicates that each pixel
  • the probability that the point belongs to the guide lane line, the area occupied by the pixel points whose probability of belonging to the guide lane line exceeds the preset threshold is called area 2
  • the area 2 in the first area can be used to represent the guide lane
  • Multiple pixel points can be selected from area 2, and line fitting can be performed according to the multiple pixel points.
  • the respective fitted line segments can be obtained in the above manner, so as to obtain a plurality of fitted line segments.
  • curve fitting may also be performed on multiple pixel points.
  • the length of the stop line in the image to be processed is obtained according to the distance between the first intersection point and the second intersection point, where the first intersection point is the intersection point of one end of the curved line segment corresponding to the first guide lane line and the straight line segment corresponding to the stop line in the image to be processed , the second intersection point is the intersection point of the other end of the curve line segment corresponding to the second guide lane line and the straight line segment corresponding to the stop line in the image to be processed, and the first guide lane line and the second guide lane line are at least two guide lane lines.
  • the two furthest directional lane lines Re-determining the length of the stop lane line through the guide lane line is beneficial to obtain a more accurate length of the stop lane line and reduce the error.
  • the length of the stop line is the length of the region of interest.
  • the length of the stop line may be processed, for example, the length of the stop line is increased by a preset pixel distance to obtain the length of the region of interest.
  • the scale bar is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • a first distance is acquired, where the first distance is the distance between the object to be detected and the self-vehicle.
  • the distance between the object to be detected and the vehicle can be obtained in various ways. For example, the distance between the object to be detected and the vehicle can be obtained through the radar 126. In related technologies, the distance between the object to be detected and the vehicle can be obtained. , all the embodiments of the present application can be used, including but not limited to the monocular ranging method and the binocular ranging method.
  • the second distance is the distance between the stop line and the lower edge of the image to be processed, such as referring to Figure 4-c, the distance between the stop line in the image to be processed and the lower edge of the image to be processed can indicate
  • the distance between the object to be detected (such as a traffic light) and the vehicle in the image is obtained by obtaining the scale according to the first distance and the second distance.
  • the actual physical length corresponding to the length of one pixel can be obtained through the scale, and then the length of the pixel occupied by the physical height of the object to be detected in the image to be processed can be obtained.
  • the physical height of the object to be detected is understood with reference to the physical height of the object to be detected described in step 203 in the embodiment corresponding to FIG. 2 , and details are not repeated here.
  • the first prediction result indicates that the first area of the image to be processed is a lane line
  • the lane line further includes guide lane lines, for example, the area 2 of the first area indicates that at least two guide lane lines are included .
  • a scale bar is obtained for any two adjacent guidance lanes according to the width of any two adjacent guidance lane lines in the image to be processed and the preset physical widths of the two guidance lane lines.
  • the width of any two adjacent guide lane lines in the image to be processed indicates the width of the lane in the image to be processed.
  • the width of the lane is usually 3.5 meters, where the width of the lane is 3.5 meters.
  • the width can be preset, then through the ratio of the two, the actual physical length corresponding to the length of one pixel can be obtained, and then the length of the pixel occupied by the physical height of the object to be detected in the image to be processed can be obtained.
  • the length of the object to be detected in the image to be processed is the width of the region of interest.
  • the length of the object to be detected in the image to be processed may be processed, such as The length of the object to be detected in the image to be processed is increased by a preset pixel distance to obtain the width of the region of interest.
  • the position of the lower edge of the region of interest is determined according to the position of the stop line in the image to be processed.
  • the lower edge of the region of interest is the first intersection point corresponding to the stop line obtained according to step 303 and The line segment between the second intersection.
  • the length and width of the region of interest can be obtained, and then the size of the region of interest can be obtained.
  • stop lines and traffic lights often appear together in the image to be processed.
  • the area of the stop line in the image to be processed is also constantly changing, and the area where the traffic signal is waiting in the image to be processed also changes. constantly changing.
  • the solution provided by the present application obtains the position of the region of interest in the image to be processed through the position of the stop line in the image to be processed, and obtains the size of the region of interest through the length of the stop line.
  • the area of the stop line in the image to be processed is also constantly changing, so the position and size of the acquired area of interest also change, so that the area of interest can accurately include the scene of the intersection and road section, which is beneficial to the waiting for the intersection and road section.
  • Detecting objects for identification such as traffic light detection, improves the accuracy of object recognition at intersections and sections.
  • the image to be processed does not include the stop lane line
  • the image to be processed may not include the stop lane line.
  • the emotional area can be obtained according to the guide lane line.
  • different ways of obtaining the area of interest can be adopted. The following is a combination of several typical embodiments will be described.
  • FIG. 5 it is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • another image processing method provided by the present application may include the following steps:
  • Steps 501 and 502 can be understood with reference to steps 201 and 202 in the embodiment corresponding to FIG. 2 , and details are not repeated here.
  • the first prediction result indicates that the lane lines in the first region of the image to be processed include at least two guide lane lines and do not include stop lines, and obtain the length of the region of interest according to the distance between the third intersection and the fourth intersection.
  • the first prediction result will indicate the probability that each pixel belongs to the guide lane line, and each guide lane line has its own probability map.
  • the probability map of the guide lane line indicates that each pixel belongs to
  • area 2 the area occupied by the pixels whose probability of belonging to the guide lane line exceeds the preset threshold is called area 2
  • the area 2 in the first area can be used to indicate that the guide lane line is waiting
  • a plurality of pixel points may be selected from the area 2, and curve fitting may be performed according to the plurality of pixel points.
  • respective fitted curve line segments can be obtained according to the above method, and the fitted curve line segment can be considered as a guide lane line.
  • the third intersection is the intersection of the first guide lane line and one end of the first straight line segment in the image to be processed
  • the second intersection is the intersection of the second guide lane line and the other end of the first straight line segment in the image to be processed
  • the first The guide lane line and the second guide lane line are the two guide lane lines with the farthest distance among the at least two guide lane lines
  • the first straight line segment is a straight line segment passing through the second pixel
  • the second pixel is the at least two guide lane lines
  • the pixel corresponding to the highest point in the image to be processed is the shortest guide lane line among the lane lines.
  • the image to be processed includes at least two guide lane lines, and at least one of the two guide lane lines is missing, causing the at least two guide lane lines to be missing.
  • Lane lines are inconsistent in length.
  • the lack of lane lines may be caused by the actual lack of lane lines, or may be caused by image segmentation neural network processing, and the embodiment of the present application does not limit the specific circumstances of the lack of lane lines.
  • the pixel corresponding to the highest point in the image to be processed according to the shortest lane line, the straight line segment passing through the pixel corresponding to the highest point, and the leftmost lane line obtains the length of the region of interest.
  • the line segment between the third intersection point and the fourth intersection point is the lower edge of the region of interest.
  • Step 504 can be understood with reference to step 305 in the embodiment corresponding to FIG. 3 , and details are not repeated here.
  • the length of the object to be detected in the image to be processed is the width of the region of interest.
  • the length of the object to be detected in the image to be processed may be processed, such as The length of the object to be detected in the image to be processed is increased by a preset pixel distance to obtain the width of the region of interest.
  • the image to be processed does not include the stop lane line, and the area corresponding to the intersection and road segment cannot be selected from the image to be processed according to the stop lane line.
  • the position of the lower edge of the region of interest is obtained according to the pixel corresponding to the highest point of the shortest guide lane line in the image to be processed, and the size of the region of interest is further determined.
  • At least two guide lane lines in the image to be processed may be All are complete guidance lane lines without missing; in addition, in some possible implementations, at least two of the at least two lane lines in the image to be processed may intersect.
  • the difference between the abscissas of pixels belonging to one guide lane and the abscissas of pixels belonging to another guide lane is within a preset range, it can be considered that the two guide lanes intersect of. For these scenarios, how to determine the size and position of the region of interest is described below with reference to a specific implementation.
  • FIG. 7 it is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • another image processing method provided by this application may include the following steps:
  • Steps 701 and 702 can be understood with reference to steps 201 and 202 in the embodiment corresponding to FIG. 2 , and details are not repeated here.
  • the lane lines in the first region include at least two guide lane lines and do not include stop lines, and the position of the lower edge of the region of interest is determined according to the position of the first line segment in the image to be processed.
  • the first prediction result will indicate the probability that each pixel belongs to the guide lane line, and each guide lane line has its own probability map.
  • the probability map of the guide lane line indicates that each pixel belongs to
  • area 2 the area occupied by the pixels whose probability of belonging to the guide lane line exceeds the preset threshold is called area 2
  • the area 2 in the first area can be used to indicate that the guide lane line is waiting
  • a plurality of pixel points may be selected from the area 2
  • curve fitting may be performed according to the plurality of pixel points.
  • respective fitted curve line segments can be obtained according to the above method, and the fitted curve line segment can be considered as a guide lane line.
  • the first line segment occupies pixels of a preset length, wherein the preset length may be a range interval of a length, and any line segment in the range interval of the length is the first line segment. Alternatively, the preset length may also be a certain fixed length. One end of the first line segment intersects with the first guide lane line, and the other end of the first line segment intersects with the second guide lane line. The two farther directional lane lines.
  • the first line segment occupies 300 pixels, and it is assumed that the intersection of the first line segment and the first guide lane line is intersection 1, and the intersection of the first line segment and the second guide lane line is intersection 2 , the difference between the abscissa of the intersection 1 and the abscissa of the intersection 2 is a pixel of a preset length, for example, the difference between the abscissa of the intersection 1 and the abscissa of the intersection 2 is a length of 300 pixels.
  • the preset length is a range interval of a length.
  • the leftmost guide lane line and the rightmost guide lane line are the farthest between the at least two guide lane lines.
  • line segment 1, line segment 2, and line segment 3 in Figure 8:
  • the pixel length occupied by line segment 1 does not meet the conditions , for example, the pixel length occupied by line segment 1 is not within the preset length range, specifically, exceeds the maximum length in the preset length range; the pixel length occupied by line segment 3 does not meet the conditions, such as the pixel length occupied by line segment 3 is not in the preset length.
  • the pixel length occupied by the line segment 2 satisfies the condition, for example, the pixel length occupied by the line segment 2 is within the preset length range. Then, arbitrarily select one of the limits satisfying the preset length range as the first line segment, for example, select line segment 2 as the first line segment, and determine the position of the lower edge of the region of interest according to the first line segment. For example, the distance between the lower edge of the region of interest and the first line segment on the image to be processed does not exceed a preset threshold.
  • the length of the first line segment is the length of the region of interest.
  • the length of the first line segment is processed to obtain the length of the region of interest.
  • the length of the first line segment is increased by a preset pixel distance to obtain the length of the region of interest.
  • Step 705 can be understood with reference to step 305 in the embodiment corresponding to FIG. 3 , and details are not repeated here.
  • the length of the object to be detected in the image to be processed is the width of the region of interest.
  • the length of the object to be detected in the image to be processed may be processed, such as The length of the detected object in the image to be processed is increased by a preset pixel distance to obtain the width of the region of interest.
  • the image to be processed does not include the stop lane line, and the area corresponding to the intersection and road segment cannot be selected from the image to be processed according to the stop lane line.
  • the image to be processed does not include the stop lane line, and ensure that the complete area corresponding to the intersection and road section can be obtained.
  • the obtained area corresponding to the intersection and road segment is not too small, resulting in the obtained area of interest being too small.
  • the position of the lower edge of the region of interest is obtained according to the intersection of the line segment corresponding to the preset pixel length with the leftmost guide lane line and the rightmost guide lane line, and the size of the region of interest is further determined.
  • the region of interest in the image to be processed can be used as a cutout area, and the cutout area can be input to the second neural network, so that the second neural network can be cut according to the cutout area.
  • the region determines the candidate frame and classification of the object to be detected.
  • super-resolution processing may also be performed on the cutout area to improve the picture quality of the cutout area, and the cutout area after the super-resolution processing is input into the second neural network to improve the quality of the cutout area. The effect of the second neural network on object detection.
  • the size of the cutout area may be too large.
  • the cutout area may also be compressed, and the compressed cutout area is input to the second Neural Networks.
  • FIG. 9 it is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • another image processing method provided by this application may include the following steps:
  • Steps 901 and 902 can be understood with reference to steps 201 and 202 in the embodiment corresponding to FIG. 2 , and details are not repeated here.
  • the first prediction result indicates that the first region of the image to be processed is a lane line
  • the method of obtaining the region of interest of the object to be detected in the image to be processed according to the height information and the first region described in the embodiments corresponding to FIG. 2 , FIG. 3 , FIG. 5 , and FIG. 7 can be used in the embodiments corresponding to FIG. 9 . Here It will not be repeated.
  • the preset resolution is 896*512 pixels. If the resolution of the area of interest obtained according to the height information and the first area is greater than 896*512 pixels, the obtained area of interest will be compressed to convert the area of interest. The resolution of the area is compressed to 896*512 pixels.
  • the size of the preset resolution is related to the input of the second neural network. For example, if the input format of the second neural network is 896*512 pixels, the preset resolution is set to 896*512 pixels.
  • various manners may be used, which is not limited in this embodiment of the present application. For example, the average value of multiple adjacent pixels is obtained to obtain one pixel, so as to achieve the purpose of compressing the image.
  • the resolution of the region of interest obtained according to the height information and the first region is smaller than the second preset threshold, perform super-resolution processing on the region of interest, so that the resolution of interest is raised to the second preset threshold .
  • the preset resolution is 896*512 pixels. If the resolution of the region of interest obtained according to the height information and the first region is less than 896*512 pixels, super-resolution processing is performed on the obtained region of interest to convert The resolution of the region of interest is increased to 896*512 pixels.
  • the size of the preset resolution is related to the input of the second neural network. For example, if the input format of the second neural network is 896*512 pixels, the preset resolution is set to 896*512 pixels. Regarding how to perform super-resolution processing on an image and improve the image quality to a specified pixel, there are multiple possible implementation manners, which are not limited in this embodiment of the present application.
  • super-resolution processing can be performed through deep learning networks such as super-resolution convolutional neural networks (SRCNN), region-based fast convolutional neural networks (accelerating the super-resolution convolutional neural network, FSRCNN).
  • SRCNN super-resolution convolutional neural networks
  • FSRCNN region-based fast convolutional neural networks
  • a bicubic interpolation algorithm can be performed on the region of interest to improve the resolution of the region of interest.
  • the region of interest is processed by the second neural network to obtain candidate frames and categories of objects to be detected in the region of interest.
  • the region of interest can be reincorporated into the image to be processed.
  • each pixel in the image to be processed has a corresponding coordinate.
  • the region of interest can be re-merged into the image to be processed, and then displayed in the image to be processed.
  • FIG. 11 it is a schematic flowchart of an image processing method according to an embodiment of the present application.
  • the image to be processed is acquired, and it is determined whether there is a stop lane line in the to-be-processed image, and if there is a stop lane line, the region of interest is acquired according to the stop lane line and the guide lane line.
  • the lower edge (length and position) of the region of interest is determined according to the line segment between the two intersections of the stop lane line and the leftmost lane line and the rightmost lane line, and the sense of The width of the region of interest.
  • the image to be processed does not include a stop lane line, it is further judged whether the shortest guide lane line in the image to be processed intersects with other guide lane lines. If it intersects, the lower edge (length and position) of the region of interest is obtained according to the target line segment.
  • the length of the target line segment is 300 pixels in length, and one end of the target line segment intersects with the leftmost lane line, and the other end of the target line segment is connected with the most Right lane lines intersect. It should be noted that the length of 300 pixels is only an exemplary illustration, and the length of the target line segment may be determined according to the threshold value input by the second neural network.
  • the actual physical height of the object to be detected Get the width of the region of interest.
  • the target line segment is parallel to the lower edge of the image to be processed, and the target line segment passes through the pixel corresponding to the highest point of the shortest lane line in the image to be processed.
  • determine the relationship between the resolution of the region of interest and the preset resolution if the resolution of the region of interest is greater than the preset resolution, compress the region of interest to convert The resolution of the region of interest is compressed to the preset resolution. If the resolution of the region of interest is smaller than the preset resolution, super-resolution processing is performed on the region of interest to increase the resolution of the region of interest to the preset resolution. resolution.
  • FIG. 12 it is a schematic diagram of selecting a region of interest in an image to be processed, and the lower edge (length and position of the region of interest) is determined according to the line segment between the two intersections of the stop lane line and the leftmost lane line and the rightmost lane line. ), and obtain the width of the region of interest according to the scale and the actual physical height of the object to be detected.
  • the area of interest may be displayed by the in-vehicle device, or the area of interest may be projected on the windshield, and the area of interest always includes the area corresponding to the road section at the intersection.
  • the area of interest obtained according to the solution provided in this application will only include traffic lights that affect the driving state of the vehicle in which the vehicle is located. Therefore, the detection of traffic lights in the area of interest will only output one decision result.
  • the image processing method provided by the embodiment of the present application has been introduced above.
  • the area corresponding to the intersection and road section can be well selected in the to-be-processed image, which is beneficial to improve the road intersection.
  • the detection accuracy of the object to be detected in the road segment is beneficial to improve the road intersection.
  • FIG. 13 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • the image processing apparatus may include an acquisition module 131, an image segmentation module 132, and a region of interest module 133.
  • the acquiring module 131 is configured to acquire the image to be processed.
  • the image segmentation module 132 is configured to input the image to be processed into the first neural network to obtain the first prediction result.
  • the area of interest module 133 is further configured to obtain the area of interest of the object to be detected in the image to be processed according to the height information and the first area when the first prediction result indicates that the first area of the image to be processed is a lane line, and the height information may include The preset physical height of the object to be detected, and the region of interest is used by the second neural network to obtain candidate frames and classification of the object to be detected.
  • the lane line in the first area may include a stop line
  • the region of interest module 133 is specifically configured to: acquire the length of the stop line in the image to be processed. Obtain the length of the region of interest according to the length of the stop line in the image to be processed. The length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected. Obtain the width of the region of interest according to the length of the object to be detected in the image to be processed.
  • the first area may include a plurality of first pixels, the probability that each of the first pixels in the plurality of first pixels belongs to a stop line exceeds a first preset threshold, and the stop line consists of a plurality of first pixels.
  • Pixel composition, the region of interest module 133 is specifically configured to: obtain the length of the stop line in the to-be-processed image according to the distance between the two most distant pixels among the plurality of first pixels.
  • the region of interest module 133 is further configured to: acquire a first distance, where the first distance is the distance between the object to be detected and the self-vehicle.
  • a second distance is obtained, which is the distance between the stop line and the lower edge of the image to be processed.
  • the scale is obtained according to the first distance and the second distance.
  • the lane lines in the first area may further include at least two guide lane lines
  • the region of interest module 133 is further configured to: acquire any two adjacent ones of the at least two guide lane lines The width of the guidance lane lines in the image to be processed.
  • the scale is obtained according to the width of any two adjacent guide lane lines in the image to be processed and the preset physical widths of the two guide lane lines.
  • the region of interest module 133 is specifically configured to: obtain the length of the region of interest according to the distance between the first intersection point and the second intersection point, where the first intersection point is the first guide lane in the image to be processed The intersection of one end of the line and the stop line, the second intersection is the intersection of the second guide lane line and the other end of the stop line in the image to be processed, the first guide lane line and the second guide lane line are at least two guide lane lines with the longest distance. The two farther directional lane lines.
  • the position of the lower edge of the region of interest is determined according to the position of the stop line in the image to be processed.
  • the lane lines in the first area may include at least two guide lane lines and may not include stop lines.
  • the region of interest module 133 is specifically configured to: according to the relationship between the third intersection point and the fourth intersection point The length of the region of interest is obtained from the distance between the two points, the third intersection is the intersection of the first guide lane line and one end of the first line segment in the image to be processed, and the second intersection is the second guide lane line and the first line in the image to be processed.
  • the intersection of the other end of the segment, the first guide lane line and the second guide lane line are the two guide lane lines with the farthest distance among the at least two guide lane lines, the first line segment is a line segment passing through the second pixel, the second The pixel is the pixel corresponding to the highest point of the shortest guide lane line among the at least two guide lane lines in the image to be processed.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected. Obtain the width of the region of interest according to the length of the object to be detected in the image to be processed.
  • the first line segment is parallel to the lower edge of the image to be processed.
  • the lane lines in the first region may include at least two guide lane lines and may not include stop lines, and the position of the lower edge of the region of interest is based on the first line segment in the image to be processed. The position is determined, the first line segment occupies a preset length of pixels, and one end of the first line segment intersects with the first guide lane line, and the other end of the first line segment intersects with the second guide lane line, and the first guide lane line and The second guide lane lines are the two guide lane lines with the farthest distance among the at least two guide lane lines.
  • the region of interest module 133 is specifically configured to: acquire the length of the region of interest according to the length of the first line segment.
  • the length of the object to be detected in the image to be processed is obtained according to the height information and the scale, and the scale is used to indicate the proportional relationship between the length of the object to be detected in the image to be processed and the physical height of the object to be detected.
  • a compression module may also be included, and the compression module is configured to, if the resolution of the region of interest obtained according to the height information and the first region is greater than a second preset threshold, compress the resolution of the region of interest rate compression to a second preset threshold.
  • a super-resolution processing module may also be included, and a super-resolution processing module is configured to, if the resolution of the region of interest obtained according to the height information and the first region is smaller than the second preset threshold, Perform super-resolution processing on the region of interest to increase the resolution of interest to a second preset threshold.
  • the object to be detected may include a traffic light.
  • FIG. 14 another schematic structural diagram of the image processing apparatus provided by the embodiment of the present application.
  • a processor 1402 and a memory 1403 are included.
  • the processor 1402 includes, but is not limited to, a central processing unit (CPU), a network processor (NP), an application-specific integrated circuit (ASIC), or a programmable logic device (programmable logic). device, PLD) one or more.
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general-purpose array logic (generic array logic, GAL) or any combination thereof.
  • Processor 1402 is responsible for communication lines 1404 and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management, and other control functions.
  • Memory 1403 may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types of storage devices that can store information and instructions It can also be an electrically erasable programmable read-only memory (electrically programmable read-only memory, EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, Optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of Any other medium that can be accessed by a computer, but is not limited to this.
  • ROM read-only memory
  • RAM random access memory
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • Optical disc storage including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs
  • the memory may exist independently and be connected to the processor 1402 through the communication line 1404 .
  • the memory 1403 may also be integrated with the processor 1402. If the memory 1403 and the processor 1402 are separate devices, the memory 1403 and the processor 1402 are connected, for example, the memory 1403 and the processor 1402 can communicate through a communication line.
  • Communication line 1404 and processor 1402 may communicate through a communication line, or communication line 1404 may be directly connected to processor 1402 .
  • Communication lines 1404 which may include any number of interconnected buses and bridges, link together various circuits including one or more processors 1402 , represented by processor 1402 , and memory, represented by memory 1403 . Communication lines 1404 may also link together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and, therefore, will not be described further herein.
  • the image processing transposition may include a processor coupled with a memory, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, implement the descriptions described in FIGS. 2 to 11 . method.
  • FIG. 15 is a schematic structural diagram of the automatic driving vehicle provided by the embodiment of the application, wherein the automatic driving vehicle 100
  • the image processing apparatus described in the embodiment corresponding to FIG. 14 may be deployed on the upper part of the upper panel, so as to realize the functions of the automatic driving vehicle in the corresponding embodiment of FIG. 2 to FIG. 11 .
  • the autonomous driving vehicle 100 may also include a communication function
  • the autonomous driving vehicle 100 may further include a receiver 1201 and a transmitter 1202 in addition to the components shown in FIG. 1 , wherein the processor 113 may An application processor 1131 and a communication processor 1132 are included.
  • the receiver 1201, the transmitter 1202, the processor 113, and the memory 114 may be connected by a bus or otherwise.
  • the processor 113 controls the operation of the autonomous vehicle.
  • various components of the autonomous vehicle 100 are coupled together through a bus system, where the bus system may include a power bus, a control bus, a status signal bus, and the like in addition to a data bus.
  • the various buses are referred to as bus systems in the figures.
  • the receiver 1201 can be used to receive input numerical or character information, and generate signal input related to the relevant settings and function control of the autonomous vehicle.
  • the transmitter 1202 can be used to output digital or character information through the first interface; the transmitter 1202 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group; the transmitter 1202 can also include a display device such as a display screen .
  • the application processor 1131 is configured to execute the image processing method executed by the automatic driving vehicle or the image processing apparatus in the embodiments corresponding to FIG. 2 to FIG. 11 .
  • Embodiments of the present application further provide a computer-readable storage medium, where a program for planning a vehicle's driving route is stored in the computer-readable storage medium, and when the computer is running on a computer, the computer is made to execute the operations shown in FIGS. 2 to 11 above.
  • the embodiments of the present application also provide a computer program product, which, when driving on the computer, causes the computer to execute the method described by the automatic driving vehicle (or the image processing device) in the method described in the embodiments shown in the foregoing FIG. 2 to FIG. 11 . steps to perform.
  • An embodiment of the present application further provides a circuit system, the circuit system includes a processing circuit, and the processing circuit is configured to execute the method described in the embodiments shown in the foregoing FIG. 2 to FIG. ) steps performed.
  • the image processing device or the autonomous vehicle provided by the embodiment of the present application may be a chip, and the chip includes: a processing unit and a communication unit.
  • the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, a pipe feet or circuits, etc.
  • the processing unit can execute the computer-executed instructions stored in the storage unit, so that the chip in the server executes the method for planning a vehicle travel route described in the embodiments shown in FIG. 2 to FIG. 9 .
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as only Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • FIG. 16 is a schematic structural diagram of a chip provided by an embodiment of the application.
  • the chip can be represented as a neural network processor NPU 130, and the NPU 130 is mounted as a co-processor to the main CPU (Host CPU), tasks are allocated by the Host CPU.
  • the core part of the NPU is the arithmetic circuit 1303, which is controlled by the controller 1304 to extract the matrix data in the memory and perform multiplication operations.
  • the arithmetic circuit 1303 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 1303 is a two-dimensional systolic array. The arithmetic circuit 1303 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 1303 is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory 1302 and buffers it on each PE in the operation circuit.
  • the arithmetic circuit fetches the data of matrix A and matrix B from the input memory 1301 to perform matrix operation, and stores the partial result or final result of the matrix in the accumulator 1308 .
  • Unified memory 1306 is used to store input data and output data.
  • the weight data is directly passed through the storage unit access controller (direct memory access controller, DMAC) 1305, and the DMAC is transferred to the weight memory 1302.
  • Input data is also moved to unified memory 1306 via the DMAC.
  • a bus interface unit (BIU) 1310 is used for the interaction between the AXI bus and the DMAC and an instruction fetch buffer (instruction fetch buffer, IFB) 1309.
  • IFB instruction fetch buffer
  • the BIU 1310 is used for the instruction fetch memory 1309 to obtain instructions from the external memory, and is also used for the storage unit access controller 1305 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 1306 , the weight data to the weight memory 1302 , or the input data to the input memory 1301 .
  • the vector calculation unit 1307 includes a plurality of operation processing units, and further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, etc., if necessary. It is mainly used for non-convolutional/fully connected layer network computations in neural networks, such as batch normalization, pixel-level summation, and upsampling of feature planes.
  • vector computation unit 1307 can store the processed output vectors to unified memory 1306 .
  • the vector calculation unit 1307 may apply a linear function and/or a non-linear function to the output of the operation circuit 1303, such as performing linear interpolation on the feature plane extracted by the convolution layer, such as a vector of accumulated values, to generate activation values.
  • the vector computation unit 1307 generates normalized values, pixel-level summed values, or both.
  • the vector of processed outputs can be used as an activation input to the arithmetic circuit 1303, such as for use in subsequent layers in a neural network.
  • An instruction fetch buffer 1309 connected to the controller 1304 is used to store the instructions used by the controller 1304 .
  • the unified memory 1306, the input memory 1301, the weight memory 1302 and the instruction fetch memory 1309 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • each layer in the recurrent neural network can be performed by the operation circuit 1303 or the vector calculation unit 1307 .
  • the processor mentioned in any one of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the method in the first aspect.
  • the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be A physical unit, which can be located in one place or distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • U disk mobile hard disk
  • ROM read-only memory
  • RAM magnetic disk or optical disk
  • a computer device which may be a personal computer, server, or network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line) or wireless (eg, infrared, wireless, microwave, etc.).
  • wire eg, coaxial cable, optical fiber, digital subscriber line
  • wireless eg, infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a server, data center, etc., which includes one or more available media integrated.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.

Abstract

一种图像处理方法,可以应用于智能汽车、智能网联汽车上。该方法可以包括:获取待处理图像(201);将待处理图像输入至第一神经网络,以获取第一预测结果(202);第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域(203),高度信息包括预设定的待检测物体的物理高度,感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。通过该方法,可以提升路口路段物体识别的准确度,比如提升交通灯识别的准确度。

Description

一种图像处理方法、装置以及智能汽车
本申请要求于2020年12月31日提交中国专利局、申请号为202011640167.6、申请名称为“一种图像处理方法、装置以及智能汽车”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理领域,尤其涉及一种图像处理方法、装置以及智能汽车。
背景技术
人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能领域的研究包括机器人,自然语言处理,计算机视觉,决策与推理,人机交互,推荐与搜索,AI基础理论等。
自动驾驶是人工智能领域的一种主流应用,自动驾驶技术依靠计算机视觉、雷达、监控装置和全球定位系统等协同合作,让机动车辆可以在不需要人类主动操作下,实现自动驾驶。自动驾驶的车辆使用各种计算系统来帮助将乘客从一个位置运输到另一位置。一些自动驾驶车辆可能要求来自操作者(诸如,领航员、驾驶员、或者乘客)的一些初始输入或者连续输入。自动驾驶车辆准许操作者从手动模操作式切换到自动驾驶模式或者介于两者之间的模式。由于自动驾驶技术无需人类来驾驶机动车辆,所以理论上能够有效避免人类的驾驶失误,减少交通事故的发生,且能够提高公路的运输效率。因此,自动驾驶技术越来越受到重视。
交通灯作为交通运转的枢纽设备,提升交通灯检测的准确度,对于自动驾驶有非常重要的意义。
发明内容
本申请提供了一种图像处理方法、装置以及智能汽车,以提升路口路段物体识别的准确度,比如提升交通灯识别的准确度。
为解决上述技术问题,本申请提供以下技术方案:
本申请第一方面提供一种图像处理方法,可用于人工智能领域的自动驾驶领域中。可以包括:获取待处理图像。将待处理图像输入至第一神经网络,以获取第一预测结果。该第一神经网络可以是用于执行图像分割任务的神经网络。相关技术中可以用于执行图像分割任务的神经网络本申请实施例均可以采用,比如第一神经网络包括但不限于:特殊卷积神经网络(specia convolutional neural network,SCNN)全卷积网络(fully convolutional networks,FCN)、U型神经网络(U-Net)、掩膜区域卷积神经网络(maskregion convolutional neural network,Mask-RCNN)、语义分割网(semanticsegmentation net,SegNet)。第一 预测结果会指示待处理图像中每个像素属于车道线的概率,具体的指示每个像素属于停止车道线的概率,每个像素属于导向车道线的概率。属于停止车道线的概率超过预设阈值的像素点的集合可以用于获取一条停止车道线在待处理图像中的区域。属于一条导向车道线的概率超过预设阈值的像素点的集合可以用于获取一条导向车道线在待处理图像中的区域。第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域。高度信息可以包括预设定的待检测物体的物理高度,感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。其中,该第二神经网络可以是用于还行物体识别任务的神经网络,包括但不限于卷积神经网络(convolutional neuron network,CNN)、深度神经网络(deep neural network,DNN)、你只能看一次(you only look once,YOLO)v3(版本号,代表第三版)、单发多核探测器(single shot multibox detector,SSD)。本申请中的ROI是指从待处理的图像以方框的方式勾勒出需要处理的区域(本申请也称之为抠图区域),将该ROI输入至第二神经网络,以输出待检测物体的候选框和分类。确定感兴趣区域包括确定感兴趣的位置、感兴趣区域的长度以及感兴趣区域的宽度。本申请提供的方案提出了一种利用车道线获取感兴趣区域的方式,具体的,可以根据车道线确定感兴趣区域的位置以及感兴趣区域的长度,根据待检测物体的物体高度确定感兴趣的宽度。本申请提供的方案利用车道线,在待处理图像中选出路口路段对应的区域,有利于提升路口路段的待检测物体的检测的准确度。
在一种可能的实施方式中,第一区域中的车道线可以包括停止线,根据高度信息和第一区域获取待处理图像的感兴趣区域,可以包括:获取停止线在待处理图像中的长度。根据停止线在待处理图像中的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。待处理图像中包括停止车道线时,说明车辆处于路口路段或者车辆即将驶入路口路段,根据停止车道线获取感兴趣区域,可以很好的在待处理图像中选出路口路段对应的区域,有利于提升路口路段的待检测物体的检测的准确度。
在一种可能的实施方式中,第一区域中可以包括多个第一像素,多个第一像素中各个第一像素属于停止线的概率超过第一预设阈值,停止线由多个第一像素组成,获取停止线在待处理图像中的长度,可以包括:根据多个第一像素中距离最远的两个像素之间的距离获取停止线在待处理图像中的长度。在这种实施方式中,给出了一种具体的获取停止线在待处理图像中的长度,增加了方案的多样性。
在一种可能的实施方式中,方法还可以包括:获取第一距离,第一距离是待检测物体和自车之间的距离。获取第二距离,第二距离是停止线和待处理图像的下边缘之间的距离。根据第一距离和第二距离获取比例尺。在这种实施方式中,给出了一种具体的获取比例尺的方式,增加了方案的多样性。
在一种可能的实施方式中,第一区域中的车道线还可以包括至少两条导向车道线,方法还可以包括:获取至少两条导向车道线中任意两条相邻的导向车道线在待处理图像中的宽度。根据任意两条相邻的导向车道线在待处理图像中的宽度和预设定的两条导向车道线 的物理宽度获取比例尺。在这种实施方式中,给出了另一种具体的获取比例尺的方式,增加了方案的多样性。
在一种可能的实施方式中,根据停止线在待处理图像中的长度获取感兴趣区域的长度,可以包括:根据第一交点和第二交点之间的距离获取感兴趣区域的长度,第一交点是待处理图像中第一导向车道线和停止线一端的交点,第二交点是待处理图像中第二导向车道线和停止线另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。在这种实施方式中,给出了一种具体的根据停止线的长度获取感兴趣区域的长度的方式,增加了方案的多样性。
在一种可能的实施方式中,感兴趣区域的下边缘的位置根据停止线在待处理图像中的位置确定。在这种可能的实施方式中,感兴趣区域的下边缘的位置直接根据停止线在待处理图像中的位置确定,简化计算过程。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,根据高度信息和第一区域获取待处理图像的感兴趣区域,可以包括:根据第三交点和第四交点之间的距离获取感兴趣区域的长度,第三交点是待处理图像中第一导向车道线和第一线段的一端的交点,第二交点是待处理图像中第二导向车道线和第一线段另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线,第一线段是经过第二像素的一条线段,第二像素是至少两条导向车道线中最短的导向车道线在待处理图像中最高点对应的像素。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。在这种实施方式中,获取到的待处理图像不包括停止车道线,但是包括导向车道线,则可以根据待处理图像中导向车道线之间的位置关系确定ROI的下边缘的位置以及ROI区域的长度。保证在没有检测到停止车道线的时候,也可以根据导向车道线确定合适的ROI区域,在待处理图像中选出路口路段对应的区域,提升路口路段的待检测物体的检测的准确度。
在一种可能的实施方式中,第一线段与待处理图像的下边缘平行。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,感兴趣区域的下边缘的位置根据第一线段在待处理图像中的位置确定,第一线段占据预设长度的像素,且第一线段的一端与第一导向车道线相交,第一线段的另一端与第二导向车道线相交,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。在这种实施方式中,针对获取到的待处理图像不包括停止车道线,但是包括导向车道线的情况,给出了另一种根据导向车道线获取感兴趣区域的方式。
在一种可能的实施方式中,根据高度信息和第一区域获取待处理图像的感兴趣区域,可以包括:根据第一线段的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,若根据高度信息和第一区域获取的感兴趣区域的分辨率大于第二预设阈值时,该方法还可以包括:将感兴趣区域的分辨率压缩至第二预设阈值。在这种可能的实施方式中,感兴趣区域的尺寸可能过大,为了减少第二神经网络的计算量,还可以对感兴趣区域进行压缩处理,将经过压缩处理后的感兴趣区域输入至第二神经网络。
在一种可能的实施方式中,若根据高度信息和第一区域获取的感兴趣区域的分辨率小于第二预设阈值时,方法还可以包括:对感兴趣区域进行超分辨率处理,以使感兴趣的分辨率提升至第二预设阈值。在这种可能的实施方式中,还可以对感兴趣区域进行超分辨率处理,提升感兴趣区域的画面的质量,将经过超分辨率处理后的感兴趣区域输入至第二神经网络中,以提升第二神经网络对于物体检测的效果。
在一种可能的实施方式中,待检测物体可以包括交通灯。
本申请第二方面提供一种图像处理装置,可以包括:获取模块,用于获取待处理图像。图像分割模块,用于将待处理图像输入至第一神经网络,以获取第一预测结果。感兴趣区域模块,还用于第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域,高度信息可以包括预设定的待检测物体的物理高度,感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。
在一种可能的实施方式中,第一区域中的车道线可以包括停止线,感兴趣区域模块,具体用于:获取停止线在待处理图像中的长度。根据停止线在待处理图像中的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,第一区域中可以包括多个第一像素,多个第一像素中各个第一像素属于停止线的概率超过第一预设阈值,停止线由多个第一像素组成,感兴趣区域模块,具体用于:根据多个第一像素中距离最远的两个像素之间的距离获取停止线在待处理图像中的长度。
在一种可能的实施方式中,感兴趣区域模块,还用于:获取第一距离,第一距离是待检测物体和自车之间的距离。获取第二距离,第二距离是停止线和待处理图像的下边缘之间的距离。根据第一距离和第二距离获取比例尺。
在一种可能的实施方式中,第一区域中的车道线还可以包括至少两条导向车道线,感兴趣区域模块,还用于:获取至少两条导向车道线中任意两条相邻的导向车道线在待处理图像中的宽度。根据任意两条相邻的导向车道线在待处理图像中的宽度和预设定的两条导向车道线的物理宽度获取比例尺。
在一种可能的实施方式中,感兴趣区域模块,具体用于:根据第一交点和第二交点之间的距离获取感兴趣区域的长度,第一交点是待处理图像中第一导向车道线和停止线一端的交点,第二交点是待处理图像中第二导向车道线和停止线另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。
在一种可能的实施方式中,感兴趣区域的下边缘的位置根据停止线在待处理图像中的位置确定。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,感兴趣区域模块,具体用于:根据第三交点和第四交点之间的距离获取感兴趣区域的长度,第三交点是待处理图像中第一导向车道线和第一线段的一端的交点,第二交点是待处理图像中第二导向车道线和第一线段另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线,第一线段是经过第二像素的一条线段,第二像素是至少两条导向车道线中最短的导向车道线在待处理图像中最高点对应的像素。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,第一线段与待处理图像的下边缘平行。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,感兴趣区域的下边缘的位置根据第一线段在待处理图像中的位置确定,第一线段占据预设长度的像素,且第一线段的一端与第一导向车道线相交,第一线段的另一端与第二导向车道线相交,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。
在一种可能的实施方式中,感兴趣区域模块,具体用于:根据第一线段的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,还可以包括压缩模块,压缩模块,用于若根据高度信息和第一区域获取的感兴趣区域的分辨率大于第二预设阈值时,将感兴趣区域的分辨率压缩至第二预设阈值。
在一种可能的实施方式中,还可以包括超分辨率处理模块,超分辨率处理模块,用于若根据高度信息和第一区域获取的感兴趣区域的分辨率小于第二预设阈值时,对感兴趣区域进行超分辨率处理,以使感兴趣的分辨率提升至第二预设阈值。
在一种可能的实施方式中,待检测物体可以包括交通灯。
本申请第三方面提供一种图像处理装置,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现第一方面所描述的方法。
本申请第四方面提供一种计算机可读存储介质,可以包括程序,当其在计算机上运行时,使得计算机执行如第一方面所描述的方法。
本申请第五方面提供一种计算机程序产品,当在计算机上运行时,使得计算机可以执行如第一方面所描述的方法。
本申请第六方面提供一种芯片,芯片与存储器耦合,用于执行存储器中存储的程序,以执行如第一方面所描述的方法。
本申请第七方面提供一种智能汽车,智能汽车可以包括处理电路和存储电路,处理电路和存储电路被配置为执行如第一方面所描述的方法。
本申请提供的方案针对于车辆获取的待处理图像,若获取到的待处理图像包括车道线,则根据车道线获取待处理图像中待检测物体的感兴趣区域。具体的,可以根据车道线确定感兴趣区域的位置以及感兴趣区域的长度,根据待检测物体的物体高度确定感兴趣的宽度。其中,待处理图像中包括停止车道线时,说明车辆处于路口路段或者车辆即将驶入路口路段,根据停止车道线获取感兴趣区域,比如车道线包括停止车道线时,根据停止车道线在待处理图像中的位置确定ROI的下边缘的位置以及ROI区域的长度,可以很好的在待处理图像中选出路口路段对应的区域,有利于提升路口路段的待检测物体的检测的准确度。若获取到的待处理图像不包括停止车道线,但是包括导向车道线,则可以根据待处理图像中导向车道线之间的位置关系确定ROI的下边缘的位置以及ROI区域的长度。保证在没有检测到停止车道线的时候,也可以根据导向车道线确定合适的ROI区域,在待处理图像中选出路口路段对应的区域,提升路口路段的待检测物体的检测的准确度。
此外,获取了感兴趣区域在待处理图像中的位置和尺寸后,可以将待处理图像中将感兴趣区作为抠图区域,将抠图区域输入至第二神经网络,使第二神经网络根据抠图区域确定待检测物体的候选框和分类。在一些可能的实施方式中,还可以对抠图区域进行超分辨率处理,提升抠图区域的画面的质量,将经过超分辨率处理后的抠图区域输入至第二神经网络中,以提升第二神经网络对于物体检测的效果。在一些可能的实施方式中,抠图区域的尺寸可能过大,为了减少第二神经网络的计算量,还可以对抠图区域进行压缩处理,将经过压缩处理后的抠图区域输入至第二神经网络。
附图说明
图1为本申请实施例提供的自动驾驶车辆的一种结构示意图;
图2为本申请实施例提供的一种图像处理方法的流程示意图;
图3为本申请实施例提供的另一种图像处理方法的流程示意图;
图4-a为本申请实施例中获取停止线长度的一种方案的示意图;
图4-b为本申请实施例中获取停止线长度的另一种方案的示意图;
图4-c为本申请实施例中获取比例尺的一种方案的示意图;
图4-d为本申请实施例中获取比例尺的另一种方案的示意图;
图5为本申请实施例提供的另一种图像处理方法的流程示意图;
图6为本申请提供的一种图像处理方法的一种应用场景示意图;
图7为本申请实施例提供的另一种图像处理方法的流程示意图;
图8为本申请提供的一种图像处理方法的另一种应用场景示意图;
图9为本申请实施例提供的另一种图像处理方法的流程示意图;
图10为本申请提供的一种图像处理方法的另一种应用场景示意图;
图11为本申请实施例提供的一种图像处理方法的流程示意图;
图12为在待处理图像中选取感兴趣区域的示意图;
图13为本申请实施例提供的图像处理装置的一种结构示意图;
图14为本申请实施例提供的图像处理装置的另一种结构示意图;
图15为本申请实施例提供的自动驾驶车辆的另一种的结构示意图;
图16为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
本申请实施例提供了一种图像处理方法,根据车道线获取待处理图像的感兴趣区域。通过本申请提供的方案,可以有效提升路口场景的物体识别的准确度。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
为了便于理解本方案,本申请实施例中首先结合图1对自动驾驶车辆的结构进行介绍,请先参阅图1,图1为本申请实施例提供的自动驾驶车辆的一种结构示意图,自动驾驶车辆100配置为完全或部分地自动驾驶模式,例如,自动驾驶车辆100可以在处于自动驾驶模式中的同时控制自身,并且可通过人为操作来确定车辆及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制自动驾驶车辆100。在自动驾驶车辆100处于自动驾驶模式中时,也可以将自动驾驶车辆100置为在没有和人交互的情况下操作。
自动驾驶车辆100可包括各种子系统,例如行进系统102、传感器系统104、控制系统106、一个或多个外围设备108以及电源110、计算机系统112和用户接口116。可选地,自动驾驶车辆100可包括更多或更少的子系统,并且每个子系统可包括多个部件。另外,自动驾驶车辆100的每个子系统和部件可以通过有线或者无线互连。
行进系统102可包括为自动驾驶车辆100提供动力运动的组件。在一个实施例中,行进系统102可包括引擎118、能量源119、传动装置120和车轮121。
其中,引擎118可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合,例如,汽油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎118将能量源119转换成机械能量。能量源119的示例包括汽油、柴油、其他基于石油的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源119也可以为自动驾驶车辆100的其他系统提供能量。传动装置120可以将来自引擎118的机械动力传送到车轮121。传动装置120可包括变速箱、差速器和驱动轴。在一个实施例中,传动装置120还可以包括其他器件,比如离合器。其中,驱动轴可包括可耦合到一个或多个车轮121的一个或多个轴。
传感器系统104可包括感测关于自动驾驶车辆100周边的环境的信息的若干个传感器。例如,传感器系统104可包括全球定位系统122(定位系统可以是全球定位GPS系统,也可以是北斗系统或者其他定位系统)、惯性测量单元(inertial measurement unit,IMU)124、雷达126、激光测距仪128以及相机130。传感器系统104还可包括被监视自动驾驶车辆100的内部系统的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主自动驾驶车辆100的安全操作的关键功能。
其中,定位系统122可用于估计自动驾驶车辆100的地理位置。IMU 124用于基于惯性加速度来感知自动驾驶车辆100的位置和朝向变化。在一个实施例中,IMU 124可以是加速度计和陀螺仪的组合。雷达126可利用无线电信号来感知自动驾驶车辆100的周边环 境内的物体,具体可以表现为毫米波雷达或激光雷达。在一些实施例中,除了感知物体以外,雷达126还可用于感知物体的速度和/或前进方向。激光测距仪128可利用激光来感知自动驾驶车辆100所位于的环境中的物体。在一些实施例中,激光测距仪128可包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他系统组件。相机130可用于捕捉自动驾驶车辆100的周边环境的多个图像。相机130可以是静态相机或视频相机。
控制系统106为控制自动驾驶车辆100及其组件的操作。控制系统106可包括各种部件,其中包括转向系统132、油门134、制动单元136、计算机视觉系统140、线路控制系统142以及障碍避免系统144。
其中,转向系统132可操作来调整自动驾驶车辆100的前进方向。例如在一个实施例中可以为方向盘系统。油门134用于控制引擎118的操作速度并进而控制自动驾驶车辆100的速度。制动单元136用于控制自动驾驶车辆100减速。制动单元136可使用摩擦力来减慢车轮121。在其他实施例中,制动单元136可将车轮121的动能转换为电流。制动单元136也可采取其他形式来减慢车轮121转速从而控制自动驾驶车辆100的速度。计算机视觉系统140可以操作来处理和分析由相机130捕捉的图像以便识别自动驾驶车辆100周边环境中的物体和/或特征。所述物体和/或特征可包括交通信号、道路边界和障碍体。计算机视觉系统140可使用物体识别算法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉系统140可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。线路控制系统142用于确定自动驾驶车辆100的行驶路线以及行驶速度。在一些实施例中,线路控制系统142可以包括横向规划模块1421和纵向规划模块1422,横向规划模块1421和纵向规划模块1422分别用于结合来自障碍避免系统144、GPS 122和一个或多个预定地图的数据为自动驾驶车辆100确定行驶路线和行驶速度。障碍避免系统144用于识别、评估和避免或者以其他方式越过自动驾驶车辆100的环境中的障碍体,前述障碍体具体可以表现为实际障碍体和可能与自动驾驶车辆100发生碰撞的虚拟移动体。在一个实例中,控制系统106可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。
自动驾驶车辆100通过外围设备108与外部传感器、其他车辆、其他计算机系统或用户之间进行交互。外围设备108可包括无线通信系统146、车载电脑148、麦克风150和/或扬声器152。在一些实施例中,外围设备108为自动驾驶车辆100的用户提供与用户接口116交互的手段。例如,车载电脑148可向自动驾驶车辆100的用户提供信息。用户接口116还可操作车载电脑148来接收用户的输入。车载电脑148可以通过触摸屏进行操作。在其他情况中,外围设备108可提供用于自动驾驶车辆100与位于车内的其它设备通信的手段。例如,麦克风150可从自动驾驶车辆100的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器152可向自动驾驶车辆100的用户输出音频。无线通信系统146可以直接地或者经由通信网络来与一个或多个设备无线通信。例如,无线通信系统146可使用3G蜂窝通信,例如例如码分多址(code division multipleaccess,CDMA)、EVD0、全球移动通信系统(global system for mobile communications,GSM),通用分组无线服务技术(general packet radio service,GPRS),或者4G蜂窝通信,例如长期演进(long term  evolution,LTE)或者5G蜂窝通信。无线通信系统146可利用无线局域网(wireless localarea network,WLAN)通信。在一些实施例中,无线通信系统146可利用红外链路、蓝牙或ZigBee与设备直接通信。其他无线协议,例如各种车辆通信系统,例如,无线通信系统146可包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
电源110可向自动驾驶车辆100的各种组件提供电力。在一个实施例中,电源110可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为自动驾驶车辆100的各种组件提供电力。在一些实施例中,电源110和能量源119可一起实现,例如一些全电动车中那样。
自动驾驶车辆100的部分或所有功能受计算机系统112控制。计算机系统112可包括至少一个处理器113,处理器113执行存储在例如存储器114这样的非暂态计算机可读介质中的指令115。计算机系统112还可以是采用分布式方式控制自动驾驶车辆100的个体组件或子系统的多个计算设备。处理器113可以是任何常规的处理器,诸如商业可获得的中央处理器(central processing unit,CPU)。可选地,处理器113可以是诸如专用集成电路(application specific integrated circuit,ASIC)或其它基于硬件的处理器的专用设备。尽管图1功能性地图示了处理器、存储器、和在相同块中的计算机系统112的其它部件,但是本领域的普通技术人员应该理解该处理器、或存储器实际上可以包括不存储在相同的物理外壳内的多个处理器、或存储器。例如,存储器114可以是硬盘驱动器或位于不同于计算机系统112的外壳内的其它存储介质。因此,对处理器113或存储器114的引用将被理解为包括可以并行操作或者可以不并行操作的处理器或存储器的集合的引用。不同于使用单一的处理器来执行此处所描述的步骤,诸如转向组件和减速组件的一些组件每个都可以具有其自己的处理器,所述处理器只执行与特定于组件的功能相关的计算。在此处所描述的各个方面中,处理器113可以位于远离自动驾驶车辆100并且与自动驾驶车辆100进行无线通信。在其它方面中,此处所描述的过程中的一些在布置于自动驾驶车辆100内的处理器113上执行而其它则由远程处理器113执行,包括采取执行单一操纵的必要步骤。在一些实施例中,存储器114可包含指令115(例如,程序逻辑),指令115可被处理器113执行来执行自动驾驶车辆100的各种功能,包括以上描述的那些功能。存储器114也可包含额外的指令,包括向行进系统102、传感器系统104、控制系统106和外围设备108中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。例如,以向右换道为例,则对于人工驾驶员需要进行以下操作:第一步:考虑安全因素和交规因素,决定换道的时机;第二步:规划出一条行驶轨迹;第三步:控制油门、刹车和方向盘,让车辆沿着预定轨迹行驶。上述操作对应于自动驾驶车辆,可以分别由自动驾驶车辆的行为规划器(behavior planner,BP),运动规划器(motion planner,MoP)和运动控制器(Control)执行。其中,BP负责下发高层决策,MoP负责规划预期轨迹和速度,Control负责操作油门刹车方向盘,让自动驾驶车辆根据目标轨迹并达到目标速度。应理解,行为规划器、运动规划器和运动控制器执行的相关操作可以是如图1所示的处理器113执行存储器114中的指令115,该指令115可以用于指示线路控制系统142。本申请实施例有时也 将行为规划器,运动规划器以及运动控制器统称为规控模块。
除了指令115以外,存储器114还可存储数据,例如道路地图、路线信息,车辆的位置、方向、速度以及其它这样的车辆数据,以及其他信息。这种信息可在自动驾驶车辆100在自主、半自主和/或手动模式中操作期间被自动驾驶车辆100和计算机系统112使用。用户接口116,用于向自动驾驶车辆100的用户提供信息或从其接收信息。可选地,用户接口116可包括在外围设备108的集合内的一个或多个输入/输出设备,例如无线通信系统146、车载电脑148、麦克风150和扬声器152。
计算机系统112可基于从各种子系统(例如,行进系统102、传感器系统104和控制系统106)以及从用户接口116接收的输入来控制自动驾驶车辆100的功能。例如,计算机系统112可利用来自控制系统106的输入以便控制转向系统132来避免由传感器系统104和障碍避免系统144检测到的障碍体。在一些实施例中,计算机系统112可操作来对自动驾驶车辆100及其子系统的许多方面提供控制。
可选地,上述这些组件中的一个或多个可与自动驾驶车辆100分开安装或关联。例如,存储器114可以部分或完全地与自动驾驶车辆100分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图1不应理解为对本申请实施例的限制。在道路行进的自动驾驶车辆,如上面的自动驾驶车辆100,可以识别其周围环境内的物体以确定对当前速度的调整。所述物体可以是其它车辆、交通控制设备、或者其它类型的物体。在一些示例中,可以独立地考虑每个识别的物体,并且基于物体的各自的特性,诸如它的当前速度、加速度、与车辆的间距等,可以用来确定自动驾驶车辆所要调整的速度。
可选地,自动驾驶车辆100或者与自动驾驶车辆100相关联的计算设备如图1的计算机系统112、计算机视觉系统140、存储器114可以基于所识别的物体的特性和周围环境的状态(例如,交通、雨、道路上的冰、等等)来预测所识别的物体的行为。可选地,每一个所识别的物体都依赖于彼此的行为,因此还可以将所识别的所有物体全部一起考虑来预测单个识别的物体的行为。自动驾驶车辆100能够基于预测的所识别的物体的行为来调整它的速度。换句话说,自动驾驶车辆100能够基于所预测的物体的行为来确定车辆将需要调整到(例如,加速、减速、或者停止)什么稳定状态。在这个过程中,也可以考虑其它因素来确定自动驾驶车辆100的速度,诸如,自动驾驶车辆100在行驶的道路中的横向位置、道路的曲率、静态和动态物体的接近度等等。除了提供调整自动驾驶车辆的速度的指令之外,计算设备还可以提供修改自动驾驶车辆100的转向角的指令,以使得自动驾驶车辆100遵循给定的轨迹和/或维持与自动驾驶车辆100附近的物体(例如,道路上的相邻车道中的轿车)的安全横向和纵向距离。
上述自动驾驶车辆100可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车和火车等,本申请实施例不做特别的限定。
结合上述描述,本申请实施例提供了一种图像处理的方法,可应用于图1中示出的自 动驾驶车辆100中。
参阅图2,为本申请实施例提供的一种图像处理方法的流程示意图。
如图2所示,本申请提供的一种图像处理方法可以包括如下步骤:
201、获取待处理图像。
车辆可以通过传感器系统104获取待处理图像。比如,车辆可以通过相机130获取待处理图像。待处理图像用于体现车辆周围的环境。在一个可能的实施方式中,车辆可以实时通过相机130获取车辆周围的环境信息,即实时获取待处理图像。在一个可能的实施方式中,当车辆获取到前方即将进入路口路段,则开始通过相机130获取车辆周围的环境信息,即获取到车辆前方即将进入路口路段时,开始获取待处理图像。
在一个可能的实施方式中,可以对获取到的待处理图像进行筛选处理,以获取信噪比满足预设条件的待处理图像。可以根据实际情况采用不同的筛选手段删除不满足信噪比的数据,获取满足信噪比的数据。还可以将一些重复的待处理图像进行删除。
202、将待处理图像输入至第一神经网络,以获取第一预测结果。
该第一神经网络可以是用于执行图像分割任务的神经网络。相关技术中可以用于执行图像分割任务的神经网络本申请实施例均可以采用,比如第一神经网络包括但不限于:特殊卷积神经网络(specia convolutional neural network,SCNN)全卷积网络(fully convolutional networks,FCN)、U型神经网络(U-Net)、掩膜区域卷积神经网络(maskregion convolutional neural network,Mask-RCNN)、语义分割网(semanticsegmentation net,SegNet)。第一预测结果会指示待处理图像中每个像素属于车道线的概率,具体的指示每个像素属于停止车道线的概率,每个像素属于导向车道线的概率。属于停止车道线的概率超过预设阈值的像素点的集合可以用于获取一条停止车道线在待处理图像中的区域。属于一条导向车道线的概率超过预设阈值的像素点的集合可以用于获取一条导向车道线在待处理图像中的区域。
可以通过第一神经网络获取待处理图像中是否包括车道线,若包括车道线,则将车道线从待处理图像中分割出来,示例性的,下面给出一种车道线检测方法:将待处理图像输入第一神经网络进行特征提取,再将提取出来的特征(每个特征图事先划分为多个栅格)通过预测头模型进行解码,生成密集的线簇(即多条预测车道线),最后,按照每个预测车道线的置信度(也可称为栅格的置信度,置信度反应的是是否有车道线穿过该栅格以及有多大概率穿过该栅格,置信度大于预设值的栅格则用于对车道线进行预测,置信度低于预设值的栅格则被认为对预测没有贡献)取值对线簇进行排序,以置信度取值最大的预测车道线为基线且以其他预测车道线与该基线之间的间距小于阈值为条件将车道线分为一组,以类似的方式将线簇分为若干组,并分别取每组中的基线作为本组最终对一条真实车道线的检测结果输出。需要说明的,本领域的技术人员可以根据实际情况选择车道线检测的方法,本申请实施例对此并不进行限定。
203、第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域。
高度信息包括预设定的待检测物体的物理高度。比如待检测物体包括交通灯,则高度 信息包括预设定的交通灯的实际高度,比如通常交通灯的高度为6-7米,可以预设定交通灯的物体高度为7米。再比如,待检测物体包括轿车,则高度信息包括预设定的轿车的实际高度,比如通常轿车的高度为1.4米至1.6米,可以预设定轿车的物体高度为1.6米。
感兴趣区域(region of interest,ROI)用于第二神经网络获取待检测物体的候选框和分类。其中,该第二神经网络可以是用于还行物体识别任务的神经网络,包括但不限于卷积神经网络(convolutional neuron network,CNN)、深度神经网络(deep neural network,DNN)、你只能看一次(you only look once,YOLO)v3(版本号,代表第三版)、单发多核探测器(single shot multibox detector,SSD)。本申请中的ROI是指从待处理的图像以方框的方式勾勒出需要处理的区域(本申请也称之为抠图区域),并将该ROI输入至第二神经网络,以输出待检测物体的候选框和分类。确定感兴趣区域包括确定感兴趣的位置、感兴趣区域的长度以及感兴趣区域的宽度。
本申请提供的方案针对于车辆获取的待处理图像,若获取到的待处理图像包括车道线,则根据车道线获取待处理图像中待检测物体的感兴趣区域。具体的,可以根据车道线确定感兴趣区域的位置以及感兴趣区域的长度,根据待检测物体的物体高度确定感兴趣的宽度。其中,待处理图像中包括停止车道线时,说明车辆处于路口路段或者车辆即将驶入路口路段,根据停止车道线获取感兴趣区域,比如车道线包括停止车道线时,根据停止车道线在待处理图像中的位置确定ROI的下边缘的位置以及ROI区域的长度,可以很好的在待处理图像中选出路口路段对应的区域,有利于提升路口路段的待检测物体的检测的准确度。若获取到的待处理图像不包括停止车道线,但是包括导向车道线,则可以根据待处理图像中导向车道线之间的位置关系确定ROI的下边缘的位置以及ROI区域的长度。保证在没有检测到停止车道线的时候,也可以根据导向车道线确定合适的ROI区域,在待处理图像中选出路口路段对应的区域,提升路口路段的待检测物体的检测的准确度。需要说明的是,本申请有时也将停止车道线称为停止线,二者表示相同的意思。
在一个可能的实施方式中,待检测物体是交通灯,比如待检测物体是红绿灯。本申请提供的方案可以有效提升交通灯检测的准确度。制约交通灯检测的准确度的一个因素在于,针对于同一个焦距下获取的待处理图像,交通灯在待处理图像中占据的像素,相比于其他待检测物体在待处理图像(比如人、车辆)中占据的像素要小很多;待处理图像在输入神经网络之前,需要对待处理图像进行压缩处理,以减小待处理图像的尺寸,进而可以减少神经网络对待处理图像进行处理所需要的数据量;由于交通灯在待处理图像中占据的像素比例本来就少,压缩处理后,可能会导致交通灯占据的像素更小,大大提升交通灯的检测难度;为了能够保证交通灯在待处理图像中占据的像素不会因为压缩处理而被压缩,且能够保证减少神经网络对待处理图像进行处理所需要的数据量,其中一种方式为选取出交通灯的ROI区域,并将该交通灯的ROI区域输入至神经网络中,使神经网络根据ROI区域检测交通灯;由于车辆在行驶过程中,交通灯的ROI区域是在不断变动的,比如车辆靠近交通灯的过程,交通灯的ROI区域在不断地向上方移动,因此如何选取ROI区域对于提升交通灯检测的准确度至关重要;现有技术中采用的手段,ROI的区域一般是固定不变的,固定的ROI区域无法适应交通灯的ROI区域是不断变化的情况;且目前一般采用通过高精地 图以及自车的GPS定位信息获取红绿灯的ROI区域,这种获取红绿灯的ROI区域的方法,当出现GPS定位不准,或者无法获取GPS信号、高精地图时,将无法获取红绿灯的ROI区域,则无法根据红绿灯的ROI区域检测红绿灯,可能会因此检测不出红绿灯,增加了安全隐患。而本申请提供的方案可以通过车道线获取红绿灯的ROI区域,不会受限于GPS信号以及高精地图;且根据车道线在待处理图像中的信息,比如根据停止车道线在待处理图像中的位置信息,停止线车道线在待处理图像中的长度、导向车道线之间的位置关系获取红绿灯的ROI区域,使红绿灯的ROI区域是动态变化的,且结合红绿灯的实际物理高度,可以更好的在待处理图像中选出红绿灯的ROI区域,将通过本申请提供的方案获取的ROI区域输入至神经网络中,以使神经网络根据本申请提供的方案获取的ROI区域进行红绿灯检测,可以有效的提升红绿灯检测的准确度。
图2对应的实施例中介绍到,车道线可以包括停止车道线和导向车道线,针对于车道线是否包括停止车道线,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域可能有不同的实现方式,下面结合几个典型的实施方式对如何根据车道线获取待处理图像中待检测物体感兴趣区域。
一、待处理图像包括停止车道线
参阅图3,为本申请实施例提供的另一种图像处理方法的流程示意图。
如图3所示,本申请提供的另一种图像处理方法可以包括如下步骤:
301、获取待处理图像。
302、将待处理图像输入至第一神经网络,以获取第一预测结果。
步骤301和步骤302可以参照图2对应的实施例中的步骤201和步骤202进行理解,这里不再重复赘述。
303、第一预测结果指示待处理图像的第一区域是车道线,且该车道线包括停止线时,获取停止线在待处理图像中的长度。
第一预测结果会指示待处理图像中每个像素属于停止车道线的概率,将属于停止线的概率超过预设阈值的像素点占据的区域称为区域1,则第一区域中的区域1可以用于表示停止车道线在待处理图像中的位置。参阅图4-a,图4-a为本申请实施例中获取停止线长度的一种方案的示意图。在待处理图像中,停止线由多个第一像素组成,第一像素是第一区域中包括的像素,在一个可能的实施方式中,可以根据第一像素中距离最远的两个像素之间的距离获取停止线在待处理图像中的长度。
在一个可能的实施方式中,第一预测结果指示待处理图像的区域1是车道线,该车道线还包括导向车道线。参阅图4-b,图4-b为本申请实施例中获取停止线长度的另一种方案的示意图。在这种方式中,可以从区域1中选取多个像素点,根据该多个像素点进行直线拟合,获取拟合后的直线线段。此外,第一预测结果会指示每个像素属于导向车道线的概率,每一条导向车道线都有各自的概率图,以一条导向车道线为例,该条导向车道线的概率图指示每个像素点属于该条导向车道线的概率,将属于该导向车道线的概率超过预设阈值的像素点占据的区域称为区域2,则该第一区域中的区域2可以用于表示该条导向车道线在待处理图像中的位置。可以从区域2中选取多个像素点,根据该多个像素点进行直 线拟合。针对每一条导向车道线,都可以按照上述方式获取各自拟合后的线段,以获取多条拟合后的线段。在一些可能的实施方式中,也可以对多个像素点进行曲线拟合。根据第一交点和第二交点之间的距离获取停止线在待处理图像中的长度,第一交点是待处理图像中第一导向车道线对应的曲线线段和停止线对应的直线线段一端的交点,第二交点是待处理图像中第二导向车道线对应的曲线线段和停止线对应的直线线段另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。通过导向车道线重新确定停止车道线的长度,有利于获取更准确的停止车道线的长度,降低误差。
304、根据停止线在待处理图像中的长度获取感兴趣区域的长度。
在一个可能的实施方式中,停止线的长度即为感兴趣区域的长度。在一个可能的实施方式中,可以对停止线的长度进行处理,比如将停止线的长度增加预设像素距离以获取感兴趣区域的长度。
305、根据高度信息和比例尺获取待检测物体在待处理图像中的长度。
比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。
本申请提供的方案可以通过多种可能的方式获取比例尺,包括但不限于以下两种方式:
在一个可能的实施方式中,获取第一距离,第一距离是待检测物体和自车之间的距离。可以通过多种方式获取待检测物体和自车之间的距离,比如可以通过雷达126获取待检测物体和自车之间的距离,相关技术中可以获取待检测物体和自车之间的距离方式,本申请实施例均可以采用,包括但不限于单目测距方式,双目测距方式。获取第二距离,第二距离是停止线和待处理图像的下边缘之间的距离,比如参照图4-c,待处理图像中的停止线和待处理图像的下边缘之间的距离可以指示图像中待检测物体(比如是交通信号灯)和自车之间的距离,根据第一距离和第二距离获取比例尺。通过比例尺可以获取一个像素的长度对应的实际物理长度,进而可以获取待检测物体的物理高度在待处理图像中占据的像素的长度。其中,待检测物体的物理高度参照图2对应的实施例中的步骤203中描述的待检测物体的物理高度进行理解,这里不再重复赘述。
在一个可能的实施方式中,第一预测结果指示待处理图像的第一区域是车道线,且该车道线还包括导向车道线时,比如第一区域的区域2指示包括至少两条导向车道线。参阅图4-d,获取至少两条导向车道线中任意两条相邻的导向车道线在待处理图像中的宽度。任意两条相邻的导向根据任意两条相邻的导向车道线在待处理图像中的宽度和预设定的两条导向车道线的物理宽度获取比例尺。在这种实施方式中,任意两条相邻的导向车道线在待处理图像中的宽度指示待处理图像中车道的宽度,结合实际的车道宽度,比如通常车道的宽度为3.5米,其中车道的宽度可以预先设定,则通过二者的比值,可以获取一个像素的长度对应的实际物理长度,进而可以获取待检测物体的物理高度在待处理图像中占据的像素的长度。
305、根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一个可能的实施方式中,待检测物体在待处理图像中的长度即为感兴趣区域的宽度,在一个可能的实施方式中,可以对待检测物体在待处理图像中的长度进行处理,比如将待 检测物体在待处理图像中的长度增加预设像素距离以获取感兴趣区域的宽度。
在一个可能的实施方式中,感兴趣区域的下边缘的位置根据停止线在待处理图像中的位置确定,比如感兴趣区域的下边缘即为根据步骤303获取的停止线对应的第一交点和第二交点之间的线段。
通过上述步骤,可以获取感兴趣区域的长度和宽度,进而获取感兴趣区域的尺寸。对于路口路段,停止线和交通信号灯往往是一同出现在待处理图像中,车辆在行驶过程中,停止线在待处理图像中的区域也是不断变动的,交通信号等在待处理图像中的区域也是不断变动的。本申请提供的方案通过停止线在待处理图像中的位置获取感兴趣区域在待处理图像中的位置,通过停止线的长度获取感兴趣区域的尺寸。停止线在待处理图像中的区域也是不断变动的,所以获取到的感兴趣区域的位置和尺寸也随之变化,使感兴趣区域可以精确的包括路口路段的场景,有利于对路口路段的待检测物体进行识别,比如进行红绿灯检测,提升对路口路段的待检测物体识别的准确度。
二、待处理图像不包括停止车道线
在一些场景中,待处理图像可能不包括停止车道线,此时可以根据导向车道线获取感情区域,针对导向车道线的不同形态,可以采用不同的获取感兴趣区域的方式,下面结合几个典型的实施方式进行说明。
参阅图5,为本申请实施例提供的另一种图像处理方法的流程示意图。
如图5所示,本申请提供的另一种图像处理方法可以包括如下步骤:
501、获取待处理图像。
502、将待处理图像输入至第一神经网络,以获取第一预测结果。
步骤501和步骤502可以参照图2对应的实施例中的步骤201和步骤202进行理解,这里不再重复赘述。
503、第一预测结果指示待处理图像的第一区域中的车道线包括至少两条导向车道线且不包括停止线,根据第三交点和第四交点之间的距离获取感兴趣区域的长度。
第一预测结果会指示每个像素属于导向车道线的概率,每一条导向车道线都有各自的概率图,以一条导向车道线为例,该条导向车道线的概率图指示每个像素点属于该条导向车道线的概率,将属于该导向车道线的概率超过预设阈值的像素点占据的区域称为区域2,则第一区域中的区域2可以用于表示该条导向车道线在待处理图像中的位置(获取区域2中包括的像素点组成了一条导向车道线)。在一个可能的实施方式中,可以从区域2中选取多个像素点,根据该多个像素点进行曲线拟合。针对每一条导向车道线,都可以按照上述方式获取各自拟合后的曲线线段,可以认为拟合后的曲线线段为一条导向车道线。第三交点是待处理图像中第一导向车道线和第一直线线段的一端的交点,第二交点是待处理图像中第二导向车道线和第一直线线段另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线,第一直线线段是经过第二像素的一条直线线段,第二像素是至少两条导向车道线中最短的导向车道线在待处理图像中最高点对应的像素。下面结合图6进行说明,参阅图6,在一些可能的场景中,待处理图像中包括至少两条导向车道线,该两条导向车道线中的至少一条车道线有缺失,导致该至少两条车 道线的长度不一致。其中车道线的缺失可能是由于车道线的实际缺失导致,也可能是因为经过图像分割神经网络处理后导致的,本申请实施例对车道线有缺失的具体情况并不进行限定。当获取到待处理图像中有至少一条车道线有缺失时,则根据该最短的车道线在待处理图像中最高点对应的像素、经过该最高点对应的像素的直线线段,与最左车道线和最右车道线的交点(第三交点、第四交点)获取感兴趣区域的长度。在一个可能的实施方式中,该第三交点和第四交点之间的线段即为感兴趣区域的下边缘。
504、根据高度信息和比例尺获取待检测物体在待处理图像中的长度。
步骤504可以参照图3对应的实施例中的步骤305进行理解,这里不再重复赘述。
505、根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一个可能的实施方式中,待检测物体在待处理图像中的长度即为感兴趣区域的宽度,在一个可能的实施方式中,可以对待检测物体在待处理图像中的长度进行处理,比如将待检测物体在待处理图像中的长度增加预设像素距离以获取感兴趣区域的宽度。
由图5对应的实施例可知,在一些可能的场景中,待处理图像不包括停止车道线,无法根据停止车道线在待处理图像中选取出路口路段对应的区域。为了能够在待处理图像不包括停止车道线时,也能够针对路口路段,从待处理图像中选取可能的路口路段对应的区域,且保证获取的路口路段对应的区域尽量是一个完整的路口路段区域,则根据最短的导向车道线在待处理图像中最高点对应的像素获取感兴趣区域的下边缘的位置、进一步的确定感兴趣区域的尺寸。
在图5对应的实施例中,待处理图像中包括的至少两条导向车道线中存在缺失的导向车道线,在一些可能的实施方式中,该待处理图像中至少两条导向车道线中可能都是完整的导向车道线,没有缺失;此外,在一些可能的实施方式中,该待处理图像中至少两条车道线中的至少两条车道线可能有相交的情况。其中,属于一条导向车道线上的像素点的横坐标与属于另一条导向车道线上的像素点的横坐标之间的差值在预设范围内,则可以认为该两条导向车道线是相交的。下面针对这些场景,结合一个具体的实施方式说明如何确定感兴趣区域的尺寸以及位置。
参阅图7,为本申请实施例提供的另一种图像处理方法的流程示意图。
如图7所示,本申请提供的另一种图像处理方法可以包括如下步骤:
701、获取待处理图像。
702、将待处理图像输入至第一神经网络,以获取第一预测结果。
步骤701和步骤702可以参照图2对应的实施例中的步骤201和步骤202进行理解,这里不再重复赘述。
703、第一区域中的车道线包括至少两条导向车道线且不包括停止线,感兴趣区域的下边缘的位置根据第一线段在待处理图像中的位置确定。
第一预测结果会指示每个像素属于导向车道线的概率,每一条导向车道线都有各自的概率图,以一条导向车道线为例,该条导向车道线的概率图指示每个像素点属于该条导向车道线的概率,将属于该导向车道线的概率超过预设阈值的像素点占据的区域称为区域2,则第一区域中的区域2可以用于表示该条导向车道线在待处理图像中的位置(获取区域2 中包括的像素点组成了一条导向车道线)。在一个可能的实施方式中,可以从区域2中选取多个像素点,根据该多个像素点进行曲线拟合。针对每一条导向车道线,都可以按照上述方式获取各自拟合后的曲线线段,可以认为拟合后的曲线线段为一条导向车道线。
第一线段占据预设长度的像素,其中预设长度可以是一个长度的范围区间,该长度的范围区间中的任意一条线段为第一线段。或者,该预设长度也可以是某一个固定的长度。第一线段的一端与第一导向车道线相交,第一线段的另一端与第二导向车道线相交,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。在一个可能的实施方式中,该第一线段占据300个像素,假设第一线段与第一导向车道线的交点为交点1,第一线段与第二导向车道线的交点为交点2,则交点1的横坐标和交点2的横坐标之间的差值为预设长度的像素,比如交点1的横坐标和交点2的横坐标之间的差值为300个像素的长度。再比如,在一个可能的实施方式中,预设长度是一个长度的范围区间,参照图8进行理解,最左导向车道线和最右导向车道线为该至少两条导向车道线中距离最远的两条导向车道线,最左导向车道线和最右导向车道线之间可能有无数的线段,比如图8中的线段1,线段2,线段3:假设线段1占据的像素长度不满足条件,比如线段1占据的像素长度不在预设的长度范围内,具体的,超出预设的长度范围中的最大长度;线段3占据的像素长度不满足条件,比如线段3占据的像素长度不在预设的长度范围内,具体的,小于预设的长度范围中最小的长度,线段2占据的像素长度满足条件,比如线段2占据的像素长度在预设的长度范围内。则从满足预设长度范围的限度中任意选择一条作为第一线段,比如选择线段2作为第一线段,并根据第一线段确定感兴趣区域的下边缘的位置。比如感兴趣区域的下边缘与第一线段在待处理图像上的距离不超过预设阈值。
704、根据第一线段的长度获取感兴趣区域的长度。
在一个可能的实施方式中,第一线段的长度即为感兴趣区域的长度,在一个可能的实施方式中,对第一线段的长度进行处理,以获取感兴趣区域的长度,比如将第一线段的长度增加预设像素距离以获取感兴趣区域的长度。
705、根据高度信息和比例尺获取待检测物体在待处理图像中的长度。
步骤705可以参照图3对应的实施例中的步骤305进行理解,这里不再重复赘述。
706、根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
一个可能的实施方式中,待检测物体在待处理图像中的长度即为感兴趣区域的宽度,在一个可能的实施方式中,可以对待检测物体在待处理图像中的长度进行处理,比如将待检测物体在待处理图像中的长度增加预设像素距离以获取感兴趣区域的宽度。
由图7对应的实施例可知,在一些可能的场景中,待处理图像不包括停止车道线,无法根据停止车道线在待处理图像中选取出路口路段对应的区域。为了能够在待处理图像不包括停止车道线时,也能够针对路口路段,从待处理图像中选取可能的路口路段对应的区域,且在保证能够获取完整的路口路段对应的区域的基础上,保证获取的路口路段对应的区域不至于太小,导致获取的感兴趣区域太小。则根据预设的像素长度对应的线段与最左边的导向车道线、最右边导向车道线的交点获取感兴趣区域的下边缘的位置、进一步的确定感兴趣区域的尺寸。
以上对如何获取感兴趣区域在待处理图像中的位置以及感兴趣区域的尺寸进行了介绍。获取了感兴趣区域在待处理图像中的位置和尺寸后,可以将待处理图像中将感兴趣区作为抠图区域,将抠图区域输入至第二神经网络,使第二神经网络根据抠图区域确定待检测物体的候选框和分类。在一些可能的实施方式中,还可以对抠图区域进行超分辨率处理,提升抠图区域的画面的质量,将经过超分辨率处理后的抠图区域输入至第二神经网络中,以提升第二神经网络对于物体检测的效果。在一些可能的实施方式中,抠图区域的尺寸可能过大,为了减少第二神经网络的计算量,还可以对抠图区域进行压缩处理,将经过压缩处理后的抠图区域输入至第二神经网络。下面结合一个具体的实施例进行说明。
参阅图9,为本申请实施例提供的另一种图像处理方法的流程示意图。
如图9所示,本申请提供的另一种图像处理方法可以包括如下步骤:
901、获取待处理图像。
902、将待处理图像输入至第一神经网络,以获取第一预测结果。
步骤901和步骤902可以参照图2对应的实施例中的步骤201和步骤202进行理解,这里不再重复赘述。
903、第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域。
图2、图3、图5、图7对应的实施例描述的根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域的方式,图9对应的实施例都可以采用,这里不再重复赘述。
904、若根据高度信息和第一区域获取的感兴趣区域的分辨率大于第二预设阈值时,将感兴趣区域的分辨率压缩至第二预设阈值。
比如预设的分辨率为896*512像素,若根据高度信息和第一区域获取的感兴趣区域的分辨率大于896*512像素,则对获取到的感兴趣区域进行压缩处理,以将感兴趣区域的分辨率压缩至896*512像素。其中预设的分辨率的大小与第二神经网络的输入相关,比如第二神经网络输入格式为896*512像素,则设定预设分辨率为896*512像素。关于如何对图像进行压缩处理,压缩至指定分辨率可以采用多种方式,本申请实施例对此并不进行限定。比如对相邻的多个像素求平均值以得到一个像素,以实现对图像进行压缩的目的。
905、若根据高度信息和第一区域获取的感兴趣区域的分辨率小于第二预设阈值时,对感兴趣区域进行超分辨率处理,以使感兴趣的分辨率提升至第二预设阈值。
比如预设的分辨率为896*512像素,若根据高度信息和第一区域获取的感兴趣区域的分辨率小于896*512像素,则对获取到的感兴趣区域进行超分辨率处理,以将感兴趣区域的分辨率提升至896*512像素。其中预设的分辨率的大小与第二神经网络的输入相关,比如第二神经网络输入格式为896*512像素,则设定预设分辨率为896*512像素。关于如何对图像进行超分辨率处理,提升画质到指定的像素有多种可能的实现方式,本申请实施例对此并不进行限定。比如,可以通过超分辨率卷积神经网络(super resolutionconvolutional neural networks,SRCNN)、基于区域的快速卷积神经网络(accelerating the super-resolution convolutional neural network,FSRCNN)等深度学习网络进行超分辨率处理。具体的,可以对感兴趣区域进行双三次插值算法,以提升感 兴趣区域的分辨率。
经过超分辨处理后,图像的细节更明显,更有利于提升目标检测的准确度。参阅图10,通过第二神经网络对感兴趣区域进行处理,以获取感兴趣区域中待检测物体的候选框和类别。根据感兴趣区域中包括的任意一个像素的坐标,可以将感兴趣区域重新合并到待处理图像中。比如待处理图像中每个像素都有对应的坐标,可以根据感兴趣区域左上角的像素在待处理图像中的坐标,将感兴趣区域重新合并到待处理图像,进而可以在待处理图像中显示待检测物体的候选框以及待检测物体的类别。
参照图11,为本申请实施例提供的一种图像处理方法的流程示意图。获取待处理图像,判断待处理图像中是否存在停止车道线,若存在停止车道线,则根据停止车道线和导向车道线获取感兴趣区域。具体的,根据停止车道线和最左车道线、最右车道线的两个交点之间的线段确定感兴趣区域的下边缘(长度以及位置),根据比例尺、待检测物体的实际物理高度获取感兴趣区域的宽度。若待处理图像中不包括停止车道线,则进一步判断待处理图像中最短的一条导向车道线是否与其他导向车道线相交。如果相交,则根据目标线段获取感兴趣区域的下边缘(长度以及位置),目标线段的长度为300个像素的长度,且目标线段的一端与最左车道线相交,目标线段的另一端与最右车道线相交。需要说明的是,300个像素的长度仅为示例性的说明,目标线段的长度可以根据第二神经网络输入的阈值确定。如果不相交,则根据,目标线段与最左车道线、最右车道线的两个交点之间的线段获取感兴趣区域的下边缘(长度以及位置),根据比例尺、待检测物体的实际物理高度获取感兴趣区域的宽度。其中目标线段与待处理图像的下边缘平行,且该目标线段经过最短的车道线在待处理图像中最高点对应的像素。获取了感兴趣区域后,判断感兴趣区域的分辨率与预设的分辨率之间的关系,如果感兴趣区域的分辨率大于预设的分辨率,则对感兴趣区域进行压缩处理,以将感兴趣区域的分辨率压缩至预设的分辨率,如果感兴趣区域的分辨率小于预设的分辨率,则对感兴趣区域进行超分辨率处理,以提升感兴趣区域的分辨率至预设的分辨率。
参阅图12,为在待处理图像中选取感兴趣区域的示意图,根据停止车道线和最左车道线、最右车道线的两个交点之间的线段确定感兴趣区域的下边缘(长度以及位置),根据比例尺、待检测物体的实际物理高度获取感兴趣区域的宽度。
在一个可能的实施方式中,可以通过车载设备显示感兴趣区域,或者在挡风玻璃上投影出感兴趣区域,感兴趣区域始终包括路口路段对应的区域。根据本申请提供的方案获取的感兴趣区域,只会包括影响自车所在车道行驶状态的交通信号灯,所以针对该感兴趣区域进行交通信号灯的检测,只会输出一种决策结果。
以上对本申请实施例提供的一种图像处理的方法进行了介绍,通过本申请提供的一种图像处理的方法,可以很好的在待处理图像中选出路口路段对应的区域,有利于提升路口路段的待检测物体的检测的准确度。
在图2至图11所对应的实施例的基础上,为了更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关设备。具体参阅图13,图13为本申请实施例提供的图像处理装置的一种结构示意图。图像处理装置包括可以包括获取模块131,图像分割模 块132,感兴趣区域模块133。
在一种可能的实施方式中,获取模块131,用于获取待处理图像。图像分割模块132,用于将待处理图像输入至第一神经网络,以获取第一预测结果。感兴趣区域模块133,还用于第一预测结果指示待处理图像的第一区域是车道线时,根据高度信息和第一区域获取待处理图像中待检测物体的感兴趣区域,高度信息可以包括预设定的待检测物体的物理高度,感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。
在一种可能的实施方式中,第一区域中的车道线可以包括停止线,感兴趣区域模块133,具体用于:获取停止线在待处理图像中的长度。根据停止线在待处理图像中的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,第一区域中可以包括多个第一像素,多个第一像素中各个第一像素属于停止线的概率超过第一预设阈值,停止线由多个第一像素组成,感兴趣区域模块133,具体用于:根据多个第一像素中距离最远的两个像素之间的距离获取停止线在待处理图像中的长度。
在一种可能的实施方式中,感兴趣区域模块133,还用于:获取第一距离,第一距离是待检测物体和自车之间的距离。获取第二距离,第二距离是停止线和待处理图像的下边缘之间的距离。根据第一距离和第二距离获取比例尺。
在一种可能的实施方式中,第一区域中的车道线还可以包括至少两条导向车道线,感兴趣区域模块133,还用于:获取至少两条导向车道线中任意两条相邻的导向车道线在待处理图像中的宽度。根据任意两条相邻的导向车道线在待处理图像中的宽度和预设定的两条导向车道线的物理宽度获取比例尺。
在一种可能的实施方式中,感兴趣区域模块133,具体用于:根据第一交点和第二交点之间的距离获取感兴趣区域的长度,第一交点是待处理图像中第一导向车道线和停止线一端的交点,第二交点是待处理图像中第二导向车道线和停止线另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。
在一种可能的实施方式中,感兴趣区域的下边缘的位置根据停止线在待处理图像中的位置确定。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,感兴趣区域模块133,具体用于:根据第三交点和第四交点之间的距离获取感兴趣区域的长度,第三交点是待处理图像中第一导向车道线和第一线段的一端的交点,第二交点是待处理图像中第二导向车道线和第一线段另一端的交点,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线,第一线段是经过第二像素的一条线段,第二像素是至少两条导向车道线中最短的导向车道线在待处理图像中最高点对应的像素。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,第一线段与待处理图像的下边缘平行。
在一种可能的实施方式中,第一区域中的车道线可以包括至少两条导向车道线且不可以包括停止线,感兴趣区域的下边缘的位置根据第一线段在待处理图像中的位置确定,第一线段占据预设长度的像素,且第一线段的一端与第一导向车道线相交,第一线段的另一端与第二导向车道线相交,第一导向车道线和第二导向车道线是至少两条导向车道线中距离最远的两条导向车道线。
在一种可能的实施方式中,感兴趣区域模块133,具体用于:根据第一线段的长度获取感兴趣区域的长度。根据高度信息和比例尺获取待检测物体在待处理图像中的长度,比例尺用于指示待检测物体在待处理图像中的长度和待检测物体的物理高度之间的比例关系。根据待检测物体在待处理图像中的长度获取感兴趣区域的宽度。
在一种可能的实施方式中,还可以包括压缩模块,压缩模块,用于若根据高度信息和第一区域获取的感兴趣区域的分辨率大于第二预设阈值时,将感兴趣区域的分辨率压缩至第二预设阈值。
在一种可能的实施方式中,还可以包括超分辨率处理模块,超分辨率处理模块,用于若根据高度信息和第一区域获取的感兴趣区域的分辨率小于第二预设阈值时,对感兴趣区域进行超分辨率处理,以使感兴趣的分辨率提升至第二预设阈值。
在一种可能的实施方式中,待检测物体可以包括交通灯。
参阅图14,为本申请实施例提供的图像处理装置的另一种结构示意图。包括处理器1402和存储器1403。
其中,处理器1402包括但不限于中央处理器(central processing unit,CPU),网络处理器(network processor,NP),专用集成电路(application-specific integrated circuit,ASIC)或者可编程逻辑器件(programmable logic device,PLD)中的一个或多个。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。处理器1402负责通信线路1404和通常的处理,还可以提供各种功能,包括定时,外围接口,电压调节,电源管理以及其他控制功能。
存储器1403可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically er服务器able programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路1404与处理器1402相连接。存储器1403也可以和处理器1402集成在一起。如果存储器1403和处理器1402是相互独立的器件,存储器1403和处理器1402相连,例如存储器1403和处理器1402可以通过通信线路通信。通信线路1404和处理器1402可以通过通信线路通信,通信线路1404也可以与处理器1402 直连。
通信线路1404可以包括任意数量的互联的总线和桥,通信线路1404将包括由处理器1402代表的一个或多个处理器1402和存储器1403代表的存储器的各种电路链接在一起。通信线路1404还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起,这些都是本领域所公知的,因此,本申请不再对其进行进一步描述。
在一个可能的实施方式中,该图像处理转置可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现图2至图11所描述的方法。
本申请实施例还提供了一种自动驾驶车辆,结合上述对图1的描述,请参阅图15,图15为本申请实施例提供的自动驾驶车辆的一种结构示意图,其中,自动驾驶车辆100上可以部署有图14对应实施例中所描述的图像处理装置,用于实现图2至图11对应实施例中自动驾驶车辆的功能。由于在部分实施例中,自动驾驶车辆100还可以包括通信功能,则自动驾驶车辆100除了包括图1中所示的组件,还可以包括:接收器1201和发射器1202,其中,处理器113可以包括应用处理器1131和通信处理器1132。在本申请的一些实施例中,接收器1201、发射器1202、处理器113和存储器114可通过总线或其它方式连接。
处理器113控制自动驾驶车辆的操作。具体的应用中,自动驾驶车辆100的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
接收器1201可用于接收输入的数字或字符信息,以及产生与自动驾驶车辆的相关设置以及功能控制有关的信号输入。发射器1202可用于通过第一接口输出数字或字符信息;发射器1202还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据;发射器1202还可以包括显示屏等显示设备。
本申请实施例中,应用处理器1131,用于执行图2至图11对应实施例中的自动驾驶车辆或者图像处理装置执行的图像处理方法。
需要说明的是,对于应用处理器1131执行图像处理方法的具体实现方式以及带来的有益效果,均可以参考图2至图11对应的各个方法实施例中的叙述,此处不再一一赘述。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于规划车辆行驶路线的程序,当其在计算机上行驶时,使得计算机执行如前述图2至图11所示实施例描述的方法中自动驾驶车辆(或者图像处理装置)所执行的步骤。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上行驶时,使得计算机执行如前述图2至图11所示实施例描述的方法中自动驾驶车辆(或者图像处理装置)所执行的步骤。
本申请实施例中还提供一种电路系统,所述电路系统包括处理电路,所述处理电路配置为执行如前述图2至图11所示实施例描述的方法中自动驾驶车辆(或者图像处理装置)所执行的步骤。
本申请实施例提供的图像处理装置或自动驾驶车辆具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接 口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使服务器内的芯片执行上述图2至图9所示实施例描述的规划车辆行驶路线的方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图16,图16为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 130,NPU 130作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路1303,通过控制器1304控制运算电路1303提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路1303内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路1303是二维脉动阵列。运算电路1303还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1303是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1302中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1301中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1308中。
统一存储器1306用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(direct memory access controller,DMAC)1305,DMAC被搬运到权重存储器1302中。输入数据也通过DMAC被搬运到统一存储器1306中。
总线接口单元(bus interface unit,BIU)1310,用于AXI总线与DMAC和取指存储器(instruction fetch buffer,IFB)1309的交互。
BIU1310,用于取指存储器1309从外部存储器获取指令,还用于存储单元访问控制器1305从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1306或将权重数据搬运到权重存储器1302中或将输入数据数据搬运到输入存储器1301中。
向量计算单元1307包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如批归一化(batch normalization),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元1307能将经处理的输出的向量存储到统一存储器1306。例如,向量计算单元1307可以将线性函数和/或非线性函数应用到运算电路1303的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元1307生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1303的激活输入,例如用于在神经网络中的后续层中的使用。
控制器1304连接的取指存储器(instruction fetch buffer)1309,用于存储控制器1304使用的指令。
统一存储器1306,输入存储器1301,权重存储器1302以及取指存储器1309均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,循环神经网络中各层的运算可以由运算电路1303或向量计算单元1307执行。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CLU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。

Claims (33)

  1. 一种图像处理方法,其特征在于,包括:
    获取待处理图像;
    将所述待处理图像输入至第一神经网络,以获取第一预测结果;
    所述第一预测结果指示所述待处理图像的第一区域是车道线时,根据高度信息和所述第一区域获取所述待处理图像中待检测物体的感兴趣区域,所述高度信息包括预设定的所述待检测物体的物理高度,所述感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。
  2. 根据权利要求1所述的方法,其特征在于,所述第一区域中的车道线包括停止线,所述根据高度信息和所述第一区域获取所述待处理图像的感兴趣区域,包括:
    获取所述停止线在所述待处理图像中的长度;
    根据所述停止线在所述待处理图像中的长度获取所述感兴趣区域的长度;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  3. 根据权利要求2所述的方法,其特征在于,所述第一区域中包括多个第一像素,所述多个第一像素中各个第一像素属于所述停止线的概率超过第一预设阈值,所述停止线由所述多个第一像素组成,所述获取所述停止线在所述待处理图像中的长度,包括:
    根据所述多个第一像素中距离最远的两个像素之间的距离获取所述停止线在所述待处理图像中的长度。
  4. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    获取第一距离,所述第一距离是所述待检测物体和自车之间的距离;
    获取第二距离,所述第二距离是所述停止线和所述待处理图像的下边缘之间的距离;
    根据所述第一距离和所述第二距离获取所述比例尺。
  5. 根据权利要求2所述的方法,其特征在于,所述第一区域中的车道线还包括至少两条导向车道线,所述方法还包括:
    获取所述至少两条导向车道线中任意两条相邻的导向车道线在所述待处理图像中的宽度;
    根据所述任意两条相邻的导向车道线在所述待处理图像中的宽度和预设定的两条导向车道线的物理宽度获取所述比例尺。
  6. 根据权利要求2至5任一项所述的方法,其特征在于,所述根据所述停止线在所述待处理图像中的长度获取所述感兴趣区域的长度,包括:
    根据第一交点和第二交点之间的距离获取所述感兴趣区域的长度,所述第一交点是所述待处理图像中第一导向车道线和所述停止线一端的交点,所述第二交点是所述待处理图像中第二导向车道线和所述停止线另一端的交点,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线。
  7. 根据权利要求2至6任一项所述的方法,其特征在于,所述感兴趣区域的下边缘的 位置根据所述停止线在所述待处理图像中的位置确定。
  8. 根据权利要求1所述的方法,其特征在于,所述第一区域中的车道线包括至少两条导向车道线且不包括停止线,所述根据高度信息和所述第一区域获取所述待处理图像的感兴趣区域,包括:
    根据第三交点和第四交点之间的距离获取所述感兴趣区域的长度,所述第三交点是所述待处理图像中第一导向车道线和第一线段的一端的交点,所述第二交点是所述待处理图像中第二导向车道线和所述第一线段另一端的交点,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线,所述第一线段是经过第二像素的一条线段,所述第二像素是所述至少两条导向车道线中最短的导向车道线在所述待处理图像中最高点对应的像素;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  9. 根据权利要求8所述的方法,其特征在于,所述第一线段与所述待处理图像的下边缘平行。
  10. 根据权利要求1所述的方法,其特征在于,所述第一区域中的车道线包括至少两条导向车道线且不包括停止线,所述根据高度信息和所述第一区域获取所述待处理图像的感兴趣区域,包括:
    根据第一线段在所述待处理图像中的位置确定所述感兴趣区域的下边缘的位置,所述第一线段占据预设长度的像素,且所述第一线段的一端与第一导向车道线相交,所述第一线段的另一端与第二导向车道线相交,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线。
  11. 根据权利要求10所述的方法,其特征在于,所述根据高度信息和所述第一区域获取所述待处理图像的感兴趣区域,还包括:
    根据所述第一线段的长度获取所述感兴趣区域的长度;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,若根据所述高度信息和所述第一区域获取的所述感兴趣区域的分辨率大于第二预设阈值时,所述方法还包括:
    将所述感兴趣区域的分辨率压缩至所述第二预设阈值。
  13. 根据权利要求1至11任一项所述的方法,其特征在于,若根据所述高度信息和所述第一区域获取的所述感兴趣区域的分辨率小于第二预设阈值时,所述方法还包括:
    对所述感兴趣区域进行超分辨率处理,以使所述感兴趣的分辨率提升至所述第二预设阈值。
  14. 根据权利要求1至13任一项所述的方法,其特征在于,所述待检测物体包括交通灯。
  15. 一种图像处理装置,其特征在于,包括:
    获取模块,用于获取待处理图像;
    图像分割模块,用于将所述待处理图像输入至第一神经网络,以获取第一预测结果;
    感兴趣区域模块,还用于所述第一预测结果指示所述待处理图像的第一区域是车道线时,根据高度信息和所述第一区域获取所述待处理图像中待检测物体的感兴趣区域,所述高度信息包括预设定的所述待检测物体的物理高度,所述感兴趣区域用于第二神经网络获取待检测物体的候选框和分类。
  16. 根据权利要求15所述的图像处理装置,其特征在于,所述第一区域中的车道线包括停止线,所述感兴趣区域模块,具体用于:
    获取所述停止线在所述待处理图像中的长度;
    根据所述停止线在所述待处理图像中的长度获取所述感兴趣区域的长度;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  17. 根据权利要求16所述的图像处理装置,其特征在于,所述第一区域中包括多个第一像素,所述多个第一像素中各个第一像素属于所述停止线的概率超过第一预设阈值,所述停止线由所述多个第一像素组成,所述感兴趣区域模块,具体用于:
    根据所述多个第一像素中距离最远的两个像素之间的距离获取所述停止线在所述待处理图像中的长度。
  18. 根据权利要求16所述的图像处理装置,其特征在于,所述感兴趣区域模块,还用于:
    获取第一距离,所述第一距离是所述待检测物体和自车之间的距离;
    获取第二距离,所述第二距离是所述停止线和所述待处理图像的下边缘之间的距离;
    根据所述第一距离和所述第二距离获取所述比例尺。
  19. 根据权利要求16所述的图像处理装置,其特征在于,所述第一区域中的车道线还包括至少两条导向车道线,所述感兴趣区域模块,还用于:
    获取所述至少两条导向车道线中任意两条相邻的导向车道线在所述待处理图像中的宽度;
    根据所述任意两条相邻的导向车道线在所述待处理图像中的宽度和预设定的两条导向车道线的物理宽度获取所述比例尺。
  20. 根据权利要求16至19任一项所述的图像处理装置,其特征在于,所述感兴趣区域模块,具体用于:
    根据第一交点和第二交点之间的距离获取所述感兴趣区域的长度,所述第一交点是所述待处理图像中第一导向车道线和所述停止线一端的交点,所述第二交点是所述待处理图 像中第二导向车道线和所述停止线另一端的交点,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线。
  21. 根据权利要求16至20任一项所述的图像处理装置,其特征在于,所述感兴趣区域的下边缘的位置根据所述停止线在所述待处理图像中的位置确定。
  22. 根据权利要求15所述的图像处理装置,其特征在于,所述第一区域中的车道线包括至少两条导向车道线且不包括停止线,所述感兴趣区域模块,具体用于:
    根据第三交点和第四交点之间的距离获取所述感兴趣区域的长度,所述第三交点是所述待处理图像中第一导向车道线和第一线段的一端的交点,所述第二交点是所述待处理图像中第二导向车道线和所述第一线段另一端的交点,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线,所述第一线段是经过第二像素的一条线段,所述第二像素是所述至少两条导向车道线中最短的导向车道线在所述待处理图像中最高点对应的像素;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  23. 根据权利要求22所述的图像处理装置,其特征在于,所述第一线段与所述待处理图像的下边缘平行。
  24. 根据权利要求15所述的图像处理装置,其特征在于,所述第一区域中的车道线包括至少两条导向车道线且不包括停止线,所述感兴趣区域的下边缘的位置根据第一线段在所述待处理图像中的位置确定,所述第一线段占据预设长度的像素,且所述第一线段的一端与第一导向车道线相交,所述第一线段的另一端与第二导向车道线相交,所述第一导向车道线和所述第二导向车道线是所述至少两条导向车道线中距离最远的两条导向车道线。
  25. 根据权利要求15至24任一项所述的图像处理装置,其特征在于,所述感兴趣区域模块,具体用于:
    根据所述第一线段的长度获取所述感兴趣区域的长度;
    根据所述高度信息和比例尺获取所述待检测物体在所述待处理图像中的长度,所述比例尺用于指示所述待检测物体在所述待处理图像中的长度和所述待检测物体的物理高度之间的比例关系;
    根据所述待检测物体在所述待处理图像中的长度获取所述感兴趣区域的宽度。
  26. 根据权利要求15至25任一项所述的图像处理装置,其特征在于,还包括压缩模块,
    所述压缩模块,用于若根据所述高度信息和所述第一区域获取的所述感兴趣区域的分辨率大于第二预设阈值时,将所述感兴趣区域的分辨率压缩至所述第二预设阈值。
  27. 根据权利要求15至25任一项所述的图像处理装置,其特征在于,还包括超分辨率处理模块,
    所述超分辨率处理模块,用于若根据所述高度信息和所述第一区域获取的所述感兴趣 区域的分辨率小于第二预设阈值时,对所述感兴趣区域进行超分辨率处理,以使所述感兴趣的分辨率提升至所述第二预设阈值。
  28. 根据权利要求15至27任一项所述的图像处理装置,其特征在于,所述待检测物体包括交通灯。
  29. 一种图像处理装置,其特征在于,包括处理器,所述处理器和存储器耦合,所述存储器存储有程序指令,当所述存储器存储的程序指令被所述处理器执行时实现权利要求1至14中任一项所述的方法。
  30. 一种计算机可读存储介质,包括程序,当其在计算机上运行时,使得计算机执行如权利要求1至14中任一项所述的方法。
  31. 一种计算机程序产品,当在计算机上运行时,使得计算机可以执行如权利要求1至14任一所描述的方法。
  32. 一种芯片,其特征在于,所述芯片与存储器耦合,用于执行所述存储器中存储的程序,以执行如权利要求1至14任一项所述的方法。
  33. 一种智能汽车,其特征在于,所述智能汽车包括处理电路和存储电路,所述处理电路和所述存储电路被配置为执行如权利要求1至14中任一项所述的方法。
PCT/CN2021/131609 2020-12-31 2021-11-19 一种图像处理方法、装置以及智能汽车 WO2022142839A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011640167.6 2020-12-31
CN202011640167.6A CN114693540A (zh) 2020-12-31 2020-12-31 一种图像处理方法、装置以及智能汽车

Publications (1)

Publication Number Publication Date
WO2022142839A1 true WO2022142839A1 (zh) 2022-07-07

Family

ID=82135830

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131609 WO2022142839A1 (zh) 2020-12-31 2021-11-19 一种图像处理方法、装置以及智能汽车

Country Status (2)

Country Link
CN (1) CN114693540A (zh)
WO (1) WO2022142839A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116883951A (zh) * 2023-09-07 2023-10-13 杭州像素元科技有限公司 基于多源信息感知的高速施工员识别方法、装置及其应用
CN117437581A (zh) * 2023-12-20 2024-01-23 神思电子技术股份有限公司 基于图像语义分割和视角缩放的机动车拥堵长度计算方法
CN117495989A (zh) * 2023-12-29 2024-02-02 腾讯科技(深圳)有限公司 数据处理方法、装置、设备及可读存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129279B (zh) * 2023-04-14 2023-06-27 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9607402B1 (en) * 2016-05-09 2017-03-28 Iteris, Inc. Calibration of pedestrian speed with detection zone for traffic intersection control
CN107688764A (zh) * 2016-08-03 2018-02-13 浙江宇视科技有限公司 检测车辆违章的方法及装置
CN109849922A (zh) * 2018-12-25 2019-06-07 青岛中汽特种汽车有限公司 一种用于智能车辆的基于视觉信息与gis信息融合的方法
CN111242118A (zh) * 2018-11-29 2020-06-05 长沙智能驾驶研究院有限公司 目标检测方法、装置、计算机设备和存储介质
CN111931745A (zh) * 2020-10-09 2020-11-13 蘑菇车联信息科技有限公司 车辆检测方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9607402B1 (en) * 2016-05-09 2017-03-28 Iteris, Inc. Calibration of pedestrian speed with detection zone for traffic intersection control
CN107688764A (zh) * 2016-08-03 2018-02-13 浙江宇视科技有限公司 检测车辆违章的方法及装置
CN111242118A (zh) * 2018-11-29 2020-06-05 长沙智能驾驶研究院有限公司 目标检测方法、装置、计算机设备和存储介质
CN109849922A (zh) * 2018-12-25 2019-06-07 青岛中汽特种汽车有限公司 一种用于智能车辆的基于视觉信息与gis信息融合的方法
CN111931745A (zh) * 2020-10-09 2020-11-13 蘑菇车联信息科技有限公司 车辆检测方法、装置、电子设备及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116883951A (zh) * 2023-09-07 2023-10-13 杭州像素元科技有限公司 基于多源信息感知的高速施工员识别方法、装置及其应用
CN116883951B (zh) * 2023-09-07 2023-11-10 杭州像素元科技有限公司 基于多源信息感知的高速施工员识别方法、装置及其应用
CN117437581A (zh) * 2023-12-20 2024-01-23 神思电子技术股份有限公司 基于图像语义分割和视角缩放的机动车拥堵长度计算方法
CN117437581B (zh) * 2023-12-20 2024-03-01 神思电子技术股份有限公司 基于图像语义分割和视角缩放的机动车拥堵长度计算方法
CN117495989A (zh) * 2023-12-29 2024-02-02 腾讯科技(深圳)有限公司 数据处理方法、装置、设备及可读存储介质
CN117495989B (zh) * 2023-12-29 2024-04-19 腾讯科技(深圳)有限公司 数据处理方法、装置、设备及可读存储介质

Also Published As

Publication number Publication date
CN114693540A (zh) 2022-07-01

Similar Documents

Publication Publication Date Title
WO2021027568A1 (zh) 障碍物避让方法及装置
WO2022142839A1 (zh) 一种图像处理方法、装置以及智能汽车
WO2021102955A1 (zh) 车辆的路径规划方法以及车辆的路径规划装置
WO2021000800A1 (zh) 道路可行驶区域推理方法及装置
WO2021238306A1 (zh) 一种激光点云的处理方法及相关设备
WO2021217420A1 (zh) 车道线跟踪方法和装置
WO2021147748A1 (zh) 一种自动驾驶方法及相关设备
CN112512887B (zh) 一种行驶决策选择方法以及装置
EP3965004A1 (en) Automatic lane changing method and device, and storage medium
WO2021189210A1 (zh) 一种车辆换道方法及相关设备
WO2022016901A1 (zh) 一种规划车辆行驶路线的方法以及智能汽车
WO2021218693A1 (zh) 一种图像的处理方法、网络的训练方法以及相关设备
WO2022062825A1 (zh) 车辆的控制方法、装置及车辆
US20230399023A1 (en) Vehicle Driving Intention Prediction Method, Apparatus, and Terminal, and Storage Medium
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
US20240017719A1 (en) Mapping method and apparatus, vehicle, readable storage medium, and chip
WO2022052881A1 (zh) 一种构建地图的方法及计算设备
CN113885045A (zh) 车道线的检测方法和装置
WO2022088761A1 (zh) 一种规划车辆驾驶路径的方法、装置、智能车以及存储介质
WO2022017307A1 (zh) 自动驾驶场景生成方法、装置及系统
US20220309806A1 (en) Road structure detection method and apparatus
WO2022151839A1 (zh) 一种车辆转弯路线规划方法及装置
CN115205848A (zh) 目标检测方法、装置、车辆、存储介质及芯片
WO2021159397A1 (zh) 车辆可行驶区域的检测方法以及检测装置
CN115508841A (zh) 一种路沿检测的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913581

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913581

Country of ref document: EP

Kind code of ref document: A1