WO2021008500A1

WO2021008500A1 - Image processing method and apparatus

Info

Publication number: WO2021008500A1
Application number: PCT/CN2020/101717
Authority: WO
Inventors: 刘兴业; 谢伟伦
Original assignee: 华为技术有限公司
Priority date: 2019-07-12
Filing date: 2020-07-13
Publication date: 2021-01-21
Also published as: CN112215748A

Abstract

Disclosed are an image processing method and apparatus, relating to the technical field of piloted driving or assisted driving and contributing to enhancing an image compensation effect. The method is applied to a vehicle-mounted device. The method may comprise: acquiring a plurality of frames of images, wherein the plurality of frames of images comprise image information of roads surrounding a vehicle on which the vehicle-mounted device is mounted (S101); acquiring a first image region in each frame of image of the plurality of frames of images, wherein a plurality of first image regions of the plurality of frames of images correspond to a first scene (S102); and performing a super-resolution operation on the plurality of first image regions (S104). The method can be used for target detection and tracking in assisted driving and piloted driving.

Description

Image processing method and device

This application claims the priority of a Chinese patent application filed with the State Intellectual Property Office with the application number 201910633070.3 and the application name "Image Processing Method and Apparatus" on July 12, 2019, the entire content of which is incorporated into this application by reference.

Technical field

This application relates to the field of automatic driving or assisted driving technology, and in particular to image processing methods and devices.

Background technique

Image super resolution technology has important application value in surveillance equipment, satellite imagery and autonomous driving technology. Image super-resolution technology includes single-frame super-division technology and multi-frame super-division technology. Single-frame super-division refers to recovering a high-resolution image from a low-resolution image. Multi-frame super-division refers to recovering a high-resolution image from multiple low-resolution images.

In the field of automatic driving technology, it is usually necessary to detect objects on the road, such as detecting whether an object is a person or a traffic light, and the distance between the object and the vehicle, etc., to assist automatic driving path planning. Currently, a single-frame superdivision is usually performed on the captured image of the road surrounding the vehicle to obtain a high-resolution image, and objects on the road are detected based on the high-resolution image. Since the single-frame super-division only compensates for the edge features, sharpening features, etc. of the object in one frame of image, while the feature information in one frame of image is less, the image compensation effect is poor.

Summary of the invention

The embodiments of the present application provide an image processing method and device, which help improve the effect of image compensation.

In a first aspect, an image processing method is provided, which is applied to an in-vehicle device. The method includes: firstly, acquiring a multi-frame image, the multi-frame image including image information of the surrounding road of the vehicle where the in-vehicle device is located; then, acquiring the multi-frame The first image area in each frame of the image; wherein the multiple first image areas (where each frame of the image includes a first image area) of the multi-frame image correspond to the first scene; then, the multiple first image areas A super-division operation is performed on an image area. The super-division operation is specifically a multi-frame super-division. In this way, the feature information of the image information in the multi-frame image can be combined for image compensation, which is helpful compared with the technical solution of using a single-frame super-division for image compensation in the prior art. To improve the image compensation effect. In addition, since in the present technical solution, multiple image regions corresponding to the same scene in the multi-frame image are subjected to multi-frame super-division, instead of performing super-division operations on the multi-frame image itself, it helps to reduce the super-division The complexity of the operation, thereby speeding up the processing rate of super-division. Further, when the processing result of the image processing method provided by the present technical solution is applied to an assisted automatic driving path planning scene, it is helpful to improve the accuracy of the automatic driving path planning.

As an example, the first scene may be understood as the road conditions around the vehicle or the spatial area where one or more objects in the driving field of view are located. The first image area may be part or all of the area in one frame of image. The first image area may or may not contain the image information of the target object. The target object may be predefined, of course, the embodiment of the present application is not limited to this.

In a possible design, the method further includes: determining that the image information of the target object exists in the image area obtained by the hyperdivision operation.

In a possible design, the method further includes: detecting the relative position between the vehicle and the target object. The relative position includes a relative distance and a relative angle, and the relative angle includes an azimuth angle and/or a pitch angle. In one example, the relative position may be used to assist automatic driving path planning.

In a possible design, for each image of the multi-frame image: the first image area is an area with a confidence level lower than or equal to the first threshold; or, the first image area corresponds to the driveable area of the vehicle A space area where the distance from the vehicle is greater than or equal to the second threshold; or, the first image area is an area at a preset position. This possible design provides several features of the first image area. In specific implementation, the first image area can be determined based on one of these features.

In a possible design, the multi-frame image includes a first image and a second image; acquiring the first image area in each frame of the multi-frame image includes: acquiring the first image area in the second image. Acquiring the first image area in the second image includes: according to the first image area in the first image (which may specifically include: the position and size of the first image area in the first image in the first image) and the vehicle Obtaining the first image area in the second image (which may specifically include: obtaining the position and size of the first image area in the second image in the second image). That is to say, the embodiments of the present application support a technical solution to infer the first image area in the first image from the first image area in the first image based on the vehicle body information, which helps to improve the accuracy of the superdivision calculation .

Among them, the vehicle body information (including the first vehicle body information and the second vehicle body information below) can be the vehicle information directly detected by sensors and other equipment installed in the vehicle, or it can be the detection of these sensors and other equipment. Information about the vehicle obtained by processing the received information.

In a possible design, the first vehicle body information may include at least one of the first relative distance, the second relative distance, and the first vehicle steering angle. The first relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the first image is taken. The second relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the second image is taken. The first vehicle steering angle is the angle between the direction of the vehicle in the time interval of shooting the first image and the second image.

In a possible design, the first image is a reference image in the multi-frame super division, and the second image is any non-reference image in the multi-frame super division.

In a possible design, performing a super-division operation on the plurality of first image regions includes: performing a super-division operation on all the plurality of first image regions after scene alignment. This is a technical solution that takes into account that the foreground and background may change during the operation of the vehicle. Based on this, it helps to improve the accuracy of the super-division calculation.

In a possible design, the multiple frames of images include a third image and a fourth image; before the superdivision operation is performed on the multiple first image regions after scene alignment, the method further includes: according to the second image of the vehicle Car body information, performing scene alignment of the plurality of first image areas. That is to say, the embodiments of the present application support a technical solution for realizing scene alignment based on vehicle body information, which helps to improve the accuracy of super-division calculation.

In a possible design, the second vehicle body information includes at least one of the first relative angle, the second relative angle, and the second vehicle steering angle. The first relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the third image is taken. The second relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the fourth image is taken. The second steering angle of the vehicle is the angle between the direction of the vehicle in the time interval of shooting the third image and the fourth image.

The third image and the first image or the second image may be the same or different, the fourth image and the first image or the second image may be the same or different, and the third image and the fourth image are different.

In a possible design, the multi-frame images are consecutive multi-frame images in time series. In this way, it is convenient to handle.

In a possible design, the time interval between the shooting moment of the first frame image and the shooting moment of the last frame image in the multi-frame images is less than or equal to the third threshold. In this way, it helps to improve the accuracy of the super-division calculation.

In a possible design, the method further includes: acquiring a second image area in each frame of the multi-frame image; wherein the multiple second image areas of the multi-frame image correspond to the second scene; then, Perform a hyperdivision operation on the plurality of second image regions. In other words, this application supports the solution of including multiple image regions to be detected in one frame of image. There can be overlap or no overlap between the first image area and the second image area. The first scene is different from the second scene.

In a second aspect, an image processing device is provided, which can be used to execute any method provided in the first aspect or any possible design of the first aspect. For example, the device may be an in-vehicle device or a chip.

In a possible design, the device may be divided into functional modules according to the method provided in the first aspect or any of the possible designs of the first aspect. For example, each functional module may be divided corresponding to each function, or Integrate two or more functions into one processing module.

In one possible design, the device may include a memory and a processor. The memory is used to store computer programs. The processor is used to invoke the computer program to execute the first aspect or the method provided by any possible design of the first aspect.

In a third aspect, a computer-readable storage medium is provided, such as a non-transitory computer-readable storage medium. A computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on a computer, the computer executes any method provided by the first aspect or any possible design of the first aspect .

In a fourth aspect, a computer program product is provided, which, when running on a computer, enables any method provided in the first aspect or any possible design of the first aspect to be executed.

It is understandable that any image processing device, computer storage medium, computer program product or system provided above can be applied to the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding The beneficial effects of the method are not repeated here.

Description of the drawings

FIG. 1 is a schematic structural diagram of a computer system applicable to an embodiment of the present application;

FIG. 2 is a schematic diagram of a result of target detection applicable to an embodiment of the present application;

FIG. 3 is a schematic diagram of an image segmentation applicable to an embodiment of the present application;

4 is a schematic flowchart of an image processing method provided by an embodiment of this application;

FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the application;

6 is a schematic diagram of a vehicle body information in a vehicle body coordinate system provided by an embodiment of the application;

FIG. 7 is a schematic diagram of obtaining ROI2 based on ROI1 according to an embodiment of the application;

FIG. 8 is a schematic diagram of a scene of ROI1 and ROI2 provided by an embodiment of the application;

FIG. 9 is a schematic diagram of a view angle change when a vehicle turns right according to an embodiment of the application;

FIG. 10 is a schematic diagram of obtaining an alignment angle of a scene according to an embodiment of the application;

FIG. 11 is a schematic diagram of mapping ROI2 to a plane where ROI1 is located according to an embodiment of the application;

FIG. 12 is a schematic flowchart of another image processing method provided by an embodiment of the application;

FIG. 13 is a schematic diagram of a drivable area provided by an embodiment of the application;

FIG. 14 is a schematic diagram of a first image area provided by an embodiment of this application;

15 is a schematic flowchart of another image processing method provided by an embodiment of the application;

FIG. 16 is a schematic diagram of another first image area provided by an embodiment of this application;

FIG. 17 is a schematic structural diagram of a vehicle-mounted device provided by an embodiment of the application.

Detailed ways

As shown in FIG. 1, it is a schematic structural diagram of a computer system applicable to the embodiments of the present application. Wherein, the computer system may be located on the vehicle, and the computer system may include the vehicle-mounted equipment 101, and the equipment/device/network connected directly or indirectly with the vehicle-mounted equipment. Referring to FIG. 1, the vehicle-mounted device 101 includes a processor 103, and the processor 103 is coupled to a system bus 105. The processor 103 may be one or more processors, where each processor may include one or more processor cores. A display adapter (video adapter) 107 can drive the display 109, and the display 109 is coupled to the system bus 105. The system bus 105 is coupled to an input/output (I/O) bus 113 through a bus bridge 111. The I/O interface 115 is coupled to the I/O bus. The I/O interface 115 communicates with a variety of I/O devices, such as input devices 117 (such as keyboard, mouse, touch screen, etc.), media tray 121 (such as compact disc read-only memory, CD -ROM), multimedia interface, etc.). The transceiver 123 (can send and/or receive radio communication signals), the camera 155 (can capture scene and dynamic digital video images), and an external universal serial bus (USB) interface 125. Wherein, optionally, the interface connected to the I/O interface 115 may be a USB interface.

The processor 103 may be any traditional processor, including a reduced instruction set computer (RISC) processor, a complex instruction set computer (CISC) processor, or a combination of the foregoing. Optionally, the processor may be a dedicated device such as an application specific integrated circuit (ASIC). Optionally, the processor 103 may be a neural network processor or a combination of a neural network processor and the foregoing traditional processors. For example, the processor 103 may be a central processing unit (CPU).

The camera 155 may be any camera used to collect images, for example, it may be a monocular camera or a binocular camera. The number of cameras can be one or more, and each camera can be located in the front, rear, or side of the vehicle. For the convenience of description, the following specific examples all take the camera located directly in front of the vehicle as an example. In the embodiment of the present application, the camera 155 may be used to collect information about the surrounding environment (including surrounding roads, etc.) of the vehicle. In an example, the camera 155 may include a software module, and the software module may be used to record the shooting time of the image taken by the camera. Alternatively, the module for recording the shooting time may also be a piece of hardware connected to the camera 155. In one example, the position of the camera 155 relative to the vehicle may be fixed. In another example, the position of the camera 155 relative to the vehicle may be changed, for example, the camera 155 may perform rotation shooting.

Optionally, in various embodiments described herein, the in-vehicle device 101 may be located far away from the autonomous driving vehicle, and may wirelessly communicate with the autonomous driving vehicle. In other respects, some of the processes described herein are executed on a processor provided in an autonomous vehicle, and others are executed by a remote processor, including taking actions required to perform a single manipulation.

The in-vehicle device 101 may communicate with a software deployment server (deploying server) 149 through a network interface 129. The network interface 129 is a hardware network interface, such as a network card. The network 127 may be an external network, such as the Internet, or an internal network, such as an Ethernet or a virtual private network (virtual private network, VPN). Optionally, the network 127 may also be a wireless network, such as a WiFi network, a cellular network, and so on.

The hard disk drive interface is coupled to the system bus 105. The hardware drive interface is connected with the hard drive. The system memory 135 and the system bus 105 are coupled. The data running in the system memory 135 may include the operating system 137 and application programs 143 of the in-vehicle device 101.

The operating system includes a shell 139 and a kernel (kernel) 141. Shell 139 is an interface between the user and the kernel of the operating system. The shell is the outermost layer of the operating system. The shell manages the interaction between the user and the operating system, waits for the user's input, interprets the user's input to the operating system, and processes the output of various operating systems.

The kernel 141 is composed of those parts of the operating system for managing memory, files, peripherals, and system resources. Directly interact with hardware, the operating system kernel usually runs processes and provides inter-process communication, providing CPU time slice management, interrupts, memory management, IO management, and so on.

The application program 143 includes programs related to controlling the automatic driving of the vehicle. For example, a program for processing an image containing image information on a vehicle road acquired by an on-vehicle device, such as a program for implementing the image processing method provided by the embodiment of the present application. For another example, the program that manages the interaction between autonomous vehicles and road obstacles, the program that controls the route or speed of autonomous vehicles, and the process of interaction between autonomous vehicles and other autonomous vehicles on the road. The application program 143 also exists on the system of the software deployment server 149. In one embodiment, when the application program 143 needs to be executed, the in-vehicle device 101 may download the application program 143 from the software deployment server 149.

The sensor 153 is associated with the in-vehicle device 101. The sensor 153 is used to detect the environment around the in-vehicle device 101. For example, the sensor 153 can detect animals, vehicles, obstacles, and crosswalks. Further, the sensor can also detect the environment around objects such as animals, vehicles, obstacles, and crosswalks, such as: the environment around the animals, for example, when the animals appear around them. Other animals, weather conditions, the brightness of the surrounding environment, etc. Optionally, if the in-vehicle device 101 is located on an autonomous vehicle, the sensor may be a camera, an infrared sensor, a chemical detector, a microphone, etc. Optionally, the sensor 153 may include a speed sensor, used to measure the speed information (such as speed, acceleration, etc.) of the own vehicle (that is, the vehicle in which the computer system shown in FIG. 1 is located); an angle sensor, used to measure the direction information of the vehicle , And the relative angle between the vehicle and the objects/objects around the vehicle.

It should be noted that the computer system shown in FIG. 1 is only an example, which does not constitute a limitation on the computer system applicable to the embodiments of the present application. For example, one or more devices connected to the vehicle-mounted equipment shown in FIG. 1 may be integrated with the vehicle-mounted equipment, for example, a camera is integrated with the vehicle-mounted equipment.

The following explains some terms or technologies involved in the embodiments of this application:

1), multi-frame super-division, reference image, non-reference image

Multi-frame super-division uses the image information of the non-reference image in the multi-frame image to process the image information of the reference image in the multi-frame image, such as compensating for edge features, sharpening features, etc., to obtain a frame of image. The resolution of this image is higher than the resolution of the reference image. Wherein, the reference image may be any one of the multiple frames of images. Non-reference images are all images in the multi-frame images except for reference images.

2), object, target object

The object may also be called a road object or obstacle or road obstacle. In the embodiments of the present application, the objects may be people, vehicles, traffic lights, traffic signs (such as speed limit signs, etc.), telephone poles, trash cans, foreign objects, etc. on the roads surrounding the vehicle. Among them, foreign objects refer to objects that should not have appeared on the road, such as boxes and tires left on the road.

The target object is the object that the vehicle-mounted equipment needs to recognize. The target object may be predefined or indicated by the user, which is not limited in the embodiment of the present application. For example, in an autonomous driving scene, the target object may include: people, cars, traffic lights, and so on.

3), target detection, image segmentation

Both target detection and image segmentation are image processing techniques.

The task of target detection is to find out the area where the image information of all the targets of interest in the image are located, and to determine the size of the area and the location of the area in the image. The target of interest can be pre-defined or user-defined. In the embodiments of the present application, in the embodiments of the present application, the target of interest may refer to the target object. Different regions obtained by target detection (that is, regions where the image information of different targets are located) may or may not overlap. As shown in Figure 2, it is a schematic diagram of a target detection result. In Fig. 2, the target of interest is a vehicle as an example for illustration. The area defined by each rectangular box in FIG. 2 is the area where the image information of an interested target is located.

Image segmentation is a computer vision task that marks designated areas in an image according to the content of the image. In short, it is to determine the image information of which objects in a frame of image, and the position of the image information of the object in the image. Specifically, the purpose of image segmentation is to determine which object pixel each pixel in the image represents. Image segmentation can include semantic segmentation, instance segmentation, and so on. There is usually no overlap between image regions obtained by image segmentation. As shown in Figure 3, it is a schematic diagram of image segmentation. Each connected region in Figure 3 represents a region obtained by image segmentation.

4), image area, space area

In order to clearly distinguish the area in the actual scene (that is, the area that exists objectively) and the image of the area in the actual scene (that is, the area in the image or picture), in this application, the area in the actual scene is called "spatial area" ", and call the image of the spatial area "image area".

5), the image area to be detected, the first image area, the second image area

The image area to be detected refers to an area in the image that contains the image information of the target object with a high probability (for example, the probability is greater than a preset threshold). This is the definition of the image area to be detected in the embodiments of this application. However, in actual implementation, when determining the image area to be detected in the image, it is usually not necessary to directly determine the probability that an image area contains the image information of the target object , And there is no need to set the preset threshold in the vehicle-mounted device, but indirectly determine whether the probability of containing the image information of the target object in an area is higher than the preset threshold through other methods. For example, when the image acquired by the vehicle-mounted device contains an image area with any of the following characteristics, it is determined that the image area is the image area to be detected:

Feature 1: The confidence level is lower than or equal to the first threshold.

Feature 2: Corresponding to a space area where the distance between the vehicle and the vehicle in the travelable area is greater than or equal to the second threshold.

Feature 3: The position in the image to which it belongs is a preset position.

That is to say, the image area to be detected is an area with a confidence level lower than or equal to the first threshold, or it corresponds to a space area in the drivable area of the vehicle whose distance from the vehicle is greater than or equal to the second threshold. The image area, or, is the area at the preset position in the image. Of course, if there is no conflict, any of the above features 1 to 3 can be combined as the feature of the image area to be detected. For related descriptions of these three characteristics, and specific examples, please refer to the following.

The image area to be detected may contain image information of one or more objects. The image area to be detected may contain the image information of one or more target objects, or may not contain the image information of the target object. The image area to be detected can be part or all of the area in a frame of image.

A frame of image can contain one or more image regions to be detected. The first image area and the second image area described in the embodiments of the present application are both image areas to be detected.

6), other terms

In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations, or illustrations. Any embodiment or design solution described as "exemplary" or "for example" in the embodiments of the present application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, words such as "exemplary" or "for example" are used to present related concepts in a specific manner.

In the embodiments of the present application, "at least one" refers to one or more. "Multiple" means two or more.

In the embodiments of the present application, "and/or" is merely an association relationship describing associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can indicate that A exists alone, and both A and A B, there are three cases of B alone. In addition, the character "/" in this text generally indicates that the associated objects before and after are in an "or" relationship.

Hereinafter, the image processing method provided by the embodiments of the present application will be described with reference to the accompanying drawings. This method can be applied to the vehicle-mounted device 101 described above.

As shown in FIG. 4, it is a schematic flowchart of an image processing method provided by an embodiment of this application. The method can include:

S101: The vehicle-mounted device acquires N frames of images, where the N-frame images include image information of the surrounding roads of the vehicle where the vehicle-mounted device is located. N is an integer greater than or equal to 2.

The N frames of images may be images taken by a camera (such as the aforementioned camera 155) installed in the vehicle. The surrounding roads of the vehicle may include one or more of the front road, the rear road, and the side road of the vehicle.

Any one of the N frames of images may be an image taken when the vehicle is in a stationary state, or may be an image taken when the vehicle is in a moving state (such as going straight, changing lanes, or turning).

Optionally, the N frames of images are consecutive N frames of images in time series. In other words, the N frames of images are N frames of images continuously captured by the camera installed in the vehicle.

Based on this optional implementation manner, when the method shown in FIG. 4 is executed multiple times, the on-vehicle device may determine the "N frame images" in S101 in the images captured by the camera based on the sliding window N. The embodiment of the present application does not limit the value of N. When the method shown in FIG. 4 is executed twice, the value of N may be the same or different. When the method shown in FIG. 4 is executed twice adjacently, the N frames of images in S101 may or may not overlap. For example, suppose that the images taken by the camera are sorted according to the order of shooting time to obtain the following sequence: image 1, image 2, image 3...image n. n is an integer, and the value of n increases as the number of shots of the camera increases; and, N=3, then, when the method shown in FIG. 4 is executed for the first time, the N frames of images in S101 can be images 1~ 3. When the method shown in FIG. 4 is executed for the second time, the N frames of images in S101 may be images 2 to 4, and when the method shown in FIG. 4 is executed for the third time, the N frames of images in S101 may be images 3 to 4 6. By analogy, the image processing method provided in the embodiment of the present application is executed.

S102: The vehicle-mounted device acquires the first image area in each frame of the N1 frame of images in the N frame of images. The N1 first image regions of the N1 frame image correspond to the first scene. Among them, 2≤N1≤N, N1 is an integer. Different first image areas are image areas in different images.

The first image area is an image area to be detected. For the relevant explanation of the image area to be detected, please refer to the above.

The first scenario is part of the road conditions or part of the driving field of view around the vehicle. The first scene can be understood as the road conditions around the vehicle or the spatial area where one or more objects in the driving field of view are located. The spatial area where an object is located refers to the area containing the object. For example, assuming that the N1 frames of images all contain the image information of the same traffic light, the first scene may be the space area where the traffic light is located, and the first image area may be the traffic light in each frame of the N1 frame image. The image area where the image is located. For another example, assuming that the N1 frames of images all contain the image information of the same vehicle, the first scene may be the space area where the vehicle is located, and the first image area may be the image of the vehicle in each frame of the N1 frame image. Image area.

It is understandable that as the vehicle moves, or as the movable objects (such as people or vehicles, etc.) on the road around the vehicle move, different images captured by the camera installed on the vehicle will contain different information. , This may cause that in the N frames of images obtained in S101, not every frame of image contains the image area corresponding to the same scene, of course, there may also be each frame of image containing the image area corresponding to the same scene. Therefore, in S102, the number of first image regions acquired by the vehicle-mounted device is marked as N1 instead of N.

In an implementation manner, for each frame of the N1 frame of image, the vehicle-mounted device may independently determine the first image area in the image. For example, the first image area in the image is determined according to at least one of the aforementioned features 1 to 3.

In another implementation manner, for a part of the image in the N frames, the vehicle-mounted device may first determine the first image area in the image, for example, determine the part according to at least one of the above-mentioned features 1 to 3 The first image area in the image; based on this inference, the first image area in the other images in the N frames of images is obtained. For specific examples, please refer to the following.

Optionally, the shooting time interval of two adjacent frames of images in the N frames of images is less than or equal to a third threshold. Among them, the shooting time interval of two frames of images refers to the time period between the moments when the two frames are shot. The embodiment of the present application does not limit the specific value and value method of the third threshold. In this way, when the vehicle speed is faster, it helps to improve the image compensation effect of multi-frame super-division.

It can be understood that when the vehicle speed is fast, if the time interval between two adjacent frames of images is relatively large, there may not be an image area corresponding to the same scene in the two frames of images. For example, if the vehicle speed is relatively fast, it may A frame of image that appears includes: image information of traffic light 1 and image information of vehicles 1 to 3, and does not include image information of other objects; and the next frame of the image includes: image information of traffic light 2 and vehicle The image information of 4 to 5 does not include the image information of other objects; then, there is no image area corresponding to the same scene in the two frames. This may lead to the inability to use the image information in other frame images to compensate the image information in the image area corresponding to the scene, resulting in poor image compensation effect for multi-frame superdivision. Therefore, this optional implementation manner helps to improve the image compensation effect of multi-frame superdivision.

Optionally, the shooting time interval between the first frame image and the last frame image in the N frames of images is less than or equal to a threshold. In this way, when the vehicle speed is faster, it helps to improve the image compensation effect of multi-frame super-division. The specific analysis process can refer to the above.

S103: The vehicle-mounted device executes scene alignment of the N1 first image regions.

When the vehicle is at a standstill, the background of different images taken by the camera installed on the vehicle remains unchanged, and the foreground may change. Therefore, when the vehicle is in a stationary state, the step of aligning the first image area in the N1 frame of image (ie S103) may be an optional step. When the vehicle is in motion, the background and foreground of different images taken by the camera installed on the vehicle may change. In this case, the scene alignment can be performed before the superdivision calculation in S104 is executed.

Scene alignment is to ensure that the same or similar foreground and background are included in the image area corresponding to the same scene in multiple frames of images. For example, the image area corresponding to the same scene in the multi-frame image can be scaled to a uniform size, and some or all of the scene in the multi-frame image can be rotated by parameters such as angle, so as to ensure that the multi-frame corresponds to the The image area of the scene contains uniform or similar foreground and background.

S104: The in-vehicle device performs a hyperdivision operation on the N1 first image regions after scene alignment to obtain the first target image region. This super-division operation can also be called a multi-frame super-division operation.

Specifically, the in-vehicle device processes the first image area in the reference image after the scene is aligned (such as compensating for edge features, sharpening features, etc.) according to the first image area in the non-reference image after the scene is aligned to obtain the first image area. The target image area. Wherein, the resolution of the first target image area is higher than the resolution of the first image area in the reference image. The specific process can refer to the prior art.

The embodiment of the present application does not limit the application scenario of the first target image area, for example, it may be applied to a target object detection scenario. When applied to a target object detection scene, the above method may further include the following step S105:

S105: The vehicle-mounted device determines whether there is a target object in the first target image area. If it exists, the relative position of the target object and the vehicle is determined. If it does not exist, it ends.

The specific implementation of step S105 can refer to the prior art. For example, the in-vehicle device performs target detection or image segmentation on the first target image area to determine whether the target object is contained in the first target image area, and if it exists, determines the relative position of the target object and the vehicle.

In the image processing method provided by the embodiments of the present application, the super-division operation is specifically multi-frame super-division, which can combine the characteristic information of the image information in the multi-frame image to perform image compensation, which is similar to the prior art using single-frame super-division to perform image compensation. Compared with the compensation technical solution, it helps to improve the image compensation effect. In addition, since in the present technical solution, multiple image areas corresponding to the same scene in the multi-frame image are subjected to multi-frame super-division, instead of performing super-division operations on the multi-frame image itself, it is helpful to reduce the super-division. The complexity of the operation, thereby speeding up the processing rate of super-division. Further, when the processing result of the image processing method provided by the embodiment of the present application is applied to assist automatic driving path planning, it helps to improve the accuracy of automatic driving path planning.

Optionally, after performing S101, the method may further include: the vehicle-mounted device acquires the second image area in each frame of the N2 frames of the N frame of images. The N2 second image regions of the N2 frame image correspond to the second scene. 2≤N2≤N, N2 is an integer. Different second image areas are image areas in different images. Based on this, the method further includes: steps S102' to S105'. S102'～S105' replace "first image area" in S102～S105 with "second image area", "N1" with "N2", and "first target image area" with "second target" Image area".

Wherein, the second image area is an image area to be detected that is different from the first image area. The first image area and the second image area may or may not overlap partially.

Among them, the embodiment of the present application does not limit the magnitude relationship between N1 and N2. In addition, the N1 frame image and the N2 frame image may or may not include the same image.

Optionally, both the N1 frame image and the N2 frame image include a reference image (that is, a reference image used in the super-division operation). For example, suppose that the N frames of images are the first to tenth frames, and the first to fifth frames all contain the image information of traffic light 1, and the first to seventh frames all contain the image information of vehicle 1, and the first frame The image is a reference image; then, the N1 frame image can be the first to fifth frame images, the first image area can be the area where the image information of the traffic light 1 is located, and the N2 frame image can be the first to seventh frame images, The second image area may be the area where the image information of the vehicle 1 is located.

Hereinafter, the image processing method provided above will be explained through specific examples:

Example 1

As shown in FIG. 5, it is a schematic flowchart of an image processing method provided by an embodiment of this application. The method can include:

S201: Refer to the above S101, of course, the embodiment of the present application is not limited to this.

S202: The vehicle-mounted device performs preprocessing (such as image segmentation or target detection, etc.) on the first image in the N frames of images according to the candidate type set to obtain at least one candidate image area included in the first image, and each candidate image area Recognition results and the confidence level of each recognition result. The first image may be any one of the N frames of images. Optionally, the first image is a reference image, and the reference image refers to a reference image in a multi-frame super division.

The candidate type set is a set composed of at least one candidate type. The candidate type is the type of target object that the vehicle-mounted device needs to recognize (that is, the type of object that the vehicle-mounted device is interested in). The type of object can be understood as what the object is. For example, if an object is a person, the type of the object is a person; if an object is a traffic light, the type of the object is a traffic light. For example, in an autonomous driving scenario, the candidate type set may include: people, cars, traffic lights, and so on.

The candidate image area is an area that contains image information of objects of candidate types. For example, if the candidate type set is a set composed of people, cars, and traffic lights, the candidate image area includes: the image area where the image information of the person is located, the image area where the image information of the car is located, and the image where the image information of the traffic light is located. area. For example, when "preprocessing" is target detection, taking the first image as the image shown in FIG. 2 as an example, the area defined by the rectangular frame in FIG. 2 can be used as a candidate image area. For another example, when the "preprocessing" is image segmentation, taking the first image as the image shown in FIG. 3 as an example, each connected region shown in FIG. 3 can be used as a candidate image region. The area occupied by the at least one candidate image area described in S202 may be part or all of the area in the first image.

The recognition result of the candidate image area may include: which kind of target object the image information in the candidate image area is. Optionally, it may also include the relative position between the target object and the vehicle.

The confidence of the recognition result of a candidate image area can be referred to as the confidence of the candidate image area, which can be understood as the accuracy of the recognition result of the candidate image area.

It is understandable that for the same object, if the object is closer to the vehicle, the image area taken by the object in the image taken by the camera in the vehicle is larger and clearer; if the object is closer to the vehicle Farther, the image area occupied by the object in the image taken by the camera is smaller and blurry. Therefore, in the same image, the confidence of the recognition result of the first candidate image area (that is, the candidate image area containing the image information of the object that is close to the vehicle) is generally higher than that of the second candidate image area ( That is, the confidence level of the recognition result of the candidate image region containing the image information of the object that is far away from the vehicle. For example, the recognition result of a candidate image area is: when the image information of the object included in the candidate image area is image information of a person, if the object is closer to the vehicle, the probability that the object is actually a "person" will be higher , That is, the confidence of the recognition result is high; if the object is far from the vehicle, the probability that the object is actually a "person" will be low, that is, the confidence of the recognition result is low, for example, the object may actually be a telephone pole Wait.

S203: The vehicle-mounted device acquires at least one area to be detected in the first image. Wherein, the area to be detected is a candidate image area whose confidence is less than or equal to the first threshold. The at least one area to be detected includes a first image area.

Wherein, acquiring the first image area in the first image may include: acquiring the position of the first image area in the first image and the size of the first image area.

In the following steps, the at least one area to be detected includes the first image area as an example for description. Optionally, the at least one area to be detected may further include a second image area and the like.

It should be noted that, for a candidate image area with a confidence level higher than the first threshold, the recognition result of the candidate image area obtained by the vehicle-mounted device in S202 has a higher accuracy. Therefore, in the subsequent steps, the vehicle-mounted device can no longer repeatedly detect (or identify) the image information included in these candidate image regions, that is, in the subsequent steps, only the confidence level is lower than or equal to the first The detection (or recognition) of the image information included in the candidate image area with a threshold helps reduce the detection complexity, thereby improving the detection efficiency.

S204: The vehicle-mounted device acquires the speed v of the vehicle, the shooting time interval T between the first image and the second image, the relative angle θ between the vehicle and the space area corresponding to the first image area when the first image is taken, and the second image is taken. The relative angle θ2 between the vehicle and the space area corresponding to the first image area at the time of the image, and the relative angle between the direction of the vehicle in the time interval T (ie, the vehicle steering angle) α.

The speed of the vehicle can be variable or constant.

v, T, θ1, θ2, and α can all be measured by corresponding sensors in the vehicle, or the original information used to obtain some or all of these parameters can be measured by corresponding sensors in the vehicle. For example, for the parameter T, the original information may be the shooting time t1 of the first image and the shooting time t2 of the second image, where T=t2-t1. For another example, for the parameter α, the original information may be the heading of the vehicle at time t1 and the heading of the vehicle at time t2. Some or all of these sensors can be integrated with on-board equipment, or they can be set independently.

S205: The in-vehicle device determines, according to v, T, θ1, and θ2, the relative distance R1 between the vehicle and the space area corresponding to the first image area when the first image is taken, and the distance between the vehicle and the first image area when the second image is taken. The relative distance between the corresponding spatial regions R2.

As shown in Figure 6, suppose: at t1, the vehicle body coordinate system is X1 axis and Y1 axis to form the X1-Y1 coordinate system, the position of the vehicle is the origin of the X1-Y1 coordinate system, and the Y1 axis is the forward direction of the vehicle. , The X1 axis is the vehicle tangential to the right direction, and the image taken by the camera is the first image; at t2, the vehicle body coordinate system is the X2 axis and the Y2 axis to form the X2-Y2 coordinate system, and the position of the vehicle is the X2-Y2 coordinate system The origin of, and the Y2 axis is the forward direction of the vehicle, the X2 axis is the tangential right direction of the vehicle, and the image taken by the camera is the second image. Then: θ1 is the angle between the first image area in the first image and the Y1 axis, and θ2 is the angle between the second image area in the second image and the Y2 axis. The positions of α, R1 and R2 in the corresponding vehicle body coordinate system can be as shown in FIG. 6. The dashed-dotted area in Figure 6 is the assumed imaging area of the front-view camera of the vehicle, where imaging area 1 is the imaging area at time t1 (ie the first image), and imaging area 2 is the imaging area at time t2 (ie the second image) ). Of course, in the actual scene, since the camera has parameters such as the angle of view, the imaging area will not be horizontal and vertical as shown in Figure 6. For the convenience of explanation in this example, it is assumed that the imaging area of the camera is a rectangular area in front of the vehicle.

It should be noted that R1 and R2 are determined based on the vehicle body coordinate system in FIG. 6. In actual implementation, R1 and R2 can also be determined based on other coordinate systems (such as the world coordinate system, camera coordinate system, etc.). This application is implemented The example does not limit this.

S206: The vehicle-mounted device determines the first image area in the second image according to the first image area, R1, R2, and α in the first image.

The first image and the second image may be any two frames of the N frames of images described in S201. In an implementation manner, the first image is a reference image (such as the first image) in the N frames of images, and the second image is any non-reference image of the N frames of images. That is to say, the embodiments of the present application support the technical solution of "determining the first image area in any other frame of non-reference image according to the first image area in the reference image". In another implementation, the first image and the second image are images that are adjacent in time series. That is to say, the embodiment of the present application supports the technical solution of "determining the first image area in a frame image according to the first image area in the previous frame image of the frame image".

In Fig. 6, ROI1 is the first image area in the first image. In the current example, it is a rectangle whose center point has been roughed. ROI2 is the first image region in the second image, and S206 is specifically: determining the position and size of ROI2 in the second image according to the position and size of ROI1 in the first image, R1, R2, and α. The following describes how the position and size of ROI2 in the second image are determined.

Since α is the steering angle of the vehicle from time t1 to time t2, it can be determined that the rotation angle of ROI2 relative to ROI1 is also α. It is assumed that the center points of ROI1 and ROI2 do not change at t1 and t2, that is, in the world coordinate system, the coordinates of the center point of ROI1 and the center point of ROI2 are the same. Therefore, the coordinates of the center point of ROI1 can be converted from the X1-Y1 coordinate system to the world coordinate system, and then to the X2-Y2 coordinate system, so that the coordinates of the center point of ROI2 in the X2-Y2 coordinate system can be obtained. So far, the position of ROI2 in the second image can be obtained.

After ROI1 is rotated by α, ROI1' is obtained, as shown in Figure 7. Since in this example, compared to t1, when the vehicle travels to t2, it is closer to the spatial area corresponding to the first image area. Therefore, according to the imaging principle, theoretically a larger frame is required at t2. Frame the area with the same amount of information as at t1. Based on this, on the basis of ROI1', the length of ROI1' can be scaled (enlarged in this example) according to the ratio of R1 and R2 (optionally, according to a certain weight) to obtain the length of ROI2; Similarly, the width of ROI2 can be obtained. So far, the size of ROI2 can be obtained.

Here is an example to illustrate that a larger frame is needed at t2 to frame the area with the same amount of information as at t1: suppose the first image is shown in Figure 8 (a), the first image in the first image An image area is the area shown by the rectangular box in (a); the second image is shown in (b) in FIG. 8, and the first image area in the second image is shown in the rectangular box in (b) Area. Based on this, it can be seen that the vehicle corresponding to the first image area is closer to the vehicle where the execution subject of this embodiment is located at time t2 than at time t1. Therefore, the size of the first image area in the second image is It must be larger than the size of the first image area in the first image (such as the area shown by the rectangular box in figure (b)) in order to make the first image area in the second image contain the person riding a motorcycle All image information.

The above S204 to S206 are described by taking the first image area in the second image inferred from the first image area in the first image as an example. Accordingly, for different second images, by executing S204-S206 one or more times, the first image area included in part or all of the N frames of images can be obtained.

It should be noted that the above parameters v, T, θ1, θ2, α, R1, and R2 are collectively referred to as vehicle body information. Among them, the vehicle body information can be the information of the vehicle (such as the above v, T, θ1, θ2, and α, etc.) detected directly by the sensors and other equipment installed in the vehicle, or it can be the information detected by these sensors and other equipment The information of the vehicle (such as R1 and R2, etc.) obtained by processing. The above S204 to S206 are only examples of "a first image area in the second image obtained by reasoning based on vehicle body information" provided by the embodiment of this application, which is not applicable to the "reduction based on vehicle body information in the embodiment of this application". The specific implementation of "the first image area in the second image" constitutes a limitation. In specific implementation, it is possible to implement "the first image area in the second image is obtained by reasoning" based on more or less vehicle body information than the vehicle body information listed above.

S207: The vehicle-mounted device executes scene alignment of the N1 first image regions.

When the vehicle is in motion, because the vehicle motion state is uncertain, the alignment process involves more scenes, and different scenarios can use different solutions for scene alignment. Here are two examples based on the vehicle motion trajectory. In the following two examples, the camera is a front-view camera and is installed directly in front of the vehicle as an example to describe the alignment process of the first image area in the two frames of images.

Example 1: The scene of the vehicle moving straight ahead.

During the straight forward movement of the vehicle, the front-view camera collects the first image and the second image at different moments. The first image and the second image include the first image regions ROI1 and ROI2, respectively. If the center points of ROI1 and ROI2 are exactly in front of the vehicle, it has been ensured that ROI2 contains the same information of the scene (including foreground and background) as ROI1 when acquiring the first image area, so the scenes are aligned, which can be specifically Scale ROI1 (in this case, zoom in) to the size of ROI2.

Example 2: A scene where the vehicle turns to the front right.

When the vehicle turns to the front right, the front-view camera collects the first image and the second image at different moments (ie t1 and t2). The first image and the second image include the first image regions ROI1 and ROI2, respectively . Turning a vehicle will cause ROI1 and ROI2 to not only have the proportional relationship due to vehicle displacement, but also have the perspective change caused by the vehicle direction offset. If ROI1 is at the front right of the vehicle, when the vehicle turns to the right (that is, the origin of the X2-Y2 coordinate system is at the upper right of the X1-Y1 coordinate system), the angle of view will inevitably include more information on the left side of the ROI, as shown in Figure 9. Show. Wherein, in FIG. 9, the image information of the object included in the first image area is a rectangle as an example for description. At t1, the vehicle body coordinate system is the X1-Y1 coordinate system. At t2, the vehicle body coordinate system is the X2-Y2 coordinate system.

According to Figure 9, it can be clearly seen that in the first image collected by the camera at time t1, the front information of the object is more, and in the second image collected by the camera at time t2, the left side of the object has more information. . Among them, the position of the camera is the origin of the coordinate system. In this way, the amount of information contained in the scene in the first image area in the first image and the second image will be different, so that in addition to the ROI1 and ROI2 sizes, the direction of the scene in the field of view must also be aligned once, that is to say , It is necessary to obtain the scene alignment angle between the first image area in the first image and the first image area in the second image. For example, the scene alignment angle may be β in FIG. 10. Figure 10 is drawn based on Figure 6. As shown in Fig. 10, it can be seen that β=180°-γ-θ1 is obtained; and γ=180°-α-θ2. Among them, γ is an intermediate quantity introduced for the convenience of calculation. It can be seen that the scene alignment angle β can be obtained based on α, θ1, and θ2.

After the scene direction is aligned, the scene needs to be aligned by pixel. In one example, based on the example in FIG. 10, the scene in ROI2 can be mapped according to the scene alignment angle β. Fig. 11 shows ROI2 is mapped to the plane where ROI1 is located by taking β as the angle and pixels as the unit to obtain ROI2’. In Fig. 11, the plane where ROI2’ is located is the plane where ROI1 is located. Then, compare ROI1 to ROI2' for zooming (in this example, zooming in), so that ROI1 and ROI2' have the same size. At this point, the scene alignment is completed.

It should be noted that, for ease of description, the example in S207 takes the alignment of the first image area in the first image and the second image as an example for description. The first image and the second image here are only for distinguishing any two images in the N1 image. In actual implementation, the first image and the second image in the alignment process may correspond to the same or different from the first image and the second image in the process of determining the first image area.

It should be noted that the above example 2 is only an example of "a realization of scene alignment based on vehicle body information" provided by the embodiment of this application, which is not applicable to the realization of "realization of scene alignment based on vehicle body information" in the embodiment of this application. The way constitutes limitation. In specific implementation, it is possible to achieve scene alignment based on more or less vehicle body information than the vehicle body information listed in Example 2.

S208 to S209: The above S104 to S105 can be referred to, of course, the embodiment of the present application is not limited thereto.

For the explanation of related content in this embodiment and the beneficial effects that can be achieved, reference may be made to the embodiment shown in FIG. 4. In addition, on the one hand, in this embodiment, a candidate image area with a confidence level lower than the first threshold in one frame of image is selected as the first image area, and the first image area in the multi-frame image is super-divided. . In this way, there is no need to perform super-division processing on candidate image regions with a confidence level higher than the first threshold, which helps reduce the complexity of super-division processing, thereby improving the efficiency of super-division processing. On the other hand, in this embodiment, according to the first image area in one frame of image and the vehicle body information, the first image area in another frame of image is determined, so that when performing super-division operations, it is helpful for vehicle equipment Get more spatial information, thereby improving the accuracy of super-division operations.

Example 2

As shown in FIG. 12, it is a schematic flowchart of an image processing method provided by an embodiment of this application. The method can include:

S301: Refer to the above S101, of course, the embodiment of the present application is not limited to this.

S302: The vehicle-mounted device performs preprocessing (such as image segmentation, etc.) on the first image in the N frames of images to obtain image information of the drivable area in the first image.

The drivable area is the area between the first objects that are away from the vehicle in all directions in the field of view. For example, the area enclosed by the black line in FIG. 13 represents a schematic diagram of a drivable area.

It is understandable that, because image segmentation technology or other technologies for obtaining image information of the feasible area of the first image cannot be 100% sure that there is no object in the obtained feasible area, it can be further based on the technical solution provided in the embodiments of the present application. It is determined whether other objects are included, that is, the following S303 can be executed next.

For the relevant description of the first image, please refer to the above, which will not be repeated here.

S303: According to the image information of the drivable area in the first image, the in-vehicle device uses the image information corresponding to the space area in the drivable area whose distance to the vehicle is greater than or equal to the second threshold as the image information in the second image The first image area.

Optionally, the in-vehicle device may first determine the position of the vehicle in which it is located in the image captured by the camera, and then use the image area in the second image whose distance from the position is greater than or equal to a threshold as the second image The first image area. Wherein, the ratio between the second threshold and the threshold is equal to the ratio between the spatial distance and the image distance (that is, the image distance mapped from the spatial distance to the image).

It should be noted that although the image captured by the camera does not contain the image information of the vehicle (that is, the vehicle in which the on-board equipment is located in this embodiment), it can be based on the location information of the camera in the vehicle, or optionally based on The motion state of the vehicle (such as turning or going straight) determines the position of the vehicle in the image.

For example, if the camera is located in the middle of the front of the vehicle and the relative position of the vehicle and the camera remains unchanged, the corresponding position of the vehicle in the image may be the middle of the lower boundary of the picture, as shown in FIG. 14. Figure 14 also illustrates the first image area in this case.

For another example, if the camera is located in the middle of the front of the vehicle and the camera rotates to the left relative to the vehicle as an example, the corresponding position of the vehicle in the image captured by the camera may be the lower right corner of the picture.

The method for determining the corresponding position of the vehicle in the image captured by the camera is not limited to this, for example, reference may be made to the prior art.

S304-S309: The foregoing S204-S209 can be referred to, of course, the embodiment of the present application is not limited thereto.

According to S302 to S306, the vehicle-mounted device can obtain the first image area included in each frame of the N1 frames of the N frames of images. In this example, since there is usually a drivable area in each frame of image, usually N=N1.

For the explanation of related content in this embodiment and the beneficial effects that can be achieved, reference may be made to the embodiment shown in FIG. 4. In addition, on the one hand, in this embodiment, an image area corresponding to a space area in the drivable area of the vehicle whose distance from the vehicle is greater than or equal to the second threshold is taken as the first image Region, and super-divide the first image region in the multi-frame image. Considering: Although the drivable area is defined as the "area between the first object of the vehicle in various directions in the field of view", there is no guarantee that the determined drivable area does not contain objects (or targets). Object), and for the determined drivable area, the greater the probability that the target object is contained in the area farther from the vehicle, this embodiment is proposed based on this. In this way, it is helpful to improve the accuracy of detecting the target object on the basis of the existing technology. On the other hand, in this embodiment, according to the first image area in one frame of image and the vehicle body information, the first image area in another frame of image is determined, so that when performing super-division operations, it is helpful for vehicle equipment Get more spatial information, thereby improving the accuracy of super-division operations.

Example 3

As shown in FIG. 15, it is a schematic flowchart of an image processing method provided by an embodiment of this application. The method can include:

S401: Refer to the above S101, of course, the embodiment of the application is not limited to this.

S402: The in-vehicle device uses the area at the preset position in the first image in the N frames of images as the first image area.

In different scenes of the target object, the position of the first image area in the first image may be the same or different.

For example, if the target object is a traffic light, since the image information of the traffic light is usually above in a frame of image, the area (such as the upper two-fifths of the area) at the upper preset position in one frame of image can be taken as The first image area. A schematic diagram of the first image area is shown in FIG. 16.

For another example, if the target object is a speed limit sign on an expressway, since the image information of the speed limit sign on an expressway is usually on the right side of an image, the right side of an image can be pre-defined. Set the location area (such as the area on the right side) as the first image area.

S403 to S408: The foregoing S204 to S209 can be referred to, of course, the embodiment of the present application is not limited thereto.

According to S402 to S405, the vehicle-mounted device can obtain the first image area included in each frame of the N1 frames of the N frames of images.

For the explanation of related content in this embodiment and the beneficial effects that can be achieved, reference may be made to the embodiment shown in FIG. 4 above. In addition, on the one hand, in this embodiment, an area at a preset position in one frame of image is used as the first image area, and super-division processing is performed on the first image area in multiple frames of images. This method is relatively simple and convenient to implement. On the other hand, in this embodiment, according to the first image area in one frame of image and the vehicle body information, the first image area in another frame of image is determined, so that when performing super-division operations, it is helpful for vehicle equipment Get more spatial information, thereby improving the accuracy of super-division operations.

The foregoing mainly introduces the solutions provided in the embodiments of the present application from the perspective of methods. In order to realize the above-mentioned functions, it includes hardware structures and/or software modules corresponding to each function. Those skilled in the art should easily realize that in combination with the units and algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

The embodiments of the present application may divide the in-vehicle equipment into functional modules according to the foregoing method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.

As shown in FIG. 17, a device provided by an embodiment of this application may specifically be a schematic structural diagram of a vehicle-mounted device 170. As an example of the device, the vehicle-mounted device 170 may be used to execute the steps performed by the vehicle-mounted device in the method shown in FIG. 4, FIG. 5, FIG. 12, or FIG.

The in-vehicle device 170 may include: a first acquisition module 1701, a second acquisition module 1702, and a super-division module 1703. Wherein, the first acquisition module 1701 is configured to acquire a multi-frame image, and the multi-frame image includes image information of the surrounding road of the vehicle where the on-board device is located. The second acquisition module 1702 is configured to acquire the first image area in each frame of the multi-frame image; wherein, the multiple first image areas of the multi-frame image correspond to the first scene. The super-division module 1703 is configured to perform super-division operations on the multiple first image regions. For example, with reference to FIG. 4, the first acquisition module 1701 may be used to perform S101, the second acquisition module 1702 may be used to perform S102, and the super-division module 1703 may be used to perform S104.

Optionally, the in-vehicle device 170 further includes: a determining module 1704, configured to determine that the image information of the target object exists in the image area obtained by the super-division operation. For example, in conjunction with FIG. 4, the determining module 1704 may be used to perform S105.

Optionally, for each image of the multi-frame image: the first image area is an area with a confidence level lower than or equal to a first threshold; or, the first image area corresponds to the difference between the drivable area of the vehicle and the vehicle. A space area where the distance between the two is greater than or equal to the second threshold; or, the first image area is an area at a preset position.

Optionally, the multi-frame image includes a first image and a second image. The second acquiring module 1702 is specifically configured to acquire the first image area in the second image. Obtaining the first image area in the second image includes: obtaining the first image area in the second image according to the first image area in the first image and the first body information of the vehicle. For example, the first vehicle body information may include at least one of a first relative distance, a second relative distance, and a first vehicle steering angle. Wherein, the first relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the first image is taken, and the second relative distance is the distance between the vehicle and the first image area when the second image is taken. The relative distance between the spatial regions, the first vehicle steering angle is the angle between the direction of the vehicle in the time interval of shooting the first image and the second image. For example, in conjunction with FIG. 5, the second acquisition module 1702 may be used to perform S204 and S205. Optionally, the first vehicle body information may also include vehicle height parameters, such as the height of the camera from the ground and/or the vehicle body height.

Optionally, the super-division module 1703 is specifically configured to perform super-division operations on multiple first image regions after scene alignment. For example, the super-division module 1703 may be used to execute S104 in FIG. 4, S208 in FIG. 5, S308 in FIG. 12, or S407 in FIG. 15.

Optionally, the multi-frame image includes a third image and a fourth image. The in-vehicle device 170 further includes an alignment module 1705, configured to perform scene alignment of the plurality of first image regions according to the second body information of the vehicle. For example, the second vehicle body information may include at least one of a first relative angle, a second relative angle, and a second vehicle steering angle. Wherein, the first relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the third image is taken, and the second relative angle is the relative angle between the vehicle and the first image area when the fourth image is taken. Corresponding to the relative angle between the spatial regions, the second vehicle steering angle is the angle between the direction of the vehicle in the time interval of shooting the third image and the fourth image. Optionally, the second vehicle body information may also include vehicle height parameters, such as the height of the camera from the ground and/or the vehicle body height.

Optionally, the multi-frame images are consecutive multi-frame images in time series.

Optionally, the time interval between the shooting moment of the first frame image and the shooting moment of the last frame image in the multi-frame image is less than or equal to a third threshold.

Optionally, the second obtaining module 1702 is further configured to: obtain a second image area in each frame of the multi-frame image; wherein, the multiple second image areas of the multi-frame image correspond to the second scene. The super-division module 1703 is further configured to perform super-division operations on the multiple second image regions.

In an example, referring to FIG. 1, the above-mentioned first acquisition module 1701 may be implemented through the I/O interface 115 in FIG. 1. At least one of the above-mentioned second acquisition module 1702, super-division module 1703, determination module 1704, and alignment module 1705 One can be implemented by the processor 103 in FIG. 1 calling the application program 143.

For specific descriptions of the foregoing optional manners, refer to the foregoing method embodiments, which are not repeated here. For the explanation and the description of the beneficial effects of any of the on-board equipment 170 provided above, reference may be made to the corresponding method embodiment above, and details are not repeated.

It should be noted that the actions corresponding to each unit above are only specific examples, and the actions actually performed by each unit refer to the actions mentioned in the above description based on the embodiment described in Figure 4, Figure 5, Figure 12 or Figure 15. Or steps.

Those of ordinary skill in the art can understand that all or part of the steps for implementing the above-mentioned embodiments can be completed by a program instructing related hardware. The program can be stored in a computer-readable storage medium. The aforementioned storage medium may be a read-only memory, a random access memory, and the like. The above-mentioned processing unit or processor may be a central processing unit, a general-purpose processor, an application specific integrated circuit (ASIC), a microprocessor (digital signal processor, DSP), a field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.

The embodiments of the present application also provide a computer program product containing instructions, which when the instructions are run on a computer, cause the computer to execute any one of the methods in the foregoing embodiments. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. Computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, computer instructions may be transmitted from a website, computer, server, or data center through a cable (such as Coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server, or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or may include one or more data storage devices such as a server or a data center that can be integrated with the medium. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).

It should be noted that the foregoing devices for storing computer instructions or computer programs provided in the embodiments of the present application, such as but not limited to, the foregoing memory, computer-readable storage medium, and communication chip, are non-transitory. .

In the process of implementing the claimed application, those skilled in the art can understand and implement other changes in the disclosed embodiments by viewing the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "one" does not exclude multiple. A single processor or other unit may implement several functions listed in the claims. Certain measures are described in mutually different dependent claims, but this does not mean that these measures cannot be combined to produce good results.

Although the present application has been described with reference to specific features and embodiments, various modifications and combinations can be made without departing from the spirit and scope of the present application. Accordingly, this specification and drawings are merely exemplary descriptions of the application defined by the appended claims, and are deemed to have covered any and all modifications, changes, combinations or equivalents within the scope of the application.

Claims

An image processing method applied to vehicle-mounted equipment, characterized in that the method includes:

Acquiring a multi-frame image, the multi-frame image including image information of a surrounding road of the vehicle where the on-board device is located;

Acquiring a first image area in each frame of the multi-frame image; wherein the multiple first image areas of the multi-frame image correspond to the first scene;

Performing a hyperdivision operation on the plurality of first image regions.
The method of claim 1, wherein the method further comprises:

It is determined that the image information of the target object exists in the image area obtained by the super-division operation.
The method according to claim 1 or 2, characterized in that, for each frame of the multi-frame image:

The first image area is an area with a confidence level lower than or equal to a first threshold;

Or, the first image area corresponds to a space area in the drivable area of the vehicle whose distance from the vehicle is greater than or equal to a second threshold;

Alternatively, the first image area is an area of a preset position.
The method according to any one of claims 1 to 3, wherein the multi-frame image includes a first image and a second image;

The acquiring the first image area in each frame of the multi-frame image includes: acquiring the first image area in the second image;

The acquiring the first image area in the second image includes:

Determine the first image area in the second image according to the first image area in the first image and the first body information of the vehicle; wherein, the first vehicle body information includes a first relative distance , At least one of a second relative distance and a first vehicle steering angle, the first relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the first image is taken The second relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the second image is taken, and the first vehicle steering angle is the The angle between the direction of the vehicle in the time interval between the image and the second image.
The method according to any one of claims 1 to 4, wherein the performing a hyperdivision operation on the plurality of first image regions comprises:

Performing a hyperdivision operation on the plurality of first image regions after scene alignment.
The method according to claim 5, wherein the multi-frame image includes a third image and a fourth image; before the superdivision operation is performed on the plurality of first image regions after the scene is aligned, the The method also includes:

According to the second vehicle body information of the vehicle, perform scene alignment of the multiple first image areas; wherein the second vehicle body information includes the first relative angle, the second relative angle, and the second vehicle steering angle. At least one of; wherein the first relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the third image is taken, and the second relative angle is the The fourth image is the relative angle between the vehicle and the space area corresponding to the first image area, and the second vehicle steering angle is at the shooting time of the third image and the fourth image In the interval, the angle between the directions of the vehicles.
The method according to any one of claims 1 to 6, wherein the multi-frame images are consecutive multi-frame images in time series.
8. The method according to claim 7, wherein the time interval between the shooting moment of the first frame image and the shooting moment of the last frame image in the multi-frame images is less than or equal to a third threshold.
The method according to any one of claims 1 to 8, wherein the method further comprises:

Acquiring a second image area in each frame of the multi-frame image; wherein the multiple second image areas of the multi-frame image correspond to the second scene;

Performing a hyperdivision operation on the plurality of second image regions.
A vehicle-mounted device, characterized in that, the vehicle-mounted device includes:

The first acquisition module is configured to acquire multi-frame images, the multi-frame images including image information of the surrounding roads of the vehicle where the on-board equipment is located;

The second acquisition module is configured to acquire the first image area in each frame of the multi-frame image; wherein the multiple first image areas of the multi-frame image correspond to the first scene;

The super-division module is used to perform super-division operations on the multiple first image regions.
The vehicle-mounted device according to claim 10, wherein the vehicle-mounted device further comprises:

The determining module is used to determine that the image information of the target object exists in the image area obtained by the hyperdivision operation.
The vehicle-mounted device according to claim 10 or 11, wherein, for each frame of the multi-frame image:

The first image area is an area with a confidence level lower than or equal to a first threshold;

Or, the first image area corresponds to a space area in the drivable area of the vehicle whose distance from the vehicle is greater than or equal to a second threshold;

Alternatively, the first image area is an area of a preset position.
The vehicle-mounted device according to any one of claims 10 to 12, wherein the multi-frame image includes a first image and a second image;

The second acquiring module is specifically configured to: acquire the first image area in the second image; the acquiring the first image area in the second image includes:

Determine the first image area in the second image according to the first image area in the first image and the first body information of the vehicle; wherein, the first vehicle body information includes a first relative distance , At least one of a second relative distance and a first vehicle steering angle, the first relative distance being the relative distance between the vehicle and the space area corresponding to the first image area when the first image is taken The second relative distance is the relative distance between the vehicle and the space area corresponding to the first image area when the second image is taken, and the first vehicle steering angle is the The angle between the direction of the vehicle in the time interval between the image and the second image.
The vehicle-mounted equipment according to any one of claims 10 to 13, wherein:

The super-division module is specifically configured to perform a super-division operation on the plurality of first image regions after scene alignment.
The vehicle-mounted device according to claim 14, wherein the multi-frame image includes a third image and a fourth image; the vehicle-mounted device further includes:

The alignment module is configured to perform scene alignment of the multiple first image areas according to the second body information of the vehicle; wherein the second body information includes a first relative angle, a second relative angle, and a first relative angle. At least one of two vehicle steering angles; wherein, the first relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the third image is taken, and the first The second relative angle is the relative angle between the vehicle and the space area corresponding to the first image area when the fourth image is taken, and the second vehicle steering angle is between the third image and the first image area. The angle between the directions of the vehicles in the shooting time interval of the four images.
The vehicle-mounted device according to any one of claims 10 to 15, wherein the multi-frame image is a continuous multi-frame image in time series.
The vehicle-mounted device according to claim 16, wherein the time interval between the shooting moment of the first frame image and the shooting moment of the last frame image in the multi-frame images is less than or equal to a third threshold.
The in-vehicle device according to any one of claims 10 to 17, wherein:

The second acquisition module is further configured to: acquire a second image area in each frame of the multi-frame image; wherein the multiple second image areas of the multi-frame image correspond to a second scene;

The super-division module is further configured to perform super-division operations on the plurality of second image regions.
An image processing device, characterized by comprising: a memory and a processor; the memory is used to store a computer program, and the processor is used to call the computer program to execute any one of claims 1 to 9 method.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program runs on a computer, the computer executes any one of claims 1 to 9 The method described.