CN115205311B

CN115205311B - Image processing method, device, vehicle, medium and chip

Info

Publication number: CN115205311B
Application number: CN202210837775.9A
Authority: CN
Inventors: 李旺
Original assignee: Xiaomi Automobile Technology Co Ltd
Current assignee: Xiaomi Automobile Technology Co Ltd
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2024-04-05
Anticipated expiration: 2042-07-15
Also published as: CN115205311A

Abstract

The disclosure relates to an image processing method, an image processing device, a vehicle, a medium and a chip, and relates to the field of automatic driving. The method comprises the following steps: obtaining a target difficult sample, wherein the target difficult sample comprises a first aerial view image; determining a first partial image corresponding to the target object in the first aerial view image; and adding the first local image to the second aerial view image corresponding to the target scene to obtain a target aerial view image which corresponds to the target scene and contains the target object. Thus, based on the acquired target refractory sample, the image content corresponding to the target object can be specified, which corresponds to acquiring refractory data of the target object, and the target bird's-eye image obtained based on the image content corresponds to the target scene and contains the target object, which corresponds to forming a new refractory sample corresponding to the target scene. Therefore, images corresponding to different objects can be generated for different scenes to serve as new difficult sample, real vehicle collection is not needed, and efficiency of obtaining the difficult sample is improved.

Description

Image processing method, device, vehicle, medium and chip

Technical Field

The present disclosure relates to the field of autopilot, and in particular to an image processing method, apparatus, vehicle, medium, and chip.

Background

Currently, multi-camera based visual perception techniques are an important technology in the field of autopilot. In general, images are acquired by a plurality of looking around fish eye cameras provided in a vehicle, and a series of image processing is performed on the acquired images to obtain a perceived result in a bird's eye view space, which is then used for different tasks (for example, ground obstacle detection, garage position detection, road surface identification detection, etc.). Therefore, obtaining a good quality image with complex information as training data has an important role in improving the completion accuracy of the task. In the related art, an image used as task training data is usually acquired by a real vehicle, and the real vehicle acquisition mode needs a lot of time and labor, and the efficiency of acquiring the training data is not high, especially complex training data (i.e. difficult sample), and further the acquisition efficiency of the training data is reduced due to the problems of high acquisition difficulty, high labor consumption for data screening and the like.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides an image processing method, apparatus, vehicle, medium, and chip.

According to a first aspect of embodiments of the present disclosure, there is provided an image processing method, the method including:

Obtaining a target difficult sample, wherein the target difficult sample comprises a first aerial view image;

determining a first local image corresponding to a target object in the first aerial view image;

and adding the first local image to a second aerial view image corresponding to a target scene to obtain a target aerial view image corresponding to the target scene and containing the target object.

Optionally, the determining a first local image corresponding to the target object in the first aerial image includes:

inputting the first aerial view image into a pre-trained image segmentation model to obtain a segmented image output by the image segmentation model, wherein the segmented image is used for indicating an image area corresponding to a preset object in the first aerial view image;

determining an image area corresponding to the target object in the segmented image as a target image area;

and extracting an image area corresponding to the target image area from the first aerial view image as the first local image.

Optionally, the image segmentation model is trained by:

acquiring training data, wherein the training data comprises a sample aerial view image and a labeling image corresponding to the sample aerial view image, and the labeling image is used for indicating preset objects corresponding to pixel points in the sample aerial view image;

And performing model training by taking the sample aerial view image as the input of a model and taking the marked image as the target output of the model so as to obtain the trained image segmentation model.

Optionally, the adding the first partial image to the second aerial image corresponding to the target scene includes:

determining an image area with a position association relation with the target object in the second aerial view image as a first association area;

the first partial image is added to the first association region.

Optionally, the method further comprises:

acquiring a first acquired image corresponding to the first aerial view image, wherein the first aerial view image is a spliced image generated based on the first acquired image;

determining a second local image corresponding to the first local image in the first acquired image according to the coordinate mapping relation between the first aerial view image and the first acquired image;

acquiring a second acquired image corresponding to the second aerial view image, wherein the second aerial view image is a spliced image generated based on the second acquired image;

and adding the second local image to the second acquisition image to obtain a target acquisition image which corresponds to the target scene and contains the target object.

Optionally, the adding the second local image to the second acquired image includes:

determining an image area with a position association relation with the target object in the second acquired image as a second association area;

the second partial image is added to the second association region.

Optionally, the target object is any one of the following:

a warehouse bit line, a wheel block, a deceleration strip and a zebra crossing.

According to a second aspect of embodiments of the present disclosure, there is provided an image processing apparatus including:

a first acquisition module configured to acquire a target difficult-to-sample, the target difficult-to-sample including a first bird's-eye image;

a first determination module configured to determine a first partial image of the first bird's-eye image corresponding to a target object;

and the first adding module is configured to add the first local image to a second aerial view image corresponding to a target scene to obtain a target aerial view image corresponding to the target scene and containing the target object.

Optionally, the first determining module includes:

a segmentation sub-module, configured to input the first aerial view image into a pre-trained image segmentation model, to obtain a segmented image output by the image segmentation model, where the segmented image is used for indicating an image area corresponding to a preset object in the first aerial view image;

A first determination submodule configured to determine an image region corresponding to the target object in the divided image as a target image region;

an extraction sub-module configured to extract an image area corresponding to the target image area from the first bird's-eye image as the first partial image.

Optionally, the image segmentation model is trained by the following modules:

the second acquisition module is configured to acquire training data, wherein the training data comprises a sample aerial view image and a labeling image corresponding to the sample aerial view image, and the labeling image is used for indicating preset objects corresponding to pixel points in the sample aerial view image;

and the training module is configured to perform model training by taking the sample aerial view image as an input of a model and taking the marked image as a target output of the model so as to obtain the trained image segmentation model.

Optionally, the first adding module includes:

a second determination submodule configured to set an image area with a position association relation with the target object in the second bird's eye view image as a first association area;

a first adding sub-module configured to add the first partial image to the first association region.

Optionally, the apparatus further comprises:

a third acquisition module configured to acquire a first acquired image corresponding to the first aerial image, the first aerial image being a stitched image generated based on the first acquired image;

the second determining module is configured to determine a second local image corresponding to the first local image in the first acquired image according to the coordinate mapping relation between the first aerial view image and the first acquired image;

a fourth acquisition module configured to acquire a second acquired image corresponding to the second bird's-eye view image, the second bird's-eye view image being a stitched image generated based on the second acquired image;

and a second adding module configured to add the second local image to the second acquired image to obtain a target acquired image corresponding to the target scene and containing the target object.

Optionally, the second adding module includes:

a third determining submodule configured to determine an image area with a position association relation with the target object in the second acquired image as a second association area;

a second adding sub-module configured to add the second partial image to the second association region.

Optionally, the target object is any one of the following:

a warehouse bit line, a wheel block, a deceleration strip and a zebra crossing.

According to a third aspect of embodiments of the present disclosure, there is provided a vehicle comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute instructions in the memory to implement the steps of the image processing method provided in the first aspect of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the image processing method provided by the first aspect of the present disclosure.

According to a fifth aspect of embodiments of the present disclosure, there is provided a chip comprising a processor and an interface; the processor is configured to read instructions to perform the image processing method provided in the first aspect of the present disclosure.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:

according to the technical scheme, after the first aerial view image in the target difficult sample is acquired, the first local image corresponding to the target object in the first aerial view image is determined, and the first local image is added to the second aerial view image corresponding to the target scene, so that the target aerial view image corresponding to the target scene and containing the target object is obtained. Thus, based on the acquired target difficult sample, the image content corresponding to the target object, namely the first partial image, can be determined from the acquired target difficult sample, which corresponds to the difficult data of the target object; then, the first local image is added to the second aerial image corresponding to the target scene, and the obtained target aerial image corresponds to the target scene and also contains the target object, so that the new difficult sample corresponding to the target scene is formed. Therefore, images corresponding to different objects can be generated for different scenes to serve as new difficult sample, real vehicle collection is not needed, and efficiency of obtaining the difficult sample is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment.

Fig. 2 is a block diagram of an image processing apparatus according to an exemplary embodiment.

FIG. 3 is a functional block diagram of a vehicle, shown in an exemplary embodiment.

Fig. 4 is a block diagram of an image processing apparatus according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

It should be noted that, all actions for acquiring signals, information or data in the present application are performed under the condition of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

In general, samples that make it difficult for an algorithm (or model) to obtain a correct processing result (e.g., incorrect matching, incorrect recognition, incorrect detection, etc.) may be referred to as difficult samples, which may be considered as obstacles that hinder the performance of the algorithm (or model). As described in the background art, in the prior art, the difficult sample required by the task is collected by a real vehicle collection mode, so that the problems of high collection difficulty and low collection efficiency exist.

In order to solve the technical problems, the disclosure provides an image processing method, an image processing device, a vehicle, a medium and a chip, so as to improve efficiency of obtaining a difficult sample.

Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment. As shown in fig. 1, the method provided by the present disclosure may include the following steps 11 to 13.

In step 11, a target difficult sample is obtained.

As described above, a difficult sample refers to a sample that is difficult for an algorithm (or model) to process (e.g., difficult to identify, difficult to detect, etc.). Different difficult sample is corresponding in different task scenes. For example, in a bank bit line identification task, an image in which a bank bit line is partially blocked, an image containing a worn bank bit line, or the like can be taken as a difficult sample.

In the application scenario of the present disclosure, there are already some images as difficult examples, and the purpose of the present disclosure is to perform data augmentation based on these images to obtain new images that can be used as difficult examples.

The target difficult sample obtained in step 11 may include a first aerial view image. In an automatic driving scene, an image acquisition device (for example, a look-around fisheye camera) for acquiring images is generally arranged on a vehicle, a plurality of acquired images corresponding to the same acquisition time can be obtained based on the image acquisition devices, and the images are spliced and fused to obtain corresponding bird's eye images.

In step 12, a first partial image of the first aerial image corresponding to the target object is determined.

The target object may be set according to actual requirements, for example, as an object related to a task. The target object may include, but is not limited to, a library bitline, a gear, a deceleration strip, a zebra crossing, and the like. For example, for a bank bit line identification task, the ability to identify a bank bit line is required, and thus, a target object may be correspondingly set as a bank bit line. For another example, for the zebra stripes detection task, the ability to detect zebra stripes is required, and thus, the target object may be correspondingly set as a zebra stripe.

In one possible embodiment, step 12 may comprise the steps of:

inputting the first aerial view image into a pre-trained image segmentation model to obtain a segmented image output by the image segmentation model;

determining an image area corresponding to a target object in the segmented image as a target image area;

an image region corresponding to the target image region is extracted from the first bird's eye image as a first partial image.

Wherein the segmented image may be used to indicate an image area of the first aerial image corresponding to the preset object.

By way of example, the image segmentation model may be trained by:

acquiring training data;

model training is carried out by taking a sample aerial view image as the input of a model and taking a labeling image as the target output of the model, so as to obtain a trained image segmentation model.

The training data may include a sample aerial view image and a labeling image corresponding to the sample aerial view image, where the labeling image may be used to indicate preset objects corresponding to pixel points in the sample aerial view image.

The preset object may be various objects that may be involved in a driving scene. Illustratively, the preset objects may include, but are not limited to, the following: a warehouse bit line, a wheel block, a deceleration strip and a zebra crossing.

In one possible implementation, for each pixel point in the sample aerial image, it may be labeled with an N-dimensional vector (N is a positive integer), where N is the number of preset objects, and each element in the N-dimensional vector corresponds to one preset object. For example, the marked image corresponding to the sample bird's eye image may be obtained by manually marking.

For example, if the preset object includes three parts of a library bit line, a wheel block and a deceleration strip, and the labeling information corresponding to the pixel point is [ X1, X2, X3], wherein X1 corresponds to the library bit line, X2 corresponds to the wheel block, X3 corresponds to the deceleration strip, and the labeling value 1 represents yes, and the labeling value 0 represents no. If a pixel is a pixel corresponding to the bank bit line, the labeling information for the pixel may be [1, 0]; if a pixel is not any of the bank bit line, the wheel block, and the deceleration strip, the labeling information for the pixel may be [0, 0].

After the training data required by the training image segmentation model is obtained based on the mode, model training can be performed by taking the sample aerial view image as the input of the model and taking the marked image as the target output of the model, so that the trained image segmentation model is obtained.

In an exemplary embodiment, in a one-time training process, a sample aerial view image is input to a model used in the current training, an output result of the model used in the current training can be obtained, then, a loss function calculation is performed by using the output result and a labeling image corresponding to the sample aerial view image input in the current training, the model used in the current training is updated by using a calculation result of the loss function, and the updated model is used in the next training. And (3) repeating the steps until the condition that the model stops training is met, and taking the obtained model as a trained image segmentation model.

For example, the model training described above may use a neural network model. For another example, the model loss function may be a cross entropy loss function. For another example, the conditions under which the model ceases training may include, but are not limited to, any of the following: the training times reach the preset times, the training time reaches the preset time, and the calculation result of the loss function is lower than the preset loss value.

And based on the trained image segmentation model, inputting the first aerial view image into the image segmentation model to obtain a segmented image output by the image segmentation model.

Wherein the segmented image may be used to indicate an image area of the first aerial image corresponding to the preset object. In the segmented image output by the image segmentation model, each pixel point can correspond to a multidimensional vector, elements in the multidimensional vector are in one-to-one correspondence with preset objects, each element value in the multidimensional vector represents the probability that the pixel point belongs to the corresponding preset object, and the preset object corresponding to the maximum probability value is the preset object (namely, the preset object corresponding to the pixel point) to which the pixel point belongs. Based on the thought, the preset object corresponding to each pixel point in the segmented image can be determined, and the image area corresponding to each preset object can be determined from the segmented image according to the plurality of pixel points corresponding to the same preset object.

Based on the image area corresponding to each preset object indicated in the divided image, the image area corresponding to the target object, that is, the target image area, can be determined.

After the target image area is determined, the area position of the target image area in the segmented image can be determined, and because the first aerial view image and the segmented image have the same size, the corresponding image area can be extracted from the corresponding position of the first aerial view image according to the determined area position to serve as the first local image, that is, the corresponding image content (namely, pixel point) of the target object in the first aerial view image is extracted.

In step 13, the first partial image is added to the second aerial image corresponding to the target scene, resulting in a target aerial image corresponding to the target scene and comprising the target object.

In one possible implementation, the first partial image may be added to a specified location in the second aerial image. For example, the specified position may be preset according to actual requirements. For another example, the specified location may be determined manually.

In another possible embodiment, step 13 may comprise the steps of:

the first partial image is added to the first association region.

In general, a target object will only appear in an area having a positional association relationship therewith. For example, the wheel blocks may appear in the parking area and not in the driving lane. Based on this, a set of rules may be set in advance to indicate an area having a positional association relationship with the target object. Thus, when the first partial image is added to the second bird's-eye view image, the first association region having the positional association relation with the target object may be first determined in the second bird's-eye view image, and then the first partial image may be added to the first association region. In this way, the first partial image corresponding to the target object can be prevented from being added to an unsuitable position, and the authenticity and accuracy of the target aerial view image can be improved.

The region division in the first aerial view image may be performed manually or by an image division technique.

Optionally, on the basis of the steps shown in fig. 1, the method provided by the present disclosure may further include the following steps:

acquiring a first acquired image corresponding to the first aerial view image;

acquiring a second acquired image corresponding to the second aerial view image;

and adding the second local image to the second acquired image to obtain a target acquired image which corresponds to the target scene and contains the target object.

As described above, the first bird's-eye view image is a stitched image generated based on the acquired images, and therefore, the acquired image corresponding to the first bird's-eye view image, that is, the first acquired image can be directly acquired.

Since the first bird's-eye view image is a stitched image generated based on the first acquired image, the pixels corresponding to each other in the two images have a coordinate mapping relationship, and based on the coordinate mapping relationship, the pixels corresponding to the respective pixels in the first bird's-eye view image can be located in the first acquired image. Further, based on the coordinate mapping relationship, a second partial image corresponding to the first partial image in the first acquired image can be determined, which corresponds to the image content of the target object corresponding to the first acquired image. For example, the coordinate mapping relationship may be formed from camera intrinsic and extrinsic calibration of the camera capturing the first acquired image.

After the second partial image is determined, the second partial image may be extracted from the first acquired image, i.e., the image content (i.e., pixels) of the target object in the first acquired image may be extracted.

The second bird's eye image is a stitched image generated based on the second acquired image. Thus, and adding the second partial image to the second acquired image, a target acquired image corresponding to the target scene and containing the target object can be obtained.

In one possible implementation, the second partial image may be added to a specified location in the second acquired image. For example, the specified position may be preset according to actual requirements. For another example, the specified location may be determined manually.

In another possible embodiment, adding the second partial image to the second acquired image may comprise the steps of:

the second partial image is added to the second associated region.

As described above, the target object may appear only in the area having the positional association relation therewith. Based on this, a set of rules may be set in advance to indicate an area having a positional association relationship with the target object. Thus, when the second partial image is added to the second captured image, the second association region having the positional association with the target object may be first determined in the second captured image, and then the second partial image may be added to the second association region. Therefore, the second partial image corresponding to the target object can be prevented from being added to an unsuitable position, and the authenticity and accuracy of the target acquisition image are improved.

The region division in the second acquired image can be performed manually or by an image segmentation technique.

It should be noted that, the first acquired image and the second acquired image have no clear sequence of acquisition, and may be acquired simultaneously or sequentially, which is not limited in this disclosure.

According to the scheme, based on the first partial image extracted from the first aerial view, the image content corresponding to the target image in the first acquisition image can be further positioned, the image content corresponding to the target object in the first acquisition image can be extracted and added to the second acquisition image of the target scene, so that the data amplification of the difficult sample of the acquisition image in the target scene is realized.

Fig. 2 is a block diagram of an image processing apparatus according to an exemplary embodiment. As shown in fig. 2, the apparatus 20 includes:

a first acquisition module 21 configured to acquire a target difficult-to-sample including a first bird's-eye image;

a first determining module 22 configured to determine a first partial image of the first bird's-eye image corresponding to a target object;

the first adding module 23 is configured to add the first local image to a second aerial image corresponding to a target scene, so as to obtain a target aerial image corresponding to the target scene and containing the target object.

Optionally, the first determining module 22 includes:

Optionally, the image segmentation model is trained by the following modules:

Optionally, the first adding module 23 includes:

Optionally, the apparatus 20 further comprises:

Optionally, the second adding module includes:

Optionally, the target object is any one of the following:

a warehouse bit line, a wheel block, a deceleration strip and a zebra crossing.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Referring to fig. 3, fig. 3 is a functional block diagram of a vehicle 600 according to an exemplary embodiment. The vehicle 600 may be configured in a fully or partially autonomous mode. For example, the vehicle 600 may obtain environmental information of its surroundings through the perception system 620 and derive an automatic driving strategy based on analysis of the surrounding environmental information to achieve full automatic driving, or present the analysis results to the user to achieve partial automatic driving.

The vehicle 600 may include various subsystems, such as an infotainment system 610, a perception system 620, a decision control system 630, a drive system 640, and a computing platform 650. Alternatively, vehicle 600 may include more or fewer subsystems, and each subsystem may include multiple components. In addition, each of the subsystems and components of vehicle 600 may be interconnected via wires or wirelessly.

In some embodiments, the infotainment system 610 may include a communication system 611, an entertainment system 612, and a navigation system 613.

The communication system 611 may comprise a wireless communication system, which may communicate wirelessly with one or more devices, either directly or via a communication network. For example, the wireless communication system may use 3G cellular communication, such as CDMA, EVD0, GSM/GPRS, or 4G cellular communication, such as LTE. Or 5G cellular communication. The wireless communication system may communicate with a wireless local area network (wireless local area network, WLAN) using WiFi. In some embodiments, the wireless communication system may communicate directly with the device using an infrared link, bluetooth, or ZigBee. Other wireless protocols, such as various vehicle communication systems, for example, wireless communication systems may include one or more dedicated short-range communication (dedicated short range communications, DSRC) devices, which may include public and/or private data communications between vehicles and/or roadside stations.

Entertainment system 612 may include a display device, a microphone, and an audio, and a user may listen to the broadcast in the vehicle based on the entertainment system, playing music; or the mobile phone is communicated with the vehicle, the screen of the mobile phone is realized on the display equipment, the display equipment can be in a touch control type, and a user can operate through touching the screen.

In some cases, the user's voice signal may be acquired through a microphone and certain controls of the vehicle 600 by the user may be implemented based on analysis of the user's voice signal, such as adjusting the temperature within the vehicle, etc. In other cases, music may be played to the user through sound.

The navigation system 613 may include a map service provided by a map provider to provide navigation of a travel route for the vehicle 600, and the navigation system 613 may be used with the global positioning system 621 and the inertial measurement unit 622 of the vehicle. The map service provided by the map provider may be a two-dimensional map or a high-precision map.

The perception system 620 may include several types of sensors that sense information about the environment surrounding the vehicle 600. For example, sensing system 620 may include a global positioning system 621 (which may be a GPS system, or may be a beidou system, or other positioning system), an inertial measurement unit (inertial measurement unit, IMU) 622, a lidar 623, a millimeter wave radar 624, an ultrasonic radar 625, and a camera 626. The sensing system 620 may also include sensors (e.g., in-vehicle air quality monitors, fuel gauges, oil temperature gauges, etc.) of the internal systems of the monitored vehicle 600. Sensor data from one or more of these sensors may be used to detect objects and their corresponding characteristics (location, shape, direction, speed, etc.). Such detection and identification is a critical function of the safe operation of the vehicle 600.

The global positioning system 621 is used to estimate the geographic location of the vehicle 600.

The inertial measurement unit 622 is configured to sense a change in the pose of the vehicle 600 based on inertial acceleration. In some embodiments, inertial measurement unit 622 may be a combination of an accelerometer and a gyroscope.

The lidar 623 uses a laser to sense objects in the environment in which the vehicle 600 is located. In some embodiments, lidar 623 may include one or more laser sources, a laser scanner, and one or more detectors, among other system components.

The millimeter-wave radar 624 utilizes radio signals to sense objects within the surrounding environment of the vehicle 600. In some embodiments, millimeter-wave radar 624 may be used to sense the speed and/or heading of an object in addition to sensing the object.

The ultrasonic radar 625 may utilize ultrasonic signals to sense objects around the vehicle 600.

The image pickup device 626 is used to capture image information of the surrounding environment of the vehicle 600. The image capturing device 626 may include a monocular camera, a binocular camera, a structured light camera, a panoramic camera, etc., and the image information acquired by the image capturing device 626 may include still images or video stream information.

The decision control system 630 includes a computing system 631 that makes analysis decisions based on information acquired by the perception system 620, and the decision control system 630 also includes a vehicle controller 632 that controls the powertrain of the vehicle 600, as well as a steering system 633, throttle 634, and braking system 635 for controlling the vehicle 600.

The computing system 631 may be operable to process and analyze the various information acquired by the perception system 620 in order to identify targets, objects, and/or features in the environment surrounding the vehicle 600. The targets may include pedestrians or animals and the objects and/or features may include traffic signals, road boundaries, and obstacles. The computing system 631 may use object recognition algorithms, in-motion restoration structure (Structure from Motion, SFM) algorithms, video tracking, and the like. In some embodiments, the computing system 631 may be used to map the environment, track objects, estimate the speed of objects, and so forth. The computing system 631 may analyze the acquired various information and derive control strategies for the vehicle.

The vehicle controller 632 may be configured to coordinate control of the power battery and the engine 641 of the vehicle to enhance the power performance of the vehicle 600.

Steering system 633 is operable to adjust the direction of travel of vehicle 600. For example, in one embodiment may be a steering wheel system.

Throttle 634 is used to control the operating speed of engine 641 and thereby the speed of vehicle 600.

The braking system 635 is used to control deceleration of the vehicle 600. The braking system 635 may use friction to slow the wheels 644. In some embodiments, the braking system 635 may convert kinetic energy of the wheels 644 into electrical current. The braking system 635 may take other forms to slow the rotational speed of the wheels 644 to control the speed of the vehicle 600.

The drive system 640 may include components that provide powered movement of the vehicle 600. In one embodiment, the drive system 640 may include an engine 641, an energy source 642, a transmission 643, and wheels 644. The engine 641 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, such as a hybrid engine of a gasoline engine and an electric motor, or a hybrid engine of an internal combustion engine and an air compression engine. The engine 641 converts the energy source 642 into mechanical energy.

Examples of energy sources 642 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity. The energy source 642 may also provide energy to other systems of the vehicle 600.

The transmission 643 may transfer mechanical power from the engine 641 to wheels 644. The transmission 643 may include a gearbox, a differential, and a driveshaft. In one embodiment, the transmission 643 may also include other devices, such as a clutch. Wherein the drive shaft may include one or more axles that may be coupled to one or more wheels 644.

Some or all of the functions of the vehicle 600 are controlled by the computing platform 650. The computing platform 650 may include at least one processor 651, and the processor 651 may execute instructions 653 stored in a non-transitory computer-readable medium, such as memory 652. In some embodiments, computing platform 650 may also be a plurality of computing devices that control individual components or subsystems of vehicle 600 in a distributed manner.

The processor 651 may be any conventional processor, such as a commercially available CPU. Alternatively, the processor 651 may also include, for example, an image processor (Graphic Process Unit, GPU), a field programmable gate array (FieldProgrammable Gate Array, FPGA), a System On Chip (SOC), an application specific integrated Chip (Application Specific Integrated Circuit, ASIC), or a combination thereof. Although FIG. 3 functionally illustrates a processor, memory, and other elements of a computer in the same block, it will be understood by those of ordinary skill in the art that the processor, computer, or memory may in fact comprise multiple processors, computers, or memories that may or may not be stored within the same physical housing. For example, the memory may be a hard disk drive or other storage medium located in a different housing than the computer. Thus, references to a processor or computer will be understood to include references to a collection of processors or computers or memories that may or may not operate in parallel. Rather than using a single processor to perform the steps described herein, some components, such as the steering component and the retarding component, may each have their own processor that performs only calculations related to the component-specific functions.

In the presently disclosed embodiments, the processor 651 may perform the image processing methods described above.

In various aspects described herein, the processor 651 can be located remotely from and in wireless communication with the vehicle. In other aspects, some of the processes described herein are performed on a processor disposed within the vehicle and others are performed by a remote processor, including taking the necessary steps to perform a single maneuver.

In some embodiments, memory 652 may contain instructions 653 (e.g., program logic), which instructions 653 may be executed by processor 651 to perform various functions of vehicle 600. Memory 652 may also contain additional instructions, including instructions to send data to, receive data from, interact with, and/or control one or more of infotainment system 610, perception system 620, decision control system 630, drive system 640.

In addition to instructions 653, memory 652 may store data such as road maps, route information, vehicle location, direction, speed, and other such vehicle data, as well as other information. Such information may be used by the vehicle 600 and the computing platform 650 during operation of the vehicle 600 in autonomous, semi-autonomous, and/or manual modes.

The computing platform 650 may control the functions of the vehicle 600 based on inputs received from various subsystems (e.g., the drive system 640, the perception system 620, and the decision control system 630). For example, computing platform 650 may utilize input from decision control system 630 in order to control steering system 633 to avoid obstacles detected by perception system 620. In some embodiments, computing platform 650 is operable to provide control over many aspects of vehicle 600 and its subsystems.

Alternatively, one or more of these components may be mounted separately from or associated with vehicle 600. For example, the memory 652 may exist partially or completely separate from the vehicle 600. The above components may be communicatively coupled together in a wired and/or wireless manner.

Alternatively, the above components are only an example, and in practical applications, components in the above modules may be added or deleted according to actual needs, and fig. 3 should not be construed as limiting the embodiments of the present disclosure.

An autonomous car traveling on a road, such as the vehicle 600 above, may identify objects within its surrounding environment to determine adjustments to the current speed. The object may be another vehicle, a traffic control device, or another type of object. In some examples, each identified object may be considered independently and based on its respective characteristics, such as its current speed, acceleration, spacing from the vehicle, etc., may be used to determine the speed at which the autonomous car is to adjust.

Alternatively, the vehicle 600 or a sensing and computing device associated with the vehicle 600 (e.g., computing system 631, computing platform 650) may predict the behavior of the identified object based on the characteristics of the identified object and the state of the surrounding environment (e.g., traffic, rain, ice on a road, etc.). Alternatively, each identified object depends on each other's behavior, so all of the identified objects can also be considered together to predict the behavior of a single identified object. The vehicle 600 is able to adjust its speed based on the predicted behavior of the identified object. In other words, the autonomous car is able to determine what steady state the vehicle will need to adjust to (e.g., accelerate, decelerate, or stop) based on the predicted behavior of the object. In this process, other factors may also be considered to determine the speed of the vehicle 600, such as the lateral position of the vehicle 600 in the road on which it is traveling, the curvature of the road, the proximity of static and dynamic objects, and so forth.

In addition to providing instructions to adjust the speed of the autonomous vehicle, the computing device may also provide instructions to modify the steering angle of the vehicle 600 so that the autonomous vehicle follows a given trajectory and/or maintains safe lateral and longitudinal distances from objects in the vicinity of the autonomous vehicle (e.g., vehicles in adjacent lanes on a roadway).

The vehicle 600 may be various types of traveling tools, such as a car, a truck, a motorcycle, a bus, a ship, an airplane, a helicopter, a recreational vehicle, a train, etc., and embodiments of the present disclosure are not particularly limited.

The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the image processing method provided by the present disclosure.

The apparatus may be a stand-alone electronic device or may be part of a stand-alone electronic device, for example, in one embodiment, the apparatus may be an integrated circuit (Integrated Circuit, IC) or a chip, where the integrated circuit may be an IC or may be a collection of ICs; the chip may include, but is not limited to, the following: GPU (Graphics Processing Unit, graphics processor), CPU (Central Processing Unit ), FPGA (Field Programmable Gate Array, programmable logic array), DSP (Digital Signal Processor ), ASIC (Application Specific Integrated Circuit, application specific integrated circuit), SOC (System on Chip, SOC, system on Chip or System on Chip), etc. The integrated circuits or chips described above may be used to execute executable instructions (or code) to implement the image processing methods described above. The executable instructions may be stored on the integrated circuit or chip or may be retrieved from another device or apparatus, such as the integrated circuit or chip including a processor, memory, and interface for communicating with other devices. The executable instructions may be stored in the memory, which when executed by the processor implement the image processing method described above; alternatively, the integrated circuit or chip may receive executable instructions through the interface and transmit the executable instructions to the processor for execution to implement the image processing method described above.

In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the above-mentioned image processing method when being executed by the programmable apparatus.

Fig. 4 is a block diagram of an image processing apparatus 1900 according to an exemplary embodiment. For example, the apparatus 1900 may be provided as a server. Referring to fig. 4, the apparatus 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that are executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the image processing methods described above.

The apparatus 1900 may further comprise a power component 1926 configured to perform power management of the apparatus 1900, a wired or wireless network interface 1950 configured to connect the apparatus 1900 to a network, and an input/output interface 1958. The apparatus 1900 may operate based on an operating system stored in the memory 1932, such as Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM ，Linux ^TM ，FreeBSD ^TM Or the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image processing method, the method comprising:

acquiring a target difficult-to-see sample, wherein the target difficult-to-see sample comprises a first aerial view image, and the first aerial view image is a spliced image generated based on a first acquired image;

determining an image area which has a position association relation with the target object in a second aerial view image, wherein the image area is used as a first association area, and adding the first local image into the first association area to obtain a target aerial view image which corresponds to a target scene and contains the target object, and the second aerial view image is a spliced image generated based on a second acquired image;

The method further comprises the steps of: determining a second local image corresponding to the first local image in the first acquired image according to the coordinate mapping relation between the first aerial view image and the first acquired image; and determining an image area which has a position association relation with the target object in the second acquired image, and adding the second local image into the second association area as a second association area to obtain a target acquired image which corresponds to the target scene and contains the target object.

2. The method of claim 1, wherein the determining a first partial image of the first aerial image corresponding to a target object comprises:

3. The method of claim 2, wherein the image segmentation model is trained by:

4. A method according to any one of claims 1-3, wherein the target object is any one of the following:

a warehouse bit line, a wheel block, a deceleration strip and a zebra crossing.

5. An image processing apparatus, characterized in that the apparatus comprises:

a first acquisition module configured to acquire a target difficult-to-sample including a first bird's-eye view image that is a stitched image generated based on a first acquired image;

A first adding module configured to add the first partial image to a second aerial image corresponding to a target scene, resulting in a target aerial image corresponding to the target scene and containing the target object, the second aerial image being a stitched image generated based on a second acquired image;

the first adding module includes:

a first adding sub-module configured to add the first partial image into the first association region;

the device is also for: determining a second local image corresponding to the first local image in the first acquired image according to the coordinate mapping relation between the first aerial view image and the first acquired image; and determining an image area which has a position association relation with the target object in the second acquired image, and adding the second local image into the second association area as a second association area to obtain a target acquired image which corresponds to the target scene and contains the target object.

6. A vehicle, characterized by comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to execute instructions in the memory to implement the steps of the method of any of claims 1-4.

7. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method of any of claims 1-4.

8. A chip, comprising a processor and an interface; the processor is configured to read instructions to perform the method of any of claims 1-4.