CN118097098A

CN118097098A - Image processing method and device, readable medium and electronic equipment

Info

Publication number: CN118097098A
Application number: CN202211457434.5A
Authority: CN
Inventors: 袁顺莹; 何培玉; 刘宇飞
Original assignee: Beijing Youzhuju Network Technology Co Ltd
Current assignee: Beijing Youzhuju Network Technology Co Ltd
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2024-05-28

Abstract

The embodiment of the disclosure relates to an image processing method, an image processing device, a readable medium and electronic equipment. The method comprises the following steps: acquiring a first target image shot by a target camera; performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under a target view angle; inputting the second target image into a pre-generated image processing model, and processing the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle. Therefore, before image processing, the photographed image can be subjected to view angle transformation, different camera view angles are unified to the target view angle, the image processing model only needs to process the image of the target view angle, and in the training stage of the image processing model, training can be performed only through the sample image of the target view angle, so that the workload of sample acquisition is reduced, and the accuracy and reliability of the trained model are improved.

Description

Image processing method and device, readable medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an image processing method, an image processing device, a readable medium, and an electronic apparatus.

Background

With the progress of robot technology, a robot can capture an image of an environment by a camera, and detect a target object in the environment, such as a person, an animal, a plant, or other objects. For example, the captured environmental image may be detected by a machine learning model to obtain information related to the target object in the image, such as category information or location information of the target object.

However, in the related art, the camera mounting position and angle of the robot are different, the view angles of the photographed images are also different, and the problem of low accuracy occurs when the target detection is performed on the images of different view angles through the machine learning model.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

According to a first aspect of embodiments of the present disclosure, there is provided an image processing method, the method including:

Acquiring a first target image shot by a target camera;

Performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under a target view angle;

inputting the second target image into a pre-generated image processing model, and processing the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle.

According to a second aspect of embodiments of the present disclosure, there is provided an image processing apparatus including:

the image acquisition module is used for acquiring a first target image shot by the target camera;

The preprocessing module is used for carrying out view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under a target view angle;

The processing module is used for inputting the second target image into a pre-generated image processing model and processing the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle.

According to a third aspect of embodiments of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processing device, implements the steps of the method of the first aspect of the present disclosure.

According to a fourth aspect of embodiments of the present disclosure, there is provided an electronic device, comprising:

A storage device having a computer program stored thereon;

processing means for executing said computer program in said storage means to carry out the steps of the method of the first aspect of the disclosure.

By adopting the technical scheme, a first target image shot by a target camera is acquired; performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under a target view angle; inputting the second target image into a pre-generated image processing model, and processing the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle. Therefore, before image processing, the photographed image can be subjected to view angle transformation, different camera view angles are unified to the target view angle, the image processing model only needs to process the image of the target view angle, and in the training stage of the image processing model, training can be performed only through the sample image of the target view angle, so that the workload of sample acquisition is reduced, and the accuracy and reliability of the trained model are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale. In the drawings:

fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment.

Fig. 2 is a schematic diagram of a reference camera coordinate system, shown according to an exemplary embodiment.

Fig. 3 is a schematic diagram illustrating a preset virtual camera coordinate system according to an exemplary embodiment.

FIG. 4 is a schematic diagram of a nominal camera coordinate system, shown according to an exemplary embodiment.

Fig. 5 is a flowchart illustrating a method of acquiring camera external parameters according to an exemplary embodiment.

FIG. 6 is a schematic diagram illustrating a preset calibrated scale according to an exemplary embodiment.

Fig. 7 is a flowchart showing a step S503 according to the embodiment shown in fig. 5.

Fig. 8 is a block diagram of an image processing apparatus according to an exemplary embodiment.

Fig. 9 is a block diagram of another image processing apparatus according to an exemplary embodiment.

Fig. 10 is a block diagram of an electronic device, according to an example embodiment.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise. In the description of the present disclosure, unless otherwise indicated, "a plurality" means two or more than two, and other adjectives are similar thereto; "at least one item", "an item" or "a plurality of items" or the like, refer to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a) may represent any number a; as another example, one (or more) of a, b, and c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural; "and/or" is an association relationship describing an association object, meaning that there may be three relationships, e.g., a and/or B, which may represent: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural.

Although operations or steps are described in a particular order in the figures in the disclosed embodiments, it should not be understood as requiring that such operations or steps be performed in the particular order shown or in sequential order, or that all illustrated operations or steps be performed, to achieve desirable results. In embodiments of the present disclosure, these operations or steps may be performed serially; these operations or steps may also be performed in parallel; some of these operations or steps may also be performed.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.

For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.

It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

Meanwhile, it can be understood that the data (including but not limited to the data itself, the acquisition or the use of the data) related to the technical scheme should conform to the requirements of the corresponding laws and regulations and related regulations.

The present disclosure is described below in connection with specific embodiments.

First, an application scenario of the present disclosure will be described. Embodiments of the present disclosure may be applied to image processing scenarios, such as object detection or instance segmentation. The intelligent device represented by the robot can be provided with one or more cameras, and the image obtained by shooting the cameras can be processed through a machine learning model, for example, target detection or instance segmentation can be performed. Because the camera mounting positions and angles on the intelligent equipment are different, the visual angles of the photographed images are also different, in order to process the images with different visual angles, sample images with various visual angles are required to be obtained in a training stage of an image processing model, the workload of sample obtaining and calibration is huge, the effective sample size is small, the training effect is poor, and the accuracy of the model obtained by training on image processing is low.

In order to solve the above problems, the present disclosure provides an image processing method, an apparatus, a readable medium, and an electronic device.

Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment. The method may be applied to a processor in an intelligent device, which may include an intelligent robot, an autonomous vehicle, an intelligent terminal, an intelligent wearable device, an intelligent sound box, etc., which is not limited in this disclosure. As shown in fig. 1, the method may include:

s101, acquiring a first target image shot by a target camera.

The target camera can be installed on the intelligent device, for example, a preset position of the intelligent robot can be installed; as another example, it may be installed at a preset position of an autonomous vehicle.

S102, performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under the target view angle.

In some embodiments, the target viewing angle may be a predetermined viewing angle, such as a front viewing angle; the second target image may be referred to as a forward looking virtual image.

In some embodiments, the camera parameters may include camera intrinsic and camera extrinsic. The camera internal parameters can be parameters of the target camera, and are determined only by the target camera and are not changed by external environment. For example, the camera parameters may include one or more of a camera focal length, a principal point position, a chamfer coefficient, and a distortion parameter, which may be further divided into a radial distortion parameter and a tangential distortion parameter. The camera outlier may be a parameter for characterizing the camera with respect to the real world (three-dimensional spatial coordinate system, real scene), which may be used for the conversion of the real world coordinate system, which may be a coordinate system based on the target viewing angle, with the image coordinate system. For example, the camera profile may include a rotation matrix and/or a translation matrix.

The camera internal parameter and the camera external parameter may be preset parameters or parameters determined by calibration.

In some examples, the first target image may include a plurality of target pixels, and in this step, the second target image at the target viewing angle may be determined by the following manner.

First, for each target pixel of a first target image, a first pixel coordinate of the target pixel in the first target image is acquired.

And secondly, determining a second pixel coordinate corresponding to each target pixel according to the camera internal parameter, the camera external parameter and the first pixel coordinate.

The second pixel coordinates can be calculated by the following formula (1) for example

Wherein,A second pixel coordinate representing the target pixel, K may represent a camera intrinsic to the target camera, R may represent a camera extrinsic,/>Representing the first pixel coordinates of the target pixel.

Thus, the second pixel coordinates can be calculated by the formula (1).

And finally, performing view angle transformation on the first target image according to the second pixel coordinates to obtain a second target image under the target view angle.

For example, each target pixel may be mapped to a location corresponding to a second pixel coordinate, resulting in the second target image.

Thus, the viewing angle conversion can be performed in the above-described manner

In other embodiments, the camera parameters may include a perspective transformation matrix, which may be a pre-calibrated perspective transformation matrix. The perspective transformation matrix may be pre-calibrated, for example, from camera intrinsic, camera extrinsic, and target perspective. The target viewing angle may be a preset reference viewing angle, for example, an intelligent device installed by the target camera may be calibrated, and a viewing angle in a preset direction (for example, a forward viewing direction or a backward viewing direction) of the intelligent device is taken as the target viewing angle.

S103, inputting the second target image into a pre-generated image processing model, and processing the second target image.

The image processing model may be a model generated by training in advance according to a sample image at a target view angle.

The image processing model may include an object detection model and/or an instance segmentation model.

In some embodiments, the image processing model may include a target detection model that may take an image (e.g., the second target image or the first target image described above) as an input, output target object information in the image, e.g., one or more target objects of interest may be separated from the input image, and obtain category information and location information for each target object.

For example, a second target image may be input into the target detection model to obtain target object information output by the target detection model; the target object information includes classification information of the target object, and position information of the target object in the second target image.

It should be noted that the target detection model may be a detection model based on deep learning, and the target detection model based on deep learning may include a two-stage (two-stage) target detection model or a single-stage (single-stage) target detection model. Wherein:

A two-stage object detection model may first determine one or more candidate regions (e.g., proposal regions) in an image where the object may be present, and then estimate class information and location information for the object based on the features and locations of the candidate regions. Illustratively, the two-stage detection model may include Faster-RCNN (Region Convolutional Neural Networks, regional convolutional neural network), mask RCNN or CASCADE RCNN, and the like.

The single-stage target detection model can express the detection task as a unified end-to-end regression problem, a process of generating a candidate region (proposal) is not needed, and a picture is processed to directly obtain the category information and the position information of the target. Illustratively, the single-stage detection model may include YOLO v 1-v 5, FCOS (Fully Convolutional One-Stage Object Detection, full convolution single-stage target detection), CENTERNET, and the like.

Through the target detection model, target detection can be performed on the second target image, and target object information is obtained.

By adopting the method, a first target image shot by a target camera is acquired; performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under the target view angle; inputting a second target image into a pre-generated image processing model, and processing the second target image; the image processing model is a model generated after training in advance according to a sample image under a target visual angle. Therefore, before image processing, the photographed image can be subjected to view angle transformation, different camera view angles are unified to the target view angle, the image processing model only needs to process the image of the target view angle, and in the training stage of the image processing model, training can be performed only through the sample image of the target view angle, so that the workload of sample acquisition is reduced, and the accuracy and reliability of the trained model are improved.

In some embodiments of the present disclosure, the target view angle may be one of a plurality of preset view angles, and different preset view angles correspond to different preset virtual camera coordinate systems; the camera external parameters may include a target rotation matrix, which may be a pre-generated matrix for converting the first target image from camera coordinates to a preset virtual camera coordinate system corresponding to the target viewing angle. The preset viewing angle may include a front viewing angle, a rear viewing angle, a left viewing angle, and a right viewing angle, for example.

The preset virtual camera coordinate system may be a three-dimensional coordinate system including an x-axis, a y-axis, and a z-axis, the z-axes of the plurality of preset virtual camera coordinates may be the same, and the x-axes of the plurality of preset virtual camera coordinates may be arranged at a preset angle interval. The preset angle may be any angle set in advance, and for example, the preset angle may be 90 degrees or 180 degrees.

In some embodiments, in order to perform perspective transformation on the first preset image, a camera coordinate system corresponding to the target camera may be defined, where the camera coordinate system may include three kinds of reference camera coordinate systems, a preset virtual camera coordinate system and a nominal camera coordinate system, which are described below with reference to the accompanying drawings respectively:

Fig. 2 is a schematic diagram of a reference camera coordinate system, shown according to an exemplary embodiment. As shown in fig. 2, the reference camera coordinate system may include the following parameters:

Coordinate origin position: may also be referred to as the camera optical center-to-ground height. The installation height of the target camera may be taken as the photo-centered height of the camera, which may be a camera height given by structural external parameters of the target camera.

The z-axis is pointed: may also be referred to as camera optical axis pointing. The preset direction of the smart device may be taken as the z-axis direction, which is parallel to the horizontal plane. For example, in case the smart device is a robot, the direction of motion of the robot may be pointed as the z-axis, and the z-axis is pointed parallel to the horizontal plane.

The y-axis is directed: the direction in which the camera optical center is directed vertically to the lower horizontal plane may be referred to as the y-axis.

The x-axis is directed: the x-axis direction may be determined according to the right-hand rule based on the z-axis direction and the y-axis direction, and for example, a right direction of the robot forward direction may be taken as the x-axis direction.

In some embodiments, the reference camera coordinate system may be used to determine the relationship of the camera to the ground, for example, a point in the current body where the ground origin is the y-axis mounting height from the camera may be defined, and the ground normal vector is [0, -1,0].

In other embodiments, the preset virtual camera coordinate system and the nominal camera coordinate system described above may be transformed to the reference camera coordinate system by a pure rotational transformation such that their z-axis and x-axis coordinates are substantially coincident with the robot body coordinate system.

Further, as the camera installed on the intelligent device (such as a robot) can be installed horizontally or vertically, a preset virtual camera coordinate can be defined, so that calibration of the camera coordinate is facilitated.

Fig. 3 is a schematic diagram illustrating a preset virtual camera coordinate system according to an exemplary embodiment. As shown in fig. 3, the predetermined virtual camera coordinate system may be a pure rotation differing from the base station camera coordinate system by a multiple of a predetermined angle about the z-axis. The preset angle may be, for example, 90 degrees.

As shown in fig. 3, four preset virtual camera coordinate systems may be defined, which are respectively:

the preset coordinate parameter Rz corresponding to the first virtual camera coordinate system (301) is 0 degree. For example, the reference camera coordinate system may be used as the first virtual camera coordinate system, and an image captured by a target camera based on the first virtual camera coordinate system may be input into an image processing model without being flipped to enter image processing (for example, tasks such as target object detection).

The preset coordinate parameter Rz corresponding to the second virtual camera coordinate system (302) is-90 degrees. For example, the reference camera coordinate system may be rotated clockwise by 90 degrees to obtain the second virtual camera coordinate system, and the image captured by the target camera based on the second virtual camera coordinate system may be rotated counterclockwise by 90 degrees and then input into the image processing model to enter into image processing (for example, tasks such as target object detection)

The preset coordinate parameter Rz corresponding to the third virtual camera coordinate system (303) is 180 degrees. For example, the reference camera coordinate system may be turned 180 degrees to obtain the third virtual camera coordinate system, and the image captured by the target camera based on the third virtual camera coordinate system may be turned upside down by 180 degrees and then input into the image processing model to enter into image processing (for example, tasks such as target object detection)

The preset coordinate parameter Rz corresponding to the fourth virtual camera coordinate system (304) is 90 degrees. For example, the reference camera coordinate system may be rotated counterclockwise by 90 degrees to obtain the fourth virtual camera coordinate system, and an image captured by the target camera based on the fourth virtual camera coordinate system may be rotated clockwise by 90 degrees and then input into the image processing model to enter into image processing (for example, tasks such as target object detection).

In some embodiments, the preset virtual camera coordinate system may be used to determine whether the captured image requires a rotation operation.

In some embodiments, the preset virtual camera coordinate system may be used to determine the physical world wide that the target camera can observe.

In some embodiments, the preset virtual camera coordinate system may be used to correct the perspective of the target camera to the forward looking direction, reducing image distortion introduced by the mounting angle.

Further, since the installation angles of the cameras on the smart device (e.g., the robot) are different, for example, an installation angle may be specified in the specification of the structure of the smart device, and the installation angle may be an angle (acute angle) with a vertical line, and the angle may act on an x-axis or a y-axis of the preset virtual camera coordinate system, the nominal camera coordinate system may be determined according to the preset virtual camera coordinate system and the installation angle.

FIG. 4 is a schematic diagram of a nominal camera coordinate system, shown according to an exemplary embodiment. As shown in fig. 4, the nominal camera coordinate system may be a coordinate system determined based on the above-described preset virtual camera coordinate system and the mounting angle. The nominal camera coordinate system may be plural, illustratively:

The first virtual camera coordinate system (301) may be rotated counterclockwise about the x-axis by an installation angle (which may be theta) to obtain a first nominal camera coordinate system (3011). Rx (theta) in the figure may characterize an angle of counterclockwise rotation theta about the x-axis. The nominal coordinate parameters corresponding to the first nominal camera coordinate system (3011) may include Rz and Rx, where the Rz may be a value (0 degrees) of a preset coordinate parameter of the first virtual camera coordinate system 301, and the Rx may be theta (installation angle).

The first virtual camera coordinate system (301) may be rotated clockwise about the x-axis by an installation angle to obtain a second nominal camera coordinate system (3012). Rx (-theta) in the figure may characterize a clockwise rotation angle theta about the x-axis. The nominal coordinate parameters corresponding to the second nominal camera coordinate system (3012) may include Rz and Rx, where the Rz may be a value (0 degrees) of a preset coordinate parameter of the first virtual camera coordinate system 301, and the Rx may be-theta (installation angle).

The second virtual camera coordinate system (302) may be rotated counterclockwise about the y-axis by an installation angle to obtain a third nominal camera coordinate system (3021). Ry (theta) in the figure may characterize an angle of counterclockwise rotation theta about the y-axis. The nominal coordinate parameters corresponding to the third nominal camera coordinate system (3021) may include Rz and Ry, where the Rz may be a value (-90 degrees) of the preset coordinate parameter of the second virtual camera coordinate system 302, and the Ry may be theta (installation angle).

The second virtual camera coordinate system (302) may be rotated clockwise about the y-axis by an installation angle to obtain a fourth nominal camera coordinate system (3022). Ry (-theta) in the figure may characterize a clockwise rotation angle theta about the y-axis. The nominal coordinate parameters corresponding to the fourth nominal camera coordinate system (3022) may include Rz and Ry, where the Rz may be a value (-90 degrees) of the preset coordinate parameter of the second virtual camera coordinate system 302, and the Ry may be-theta (mounting angle).

The third virtual camera coordinate system (303) may be rotated clockwise about the x-axis by an installation angle to obtain a fifth nominal camera coordinate system (3031). Rx (-theta) in the figure may characterize a clockwise rotation angle theta about the x-axis. The nominal coordinate parameters corresponding to the fifth nominal camera coordinate system (3031) may include Rz and Rx, where the Rz may be the value (180 degrees) of the preset coordinate parameter of the third virtual camera coordinate system 303, and the Rx may be-theta (installation angle).

The third virtual camera coordinate system (303) may be rotated counterclockwise about the x-axis by an installation angle to obtain a sixth nominal camera coordinate system (3032). Rx (theta) in the figure may characterize an angle of counterclockwise rotation theta about the x-axis. The nominal coordinate parameters corresponding to the sixth nominal camera coordinate system (3032) may include Rz and Rx, where the Rz may be the value (180 degrees) of the preset coordinate parameter of the third virtual camera coordinate system 303, and the Rx may be theta (installation angle).

The fourth virtual camera coordinate system (304) may be rotated clockwise about the y-axis by an installation angle to obtain a seventh nominal camera coordinate system (3041). Ry (-theta) in the figure may characterize a clockwise rotation angle theta about the y-axis. The nominal coordinate parameters corresponding to the seventh nominal camera coordinate system (3041) may include Rz and Ry, where the Rz value may be the value (90 degrees) of the preset coordinate parameter of the fourth virtual camera coordinate system 304, and the Ry value may be-theta (installation angle).

The fourth virtual camera coordinate system (304) may be rotated counterclockwise about the y-axis by an installation angle to obtain an eighth nominal camera coordinate system (3042). Ry (theta) in the figure may characterize an angle of counterclockwise rotation theta about the y-axis. The nominal coordinate parameters corresponding to the eighth nominal camera coordinate system (3042) may include Rz and Ry, where the Rz may be a value (90 degrees) of a preset coordinate parameter of the fourth virtual camera coordinate system 304, and the Ry may be theta (installation angle).

The nominal camera coordinate system may include one or more of the first through eighth nominal camera coordinate systems described above.

It should be noted that, in the case where the error of the structural external parameter of the smart device is controllable (for example, the error is less than or equal to the preset error threshold value), the nominal camera coordinate system may be used as the actual camera coordinate system. According to the nominal camera coordinate system, pure rotation rigid transformation can be carried out on the 3D image shot by the target camera to be mapped back to the preset virtual camera coordinate system or the reference camera coordinate system, and the 3D image can also be converted pixel by pixel into the preset virtual camera coordinate system or the reference camera coordinate system by calculating a pixel-level 2D mapping matrix through the pure rotation rigid transformation.

In some embodiments of the present disclosure, the camera external parameter may be a preset parameter, for example, the camera external parameter may include a preset rotation matrix.

In other embodiments, the camera external parameters may be pre-calibrated parameters, for example, a target rotation matrix corresponding to the target camera may be determined according to the installation height and the installation angle of the target camera; the target rotation matrix is used as a camera external parameter.

The target camera can be installed on the intelligent device, the installation height can be used for representing the height of the installation position of the target camera on the intelligent device relative to the ground, and the installation angle can be used for representing the included angle between the target camera and the vertical direction.

Fig. 5 is a flowchart illustrating a method of determining camera parameters according to an exemplary embodiment. As shown in fig. 5, the method may include:

S501, determining a preset reference position and a preset calibration scale in a preset area.

The preset calibration scale is used for indicating the distance between a preset calibration position and a preset reference position in a preset area.

In some embodiments, the preset calibrated scale may be a scale manually drawn on level ground. For example, the drawing of the ground scale may be performed on a level ground so that the scale lines are parallel to each other; according to the rule of the near size and the far size, the scale interval with the nearer distance can be set to be a smaller value, and the scale interval with the farther distance can be set to be a larger value, namely, the scales with the smaller values are denser, and the scales with the larger values are sparser. The 0 scale position of the preset calibration scale can be 0m, and the farthest scale can be the farthest distance which can be clearly distinguished in the image.

FIG. 6 is a schematic diagram illustrating a preset calibrated scale according to an exemplary embodiment. Fig. 6 shows that the white line segment in fig. 6 is a preset calibrated scale, where the preset calibrated scale may include N scale lines in a preset area, the N scale lines are respectively 1,2, 3, … …, and N scale lines are in a numerical direction, and a distance increasing direction is from left to right. The distance between adjacent graduation marks gradually increases from left to right, taking N as 10 as an example, the distance between adjacent graduation marks may be: the physical distance between the graduation mark 1 and the graduation mark 2 is 1 meter, the physical distance between the graduation mark 2 and the graduation mark 3 is 1.5 meters, the physical distance between the graduation mark 3 and the graduation mark 4 is 2 meters, and so on, the physical distance between the graduation mark 9 and the graduation mark 10 is 5 meters. The origin position (i.e., the position with a distance of 0) corresponding to the preset calibration scale may be used as the preset reference position, or a straight line passing through the origin position and parallel to the scale line may be used as the preset reference position.

In other embodiments, the preset calibrated scale may be automatically calibrated, for example, may be factory automated line segment detection scheme determination. The manual drawing of the preset scale is only used for explaining the principle, and the number of cameras on the mobile robot is considered to be controllable, if the number of cameras is increased and the calibration is needed at the factory side, the manual drawing of the preset scale can be replaced by an automatic line segment detection scheme.

S502, determining a nominal grid image corresponding to a plurality of nominal camera coordinate systems and nominal distances between each pixel in the nominal grid image and a preset reference position according to the mounting height and the mounting angle.

The nominal grid image may be an image including a predetermined area, and the nominal camera coordinate system may be a coordinate system obtained by rotating the installation angle around the x-axis or the y-axis based on the predetermined virtual camera coordinate system.

In this step, first, a reference grid image corresponding to a reference camera coordinate system and a reference distance between each pixel in the reference grid image and a preset reference position may be determined according to an installation height; the reference grid image is an image containing a preset area; and then, carrying out rigid transformation on the reference grid image according to the installation angle and the nominal coordinate parameters of the nominal camera coordinate system to obtain a nominal grid image and the nominal distance between each pixel in the nominal grid image and the preset reference position.

In some embodiments, the nominal grid image corresponding to the nominal camera coordinate system may be determined by:

First, a physical size parameter of a preset area in a reference camera coordinate system may be set.

By way of example, the preset region may be a region desired to be observable in the image under the reference camera coordinate system, and the physical size parameter may include at least one of:

The nearest forward distance z_min may be set to 0m.

The furthest forward distance z_max may be set to the furthest distance of the preset calibrated scale described above, for example 7m.

The forward grid interval z_interval may be any preset value, for example, may be set to 0.01m, i.e., 1cm for the forward grid.

The furthest distance x_min to the left of the reference camera may be any preset value, for example-8 m.

The furthest distance x_max to the right of the reference camera may be any preset value, for example 8m.

The lateral grid interval x_interval may be any preset value, for example, may be set to 0.01m, and the smaller the interval, the denser the grid.

And secondly, determining a reference grid image corresponding to a reference camera coordinate system according to the camera mounting height and the physical dimension parameters.

For example, a ground grid 3D coordinate map (base grids) may be generated under a reference camera coordinate system and the ground grid 3D coordinate map may be taken as the reference grid image.

The reference grid image comprises a plurality of pixels and a reference distance between each pixel and a preset reference position.

It should be noted that, the specific manner of generating the ground grid 3D coordinate graph may refer to the implementation in the related art, which is not described in detail in this disclosure.

And determining nominal coordinate parameters of a nominal camera coordinate system according to the installation angle, and performing rigid transformation on the reference grid image according to the nominal coordinate parameters to obtain a nominal grid image and a nominal distance between each pixel in the nominal grid image and a preset reference position.

The nominal coordinate parameters may include, for example, rz, which is a preset coordinate parameter of a preset virtual camera coordinate system from which the nominal camera coordinate system is derived, and may further include any one of Rx or Ry, which may be determined according to the above-described mounting angle. The different nominal camera coordinate systems may correspond to different nominal coordinate parameters, and may be specifically referred to the description in the foregoing embodiments of the disclosure, which is not repeated herein.

The pure rotation transformation matrix between the corresponding reference coordinate system and the nominal camera coordinate system can be calculated according to the nominal coordinate parameters (Rz and Rx/Ry)According to/>The standard grid image is subjected to rigid transformation to obtain a nominal grid image corresponding to the nominal camera coordinate system. Illustratively, the nominal grid image may be calculated by the following equation (2):

Wherein P _cam denotes a nominal grid image, Representing the pure rotational transformation matrix between the reference coordinate system and the nominal camera coordinate system, P _base represents the reference grid image.

In some embodiments, the ground grid coordinates in the nominal camera coordinate system may also be mapped to the image coordinate system according to the camera parameters, resulting in a nominal image in the image coordinate system.

For example, the mapping can be performed by the following formula (3):

Wherein, The pixel coordinate of the target pixel in the image coordinate system is represented, Z represents the Z-axis coordinate value of the target pixel in the nominal camera coordinate system, K represents the camera internal reference, and P represents the ground grid coordinate of the target pixel in the nominal camera coordinate system.

It should be noted that, if only the image formed by the projection grid points has a large amount of hollows, in some embodiments, the value filling may be performed around the effective pixels corresponding to the grid points, so as to obtain a final ideal ground grid dense projection map, where the value of the ineffective area may be set to 0.

By adopting the mode, the combination conditions of all preset virtual cameras and corresponding installation angles (such as vertical included angles) can be arranged and combined, and the nominal grid image corresponding to each nominal camera coordinate system can be generated.

S503, determining a target nominal camera coordinate system corresponding to the target visual angle from a plurality of nominal camera coordinate systems according to the nominal distance.

In some embodiments, the target nominal camera coordinate system may be determined by calibrating the image captured by the target camera according to the nominal distance in the image captured by the target camera.

Fig. 7 is a flowchart showing a step S503 according to the embodiment shown in fig. 5. As shown in fig. 7, this S503 step may include the sub-steps of:

s5031, shooting a preset area from a preset reference position through a target camera to obtain a calibration image.

For example, if the target camera is mounted on the robot, the robot may be placed at a 0 scale position of a ground preset calibration scale, and the preset area is photographed to obtain the calibration image.

S5031, determining the calibration distance between each pixel in the calibration image and a preset reference position according to a preset calibration scale.

For example, the calibration distances of the pixels corresponding to all scales of the preset calibration scale may be first determined, and then the calibration distances of the other pixels may be determined by using an interpolation algorithm.

S5033, acquiring the image overlapping degree of the calibration image and each nominal grid image according to the calibration distance and the nominal distance.

In some embodiments, the calibration distance and the nominal distance of all pixels may be directly compared to obtain the image overlapping degree.

In other embodiments, a region with a certain preset distance (for example, a region with a distance of 0 is selected, that is, an ineffective region) may be selected for calculating the overlapping degree, taking into account the difference of the numerical interpolation scheme between the ground rule reference template and the actual ground rule and the actual external parameter error. Illustratively:

Firstly, determining a first pixel position of a calibration image, wherein the nominal distance of the first pixel position is equal to a preset distance; and determining a third pixel position with a nominal distance being a preset distance in the nominal grid image. In some implementations, the preset distance may be 0, which is used to characterize the inactive area.

Next, a pixel overlap of the first pixel location and the second pixel location is obtained.

The pixel overlapping degree can be calculated by the following steps: setting the pixel value of a first pixel position in the calibration image as 1, and setting the pixel values of other pixel positions except the first pixel position in the calibration image as 0 to obtain a calibration mask corresponding to the calibration image; setting the pixel value of the second pixel position in the nominal grid image to be 1, and setting the pixel values of other pixel positions except the second pixel position in the nominal grid image to be 0 to obtain a nominal mask corresponding to the nominal grid image; and performing bit-wise Boolean AND operation on the calibration mask and the nominal mask, thereby obtaining the pixel overlapping degree. For example, the number of median values of 1 in the mask after bitwise boolean and operation may be used as the pixel overlap. The higher the pixel overlap, the higher the similarity of the calibration image to the nominal grid image can be characterized.

Finally, the pixel overlap is taken as the image overlap.

Thus, the degree of image overlap can be calculated.

S5034, determining a target nominal camera coordinate system from the nominal camera coordinate systems according to the image overlapping degree.

For example, the nominal camera coordinate system with the highest image overlapping degree may be taken as the target nominal camera coordinate system.

In this way, the target nominal camera coordinate system can be determined, i.e. corresponding preset virtual camera coordinate systems and orientations are determined, i.e. corresponding nominal coordinate parameters are determined.

S504, determining a target rotation matrix according to a target nominal camera coordinate system.

For example, a pure rotational transformation matrix between the corresponding reference coordinate system and the target nominal camera coordinate system may be calculated from the corresponding nominal coordinate parameters (Rz, and Rx or Ry) of the target nominal camera coordinate systemThe pure rotation transformation matrix is taken as the target rotation matrix.

S505, taking the target rotation matrix as a camera external parameter.

By the method, the camera external parameters of the target camera can be calibrated, and the more accurate camera external parameters are determined so as to perform visual angle transformation on the image according to the camera external parameters.

It should be noted that, the method for determining the camera external parameter shown in fig. 5 may be referred to as coarse calibration of the camera external parameter.

In some embodiments, the camera external parameter fine calibration may also be performed on the basis of the camera external parameter coarse calibration, and for example, the camera external parameter fine calibration may be performed according to a high-precision calibration plate, a multi-sensor combined calibration, and the like. So as to further improve the accuracy of the viewing angle conversion.

In some embodiments of the present disclosure, the camera references described above may also be pre-calibrated, for example, a checkerboard calibration method may be used to determine the camera references of the target camera.

In some embodiments of the present disclosure, the rotation transformation and the de-distortion processing may be further performed on the first target image according to a camera parameter of the target camera, so as to obtain the second target image.

For example, in the case where the above-described object camera includes a fisheye lens, the fisheye lens may generate distortion, and thus, a de-distortion process is required.

Both the de-distortion process and the rotational transformation require image dependent Remapping (REMAPPING), i.e. a process of taking a pixel from one source location (x, y) in the input image f and filling it into another location u (x, y) in the output image g. For example, the remapping of the image may be performed by the following equation (4):

g(x，y)＝f(u(x，y)) (4)

Where g (x, y) represents the output image, f represents the source image, and u (x, y) represents the mapping method function for (x, y). Since digital images are discrete functions, this process typically requires interpolation methods to map successive locations.

Both the de-distortion and rotation transformations may be performed by remapping.

In some embodiments, the rotation transformation and the de-distortion processing may be performed on the first target image, respectively, to obtain the second target image.

For example, the first target image may be rotated to obtain a third target image, and then the third target image may be subjected to a de-distortion process to obtain a second target image.

In other embodiments, the rotation transformation and de-distortion processing may be combined into the same remapping process that is performed on the first target image to obtain the second target image. Illustratively, the method includes the steps of. The remapping process may be performed by the following steps.

First, a first mapping function for performing a rotation transformation on a first target image and a second mapping function for performing a de-distortion process on the first target image are determined.

Next, a third mapping function is determined based on the first mapping function and the second mapping function.

And finally, carrying out remapping processing on the first target image through a third mapping function to obtain a second target image.

The principle of this mode is described below:

If the remapping itself is represented as two functions u _x and u _y, the formula is as follows:

u(x，y)＝(u_x(x,y),u_y(x，y))

The effect of the function compounding can be achieved by applying the remap v on u _x and u _y, respectively, as follows:

v.u(x，y)＝v(u_x(x,y)),v(u_y(x，y))

the amount of computation of the re-mapping function v.u after the compounding is the same as the one of the re-mapping functions u and v for the compounding whose output image is larger in size.

By utilizing the characteristics, the third mapping function can be obtained by combining the first mapping function and the second mapping function, so that the operation efficiency is improved, and the image processing efficiency is improved.

It should be noted that the first mapping function, the second mapping function, and the third mapping function may be in the form of mapping correspondence, for example, may be a mapping table.

Since the position of each pixel in the output image is independently computable during the remapping process, the process is naturally amenable to acceleration using a GPU (Graphics Processing Unit, graphics processor). Thus, the above remapping process may be performed by way of GPU acceleration.

In this way, the efficiency of image processing can be further improved by GPU acceleration.

Fig. 8 is a block diagram of an image processing apparatus 1100, according to an exemplary embodiment, as shown in fig. 8, the apparatus 1100 may include:

An image acquisition module 1101, configured to acquire a first target image captured by a target camera;

the preprocessing module 1102 is configured to perform perspective transformation on the first target image according to camera parameters of the target camera, so as to obtain a second target image under a target perspective;

A processing module 1103, configured to input the second target image into a pre-generated image processing model, and process the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle.

According to one or more embodiments of the present disclosure, the camera parameters include a camera intrinsic parameter and a camera extrinsic parameter, and the preprocessing module 1102 is configured to, for each target pixel of the first target image, obtain a first pixel coordinate of the target pixel in the first target image; determining a second pixel coordinate corresponding to each target pixel according to the camera internal parameter, the camera external parameter and the first pixel coordinate; and carrying out view angle transformation on the first target image according to the second pixel coordinates to obtain a second target image under a target view angle.

According to one or more embodiments of the present disclosure, the target viewing angle is one of a plurality of preset viewing angles, and different preset viewing angles correspond to different preset virtual camera coordinate systems; the camera external parameters comprise a target rotation matrix, wherein the target rotation matrix is a pre-generated matrix used for converting the first target image into a preset virtual camera coordinate system corresponding to the target visual angle.

According to one or more embodiments of the present disclosure, the preset virtual camera coordinate system is a three-dimensional coordinate system including an x-axis, a y-axis, and a z-axis, the z-axes of the plurality of preset virtual camera coordinates are the same, and the x-axes of the plurality of preset virtual camera coordinates are arranged at intervals according to a preset angle.

Fig. 9 is a block diagram of another image processing apparatus 1100, according to an exemplary embodiment, as shown in fig. 9, the apparatus 1100 may further include:

A parameter obtaining module 1104, configured to determine a target rotation matrix corresponding to a target camera according to a mounting height and a mounting angle of the target camera; taking the target rotation matrix as the camera external parameter.

According to one or more embodiments of the present disclosure, the parameter acquisition module 1104 is configured to determine a preset reference position and a preset calibration scale within a preset area; the preset calibration scale is used for indicating the distance between a preset calibration position and the preset reference position in the preset area; determining a plurality of nominal grid images corresponding to the nominal camera coordinate systems and nominal distances between each pixel in the nominal grid images and the preset reference position according to the mounting height and the mounting angle; the nominal grid image is an image containing the preset area, and the nominal camera coordinate system is a coordinate system obtained by rotating the installation angle around an x axis or a y axis based on the preset virtual camera coordinate system; determining a target nominal camera coordinate system corresponding to the target visual angle from a plurality of nominal camera coordinate systems according to the nominal distance; and determining the target rotation matrix according to the target nominal camera coordinate system.

According to one or more embodiments of the present disclosure, the parameter obtaining module 1104 is configured to determine, according to the installation height, a reference grid image corresponding to a reference camera coordinate system, and a reference distance between each pixel in the reference grid image and the preset reference position; the reference grid image is an image containing the preset area; determining nominal coordinate parameters of the nominal camera coordinate system according to the installation angle; and carrying out rigid body transformation on the reference grid image according to the nominal coordinate parameters to obtain the nominal grid image and the nominal distance between each pixel in the nominal grid image and the preset reference position.

According to one or more embodiments of the present disclosure, the parameter obtaining module 1104 is configured to obtain a calibration image by capturing, by the target camera, the preset area from the preset reference position; determining the calibration distance between each pixel in the calibration image and the preset reference position according to the preset calibration scale; acquiring the image overlapping degree of the calibration image and each nominal grid image according to the calibration distance and the nominal distance; and determining a target nominal camera coordinate system from the nominal camera coordinate systems according to the image overlapping degree.

According to one or more embodiments of the present disclosure, the parameter obtaining module 1104 is configured to determine a first pixel position in the calibration image where the calibration distance is equal to a preset distance; determining a third pixel position in the nominal grid image, wherein the nominal distance is the preset distance; acquiring the pixel overlapping degree of the first pixel position and the second pixel position; and taking the pixel overlapping degree as the image overlapping degree.

According to one or more embodiments of the present disclosure, the preprocessing module 1102 is configured to perform rotation transformation and de-distortion processing on the first target image according to camera parameters of the target camera, so as to obtain a second target image under a target viewing angle.

According to one or more embodiments of the present disclosure, the preprocessing module 1102 is configured to determine a first mapping function for performing a rotation transformation on the first target image, and a second mapping function for performing a de-distortion process on the first target image; determining a third mapping function according to the first mapping function and the second mapping function; and carrying out remapping processing on the first target image through the third mapping function to obtain the second target image.

According to one or more embodiments of the present disclosure, the image processing model is a target detection model; the processing module 1103 is configured to input the second target image into the target detection model, so as to obtain target object information output by the target detection model; the target object information includes category information of a target object, and position information of the target object in the second target image.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Referring now to fig. 10, a schematic diagram of an architecture of an electronic device 2000 (e.g., a terminal device, server, robot, or other smart device) suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. Servers in embodiments of the present disclosure may include, but are not limited to, such as local servers, cloud servers, individual servers, distributed servers, and the like. The electronic device shown in fig. 10 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 10, the electronic device 2000 may include a processing apparatus (e.g., a central processing unit, a graphics processor, etc.) 2001, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 2002 or a program loaded from a storage apparatus 2008 into a Random Access Memory (RAM) 2003. In the RAM2003, various programs and data required for the operation of the electronic device 2000 are also stored. The processing device 2001, ROM2002, and RAM2003 are connected to each other by a bus 2004. An input/output (I/O) interface 2005 is also connected to bus 2004.

In general, the following devices may be connected to the input/output interface 2005: input devices 2006 including, for example, a touch screen, a touch pad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, and the like; output devices 2007 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage 2008 including, for example, a magnetic tape, a hard disk, and the like; and a communication device 2009. The communication means 2009 may allow the electronic device 2000 to communicate with other devices wirelessly or by wire to exchange data. While fig. 10 shows an electronic device 2000 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 2009, or installed from the storage device 2008, or installed from the ROM 2002. The above-described functions defined in the method of the embodiment of the present disclosure are performed when the computer program is executed by the processing device 2001.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a first target image shot by a target camera; performing view angle transformation on the first target image according to camera parameters of the target camera to obtain a second target image under a target view angle; inputting the second target image into a pre-generated image processing model, and processing the second target image; the image processing model is a model generated after training in advance according to the sample image under the target visual angle.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented in software or hardware. The name of the module is not limited to the module itself in some cases, and for example, the image acquisition module may also be described as "a module that acquires a first target image captured by a target camera".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided an image processing method including:

Acquiring a first target image shot by a target camera;

According to one or more embodiments of the present disclosure, the camera parameters include a camera intrinsic parameter and a camera extrinsic parameter, the first target image includes a plurality of target pixels; the performing perspective transformation on the first target image according to the camera parameters of the target camera, and obtaining a second target image under the target perspective includes:

For each target pixel of the first target image, acquiring a first pixel coordinate of the target pixel in the first target image;

determining a second pixel coordinate corresponding to each target pixel according to the camera internal parameter, the camera external parameter and the first pixel coordinate;

and carrying out view angle transformation on the first target image according to the second pixel coordinates to obtain a second target image under a target view angle.

According to one or more embodiments of the present disclosure, the camera profile is obtained by:

Determining a target rotation matrix corresponding to a target camera according to the installation height and the installation angle of the target camera;

taking the target rotation matrix as the camera external parameter.

According to one or more embodiments of the present disclosure, the determining, according to the installation height and the installation angle of the target camera, the target rotation matrix corresponding to the target camera includes:

Determining a preset reference position and a preset calibration scale in a preset area; the preset calibration scale is used for indicating the distance between a preset calibration position and the preset reference position in the preset area;

Determining a plurality of nominal grid images corresponding to the nominal camera coordinate systems and nominal distances between each pixel in the nominal grid images and the preset reference position according to the mounting height and the mounting angle; the nominal grid image is an image containing the preset area, and the nominal camera coordinate system is a coordinate system obtained by rotating the installation angle around an x axis or a y axis based on the preset virtual camera coordinate system;

Determining a target nominal camera coordinate system corresponding to the target visual angle from a plurality of nominal camera coordinate systems according to the nominal distance;

and determining the target rotation matrix according to the target nominal camera coordinate system.

According to one or more embodiments of the present disclosure, the determining, according to the mounting height and the mounting angle, a nominal grid image corresponding to a plurality of nominal camera coordinate systems, and a nominal distance between each pixel in the nominal grid image and the preset reference position includes:

Determining a reference grid image corresponding to a reference camera coordinate system and a reference distance between each pixel in the reference grid image and the preset reference position according to the mounting height; the reference grid image is an image containing the preset area;

Determining nominal coordinate parameters of the nominal camera coordinate system according to the installation angle;

And carrying out rigid body transformation on the reference grid image according to the nominal coordinate parameters to obtain the nominal grid image and the nominal distance between each pixel in the nominal grid image and the preset reference position.

According to one or more embodiments of the present disclosure, the determining, from the plurality of nominal camera coordinate systems, a target nominal camera coordinate system corresponding to the target view angle according to the nominal distance includes:

shooting the preset area from the preset reference position through the target camera to obtain a calibration image;

determining the calibration distance between each pixel in the calibration image and the preset reference position according to the preset calibration scale;

Acquiring the image overlapping degree of the calibration image and each nominal grid image according to the calibration distance and the nominal distance;

and determining a target nominal camera coordinate system from the nominal camera coordinate systems according to the image overlapping degree.

According to one or more embodiments of the present disclosure, the obtaining the image overlapping degree of the calibration image and each of the nominal grid images according to the calibration distance and the nominal distance includes:

determining a first pixel position of the calibration distance equal to a preset distance in the calibration image;

Determining a third pixel position in the nominal grid image, wherein the nominal distance is the preset distance;

acquiring the pixel overlapping degree of the first pixel position and the second pixel position;

And taking the pixel overlapping degree as the image overlapping degree.

According to one or more embodiments of the present disclosure, the performing, according to camera parameters of the target camera, perspective transformation on the first target image, to obtain a second target image at a target perspective includes:

And performing rotation transformation and de-distortion processing on the first target image according to camera parameters of the target camera to obtain a second target image under a target visual angle.

According to one or more embodiments of the present disclosure, the performing rotation transformation and de-distortion processing on the first target image according to the camera parameters of the target camera, to obtain a second target image under the target viewing angle includes:

Determining a first mapping function for performing rotation transformation on the first target image and a second mapping function for performing de-distortion processing on the first target image;

Determining a third mapping function according to the first mapping function and the second mapping function;

And carrying out remapping processing on the first target image through the third mapping function to obtain the second target image.

According to one or more embodiments of the present disclosure, the image processing model is a target detection model; the inputting the second target image into a pre-generated image processing model, and the processing the second target image comprises the following steps:

Inputting the second target image into the target detection model to obtain target object information output by the target detection model; the target object information includes category information of a target object, and position information of the target object in the second target image.

According to one or more embodiments of the present disclosure, there is provided an image processing apparatus including:

According to one or more embodiments of the present disclosure, the camera parameters include a camera internal parameter and a camera external parameter, and the preprocessing module is configured to, for each target pixel of the first target image, acquire a first pixel coordinate of the target pixel in the first target image; determining a second pixel coordinate corresponding to each target pixel according to the camera internal parameter, the camera external parameter and the first pixel coordinate; and carrying out view angle transformation on the first target image according to the second pixel coordinates to obtain a second target image under a target view angle.

According to one or more embodiments of the present disclosure, the apparatus further comprises:

The parameter acquisition module is used for determining a target rotation matrix corresponding to the target camera according to the installation height and the installation angle of the target camera; taking the target rotation matrix as the camera external parameter.

According to one or more embodiments of the present disclosure, the parameter acquisition module is configured to determine a preset reference position and a preset calibration scale in a preset area; the preset calibration scale is used for indicating the distance between a preset calibration position and the preset reference position in the preset area; determining a plurality of nominal grid images corresponding to the nominal camera coordinate systems and nominal distances between each pixel in the nominal grid images and the preset reference position according to the mounting height and the mounting angle; the nominal grid image is an image containing the preset area, and the nominal camera coordinate system is a coordinate system obtained by rotating the installation angle around an x axis or a y axis based on the preset virtual camera coordinate system; determining a target nominal camera coordinate system corresponding to the target visual angle from a plurality of nominal camera coordinate systems according to the nominal distance; and determining the target rotation matrix according to the target nominal camera coordinate system.

According to one or more embodiments of the present disclosure, the parameter obtaining module is configured to determine, according to the installation height, a reference grid image corresponding to a reference camera coordinate system, and a reference distance between each pixel in the reference grid image and the preset reference position; the reference grid image is an image containing the preset area; determining nominal coordinate parameters of the nominal camera coordinate system according to the installation angle; and carrying out rigid body transformation on the reference grid image according to the nominal coordinate parameters to obtain the nominal grid image and the nominal distance between each pixel in the nominal grid image and the preset reference position.

According to one or more embodiments of the present disclosure, the parameter obtaining module is configured to capture, by using the target camera, the preset area from the preset reference position, to obtain a calibration image; determining the calibration distance between each pixel in the calibration image and the preset reference position according to the preset calibration scale; acquiring the image overlapping degree of the calibration image and each nominal grid image according to the calibration distance and the nominal distance; and determining a target nominal camera coordinate system from the nominal camera coordinate systems according to the image overlapping degree.

According to one or more embodiments of the present disclosure, the parameter obtaining module is configured to determine a first pixel position in the calibration image, where the calibration distance is equal to a preset distance; determining a third pixel position in the nominal grid image, wherein the nominal distance is the preset distance; acquiring the pixel overlapping degree of the first pixel position and the second pixel position; and taking the pixel overlapping degree as the image overlapping degree.

According to one or more embodiments of the present disclosure, the preprocessing module is configured to perform rotation transformation and de-distortion processing on the first target image according to camera parameters of the target camera, so as to obtain a second target image under a target viewing angle.

According to one or more embodiments of the present disclosure, the preprocessing module is configured to determine a first mapping function for performing a rotation transformation on the first target image, and a second mapping function for performing a de-distortion process on the first target image; determining a third mapping function according to the first mapping function and the second mapping function; and carrying out remapping processing on the first target image through the third mapping function to obtain the second target image.

According to one or more embodiments of the present disclosure, the image processing model is a target detection model; the processing module is used for inputting the second target image into the target detection model to obtain target object information output by the target detection model; the target object information includes category information of a target object, and position information of the target object in the second target image.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims. The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Claims

1. An image processing method, the method comprising:

Acquiring a first target image shot by a target camera;

2. The method of claim 1, wherein the camera parameters include a camera intrinsic and a camera extrinsic, the first target image comprising a plurality of target pixels; the performing perspective transformation on the first target image according to the camera parameters of the target camera, and obtaining a second target image under the target perspective includes:

3. The method of claim 2, wherein the target perspective is one of a plurality of preset perspectives, different ones of the preset perspectives corresponding to different preset virtual camera coordinate systems; the camera external parameters comprise a target rotation matrix, wherein the target rotation matrix is a pre-generated matrix used for converting the first target image into a preset virtual camera coordinate system corresponding to the target visual angle.

4. A method according to claim 3, wherein the predetermined virtual camera coordinate system is a three-dimensional coordinate system including an x-axis, a y-axis, and a z-axis, the z-axes of the plurality of predetermined virtual camera coordinates being identical, and the x-axes of the plurality of predetermined virtual camera coordinates being arranged at predetermined angular intervals.

5. The method of claim 4, wherein the camera profile is obtained by:

taking the target rotation matrix as the camera external parameter.

6. The method of claim 5, wherein determining a target rotation matrix corresponding to the target camera based on the mounting height and the mounting angle of the target camera comprises:

7. The method of claim 6, wherein determining a nominal grid image corresponding to a plurality of nominal camera coordinate systems from the mounting height and the mounting angle, and a nominal distance of each pixel in the nominal grid image from the preset reference position comprises:

8. The method of claim 6, wherein the determining a target nominal camera coordinate system corresponding to the target perspective from a plurality of the nominal camera coordinate systems based on the nominal distance comprises:

9. The method of claim 8, wherein said obtaining an image overlap of said calibration image and each of said nominal grid images based on said calibration distance and said nominal distance comprises:

And taking the pixel overlapping degree as the image overlapping degree.

10. The method of claim 1, wherein performing perspective transformation on the first target image according to camera parameters of the target camera to obtain a second target image at a target perspective comprises:

11. The method of claim 10, wherein the performing rotational transformation and de-distortion processing on the first target image according to camera parameters of the target camera to obtain a second target image at a target viewing angle comprises:

12. The method according to any one of claims 1 to 11, wherein the image processing model is a target detection model; the inputting the second target image into a pre-generated image processing model, and the processing the second target image comprises the following steps:

13. An image processing apparatus, characterized in that the apparatus comprises:

14. A computer readable medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processing device, implements the steps of the method according to any one of claims 1 to 12.

15. An electronic device, comprising:

A storage device having a computer program stored thereon;

Processing means for executing said computer program in said storage means to carry out the steps of the method of any one of claims 1 to 12.