CN110555874B

CN110555874B - Image processing method and device

Info

Publication number: CN110555874B
Application number: CN201810562100.1A
Authority: CN
Inventors: 柯政遠
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2023-03-10
Anticipated expiration: 2038-05-31
Also published as: CN110555874A

Abstract

The embodiment of the application discloses an image processing method and device, wherein the method comprises the following steps: the image processing device acquires a first area in a first image; acquiring a second area in the second image; then calculating the horizontal displacement difference between the position of the first area in the first image and the position of the second area in the second image, and determining the distance between the target object and the shooting equipment according to the horizontal displacement difference; the first image is an image which is shot by a first camera of the shooting device and comprises a target object, and the second image is an image which is shot by a second camera of the shooting device and comprises the target object. According to the method and the device, the horizontal displacement difference between the position of the first area on the first image and the position of the second area on the second image is used for representing the horizontal displacement difference of corresponding pixel points of the first image and the second image, so that the calculation amount of the images is reduced, and the working efficiency is improved.

Description

Image processing method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method and apparatus.

Background

With the development of computer technology, augmented Reality (AR) and/or Virtual Reality (VR) applications are becoming more and more common. In AR and/or VR applications, object depth distance is a very important parameter. The object depth distance refers to a distance between the object to be measured and the photographing apparatus.

In the prior art, the measuring method of the object depth distance mainly comprises the following steps: comparing some related points of two images shot by the shooting equipment (one image is shot by a right camera in the shooting equipment, the other image is shot by a left camera in the shooting equipment, and the right camera and the left camera are positioned on a horizontal line), calculating the parallax of each pixel point on the image to form a parallax depth image, and performing depth calculation by using the parallax depth image to obtain the object depth distance. For example, stereo matching is performed on an image 1 captured by a left camera and an image 2 captured by a right camera in the capturing device, so as to obtain a relative horizontal displacement relationship between corresponding pixel points of the image 1 and the image 2. And obtaining a parallax depth map according to the relative horizontal displacement relation. Since the parallax and the object depth distance are in inverse proportion, the object depth distance can be estimated by using the parallax depth map. However, when the depth distance of the object is calculated, the parallax value of each corresponding pixel point in the whole image needs to be calculated, the image calculation amount is large, and the working efficiency is low.

Disclosure of Invention

The embodiment of the application provides an image processing method and device, which can reduce the operation amount of images and improve the working efficiency.

In a first aspect, an embodiment of the present application provides an image processing method, including: the image processing device acquires a first area in the first image, wherein the first area is an image area corresponding to a target object in the first image. A second region is then acquired in the second image. The second area is the same as the first area in shape and size, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in the second image. The image processing device calculates a horizontal displacement difference between a position of the first region in the first image and a position of the second region in the second image. Wherein, the horizontal displacement difference is used for representing the parallax between the first area and the second area. The distance between the target object and the shooting device can be determined according to the horizontal displacement difference. Wherein the first image is an image captured by a first camera of the image capturing apparatus, the second image is an image captured by a second camera of the image capturing apparatus, and the first camera and the second camera are located on a horizontal line. The first area is an image area corresponding to the target object in the first image, and the second area is an image area corresponding to the target object in the second image. According to the method and the device, the horizontal displacement difference between the position of the target object on the first image and the position of the target object on the second image is used for representing the horizontal displacement difference between the first image and the second image, and the horizontal displacement difference of corresponding pixel points of the first image and the second image does not need to be calculated, so that the calculation amount is reduced, and the working efficiency is improved.

With reference to the first aspect, in one possible implementation manner, the image processing apparatus may perform image segmentation on the first image to obtain the first region. Commonly used image segmentation techniques include: threshold-based image segmentation, semantic-based image segmentation, edge detection-based image segmentation, and so forth. Because the image segmentation technology can better segment a foreground region (an image region corresponding to a target object in an image) and a background region (an image region obtained by subtracting the foreground region from the image), the first region obtained by the image segmentation technology is more accurate, and the accuracy of image processing can be improved.

With reference to the first aspect, in one possible implementation manner, when performing image segmentation, the image processing apparatus may determine, as the first region, a region that is formed by a plurality of target pixels in the first image. And the characteristic value of the target pixel point is within the range of the target threshold value. The image segmentation method based on the threshold segmentation is simple in calculation and high in calculation efficiency, so that the accuracy of image processing can be improved, and the calculation efficiency of the image processing can be improved.

With reference to the first aspect, in a possible implementation manner, before determining, as the first region, a region that is formed by at least two target pixel points in the first image, the image processing apparatus may further determine a reference region of the target object in the first image. And then obtaining the characteristic values of a plurality of pixel points in the reference region. And determining a target threshold range for image segmentation according to the characteristic values of a plurality of pixel points in the reference region. The first region after image segmentation is the image region corresponding to the target object in the first image, and the threshold range for image segmentation is obtained from the characteristic values of the pixel points in the reference region of the target object, so that the image segmentation precision can be improved, a more complete first region can be obtained, and the image processing precision is further improved.

With reference to the first aspect, in one possible implementation manner, the characteristic value may include a color value, a gray value, or a depth reference value. Wherein the depth reference value is used to represent a reference distance between the pixel point and the photographing apparatus. In the embodiment of the application, whether the first image is a color image or a grayscale image, the first image can be segmented by using an image segmentation method based on a threshold value to obtain a first region corresponding to a target object in the first image, so that a more complete image processing method can be provided while the operation efficiency is improved.

With reference to the first aspect, in one possible implementation, when the image processing apparatus acquires the second region in the second image, one or more mask regions may be acquired in the second image, and each of the mask regions has the same shape and size as the first region. And determining the mask area with the pixel difference smaller than the pixel difference threshold value from the first area as the second area according to the pixel difference between each mask area and the first area. The embodiment of the application finds the mask area with the pixel difference smaller than the pixel difference threshold value from the first area in one or more mask areas, so that the second area can be determined in the second image.

With reference to the first aspect, in a possible implementation manner, when the image processing apparatus acquires one or more mask regions in the second image, the image processing apparatus may determine a mask window corresponding to the first region according to the first region, level upper and lower edges of the mask window with upper and lower edges of the second image, and then translate the mask window in the second image to obtain one or more mask regions in the second image. The mask window and the first image are the same in size and shape, and the mask area and the first area are the same in shape and size. Because the upper and lower edges of the shielding window are flush with the upper and lower edges of the left view and then are translated. Therefore, the number of the mask areas in the second image can be reduced, so that the calculation amount is reduced, and the working efficiency is improved.

With reference to the first aspect, in a possible implementation manner, the pixel difference between each mask region and the first region may be a sum of color differences between each pixel point in each mask region and a corresponding pixel point in the first region. When the first image and the second image are both color images, the image processing device determines the pixel difference by calculating the color difference of the color values of the corresponding pixel points on the first area and the mask area, and determines the mask area with the sum of the color difference with the first area smaller than the color difference threshold as the second area. A more sophisticated image processing method is provided.

With reference to the first aspect, in a possible implementation manner, the pixel difference between each mask region and the first region may be a sum of gray scale differences between gray scale values of each pixel in each mask region and a corresponding pixel in the first region. When the first image and the second image are both gray images, the image processing device determines pixel difference by calculating gray difference of gray values of corresponding pixel points on the first area and the mask area, and determines the mask area of which the sum of gray difference with the first area is less than a gray difference threshold value as the second area. The image processing method of the embodiment of the application is still suitable for gray level images, has a wide application range, and provides a more complete image processing method.

With reference to the first aspect, in one possible implementation manner, if the first image is a color image and the second image is a grayscale image. Before the image processing device determines the masking region with the pixel difference smaller than the pixel difference threshold value from the first region as the second region according to the pixel difference between each masking region and the first region, the image processing device can also convert the color value of each pixel point in the first region to obtain the gray value of each pixel point. The image processing method according to the embodiment of the present application is also applicable to a case where the first image is a color image and the second image is a grayscale image. Further expanding the application range and providing a more perfect image processing method.

With reference to the first aspect, in a possible implementation manner, when determining the second region, the image processing apparatus may determine, as the second region, a mask region having a smallest difference in pixels from the first region. According to the embodiment of the application, the mask area with the minimum pixel difference with the first area is searched in one or more mask areas to serve as the second area, the obtained pixel difference between the second area and the first area is minimum and more accurate, and therefore the accuracy of image processing can be improved.

With reference to the first aspect, in one possible implementation manner, when the image processing apparatus determines the distance between the target object and the shooting device according to the horizontal displacement difference, the shooting focal length when the first image and/or the second image is/are shot may be acquired. And then acquiring the spacing distance between the first camera and the second camera. And determining the distance between the target object and the shooting equipment according to the shooting focal length, the spacing distance and the horizontal displacement difference.

With reference to the first aspect, in one possible implementation, the image processing apparatus calculates a product of the photographing focal length and the separation distance. And determining the quotient of the product of the shooting focal length and the separation distance and the horizontal displacement difference as the distance between the target object and the shooting equipment. According to the embodiment of the application, the distance between the target object and the shooting equipment is calculated by utilizing the inverse relation between the horizontal displacement difference and the distance, the calculation is simple, and the calculation efficiency is high.

In a second aspect, an embodiment of the present application provides an image processing apparatus having a function of implementing the image processing method of the first aspect described above. The function can be realized by hardware, and can also be realized by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.

With reference to the second aspect, in one possible implementation, the image processing apparatus includes a first obtaining module, a second obtaining module, a calculating module, and a determining module. The first obtaining module is configured to obtain a first region in a first image, where the first region is an image region corresponding to a target object in the first image. The second acquiring module is used for acquiring a second area in a second image. The second area and the first area are the same in shape and size, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in the second image. The calculating module is configured to calculate a horizontal displacement difference between the position of the first region in the first image acquired by the first acquiring module and the position of the second region in the second image determined by the second acquiring module, where the horizontal displacement difference is used to represent a parallax between the first region and the second region. The determining module is used for determining the distance between the target object and the shooting device according to the horizontal displacement difference calculated by the calculating module. The first image is an image shot by a first camera of the shooting device, the second image is an image shot by a second camera of the shooting device, and the first camera and the second camera are located on a horizontal line.

In a third aspect, an embodiment of the present application provides another image processing apparatus, which includes a processor and a memory, the processor and the memory being connected to each other, wherein the memory is used for storing program codes;

the processor is used for calling the program code and executing the following operations:

a first region is acquired in the first image and a second region is acquired in the second image. The first area is an image area corresponding to a target object in the first image, the second area is the same as the first area in shape and size, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in the second image. Calculating a horizontal displacement difference between a position of the first region in the first image and a position of the second region in the second image, the horizontal displacement difference indicating a parallax between the first region and the second region. The distance between the target object and the shooting device can be determined according to the horizontal displacement difference. The first image is an image captured by a first camera of the imaging device, the second image is an image captured by a second camera of the imaging device, and the first camera and the second camera are located on a horizontal line.

In a fourth aspect, embodiments of the present application provide a computer storage medium for storing computer program instructions for an image processing apparatus, which includes instructions for executing the program according to the first aspect.

By implementing the embodiment of the application, on one hand, the operation amount of the image can be reduced, and the working efficiency is improved. On the other hand, the accuracy of image processing can be improved, and more accurate object depth distance can be obtained.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below.

FIG. 1 is a right side view and a left side view of the same scene;

FIG. 2 is a schematic diagram of the trigonometric principle;

FIG. 3 is a schematic flow chart diagram of an image processing method provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of a first region after image segmentation;

FIG. 5A is a schematic view of a mask region obtained on a left side view;

FIG. 5B is another schematic illustration of obtaining a mask area on a left side view;

FIG. 5C is a schematic diagram of another method for obtaining a mask region on a left side view

FIG. 6A is a schematic illustration of a horizontal displacement difference;

FIG. 6B is a schematic illustration of another horizontal displacement difference;

FIG. 6C is a schematic illustration of yet another horizontal displacement difference;

fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

The image processing method provided by the embodiment of the application can be applied to an application scene of object depth measurement. In some possible embodiments, the specific depth measurements are: firstly, matching left and right views by using a stereo matching method, and finding projection points of each point in a shooting scene on the left view and the right view respectively. The proxels appear as pixel points on the left and right views. Then, the correspondence between each projection point on the left view and the right view is calculated by utilizing the principle of trigonometryAnd obtaining the parallax depth map generated by the relative horizontal displacement of each projection point (pixel point) on the left view or the right view by the relative horizontal displacement between the projection points. Since the parallax is in inverse proportion to the depth distance, the object depth distance can be estimated by using the parallax depth map. Wherein, the image shot by the right camera in the shooting equipment is a right view, the image shot by the left camera is a left view, and the left camera and the right camera are positioned on a horizontal line. For example, as shown in fig. 1, fig. 1 is a right view and a left view of the same scene. Since the right camera and the left camera are on the same horizontal straight line in the photographing apparatus and there is a fixed distance between the right camera and the left camera, there is a slight horizontal difference between the right view and the left view. The stereo matching may include local stereo matching and global stereo matching. As shown in fig. 2, fig. 2 is a schematic view of the principle of triangulation. Taking any scene point in the shooting scene as an example, P represents a scene point in the shooting scene. Z _P Representing the distance between point P and the shooting device, i.e. the depth distance of the scene point P, and f representing the shooting focal length of the left camera and/or the right camera in the shooting device. P _r Representing the projected point of point P on the right view, P _l Representing the projected point of point P on the left view. O is _l Indicating the location point of the left camera in the camera device, O _r Indicating the location point of the right camera in the photographing apparatus, O _l And O _r Is B. Pi _l Indicating the maximum imaging range of the left camera,. Pi _r Represents the maximum imaging range of the right camera, where _l ＝π _r 。X _l Represents P _l Horizontal distance, X, from the leftmost side of the left view _r Represents P _r Horizontal distance to the leftmost side of the right view. Relative horizontal displacement d = | X _l -X _r L. Due to the Delta PP _l P _r And delta PO _l O _r Similarly, one can obtain:

whereinFormula (1) is simplified to Z _P ＝Bf/(X _l -X _r ) And = Bf/d. Similarly, the relative horizontal displacement of each pixel point on the left or right view can be calculated. Object depth distance

Representing the mean of the relative horizontal displacements of all the pixel points on the left or right view. In the embodiment of the application, on one hand, when the object depth is measured, the object depth distance is estimated through the relative horizontal displacement of each pixel point on the left view and the corresponding pixel point of the right view. Therefore, the relative horizontal displacement of each pixel point on the image needs to be calculated, so the calculation amount is large and the working efficiency is low. On the other hand, the calculated average value of the relative horizontal displacements of all the pixel points on the left or right view is used, the object depth distance represents the distance between the object to be measured and the shooting equipment, and the average value of the relative horizontal displacements of all the pixel points on the image is used for representing the relative horizontal displacement of the area where the object is located, so that the calculated object depth distance is inaccurate.

The image processing method provided by the embodiment of the application can be applied to image processing devices with image processing functions, such as smart phones, IPADs, desktop computers, notebook computers and the like. The shooting device of the embodiment of the application can be integrated on the image processing device and can also exist independently of the image processing device. The embodiments of the present application are not limited.

The following describes an image processing method and apparatus provided in an embodiment of the present application with reference to fig. 3 to 8.

In order to better understand and implement the solution of the embodiment of the present application, the photographing apparatus according to the embodiment of the present application may include a plurality of cameras (here, 2 or more), an image photographed by a right camera in the photographing apparatus is defined as a right view, an image photographed by a left camera in the photographing apparatus is defined as a left view, and the left camera and the right camera are located on a horizontal line. The first image in the image processing method provided by the embodiment of the application may be a right view or a left view. If the first image is a right view, the second image is a left view; if the first image is a left view, the second image is a right view. Since the shooting parameters of the left camera and the right camera in the shooting device are the same, the sizes and resolutions of the left view and the right view are the same. Referring to fig. 3, fig. 3 is a schematic flowchart of an image processing method provided in an embodiment of the present application. For convenience of description, in the image processing method shown in fig. 3, the first image is taken as a right view for example to explain. As shown in fig. 3, the image processing method provided in the embodiment of the present application may include:

s101, the image processing device acquires a first area in a first image.

The first region may be an image region surrounded by a contour of the target object in the right view, or may be an image region including the contour of the target object in the right view (for example, a regular geometric shape region including the contour of the target object), which is not limited in the embodiment of the present application.

S102, the image processing device acquires a second area in the second image.

The second area may be an image area surrounded by the contour of the target object in the left view, or an image area including the contour of the target object in the left view.

S103, the image processing apparatus calculates a horizontal displacement difference between the position of the first region in the first image and the position of the second region in the second image.

Wherein, because the right camera and the left camera are on a horizontal straight line and there is a fixed distance between the right camera and the left camera in the photographing apparatus, there is a slight horizontal difference between the right view and the left view, and thus there is also a slight horizontal difference (i.e., a horizontal displacement difference) between the first region in the right view and the second region in the left view. The horizontal displacement difference can be expressed in the form of pixel points and also in the form of physical distance. For example, the horizontal displacement difference may be 30 pixel points, or may be 0.2 cm.

And S104, the image processing device determines the distance between the target object and the shooting equipment according to the horizontal displacement difference.

In the embodiment of the present application, for the step S101, there may be some possible implementation manners as follows:

in some possible embodiments, the image processing apparatus acquires a right view (referred to as a first image) including the target object captured by a right-hand camera (referred to as a first camera) in the capturing device, identifies the target object in the right view using an image recognition technique in Artificial Intelligence (AI), and marks the target object on the right view. The image processing apparatus may take the marked image area in the right view as the first area. Alternatively, the target object captured by the right camera is presented entirely in the right view. The target object can be any object to be measured in a shooting scene.

In some possible embodiments, the image processing apparatus may perform image segmentation on the right view by using an image segmentation technique to obtain the first region. The image segmentation techniques may include threshold-based image segmentation, semantic-based image segmentation, edge detection-based image segmentation, and so on, among others. Because the image segmentation technology can better segment a foreground region (an image region corresponding to a target object in an image) and a background region (an image region obtained by subtracting the foreground region from the image), the first region obtained by the image segmentation technology is more accurate, and the accuracy of image processing can be improved.

In some possible embodiments, the image processing apparatus may perform image segmentation on the right view by using a threshold-based image segmentation method to obtain the first region. Specifically, the image processing apparatus may obtain a target threshold range for image segmentation, further obtain a feature value of each pixel point in the right view, and may set a plurality of feature values (here, greater than or equal to the target threshold range) of the pixel points in the right view within the target threshold rangeEqual to 2) target pixel points are determined as the first region (foreground region). And then a region formed by a plurality of (here, more than or equal to 2) pixel points of which the characteristic values of the pixel points in the right view are outside the target threshold range is used as a background region. Since the pixel points in the image have sizes and are usually square, even 2 pixel points located on one line can determine an area, which can be the first area. The first region may be a region formed by a plurality of pixels, or may be a minimum continuous region including the plurality of pixels. The target threshold range may be a threshold range preset by a user, or may be a threshold range calculated by the image processing device according to the image feature of the right view, where the target threshold range may be a color value range, a gray value range, or a depth reference value range, and the feature value of the corresponding pixel point may include a color value, a gray value, or a depth reference value. Here, the color value may be a red (red) green (green) blue (blue) value, i.e., an RGB value, or a color space value, i.e., a YCBCR value. For example, the color value range is 189 to 205, the gray value range is 124 to 156, and so on. The depth reference value may be a distance between each pixel point in the right view and the photographing device, for example, the depth reference value may be Z in the above formula (1) _P . If the right view is a color image, the feature values of the pixels may include color values and/or depth reference values. If the right view is a gray image, the feature values of the pixel points may include gray values and/or depth reference values. As shown in fig. 4, fig. 4 is a schematic diagram of the first region after image segmentation. Assuming that the right view R is a color image and the color value range is 189 to 205, a region composed of a plurality of pixels having color values of pixels outside 189 to 205 (the color value is smaller than 189 or the color value is larger than 205) in the right view is determined as a background region, and the color values of all the pixels in the background region are set to 0, such as the black region in fig. 4. Determining a region formed by a plurality of target pixel points with color values of 189-205 of pixel points in the right view as a first region, and setting all the color values of the first region to be 255, such as a white region in fig. 4. Book (I)In the embodiment of the application, the image segmentation method based on the threshold segmentation is simple in calculation and high in calculation efficiency, so that the accuracy of image processing can be improved, and the calculation efficiency of the image processing can also be improved.

In some possible embodiments, the obtaining manner of the target threshold range is specifically:

1) The image processing device determines a reference area corresponding to the target object in the right view. For example, the image processing apparatus may recognize the right view by using an image recognition technology such as pattern recognition, a support vector machine, and the like, recognize the target object in the right view, and determine an image area occupied by the target object recognized in the right view as a reference area, or may determine an area according to a frame or click operation on the right view by a user as a reference area corresponding to the target object. The reference area is used to reflect the preliminary positioning of the target object in the right view, for example, the reference area may be an image area in the right view that is larger than the outline of the target object, or an image area in the right view that is smaller than the outline of the target object, so that the image area where the target object is located in the right view, that is, the first area, may be further determined more accurately.

2) After determining the reference region corresponding to the target object, the image processing device acquires the feature values of a plurality of (here, greater than or equal to 2) pixel points in the reference region. For example, the image processing device samples all the acquired pixels in the reference region to obtain a plurality of sampling points in the reference region, and acquires feature values of the plurality of sampling points. The image processing apparatus may also extract all the feature points in the reference region, and acquire feature values of all the feature points in the reference region. The characteristic value of the pixel point may include a color value, a gray value, or a depth reference value. The depth reference value may be used to represent a reference distance between each pixel point and the photographing device.

3) And the image processing device determines a target threshold range for image segmentation according to the characteristic values of a plurality of pixel points in the reference region. For example, if the right view is a color image, the image processing apparatus acquires the reference regionThe color values and the depth reference values of all the pixel points in the domain can be calculated according to the obtained color values and the depth reference values, and the color mean value C of all the pixel points in the reference region can be calculated _ave Color standard deviation C _δ Mean value of depth D _ave And depth standard deviation D _δ . The target threshold range may be (C) _ave -C _δ )～(C _ave +C _δ ) And (D) _ave -D _δ )～(D _ave +D _δ ). Therefore, the image processing device can acquire the color value and the depth reference value of each pixel point in the first image, and can make the color value of the pixel point in the first image be (C) _ave -C _δ )～(C _ave +C _δ ) Within the range, and the depth reference value of the pixel point is (D) _ave -D _δ )～(D _ave +D _δ ) And determining the area formed by all the pixel points in the range as the first area. Further, the color value of the pixel point in the right view is within the range (C) _ave -C _δ )～(C _ave +C _δ ) And/or the depth reference value of the pixel point is (D) _ave -D _δ )～(D _ave +D _δ ) And a region formed by all the pixel points outside the range is used as a background region. In the embodiment of the application, the image processing device determines the reference region of the target object on the first image, and then determines the target threshold range for image segmentation according to the characteristic values of the pixel points in the reference region. On one hand, the threshold range for image segmentation is not required to be set by a user, and manual processing links are reduced. On the other hand, the reference region of the target object is an approximate region of the target object on the right view, the target threshold range for image segmentation is obtained from the characteristic values of pixel points in the approximate region of the target object on the right view, and the first region obtained by image segmentation is a more accurate image region of the target object in the right view, so that the image segmentation precision can be improved, a more complete first region is obtained, and the image processing precision is further improved.

In the embodiment of the present application, for the step S102, there may be some possible implementation manners as follows:

in some possible embodiments, the image processing apparatus may acquire the second region in the second image by using image segmentation or AI recognition. The method for acquiring the second region may be the same as or different from the method for acquiring the first region, and this embodiment of the present application is not limited. The pixel difference threshold may be user-defined, or may be determined by the image processing apparatus according to the feature points of the first image and the second image. For example, a first feature point corresponding to the target object in the first image and a second feature point corresponding to the first feature point in the second image are respectively extracted, a pixel difference between the first feature point and the second feature point is calculated, and a mean value of the pixel difference is used as a pixel difference threshold.

In some possible embodiments, the obtaining manner of the second area may specifically be:

1) Based on the first region obtained as described above, the image processing apparatus may obtain one or more mask regions in the left view (referred to as the second image). Each of the plurality of mask regions has the same shape and size as the first region. The left view is an image including the target object captured by a left camera (referred to as a second camera) in the photographing apparatus. Alternatively, the target object captured by the left camera is fully presented in the left view. Fig. 5A is a schematic view of obtaining a mask region on a left side view, as shown in fig. 5A. The first area A1 is obtained from the right view R, the image processing apparatus determines the mask window A1 according to the first area A1, specifically, the mask window corresponding to the first area may be determined according to the size, shape and position of the first area in the right view, the slash area in A1 is an opaque area, the c1 area in A1 corresponding to A1 is a transparent area, that is, the position of c1 in the mask window A1 is the same as the position of A1 in the right view R, and the size and shape of c1 and A1 are the same. The mask window A1 is masked on the left view L, the area displayed by the left view L through the area c1 in the mask window A1 is the mask area b1, different mask areas b1 can be obtained by moving the mask window A1 on the left view L, and the shape and the size of the mask area b1 are the same as those of the first area A1. The mask window A1 may move left and right in the left view L, or may move up and down in the left view L. When the mask window A1 moves on the left view L, a fixed number of pixels may be moved at each time, such as moving 5 pixels or 1 pixel at each time, or different numbers of pixels may be moved at each time, such as moving 10 pixels for the first time, moving 8 pixels for the second time, and so on.

In some possible embodiments, the image processing device translates the upper and lower edges of the mask window after being respectively aligned with the upper and lower edges of the left view to obtain at least one mask region in the left view, and the image processing device may record the number of pixels translated on the left view of the mask window corresponding to each mask region in the at least one mask region. As shown in fig. 5B, the image processing apparatus determines the mask window A2 in accordance with the size, shape, and position of the first region a1, thereby determining four vertices of the mask window A2, such as the

vertices

1,2,3,4 shown in fig. 5B. The slash region in A2 is an opaque region, and the c1 region in A2 corresponding to a1 is a transparent region. The vertices 1',2',3',4' are used to represent 4 vertices on the left view L, respectively. The upper and lower edges of the mask window are respectively flush with the upper and lower edges of the left view: the upper edge of A2 determined by 1,2 fixed point in A2 is collinear with the upper edge of L determined by 1 'and 2' of L, and the lower edge of A2 determined by 3,4 fixed point in A2 is collinear with the lower edge of L determined by 3 'and 4' of L. The vertex 2 in the mask window A2 may be aligned with the vertex 1 'in the left view L, and the vertex 4 in the mask window A2 may be aligned with the vertex 3' in the left view L, from which point the mask window A2 may be shifted to the right, so as to mask on L, discard an area of the left view L that is not completely masked by the c1 area in A2, and discard an area of the left view L that is displayed through the c1 area and has the same size and shape as the a1 area as the mask area b2. Similarly, the vertex 1 in the mask window A2 may be aligned with the vertex 2 'in the left view L, and the vertex 3 in the mask window A2 may be aligned with the vertex 4' in the left view L, from which point the mask window A2 may be translated to the left, so as to mask on L, and in the moving process, the area of the left view L that is not completely masked in the area c1 in the A2 and displayed on the left view L may be discarded, and the area of the left view L that is displayed through the area c1 and has the same size and shape as the area a1 may be used as the mask area b2. In this embodiment, since the left camera and the right camera in the shooting device are on the same horizontal line, which indicates that the longitudinal positions of the left camera and the right camera are the same, the longitudinal positions of the target object in the left view and the right view are also the same, that is, only the horizontal difference exists between the left view and the right view, the upper edge and the lower edge of the mask window are respectively leveled with the upper edge and the lower edge of the left view, and then the mask is translated, so that the number of mask regions in the left view can be reduced, the image processing workload is reduced, and the work efficiency is improved.

In some possible embodiments, as shown in fig. 5C, fig. 5C is a schematic diagram of another method for obtaining a mask region on a left side view. Wherein, because the target object shot by the right camera in the shooting device is deviated to the left area in the right view, the target object shot by the left camera is deviated to the right area in the left view. Therefore, the image processing apparatus can align 4 vertices in the mask window A3 with corresponding vertices in the left view L (i.e., vertex 1 is aligned with vertex 1', vertex 2 is aligned with vertex 2', vertex 3 is aligned with vertex 3', and vertex 4 is aligned with vertex 4'), from which point the mask window A3 is translated to the right, so as to mask on L, discard the area of the left view L that is not completely masked by the c1 area in A3 during the movement, and use the area of the left view L that is displayed through the c1 area and has the same size and shape as a1 as the mask area b3.

2) The image processing device may obtain a pixel difference between each of the one or more mask regions and the first region, so as to determine a mask region having a smallest pixel difference from the first region among the one or more mask regions as the second region. The pixel difference may include a color or gray difference between pixels. The second area may be an image area corresponding to the target object in the left view. For example, assuming that the image processing device acquires 200 mask regions in total in the left view, the image processing device may acquire the pixel difference of each of the 200 mask regions from the first region. The mask region having the smallest difference in pixels from the first region among the 200 mask regions is determined as the second region.

In this embodiment, since the first region is used as a reference to search the second region, the shapes and sizes of the second region and the first region are the same, and the pixel difference is smaller than a smaller pixel difference threshold, interference caused by other image regions which are not the target object in the left view to the acquisition of the second region is eliminated, so that the pixel difference between the first region and the second region is ensured to be small enough, the subsequent calculation of the horizontal displacement difference between the position of the first region in the right view and the position of the second region in the left view is more accurate, and the obtained distance between the target object and the shooting device is more accurate.

In some possible embodiments, if the right view (referred to as the first image) and the left view (referred to as the second image) are color images, the image processing apparatus may obtain a sum of color differences between each pixel point in each mask region and a corresponding pixel point in the first region, and determine a mask region in each mask region having a smallest sum of color differences from the first region as the second region. For example, assume that the 70 th mask region includes A ₇₀ ,B ₇₀ ,C ₇₀ ,D ₇₀ ,E ₇₀ These 5 pixels. The first region comprises A' ₃₀ ,B′ ₃₀ ,C′ ₃₀ ,D′ ₃₀ ,E′ ₃₀ These 5 pixels. A. The ₇₀ And A' ₃₀ Corresponds to, B ₇₀ And B' ₃₀ Corresponds to, C ₇₀ And C' ₃₀ Corresponds to, D ₇₀ And D' ₃₀ Corresponds to, E ₇₀ And E' ₃₀ And (7) correspondingly. The image processing device can respectively calculate A in the 70 th mask region ₇₀ ,B ₇₀ ,C ₇₀ ,D ₇₀ ,E ₇₀ These 5 pixel points and 5 pixel points a 'corresponding to the first region' ₃₀ ,B′ ₃₀ ,C′ ₃₀ ,D′ ₃₀ ,E′ ₃₀ Color difference Cd between color values of _A 、Cd _B 、Cd _C 、Cd _D And Cd _E . Wherein Cd is _A Can be as follows:

wherein R in the formula (2) _A Can represent A ₇₀ The red component, R, in the RGB color value of this pixel _A′ May represent A' ₃₀ The pixel point is the red component of the RGB color value. G _A Can represent A ₇₀ The green component, G, in the RGB color value of this pixel _A′ May represent A' ₃₀ The pixel point is the green component of the RGB color value. B is _A Can represent A ₇₀ The blue component, B, in the RGB color value of this pixel _A′ May represent A' ₃₀ The pixel point has a blue component in the RGB color value. Cd [ Cd ] _B 、Cd _C 、Cd _D And Cd _E Can be prepared from Cd _A The same can be obtained. The image processing device can calculate the color difference sum Cd between the color value of each mask region and the first region _{General assembly} ＝Cd _A +Cd _B +Cd _C +Cd _D +Cd _E And minimizing the sum of color differences (Cd) from the first area _{General (1)} ) _min Is determined as the second area. In the embodiment of the application, if the right view and the left view are color images, the image processing device determines the pixel difference by calculating the RGB color components of the pixel points on the first region and the mask region, thereby providing a more complete image processing method.

In some possible embodiments, if both the right view (referred to as the first image) and the left view (referred to as the second image) are grayscale images, the image processing apparatus may obtain a sum of grayscale differences between grayscale values of each pixel point in each mask region and a corresponding pixel point in the first region, and determine a mask region in each mask region having a smallest sum of grayscale differences with the first region as the second region. For example, assume that the 100 th mask region includes A ₁₀₀ ,B ₁₀₀ ,C ₁₀₀ ,D ₁₀₀ ,E ₁₀₀ These 5 pixels. The first region comprises A' ₃₀ ,B′ ₃₀ ,C′ ₃₀ ,D′ ₃₀ ,E′ ₃₀ These 5 pixels. A. The ₁₀₀ And A' ₃₀ Corresponds to, B ₁₀₀ And B' ₃₀ Corresponds to, C ₁₀₀ And C' ₃₀ Corresponds to, D ₁₀₀ And D' ₃₀ Corresponds to, E ₁₀₀ And E' ₃₀ And (7) corresponding. The image processing apparatus may calculate a in the 100 th mask region respectively ₁₀₀ ,B ₁₀₀ ,C ₁₀₀ ,D ₁₀₀ ,E ₁₀₀ The 5 pixel points and corresponding pixel points A 'in the first area' ₃₀ ,B′ ₃₀ ,C′ ₃₀ ,D′ ₃₀ ,E′ ₃₀ Gray difference sum GreyD:

wherein | Grey in the formula (3) _A -Grey _A′ I represents the pixel point A ₁₀₀ Gray value of _A And pixel point A' ₃₀ Gray value of _A′ The absolute difference of (c). | Grey _B -Grey _B′ I represents the pixel B ₁₀₀ Gray value of _B And pixel point B' ₃₀ Gray value of _B′ The absolute difference of (c). | Grey _C -Grey _C′ I represents a pixel point C ₁₀₀ Gray value of _C And pixel point C' ₃₀ Grey value of _C′ The absolute difference of (c). | Grey _D -Grey _D′ I represents a pixel D ₁₀₀ Gray value of _D And pixel point D' ₃₀ Grey value of _D′ The absolute difference of (c). | Grey _E -Grey _E′ I represents the pixel E ₁₀₀ Gray value of _E And pixel point E' ₃₀ Gray value of _E′ The absolute difference of (c). The image processing apparatus may calculate a sum of gray differences GreyD between the gray values of each of the mask regions and the first region, and minimize the sum of gray differences GreyD from the first region _min Is determined as the second area. In this embodiment of the application, if both the left view and the right view are grayscale images, the image processing apparatus may determine the pixel difference by calculating an absolute difference between grayscale values of pixel points on the first region and the mask region, which illustrates an image processing method according to this embodiment of the applicationThe method is still suitable for gray level images, has wide application range and provides a more perfect image processing method.

In some possible embodiments, if the right view (referring to the first image) is a color image and the left view (referring to the second image) is a gray scale image, for each pixel point in the first region, the image processing apparatus may convert the color value of the pixel point in the first region to obtain the gray scale value of the pixel point. And then obtaining the gray difference sum of the gray values of each pixel point in each mask region and the corresponding pixel point in the first region. And a mask region having the smallest sum of differences in gray levels from the first region may be determined as the second region. For example, the image processing apparatus may convert the color value of each pixel point in the first area into the Gray value corresponding to this pixel point according to the conversion formula of RGB color and Gray scale, gray = R0.299 + g 0.587+ b 0.114. And then, calculating the gray difference sum GreyD between the gray values of each mask region and the first region by using the formula (3), and determining the mask region with the minimum gray difference sum with the first region as a second region. In the embodiment of the present application, if one of the left and right views is a color image and the other is a gray image, if the right view is a color image and the left view is a gray image, the image processing apparatus may convert the color value of the first region in the right view into a gray value, and then calculate the absolute difference between the gray values of the first region and the mask region to determine the pixel difference. The image processing method provided by the embodiment of the application is also applied to the situation that one image is in color and the other image is in gray scale, so that the application range is further expanded.

In the embodiment of the present application, for the step S103, there may be some possible implementation manners as follows:

in some possible embodiments, for the manner of acquiring the mask region shown in fig. 5A, the image processing device may determine a first reference point in the first region, and determine a second reference point corresponding to the first reference point in the second region. The image processing means calculates the lateral position of the first reference point on the right view and the lateral position of the second reference point on the left view. Thereby calculating a horizontal displacement difference of the lateral position of the first reference point on the right view and the lateral position of the second reference point on the left view. And if the first reference point is the gravity center point of the first area, the second reference point is the gravity center point of the second area. Or the first reference point is a leftmost (rightmost) pixel point in the first region, and the second reference point is a leftmost (rightmost) pixel point in the second region. As shown in fig. 6A, fig. 6A is a schematic diagram of a horizontal displacement difference. Wherein the image processing apparatus determines the center of gravity point of the first area a1 as the first reference point RP1 according to the shape of the first area a 1. Since the first area a1 shown in fig. 6A is a circle, the center of gravity is the center of the circle. Similarly, the image processing apparatus may determine the second reference point RP2 in the second area b 1. The image processing device calculates the number P1=79 of the leftmost pixel points of the first reference point RP1 from the right view, and calculates the number P2=103 of the leftmost pixel points of the second reference point RP2 from the left view. And then obtaining an absolute difference value 24 between the number P1 of the leftmost pixels of the first reference point RP1 from the right view and the number P2 of the leftmost pixels of the second reference point RP2 from the left view. And finally, converting the 24 pixel points of the absolute difference into a horizontal displacement difference. The conversion of pixels to centimeters may be: real size (inches) = pixels/resolution, 1 inch =2.54 cm.

In some possible embodiments, for the manner of obtaining the mask regions shown in fig. 5B, the image processing device records the number of pixel points shifted on the left view of the mask window corresponding to each mask region in the at least one mask region. The image processing apparatus may calculate a horizontal position of the first region in the right view, and may obtain the number of pixels of the mask window corresponding to the second region translated on the left view, that is, the horizontal position of the second region in the left view. The image processing apparatus may calculate a horizontal displacement difference of a lateral position of the first region in the right view and a lateral position of the second region in the left view. Fig. 6B is a schematic diagram of another horizontal displacement difference, as shown in fig. 6B. The distance between the leftmost pixel point of the first area a1 and the leftmost pixel point of the right view is R1=100 pixel points, that is, the horizontal position of the first area a1 on the right view R is 100 pixel points. The mask window A2 corresponding to the second area b2 is shifted rightward from the leftmost position of the left view L, and the number R2=135 of pixels shifted together when c1 in the record A2 coincides with b2, that is, the horizontal position of the second area b2 on the left view L is 135 pixels. The image processing device calculates the absolute difference value of 35 pixel points between 100 pixel points at the transverse position of the first area in the right view and 135 pixel points at the transverse position of the second area in the left view. The image processing apparatus may obtain a resolution of the right view/left view, and convert the 35 pixels of absolute difference into a horizontal displacement difference according to the resolution.

Similarly, the distance between the rightmost pixel point of the first area a1 and the rightmost R1 of the right view =120 pixel points, that is, the horizontal position of the first area a1 on the right view R is 120 pixel points. The masking window A2 corresponding to the second area b2 starts to translate leftward from the rightmost position of the left view L, and when c1 and b2 in the recording A2 coincide, the number of co-translated pixels R2=85, that is, the horizontal position of the second area b2 on the left view L is 85 pixels. The image processing device calculates the absolute difference value of 35 pixel points between 120 pixel points at the transverse position of the first area in the right view and 85 pixel points at the transverse position of the second area in the left view. The image processing apparatus may obtain a resolution of the right view/left view, and convert the 35 pixels of absolute difference into a horizontal displacement difference according to the resolution.

In some possible embodiments, for the way of obtaining the mask region shown in fig. 5C, the image processing device records the number of pixel points shifted in the left view of the mask window A3 corresponding to each mask region b3. The image processing device can acquire the number of pixel points of the masking window A3 corresponding to the second area in the left view in a translation mode, and the number of the pixel points is converted into a horizontal displacement difference. Fig. 6C is a schematic diagram of yet another horizontal displacement difference, as shown in fig. 6C. The masking window A3 corresponding to the second area b3 is shifted rightward from the time when the four vertices of the left view L are aligned with the four vertices of A3, and the co-shifted pixel number Pa =35 when c1 and b3 in A3 coincide is recorded. The image processing device acquires Pa and converts Pa =35 pixels into a horizontal displacement difference.

In the embodiment of the present application, for the step S104, there may be some possible implementation manners as follows:

in some possible embodiments, the right camera and the left camera of the shooting device have the same shooting focal length. The image processing apparatus may acquire a photographing focal distance when photographing the right view and/or the left view. The image processing apparatus may further acquire a separation distance between the right camera (referred to as a first camera) and the left camera (referred to as a second camera). The image processing apparatus may calculate a product of the photographing focal length and the separation distance, and determine a quotient of the product of the photographing focal length and the separation distance divided by the calculated horizontal displacement difference as a distance between the target object and the photographing device. For example, Z may be used to indicate the distance between the target object and the photographing apparatus, F may be used to indicate the photographing focal length of the photographing apparatus, B may be used to indicate the distance between the right camera and the left camera of the photographing apparatus, and d may be used to indicate the horizontal displacement difference calculated as described above. And the distance Z = B F/d between the target object and the shooting device. According to the embodiment of the application, only the horizontal displacement difference is calculated once, the horizontal displacement difference of corresponding pixel points of the right view and the left view does not need to be calculated, and the distance between the target object and the shooting equipment is determined according to the horizontal displacement difference, so that the calculation amount is reduced, and the working efficiency is improved.

In the embodiment of the present application, the image processing apparatus acquires a first region in a right view (referred to as a first image), and then acquires a second region in a left view. The first area is an image area corresponding to a target object in a right view, the second area is the same as the first area in shape and size, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in a left view. Then, the image processing device calculates a horizontal displacement difference between the position of the first region in the right view and the position of the second region in the left view, and determines the distance between the target object and the shooting device according to the relative horizontal displacement difference, wherein the horizontal displacement difference is the parallax between the first region and the second region. According to the embodiment of the application, the horizontal displacement difference between the position of the target object on the right view and the position of the target object on the left view is used for representing the horizontal displacement difference between the right view and the left view, and the horizontal displacement difference of corresponding pixel points of the right view and the left view does not need to be calculated, so that the calculation amount is reduced, and the working efficiency is improved.

The method of the embodiment of the present application is explained in detail above, and in order to better implement the above-mentioned scheme of the embodiment of the present application, the embodiment of the present application further provides a corresponding apparatus.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application. As shown in fig. 7, the image processing apparatus 70 may include:

a first obtaining module 701, configured to obtain a first region in a first image. The first area is an image area corresponding to a target object in the first image.

A second acquiring module 702 is configured to acquire a second region in a second image. The second area is the same as the first area in shape and size, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in the second image.

A calculating module 703, configured to calculate a horizontal displacement difference between the position of the first region in the first image acquired by the first acquiring module 701 and the position of the second region in the second image acquired by the second acquiring module 702. Wherein, the horizontal displacement difference is used for representing the parallax between the first area and the second area.

A determining module 704, configured to determine a distance between the target object and the shooting device according to the horizontal displacement difference calculated by the calculating module 703.

Wherein the first image is an image captured by a first camera of the image capturing apparatus, the second image is an image captured by a second camera of the image capturing apparatus, and the first camera and the second camera are located on a horizontal line.

In some possible embodiments, the first obtaining module 701 is specifically configured to perform image segmentation on the first image to obtain the first region.

In some possible embodiments, the first obtaining module 701 is specifically configured to determine a region jointly formed by a plurality of target pixel points in the first image as the first region. And the characteristic value of the target pixel point is within the range of the target threshold value.

In some possible embodiments, the first obtaining module 701 is further configured to determine a reference region corresponding to the target object in the first image, obtain feature values of a plurality of pixel points in the reference region, and determine a target threshold range for image segmentation according to the feature values of the plurality of pixel points in the reference region. Wherein, the reference area is used for reflecting the preliminary positioning of the target object in the first image.

In some possible embodiments, the characteristic value includes a color value, a gray value, or a depth reference value, and the depth reference value is used to indicate a reference distance between the pixel point and the photographing device.

In some possible embodiments, the second obtaining module 702 is specifically configured to obtain one or more mask regions in the second image, where each of the mask regions has a same shape and size as the first region, and determine, as the second region, a mask region having a pixel difference from the first region smaller than a pixel difference threshold according to a pixel difference between each of the mask regions and the first region.

In some possible embodiments, the second obtaining module 702 is specifically configured to determine a mask window corresponding to the first region according to the first region, level upper and lower edges of the mask window with upper and lower edges of the second image, and then translate the mask window to obtain one or more mask regions in the second image. Wherein, the size and shape of the mask window are the same as those of the first image.

In some possible embodiments, the pixel difference between each of the mask regions and the first region is a sum of color differences between each pixel point in each of the mask regions and a corresponding pixel point in the first region.

In some possible embodiments, the pixel difference between each of the mask regions and the first region is a sum of gray-scale differences between each pixel point in each of the mask regions and a corresponding pixel point in the first region.

In some possible embodiments, the image processing apparatus 70 further includes: a converting module 705, configured to convert the color value of each pixel point in the first region acquired by the first acquiring module 701 to obtain a gray value of each pixel point.

In some possible embodiments, the second obtaining module 702 is specifically configured to determine, as the second region, a mask region with the smallest pixel difference from the first region according to the pixel difference between each mask region and the first region.

In some possible embodiments, the determining module 704 is specifically configured to obtain a shooting focal length when the first image and/or the second image are shot, obtain a separation distance between the first camera and the second camera, and determine a distance between the target object and the shooting device according to the shooting focal length, the separation distance, and the horizontal displacement difference.

In some possible embodiments, the determining module 704 is further specifically configured to calculate a product of the shooting focal length and the separation distance, and determine a quotient obtained by dividing the product of the shooting focal length and the separation distance by the horizontal displacement difference as the distance between the target object and the shooting device.

In a specific implementation, the implementation of each module may also correspond to the corresponding description of the method embodiment shown in fig. 3, and perform the method and functions performed in the foregoing embodiment.

Referring to fig. 8, fig. 8 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present application. As shown in fig. 8, the image processing apparatus 100 may include: a processor 110 and a memory 120 (one or more computer-readable storage media). These components may communicate over one or more communication buses 130.

The processor 110 may include an Application Processor (AP) and an Image Signal Processor (ISP). The AP and the ISP may be two relatively independent components or may be integrated on one integrated chip.

The memory 120 is coupled to the processor 110 for storing various software programs and/or sets of instructions. In particular implementations, memory 120 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. The memory 120 may store an operating system (hereinafter referred to simply as a system), such as WINDOWS, LINUX, ANDROID, IOS, etc. The memory 120 may also store a network communication program that may be used to communicate with one or more additional devices, one or more terminal devices, one or more network devices. The memory 120 may further store a user interface program, which may vividly display the content of the application program through a graphical operation interface, and receive a control operation of the application program from a user through input controls such as application icons, menus, dialog boxes, and buttons. The memory 120 may also store one or more application programs. As shown in fig. 8, these applications may include: cameras, galleries, and other applications, etc. In the present application, the memory 120 may be used to store a computer program implementing the image processing method shown in fig. 3. The processor 110 calls the computer program stored in the memory 120 to implement the image processing method shown in fig. 3.

In some possible implementations, the image processing apparatus 100 may further include a communication part 140, a power management part 150, and a peripheral system (I/O) 160. The communication section 140 described above may control a communication connection between the image processing apparatus 100 and another communication device. The communications component 140 can include radio frequency components, cellular components, and the like. The communication section 140 may provide a wireless communication function by using radio frequency. Alternatively, the communication section 140 may include a network interface, a modulator/demodulator (modem), and the like for connecting the image processing apparatus 100 to a network (e.g., the internet, a local area network, a wide area network, a telecommunication network, a cellular network, a satellite network, a plain old telephone service, and the like).

The above-described power management unit 150 is mainly used to supply stable, high-precision voltages to the processor 110, the memory 120, the communication unit 140, and the peripheral system 160.

The above-described peripheral system (I/O) 160 is mainly used to implement an interactive function between the image processing apparatus 100 and a user/external environment, and mainly includes an input-output device of the image processing apparatus 100. In a specific implementation, the peripheral system (I/O) 160 may include a plurality of (here, greater than or equal to 2) camera controllers, such as the camera controller 1, the camera controller 2, the camera controller 3, and the like shown in fig. 8. Wherein, each camera controller can be coupled with the peripheral equipment such as camera 1, camera 2 and camera 3 which respectively correspond. In some possible embodiments, when capturing an image, the camera 1 and the camera 2 are located on a horizontal line, the camera 1 may be a color camera, and the camera 2 may be a black-and-white camera. In practice, the peripheral system (I/O) 160 may also include other I/O peripherals, and is not limited herein.

In some possible embodiments, when an image is taken, the camera 1 and the camera 2 are located on a horizontal line, the camera controller 1 controls the camera 1 to transmit a collected image signal to the ISP, and the ISP processes the received image signal to form a first image. Similarly, the camera controller 2 controls the camera 2 to transmit the acquired image signal to the ISP, and the ISP processes the received image signal to form a second image. The ISP transmits the first image and the second image to the AP for image processing as described in the embodiment of fig. 3. In specific implementation, if the

cameras

1 and 2 are arranged left and right on the image processing device, the vertical screen can be used to position the

cameras

1 and 2 on a horizontal line during shooting, and if the

cameras

1 and 2 are arranged up and down on the image processing device, the horizontal screen can be used to position the

cameras

1 and 2 on a horizontal line during shooting, and the physical position relationship between the

cameras

1 and 2 on the image processing device is not limited in the embodiment of the present application. The camera controller 1 and the camera controller 2 control the camera 1 and the camera 2 to simultaneously acquire image signals through a synchronization mechanism. In some possible embodiments, the AP or the ISP sends control instructions to the camera controller 1 and the camera controller 2 at the same time, and the control instructions are used for controlling the camera 1 and the camera 2 to acquire image signals at the same time.

Those skilled in the art can understand that all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer readable storage medium and can include the processes of the method embodiments described above when executed. And the aforementioned storage medium includes: various media capable of storing program codes, such as ROM or RAM, magnetic or optical disks, etc.

Claims

1. An image processing method, comprising:

acquiring a first area in a first image, wherein the first area is an image area corresponding to a target object in the first image;

acquiring a second area in a second image, wherein the shape and the size of the second area are the same as those of the first area, the pixel difference between the second area and the first area is smaller than a pixel difference threshold value, and the second area is an image area corresponding to the target object in the second image;

calculating a horizontal displacement difference between the position of the first area in the first image and the position of the second area in the second image, and determining the distance between the target object and the shooting equipment according to the horizontal displacement difference, wherein the horizontal displacement difference is the parallax between the first area and the second area;

the first image is an image shot by a first camera of the shooting equipment, the second image is an image shot by a second camera of the shooting equipment, and the first camera and the second camera are located on a horizontal line.

2. The method of claim 1, wherein acquiring the first region in the first image comprises:

determining a region formed by a plurality of target pixel points in the first image as the first region, wherein the characteristic value of the target pixel point is within a target threshold range.

3. The method according to claim 2, wherein before determining a region collectively composed of a plurality of target pixels in the first image as the first region, further comprising:

determining a reference region corresponding to the target object in the first image;

obtaining characteristic values of a plurality of pixel points in the reference region;

and determining the target threshold range for image segmentation according to the characteristic values of a plurality of pixel points in the reference region.

4. The method according to claim 2 or 3, wherein the characteristic value comprises a color value, a gray value or a depth reference value, and the depth reference value is a reference distance between a pixel point and the photographing device.

5. The method of claim 1, wherein acquiring the second region in the second image comprises:

acquiring one or more mask regions in the second image, each of the one or more mask regions being the same in shape and size as the first region;

and determining the mask area with the pixel difference with the first area smaller than a pixel difference threshold value as the second area according to the pixel difference of each mask area with the first area.

6. The method of claim 5, wherein said acquiring one or more mask regions in the second image comprises:

determining a mask window corresponding to the first area according to the first area;

respectively leveling the upper edge and the lower edge of the shielding window with the upper edge and the lower edge of the second image and then translating in the horizontal direction to obtain one or more shielding areas in the second image;

wherein the mask window is the same size and shape as the first image.

7. The method according to claim 5 or 6, wherein the pixel difference between each mask region and the first region is a sum of color differences between each pixel point in each mask region and a corresponding pixel point in the first region.

8. The method according to claim 5 or 6, wherein the pixel difference between each mask region and the first region is a sum of gray level differences of gray levels of each pixel point in each mask region and a corresponding pixel point in the first region.

9. The method according to claim 8, wherein before determining a mask region having a pixel difference from the first region less than a pixel difference threshold as a second region according to a pixel difference of each mask region from the first region, further comprising:

and converting the color value of each pixel point in the first area to obtain the gray value of each pixel point.

10. The method according to any one of claims 1-3, wherein said determining a distance between the target object and a photographing apparatus according to the horizontal displacement difference comprises:

acquiring a shooting focal length when the first image and/or the second image is shot;

acquiring a spacing distance between the first camera and the second camera;

and determining the distance between the target object and the shooting equipment according to the shooting focal length, the spacing distance and the horizontal displacement difference.

11. The method of claim 10, wherein the determining the distance between the target object and the photographing apparatus according to the photographing focal length, the separation distance, and the horizontal displacement difference comprises:

calculating the product of the shooting focal length and the spacing distance;

and determining the quotient of the product of the shooting focal length and the separation distance and the horizontal displacement difference as the distance between the target object and the shooting device.

12. An image processing apparatus characterized by comprising:

the first acquisition module is used for acquiring a first area in a first image, wherein the first area is an image area corresponding to a target object in the first image;

a second obtaining module, configured to obtain a second region in a second image, where the second region has a same shape and size as the first region, and a pixel difference between the second region and the first region is smaller than a pixel difference threshold, and the second region is an image region corresponding to the target object in the second image;

a calculating module, configured to calculate a horizontal displacement difference between the position of the first region in the first image acquired by the first acquiring module and the position of the second region in the second image acquired by the second acquiring module, where the horizontal displacement difference is used to represent a parallax between the first region and the second region;

the determining module is used for determining the distance between the target object and the shooting equipment according to the horizontal displacement difference calculated by the calculating module;

the first image is an image shot by a first camera of the shooting device, the second image is an image shot by a second camera of the shooting device, and the first camera and the second camera are located on a horizontal line.

13. The image processing apparatus according to claim 12, wherein the first obtaining module is specifically configured to determine, as the first region, a region that is jointly composed of a plurality of target pixels in the first image, where feature values of the target pixels are within a target threshold range.

14. The image processing apparatus of claim 13, wherein the first obtaining module is further configured to:

and determining a target threshold range for image segmentation according to the characteristic values of a plurality of pixel points in the reference region.

15. The apparatus according to claim 13 or 14, wherein the feature value includes a color value, a grayscale value, or a depth reference value, and the depth reference value is used to represent a reference distance between a pixel point and the photographing device.

16. The image processing apparatus according to claim 12, wherein the second obtaining module is specifically configured to:

obtaining one or more mask regions in the second image, each mask region of the plurality of mask regions being the same shape and size as the first region;

and determining the mask area with the pixel difference from the first area smaller than a pixel difference threshold value as the second area according to the pixel difference of each mask area from the first area.

17. The image processing apparatus according to claim 16, wherein the second obtaining module is specifically configured to:

respectively leveling the upper edge and the lower edge of the shielding window with the upper edge and the lower edge of the second image and then translating to obtain the one or more shielding areas in the second image;

wherein the mask window is the same size and shape as the first image.

18. The image processing device according to claim 16 or 17, wherein the pixel difference between each mask region and the first region is a sum of color differences between each pixel point in each mask region and a corresponding pixel point in the first region.

19. The image processing device according to claim 16 or 17, wherein the pixel difference between each mask region and the first region is a sum of gray level differences of gray levels of each pixel point in each mask region and a corresponding pixel point in the first region.

20. The image processing apparatus according to claim 19, characterized by further comprising:

and the conversion module is used for converting the color value of each pixel point in the first area acquired by the first acquisition module to obtain the gray value of each pixel point.

21. The image processing apparatus according to any of claims 12 to 14, wherein the determining module is specifically configured to:

acquiring a spacing distance between the first camera and the second camera;

22. The image processing apparatus according to claim 21, wherein the determining module is specifically configured to:

calculating the product of the shooting focal length and the spacing distance;

23. An image processing apparatus comprising a processor, a memory, wherein the memory is configured to store program code, and the processor is configured to invoke the program code, and when the program code is executed, the processor is configured to perform the method of any of claims 1-11.

24. A computer storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method according to any one of claims 1-11.