CN112150355B

CN112150355B - Image processing method and related equipment

Info

Publication number: CN112150355B
Application number: CN201910563110.1A
Authority: CN
Inventors: 李静; 杨涛; 曾建洪; 肖晶; 魏朦
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2023-09-29
Anticipated expiration: 2039-06-26
Also published as: WO2020259444A1; CN112150355A

Abstract

The embodiment of the invention discloses an image processing method and related equipment. The method comprises the following steps: the method comprises the steps that a first server obtains N image transformation relations; the N image transformation relations are in one-to-one correspondence with the N third images, an image overlapping area exists at the adjacent position of any two adjacent third images, and each image transformation relation represents the image overlapping area between the corresponding third image and the adjacent image; the first server receives N first images; any two adjacent first images have an image overlapping area at the adjacent position; the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images; an image overlapping area does not exist between any two adjacent second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras. By implementing the above embodiment, an ultra-high resolution image of hundred million pixels can be obtained in real time.

Description

Image processing method and related equipment

Technical Field

The present invention relates to image processing technologies in the field of communications, and in particular, to an image processing method and related devices.

Background

With the development of video imaging devices and the improvement of computer performance, there is an increasing demand for high-resolution, low-delay, long-distance, wide-viewing-angle video. For example, in the field of intelligent video monitoring, high-quality video data including information such as faces and license plates, which are wide in monitoring range and long in monitoring distance, are often required. In addition, the high-quality video source can finish richer and more various tasks through intelligent information mining, and the development of smart cities is promoted strongly.

The targets photographed by conventional cameras are typically near-large and far-small. If the far target is seen clearly, the angle of view can only be reduced, and a large-range dead angle is caused at the near part; if the angle of view is enlarged to increase the coverage of the near, the far target is not clearly seen, and both cannot be achieved. If an image taken by a camera is required to both see a distant target and to ensure a sufficiently large angle of view, the resolution of the camera needs to be improved. Most of the resolution of the cameras on the market is millions or tens of millions, which usually requires more complex and costly manufacturing processes for cameras if higher resolutions are to be achieved, such as achieving billions of resolutions.

A conventional approach to hundred million-level imaging techniques is described below.

In conventional scheme one: firstly, setting a picture range to be shot, wherein the picture range can be determined by setting the upper left corner and the lower right corner of a picture to be shot; then, the number of the transverse columns and the longitudinal columns which need to be covered is automatically calculated through a special holder robot and a telescope lens, automatic scanning is carried out, and tens or hundreds of partial images are sequentially shot; and then, tens or hundreds of partial images are automatically spliced and shot by software, so that a hundred-pixel ultra-high resolution image is obtained.

However, although the method can shoot super-resolution images with hundred-level pixels, the method is actually required to shoot dozens or hundreds of partial images sequentially, and then automatically splice the partial images into the super-resolution images with hundred-level pixels through software. The method has low processing speed, cannot meet the requirement of practical application on real-time performance, and is required to image an image in real time in the field of video monitoring.

Another conventional approach to hundred million-level imaging techniques is described below.

In the second conventional scheme: firstly, shooting a target area by using a camera array formed by a plurality of cameras, so as to obtain a plurality of images at one time; and then, sending the plurality of images to a server for splicing, so as to obtain a spliced image. However, if the image is compressed and then sent to a server for stitching, the real-time requirement in the video monitoring field can be met, but the resolution of the stitched image is lower because the image is compressed; if the image is not compressed and then sent to a server for splicing, hundred million-level ultrahigh-resolution images can be obtained, but the image sending time and the splicing time are longer due to larger image data quantity, so that the real-time requirement in the video monitoring field can not be met.

Therefore, in the application scene with higher requirement on image display real-time performance in the conventional scheme II, for example, the video monitoring field cannot meet the requirements on real-time performance and ultrahigh resolution at the same time, so that the effect of lossless display of hundred million-level video streams cannot be achieved.

How to economically realize ultra-high resolution (e.g., hundred million pixel resolution) video presentation is still a serious technical challenge.

Disclosure of Invention

The embodiment of the invention provides an image processing method and related equipment, which can economically realize real-time imaging of video with ultra-high resolution (such as hundred million-level pixel resolution).

In a first aspect, an embodiment of the present invention provides an image processing method, including: the method comprises the steps that a first server obtains N image transformation relations; the N image transformation relations are in one-to-one correspondence with N third images, the N third images are obtained by respectively shooting N second areas of N cameras of a camera array in a synchronous manner, the arrangement mode of the N second areas is consistent with that of the N cameras of the camera array, any adjacent two third images in the N third images are provided with image overlapping areas at adjacent positions, each image transformation relation in the N image transformation relations represents an image overlapping area between the corresponding third image and the adjacent image, the adjacent image is all images adjacent to the corresponding third image in the N third images, and N is an integer larger than 1; the first server receives N first images; the N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position; the N image transformation relations are in one-to-one correspondence with the N first images, and the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images; and an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

The method comprises the steps that a first server obtains N image transformation relations; after the first server receives the N first images, the N first images can be directly subjected to image transformation according to the N image transformation relations, so that N second images which can be seamlessly spliced into a panoramic image are obtained, and the N second images are sent to N display devices to be displayed in real time. In the process of carrying out image transformation on the N first images, the step of calculating the image transformation relation is saved, so that the image transformation rate is improved, the resource cost is reduced, and the image quality is not affected. By implementing the embodiment, the real-time video imaging of the hundred million-level pixels can be realized, and the method is economical to realize, high in efficiency and good in timeliness.

In some implementations, the first server obtains N image transformation relationships, including: the first server fuses the N third images to obtain a second panoramic image; the first server segments the second panoramic image to obtain N fourth images; the N third images and the N fourth images are in one-to-one correspondence, and the first server generates the N image transformation relations according to the N third images and the corresponding fourth images.

In some implementations, the first server performs fusion according to the N third images to obtain a second panoramic image, including: the first server extracts feature points from a preset part of images in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; the first server registers according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and the first server splices the N third images according to the characteristic point pairs to obtain the second panoramic image.

In the implementation manner, the first server only extracts the feature points of the preset partial images in each third image, so that the range of feature extraction is reduced, and the efficiency of feature extraction is improved.

In some implementations, the first server is a server in a distributed system, the distributed system further comprising a second server, the camera array further comprising M cameras, wherein the first server corresponds to the N cameras of the camera array, the second server corresponds to the M cameras of the camera array, M is an integer greater than 1; the first server acquires N image transformation relationships, including: the first server fuses the N third images and the M third images to obtain a third panoramic image; the M third images are obtained by respectively shooting M second areas synchronously by the M cameras, the arrangement mode of the M second areas is consistent with that of the M cameras, and an image overlapping area exists at the adjacent position of any adjacent two third images in the N third images and the M third images; the first server segments the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one; the first server generates the N image transformation relations according to the N third images and the N fourth images;

The method further comprises the steps of: the first server generates M image transformation relations according to the M third images and the M fourth images respectively; the first server sends the M image transformation relations to the second server; the M image transformation relations are in one-to-one correspondence with the M first images, the M image transformation relations are used for the second server to process the corresponding first images respectively, so that M second images are obtained, the M first images are obtained by respectively shooting M first areas synchronously by the M cameras, the arrangement mode of the M first areas is consistent with that of the M cameras of the camera array, an image overlapping area exists at any adjacent position of two first images in the M first images, an image overlapping area does not exist between any adjacent two second images in the M second images, and the M second images form a fourth panoramic image according to the arrangement mode of the M cameras, and the fourth panoramic image and the first panoramic image jointly form a fifth panoramic image.

It can be seen that in the embodiment of the present invention, the first server calculates N image transformation relationships corresponding to the N cameras and M image transformation relationships corresponding to the M cameras in advance, stores the N image transformation relationships locally, and sends the M image transformation relationships to the second server. In this way, in the application stage, the first server performs real-time image transformation according to the N image transformation relations stored locally, and sends the transformed images to each display device for real-time imaging, and the second server may also perform real-time image transformation on the images captured by the M cameras according to the M image transformation relations received in advance, and send the transformed images to each display device for real-time imaging. In the process of transforming the image, the first server and the second server save operations such as image feature point extraction, image feature point registration, image stitching and the like, so that the image transformation rate is improved, the resource cost is reduced, and the image quality is not influenced; in addition, the scheme adopts a distributed idea, based on the technical idea, real-time imaging of hundred million-level videos can be conveniently realized by expanding a large number of servers, and the method is economical, high in efficiency and good in timeliness.

In some implementations, the first server performs fusion according to the N third images and the M third images to obtain a third panoramic image, including: the first server extracts feature points from a preset part of images in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; the first server registers according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and the first server splices the N third images and the M third images according to the characteristic point pairs to obtain the third panoramic image.

In a second aspect, an embodiment of the present invention provides a server for image processing, the server including:

The acquisition module is used for acquiring N image transformation relations; the N image transformation relations are in one-to-one correspondence with N third images, the N third images are obtained by respectively shooting N second areas of N cameras of a camera array in a synchronous manner, the arrangement mode of the N second areas is consistent with that of the N cameras of the camera array, any adjacent two third images in the N third images are provided with image overlapping areas at adjacent positions, each image transformation relation in the N image transformation relations represents an image overlapping area between the corresponding third image and the adjacent image, the adjacent image is all images adjacent to the corresponding third image in the N third images, and N is an integer larger than 1;

the communication module is used for receiving N first images; the N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position;

the N image transformation relations are in one-to-one correspondence with the N first images, and the image transformation module is used for respectively processing the corresponding first images according to the N image transformation relations to obtain N second images; and an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

In some implementations, the acquiring module is specifically configured to: fusing according to the N third images to obtain a second panoramic image; segmenting the second panoramic image to obtain N fourth images; the N third images and the N fourth images are in one-to-one correspondence, and the N image transformation relations are generated according to the N third images and the corresponding fourth images.

In some implementations, the acquisition module is further to: extracting feature points from a preset partial image in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and splicing the N third images according to the characteristic point pairs to obtain the second panoramic image.

In a third aspect, there is provided an image processing system, the system comprising: a first server and a second server;

The first server is used for acquiring N image transformation relations corresponding to the N third images one by one and M image transformation relations corresponding to the M third images one by one; the N third images are obtained by respectively and synchronously shooting N second areas by N cameras of a camera array, the camera array further comprises M cameras, the M third images are obtained by respectively and synchronously shooting M second areas by the M cameras, the arrangement mode of the N second areas is consistent with that of the N cameras, the arrangement mode of the M second areas is consistent with that of the M cameras, an image overlapping area exists at the adjacent position of any two adjacent third images in the N third images and the M third images, each image transformation relation in the N image transformation relations and the M image transformation relations represents the image overlapping area between the corresponding third image and the adjacent image, the adjacent image is all images adjacent to the corresponding third image in the N third images and the M third images, and N and M are integers larger than 1;

the first server is further configured to receive N first images; the N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position;

The N image transformation relations are in one-to-one correspondence with the N first images, and the first server is further used for respectively processing the corresponding first images according to the N image transformation relations to obtain N second images;

the first server is further configured to send the M image change relationships to the second server;

the second server is used for receiving M first images; the M first images are obtained by respectively shooting M first areas synchronously by the M cameras, the arrangement mode of the M first areas is consistent with that of the M cameras, and any two adjacent first images in the M first images have an image overlapping area at the adjacent position;

the M image transformation relations are in one-to-one correspondence with the M first images, and the second server is further used for respectively processing the corresponding first images according to the M image transformation relations to obtain M second images;

the image overlapping area does not exist between any two adjacent second images in the N second images and the M second images, the N second images form a first panoramic image according to the arrangement mode of the N cameras, the M second images form a fourth panoramic image according to the arrangement mode of the M cameras, and the fourth panoramic image and the first panoramic image jointly form a fifth panoramic image.

In some implementations, the first server is specifically configured to fuse the N third images and the M third images to obtain a third panoramic image; segmenting the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one; generating the N image transformation relations according to the N third images and the N fourth images respectively, and generating the M image transformation relations according to the M third images and the M fourth images respectively.

In some implementations, the first server is specifically configured to: extracting feature points from a preset partial image in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and splicing the N third images and the M third images according to the characteristic point pairs to obtain the third panoramic image.

In a fourth aspect, an embodiment of the present invention provides a server, including an input device, an output device, a memory, and a processor; the input device is used for receiving data, the output device is used for sending data, the storage is used for storing data and program instructions, and the processor is used for calling and executing the program instructions; the program instructions, when executed by the processor, cause the server to implement a method as described in any embodiment of the first aspect.

In a fifth aspect, an embodiment of the present invention provides a camera array, where the camera array includes N cameras; the N cameras are used for respectively shooting N second areas synchronously to obtain N third images; the arrangement mode of the N second areas is consistent with the arrangement mode of the N cameras, and any two adjacent third images in the N third images have an image overlapping area at the adjacent position; the method is also used for sending the N third images to the first server; the N cameras are also used for respectively and synchronously shooting N first areas to obtain N first images; the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position; the method is also used for sending the N first images to the first server; the first server is the server described in the second aspect, or the first server is the server described in the fourth aspect.

In a sixth aspect, an embodiment of the present invention provides a display platform, where the display platform includes N display devices, where the N display devices are respectively configured to receive N second images sent by a first server, and are respectively configured to display the N second images, an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to an arrangement manner of N cameras of a camera array; the first server is the server described in the second aspect, or the first server is the server described in the fourth aspect.

In a seventh aspect, embodiments of the present invention provide a non-transitory computer readable storage medium; the computer readable storage medium is for storing code for implementing the method of the first aspect. The program code, when executed by a computing device, is adapted to carry out the method of the first aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments will be briefly described below.

Fig. 1 is a schematic diagram of a network architecture for image processing according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an arrangement of a camera array according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of another network architecture for image processing according to an embodiment of the present invention

FIG. 4 is a schematic diagram of an application scenario according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of another application scenario according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart of an image processing method according to an embodiment of the present invention;

FIG. 7 is a schematic flow chart of another image processing method according to an embodiment of the present invention;

FIG. 8 is a schematic flow chart of calibrating a camera according to an embodiment of the present invention;

FIG. 9 is a schematic flow chart of another image processing method according to an embodiment of the present invention;

FIG. 10 is a schematic block diagram of an image processing server according to an embodiment of the present invention;

FIG. 11 is a schematic diagram of system interaction according to an embodiment of the present invention;

fig. 12 is a schematic diagram of a hardware architecture of a server according to an embodiment of the present invention.

Detailed Description

The terminology used in the description of the embodiments of the invention herein is for the purpose of describing particular embodiments of the invention only and is not intended to be limiting of the invention.

The embodiment of the invention provides a novel hundred million-level imaging technical scheme, which can realize the ultra-high resolution real-time imaging of hundred-level pixels. Referring now to fig. 1, an architecture of an image processing network for a hundred million-level imaging technique according to the present invention is shown, and includes:

(1) Camera array: the camera array is composed of a plurality of cameras, any two adjacent cameras can meet the field overlapping requirement, for example, two cameras adjacent left and right can meet the horizontal field overlapping, and two cameras adjacent up and down can meet the vertical field overlapping requirement. Each camera in the camera array is responsible for image acquisition and sends acquired images to each server in the distributed system respectively. In some embodiments, each camera of the camera array further controls the plurality of cameras to expose at the same time through the synchronizer, so as to ensure that each image is acquired by exposure at the same time.

The following describes an arrangement of camera arrays, with specific reference to fig. 2: the camera array is constructed according to the arrangement mode of two rows and six columns. Each row of cameras adopts a concentric coplanar design scheme, the optical centers of the cameras pass through one point, the cameras are uniformly distributed on the concentric circle surface according to the same horizontal angle interval, the angle is determined by the horizontal view angle of each camera, and the requirement of overlapping the horizontal view fields of the adjacent cameras is met; the upper row of cameras and the lower row of cameras adopt a design scheme of concentric included angles, the radiuses of two circular surfaces where the upper row of cameras and the lower row of cameras are respectively located are the same, the circle centers are overlapped, a certain angle exists between the two circular surfaces, the angle is determined by the angle of view of the vertical direction of the cameras, and the upper row of cameras and the lower row of cameras are required to meet the requirement of overlapping the view fields of the vertical direction. It should be understood that the arrangement of the camera array is merely used as an example, and different arrangements may be designed according to the requirements in practical use.

(2) Distributed system: the distributed system is composed of a plurality of servers which are deployed at a cloud end, the servers perform image transformation on all images acquired by the camera array together so as to eliminate an image overlapping area caused by overlapping of camera view fields, and then all the images after image transformation are respectively sent to all the display devices in the screen wall so that all the display devices respectively display all the images after image transformation. In addition, in the camera array deployment stage, one server, such as a first server, in the distributed system is responsible for performing offline calibration on each camera of the camera array. Specifically, each server in the distributed system is correspondingly connected with an indefinite number of cameras, and the number can be determined by the image processing capacity of each server. In the camera array deployment stage, each server receives images sent by the camera array, one server is responsible for collecting images received by the server and other servers, the server collecting the images performs off-line calibration based on the collected images, so that calibration information of each camera in the camera array is obtained, and the calibration information of each camera is respectively sent to the corresponding server. Therefore, after the off-line calibration is finished, each server in the distributed system can perform real-time image transformation on the images acquired by each camera according to the existing calibration information, so that seamless splicing of the transformed images is realized, and further the real-time imaging application scene of the hundred million-level pixel video is satisfied.

(3) Curtain wall: the screen wall consists of a plurality of display devices, and each display device is respectively responsible for displaying the images after image conversion sent by each server in the distributed system. Since the pixels of the image in this scheme are in the order of billions, and the resolution of a conventional display device is in the order of millions, it is impossible to display an image of the order of billions of pixels by one display device. In the scheme, a screen wall composed of a plurality of display devices is constructed, and images acquired by each camera in the camera array are correspondingly displayed in one display device, so that images of hundred million pixels are displayed on one screen wall. In addition, in some embodiments, the number of display devices connected to each server is the same as the number of cameras connected to the server, that is, images captured by each camera are correspondingly displayed in one display device; in other embodiments, the number of display devices connected to each server may be different from the number of cameras connected to the server, and images captured by multiple cameras may be displayed in the same display device, or images captured by one camera may be displayed in multiple displays.

In another embodiment, the image processing network architecture provided by the present invention may also be a network architecture as shown in fig. 3, where the architecture includes: the system comprises a camera array, a first server and a screen wall; in contrast to the distributed system in the network architecture of fig. 2, the architecture uses only one server, which may be referred to as the first server. The first server performs image transformation on each image acquired by the camera array to eliminate an image overlapping area caused by overlapping of camera view fields, and sends each image after image transformation to each display device in the screen wall respectively. In addition, in the camera array deployment stage, the first server can also be responsible for carrying out off-line calibration on each camera of the camera array, so that calibration information of each camera is obtained, and in the camera array use stage, the calibration information is used for carrying out real-time image transformation on images acquired by each camera. Whereas the relevant functions of the camera array and the screen wall in fig. 3 are described above. In one scheme, the number of display devices connected with the first server is the same as the number of cameras connected with the first server, that is, images shot by each camera are correspondingly displayed in one display device; in another scheme, the number of display devices connected to the first server may be different from the number of cameras connected to the first server, and images obtained by shooting by a plurality of cameras may be displayed in the same display device, or images obtained by shooting by one camera may be displayed in a plurality of displays.

In order to facilitate understanding of the embodiments of the present invention, the following describes application scenarios related to the embodiments of the present invention.

As shown in fig. 4, fig. 4 is a schematic diagram of an application scenario for offline calibration of a camera array according to an embodiment of the present invention. In the camera array deployment stage, each camera of the camera array is respectively used for each region (each region can also be understood as a field of view corresponding to the camera) of the photographed object, and each region can comprise different landscape contents, such as regions 1-12 (i.e. 12 regions) in the figure, and shooting is performed so as to obtain images respectively aiming at each region, such as image P in the figure ₁ -P ₁₂ The method comprises the steps of carrying out a first treatment on the surface of the In the embodiment of the invention, based on the parameter setting and the deployment position of each camera, the ranges of each adjacent area in the 12 areas can be overlapped (namely, the field of view ranges of the adjacent cameras can be overlapped), so that an image overlapping area exists in the adjacent positions of the images shot by each camera. Then each camera sends the images obtained by shooting to each corresponding server in the distributed system; while the first server in the distributed system receives the original image P sent by the camera connected to itself ₁ ，P ₂ ，P ₃ ，P ₄ In addition to the original image P sent by other servers, the original image P is received ₅ ，P ₆ ，P ₇ ，P ₈ And P ₉ ，P ₁₀ ，P ₁₁ ，P ₁₂ (II), (III), (V), (; the first server then receives the two-part original image P based on the received two-part original image P ₁ -P ₁₂ Calibrating each camera in the camera array to obtain each image transformation matrix { H } ₁ ，H ₂ ，H ₃ ，H ₄ ，H ₅ ，H ₆ ，H ₇ ，H ₈ ，H ₉ ，H ₁₀ ，H ₁₁ ，H ₁₂ Each image transformation matrix is allocated to each server, e.g. in FIG. 4, the first server assigns H ₁ ，H ₂ ，H ₃ ，H ₄ After being stored locally, H is ₅ ，H ₆ ，H ₇ ，H ₈ To a server of the distributed system to send H ₉ ，H ₁₀ ，H ₁₁ ，H ₁₂ To another server in the distributed system. After the camera array is put into use, each image transformation matrix is used for carrying out image transformation on images shot by each camera, so that each transformed image can be spliced into a panoramic image in a seamless way, and no image overlapping area exists.

The first server calibrates each camera in the camera array based on the received two parts of original images, and reference may be made to the flow of the dashed box in fig. 4. Specifically, the first server fuses the received two parts of original images into a panoramic image, then cuts the panoramic image into sub-images, then calculates and obtains an image transformation matrix between each original image and each sub-image, and further obtains a plurality of image transformation matrices.

In some application scenarios, the first server in fig. 4 may be a specific server in the distributed system, or may be a server with the highest image processing capability or computing capability in the distributed system.

It should be noted that the above-described division manner of 12 areas of the object, the number of cameras, the number of servers, the connection relationship between the cameras and the servers, and the like are only used for exemplary explanation of the scheme of the present invention, and are not limiting.

The following describes another application scenario provided by the embodiment of the present invention. As shown in fig. 5, fig. 5 is a schematic view of an application scenario in which images acquired by each camera are displayed on a screen wall after real-time image conversion according to an embodiment of the present invention. In the deployment phase of the camera array in the application scenario of fig. 4, each server in the distributed system has obtained the image transformation matrix of each camera connected to the server, and after the camera array is put into use, each server in the distributed system can shoot the obtained image of each camera according to each obtained image transformation matrix, for example, the image P in the figure ₁ -P ₁₂ Perform real-time image transformation and convertThe respective images transformed by the real-time image are sent to the respective display devices in the screen wall so that the respective display devices display the respective images transformed by the real-time image, for example, the first display device in the figure displays the image P ₁ The second display device displays an image P ₂ And so on. Because the image transformation matrix is obtained by off-line calibration in the camera array deployment stage in advance, after the camera array is put into use, the first server can directly carry out real-time image transformation on the images shot by each camera according to the existing image transformation matrix, and the image transformation matrix does not need to be calculated before the real-time image transformation is executed.

The image processing method provided by the embodiment of the invention can obtain the ultra-high resolution image with hundred million-level pixels in real time.

Referring to fig. 6, fig. 6 is a flowchart of an image processing method according to an embodiment of the present invention. As shown in fig. 6, the method includes:

s101, a first server acquires N image transformation relations. Wherein N is an integer greater than 1.

In the embodiment of the invention, in the camera array deployment stage, the camera array faces a second shot target (such as urban road street view, etc.), the second shot target is divided into N different second areas, N cameras in the camera array aim at the N different second areas respectively, the N cameras are in one-to-one correspondence with the N different second areas, the N cameras in the camera array shoot the N second areas respectively and synchronously to obtain N third images, and then the first server performs calibration according to the N third images, so that N image transformation relations are obtained. The N image transformation relations are in one-to-one correspondence with the N third images, the arrangement mode of the N second areas is consistent with the arrangement mode of the N cameras, and any adjacent two third images in the N third images have an image overlapping area at the adjacent position. It should be noted that, each of the N image transformation relationships may represent an image overlapping region between the corresponding third image and other respective images adjacent to the third image.

In the embodiment of the invention, the image overlapping area between any two adjacent images is an image area with the same image information between the two images, wherein the image information can comprise an image brightness value and/or an image chromaticity value.

The N third images may be N cameras shot and then sent to the first server, or N cameras shot and then sent to the first server through other communication devices, and in addition, the N third images may be N cameras shot in advance and stored in a storage medium (for example, a hard disk), and then the first server obtains the N third images from the storage medium.

In the embodiment of the invention, the N third images are obtained by respectively and synchronously shooting N second areas of the second shot object by the N cameras, and particularly, the N cameras can be controlled to be exposed and acquire images at the same time in a hardware synchronous mode, wherein the hardware synchronous mode can be that a singlechip is used for generating square waves of fixed pulses to control the N cameras to be exposed simultaneously.

Specifically, the N image transformation relationships specifically correspond to N image transformation matrices, where the N image transformation matrices are used to respectively perform image transformation on images acquired by the N cameras after the camera array is put into use, and a specific image transformation formula (1) is as follows:

A _n *H _n ＝B _n (n＝1,2,…,m) (1)

Wherein A is _n Representing an original image obtained by shooting by a camera, B _n Representing an image of an original image after image transformation, specifically A _n And B _n Are all expressed by mathematical expression of an image matrix, H _n Representing an image transformation matrix, which is obtainable by a formula, by converting an original image a _n And image transformation matrix H _n Multiplying to obtain an image B after image transformation _n In some embodiments the image transformation Matrix may be represented by a Hessian Matrix (Hessian Matrix).

In some embodiments, the first server may generate the N image transformation relationships in advance and store the N image transformation relationships locally. In other embodiments, the first server may receive the N image transforms sent by other servers, where the N image transforms may be generated by the other servers and then sent to the first server.

S102, the first server receives N first images. The N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position.

In the embodiment of the invention, in the using stage of the camera array, the camera array faces a first shot target (such as urban road street view, etc.), the first shot target is divided into N different first areas, N cameras in the camera array aim at the N different first areas respectively, and the N cameras are in one-to-one correspondence with the N different first areas. In order to ensure that the N first images captured by the N cameras in the camera array include the entire content of the first object, it is necessary to ensure that the angles of view of any two adjacent cameras in the camera array overlap, so as to avoid a blind field of view.

It should be noted that, the first shot object and the second shot object may be the same or different, that is, after the camera array is put into use, the shooting scene faced by the camera array may be the same or different from the shooting scene faced by the camera array after the camera array is put into use.

In some embodiments, the N first images are obtained by respectively and synchronously shooting N areas of the first shot object by the N cameras, and specifically, the N cameras can be controlled to be exposed and acquire images at the same time in a hardware synchronous manner, where the hardware synchronous manner may be that a singlechip is used to generate a square wave of a fixed pulse to control the N cameras to be exposed simultaneously. The synchronous mode of hardware can more accurately ensure that N first images are exposed at the same time. If the scheme is applied to monitoring a dynamically-changed scene, for example, monitoring a high-speed road surface with large traffic flow or a street city with dense traffic flow, it is particularly important to control N cameras to simultaneously expose the scene and shoot the scene.

S103, the N image transformation relations are in one-to-one correspondence with the N first images, and the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images. And an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

It should be noted that, since a large amount of calculation time is required for calibrating each camera in the camera array, if the received image is calibrated online and then subjected to image transformation under the condition of large image data volume, the method is very time-consuming and cannot meet the application scenario with high real-time requirement. In the invention, the first server can receive the N image transformation relations sent by other servers and locally store the N image transformation relations, or can calibrate the N cameras in advance so as to generate the N image transformation relations and locally store the N image transformation relations. After the first server receives the N first images, the first server can perform image transformation on the N first images according to the locally stored calibration information, so that N second images are obtained.

In the embodiment of the invention, since any two adjacent cameras in the N cameras have field of view overlapping, any two adjacent first images in the N first images shot by the N cameras correspondingly have image overlapping areas. In order to eliminate the image overlapping area, the first server processes N first images in real time according to the N image transformation relations respectively to obtain N second images without the image overlapping area, the N second images can form a first panoramic image according to the arrangement mode of the N cameras, image information in the first panoramic image changes smoothly and the image overlapping area is not present, wherein the image information can comprise an image brightness value and/or an image chromaticity value.

In some embodiments, after the first server obtains the N second images, the first server transmits the N second images to the N display devices, respectively, to cause the N display devices to display the N second images, respectively. Since the resolution of a conventional display device (e.g., a liquid crystal display screen) is millions, N second images cannot be displayed by one display without any loss at the same time, in some embodiments, the first server sends the N second images to the N display devices, respectively, so that each display device correspondingly displays one second image, where the positional relationship of the N display devices may be the same as the positional relationship of the N cameras, for example, if the 12 cameras are arranged in an arrangement of two rows and six columns, the N display devices should also be arranged in an arrangement of two rows and six columns. In addition, the mapping relationship between the N cameras and the N display devices is preset, for example, the second image shot by the camera with the number 1 should be correspondingly displayed on the display device with the number 1, the mapping relationship between the second image shot by the camera with the other number and the display device with the other number, and so on.

In the method embodiment of fig. 6, a first server acquires N image transformation relationships; after the first server receives the N first images, the N first images can be directly subjected to image transformation according to the N image transformation relations, so that N second images which can be seamlessly spliced into a panoramic image are obtained, and the N second images are sent to N display devices to be imaged in real time. In the process of carrying out image transformation on the N first images, the step of calculating the image transformation relation is saved, so that the image transformation rate is improved, the resource cost is reduced, and the image quality is not affected. By implementing the embodiment, the real-time video imaging of the hundred million-level pixels can be realized, and the method is economical to realize, high in efficiency and good in timeliness.

Referring to fig. 7, fig. 7 is a flowchart of another image processing method according to an embodiment of the present invention. As shown in fig. 7, the method includes:

s201, the first server fuses the N third images to obtain a second panoramic image.

In the embodiment of the present invention, step S201 may be specifically implemented by the following steps: the first server extracts feature points from a preset part of images in each third image; secondly, registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and then the N third images are spliced according to the characteristic point pairs, so that the second panoramic image is obtained. The preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is greater than or equal to 3. It is to be understood that when the N cameras are deployed, the spacing angles of the two cameras adjacent left and right and the spacing angles of the two cameras adjacent up and down can be determined, and according to the two spacing angles, the maximum field of view overlapping areas of the two cameras adjacent left and right and the maximum field of view overlapping areas of the two cameras adjacent up and down can be determined. Then, the range of the preset partial image may be determined by the range of the maximum image overlapping area, and the range of the preset partial image at least includes the range of the maximum image overlapping area, for example, the image overlapping area of the left and right adjacent images is an image within 3 cm from the image edge, and the first server should extract the feature point from the image within 3 cm or more from the image edge. In the above embodiment, the first server performs feature point extraction only for the preset partial image in each third image, so that the range of feature extraction is reduced, and the efficiency of feature extraction is improved.

S202, the first server cuts the second panoramic image to obtain N fourth images.

Optionally, after the first server splices the N third images according to the plurality of feature point pairs to obtain a second panoramic image, the first server further clips an image edge of the second panoramic image to obtain a second panoramic image with aligned image edges. Accordingly, the first server should cut based on the second panoramic image with the image edges aligned, thereby obtaining N fourth images corresponding to the N third images.

Steps S201 to S202 are described below in connection with fig. 8. Fig. 8 is a schematic flow chart of calibrating a camera according to the present invention, wherein the left part of fig. 8 is a specific implementation of a calibration step of the camera, and each image on the right part of fig. 8 is an image description corresponding to each specific implementation of the calibration step. The following description is developed:

1. the first server extracts feature points from a preset partial image in each third image. Referring to the uppermost image of fig. 8, each point in the image is the feature point, and the feature point is a point which has a distinct characteristic in the image and can effectively reflect the essential feature of the image and can identify a target object in the image.

2. The first server registers according to the characteristic points of each third image to obtain a plurality of characteristic point pairs. Wherein the first server may register feature points of each third image according to an image feature matching algorithm including, but not limited to, SIFT-invariant feature transform (Scale-invariant feature) algorithm, SURF (Speed Up Robust Features, accelerated robust feature) algorithm.

One specific implementation of the registration process is as follows:

first, a Hessian (Hessian) matrix can be constructed for 12 images, i.e. image f _n (x, y) (n=1, 2, …, 12), the Hessian matrix (2) is constructed as follows:

the image is gaussian filtered before constructing the Hessian matrix, and the filtered Hessian matrix (3) is expressed as:

secondly, when the discriminant of the Hessian matrix obtains a local maximum value, the current feature point is judged to be a brighter or darker point than other points in the surrounding neighborhood, so as to locate the position of the key point.

Again, a scale space of 12 images is constructed, each image is divided into O groups of L layers using box filters for 12 images, and the template sizes of the box filters used between different groups gradually increase, and the filters of the same size are used between different layers of the same group, but the blur coefficients of the filters gradually increase.

Next, positioning the characteristic points of 12 images, comparing each pixel point processed by the Hessian matrix with 26 points in the two-dimensional image space and the scale space neighborhood, preliminarily positioning key points, filtering out the key points with weaker energy and the key points positioned in error, and screening out final stable characteristic points; calculating the main direction distribution of the characteristic points of 12 images, counting the sum of the horizontal and vertical harr wavelet characteristics of all points in a 60-degree fan in the circular neighborhood of the characteristic points, rotating the fan at intervals of 0.2 radian, counting the characteristic value of the harr wavelet in the area again, and finally taking the direction of the fan with the largest value as the main direction of the characteristic points.

Finally, a feature point descriptor of 12 images is generated, a rectangular region block of 4*4 is taken around the feature point, but the taken rectangular region direction is along the main direction of the feature point. Each sub-region counts haar wavelet characteristics for the horizontal and vertical directions of 25 pixels, where both horizontal and vertical directions are relative to the main direction. The haar wavelet features are the sum of horizontal direction values, the sum of vertical direction values, the sum of horizontal direction absolute values, and the sum of vertical direction absolute values in 4 directions. Matching the feature points in 12 images, and calculating Euclidean distance between the two feature points to determine the matching degree, so as to obtain a plurality of feature point pairs, wherein the shorter the Euclidean distance is, the better the matching degree of the two feature points is represented.

In some embodiments, the multiple feature point pairs obtained in the above embodiments are further optimized, and a RANSAC (Random Sample Consensus, random consistency sampling) algorithm may be used to remove the feature point pairs that are not matched. By further optimizing the plurality of feature point pairs, the feature point pairs which are erroneously matched can be removed, so that the accuracy of the matched feature point pairs is higher. In other embodiments, the mismatching feature point pairs may be removed by other algorithms, which are not specifically limited in the embodiments of the present invention.

3. And the first server splices the third images according to the characteristic point pairs to obtain a second panoramic image.

In some embodiments, the above embodiments may obtain a pair of feature points that are accurately matched between adjacent images, and then stitch each third image according to the pair of feature points that are accurately matched, so as to obtain the second panoramic image. In a specific implementation, each third image may be processed according to the pair of feature points that are accurately matched using a BA (Bundle Adjustment, beam adjustment method) algorithm, so as to obtain the second panoramic image.

4. The non-edge aligned portion of the first panoramic image is cropped to obtain a second panoramic image with the image edge aligned.

In some embodiments, the second panoramic image obtained in step 3 above is subjected to image edge cropping, thereby obtaining an image edge-aligned second panoramic image. Specifically, for example, in the second panoramic image in fig. 8, the excessive irregular part images in the upper, lower, left and right of the second panoramic image may be removed, so as to obtain a second panoramic image with a regular image edge.

S203, the N third images and the N fourth images are in one-to-one correspondence, and the first server generates N image transformation relations according to the N third images and the corresponding fourth images.

In some embodiments, the N image transformation relationships may be represented by N image transformation matrices.

It should be noted that, because any two adjacent cameras in the N cameras have a field of view overlapping area, an image overlapping area exists between any two adjacent third images in the N third images obtained by shooting by the N cameras; the N fourth images are obtained by segmentation of the second panoramic image, the image information in the second panoramic image is changed smoothly and no image overlapping area exists, and accordingly, no image overlapping area exists between any two adjacent fourth images in the N fourth images. In general, the third image is more than the fourth image corresponding to the same shooting area in the overlapping area around the image.

From equation (1) in the embodiment of the method of fig. 7 described above, the following equation can be derived:

H _n ＝B _n /A _n (n＝1,2,…,m) (4)

wherein H is _n Representing an image transformation matrix, A _n Image matrix representing a third image, B _n An image matrix representing a fourth image, where the image matrix B of the fourth image is obtainable by a formula _n And an image matrix A of a third image _n Under the known condition, the corresponding image transformation matrix H can be obtained through calculation according to the formula (4) _n 。

In some embodiments, after the server computes N image transformation matrices, the N image transformation matrices are stored locally in XML (Extensible Markup Language ) format. In other embodiments, the image transformation matrix may be stored in other formats, which are not particularly limited in the embodiments of the present invention.

S204, the first server receives N first images. The N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position.

In some embodiments, the specific description of step S204 may be referred to the description in the embodiment of fig. 6, and for brevity, will not be repeated here.

S205, the N image transformation relations are in one-to-one correspondence with the N first images, and the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images. And an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

In some embodiments, step S205 may refer to the description of step S103 in the method embodiment of fig. 6, which is not repeated herein for brevity.

In some embodiments, after the first server obtains the N second images, the first server transmits the N second images to the N display devices, respectively, to cause the N display devices to display the N second images, respectively. Since the resolution of a conventional display device (e.g., a liquid crystal display) is millions, N second images cannot be displayed simultaneously without loss by one display, in some embodiments, the first server sends the N second images to the N display devices, respectively, so that each display device correspondingly displays one second image, where the positional relationship of the N display devices may be the same as the positional relationship of the N cameras.

In some embodiments, the first server is connected to the N cameras, and the first server receives N first images sent by the N cameras. And the specific number of cameras connected to the first server is positively correlated to the image processing performance of the first server, wherein the image processing performance of the first server can be represented by the number of cores of the processor of the first server. For example, if the number of cores of the processor of the first server is 4, the number of cameras connected to the first server may be set to 4; if the number of cores of the processor of the first server is 8 cores, the number of cameras connected to the first server may be set to 8. It should be understood that the foregoing examples are merely examples, and should not be construed as limiting in any way, and the correspondence between the number of processor cores of the first server and the number of cameras may be determined according to actual requirements. In the above embodiment, each core in the processor corresponds to one camera separately, so that the capability of the first server for processing N first images simultaneously can be improved, and further real-time image transformation of the N first images is realized.

In the method embodiment of fig. 7, the first server locally generates N image transformation relationships in advance; after the first server receives the N first images, the N first images are subjected to image transformation according to the N image transformation relations stored locally, so that N second images which can be seamlessly spliced into a panoramic image are obtained, and the N second images are sent to N display devices to be imaged in real time. In the process of carrying out image transformation on the N first images, the image transformation relation corresponding to each camera is saved, and the steps of extracting characteristic points, registering the characteristic points, splicing the images and the like are carried out on the images, so that the image transformation rate is improved, the resource cost is reduced, and the image quality is not affected. By implementing the embodiment, the real-time video imaging of the hundred million-level pixels can be realized, and the method is economical to realize, high in efficiency and good in timeliness.

Fig. 9 is a schematic flow chart of another image processing method provided in the embodiment of the present invention, where a first server in the embodiment of the method of fig. 9 is a first server in a distributed system in the architecture diagram of fig. 1, and unlike the embodiment of the method of fig. 8, the first server in the embodiment of the method of fig. 9 is a server in the distributed system, and the distributed system further includes a second server, and the first server is responsible for calculating image transformation relationships of N cameras connected to the first server and M cameras connected to the second server. As shown in fig. 9, the method includes:

s301, the first server fuses the N third images and the M third images to obtain a third panoramic image.

In the embodiment of the invention, the M third images are obtained by synchronously shooting M second areas by the M cameras respectively, the arrangement mode of the M second areas is consistent with that of the M cameras, and an image overlapping area exists at the adjacent position of any adjacent two third images in the N third images and the M third images.

In some embodiments, the N third images are captured by the N cameras and sent to the first server; and the M fourth images are obtained by shooting by the M cameras and are sent to the first server through the second server. In some embodiments, the N cameras and the M cameras are all located in the same camera array, and the N cameras and the M cameras may be in an adjacent relationship.

And under the same moment, the first server gathers and collects N third images shot by the N cameras and M third images shot by the M cameras, and then fuses the N third images and the M third images to obtain a third panoramic image.

In some embodiments, the specific description of the fusion process in step S301 may refer to the description of step S201 in the embodiment of the method of fig. 7, where, unlike the description of step S201 that performs fusion based on only N third images, to obtain a second panoramic image composed of image information of only N third images, and step S301 performs fusion based on N third images and M third images, to obtain a third panoramic image composed of image information of N third images and M third images.

S302, the first server segments the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one.

Optionally, after the first server obtains the third panoramic image, the first server further clips an image edge of the third panoramic image to obtain a third panoramic image with aligned image edges. Accordingly, the first server should perform segmentation based on the third panoramic image with the image edges aligned, so as to obtain N fourth images in one-to-one correspondence with the N third images, and M fourth images in one-to-one correspondence with the M third images.

S303, the first server generates N image transformation relations according to the N third images and the N fourth images.

S304, the first server generates M image transformation relations according to the M third images and the M fourth images.

In the embodiment of the present invention, the N image transformation relationships in step S303 and the M image transformation relationships in step S304 may be obtained by calculation according to formula (2).

S305, the first server sends M image transformation relations to the second server.

In the embodiment of the invention, the M image transformation relations are in one-to-one correspondence with the M first images, the M image transformation relations are used for respectively processing the corresponding first images by the second server, so as to obtain M second images, the M first images are obtained by respectively synchronously shooting M first areas by the M cameras, the arrangement mode of the M first areas is consistent with that of the M cameras of the camera array, an image overlapping area exists between any two adjacent first images in the M first images, an image overlapping area does not exist between any two adjacent second images in the M second images, and the M second images form a fourth panoramic image according to the arrangement mode of the M cameras, and the fourth panoramic image and the first images jointly form a fifth panoramic image.

S306, the first server receives N first images.

S307, the N image transformation relations are in one-to-one correspondence with the N first images, and the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images.

In some embodiments, the descriptions of step S306 and step S307 may refer to the descriptions of step S102 and step S103 in the method embodiment of fig. 6, which are not repeated herein for brevity.

In some embodiments, the distributed system may further include more servers besides the first server and the second server, and correspondingly, the camera array may further include more cameras, which is not specifically limited in the embodiments of the present invention.

In the embodiment of the method of fig. 9, the first server calculates N image transformation relations corresponding to the N cameras and M image transformation relations corresponding to the M cameras in advance, stores the N image transformation relations locally, and sends the M image transformation relations to the second server. In this way, in the application stage, the first server performs real-time image transformation according to the N image transformation relations stored locally, and sends the transformed images to each display device for real-time imaging, and the second server may also perform real-time image transformation on the images captured by the M cameras according to the M image transformation relations received in advance, and send the transformed images to each display device for real-time imaging. In the process of transforming the image, the first server and the second server save operations such as image feature point extraction, image feature point registration, image stitching and the like, so that the image transformation rate is improved, the resource cost is reduced, and the image quality is not influenced; in addition, the scheme adopts a distributed idea, based on the technical idea, real-time imaging of hundred million-level videos can be conveniently realized by expanding a large number of servers, and the method is economical, high in efficiency and good in timeliness.

The related method of the embodiment of the present invention is described above, and the related apparatus of the embodiment of the present invention is described below based on the same inventive concept.

Referring to fig. 10, fig. 10 is a schematic block diagram of an image processing server according to an embodiment of the present invention. The following describes the modules in the server of fig. 10:

as shown in fig. 10, the server 400 includes: an acquisition module 401, a communication module 402, an image conversion module 403,

an acquiring module 401, configured to acquire N image transformation relationships; the N image transformation relations are in one-to-one correspondence with N third images, the N third images are obtained by respectively shooting N second areas of N cameras of a camera array in a synchronous manner, the arrangement mode of the N second areas is consistent with that of the N cameras of the camera array, any adjacent two third images in the N third images are provided with image overlapping areas at adjacent positions, each image transformation relation in the N image transformation relations represents an image overlapping area between the corresponding third image and the adjacent image, the adjacent image is all images adjacent to the corresponding third image in the N third images, and N is an integer larger than 1;

A communication module 402, configured to receive N first images; the N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position;

the N image transformation relations are in one-to-one correspondence with the N first images, and the image transformation module 403 is configured to respectively process the corresponding first images according to the N image transformation relations, so as to obtain N second images; and an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

In a possible embodiment, the obtaining module 401 is specifically configured to: fusing according to the N third images to obtain a second panoramic image; segmenting the second panoramic image to obtain N fourth images; the N third images and the N fourth images are in one-to-one correspondence, and the N image transformation relations are generated according to the N third images and the corresponding fourth images.

In a possible embodiment, the obtaining module 401 is further configured to: extracting feature points from a preset partial image in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and splicing the N third images according to the characteristic point pairs to obtain the second panoramic image.

It should be noted that, for example, in the method of fig. 7, the acquisition module 401 may be used to execute the steps S201, S202, and S203, the communication module 402 may be used to execute the step S204, and the image transformation module 403 may be used to execute the step S205, referring to fig. 6, 7, and 9 for the description of the embodiment of the method of fig. 10. For brevity, no further description is provided herein.

Referring to fig. 11, fig. 11 is a schematic diagram of system interaction provided by an embodiment of the present invention, where the system includes a first server and a second server.

In a possible embodiment, the first server is specifically configured to fuse the N third images and the M third images to obtain a third panoramic image; segmenting the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one; generating the N image transformation relations according to the N third images and the N fourth images respectively, and generating the M image transformation relations according to the M third images and the M fourth images respectively.

In a possible embodiment, the first server is specifically configured to: extracting feature points from a preset partial image in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3; registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs; and splicing the N third images and the M third images according to the characteristic point pairs to obtain the third panoramic image.

It should be noted that, for brevity, details not mentioned in the system embodiment of fig. 11, please refer to the description of the method embodiment of fig. 9, and further description is omitted herein.

Referring to fig. 12, fig. 12 is a block diagram of a server according to an embodiment of the present disclosure, and the first server referred to in fig. 1, 3, 4, and 5 may refer to the block diagram of the server of fig. 12. The server includes: a processor 501, a memory for storing processor executable instructions, wherein the processor is configured to: method steps involved in the first server in the method embodiments of fig. 6, 7, 9 are performed.

In a possible embodiment, the server may further include: one or more input interfaces 502, one or more output interfaces 503, and a memory 504.

The processor 501, the input interface 502, the output interface 503, and the memory 504 are connected via a bus 505. The memory 502 is used for storing instructions, the processor 501 is used for executing the instructions stored by the memory 502, the input interface 502 is used for receiving data, such as a first image in the method embodiment of fig. 6, and the output interface 503 is used for outputting data, such as a second image in the method embodiment of fig. 6.

Wherein the processor 501 is configured to invoke the program instruction execution: the method embodiments of fig. 6, 7, 9 relate to method steps related to a processor of a first server.

It should be appreciated that in the disclosed embodiments, the processor 501 may be a central processing unit (Central Processing Unit, CPU), which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 504 may include read only memory and random access memory and provide instructions and data to the processor 501. A portion of memory 504 may also include non-volatile random access memory. For example, memory 504 may also store information of the interface type.

In some implementations, the above-described components of the servers described in the embodiments of the present disclosure may be used to perform the method steps described above in the method embodiments of fig. 6, 7, or 9 involving the first server. For brevity, no further description is provided herein.

The embodiment of the invention also provides a camera array, which comprises N cameras. The N cameras are used for respectively shooting N second areas synchronously to obtain N third images; the arrangement mode of the N second areas is consistent with the arrangement mode of the N cameras, and any two adjacent third images in the N third images have an image overlapping area at the adjacent position; the method is also used for sending the N third images to the first server; the N cameras are also used for respectively and synchronously shooting N first areas to obtain N first images; the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position; the method is also used for sending the N first images to the first server; the first server is the first server described in the method embodiment of fig. 6, fig. 7, or fig. 9, and for brevity, will not be described herein.

The embodiment of the invention also provides a display platform, which comprises N display devices, wherein the N display devices are respectively used for receiving N second images sent by the first server and respectively used for displaying the N second images, an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of N cameras of the camera array; the first server is described in the method embodiment of fig. 6, fig. 7, or fig. 9, and in the interest of brevity, a detailed description is omitted here.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions which, when loaded and executed on a computer, produce, in whole or in part, a process or function in accordance with embodiments of the present invention. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one network site, computer, server, or data center to another network site, computer, server, or data center via wired (e.g., coaxial cable, optical fiber, digital subscriber line) or wireless (e.g., infrared, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer and may also be a data storage device, such as a server, data center, etc., that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape, etc.), an optical medium (e.g., DVD, etc.), or a semiconductor medium (e.g., solid state disk), etc.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

Claims

1. An image processing method, the method comprising:

the method comprises the steps that a first server obtains N image transformation relations; the N image transformation relations are in one-to-one correspondence with N third images, the N third images are obtained by respectively shooting N second areas of N cameras of a camera array in a synchronous manner, the arrangement mode of the N second areas is consistent with that of the N cameras of the camera array, any adjacent two third images in the N third images are provided with image overlapping areas at adjacent positions, each image transformation relation in the N image transformation relations represents an image overlapping area between the corresponding third image and the adjacent image, the adjacent image is all images adjacent to the corresponding third image in the N third images, and N is an integer larger than 1;

the first server receives N first images; the N first images are obtained by respectively shooting N first areas synchronously by the N cameras, the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position;

The N image transformation relations are in one-to-one correspondence with the N first images, and the first server respectively processes the corresponding first images according to the N image transformation relations to obtain N second images; and an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of the N cameras.

2. The method of claim 1, wherein the first server obtains N image transformations, comprising:

the first server fuses the N third images to obtain a second panoramic image;

the first server segments the second panoramic image to obtain N fourth images;

the N third images and the N fourth images are in one-to-one correspondence, and the first server generates the N image transformation relations according to the N third images and the corresponding fourth images.

3. The method of claim 2, wherein the first server performs fusion according to the N third images to obtain a second panoramic image, comprising:

the first server extracts feature points from a preset part of images in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3;

The first server registers according to the characteristic points of each third image to obtain a plurality of characteristic point pairs;

and the first server splices the N third images according to the characteristic point pairs to obtain the second panoramic image.

4. The method of claim 1, wherein the first server is a server in a distributed system, the distributed system further comprising a second server, the camera array further comprising M cameras, wherein the first server corresponds to the N cameras of the camera array, the second server corresponds to the M cameras of the camera array, M is an integer greater than 1;

the first server acquires N image transformation relationships, including:

the first server fuses the N third images and the M third images to obtain a third panoramic image; the M third images are obtained by respectively shooting M second areas synchronously by the M cameras, the arrangement mode of the M second areas is consistent with that of the M cameras, and an image overlapping area exists at the adjacent position of any adjacent two third images in the N third images and the M third images;

The first server segments the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one;

the first server generates the N image transformation relations according to the N third images and the N fourth images;

the method further comprises the steps of:

the first server generates M image transformation relations according to the M third images and the M fourth images respectively;

the first server sends the M image transformation relations to the second server; the M image transformation relations are in one-to-one correspondence with the M first images, the M image transformation relations are used for the second server to process the corresponding first images respectively, so that M second images are obtained, the M first images are obtained by respectively shooting M first areas synchronously by the M cameras, the arrangement mode of the M first areas is consistent with that of the M cameras of the camera array, an image overlapping area exists at any adjacent position of two first images in the M first images, an image overlapping area does not exist between any adjacent two second images in the M second images, and the M second images form a fourth panoramic image according to the arrangement mode of the M cameras, and the fourth panoramic image and the first panoramic image jointly form a fifth panoramic image.

5. The method of claim 4, wherein the first server performs fusion according to the N third images and the M third images to obtain a third panoramic image, comprising:

and the first server splices the N third images and the M third images according to the characteristic point pairs to obtain the third panoramic image.

6. A server for image processing, the server comprising:

7. The server according to claim 6, wherein the obtaining module is specifically configured to:

fusing according to the N third images to obtain a second panoramic image;

segmenting the second panoramic image to obtain N fourth images;

the N third images and the N fourth images are in one-to-one correspondence, and the N image transformation relations are generated according to the N third images and the corresponding fourth images.

8. The server of claim 7, wherein the acquisition module is further configured to:

extracting feature points from a preset partial image in each third image; the preset partial image in the third image represents an image, which is within a preset width range from an image boundary, in the third image, and the preset width is smaller than a pixel distance between a center point position of the third image and the image boundary when N is more than or equal to 3;

registering according to the characteristic points of each third image to obtain a plurality of characteristic point pairs;

and splicing the N third images according to the characteristic point pairs to obtain the second panoramic image.

9. An image processing system, the system comprising: a first server and a second server;

10. The system of claim 9, wherein the first server is configured to,

fusing the N third images and the M third images to obtain a third panoramic image;

segmenting the third panoramic image to obtain N fourth images corresponding to the N third images one by one and M fourth images corresponding to the M third images one by one;

generating the N image transformation relations according to the N third images and the N fourth images respectively, and generating the M image transformation relations according to the M third images and the M fourth images respectively.

11. The system of claim 10, wherein the first server is specifically configured to:

and splicing the N third images and the M third images according to the characteristic point pairs to obtain the third panoramic image.

12. A server, characterized by comprising an input device, an output device, a memory and a processor; the input device is used for receiving data, the output device is used for sending data, the memory is used for storing data and program instructions, and the processor is used for calling and executing the program instructions; the program instructions, when executed by the processor, cause the server to implement the method of any one of claims 1 to 5.

13. The camera array is characterized by comprising N cameras;

The N cameras are used for respectively shooting N second areas synchronously to obtain N third images; the arrangement mode of the N second areas is consistent with the arrangement mode of the N cameras, and any two adjacent third images in the N third images have an image overlapping area at the adjacent position; the method is also used for sending the N third images to the first server;

the N cameras are also used for respectively and synchronously shooting N first areas to obtain N first images; the arrangement mode of the N first areas is consistent with that of the N cameras, and any two adjacent first images in the N first images have an image overlapping area at the adjacent position; the method is also used for sending the N first images to the first server;

the first server is a server according to any of claims 6-8, or the first server is a server according to claim 12.

14. The display platform is characterized by comprising N display devices, wherein the N display devices are respectively used for receiving N second images sent by a first server and respectively used for displaying the N second images, an image overlapping area does not exist between any two adjacent second images in the N second images, and the N second images form a first panoramic image according to the arrangement mode of N cameras of a camera array;