CN115797164B

CN115797164B - Image stitching method, device and system in fixed view field

Info

Publication number: CN115797164B
Application number: CN202111058979.4A
Authority: CN
Inventors: 张丽; 唐虎; 孙运达; 刘永春; 李栋; 王志明; 郑大川
Original assignee: Tsinghua University; Nuctech Co Ltd
Current assignee: Tsinghua University; Nuctech Co Ltd
Priority date: 2021-09-09
Filing date: 2021-09-09
Publication date: 2023-12-12
Anticipated expiration: 2041-09-09
Also published as: CN115797164A

Abstract

The present disclosure relates to the field of image processing technologies, and in particular, to a method, an apparatus, a system, an electronic device, a storage medium, and a program for image stitching in a fixed field of view. An image stitching method comprises the steps of obtaining an image sequence, wherein the image sequence at least comprises a plurality of frame images of a moving target; determining the same position in the plurality of frame images as a splicing position; respectively determining t _n Time frame image and t _n+1 The foreground of the moving target in the time frame image is extracted, and the contextual characteristics of the image area at the foreground are extracted; determining a moving object at t according to the context characteristics _n Time frame image and t _n+1 A moving distance between time frame images; respectively corresponding to t according to the moving distance and the splicing position _n Time frame image and t _n+1 Cutting the time frame image to obtain a cut image; and splicing the clipping images. The method can realize the spliced display of the moving target in the fixed view field.

Description

Image splicing method, device and system in fixed field of view

技术领域Technical field

本公开涉及图像处理技术领域，更具体地，涉及一种固定视场中的图像拼接方法、装置、系统、电子设备、存储介质和程序产品。The present disclosure relates to the field of image processing technology, and more specifically, to an image splicing method, device, system, electronic device, storage medium and program product in a fixed field of view.

背景技术Background technique

广义的图像拼接就是将数张有重叠部分的静态图像(可能是不同时间，不同视角或不同传感器获得的)拼成一副无缝的全景图或高分辨率图像。现有的图像拼接技术一般包含图像配准和图像融合两个关键技术。其中图像配准是指采用图像处理方法计算带拼接图像间的匹配关系，得到重叠区域，实现图像的拼接。图像融合则是实现拼接后图像在拼接处的平滑过渡。Generalized image stitching is to stitch together several overlapping static images (which may be obtained at different times, different perspectives or different sensors) into a seamless panorama or high-resolution image. Existing image stitching technology generally includes two key technologies: image registration and image fusion. Image registration refers to using image processing methods to calculate the matching relationship between spliced images, obtain overlapping areas, and realize image splicing. Image fusion is to realize the smooth transition of the spliced images at the splicing point.

但是，当相机或其他拍摄设备的视场固定且拼接对象为场景中的移动目标(例如，拼接正在行驶的大型交通工具、拼接传送带上传送的物体等)时，对于现有的拼接方式，由于待拼接图像中存在视场相同的背景，当使用特征匹配时，会导致错误的图像配准，从而得到错误的拼接结果。However, when the field of view of the camera or other shooting equipment is fixed and the splicing object is a moving target in the scene (for example, splicing a large moving vehicle, splicing objects conveyed on a conveyor belt, etc.), for the existing splicing methods, due to There is a background with the same field of view in the images to be stitched. When feature matching is used, it will lead to wrong image registration and thus wrong stitching results.

发明内容Contents of the invention

有鉴于此，本公开的实施例提供了一种图像拼接方法、装置、系统、电子设备、存储介质和程序产品。In view of this, embodiments of the present disclosure provide an image stitching method, device, system, electronic device, storage medium and program product.

本公开的一个方面提供了一种固定视场中的图像拼接方法，包括：获取图像序列，所述图像序列至少包括包含有移动目标的多个帧图像；确定多个所述帧图像中的同一位置为拼接位置；分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景，并对所述前景处的图像区域进行上下文特征提取；根据所述上下文特征确定所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离；根据所述移动距离和所述拼接位置分别对所述t_n时刻帧图像和所述t_n+1时刻帧图像进行剪裁，获得剪裁图像；对所述剪裁图像进行拼接。One aspect of the present disclosure provides an image splicing method in a fixed field of view, including: acquiring an image sequence, the image sequence at least including a plurality of frame images containing a moving target; determining the same frame image in a plurality of the frame images. The position is the splicing position; respectively determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 , and perform context feature extraction on the image area at the foreground; determine the The moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 ; according to the moving distance and the splicing position, the frame image at time t _n and the frame image at t _n+ are respectively ₁ time frame image is trimmed to obtain a trimmed image; the trimmed images are spliced.

在某些实施例中，在分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景之前，还包括：获取帧图像中的背景图像；对所述背景图像和所述t_n时刻帧图像分别进行混合高斯建模；根据所述背景图像的混合高斯模型和所述t_n时刻帧图像的混合高斯模型的比对结果，判断所述t_n时刻帧图像中是否包含移动目标。In some embodiments, before determining the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 respectively, the method further includes: obtaining a background image in the frame image; and comparing the background image and the The frame images at time t _n are respectively subjected to mixed Gaussian modeling; based on the comparison results of the Gaussian mixture model of the background image and the Gaussian mixture model of the frame image at time t _n , it is determined whether the frame image at time t _n contains Moving target.

在某些实施例中，在判断所述t_n时刻帧图像中未包含移动目标时，将所述t_n时刻帧图像的混合高斯模型更新为所述背景图像的混合高斯模型。In some embodiments, when it is determined that the moving target is not included in the frame image at time _tn , the Gaussian mixture model of the frame image at time _tn is updated to the Gaussian mixture model of the background image.

在某些实施例中，所述确定多个所述帧图像中的同一位置为拼接位置包括：将所述帧图像的行或者列设置为所述拼接位置，所述拼接位置满足：在所述移动目标的运动方向上，所述拼接位置位于t_m时刻帧图像中的移动目标的前方，所述t_m时刻帧图像为第一次包含有所述移动目标的帧图像，m为正整数且m小于等于n。In some embodiments, determining the same position in multiple frame images as the splicing position includes: setting a row or column of the frame image as the splicing position, and the splicing position satisfies: in the In the direction of movement of the moving target, the splicing position is located in front of the moving target in the frame image at time t _m , which is the first frame image containing the _moving target, m is a positive integer and m is less than or equal to n.

在某些实施例中，所述根据所述上下文特征确定所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离包括：对所述t_n时刻帧图像的上下文特征和所述t_n+1时刻帧图像的上下文特征进行特征匹配，并获得所述特征匹配对；根据所述特征匹配对在所述tn时刻帧图像和所述t_n+1时刻帧图像中的坐标信息计算所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离。In some embodiments, determining the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 according to the contextual characteristics includes: comparing the frame image at time t _n The contextual features of the image and the contextual features of the frame image at time t _n+1 are feature matched, and the feature matching pair is obtained; according to the feature matching pair, the frame image at time tn and the frame image at time t _n+1 are obtained. The coordinate information in the frame image is used to calculate the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 .

在某些实施例中，所述对所述前景处的图像区域进行上下文特征提取包括：对所述前景处的图像区域内的多种上下文特征进行提取；所述对所述t_n时刻帧图像的上下文特征和所述t_n+1时刻帧图像的上下文特征进行特征匹配包括：对所述t_n时刻帧图像的多种上下文特征的组合和所述t_n+1时刻帧图像的多种上下文特征的组合进行特征匹配。In some embodiments, extracting contextual features from the image area at the foreground includes: extracting multiple contextual features within the image area at the foreground; extracting the frame image at time t _n Feature matching between the context features of the frame image at time t _n+1 and the context features of the frame image at time t _n +1 includes: combining multiple context features of the frame image at time t n and multiple contexts of the frame image at time t _n+1 Feature matching is performed using a combination of features.

在某些实施例中，所述根据所述特征匹配对在所述t_n时刻帧图像和所述t_n+1时刻帧图像中的坐标信息计算所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离包括：根据所述多个所述特征匹配对的坐标信息的平均绝对差或者绝对差的中间值计算所述移动距离。In some embodiments, the frame image of the moving target at time t _n is calculated based on the feature matching pair of coordinate information in the frame image at time t n and the frame image at time t _n ₊₁ and the moving distance between the frame images at time t _n+1 includes: calculating the moving distance based on the average absolute difference or the intermediate value of the absolute differences of the coordinate information of the plurality of feature matching pairs.

在某些实施例中，至少一个所述上下文特征与采集所述图像序列的设备的工作模式和/或所述设备的工作环境相关。In certain embodiments, at least one of said contextual features is related to an operating mode of a device that acquired said image sequence and/or an operating environment of said device.

在某些实施例中，所述根据所述移动距离和所述拼接位置分别对所述t_n时刻帧图像和所述t_n+1时刻帧图像进行剪裁包括：确定剪裁区域，所述剪裁区域为以所述拼接位置处为剪裁起点，以所述剪裁起点沿所述移动目标的相反运动方向划定长度为所述移动距离的区域。In some embodiments, clipping the frame image at time t _n and the frame image at time t _n+1 according to the movement distance and the splicing position includes: determining a clipping area, the clipping area In order to use the splicing position as the starting point for clipping, a region whose length is the movement distance is defined along the opposite movement direction of the moving target with the clipping starting point.

在某些实施例中，所述获取所述图像序列包括：获取待拼接的视频，对所述视频解码后获得所述图像序列。In some embodiments, obtaining the image sequence includes: obtaining a video to be spliced, and decoding the video to obtain the image sequence.

在某些实施例中，所述分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景包括：采用形态学方法确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景。In some embodiments, determining the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 respectively includes: using a morphological method to determine the frame image at time t _n and the frame at time t _n+1 The foreground of the moving target described in the image.

在某些实施例中，在所述对所述剪裁图像进行拼接后，若判断所述t_n+2时刻帧图像中未包含有所述移动目标，则结束对所述剪裁图像的拼接。In some embodiments, after the splicing of the cropped images, if it is determined that the moving target is not included in the frame image at time t _n+2 , the splicing of the cropped images will be terminated.

本公开的另一个方面提供了一种固定视场中的图像拼接装置，包括：图像序列获取模块，用于获取图像序列，所述图像序列至少包括包含有移动目标的多个帧图像；拼接位置设定模块，用于确定多个所述帧图像中的同一位置为拼接位置；前景确定模块，用于分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景；上下文特征提取模块，用于对所述前景处的图像区域进行上下文特征提取；移动距离计算模块，用于根据所述上下文特征确定所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离；剪裁模块，用于根据所述移动距离和所述拼接位置分别对所述t_n时刻帧图像和所述t_n+1时刻帧图像进行剪裁，获得剪裁图像；拼接模块，对所述剪裁图像进行拼接。Another aspect of the present disclosure provides an image splicing device in a fixed field of view, including: an image sequence acquisition module, used to acquire an image sequence, the image sequence at least includes a plurality of frame images including a moving target; a splicing position The setting module is used to determine the same position in multiple frame images as the splicing position; the foreground determination module is used to determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 respectively; A contextual feature extraction module, used to extract contextual features from the image area at the foreground; a moving distance calculation module, used to determine the frame image of the moving target at the time t n and the frame image of the moving target at _the t _n time according to the contextual features The movement distance between the frame images at time ₊₁ ; a clipping module, configured to clip the frame image at time t _n and the frame image at time t _n+1 respectively according to the movement distance and the splicing position to obtain a clipped image ; Splicing module, splicing the cropped images.

在某些实施例中，还包括：背景图像获取模块，用于获取帧图像中的背景图像；混合高斯建模模块，用于对所述背景图像和所述tn时刻帧图像分别进行混合高斯建模；移动目标判断模块，用于根据所述背景图像的混合高斯模型和所述t_n时刻帧图像的混合高斯模型的比对结果，判断所述t_n时刻帧图像中是否包含移动目标。In some embodiments, it also includes: a background image acquisition module, used to acquire the background image in the frame image; and a mixed Gaussian modeling module, used to perform mixed Gaussian construction on the background image and the tn time frame image respectively. module; a moving target judgment module, configured to judge whether the t _n time frame image contains a moving target based on the comparison result of the mixed Gaussian model of the background image and the t _n time frame image.

在某些实施例中，还包括：特征匹配模块，用于对所述t_n时刻帧图像的上下文特征和所述t_n+1时刻帧图像的上下文特征进行特征匹配，并获得所述特征匹配对。In some embodiments, it also includes: a feature matching module, configured to perform feature matching on the contextual features of the frame image at time t _n and the contextual features of the frame image at time t _n+1 , and obtain the feature matching. right.

在某些实施例中，还包括：视频处理模块，用于获取待拼接的视频，对所述视频进行解码处理。In some embodiments, it also includes: a video processing module, configured to obtain the video to be spliced and decode the video.

在某些实施例中，还包括：混合高斯模型更新模块，用于在判断所述t_n时刻帧图像中未包含移动目标时，将所述t_n时刻帧图像的混合高斯模型更新为所述背景图像的混合高斯模型。In some embodiments, the method further includes: a Gaussian mixture model update module, configured to update the Gaussian mixture model of the frame image at time t _n to the Gaussian mixture model when it is determined that the frame image at time t _n does not contain a moving target. Gaussian mixture model of the background image.

本公开的另一个方面还提供了一种固定视场中的图像拼接系统，包括：上述中任一项所述的固定视场中的图像拼接装置，以及图像采集设备，用以采集并形成包含有移动目标的视频以及采集图像背景。Another aspect of the present disclosure also provides an image splicing system in a fixed field of view, including: the image splicing device in a fixed field of view described in any one of the above, and an image acquisition device to collect and form images containing There are videos of moving targets as well as captured image backgrounds.

本公开的另一个方面还提供了一种电子设备，包括：一个或多个处理器；存储装置，用于存储一个或多个程序，其中，当所述一个或多个程序被所述一个或多个处理器执行时，使得所述一个或多个处理器执行上述中任一项所述的方法。Another aspect of the present disclosure also provides an electronic device, including: one or more processors; a storage device for storing one or more programs, wherein when the one or more programs are processed by the one or more When executed by multiple processors, the one or more processors are caused to execute any of the methods described above.

本公开的另一个方面还提供了一种计算机可读存储介质，其上存储有可执行指令，该指令被处理器执行时使处理器执行上述中任一项所述的方法。Another aspect of the present disclosure also provides a computer-readable storage medium having executable instructions stored thereon, which when executed by a processor causes the processor to perform any one of the methods described above.

本公开的另一个方面还提供了一种计算机程序产品，包括计算机程序，所述计算机程序被处理器执行时实现上述中任一项所述的方法。Another aspect of the present disclosure also provides a computer program product, including a computer program that implements any of the above methods when executed by a processor.

根据本公开实施例的图像拼接方法包括获取图像序列，图像序列至少包括包含有移动目标的多个帧图像；确定多个帧图像中的同一位置为拼接位置；分别确定t_n时刻帧图像和t_n+1时刻帧图像中移动目标的前景，并对前景处的图像区域进行上下文特征提取；根据上下文特征确定移动目标在t_n时刻帧图像和t_n+1时刻帧图像间的移动距离；根据移动距离和拼接位置分别对t_n时刻帧图像和t_n+1时刻帧图像进行剪裁，获得剪裁图像；对剪裁图像进行拼接。本公开的方法中通过比对上下文特征确定出移动物体在相邻帧图像间的移动距离，根据移动距离以及设定的拼接位置，实现对帧图像的剪裁以获得移动目标的部分，每剪裁一次获取移动目标的部分图像，最终通过将所有帧图像中剪裁下来的区域进行拼接，即可实现对移动目标的全景显示。The image splicing method according to the embodiment of the present disclosure includes obtaining an image sequence, which at least includes multiple frame images containing moving targets; determining the same position in the multiple frame images as the splicing position; determining the frame image at time t _n and t respectively. The foreground of the moving target in the frame image at time _n+1 , and context feature extraction is performed on the image area in the foreground; determine the moving distance of the moving target between the frame image at time _tn and the frame image at time tn ₊₁ based on the context features; based on the movement The distance and splicing position are used to clip the frame image at time t _n and the frame image at time t _n+1 respectively to obtain a clipped image; the clipped images are spliced. In the method disclosed in the present disclosure, the moving distance of the moving object between adjacent frame images is determined by comparing the contextual features. According to the moving distance and the set splicing position, the frame image is trimmed to obtain the moving target part. Each trimming is By acquiring partial images of the moving target, and finally splicing the cropped areas of all frame images, a panoramic display of the moving target can be achieved.

附图说明Description of drawings

通过以下参照附图对本公开实施例的描述，本公开的上述以及其他目的、特征和优点将更为清楚，在附图中：The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:

图1示意性示出了根据本公开实施例的固定视场中移动目标的拼接方法的流程图；Figure 1 schematically shows a flow chart of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure;

图2示意性示出了根据本公开实施例的固定视场中移动目标的拼接方法另一实施方式的流程图；Figure 2 schematically shows a flow chart of another embodiment of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure;

图3示意性示出了根据本公开实施例的固定视场中移动目标的拼接方法另一实施方式的流程图；Figure 3 schematically shows a flow chart of another embodiment of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure;

图4示意性示出了根据本公开实施例的固定视场中移动目标的拼接方法另一实施方式的流程图；Figure 4 schematically shows a flow chart of another embodiment of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure;

图5示意性示出了根据本公开实施例的移动目标t_n时刻帧图像的示意图；Figure 5 schematically shows a schematic diagram of a frame image of a moving target at time t _n according to an embodiment of the present disclosure;

图6示意性示出了根据本公开实施例中移动目标t_n+1时刻帧图像的示意图；Figure 6 schematically shows a schematic diagram of a frame image of a moving target at time t _n+1 according to an embodiment of the present disclosure;

图7示意性示出了根据本公开实施例中移动目标t_n+2时刻帧图像的示意图；Figure 7 schematically shows a schematic diagram of a frame image of a moving target at time t _n+2 according to an embodiment of the present disclosure;

图8示意性示出了根据本公开实施例的固定视场中移动目标的拼接装置的结构框图；Figure 8 schematically shows a structural block diagram of a splicing device for moving targets in a fixed field of view according to an embodiment of the present disclosure;

图9示意性示出了根据本公开实施例的适于实现固定视场中移动目标的拼接方法的电子设备的方框图。FIG. 9 schematically shows a block diagram of an electronic device suitable for implementing a splicing method of moving targets in a fixed field of view according to an embodiment of the present disclosure.

具体实施方式Detailed ways

以下，将参照附图来描述本公开的实施例。但是应该理解，这些描述只是示例性的，而并非要限制本公开的范围。在下面的详细描述中，为便于解释，阐述了许多具体的细节以提供对本公开实施例的全面理解。然而，明显地，一个或多个实施例在没有这些具体细节的情况下也可以被实施。此外，在以下说明中，省略了对公知结构和技术的描述，以避免不必要地混淆本公开的概念。Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood, however, that these descriptions are exemplary only and are not intended to limit the scope of the present disclosure. In the following detailed description, for convenience of explanation, numerous specific details are set forth to provide a comprehensive understanding of the disclosed embodiments. It will be apparent, however, that one or more embodiments may be practiced without these specific details. Furthermore, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily confusing the concepts of the present disclosure.

在此使用的术语仅仅是为了描述具体实施例，而并非意在限制本公开。在此使用的术语“包括”、“包含”等表明了所述特征、步骤、操作和/或部件的存在，但是并不排除存在或添加一个或多个其他特征、步骤、操作或部件。The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the disclosure. The terms "comprising," "comprising," and the like, as used herein, indicate the presence of stated features, steps, operations, and/or components but do not exclude the presence or addition of one or more other features, steps, operations, or components.

在使用类似于“A、B或C等中至少一个”这样的表述的情况下，一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如，“具有A、B或C中至少一个的系统”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的系统等)。术语“第一”、“第二”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。Where an expression similar to "at least one of A, B or C, etc." is used, it should generally be interpreted in accordance with the meaning that a person skilled in the art generally understands the expression to mean (for example, "having A, B or C "A system with at least one of" shall include, but is not limited to, systems with A alone, B alone, C alone, A and B, A and C, B and C, and/or systems with A, B, C, etc. ). The terms “first” and “second” are used for descriptive purposes only and shall not be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, features defined as “first” and “second” may explicitly or implicitly include one or more of the described features.

详细的背景技术，可以包括除独权解决的技术问题之外的其它技术问题。The detailed background technology may include other technical problems in addition to the technical problems that are exclusively solved.

本公开的实施例中提供了一种固定视场中的图像拼接方法包括获取图像序列，图像序列至少包括包含有移动目标的多个帧图像；确定多个帧图像中的同一位置为拼接位置；分别确定t_n时刻帧图像和t_n+1时刻帧图像中移动目标的前景，并对前景处的图像区域进行上下文特征提取；根据上下文特征确定移动目标在t_n时刻帧图像和t_n+1时刻帧图像间的移动距离；根据移动距离和拼接位置分别对t_n时刻帧图像和t_n+1时刻帧图像进行剪裁，获得剪裁图像；对剪裁图像进行拼接。An embodiment of the present disclosure provides an image splicing method in a fixed field of view, including acquiring an image sequence, which at least includes multiple frame images containing a moving target; determining the same position in the multiple frame images as the splicing position; Determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 respectively, and perform context feature extraction on the image area at the foreground; determine the location of the moving target in the frame image at time t _n and time t _n+1 based on the context features The movement distance between frame images; trim the frame image at time t _n and the frame image at time t _n+1 respectively according to the movement distance and splicing position to obtain a trimmed image; splice the trimmed images.

本公开的实施例中的图像拼接方法适用于在固定视场环境下，拼接出固定视场中移动物体的全景图像，特别适用于，移动物体的体积较大，固定视场的取像范围无法对移动物体的整体直接进行成像的场景。The image splicing method in the embodiment of the present disclosure is suitable for splicing panoramic images of moving objects in a fixed field of view in a fixed field of view environment. It is especially suitable for situations where the moving objects are large in size and the imaging range of the fixed field of view cannot A scene in which the entirety of a moving object is directly imaged.

需要说明的是，本公开实施例中的移动目标可以为固定视场中需要拍摄的移动物体。It should be noted that the moving target in the embodiment of the present disclosure may be a moving object that needs to be photographed in a fixed field of view.

图1示意性示出了根据本公开实施例的固定视场中的图像拼接方法的应用场景图。Figure 1 schematically shows an application scene diagram of an image stitching method in a fixed field of view according to an embodiment of the present disclosure.

如图1所示，根据该实施例的应用场景100可以包括存储设备101、102、103。网络104用以在存储设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in Figure 1, the application scenario 100 according to this embodiment may include storage devices 101, 102, and 103. Network 104 is the medium used to provide communication links between storage devices 101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用存储设备101、102、103通过网络104与服务器105交互，以将图像序列上传到服务器105上，供服务器105处理。The user can interact with the server 105 through the network 104 using the storage devices 101, 102, 103 to upload image sequences to the server 105 for processing by the server 105.

存储设备101、102、103可以是具有显示屏的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The storage devices 101, 102, and 103 may be various electronic devices with display screens, including but not limited to smartphones, tablet computers, laptop computers, desktop computers, and the like.

服务器105可以是针对图像序列进行上下文特征识别、特征匹配、图像剪裁、图像拼接融合的服务器。The server 105 may be a server that performs contextual feature recognition, feature matching, image cropping, and image splicing and fusion for image sequences.

需要说明的是，本公开实施例所提供的固定视场中移动目标的拼接方法一般可以由服务器105执行。相应地，本公开实施例所提供的固定视场中移动目标的拼接装置一般可以设置于服务器105中。本公开实施例所提供的固定视场中移动目标的拼接方法也可以由不同于服务器105且能够与存储设备101、102、103和/或服务器105通信的服务器或服务器集群执行。相应地，本公开实施例所提供的固定视场中移动目标的拼接系统也可以设置于不同于服务器105且能够与存储设备101、102、103和/或服务器105通信的服务器或服务器集群中。It should be noted that the splicing method of moving targets in a fixed field of view provided by the embodiments of the present disclosure can generally be executed by the server 105 . Correspondingly, the splicing device for moving targets in a fixed field of view provided by the embodiments of the present disclosure may generally be installed in the server 105 . The splicing method of moving targets in a fixed field of view provided by the embodiments of the present disclosure can also be executed by a server or server cluster that is different from the server 105 and can communicate with the storage devices 101, 102, 103 and/or the server 105. Correspondingly, the splicing system for moving targets in a fixed field of view provided by the embodiments of the present disclosure can also be set up in a server or server cluster that is different from the server 105 and can communicate with the storage devices 101, 102, 103 and/or the server 105.

应该理解，图1中的存储设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the number of storage devices, networks and servers in Figure 1 is only illustrative. Depending on implementation needs, there can be any number of end devices, networks, and servers.

以下将基于图1描述的场景，通过图2～图7对本公开实施例的固定视场中移动目标的拼接方法进行详细描述。The following will describe in detail the splicing method of moving targets in a fixed field of view according to the embodiment of the present disclosure through FIGS. 2 to 7 based on the scene described in FIG. 1 .

图2示意性示出了根据本公开实施例的固定视场中移动目标的拼接方法的流程图。FIG. 2 schematically shows a flow chart of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure.

如图2所示，该实施例的固定视场中移动目标的拼接方法包括操作S210～操作S260。As shown in FIG. 2 , the splicing method of moving targets in a fixed field of view in this embodiment includes operations S210 to S260.

在操作210，获取图像序列，所述图像序列至少包括包含有移动目标在运动时刻t下的多个帧图像。In operation 210, an image sequence is obtained, where the image sequence at least includes a plurality of frame images containing the moving target at a moving time t.

本公开的实施例中的图像序列可以包括在不同时间移动目标依序连续获取的系列图像，还可以包括对包含有移动目标的视频、动画等进行处理后得到的多个帧图像。在本公开的实施例的应用场景中，移动目标在固定视场中运动，相应地，上述图像序列至少包括移动目标开始进入固定视场以及到完全脱离固定视场整个运动过程所产生的视频，并对上述视频进行解析处理后得到的以运动时刻t为轴线所对应产生的多张帧图像。The image sequence in the embodiment of the present disclosure may include a series of images obtained sequentially and continuously from a moving target at different times, and may also include multiple frame images obtained by processing videos, animations, etc. containing moving targets. In the application scenario of the embodiment of the present disclosure, the moving target moves in a fixed field of view. Correspondingly, the above-mentioned image sequence at least includes the video generated by the entire movement process of the moving target starting from entering the fixed field of view and completely leaving the fixed field of view. After analyzing and processing the above video, multiple frame images corresponding to the motion time t as the axis are obtained.

结合本公开实施例的应用场景，如港口处对大型集装箱的安检场景，固定视场中的拍摄装置在工作时间内会保持工作状态，但是固定视场中不会每时每刻均有移动目标在运动，所以，图像序列中还可包括未包含有移动目标的帧图像。Combined with the application scenarios of the embodiments of the present disclosure, such as security inspection scenarios for large containers at ports, the shooting device in the fixed field of view will remain in working condition during working hours, but there will not be moving targets in the fixed field of view at all times. In motion, therefore, the image sequence may also include frame images that do not contain moving targets.

本公开的实施例中的获取图像序列的方式可包括：获取包含有待拼接移动目标的视频，对上述视频进行解码处理，得到多个帧图像。The method of obtaining an image sequence in an embodiment of the present disclosure may include: obtaining a video containing a moving target to be spliced, and decoding the video to obtain multiple frame images.

所述视频可以包括固定视场中的摄像机等拍摄的离线视频文件，可以包括用户上传到网上的在线视频文件，视频文件的格式可以包括但不限于MPEG、AVI、MOV、WMV等通用视频格式，解码得到的帧图像的格式可以包括但不限于PNG、BMP或JPG等通用图像格式。The video may include offline video files captured by cameras in a fixed field of view, and may include online video files uploaded to the Internet by users. The format of the video files may include but is not limited to general video formats such as MPEG, AVI, MOV, and WMV. The format of the decoded frame image may include, but is not limited to, common image formats such as PNG, BMP, or JPG.

本公开的实施例中的对视频解码处理可采用相关的第三方软件，如photoshop、Adobe Premiere等软件中的帧图像处理功能实现。The video decoding process in the embodiments of the present disclosure can be implemented by using the frame image processing function in relevant third-party software, such as photoshop, Adobe Premiere and other software.

可以理解，在对视频进行解码处理时，可通过控制帧图像的帧数以保证具有移动目标的相邻帧之间具有重复的部分，使得本公开实施例提供的方法中存在图像拼接的基础。It can be understood that when decoding a video, the number of frame images can be controlled to ensure that there are repeated parts between adjacent frames with moving targets, so that there is a basis for image splicing in the method provided by the embodiments of the present disclosure.

需要说明的是，对视频解码处理得到的帧图像以移动目标的运动时刻t进行命名，以便于理解和阐述本公开实施例的技术构思。示例性地，t₀时刻帧图像到t_n时刻帧图像，下述中为了表示帧图像中某些具有特殊含义或者作用的帧图像，将其命名为t_m时刻帧图像，以加以更好地区分说明。It should be noted that the frame images obtained by the video decoding process are named after the movement time t of the moving target, in order to facilitate understanding and explanation of the technical concepts of the embodiments of the present disclosure. For example, from the frame image at time t ₀ to the frame image at time t _n , in the following, in order to represent some frame images with special meanings or functions in the frame images, they are named frame images at time t _m to better distinguish them. Sub-explanation.

在操作S220，确定多个所述帧图像中的同一位置为拼接位置。In operation S220, the same position in the plurality of frame images is determined as a splicing position.

本公开的实施例中获取到的帧图像是固定视场下的图像，相应地，固定视场所对应产生的背景图像的大小是相同的。参见图5到图7所示，背景图像为包括a1到a18列，b1到b14行的像素区域。本实施例中以a9列为上述的拼接位置，拼接位置的确定以方便后续图像的剪裁和拼接。The frame image obtained in the embodiment of the present disclosure is an image under a fixed field of view. Correspondingly, the size of the background image generated corresponding to the fixed field of view is the same. As shown in Figures 5 to 7, the background image is a pixel area including columns a1 to a18 and rows b1 to b14. In this embodiment, column a9 is used as the above-mentioned splicing position, and the splicing position is determined to facilitate subsequent image cropping and splicing.

可以理解，不同的视场，对应的背景图像的像素区域的尺寸不同。It can be understood that different fields of view have different sizes of corresponding pixel areas of the background image.

可以理解，拼接位置还可以是像素区域中的行像素，或者包含多个行或者列像素的折线型拼接线。It can be understood that the splicing position can also be a row of pixels in the pixel area, or a polygonal splicing line containing multiple rows or columns of pixels.

为了进一步优化拼接出来的移动目标的完整性，在对拼接位置进行设定时，还需要满足下述条件：In order to further optimize the integrity of the spliced moving target, when setting the splicing position, the following conditions need to be met:

在所述移动目标的运动方向上，所述拼接位置位于t_m时刻帧图像中的移动目标的前方，所述t_m时刻帧图像为第一次包含有所述移动目标的帧图像，m为正整数且m小于等于n。In the direction of motion of the moving target, the splicing position is located in front of the moving _target in the frame image at time t _m , which is the first frame image containing the moving target, and m is Positive integer and m is less than or equal to n.

参见图5所示，图5中水平方向的箭头标识出移动目标的运动方向，假设当前的t_n时刻为上述中的t_m时刻帧图像，假设移动目标由c1、c2、c3等部分组成，为了图示的更消楚，避免更多的重叠，本实施例中仅在附图中显示移动目标的c1和c2部分。在t_m时刻下，移动目标第一次进入到固定视场中，此时需要保证设定的拼接位置在c1的左边，此种设置的目的在于，当t_m时刻的帧图像中，拼接位置若是和移动目标存在交叉，在后续的图像剪裁时，会导致位于拼接位置左侧的移动目标的无法被剪裁到，导致图像拼接时，无法显示出移动目标的全景图像。Referring to Figure 5, the horizontal arrow in Figure 5 identifies the movement direction of the moving target. Assume that the current t _n time is the frame image at t _m time mentioned above, and it is assumed that the moving target is composed of c1, c2, c3 and other parts. In order to make the illustration clearer and avoid more overlaps, only the c1 and c2 parts of the moving target are shown in the drawings in this embodiment. At time t _m , the moving target enters the fixed field of view for the first time. At this time, it is necessary to ensure that the set splicing position is to the left of c1. The purpose of this setting is that in the frame image at time t _m , the splicing position If there is an intersection with the moving target, the moving target located to the left of the splicing position will not be clipped during subsequent image trimming, resulting in the inability to display the panoramic image of the moving target during image splicing.

在操作S230，分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景，并对所述前景处的图像区域进行上下文特征提取。In operation S230, the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 is determined respectively, and context feature extraction is performed on the image area at the foreground.

本公开的实施例中，采用形态学方法确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景。将背景图像和移动目标的前景采用形态学中的开运算，即对背景图像和前景形成的二值图像先腐蚀再膨胀以对帧图像进行降噪处理，进而更好地在背景图像中区分出前景。In the embodiment of the present disclosure, a morphological method is used to determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 . The background image and the foreground of the moving target are opened using morphological operations, that is, the binary image formed by the background image and the foreground is first corroded and then expanded to perform noise reduction processing on the frame image, thereby better distinguishing the background image. prospect.

本公开的实施例中的上下文特征可以包括SIFT特征、HOG特征等。本公开的实施例中SIFT特征提取方法可以包括以下操作：Contextual features in embodiments of the present disclosure may include SIFT features, HOG features, etc. In the embodiment of the present disclosure, the SIFT feature extraction method may include the following operations:

构建DOG空间；Build DOG space;

提取兴趣点(interest points)，找到所有特征点后，用拐角点(cornerpoints)做识别，对离散点用RANSAC做曲线拟合，得到精确的关键点的位置和尺度信息；Extract interest points. After finding all feature points, use corner points for identification, and use RANSAC for curve fitting of discrete points to obtain accurate location and scale information of key points;

方向赋值，根据检测到的关键点的局部图像结构为特征点赋值，可采用梯度方向直方图的方法，在计算直方图时，每个加入直方图的采样点都使用圆形高斯函数进行加权处理。Direction assignment, assigning values to feature points based on the local image structure of the detected key points, can use the gradient direction histogram method. When calculating the histogram, each sampling point added to the histogram is weighted using a circular Gaussian function. .

可以理解，在对上下文特征进行提取时，可提取多种上下文特征，以通过后续操作对多种组合的上下文特征的比对，实现更加精确的剪裁区域，以方便拼接。示例性地，同时提取上述中的SIFT特征和HOG特征，以进一步减弱外界环境如光照等对图像拼接的影响。It can be understood that when extracting context features, a variety of context features can be extracted to compare multiple combinations of context features through subsequent operations to achieve a more precise clipping area to facilitate splicing. For example, the above-mentioned SIFT features and HOG features are extracted at the same time to further reduce the impact of external environments such as lighting on image splicing.

可以理解，在提取上下文特征时，需保证至少一个所述上下文特征与采集所述图像序列的设备的工作模式和/或所述设备的工作环境相关，以降低设备的工作模式和工作环境(例如白天检测和夜晚检测等)对图像拼接的影响。It can be understood that when extracting contextual features, it is necessary to ensure that at least one of the contextual features is related to the working mode of the device that collects the image sequence and/or the working environment of the device, so as to reduce the working mode and working environment of the device (for example, day detection and night detection, etc.) on image stitching.

需要说明的是，在采用形态学方法确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景前还需要进行如下操作。It should be noted that the following operations need to be performed before using the morphological method to determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 .

图3示意性示出了根据本公开实施例中的固定视场中移动目标的拼接方法的又一实施例的流程图，包括操作S201到操作S203。FIG. 3 schematically shows a flowchart of yet another embodiment of a splicing method for moving targets in a fixed field of view according to an embodiment of the present disclosure, including operations S201 to S203.

在操作S201，获取帧图像中的背景图像。In operation S201, a background image in the frame image is obtained.

背景图像为不包含有移动目标的图像。例如，可以通过拍照设备对固定视场未包含有移动目标时进行拍照，以获取所述背景图像。The background image is an image that does not contain moving objects. For example, the background image can be obtained by taking pictures using a camera device when the fixed field of view does not include a moving target.

在操作S202，对所述背景图像和所述t_n时刻帧图像分别进行混合高斯建模。In operation S202, Gaussian mixture modeling is performed on the background image and the frame image at time t _n respectively.

在操作S203，根据所述背景图像的混合高斯模型和所述t_n时刻帧图像的混合高斯模型的比对结果，判断所述t_n时刻帧图像中是否包含移动目标。混合高斯模型中使用K个高斯模型来表征帧图像中各个像素点的特征，K通常取值3到5。在进行判断时，通过将t_n时刻帧图像中的每个像素点与背景图像的混合高斯模型匹配，若匹配成功，则说明该像素点是背景，若匹配不成功，则说明该像素点为前景。在判断出t_n时刻帧图像中包含移动目标后，再采用上述的基于形态学的方法对帧图像中的前景进一步降噪优化处理。In operation S203, based on the comparison result of the Gaussian mixture model of the background image and the Gaussian mixture model of the frame image at time _tn , it is determined whether the frame image at time _tn contains a moving target. K Gaussian models are used in the mixed Gaussian model to characterize the characteristics of each pixel in the frame image. K usually takes a value of 3 to 5. When making the judgment, each pixel in the frame image at time t _n is matched with the Gaussian mixture model of the background image. If the match is successful, it means that the pixel is the background. If the match is unsuccessful, it means that the pixel is the background. prospect. After it is determined that the frame image at time t _n contains a moving target, the above-mentioned morphology-based method is used to further reduce noise and optimize the foreground in the frame image.

背景图像的采集易受到外界环境的影响，如光照、拍摄角度、距离等均会导致背景图像的混合高斯模型发生变化，可以理解，采用高斯混合建模是为了在背景图像中更好地区分出来移动目标，故参见图4所示，在操作S2031，在判断所述t_n时刻帧图像中未包含移动目标时，将所述t_n时刻帧图像的混合高斯模型更新为所述背景图像的混合高斯模型，以保证背景图像尽可能的接近t_n时刻帧图像中的背景图像。The collection of background images is easily affected by the external environment, such as lighting, shooting angle, distance, etc., which will cause changes in the Gaussian mixture model of the background image. It can be understood that the purpose of using Gaussian mixture modeling is to better distinguish it from the background image. Moving targets, so see Figure 4. In operation S2031, when it is determined that the frame image at time t _n does not contain a moving target, the Gaussian mixture model of the frame image at time t _n is updated to a mixture of the background image. Gaussian model to ensure that the background image is as close as possible to the background image in the frame image at time t _n .

返回参照图2，在操作S240，根据所述上下文特征确定所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离。Referring back to FIG. 2 , in operation S240 , the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 is determined according to the context feature.

本公开的实施例中，对所述t_n时刻帧图像的上下文特征和所述t_n+1时刻帧图像的上下文特征进行特征匹配，并获得所述特征匹配对；根据所述特征匹配对在所述t_n时刻帧图像和所述t_n+1时刻帧图像中的坐标信息计算所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离。以上下文特征中的SIFT特征为示例，分别对t_n时刻帧图像和t_n+1时刻帧图像中移动目标上的SIFT特征进行提取，通过特征描述确定移动目标的相同部分在t_n时刻帧图像和所述t_n+1时刻帧图像分别对应的SIFT特征，故在不同的帧图像中描述移动物体相同部分的SIFT特征构建成特征匹配对，通过分别获取特征匹配对在帧图像中的像素区域的坐标位置，如在t_n时刻帧图像中特征匹配对的一方的坐标为像素区域中的a13列，在t_n+1时刻帧图像中特征匹配对的另一方的坐标为像素区域中的a9列，则可以计算出移动目标在t_n时刻帧图像和t_n+1时刻帧图像间的移动距离为四列像素单元的长度。In the embodiment of the present disclosure, feature matching is performed on the context features of the frame image at time t _n and the context features of the frame image at time t _n+1 , and the feature matching pair is obtained; according to the feature matching pair, The coordinate information in the frame image at time t _n and the frame image at time t _n+1 is used to calculate the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 . Taking the SIFT feature in the context feature as an example, the SIFT features on the moving target in the t _n time frame image and the t _{n +1} time frame image are respectively extracted, and the same part of the moving target is determined through the feature description in the t n time frame image and t _n time frame image. The SIFT features corresponding to the frame images at time t _n+1 are respectively used. Therefore, the SIFT features describing the same part of the moving object in different frame images are constructed into feature matching pairs, and the feature matching pairs are obtained in the pixel area of the frame image respectively. Coordinate position, for example, in the frame image at time t _n , the coordinates of one side of the feature matching pair are column a13 in the pixel area, and in the frame image at time t _n+1, the coordinates of the other side of the feature matching pair are column a9 in the pixel area. , then it can be calculated that the moving distance between the frame image at time t _n and the frame image at time t _n+1 is the length of four columns of pixel units.

可以理解，在对同一种上下文特征进行提取时，可对移动目标上的多处位置进行特征提取，进而得到多个特征匹配对，可根据多个特征匹配的坐标信息的平均绝对差或者绝对差的中间值计算所述移动距离，以提高移动距离计算的精确度。It can be understood that when extracting the same context feature, features can be extracted from multiple locations on the moving target to obtain multiple feature matching pairs, and the average absolute difference or absolute difference of the coordinate information matched by multiple features can be obtained. The moving distance is calculated as an intermediate value to improve the accuracy of the moving distance calculation.

可以理解，在对多种上下文特征进行提取时，可对移动目标上的多处位置进行不同的特征提取，进而得到多个不同特征匹配对，可根据多个不同特征匹配的坐标信息的平均绝对差或者绝对差的中间值计算所述移动距离，以提高移动距离计算的精确度。It can be understood that when extracting multiple contextual features, different features can be extracted from multiple locations on the moving target, thereby obtaining multiple different feature matching pairs, and the average absolute coordinate information of the matching coordinate information can be based on multiple different features. The moving distance is calculated using the difference or the intermediate value of the absolute difference to improve the accuracy of the moving distance calculation.

在操作S250，根据所述移动距离和所述拼接位置分别对所述t_n时刻帧图像和所述t_n+1时刻帧图像进行剪裁，获得剪裁图像。In operation S250, the frame image at time t _n and the frame image at time t _n+1 are respectively trimmed according to the movement distance and the splicing position to obtain a trimmed image.

例如，在该操作S250中，可以确定剪裁区域，所述剪裁区域为以所述拼接位置处为剪裁起点，以所述剪裁起点沿所述移动目标的运动方向相反的方向划定长度为所述移动距离的区域。参见图5至图7，移动目标的运动方向为在图面中从右向左的方向。图中的a9列位置处表示在操作S220中所阐述的拼接位置。在图5中的t_n时刻，移动目标的c1部分的前端运动到像素区域的a13列，在图6中的t_n+1时刻，移动目标的c1部分的前端运动到像素区域的a9列，采用上述的上下文特征计算移动距离的方法，可计算出移动距离为四个列像素单元的长度，以确定剪裁区域的方法进行区域划定可得到如图6中的水平双向箭头所指示的区域，即为剪裁区域。此剪裁区域至少适用t_n时刻帧图像和t_n+1时刻帧图像。在t_n时刻帧图像中进行剪裁得到的是不含有移动目标的空白区域，在t_n+1时刻帧图像中进行剪裁得到是图6中所示的移动目标的阴影部分。依照此方式可同样对t_n+2时刻帧图像进行剪裁，得到如图7中所示的移动目标的阴影部分。For example, in this operation S250, a trimming area may be determined. The trimming area is based on the splicing position as the trimming starting point and the length of the trimming starting point in the direction opposite to the movement direction of the moving target. moving distance area. Referring to Figures 5 to 7, the movement direction of the moving target is from right to left in the drawing. The position of column a9 in the figure represents the splicing position explained in operation S220. At time t _n in Figure 5, the front end of part c1 of the moving target moves to column a13 of the pixel area. At time t _n+1 in Figure 6, the front end of part c1 of the moving target moves to column a9 of the pixel area. Using the method of calculating the moving distance using the above context features, the moving distance can be calculated as the length of four columns of pixel units, and the area indicated by the horizontal two-way arrow in Figure 6 can be obtained by delimiting the area using the method of determining the clipping area. That is the clipping area. This clipping area is applicable to at least the t _n time frame image and the t _{n + 1} time frame image. What is obtained by clipping the frame image at time t _n is a blank area that does not contain the moving target, and what is obtained by clipping the frame image at time t _n+1 is the shadow part of the moving target shown in Figure 6 . In this way, the frame image at time t _n+2 can be similarly cropped to obtain the shadow part of the moving target as shown in Figure 7.

在操作S260，对所述剪裁图像进行拼接。In operation S260, the cropped images are spliced.

以上述中图5和图7进行示例说明，通过将剪裁区域中的移动目标进行拼接可得到移动目标上的完整c1和c2部分，即将图5中剪裁下来的空白区域、图6和图7中的移动目标的阴影部分进行拼接。在剪裁区域拼接时，以图5中的a9位置为起点，以图5中的a13处与图6中的a9处对接以实现两个剪裁区域的拼接，后续拼接方式依次类推，不再赘述。Taking the above-mentioned Figure 5 and Figure 7 as an example, by splicing the moving target in the clipping area, the complete c1 and c2 parts of the moving target can be obtained, that is, the blank area clipped in Figure 5, the blank area in Figure 6 and Figure 7 The shadow part of the moving target is spliced. When splicing the clipping areas, take the position a9 in Figure 5 as the starting point, and connect the position a13 in Figure 5 and the position a9 in Figure 6 to realize the splicing of the two clipping areas. The subsequent splicing methods are analogous and will not be described again.

可以理解地，剪裁区域的拼接，可在所述剪裁区域均得到后，在按照上述规则进行拼接，或者每获取到一个剪裁区域，就执行一次拼接操作。It can be understood that the splicing of the clipping areas can be performed according to the above rules after all the clipping areas are obtained, or a splicing operation can be performed every time a clipping area is obtained.

在操作S260之后，在对t_n+2时刻帧图像进行是否含有移动目标的判断时，若t_n+2时刻帧图像中未包含有移动目标，则可结束对图像拼接的整个方法，说明图像的拼接操作已经完成，直到再次检测到新的移动目标时，在循环执行本方法。After operation S260, when determining whether the frame image at time t _n+2 contains a moving target, if the frame image at time t _n+2 does not contain a moving target, the entire method of image splicing can be ended, indicating that the image The splicing operation has been completed, and this method is executed in a loop until a new moving target is detected again.

基于上述固定视场中移动目标的拼接方法，本公开的实施例还提供了一种固定视场中移动目标的拼接装置。以下将结合图8对该装置进行详细描述。Based on the above splicing method of moving targets in a fixed field of view, embodiments of the present disclosure also provide a splicing device of moving targets in a fixed field of view. The device will be described in detail below with reference to FIG. 8 .

图8示意性示出了根据本公开实施例的固定视场中移动目标的拼接装置的结构框图。Figure 8 schematically shows a structural block diagram of a splicing device for moving targets in a fixed field of view according to an embodiment of the present disclosure.

如图8所示，拼接装置300包括图像序列获取模块301，拼接位置设定模块302，前景确定模块303，上下文特征提取模块304，移动距离计算模块305，剪裁模块306，拼接模块307。As shown in Figure 8, the splicing device 300 includes an image sequence acquisition module 301, a splicing position setting module 302, a foreground determination module 303, a context feature extraction module 304, a movement distance calculation module 305, a cropping module 306, and a splicing module 307.

图像序列获取模块301，用于获取图像序列，所述图像序列至少包括包含有移动目标在运动时刻t下的多个帧图像。The image sequence acquisition module 301 is used to acquire an image sequence, which at least includes multiple frame images of the moving target at the moving time t.

拼接位置设定模块302，用于确定多个所述帧图像中的同一位置为拼接位置。The splicing position setting module 302 is used to determine the same position in multiple frame images as the splicing position.

前景确定模块303，用于分别确定t_n时刻帧图像和t_n+1时刻帧图像中所述移动目标的前景。The foreground determination module 303 is used to determine the foreground of the moving target in the frame image at time t _n and the frame image at time t _n+1 respectively.

上下文特征提取模块304，用于对所述前景处的图像区域进行上下文特征提取。The context feature extraction module 304 is used to extract context features from the image area in the foreground.

移动距离计算模块305，用于根据所述上下文特征确定所述移动目标在所述t_n时刻帧图像和所述t_n+1时刻帧图像间的移动距离。The moving distance calculation module 305 is configured to determine the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 according to the context characteristics.

剪裁模块306，用于根据所述移动距离和所述拼接位置分别对所述t_n时刻帧图像和所述t_n+1时刻帧图像进行剪裁，获得剪裁图像。The trimming module 306 is configured to trim the frame image at time t _n and the frame image at time t _n+1 respectively according to the movement distance and the splicing position to obtain a trimmed image.

拼接模块307，用于对所述剪裁图像进行拼接。The splicing module 307 is used to splice the cropped images.

例如，根据本公开实施例的固定视场中移动目标的拼接装置还包括背景图像获取模块，用于获取帧图像中的背景图像；混合高斯建模模块，用于对所述背景图像和所述t_n时刻帧图像分别进行混合高斯建模；移动目标判断模块，用于根据所述背景图像的混合高斯模型和所述t_n时刻帧图像的混合高斯模型的比对结果，判断所述t_n时刻帧图像中是否包含移动目标。For example, the splicing device for moving targets in a fixed field of view according to an embodiment of the present disclosure also includes a background image acquisition module for acquiring the background image in the frame image; and a Gaussian hybrid modeling module for calculating the background image and the t The frame images at time _n are respectively subjected to mixed Gaussian modeling; the moving target judgment module is used to judge the time t _n based on the comparison results of the mixed Gaussian model of the background image and the mixed Gaussian model of the frame image at time t _n Whether the frame image contains moving targets.

例如，根据本公开实施例的固定视场中移动目标的拼接装置还包括特征匹配模块，用于对所述t_n时刻帧图像的上下文特征和所述t_n+1时刻帧图像的上下文特征进行特征匹配，并获得所述特征匹配对。For example, the splicing device for moving targets in a fixed field of view according to an embodiment of the present disclosure further includes a feature matching module for characterizing the contextual features of the frame image at time t _n and the contextual features of the frame image at time t _n+1 match, and obtain matching pairs of the features.

例如，根据本公开实施例的固定视场中移动目标的拼接装置还包括：视频处理模块，用于获取待拼接的视频，对所述视频进行解码处理。For example, the device for splicing moving targets in a fixed field of view according to an embodiment of the present disclosure further includes: a video processing module, configured to obtain a video to be spliced and perform decoding processing on the video.

例如，根据本公开实施例的固定视场中移动目标的拼接装置还包括混合高斯模型更新模块，用于在判断所述t_n时刻帧图像中未包含移动目标时，将所述t_n时刻帧图像的混合高斯模型更新为所述背景图像的混合高斯模型。For example, the splicing device for moving targets in a fixed field of view according to an embodiment of the present disclosure further includes a hybrid Gaussian model update module, configured to update the frame image at time _tn when it is determined that the frame image at time _tn does not contain a moving target. The Gaussian mixture model is updated to be the Gaussian mixture model of the background image.

在本公开的实施例中，获取图像序列，图像序列至少包括包含有移动目标的多个帧图像；确定多个帧图像中的同一位置为拼接位置；分别确定t_n时刻帧图像和t_n+1时刻帧图像中移动目标的前景，并对前景处的图像区域进行上下文特征提取；根据上下文特征确定移动目标在t_n时刻帧图像和t_n+1时刻帧图像间的移动距离；根据移动距离和拼接位置分别对t_n时刻帧图像和t_n+1时刻帧图像进行剪裁，获得剪裁图像；对剪裁图像进行拼接。通过比对上下文特征确定出移动物体在相邻帧图像间的移动距离，根据移动距离以及设定的拼接位置，实现对帧图像的剪裁以获得移动目标的部分，每剪裁一次获取移动目标的部分图像，最终通过将所有帧图像中剪裁下来的区域进行拼接，即可实现对移动目标的全景显示。In an embodiment of the present disclosure, an image sequence is obtained, which at least includes multiple frame images containing a moving target; the same position in the multiple frame images is determined as the splicing position; and the frame image at time t _n and t _n+ are determined respectively. The foreground of the moving target in the frame image at time ₁ is extracted, and context features are extracted from the image area in the foreground; the moving distance of the moving target between the frame image at time t _n and the frame image at time t _n+1 is determined based on the context features; according to the moving distance and At the splicing position, the frame image at time t _n and the frame image at time t _n+1 are respectively trimmed to obtain a trimmed image; the trimmed images are spliced. By comparing the context features, the moving distance of the moving object between adjacent frame images is determined. According to the moving distance and the set splicing position, the frame image is trimmed to obtain the part of the moving target. The part of the moving target is obtained each time it is clipped. Images, and finally by splicing the cropped areas in all frame images, a panoramic display of the moving target can be achieved.

根据本公开的实施例，图像序列获取模块301，拼接位置设定模块302，前景确定模块303，上下文特征提取模块304，移动距离计算模块305，剪裁模块306，拼接模块307中的任意多个模块可以合并在一个模块中实现，或者其中的任意一个模块可以被拆分成多个模块。或者，这些模块中的一个或多个模块的至少部分功能可以与其他模块的至少部分功能相结合，并在一个模块中实现。根据本公开的实施例，图像序列获取模块301，拼接位置设定模块302，前景确定模块303，上下文特征提取模块304，移动距离计算模块305，剪裁模块306，拼接模块307中的至少一个可以至少被部分地实现为硬件电路，例如现场可编程门阵列(FPGA)、可编程逻辑阵列(PLA)、片上系统、基板上的系统、封装上的系统、专用集成电路(ASIC)，或可以通过对电路进行集成或封装的任何其他的合理方式等硬件或固件来实现，或以软件、硬件以及固件三种实现方式中任意一种或以其中任意几种的适当组合来实现。或者，图像序列获取模块301，拼接位置设定模块302，前景确定模块303，上下文特征提取模块304，移动距离计算模块305，剪裁模块306，拼接模块307中的至少一个可以至少被部分地实现为计算机程序模块，当该计算机程序模块被运行时，可以执行相应的功能。According to an embodiment of the present disclosure, any multiple modules in the image sequence acquisition module 301, the splicing position setting module 302, the foreground determination module 303, the contextual feature extraction module 304, the movement distance calculation module 305, the cropping module 306, and the splicing module 307 Can be combined into one module, or any module can be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the image sequence acquisition module 301, the splicing position setting module 302, the foreground determination module 303, the contextual feature extraction module 304, the movement distance calculation module 305, the cropping module 306, and the splicing module 307 may at least is implemented partially as hardware circuitry, such as a field programmable gate array (FPGA), a programmable logic array (PLA), a system on a chip, a system on a substrate, a system on a package, an application specific integrated circuit (ASIC), or may be implemented via an application specific integrated circuit (ASIC). Any other reasonable way of integrating or encapsulating circuits, such as hardware or firmware, or any one of the three implementation methods of software, hardware and firmware or an appropriate combination of any of them. Alternatively, at least one of the image sequence acquisition module 301, the splicing position setting module 302, the foreground determination module 303, the context feature extraction module 304, the moving distance calculation module 305, the cropping module 306, and the splicing module 307 can be at least partially implemented as Computer program modules, when the computer program modules are run, can perform corresponding functions.

本公开的实施例还提供了一种固定视场中移动目标的拼接系统，上述中实施例中的固定视场中的图像拼接装置，以及图像采集设备，用以采集并形成包含有移动目标的视频，以及采集图像背景。Embodiments of the present disclosure also provide a splicing system for moving targets in a fixed field of view. The image splicing device in the fixed field of view and the image acquisition device in the above embodiments are used to collect and form videos containing moving targets. , and collect image background.

如图9所示，根据本公开实施例的电子设备400包括处理器401，其可以根据存储在只读存储器(ROM)402中的程序或者从存储部分408加载到随机访问存储器(RAM)403中的程序而执行各种适当的动作和处理。处理器401例如可以包括通用微处理器(例如CPU)、指令集处理器和/或相关芯片组和/或专用微处理器(例如，专用集成电路(ASIC))等等。处理器401还可以包括用于缓存用途的板载存储器。处理器401可以包括用于执行根据本公开实施例的方法流程的不同动作的单一处理单元或者是多个处理单元。As shown in FIG. 9 , an electronic device 400 according to an embodiment of the present disclosure includes a processor 401 that can be loaded into a random access memory (RAM) 403 according to a program stored in a read-only memory (ROM) 402 or from a storage part 408 program to perform various appropriate actions and processes. Processor 401 may include, for example, a general-purpose microprocessor (eg, a CPU), an instruction set processor and/or an associated chipset, and/or a special-purpose microprocessor (eg, an application specific integrated circuit (ASIC)), or the like. Processor 401 may also include onboard memory for caching purposes. The processor 401 may include a single processing unit or multiple processing units for performing different actions of the method flow according to embodiments of the present disclosure.

在RAM 403中，存储有电子设备400操作所需的各种程序和数据。处理器401、ROM402以及RAM 403通过总线404彼此相连。处理器401通过执行ROM 402和/或RAM 403中的程序来执行根据本公开实施例的方法流程的各种操作。需要注意，所述程序也可以存储在除ROM 402和RAM 403以外的一个或多个存储器中。处理器401也可以通过执行存储在所述一个或多个存储器中的程序来执行根据本公开实施例的方法流程的各种操作。In the RAM 403, various programs and data required for the operation of the electronic device 400 are stored. The processor 401, ROM 402 and RAM 403 are connected to each other through a bus 404. The processor 401 performs various operations according to the method flow of the embodiment of the present disclosure by executing programs in the ROM 402 and/or RAM 403. It should be noted that the program may also be stored in one or more memories other than ROM 402 and RAM 403. The processor 401 may also perform various operations according to the method flow of embodiments of the present disclosure by executing programs stored in the one or more memories.

根据本公开的实施例，电子设备400还可以包括输入/输出(I/O)接口405，输入/输出(I/O)接口405也连接至总线404。电子设备400还可以包括连接至I/O接口405的以下部件中的一项或多项：包括键盘、鼠标等的输入部分406；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分407；包括硬盘等的存储部分408；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分409。通信部分409经由诸如因特网的网络执行通信处理。驱动器410也根据需要连接至I/O接口405。可拆卸介质411，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器410上，以便于从其上读出的计算机程序根据需要被安装入存储部分408。According to embodiments of the present disclosure, the electronic device 400 may further include an input/output (I/O) interface 405 that is also connected to the bus 404 . Electronic device 400 may also include one or more of the following components connected to I/O interface 405: an input portion 406 including a keyboard, mouse, etc.; including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and an output section 407 such as a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem and the like. The communication section 409 performs communication processing via a network such as the Internet. Driver 410 is also connected to I/O interface 405 as needed. Removable media 411, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, etc., are installed on the drive 410 as needed, so that a computer program read therefrom is installed into the storage portion 408 as needed.

本公开还提供了一种计算机可读存储介质，该计算机可读存储介质可以是上述实施例中描述的设备/装置/系统中所包含的；也可以是单独存在，而未装配入该设备/装置/系统中。上述计算机可读存储介质承载有一个或者多个程序，当上述一个或者多个程序被执行时，实现根据本公开实施例的方法。The present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be included in the device/device/system described in the above embodiments; it may also exist independently without being assembled into the device/system. in the device/system. The above computer-readable storage medium carries one or more programs. When the above one or more programs are executed, the method according to the embodiment of the present disclosure is implemented.

根据本公开的实施例，计算机可读存储介质可以是非易失性的计算机可读存储介质，例如可以包括但不限于：便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开的实施例中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。例如，根据本公开的实施例，计算机可读存储介质可以包括上文描述的ROM 402和/或RAM 403和/或ROM 402和RAM 403以外的一个或多个存储器。According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, but is not limited to, portable computer disks, hard disks, random access memory (RAM), and read-only memory (ROM). , erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In embodiments of the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include one or more memories other than ROM 402 and/or RAM 403 and/or ROM 402 and RAM 403 described above.

本公开的实施例还包括一种计算机程序产品，其包括计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。当计算机程序产品在计算机系统中运行时，该程序代码用于使计算机系统实现本公开实施例所提供的物品推荐方法。Embodiments of the present disclosure also include a computer program product including a computer program containing program code for performing the method illustrated in the flowchart. When the computer program product is run in the computer system, the program code is used to cause the computer system to implement the item recommendation method provided by the embodiments of the present disclosure.

在该计算机程序被处理器401执行时执行本公开实施例的系统/装置中限定的上述功能。根据本公开的实施例，上文描述的系统、装置、模块、单元等可以通过计算机程序模块来实现。When the computer program is executed by the processor 401, the above-described functions defined in the system/device of the embodiment of the present disclosure are performed. According to embodiments of the present disclosure, the systems, devices, modules, units, etc. described above may be implemented by computer program modules.

在一种实施例中，该计算机程序可以依托于光存储器件、磁存储器件等有形存储介质。在另一种实施例中，该计算机程序也可以在网络介质上以信号的形式进行传输、分发，并通过通信部分409被下载和安装，和/或从可拆卸介质411被安装。该计算机程序包含的程序代码可以用任何适当的网络介质传输，包括但不限于：无线、有线等等，或者上述的任意合适的组合。In one embodiment, the computer program may rely on tangible storage media such as optical storage devices and magnetic storage devices. In another embodiment, the computer program can also be transmitted and distributed in the form of a signal on a network medium, and downloaded and installed through the communication part 409, and/or installed from the removable medium 411. The program code contained in the computer program can be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.

在这样的实施例中，该计算机程序可以通过通信部分409从网络上被下载和安装，和/或从可拆卸介质411被安装。在该计算机程序被处理器401执行时，执行本公开实施例的系统中限定的上述功能。根据本公开的实施例，上文描述的系统、设备、装置、模块、单元等可以通过计算机程序模块来实现。In such embodiments, the computer program may be downloaded and installed from the network via communication portion 409 and/or installed from removable media 411 . When the computer program is executed by the processor 401, the above-described functions defined in the system of the embodiment of the present disclosure are performed. According to embodiments of the present disclosure, the systems, devices, devices, modules, units, etc. described above may be implemented by computer program modules.

根据本公开的实施例，可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例提供的计算机程序的程序代码，具体地，可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。程序设计语言包括但不限于诸如Java，C++，python，“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中，远程计算设备可以通过任意种类的网络，包括局域网(LAN)或广域网(WAN)，连接到用户计算设备，或者，可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。According to the embodiments of the present disclosure, the program code for executing the computer program provided by the embodiments of the present disclosure may be written in any combination of one or more programming languages. Specifically, high-level procedural and/or object-oriented programming may be utilized. programming language, and/or assembly/machine language to implement these computational procedures. Programming languages include, but are not limited to, programming languages such as Java, C++, python, "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device, such as provided by an Internet service. (business comes via Internet connection).

附图中的流程图和框图，图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operations of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block in the block diagram or flowchart illustration, and combinations of blocks in the block diagram or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented by special purpose hardware-based systems that perform the specified functions or operations. Achieved by a combination of specialized hardware and computer instructions.

本领域技术人员可以理解，本公开的各个实施例和/或权利要求中记载的特征可以进行多种组合和/或结合，即使这样的组合或结合没有明确记载于本公开的实施例中。特别地，在不脱离本公开精神和教导的情况下，本公开的各个实施例和/或权利要求中记载的特征可以进行多种组合和/或结合。所有这些组合和/或结合均落入本公开的范围。Those skilled in the art will understand that the features described in the various embodiments and/or claims of the present disclosure may be combined and/or combined in various ways, even if such combinations or combinations are not explicitly described in the embodiments of the present disclosure. In particular, various combinations and/or combinations of features recited in the various embodiments and/or claims of the disclosure may be made without departing from the spirit and teachings of the disclosure. All such combinations and/or combinations fall within the scope of this disclosure.

以上对本公开的实施例进行了描述。但是，这些实施例仅仅是为了说明的目的，而并非为了限制本公开的范围。尽管在以上分别描述了各实施例，但是这并不意味着各个实施例中的措施不能有利地结合使用。本公开的范围由所附权利要求及其等同物限定。不脱离本公开的范围，本领域技术人员可以做出多种替代和修改，这些替代和修改都应落在本公开的范围之内。The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although each embodiment is described separately above, this does not mean that the measures in the various embodiments cannot be used in combination to advantage. The scope of the disclosure is defined by the appended claims and their equivalents. Without departing from the scope of the present disclosure, those skilled in the art can make various substitutions and modifications, and these substitutions and modifications should all fall within the scope of the present disclosure.

Claims

1. A method of image stitching in a fixed field of view, comprising:

acquiring an image sequence, wherein the image sequence at least comprises a plurality of frame images containing a moving target;

Determining the same position in a plurality of frame images as a splicing position;

respectively determining t _n Time frame image and t _n+1 Extracting a plurality of context features in an image area at the foreground, wherein n is a positive integer;

determining the moving target at t according to the context characteristics _n Time frame image and t _n+1 A moving distance between time frame images, wherein the moving distance comprises the following distance between time frame images _n Combination of multiple contextual features of a temporal frame image and t _n+1 Performing feature matching on the combination of multiple contextual features of the time frame image to obtain feature matching pairs; matching the pair at t according to the characteristics _n Time frame image and t _n+1 Calculating the coordinate information of the moving object in the time frame image at the t _n Time frame image and t _n+1 A moving distance between time frame images;

respectively aiming at the t according to the moving distance and the splicing position _n Time frame image and t _n+1 Cutting the time frame image to obtain a cut image; and

and splicing the clipping images.

2. The method of image stitching in a fixed field of view according to claim 1, wherein, in determining t separately _n Time frame image and t _n+1 Before the foreground of the moving object in the time frame image, the method further comprises:

acquiring a background image in a frame image;

for the background image and t _n Respectively carrying out Gaussian mixture modeling on the time frame images; and

based on the Gaussian mixture model of the background image and the t _n Comparing the results of the Gaussian mixture model of the time frame image, and judging the t _n Whether the moving object is contained in the time frame image.

3. The method of image stitching in a fixed field of view according to claim 2, wherein in determining the t _n When the time frame image does not contain the moving target, the t is calculated _n And updating the Gaussian mixture model of the moment frame image into the Gaussian mixture model of the background image.

4. The method of image stitching in a fixed field of view according to claim 2, wherein determining the same position in a plurality of the frame images as a stitching position comprises:

setting the row or column of the frame image as the splicing position, wherein the splicing position satisfies:

in the moving direction of the moving object, the splicing position is positioned at t _m In front of the moving object in the time frame image, wherein t is _m The time frame image is a frame image containing the moving target for the first time, m is a positive integer, and m is less than or equal to n.

5. The method of image stitching in a fixed field of view according to claim 1, wherein the matching pair is at the t according to the characteristics _n Time frame imageAnd t is as described _n+1 Calculating the coordinate information of the moving object in the time frame image at the t _n Time frame image and t _n+1 The moving distance between the time frame images includes:

and calculating the moving distance according to the average absolute difference or the intermediate value of the absolute differences of the coordinate information of the plurality of feature matching pairs.

6. The method of image stitching in a fixed field of view according to claim 5, wherein at least one of the contextual characteristics relates to an operating mode of a device acquiring the sequence of images and/or an operating environment of the device.

7. The method of image stitching in a fixed field of view according to claim 1, wherein the moving distance and the stitching position are respectively for the t _n Time frame image and t _n+1 The clipping of the time frame image comprises the following steps:

and determining a clipping region, wherein the clipping region is a region with the clipping starting point at the splicing position and the length of the clipping starting point is defined as the moving distance along the direction opposite to the moving direction of the moving target.

8. The method of image stitching in a fixed field of view according to any one of claims 1-7, wherein the acquiring the sequence of images includes:

and acquiring videos to be spliced, and decoding the videos to obtain the image sequence.

9. The method of image stitching in a fixed field of view according to any one of claims 1-7, wherein the respective determination of t _n Time frame image and t _n+1 The foreground of the moving object in the time frame image comprises:

determination of t using morphological methods _n Time frame image and t _n+1 And the foreground of the moving target in the moment frame image.

10. The method of claim 1, wherein after said stitching said cropped images, if said t is determined to be _n+2 And if the time frame image does not contain the moving target, ending the splicing of the clipping image.

11. An image stitching device in a fixed field of view, comprising:

an image sequence acquisition module, configured to acquire an image sequence, where the image sequence includes at least a plurality of frame images including a moving object;

the splicing position setting module is used for determining the same position in the plurality of frame images as a splicing position;

A foreground determining module for determining t respectively _n Time frame image and t _n+1 The foreground of the moving target in the time frame image;

the context feature extraction module is used for extracting context features of the image area at the foreground, wherein the context feature extraction module comprises the steps of extracting various context features in the image area at the foreground;

a moving distance calculation module for determining the moving object at the t according to the context characteristics _n Time frame image and t _n+1 A moving distance between time frame images, wherein the moving distance comprises the following distance between time frame images _n Combination of multiple contextual features of a temporal frame image and t _n+1 Performing feature matching on the combination of multiple contextual features of the time frame image to obtain feature matching pairs; matching the pair at t according to the characteristics _n Time frame image and t _n+1 Calculating the coordinate information of the moving object in the time frame image at the t _n Time frame image and t _n+1 A moving distance between time frame images;

the clipping module is used for respectively aiming at the t according to the moving distance and the splicing position _n Time frame image and t _n+1 Cutting the time frame image to obtain a cut image;

and the splicing module splices the cut images.

12. The image stitching device in a fixed field of view of claim 11 further comprising:

the background image acquisition module is used for acquiring a background image in the frame image;

a Gaussian mixture modeling module for modeling the background image and the t _n Respectively carrying out Gaussian mixture modeling on the time frame images;

a moving target judging module for judging the moving target according to the Gaussian mixture model of the background image and the t _n Comparing the results of the Gaussian mixture model of the time frame image, and judging the t _n Whether the moving object is contained in the time frame image.

13. The image stitching device in a fixed field of view of claim 11 further comprising:

the video processing module is used for acquiring videos to be spliced and decoding the videos.

14. The image stitching device in a fixed field of view of claim 11 further comprising:

a mixed Gaussian model updating module for judging the t _n When the time frame image does not contain the moving target, the t is calculated _n And updating the Gaussian mixture model of the time frame image into the Gaussian mixture model of the background image.

15. An image stitching system in a fixed field of view, comprising:

The image stitching device in a fixed field of view of any one of claims 11-14;

an image acquisition device for acquiring and forming a video including a moving object, and

an image background is acquired.

16. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-10.

17. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 10.