CN113901268A

CN113901268A - Video image background acquisition method

Info

Publication number: CN113901268A
Application number: CN202111249178.6A
Authority: CN
Inventors: 林丕成; 宋开银; 叶春雨
Original assignee: Shenyan Artificial Intelligence Technology Shenzhen Co ltd
Current assignee: Shenyan Artificial Intelligence Technology Shenzhen Co ltd
Priority date: 2021-10-26
Filing date: 2021-10-26
Publication date: 2022-01-07

Abstract

The invention relates to the technical field of video file processing, and particularly discloses a video image background acquisition system, which comprises: the database generation unit is used for receiving a video file and generating an image database based on the video file; the grayscale image conversion unit is used for reading the video frame image in the image database and converting the video frame image into a grayscale image; the calculation unit is used for generating an optical flow field matrix based on the gray-scale image and calculating corresponding optical flow field data; and the picture generation unit is used for acquiring a foreground object area in the video frame image and generating a background picture based on the optical flow field data and the foreground object area. The invention provides a method for combining the calculation processing of moving pixels in continuous frames of a video and the detection of foreground objects to obtain an accumulated background picture, which can effectively identify the foreground objects and then eliminate the foreground objects.

Description

Video image background acquisition method

Technical Field

The invention relates to the technical field of video file processing, in particular to a video image background acquisition method.

Background

With the development of computer technology, computer vision technology has been widely applied in various industries in recent years. The image classification technology, the semantic segmentation technology, the target detection technology and the like based on the deep learning algorithm continuously refresh the records of the computer vision field and are popular in the academic and industrial fields. The computer vision model based on the deep learning algorithm can provide end-to-end training without complex image preprocessing and feature extraction, and the trained model can be easily deployed to various practical application scenes as a relatively independent whole under the support of the modern rich and diverse reasoning platforms.

However, image objects involved in some relatively complex tasks often cannot be directly processed through a deep learning algorithm due to occlusion problems, for example, a road damage detection task in a traffic scene, and foreground such as vehicles and pedestrians on a road can occlude the detected objects to cause detection failure, so that a simple and practical method is needed to eliminate interference of a foreground on the objects.

Disclosure of Invention

The present invention is directed to a method for obtaining a background of a video image, so as to solve the problems in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

a video image background acquisition method, the method comprising:

receiving a video file, and generating an image database based on the video file;

reading a video frame image in the image database, and converting the video frame image into a gray image;

generating an optical flow field matrix based on the gray-scale map, and calculating corresponding optical flow field data;

and acquiring a foreground object region in the video frame image, and generating a background picture based on the optical flow field data and the foreground object region.

As a further limitation of the technical scheme of the invention: the specific steps of receiving a video file and generating an image database based on the video file include:

receiving a video file, and decompressing the video file;

acquiring a video frame image in a decompressed video file, and reading a corresponding time item;

the video frame images are ranked based on the time entry, and an image database is generated.

As a further limitation of the technical scheme of the invention: the specific steps of converting the video frame image into a gray scale image include:

scaling the video frame image;

sequentially traversing pixel points of a video frame image to obtain color values of the pixel points;

and generating a gray array based on the color values of the pixel points, and confirming a gray image based on the gray array.

As a further limitation of the technical scheme of the invention: the specific steps of generating an optical flow field matrix based on the gray-scale map and calculating corresponding optical flow field data comprise:

generating an optical flow field matrix based on the gray scale map;

absolute value converting the optical flow field matrix;

and carrying out noise reduction processing on the optical flow field matrix after absolute value processing.

As a further limitation of the technical scheme of the invention: the specific steps of generating a background picture based on the optical flow field data and the foreground object include:

confirming the positions of pixel points in the front scene area, and generating a fitting matrix based on the positions of the pixel points;

reading the processed optical flow field matrix, and performing AND operation on the optical flow field matrix and the fitting matrix to generate a result matrix;

and generating a background picture based on the result matrix.

A video image background acquisition system, the system comprising:

the database generation unit is used for receiving a video file and generating an image database based on the video file;

the grayscale image conversion unit is used for reading the video frame image in the image database and converting the video frame image into a grayscale image;

the calculation unit is used for generating an optical flow field matrix based on the gray-scale image and calculating corresponding optical flow field data;

and the picture generation unit is used for acquiring a foreground object area in the video frame image and generating a background picture based on the optical flow field data and the foreground object area.

As a further limitation of the technical scheme of the invention: the database generation unit specifically includes:

the decompression module is used for receiving the video file and decompressing the video file;

the time item reading module is used for acquiring the video frame image in the decompressed video file and reading the corresponding time item;

and the sorting module is used for sorting the video frame images based on the time items and generating an image database.

As a further limitation of the technical scheme of the invention: the grayscale map conversion unit specifically includes:

a scaling module for scaling the video frame image;

the color value acquisition module is used for sequentially traversing pixel points of the video frame image and acquiring color values of the pixel points;

and the gray level image confirmation module is used for generating a gray level array based on the color values of the pixel points and confirming the gray level image based on the gray level array.

As a further limitation of the technical scheme of the invention: the computing unit specifically includes:

the optical flow field matrix generating module is used for generating an optical flow field matrix based on the gray-scale map;

the absolute module is used for absolute value conversion of the optical flow field matrix;

and the noise reduction module is used for carrying out noise reduction processing on the optical flow field matrix after absolute value conversion.

As a further limitation of the technical scheme of the invention: the picture generation unit specifically includes:

the fitting matrix generating module is used for confirming the positions of pixel points in the front scene area and generating a fitting matrix based on the positions of the pixel points;

the result matrix generation module is used for reading the processed optical flow field matrix and carrying out AND operation on the optical flow field matrix and the fitting matrix to generate a result matrix;

and the output module is used for generating a background picture based on the result matrix.

Compared with the prior art, the invention has the beneficial effects that: the invention provides a method for combining motion pixel calculation processing and foreground target detection in continuous frames of a video to obtain an accumulated background picture, which can effectively identify a foreground object and then eliminate the foreground object, and can obtain a clean background image without being shielded by a motion foreground object after continuously overlapping and accumulating multi-frame images to be used as image processing of the next stage.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention.

Fig. 1 is a flow chart of a video image background acquisition method.

Fig. 2 is a first sub-flow diagram of a video image background acquisition method.

Fig. 3 is a second sub-flow diagram of a video image background acquisition method.

Fig. 4 is a third sub-flow diagram of a video image background acquisition method.

Fig. 5 is a fourth sub-flowchart of a video image background acquisition method.

Fig. 6 is a block diagram of a video image background acquisition system.

Fig. 7 is a schematic structural diagram of a database generation unit in the video image background acquisition method.

Fig. 8 is a schematic structural diagram of a grayscale map conversion unit in the video image background acquisition method.

Fig. 9 is a schematic structural diagram of a computing unit in the video image background acquisition method.

Fig. 10 is a schematic structural diagram of a picture generation unit in the video image background acquisition method.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

Fig. 1 shows a flow chart of a video image background acquisition method, and provides a video image background acquisition method, which specifically includes:

step S1: receiving a video file, and generating an image database based on the video file;

step S1 is a conversion of video files, because video files are a collection of pictures and audio, in the process of image background acquisition, only facing pictures, there is no need to ask about audio, and for these pictures, the best processing method is to strip them from video files and then compose an image database.

Step S2: reading a video frame image in the image database, and converting the video frame image into a gray image;

the constituent data items in the image database are the images of the frames in the video file, which are a set of one static picture, and for the processing of the pictures, the gray scale is generally used, because the normal pictures are mostly in the RGB mode, and the pictures in the RGB mode have three values, which is a very troublesome matter in the processing process, not only increasing the difficulty of the program design, but also making certain requirements on the running of the program.

Step S3: generating an optical flow field matrix based on the gray-scale map, and calculating corresponding optical flow field data;

the optical flow field matrix obtained after the optical flow calculation contains positive and negative values, wherein the positive and negative values only represent the direction of pixel point motion but not the size, and the absolute value of the optical flow field matrix calculated by the optical flow algorithm needs to be taken to facilitate the judgment of the pixel motion scale in the next step. The number of the optical flow field matrix smaller than the threshold is set to be 0 to eliminate the environmental noise and estimate the displacement of the static pixel point from the algorithm, and the number of the optical flow field matrix larger than the threshold is set to be 255 to be used as the confirmation of the moving object.

Step S4: acquiring a foreground object region in the video frame image, and generating a background picture based on the optical flow field data and the foreground object region;

and (3) deducing to obtain a foreground object bdbox in the picture through a target detection algorithm, and setting the mask matrix value at the position of the bdbox as 0. Adding the values in the x and y directions in the processed optical flow field matrix, and combining the added values into a one-dimensional matrix with the same size as the mask matrix, wherein the values in the matrix comprise: 0. 255 to 510. 0 indicates that the pixel has not found motion and is set to 1, and 255 and 510 indicate that the pixel has been shifted and is set to 0. And performing sum operation on the generated matrix and the mask matrix, zooming the result matrix after the sum closest interpolation method to the original image size in the video frame, and performing sum operation on the original image to obtain a background image. And repeating the above process to continuously superpose the background pictures to realize background accumulation.

Fig. 2 shows a first sub-flowchart of the video image background acquiring method, which details the step S1, where the receiving a video file, and the generating an image database based on the video file specifically includes:

step S11: receiving a video file, and decompressing the video file;

step S11 is a simple decompression process, as mentioned above;

step S12: acquiring a video frame image in a decompressed video file, and reading a corresponding time item;

when acquiring a video frame image in a decompressed video file, there must be time item data, and normally, the video frame image is read based on the time item, but due to the difference in size, or different loading speed, of the video frame images, and some other reasons such as error codes, when acquiring a video frame image in a decompressed video file, there is a possibility that the video frame images are not arranged in a time item sequence, and in short, the acquisition sequence of the video frame images and the corresponding time sequence may be different, so when acquiring a video frame image in a decompressed video file, the corresponding time item is read, which is convenient for the subsequent sorting operation.

Step S13: sorting the video frame images based on the time items and generating an image database;

step S13 is essentially a judgment and comparison step, comparing time, and then ranking the video frame images based on the comparison result, and finally generating an image database.

Fig. 3 shows a second sub-flowchart of the video image background acquisition method, which details the step S2, and the specific step of converting the video frame image into a gray-scale map includes:

step S21: scaling the video frame image;

in the technical scheme of the invention, the related algorithms comprise an optical flow algorithm and a target detection algorithm, and in order to enable the algorithms to be smoothly carried out, the video frame image is zoomed to the size suitable for the optical flow algorithm and the target detection algorithm.

Step S22: sequentially traversing pixel points of a video frame image to obtain color values of the pixel points;

the video frame image is a picture, the picture is formed by pixel points, the pixel points have certain numerical values, namely color values, in the computer processing process, the picture is actually a set of certain numbers, and each score value is a two-dimensional array when the color values of the pixel points are obtained due to the fact that the picture is of a two-dimensional structure;

step S23: generating a gray array based on the color values of the pixel points, and confirming a gray image based on the gray array;

there are many formulas for converting color values to gray scale, and the most well-known one of the psychological formulas is: gray ═ R0.299 + G0.587 + B0.114; on the basis of this, there are other algorithms, such as Gray ═ R (R19595 + G38469 + B7472) > >16, which is based on the actual storage structure of the computer and is converted in a manner of matching with the shift; the calculation of the gray level average value of the gray level image is the same as the color value acquisition process, and the calculation is of a double-circulation logic structure. It should be noted that the gray level array is a two-dimensional array, which represents the gray level as a graph in a computer, and the data is processed and converted into a specific picture format, which is the gray level graph.

Fig. 4 shows a third sub-flow diagram of the video image background acquisition method, which details step S3, where the specific steps of generating an optical flow field matrix based on the gray-scale map and calculating corresponding optical flow field data include:

step S31: generating an optical flow field matrix based on the gray scale map;

the generation of the optical flow field matrix is realized by an optical flow algorithm, and the optical flow algorithm has two basic assumption conditions:

first, the brightness is constant. I.e. the brightness of the same object does not change when it moves between different frames. This is an assumption of basic optical flow (all optical flow variants must be satisfied) for obtaining the basic equations of optical flow; second, the temporal continuation or movement is a "small movement". I.e. the temporal variation does not cause a drastic change in the target position, the displacement between adjacent frames is relatively small. It is also an indispensable assumption of the optical flow method. The two points mean that the invention has requirements on the video image to be processed, and at least within a certain error range, the invention meets the above conditions;

step S32: absolute value converting the optical flow field matrix;

the optical flow field matrix obtained after the optical flow calculation contains positive and negative values, wherein the positive and negative values only represent the direction of pixel point motion but not the size, and the absolute value of the optical flow field matrix calculated by the optical flow algorithm needs to be taken to facilitate the judgment of the pixel motion scale in the next step.

Step S33: carrying out noise reduction processing on the optical flow field matrix after absolute value processing;

due to the influence of noise and the limitation of an optical flow algorithm, the obtained optical flow field matrix has noise, the number of the optical flow field matrix smaller than the threshold value is set to be 0 so as to eliminate environmental noise and estimation of the displacement of the static pixel points by the algorithm, and the number of the optical flow field matrix larger than the threshold value is set to be 255 so as to be used as confirmation of a moving object.

Fig. 5 shows a fourth sub-flowchart of the video image background acquisition method, which details step S4, where the specific step of generating a background picture based on the optical flow field data and the foreground object includes:

step S41: confirming the positions of pixel points in the front scene area, and generating a fitting matrix based on the positions of the pixel points;

a foreground object bdbox in the picture is obtained through inference by a target detection algorithm, wherein the foreground object bdbox is an area in the picture, in other words, the foreground object bdbox is a pixel point set, and the pixel point set and the video frame image are subjected to the processing processes such as reduction, gray level conversion, optical flow algorithm and the like. On the basis of the identification of the pixel point positions in the foreground object region, a fitting matrix is generated, which is a mask matrix, i.e. only 0 and 1 in the matrix, which is mathematically also conceivable as a correction matrix for correcting the optical flow field matrix.

Step S42: reading the processed optical flow field matrix, and performing AND operation on the optical flow field matrix and the fitting matrix to generate a result matrix;

step S42 is a specific correction step, and the operation rule is and operation to generate a result matrix.

Step S43: generating a background picture based on the result matrix;

and scaling the result matrix to the size of the original image in the video frame, and then carrying out operation on the result matrix and the original image to obtain a background image. And repeating the above process to continuously superpose the background pictures to realize background accumulation. It is contemplated that the present invention, in terms of programming, is based on a large loop, also referred to as traversal, of the image database.

The core technology of the invention is as follows: carrying out optical flow calculation on continuous multi-frame pictures after the contraction and graying processing, carrying out 0,255 dual-polarization noise reduction processing on the result obtained by calculation, carrying out target detection on the compressed image frames, combining the optical flow matrix to generate a 0,1 binary mask after detecting a foreground target, carrying out phase comparison on the mask and the original video frame images to obtain a background which is taken out and detected as well as the moving object, repeating the process, realizing background accumulation by performing mask operation, and obtaining a complete background picture after a period of time. The implementation method can be summarized as the following steps:

(1) zooming a video frame image;

(2) carrying out foreground object detection on the zoomed video frame image;

(3) graying the zoomed video frame image in the step (1) into a one-dimensional image;

(4) carrying out optical flow calculation on a plurality of continuous video frame images to obtain an optical flow field matrix;

(5) carrying out 0 and 255 double polarization treatments on the optical flow field matrix calculated in the step (4) to obtain a double polarization optical flow field matrix;

(6) obtaining a one-dimensional matrix by modular length of vectors in the two-polarization optical flow field matrix in the step (5);

(7) converting the one-dimensional matrix in the step (6) into a binary matrix of 0 and 1;

(8) combining the foreground target frame obtained in the step (2) with the binarization matrix obtained in the step (7) to obtain a mask;

(9) the Mask is zoomed to the size of the original image in the video frame and then is compared with the original image to generate a background image;

(10) the processes of (1) to (9) are repeated, and background images are accumulated.

Example 2

Fig. 6 shows a block diagram of a video image background acquisition system, the system 10 comprising:

a database generation unit 11 configured to receive a video file, and generate an image database based on the video file;

the database generation unit 11 is hardware support of step S1.

A grayscale image conversion unit 12, configured to read a video frame image in the image database, and convert the video frame image into a grayscale image;

the gradation-map converting unit 12 is hardware support of step S2.

A calculating unit 13, configured to generate an optical flow field matrix based on the grayscale map, and calculate corresponding optical flow field data;

the computing unit 13 is hardware support for step S3

A picture generating unit 14, configured to obtain a foreground object region in the video frame image, and generate a background picture based on the optical flow field data and the foreground object region;

the picture generation unit 14 is hardware support of step S4.

Fig. 7 is a block diagram of a database generation unit in a video image background acquisition system, where the database generation unit 11 specifically includes:

a decompression module 111, configured to receive a video file and decompress the video file;

the decompression module 111 is hardware support for step S11.

A time item reading module 112, configured to obtain a video frame image in the decompressed video file, and read a corresponding time item;

the time item reading module 112 is hardware support of step S12.

A sorting module 113, configured to sort the video frame images based on the time items, and generate an image database;

the sorting module 113 is hardware support for step S13.

Fig. 8 is a block diagram of a grayscale map conversion unit in a video image background acquisition system, where the grayscale map conversion unit 12 specifically includes:

a scaling module 121, configured to scale the video frame image;

the scaling module 121 is hardware support for step S21.

The color value obtaining module 122 is configured to sequentially traverse pixel points of a video frame image, and obtain color values of the pixel points;

the color value obtaining module 122 is a hardware support of step S22.

The gray scale image confirmation module 123 is configured to generate a gray scale array based on the color values of the pixel points, and confirm a gray scale image based on the gray scale array;

the grayscale map confirmation module 123 is hardware support of step S22.

Fig. 9 is a block diagram of a computing unit in a video image background acquisition system, where the computing unit 13 specifically includes:

an optical flow field matrix generating module 131, configured to generate an optical flow field matrix based on the grayscale map;

the optical flow field matrix generation module 131 is a hardware support of step S31.

An absolute value module 132, configured to absolute value the optical flow field matrix;

the absolute module 132 is hardware support for step S32.

The noise reduction module 133 is configured to perform noise reduction processing on the optical flow field matrix after absolute value conversion;

the noise reduction module 133 is hardware support for step S33.

Fig. 10 is a block diagram of a picture generating unit in a video image background acquiring system, where the picture generating unit 14 specifically includes:

a fitting matrix generating module 141, configured to determine pixel positions in a foreground object region, and generate a fitting matrix based on the pixel positions;

the fitting matrix generation module 141 is hardware support for step S41.

A result matrix generating module 142, configured to read the processed optical flow field matrix, and perform an and operation on the optical flow field matrix and the fitting matrix to generate a result matrix;

the result matrix generation module 142 is hardware support for step S42.

An output module 143, configured to generate a background picture based on the result matrix;

the output module 143 is hardware support of step S42.

The functions that can be performed by the video image background acquisition system are all performed by a computer device, which includes one or more processors and one or more memories, where at least one program code is stored in the one or more memories, and loaded into and executed by the one or more processors to perform the functions of the video image background acquisition system.

The processor fetches instructions and analyzes the instructions one by one from the memory, then completes corresponding operations according to the instruction requirements, generates a series of control commands, enables all parts of the computer to automatically, continuously and coordinately act to form an organic whole, realizes the input of programs, the input of data, the operation and the output of results, and the arithmetic operation or the logic operation generated in the process is completed by the arithmetic unit; the Memory comprises a Read-Only Memory (ROM) for storing a computer program, and a protection device is arranged outside the Memory.

Illustratively, a computer program can be partitioned into one or more modules, which are stored in memory and executed by a processor to implement the present invention. One or more of the modules may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the terminal device.

Those skilled in the art will appreciate that the above description of the service device is merely exemplary and not limiting of the terminal device, and may include more or less components than those described, or combine certain components, or different components, such as may include input output devices, network access devices, buses, etc.

The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the terminal equipment and connects the various parts of the entire user terminal using various interfaces and lines.

The memory may be used to store computer programs and/or modules, and the processor may implement various functions of the terminal device by operating or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the berth-status display system, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

All or part of the modules/units in the system according to the above embodiment may be implemented by instructing related hardware through a computer program, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the computer program may implement the functions of the system embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all the functions that can be implemented by the video image background acquisition system described above within the protection scope of the present invention are implemented by a computer device, which includes one or more processors and one or more memories, wherein at least one program code is stored in the one or more memories, and the program code is loaded into and executed by the one or more processors to implement the functions of the video image background acquisition system.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A video image background acquisition method is characterized by comprising the following steps:

2. The method for background capture of video images as claimed in claim 1, wherein the steps of receiving a video file and generating an image database based on the video file comprise:

receiving a video file, and decompressing the video file;

3. The method of claim 1, wherein the step of converting the video frame image into a gray-scale image comprises:

scaling the video frame image;

4. The method for obtaining the background of the video image according to claim 1, wherein the specific steps of generating an optical flow field matrix based on the gray-scale map and calculating the corresponding optical flow field data include:

generating an optical flow field matrix based on the gray scale map;

absolute value converting the optical flow field matrix;

5. The video image background acquisition method according to claim 4, wherein the specific step of generating a background picture based on the optical flow field data and the foreground object comprises:

and generating a background picture based on the result matrix.

6. A video image background acquisition system, the system comprising:

7. The video image background acquisition system according to claim 6, wherein the database generation unit specifically includes:

8. The video image background acquisition system according to claim 6, wherein the grayscale map conversion unit specifically includes:

a scaling module for scaling the video frame image;

9. The video image background acquisition system according to claim 6, wherein the computing unit specifically includes:

10. The video image background acquisition system according to claim 9, wherein the picture generation unit specifically includes: