CN115661194A

CN115661194A - Moving object extraction method, system, electronic device and medium

Info

Publication number: CN115661194A
Application number: CN202211159735.XA
Authority: CN
Inventors: 姜金涛; 江高勇; 王志花; 严向华; 陈从平; 刘闻珂; 张俊杰
Original assignee: Inner Mongolia Zhicheng Internet Of Things Co ltd
Current assignee: Inner Mongolia Zhicheng Internet Of Things Co ltd
Priority date: 2022-09-22
Filing date: 2022-09-22
Publication date: 2023-01-31

Abstract

The invention relates to a moving target extraction method, a moving target extraction system, electronic equipment and a medium, wherein the moving target extraction method comprises the following steps: acquiring a motion video and each first initial frame image in the motion video; determining a pixel point set corresponding to each first initial frame image according to each first initial frame image; merging the pixel point sets to obtain a sample point set; for each first initial frame image, taking each pixel point in the first initial frame image as a first target pixel point, and determining each second target pixel point according to each first target pixel point and the sample point set; and for each first initial frame image, determining a moving target in the first initial frame image according to each second target pixel point. The problems of poor precision and large calculated amount of the existing target detection method are solved.

Description

Moving object extraction method, system, electronic device and medium

Technical Field

The present invention relates to the field of moving object detection technologies, and in particular, to a moving object extraction method, a moving object extraction system, an electronic device, and a medium.

Background

The moving target detection is the key for realizing locking tracking and classification identification of a dynamic target by computer vision, and is widely applied to the fields of intelligent monitoring, intelligent transportation, aviation and navigation, agricultural product assembly line dynamic sorting and the like.

For the detection of a moving target, for example, an optical flow method, a velocity vector field is established by using all pixel points of a background frame, and the target detection is completed by positioning abrupt change positions in the vector field.

Disclosure of Invention

The invention provides a moving target extraction method, a moving target extraction system, electronic equipment and a moving target extraction medium, aiming at solving the problems of poor accuracy and large calculation amount of the existing target detection method.

In a first aspect, to solve the above technical problem, the present invention provides a moving object extracting method, including the following steps:

acquiring a motion video and each first initial frame image in the motion video;

determining a pixel point set corresponding to each first initial frame image according to each first initial frame image, wherein for each pixel point set, the pixel point set comprises a plurality of background pixel points;

merging the pixel point sets to obtain a sample point set;

for each first initial frame image, taking each pixel point in the first initial frame image as a first target pixel point, and determining each second target pixel point according to each first target pixel point and the sample point set;

and for each first initial frame image, determining a moving target in the first initial frame image according to each second target pixel point.

The moving target extraction method provided by the invention has the beneficial effects that: the method comprises the steps of firstly obtaining a pixel point set corresponding to each first initial frame image, determining background pixel points in each first initial frame image, then combining all the background pixel points into a sample point set, and finally screening out second target pixel points according to each first target pixel point and the sample point set in each first initial frame image, namely determining a moving target in each first initial frame image.

On the basis of the technical scheme, the moving object extraction method can be further improved as follows.

Further, the method also includes:

according to each first initial frame image, performing ghost suppression processing on each first initial frame image, and determining a second initial frame image corresponding to each first initial frame image;

determining a pixel point set corresponding to each first initial frame image according to each first initial frame image, wherein the pixel point set comprises:

and determining a pixel point set corresponding to each second initial frame image according to each second initial frame image.

The beneficial effect of adopting the above further scheme is: if a moving object exists in a first initial frame image corresponding to a first frame image in a moving video, a ghost image will appear in the first initial frame image corresponding to each subsequent frame image, and in order to better detect a clear moving object, the ghost image needs to be removed.

Further, the above-mentioned performing ghost-suppression processing on each first initial frame image according to each first initial frame image, and determining a second initial frame image corresponding to each first initial frame image, includes:

for each first initial frame image, determining a first foreground target corresponding to each other frame first initial frame image except the first frame first initial frame image in each first initial frame image;

and performing frame difference operation on every two adjacent first initial frame images from a first initial frame image of a first frame in the motion video to determine a target area, wherein the target area is a foreground image or a background image, if the target areas with continuously set quantity are all background images, stopping the frame difference operation, replacing the background image in each first initial frame image by the target area obtained by the last frame difference operation, and determining each replaced first initial frame image as each second initial frame image.

The beneficial effect of adopting the above further scheme is: the method comprises the steps of firstly obtaining first foreground targets corresponding to other first initial frame images except for a first initial frame image, then determining a target area through a frame difference method, finally replacing a background image in each first initial frame image through the target area, and then combining the first foreground targets in each first initial frame image to obtain each second initial frame image after ghost is removed.

Further, in the method, for each first initial frame image, determining a first foreground target corresponding to each frame of first initial frame images except the first frame of first initial frame images in each first initial frame image, includes:

starting from a second frame first initial frame image to a last frame first initial frame image in the motion video, and regarding each first initial frame image, taking the first initial frame image as a current frame image;

performing differential operation on a current frame image and a previous frame image of the current frame image to determine a first differential image, and performing differential operation on the current frame image and a next frame image of the current frame image to determine a second differential image;

taking pixel points with pixel values larger than a first preset threshold value in the first differential image as third target pixel points, and taking pixel points with pixel values larger than a second preset threshold value in the second differential image as fourth target pixel points;

determining a second foreground target corresponding to the first differential image according to each first target pixel point, and determining a third foreground target corresponding to the second differential image according to each second target pixel point;

extracting the edge characteristics of the current frame image to obtain a characteristic image corresponding to the current frame image;

performing logical AND operation on the second foreground target and the characteristic image to determine a fourth foreground target, and performing logical AND operation on the third foreground target and the characteristic image to determine a fifth foreground target;

performing logical OR operation on the fourth foreground target and the fifth foreground target to determine a sixth foreground target;

and removing the foreground spot region with the area smaller than the first preset value in the sixth foreground target, filling the foreground hole region with the area smaller than the second preset value, and determining the first foreground target corresponding to the current frame image.

The beneficial effect of adopting the above further scheme is: the method comprises the steps of obtaining a first differential image and a second differential image through a current frame image, a previous frame image and a next frame image by utilizing differential operation, determining a third target pixel point and a fourth target pixel point by utilizing an OTSU threshold method (a first preset threshold value and a second preset threshold value), determining a second foreground target, a fourth target pixel point and a second differential image according to the third target pixel point and the first differential image, determining a third foreground target, wherein at the moment, edge characteristics of the current frame image need to be extracted to obtain a characteristic image, and finally performing logical OR operation on the characteristic image, the third foreground target and the fourth foreground target respectively to obtain a sixth foreground target.

Further, in the method, determining a pixel point set corresponding to each second initial frame image according to each second initial frame image includes:

regarding each second initial frame image, taking each pixel point in the second initial frame image as a fifth target pixel point;

for each fifth target pixel point, taking the fifth target pixel point as a central pixel point, acquiring all adjacent pixel points in the field of the central pixel point 24, determining a plurality of similarity values corresponding to the fifth target pixel point according to the fifth target pixel point and each adjacent pixel point, and for each similarity value, representing the similarity between the gray value of the fifth target pixel point and the gray value of one corresponding adjacent pixel point;

for each fifth target pixel point, sorting the similarity values corresponding to the fifth target pixel point from small to large, and selecting adjacent pixel points corresponding to the first n similarity values in the sorting as background pixel points;

for each second initial frame image, taking all the acquired background pixel points as a pixel point set;

determining a plurality of similarity values corresponding to the fifth target pixel point according to the fifth target pixel point and each adjacent pixel point, including:

determining a plurality of similarity values corresponding to the fifth target pixel point through a first formula according to the fifth target pixel point and each adjacent pixel point, wherein the first formula is as follows:

G(a,b)＝|I(a)-I(b)|

g (a, b) represents a similarity value, a represents a fifth target pixel point, b represents any one adjacent pixel point, I (a) represents a gray value corresponding to the fifth target pixel point, and I (b) represents a gray value corresponding to the adjacent pixel point.

The beneficial effect of adopting the further scheme is that: by comparing each fifth target pixel point in each second initial frame image with all adjacent pixel points in the field of the central pixel point 24, the similarity between the gray value of each fifth target pixel point and the gray value of each adjacent pixel point can be obtained, and thus the background pixel point is determined.

Further, in the above method, for each first initial frame image, taking each pixel point in the first initial frame image as a first target pixel point, and determining each second target pixel point according to each first target pixel point and the sample point set, includes:

determining a radius threshold according to the sample point set;

determining the Euclidean distance between a first target pixel point and each background pixel point in a sample point set for each first target pixel point in each first initial frame image;

and for each first target pixel point in each first initial frame image, comparing each Euclidean distance corresponding to the first target pixel point with a radius threshold value respectively, counting the number of the first target pixel points of which the Euclidean distances corresponding to the first target pixel points are smaller than the radius threshold value, and if the number is smaller than a third preset value, determining that the first target pixel points are second target pixel points.

The beneficial effect of adopting the further scheme is that: when the first target pixel points are matched with the background pixel points, the background disturbance condition exists, therefore, the radius threshold value is generated according to the sample point set, the background disturbance is controlled within a certain range, the Euclidean distance is calculated through each first target pixel point and each background pixel point, the number of the first target pixel points, corresponding to the first target pixel points, of which the Euclidean distances are smaller than the radius threshold value can be determined, and once the number is smaller than a third preset value, each second target pixel point can be obtained.

Further, the determining a radius threshold according to the sample point set in the method includes:

determining a radius threshold value through a second formula according to the sample point set, wherein the second formula is as follows:

wherein D represents the dispersion of the sample point set, M represents the total frame number of the motion video, n represents the number of the selected similarity values, and x _i Represents the ith background pixel point and the ith background pixel point,

the average value of each background pixel point in the sample point set is represented, alpha represents an adaptive parameter, delta represents a scale factor, R is an initial value of a radius threshold, and R (x, y) is a radius threshold determined according to the sample point set.

The beneficial effect of adopting the further scheme is that: and by a second formula, a radius threshold can be established through the sample point set, so that the problem of background disturbance is solved.

In a second aspect, the present invention provides a moving object extraction system, including:

the first acquisition module is used for acquiring the motion video and each first initial frame image in the motion video;

the second obtaining module is used for determining a pixel point set corresponding to each first initial frame image according to each first initial frame image, and for each pixel point set, the pixel point set comprises a plurality of background pixel points;

the third acquisition module is used for merging all the pixel point sets to obtain a sample point set;

the fourth acquisition module is used for taking each pixel point in the first initial frame image as a first target pixel point for each first initial frame image, and determining each second target pixel point according to each first target pixel point and the sample point set;

and the fifth acquisition module is used for determining a moving target in each first initial frame image according to each second target pixel point.

In a third aspect, the present invention also provides an electronic device, which includes a memory, a processor, and a program stored in the memory and running on the processor, and when the processor executes the program, the steps of the moving object extracting method as described above are implemented.

In a fourth aspect, the present invention also provides a computer-readable storage medium, in which instructions are stored, and when the instructions are executed on a terminal device, the instructions cause the terminal device to execute the steps of a moving object extraction method as described above.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the present invention is further described below with reference to the accompanying drawings and embodiments.

Fig. 1 is a schematic flow chart of a moving object extraction method according to an embodiment of the present invention;

FIG. 2 is a diagram of any one of the first initial frame images in a motion video;

FIG. 3 is a diagram of a moving object obtained using a conventional background subtraction method;

FIG. 4 is a moving object obtained using a moving object extraction method;

FIG. 5 is a diagram of any one of the first initial frame images in the motion video;

FIG. 6 is a diagram of a moving object obtained using a conventional background subtraction method;

FIG. 7 is a moving object obtained using a moving object extraction method;

fig. 8 is a schematic structural diagram of a moving object extraction system according to an embodiment of the present invention.

Detailed Description

The following examples are further illustrative and supplementary to the present invention and do not limit the present invention in any way.

A moving object extraction method, a moving object extraction system, an electronic device, and a medium according to embodiments of the present invention are described below with reference to the accompanying drawings.

As shown in fig. 1, a moving object extraction method according to an embodiment of the present invention includes the following steps:

s1, obtaining a motion video and each first initial frame image in the motion video.

And S2, determining a pixel point set corresponding to each first initial frame image according to each first initial frame image, wherein for each pixel point set, the pixel point set comprises a plurality of background pixel points.

Optionally, for each pixel point set corresponding to each first initial frame image, the background pixel point is any one pixel point in the background image in the first initial frame image.

Optionally, in the motion video, if the first initial frame image of the first frame contains a motion target, a ghost may occur in each subsequent initial frame image, so that the accuracy of detecting the motion target is reduced, and therefore the method further includes:

determining a pixel point set corresponding to each first initial frame image according to each first initial frame image, including:

Optionally, performing ghost suppression processing on each first initial frame image according to each first initial frame image, and determining a second initial frame image corresponding to each first initial frame image, including:

The set number is set according to actual conditions, and if the target areas obtained after the preset number of frame difference operations are all background images, it indicates that the target area obtained by the last frame difference operation can be used as a clear background image and does not contain or contains a small part of foreground images, so that the target area at the moment is used as a new background image of each first initial frame image.

Optionally, for each first initial frame image, determining a first foreground target corresponding to each frame of first initial frame images except for the first frame of first initial frame images in each first initial frame image, including:

In this embodiment, after the second foreground object and the third foreground object are determined, median filtering may be performed on the second foreground object and the third foreground object to remove fine interference points in the second foreground object and the third foreground object, and then expansion operation is performed on the second foreground object and the third foreground object, so that the second foreground object and the third foreground object are more complete.

In this embodiment, because the foreground object obtained after the difference operation has the foreground holes and the foreground blobs, the finally obtained sixth foreground object may also have foreground hole regions and foreground blobs regions, and thus, the foreground blobs region whose area is smaller than the first preset value in the sixth foreground object may be removed, and the foreground hole region whose area is smaller than the second preset value may be filled, where the first preset value and the second preset value are set according to an actual situation, optionally, the first preset value is 10, and the second preset value is 20.

Optionally, determining a pixel point set corresponding to each second initial frame image according to each second initial frame image, including:

G(a,b)＝|I(a)-I(b)|；

g (a, b) represents a similarity value, a represents a fifth target pixel point, b represents any adjacent pixel point, I (a) represents a gray value corresponding to the fifth target pixel point, and I (b) represents a gray value corresponding to the adjacent pixel point.

In this embodiment, the value of n is selected according to actual situations, and optionally, n may be selectively set to 4.

In this embodiment, for each fifth target pixel point in each second initial frame image, the similarity value between each fifth target pixel point and each adjacent pixel point is calculated, so that each screened background pixel point has time domain and space domain information, and the robustness of the sample point set is enhanced.

And S3, merging the pixel point sets to obtain a sample point set.

And S4, regarding each first initial frame image, taking each pixel point in the first initial frame image as a first target pixel point, and determining each second target pixel point according to each first target pixel point and the sample point set.

In this embodiment, the first target pixel point carries out the euclidean distance with every background pixel point in the sample point set to confirm the second target pixel point, but, when some objects in the background take place to rock, the pixel value of background pixel point can change, produces the circumstances of background disturbance, and at this moment, only through the euclidean distance screening second target pixel point can lead to the second target pixel point precision that obtains at last poor, consequently. By introducing the radius threshold, the influence caused by background disturbance is controlled within a controllable range, and the adaptability of moving target detection to the background disturbance is improved.

Optionally, for each first initial frame image, taking each pixel point in the first initial frame image as a first target pixel point, and determining each second target pixel point according to each first target pixel point and the sample point set, including:

determining a radius threshold according to the sample point set;

Optionally, determining the radius threshold according to the sample point set includes:

expressing the average value of each background pixel point in the sample point set, alpha expressing the adaptive parameter, delta tableScale factors are shown, R is the initial value of the radius threshold, R (x, y) is the radius threshold determined from the set of sample points, preferably R =20, α =0.05, δ =6.

In this embodiment, the third preset value is set according to an actual situation, and optionally, the third preset value is set to 20.

And S5, for each first initial frame image, determining a moving target in the first initial frame image according to each second target pixel point.

In this embodiment, as shown in fig. 2 to 4, fig. 2 is any one first initial frame image in a moving video, and through a conventional background difference method, the effect is as shown in fig. 3, it is obvious that the accuracy of fig. 3 is poor, although a plurality of moving targets are detected, but there is no distinction between each moving target, but after the first initial frame image is processed by a moving target extraction method provided in this application, the effect is as shown in fig. 4, and it can be seen that fig. 4 not only detects all moving targets, but also distinguishes each moving target, thereby greatly improving the accuracy of moving target extraction.

As shown in fig. 5-7, fig. 5 is any one of the first initial frame images in the motion video, and the effect of the conventional background difference method is as shown in fig. 6, it can be obviously seen that fig. 6 captures background pixels of the background image together due to occurrence of background disturbance (leaf shaking), which results in poor accuracy of the extracted motion target, but after the first initial frame image is processed by the motion target extraction method provided by the present application, the effect is as shown in fig. 7, it can be seen that, through the radius threshold, the background disturbance is controlled within a controllable range, only a few background pixels of a part of the background image in fig. 7 are captured, and the accuracy of the motion target extraction is greatly improved.

As shown in fig. 8, a moving object extraction system according to an embodiment of the present invention includes:

a first obtaining module 202, configured to obtain a motion video and each first initial frame image in the motion video;

a second obtaining module 203, configured to determine, according to each first initial frame image, a pixel point set corresponding to each first initial frame image, where, for each pixel point set, the pixel point set includes a plurality of background pixel points;

a third obtaining module 204, configured to combine the pixel point sets to obtain a sample point set;

a fourth obtaining module 205, configured to, for each first initial frame image, take each pixel point in the first initial frame image as a first target pixel point, and determine each second target pixel point according to each first target pixel point and the sample point set;

the fifth obtaining module 206 is configured to, for each first initial frame image, determine a moving target in the first initial frame image according to each second target pixel point.

Optionally, the system further comprises:

the sixth acquisition module is used for performing ghost suppression processing on each first initial frame image according to each first initial frame image and determining a second initial frame image corresponding to each first initial frame image;

the second obtaining module 203 is further configured to determine a pixel point set corresponding to each second initial frame image according to each second initial frame image.

Optionally, the sixth obtaining module further includes:

the seventh obtaining module is used for determining a first foreground target corresponding to each frame of first initial frame images except the first frame of first initial frame images in each first initial frame image;

and the eighth acquisition module is used for performing frame difference operation on every two adjacent first initial frame images from the first initial frame image of the first frame in the motion video to determine a target area, wherein the target area is a foreground image or a background image, if the target areas with continuously set quantity are all background images, the frame difference operation is stopped, the target area obtained by the last frame difference operation replaces the background image in each first initial frame image, and each replaced first initial frame image is determined to be each second initial frame image.

Optionally, the seventh obtaining module further includes:

a ninth obtaining module, configured to start from a second frame first initial frame image to a last frame first initial frame image in the motion video, and regarding each first initial frame image, take the first initial frame image as a current frame image;

the tenth acquisition module is used for carrying out differential operation on the current frame image and the previous frame image of the current frame image to determine a first differential image, and carrying out differential operation on the current frame image and the next frame image of the current frame image to determine a second differential image;

an eleventh obtaining module, configured to use a pixel point in the first difference image whose pixel value is greater than the first preset threshold as a third target pixel point, and use a pixel point in the second difference image whose pixel value is greater than the second preset threshold as a fourth target pixel point;

a twelfth obtaining module, configured to determine, according to each first target pixel point, a second foreground target corresponding to the first difference image, and determine, according to each second target pixel point, a third foreground target corresponding to the second difference image;

the thirteenth acquisition module is used for extracting the edge characteristics of the current frame image to obtain a characteristic image corresponding to the current frame image;

a fourteenth acquiring module, configured to perform a logical and operation on the second foreground target and the feature image, determine a fourth foreground target, perform a logical and operation on the third foreground target and the feature image, and determine a fifth foreground target;

a fifteenth obtaining module, configured to perform a logical or operation on the fourth foreground target and the fifth foreground target, and determine a sixth foreground target;

and the sixteenth acquisition module is used for removing the foreground spot region with the area smaller than the first preset value in the sixth foreground target, filling the foreground hole region with the area smaller than the second preset value, and determining the first foreground target corresponding to the current frame image.

Optionally, the second obtaining module 203 further includes:

a seventeenth obtaining module, configured to, for each second initial frame image, take each pixel point in the second initial frame image as a fifth target pixel point;

an eighteenth acquiring module, configured to, for each fifth target pixel, acquire all adjacent pixels in the field of the central pixel 24 by using the fifth target pixel as the central pixel, determine a plurality of similarity values corresponding to the fifth target pixel according to the fifth target pixel and each adjacent pixel, where, for each similarity value, the similarity value represents a similarity between a gray value of the fifth target pixel and a gray value of a corresponding adjacent pixel;

a nineteenth obtaining module, configured to, for each fifth target pixel point, sort similarity values corresponding to the fifth target pixel point from small to large, and select an adjacent pixel point corresponding to the top n similarity values in the sort as a background pixel point;

a twentieth acquiring module, configured to, for each second initial frame image, use all acquired background pixel points as a pixel point set;

the eighteenth acquisition module determines a plurality of similarity values corresponding to the fifth target pixel point through a first formula, wherein the first formula is as follows:

G(a,b)＝|I(a)-I(b)|；

Optionally, the fourth obtaining module 205 further includes:

a twenty-first obtaining module, configured to determine a radius threshold according to the sample point set;

a twenty-second obtaining module, configured to determine, for each first target pixel point in each first initial frame image, a euclidean distance between the first target pixel point and each background pixel point in the sample point set;

a twenty-third obtaining module, configured to compare, for each first target pixel point in each first initial frame image, each euclidean distance corresponding to the first target pixel point with a radius threshold, and count the number of first target pixel points for which each euclidean distance corresponding to the first target pixel point is smaller than the radius threshold, where if the number is smaller than a third preset value, the first target pixel point is a second target pixel point;

the twenty-first obtaining module determines the radius threshold value through a second formula, wherein the second formula is as follows:

The electronic equipment comprises a memory, a processor and a program which is stored on the memory and runs on the processor, wherein the processor executes the program to realize part or all of the steps of the moving object extracting method.

In the above description, for each parameter and step in the electronic device of the present invention, reference may be made to each parameter and step in the above embodiment of the moving object extracting method, which is not described herein again.

As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present disclosure may be embodied in the form of: may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software, and may be referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied in the form of a computer program product in one or more computer-readable media having computer-readable program code embodied in the medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A moving object extraction method is characterized by comprising the following steps:

merging the pixel point sets to obtain a sample point set;

and for each first initial frame image, determining the moving target in the first initial frame image according to each second target pixel point.

2. The method of claim 1, further comprising:

according to each initial frame image, carrying out ghost suppression processing on each initial frame image, and determining a second initial frame image corresponding to each initial frame image;

3. The method of claim 2, wherein performing ghost-suppression processing on each of the first initial frame images according to each of the first initial frame images to determine a second initial frame image corresponding to each of the first initial frame images comprises:

for each first initial frame image, determining a first foreground target corresponding to each frame of the first initial frame image except the first initial frame image of the first frame in each first initial frame image;

performing frame difference operation on every two adjacent first initial frame images from a first frame of the first initial frame image in the motion video to determine a target area, wherein the target area is a foreground image or a background image, stopping the frame difference operation if a continuously set number of the target areas are all background images, replacing the background image in each first initial frame image by the target area obtained by the last frame difference operation, and determining each replaced first initial frame image as each second initial frame image.

4. The method of claim 3, wherein for each of the first initial frame images, determining a first foreground target corresponding to each of the first initial frame images other than the first initial frame image comprises:

starting from a second frame of the first initial frame image in the motion video to a last frame of the first initial frame image, regarding each first initial frame image as a current frame image;

performing a difference operation on the current frame image and a previous frame image of the current frame image to determine a first difference image, and performing a difference operation on the current frame image and a next frame image of the current frame image to determine a second difference image;

taking the pixel points with the pixel values larger than a first preset threshold value in the first differential image as third target pixel points, and taking the pixel points with the pixel values larger than a second preset threshold value in the second differential image as fourth target pixel points;

5. The method of claim 2, wherein said determining a set of pixels corresponding to each of said second initial frame images based on each of said second initial frame images comprises:

for each second initial frame image, taking each pixel point in the second initial frame image as a fifth target pixel point;

for each fifth target pixel point, taking the fifth target pixel point as a central pixel point, acquiring all adjacent pixel points in the field of the central pixel point 24, determining a plurality of similarity values corresponding to the fifth target pixel point according to the fifth target pixel point and each adjacent pixel point, wherein for each similarity value, the similarity value represents the similarity between the gray value of the fifth target pixel point and the gray value of one corresponding adjacent pixel point;

for each fifth target pixel point, sorting the similarity values corresponding to the fifth target pixel point from small to large, and selecting adjacent pixel points corresponding to the first n similarity values in the sorting as the background pixel points;

for each second initial frame image, taking all the obtained background pixel points as a pixel point set;

determining a plurality of similarity values corresponding to the fifth target pixel point according to the fifth target pixel point and each of the adjacent pixel points includes:

G(a,b)＝|I(a)-I(b)|

6. The method of claim 5, wherein for each of the first initial frame images, determining each second target pixel point from each of the first target pixel points and the sample point set by using each pixel point in the first initial frame image as a first target pixel point comprises:

determining a radius threshold according to the sample point set;

for each first target pixel point in each first initial frame image, determining the Euclidean distance between the first target pixel point and each background pixel point in the sample point set;

for each first target pixel point in each first initial frame image, comparing the Euclidean distances corresponding to the first target pixel points with a radius threshold respectively, counting the number of the first target pixel points of which the Euclidean distances corresponding to the first target pixel points are smaller than the radius threshold, and if the number is smaller than a third preset value, determining that the first target pixel points are the second target pixel points.

7. The method of claim 6, wherein determining a radius threshold from the set of sample points comprises:

and expressing the average value of each background pixel point in the sample point set, alpha expressing an adaptive parameter, delta expressing a scale factor, R being the initial value of the radius threshold, and R (x, y) being the radius threshold determined according to the sample point set.

8. A moving object extraction system, comprising:

the first acquisition module is used for acquiring a motion video and each first initial frame image in the motion video;

a second obtaining module, configured to determine, according to each of the first initial frame images, a pixel point set corresponding to each of the first initial frame images, where, for each of the pixel point sets, the pixel point set includes a plurality of background pixel points;

a fourth obtaining module, configured to, for each first initial frame image, use each pixel point in the first initial frame image as a first target pixel point, and determine each second target pixel point according to each first target pixel point and the sample point set;

and a fifth obtaining module, configured to, for each first initial frame image, determine the moving target in the first initial frame image according to each second target pixel point.

9. An electronic device comprising a memory, a processor and a program stored on the memory and running on the processor, wherein the steps of a moving object extraction method according to any one of claims 1 to 7 are implemented when the program is executed by the processor.

10. A computer-readable storage medium, having stored therein instructions which, when run on a terminal device, cause the terminal device to perform the steps of a moving object extraction method according to any one of claims 1 to 7.