CN112837341B

CN112837341B - Self-adaptive time-space domain pedestrian appearance restoration method

Info

Publication number: CN112837341B
Application number: CN202110106572.8A
Authority: CN
Inventors: 张云佐
Original assignee: Shijiazhuang Tiedao University
Current assignee: Shijiazhuang Tiedao University
Priority date: 2021-01-26
Filing date: 2021-01-26
Publication date: 2022-05-03
Anticipated expiration: 2041-01-26
Also published as: CN112837341A

Abstract

The invention discloses a self-adaptive time-space domain pedestrian appearance restoration method, which comprises the following steps of: completing horizontal visual rhythm splicing by taking a single pixel row as a sampling unit; researching the relation between the pedestrian track slope and the pedestrian movement speed in the horizontal visual rhythm; and the pedestrian track slope is used as a vertical vision rhythm sampling width determination basis to finish self-adaptive time-space domain pedestrian appearance restoration. The experimental result shows that compared with the pedestrian appearance formed by the traditional visual rhythm, the pedestrian appearance restored by the method has better effects on contour fluency and visual perception, and the difference proportion of the pedestrian appearance restored by the method and the pedestrian image in the original spatial domain is smaller.

Description

Self-adaptive time-space domain pedestrian appearance restoration method

Technical Field

The invention relates to the technical field of image processing methods, in particular to a self-adaptive time-space domain pedestrian appearance restoration method.

Background

The number of surveillance videos is increasing at an incredible rate, and browsing and watching a large amount of surveillance videos becomes a time-consuming and labor-consuming task while the life safety of people is guaranteed. The video abstraction technology can generate a high-compression-rate video abstraction on the premise of keeping target activity information in an original video, can quickly obtain interested contents, and greatly reduces the time for a user to browse the video.

Saeid Bagheri proposes a method for mapping a surveillance video to a time profile for indexing, which effectively visualizes video content in a two-dimensional time image, extracts the time profile from the video to deliver accurate time information, and simultaneously retains certain spatial features for identification. The method of Saeid Bagheri enables the video abstraction to achieve good effects on both computational efficiency and visual quality. However, in many cases the temporal width of the target is compressed due to the limited frame rate on close objects with fast speed, resulting in a deformation of the contour map in the spatial-temporal domain, as shown in fig. 1.

Disclosure of Invention

The invention aims to provide a human appearance restoration method which has a better effect on the contour fluency and visual perception of the restored pedestrian appearance and has a smaller difference proportion with the original spatial domain pedestrian image.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a self-adaptive time-space domain pedestrian appearance restoration method is characterized by comprising the following steps:

completing horizontal visual rhythm splicing by taking a single pixel row as a sampling unit;

researching the relation between the pedestrian track slope and the pedestrian movement speed in the horizontal visual rhythm;

and the pedestrian track slope is used as a vertical vision rhythm sampling width determination basis to finish self-adaptive time-space domain pedestrian appearance restoration.

The further technical scheme is as follows: the slope of the target track in the horizontal visual rhythm is defined as k; a (x)₁,y₁) And B (x)₂,y₂) Are two points on the trajectory, the expressions of k and v are:

the above formula shows the numerical relationship between the slope k of the space-time motion trajectory of the target and the motion velocity v of the target in the horizontal direction; as long as the slope k of the spatiotemporal motion trajectory of the pedestrian in the horizontal visual rhythm can be obtained, the motion velocity v of the pedestrian relative to the hardware device can be obtained.

The further technical scheme is as follows: the sampling width of the vertical visual rhythm in the single frame image is defined as delta x, and the relation between the delta x and the target movement speed v is as follows:

Δx＝|v·f|

wherein f represents a unit frame, and as the value of f is 1, the sampling width delta x of the vertical visual rhythm is only required to be ensured to be equal to the value of the pedestrian moving speed v to ensure that the pedestrian appearance in the space-time domain keeps higher resolution;

the relationship between the vertical visual rhythm sampling width Deltax and the slope k of the pedestrian space-time motion trajectory is as follows:

sampling region S of vertical visual rhythm in ith frame of video_iyExpressed as the following equation:

wherein y is_i,l ^jRepresents the position (j, l), l ∈ [1, M ]]The pixel is determined as long as the sampling width value delta x of the vertical visual rhythm and the space-time motion of the target are ensuredThe inverse of the trajectory slope k is consistent, a complete and smooth contour of the target appearance image can be obtained in the 2D spatio-temporal image.

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: the method comprises the steps of firstly, completing horizontal visual rhythm splicing by taking a single pixel row as a sampling unit; secondly, researching the relation between the pedestrian track slope and the pedestrian movement speed in the horizontal visual rhythm; and finally, the pedestrian track slope is used as a vertical visual rhythm sampling width determination basis to finish self-adaptive time-space domain pedestrian appearance restoration. The experimental result shows that compared with the pedestrian appearance formed by the traditional visual rhythm, the pedestrian appearance restored by the method has better effects on contour fluency and visual perception, and the difference proportion of the pedestrian appearance restored by the method and the pedestrian image in the original spatial domain is smaller.

Drawings

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

FIG. 1 is a raw resolution of a time slice, where the time resolution of the object is too low to be identified;

FIG. 2 is an overall flow chart of the method according to the embodiment of the invention

FIG. 3 is a schematic diagram of horizontal visual cadence formation in an embodiment of the invention;

FIG. 4 is a composite schematic of vertical visual cadence sample widths according to an embodiment of the invention;

FIG. 5a is a pedestrian image reconstructed in the time-space domain using a conventional visual rhythm method;

FIG. 5b is a pedestrian image reconstructed in the time-space domain using the method proposed in the present application;

FIG. 5c is a reconstructed pedestrian image in the spatial domain in the starting video;

FIG. 6a is a pedestrian image reconstructed in the time-space domain using a conventional visual rhythm method;

FIG. 6b is a pedestrian image reconstructed in the time-space domain using the method described in the present application;

FIG. 7a is a pedestrian image reconstructed in the time-space domain using a conventional visual rhythm method;

FIG. 7b is a pedestrian image reconstructed in the time-space domain using the method described in the present application;

fig. 8 is a graph of experimental results for each test video.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.

Improved visual rhythm

The visual rhythm is also called space-time slice, which is a high-efficiency video space-time analysis method and has wide application in video processing. Video is defined as V ═ F₁,F₂,L,F_tIn which F_iAnd i is more than or equal to 1 and less than or equal to t represents a video frame with the size of h multiplied by w. Let T (F)_i)＝S_iIs a general formula F_iOperation of the column vector mapped to n × 1. Defining the visual cadence as an nxt image:

VR(VΥ)＝[T(F₁) T(F₂) L T(F_t)]＝[S₁ S₂ L S_t] (1)

vertical visual rhythm S obtained from ith frame image_iyIs defined as:

S_iy＝(y_i,l ^j)^T＝(y_i,l ¹,y_i,l ²,y_i,l ³,L,y_i,l ^h)^T (2)

wherein, y_i,l ^jDenotes the position (j, l), l ∈ [1, w ∈]OfA pixel.

Conventional visual rhythms typically accomplish the reorganization of two-dimensional spatio-temporal images in a basic unit of a single pixel column (row). The single and simple splicing method enables a space target with continuous information to be distorted in a time-space domain, and the shape of the space target is different from that of the target which can be perceived by human vision. Two-dimensional space-time pedestrians exist in a form which is not easy to recognize, so that understanding of the video abstract is more difficult, and the showing form is also poor. The application provides a new visual rhythm reorganization method, which breaks through the limitation of the traditional single-column (row) pixel reorganization, takes the number of the row (row) pixels capable of forming the complete appearance of a space-time target as a basic unit for forming the visual rhythm, and completes the self-adaptive reorganization and splicing of the two-dimensional space-time target. Assuming m columns of pixels as a reorganization unit of the vertical visual rhythm, equation (2) can be expressed as:

equation (3) shows an improved representation of vertical visual cadence, providing the possibility of restoring pedestrian appearance in the time-space domain using visual cadence.

Studying pedestrian deformation influence factors:

experiments prove that the factors influencing the deformation of the pedestrian in the space-time domain comprise two aspects, namely the frame rate of a hardware device for shooting video and the walking speed of the pedestrian relative to the ground. Assuming that a pedestrian walks at a speed v, the frame rate of a hardware device that captures a video is f₁，f₂The sampling width of the vertical visual rhythm is w₁，w₂. If f is₁＝μ×f₂Then, then

Conversely, if the frame rate f of the hardware device remains unchanged, then the pedestrian is present at v respectively₁And v₂Is traveling at a speed of v and v₁＝μ×v₂At this time, the flow rate of the gas is increased,

w₁＝μ×w₂ (5)

the pedestrian walking speed displayed in the video refers to the horizontal moving speed v of the pedestrian relative to the hardware equipment_p(pixel/frame), the expression of which is shown in equation (6).

v_p＝α·v+β·f (6)

Where v denotes a horizontal moving speed of the pedestrian with respect to the ground, f denotes a frame rate of the hardware device, and α, β denote weight coefficients.

Through the above analysis, the present application provides a self-adaptive time-space domain pedestrian appearance restoration method, as shown in fig. 2, the method includes the following steps:

The method makes the video image mapped on the two-dimensional time section more vivid, and makes the pedestrian on the two-dimensional time section exist in the visual field in an easier-to-recognize mode. The method breaks through the limitation of the traditional visual rhythm, and utilizes the pedestrian track information in the horizontal visual rhythm to assist in completing the splicing of the vertical visual rhythm, thereby completing the self-adaptive restoration of the pedestrian image in the space-time domain and enabling the pedestrian appearances in the space-time domain and the space domain to be infinitely close.

Splicing horizontal visual rhythm:

the space-time motion track reflects the state change of all moving objects, and the horizontal visual rhythm is an effective tool for exploring the space-time motion track. Assume that the object moves straight from left to right at a constant speed in the horizontal direction at a speed of v pixels/frame (p/f). The single pixel line area where the target center position is located is sampled through experiments, one side of the spliced horizontal visual rhythm is the same as the horizontal length of an original video frame, and the other side of the spliced horizontal visual rhythm is

A pixel. The formation process of the horizontal visual rhythm and the target motion trail is shown in fig. 3.

Explore the relationship of k to v:

the slope of the target trajectory in the horizontal visual rhythm is defined as k. A (x)₁,y₁) And B (x)₂,y₂) Are two points on the trajectory, and the expressions of k and v are shown in formula (7).

Equation (7) shows the numerical relationship between the slope k of the target spatiotemporal motion trajectory and its motion velocity v. As long as the slope k of the spatiotemporal motion trajectory of the pedestrian in the horizontal visual rhythm can be obtained, the motion velocity v of the pedestrian relative to the hardware device can be obtained.

Adaptive time-space domain pedestrian stitching

The width of vertical visual rhythm sampling in a single frame image directly affects the appearance of the target in the spatio-temporal domain. If the sampling width is too narrow, the target shape is compressed and is difficult to recognize; however, if the sampling width is too wide, severe splicing traces can be generated, and the target contour is not smooth. The sampling width of the vertical visual rhythm in the single frame image is defined as Δ x, and the relationship between Δ x and the target movement velocity v can be obtained from fig. 4, as shown in equation (8).

Δx＝|v·f| (8)

Where f denotes a unit frame. Since the value of f is 1, to maintain a higher resolution of the appearance of the pedestrian in the spatiotemporal domain, it is only necessary to ensure that the sampling width Δ x of the vertical visual rhythm is equal to the value of the pedestrian moving speed v.

wherein y is_i,l ^jRepresents the position (j, l), l ∈ [1, M ]]And the pixel can obtain a complete and smooth contour of the target appearance image in the 2D space-time image as long as the sampling width value delta x of the vertical visual rhythm is ensured to be consistent with the reciprocal of the target space-time motion track slope k.

Experimental verification

All experiments were performed on Windows10 system, Intel (R) core (TM) i7-4790cpu octal processor, AMD radeonr7200 series graphics card, 16GB memory. The video data involved in the experiment are all self-collected monitoring video data, and 10 sections of videos and 6 different scenes are tested. The method and the device use the reciprocal of the slope of the empty motion track of the pedestrian descending at the horizontal visual rhythm as the basic unit of vertical visual rhythm sampling to reconstruct a complete pedestrian appearance image in the space-time domain. FIGS. 5a-5c, 6a-6b, and 7a-7b show the results for single and multiple targets, respectively, in the spatio-temporal domain. As can be seen from fig. 5a-5c, fig. 6a-6b and fig. 7a-7b, although the pedestrian in the time-space domain recovered by the method has a slightly different appearance from that of the pedestrian in the original video in the space domain, the method has a great improvement compared with the traditional visual rhythm reorganization image.

The performance of the method provided by the application and the performance of the traditional visual rhythm method are compared by taking the difference proportion as an evaluation index. The value range of the difference ratio is [0,1], and the smaller the difference ratio is, the better the difference ratio is. The formula of the difference ratio is shown in formula (11). The experimental results show that the average difference ratio of the traditional visual rhythm is 0.8401, and the average difference ratio of the method is 0.0857. Fig. 8 shows a comparison of the difference ratios on each test video.

Fig. 8 shows that the method provided by the present application has a smaller difference ratio compared with the conventional visual rhythm method, that is, a better pedestrian recovery effect is achieved in the time-space domain.

Claims

1. A self-adaptive time-space domain pedestrian appearance restoration method is characterized by comprising the following steps:

the pedestrian track slope is used as a vertical vision rhythm sampling width determination basis to finish self-adaptive time-space domain pedestrian appearance restoration;

the sampling width of the vertical visual rhythm in the single frame image is defined as delta x, and the relation between the delta x and the target movement speed v is as follows:

Δx＝|v·f| (1)

wherein y is_i,l ^jRepresents the position (j, l), l ∈ [1, M ]]Pixel of (b), as long as vertical visual rhythm is guaranteedThe sample width value delta x is consistent with the reciprocal of the slope k of the target space-time motion track, so that a complete and smooth contour of a target appearance image can be obtained in a 2D space-time image;

assuming that a pedestrian walks at a speed v, the frame rate of a hardware device that captures a video is f₁，f₂The sampling width of the vertical visual rhythm is w₁，w₂If f is₁＝μ×f₂Then, then

Conversely, if the frame rate f of the hardware device remains unchanged, then the pedestrian is present at v respectively₁And v₂Is traveling at a speed of v and v₁＝μ×v₂At this time, the process of the present invention,

w₁＝μ×w₂ (5)

v_p＝α·v+β·f (6)

Wherein v represents the horizontal moving speed of the pedestrian relative to the ground, f represents the frame rate of the hardware equipment, and alpha and beta represent weight coefficients;

the slope of the target track in the horizontal visual rhythm is defined as k; a (x)₁,y₁) And B (x)₂,y₂) Are two points on the trajectory, and the expressions of k and v are: