WO2022000838A1 - Markov random field-based method for labeling remote control tower video target - Google Patents

Markov random field-based method for labeling remote control tower video target Download PDF

Info

Publication number
WO2022000838A1
WO2022000838A1 PCT/CN2020/118643 CN2020118643W WO2022000838A1 WO 2022000838 A1 WO2022000838 A1 WO 2022000838A1 CN 2020118643 W CN2020118643 W CN 2020118643W WO 2022000838 A1 WO2022000838 A1 WO 2022000838A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
video
background
label
frame
Prior art date
Application number
PCT/CN2020/118643
Other languages
French (fr)
Chinese (zh)
Inventor
何亮
程先峰
杨恺
叶鑫鑫
刘胜新
Original Assignee
南京莱斯信息技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京莱斯信息技术股份有限公司 filed Critical 南京莱斯信息技术股份有限公司
Publication of WO2022000838A1 publication Critical patent/WO2022000838A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes

Definitions

  • the invention belongs to the technical field of remote towers, and specifically refers to a method for hanging signs on video objects of remote towers based on a Markov random field.
  • Remote tower video surveillance can effectively help controllers to manage surface traffic, but video surveillance can only provide image information.
  • the video target automatic signage function can intuitively and accurately display the flight number, speed, model and other signage information in the video, effectively reduce the controller's control load, improve the control efficiency, and ensure the control safety.
  • the existing video and surveillance data fusion automatic labeling methods mainly use the background difference method and KLT algorithm to realize the detection and tracking of the aircraft, select the target center point in a single frame image as the video position coordinates, and perform coordinate transformation on the aircraft latitude and longitude in the surveillance data. It is mapped with the video position coordinates, but the single-frame image coordinate mapping method has the problem of signage delay and loss.
  • the mixed Gaussian model is used to establish the background model, the coordinates of the aircraft image are obtained by the background difference method, and then the feature points are selected on the airport map and the video image to establish the mapping relationship, so as to realize the image tracking data and broadcast automatic dependent surveillance (ADS-B, Automatic Dependent Surveillance).
  • ADS-B Automatic Dependent Surveillance
  • Surveillance Broadcast this method uses covariance matrix and homography mapping to correct measurement errors, focusing on reducing the correlation error between image detection results and radar tracking results, ignoring video tracking results errors. At the same time, the impact of hardware cost is ignored.
  • the method of single-frame matching and association is used. For each video frame, it is necessary to process the workflow of image target detection, coordinate mapping, error correction, database search and correlation monitoring data, which is affected by system performance. When dealing with consecutive multi-frame targets, there may be a delay or target loss.
  • the motion segmentation method classifies the pixels according to the motion mode.
  • the common one is the KLT method, which decomposes the image into different motion levels according to the vector velocity field of the moving target on the pixel surface and different motion parameters.
  • the method does not require prior information, but the calculation is complex and the hardware cost is high.
  • the purpose of the present invention is to provide a method for hanging a signage on a remote tower video target based on a Markov random field. It takes the background as an input, and uses the Hopfield network's autonomous optimization characteristics to automatically form an optimal estimate of the foreground target.
  • a method for hanging a signage of a remote tower video target based on a Markov random field of the present invention the steps are as follows:
  • the background intensity in the sequence is basically unchanged, so for a continuous video sequence D, the background images in each frame are considered to be linearly related, and the moving target is regarded as a pixel that cannot be included in the background matrix B during the linear decomposition of the video sequence.
  • I is a unit matrix
  • x e I represents white Gaussian noise
  • the binary label support set S ⁇ 0,1 ⁇ m ⁇ n as the image pixel label, and its elements are specified as:
  • the parameter ⁇ >0 is a constant related to the sparsity of the coefficient vector x, which controls the complexity of the background.
  • step 2) specifically includes: assuming that the optimal support set estimate S has been obtained, the formula (7) is simplified to the following optimization problem:
  • Equation (10) the problem shown in Equation (8) is transformed into the L1 normal form minimization problem shown in Equation (10):
  • the optimization solution of the foreground label set further optimizes the background estimation, and in the subsequent iterations, the current frame y is used to replace the template with the smallest sparse representation coefficient x in D t-1.
  • step 3) specifically includes: when the sparse coefficient x is given, the energy function shown in formula (7) is converted into:
  • the constant C is also determined; in order to obtain the estimation of the support S in Eq. (11), so as to obtain the foreground image in each frame, the image based on Markov random fields (MRFs) is used. segmentation method;
  • Use G ⁇ (i,j)
  • 0 ⁇ i ⁇ h,0 ⁇ j ⁇ w ⁇ to represent the set of all pixels in the h ⁇ w image of the current frame, and g (i,j) ⁇ G to represent the two-dimensional image
  • each pixel position g on the image corresponds to a random value in a label support set S ⁇ 0,1 ⁇ m ⁇ n . It is assumed that the local conditional probability of the label value of the foreground pixel only varies with the state of its neighborhood.
  • the value set S of the pixel label including the positional relationship is a Markov random field about the neighborhood system N, according to the observed image data Y, the value of the pixel label is selected according to the observed image data Y.
  • the value can be derived from the Bayesian criterion:
  • P(Y) is the prior distribution of the observed data, and a given video frame image can be regarded as a constant;
  • P(S) is the prior distribution of the label field.
  • V c (l c ) the potential of a given cluster is function V c (l c ), using Fitting the label field prior distribution, l c represents a reference point on the cluster c, is the sum of the energy of the potential functions on each cluster;
  • the definition of the potential function in the Ising model is:
  • S) is the likelihood probability. Usually, it is assumed that each pixel is independent and has the same Gaussian distribution. The likelihood probability is regarded as the product of the likelihood probability at each pixel: P(Y
  • S) ⁇ g ⁇ G P(y g
  • the maximum a posteriori probability (Maximum A Posteriori, MAP) criterion is selected as the optimal judgment criterion for image segmentation, then the optimal solution of the objective function is to maximize the equation (12).
  • the solution of the posterior probability take the logarithm of both sides to obtain the following objective function:
  • the step 3) specifically also includes: let u k , v k be the input and output voltages of the kth neuron in the recurrent neural network, respectively, R k , C k are their input resistance and input capacitance, respectively, I k is the bias current, g k (u k ) is the transfer function of the neuron, ⁇ jk is the connection resistance between the neuron j and the neuron k, that is, the connection weight, the overall energy function of the network usually has the following form:
  • Equation (17) shows a downward decay trend as a whole over time, and is simplified as:
  • the energy function converges to the minimum value, so the recurrent neural network realizes the autonomous iterative optimization of the input signal
  • the images are labeled As an input to a recurrent neural network, while setting the network's bias current
  • the energy function of the network is:
  • the step 4) specifically includes: obtaining the tracking and monitoring of the aircraft target in the video image coordinate system by estimating the above-mentioned background and foreground, establishing the mapping relationship between the image pixel coordinates and the world coordinates, and finding relevant information in the radar tracking results. aircraft placard information;
  • the pixel plane coordinates of the target point are obtained in the world Coordinate conversion relationship:
  • f x , f y are parameters representing the focal length
  • (u 0 , v 0 ) T is the position of the main point relative to the image plane (projection plane), that is, the intersection of the main optical axis and the image plane
  • z c is the pixel
  • R is the rotation matrix of the camera
  • T is the translation matrix
  • the coordinates of the tracking result of the foreground target in the world coordinate system are obtained, and the nearest neighbor method is used to establish the corresponding relationship between the video tracking coordinates and the ADS-B data, so as to realize the data association, so as to integrate the ADS-B in the data.
  • the signage information of the flight number is associated with the video to realize automatic signage.
  • the background is modeled as a sparse representation of a continuous video frame sequence, and the complexity of the background solution can be reduced by using a greedy algorithm to solve the sparse signal recovery problem.
  • the foreground solution as an image segmentation problem based on the Markov random field.
  • the background layer use the independent optimization characteristics of the Hopfield network to establish the corresponding relationship between the network input and the Markov random field energy function of the foreground modeling, Automatically optimize the image label set to obtain a smooth foreground target.
  • the foreground target can be fed back to the background solution process, and the number of iterations controls the computational complexity of the overall foreground and background estimation.
  • the moving target is automatically captured in the continuous frame video image
  • the corresponding relationship between the image coordinates and the broadcast automatic correlation monitoring data is established through coordinate conversion, and the single-frame look-up table mapping method is transformed into the batch processing method through the transformation matrix.
  • the target image coordinates in consecutive frames are converted into world coordinates, and then the database is searched and correlated with the broadcast ADS-C data according to the nearest neighbor principle, which reduces the problem of signage delay and target loss caused by processing performance limitations to a certain extent.
  • FIG. 1 is a schematic diagram of the method of the present invention.
  • Figure 2 is a diagram of a recurrent neural network neuron model.
  • Compressed sampling Also known as Compressive sensing or Sparse sampling, it exploits the sparse characteristics of the signal and uses random sampling to obtain discrete samples of the signal under the condition that the sampling rate is much smaller than the Nyquist sampling rate. The signal is then perfectly reconstructed by a nonlinear reconstruction algorithm.
  • Image segmentation The technology and process of dividing an image into several specific regions with unique properties and proposing objects of interest. It is a computer vision task of marking designated regions according to the content of the image.
  • Markov random field A random field with Markov properties. When a value in the phase space is randomly assigned to each position according to a certain distribution, the whole is called a random field; and the Markov property refers to that when a sequence of random variables is arranged in time sequence, the N+1th moment The distribution characteristics of , have nothing to do with the value of the random variable before N time.
  • I is a unit matrix
  • x e I represents white Gaussian noise
  • the binary label support set S ⁇ 0,1 ⁇ m ⁇ n as the image pixel label, and its elements are specified as:
  • the parameter ⁇ >0 is a constant related to the sparsity of the coefficient vector x, which controls the complexity of the background.
  • equation (7) reduces to the following optimization problem:
  • Equation (10) the problem shown in Equation (8) is transformed into the L1 normal form minimization problem shown in Equation (10):
  • the optimization solution of the foreground label set further optimizes the background estimation, and in the subsequent iterations, the current frame y is used to replace the template with the smallest sparse representation coefficient x in D t-1.
  • the constant C is also determined; in order to obtain the estimation of the support S in Eq. (11), so as to obtain the foreground image in each frame, the image based on Markov random fields (MRFs) is used. segmentation method;
  • Use G ⁇ (i,j)
  • 0 ⁇ i ⁇ h,0 ⁇ j ⁇ w ⁇ to represent the set of all pixels in the h ⁇ w image of the current frame, and g (i,j) ⁇ G to represent the two-dimensional image
  • each pixel position g on the image corresponds to a random value in a label support set S ⁇ 0,1 ⁇ m ⁇ n . It is assumed that the local conditional probability of the label value of the foreground pixel only varies with the state of its neighborhood.
  • the value set S of the pixel label including the positional relationship is a Markov random field about the neighborhood system N, according to the observed image data Y, the value of the pixel label is selected according to the observed image data Y.
  • the value can be derived from the Bayesian criterion:
  • P(Y) is the prior distribution of the observed data, and a given video frame image can be regarded as a constant;
  • P(S) is the prior distribution of the label field.
  • V c (l c ) the potential of a given cluster
  • the prior distribution of the label field is approximately l c represents the label of the point on the cluster c
  • the definition of the potential function in the Ising model is:
  • S) is the likelihood probability. Usually, it is assumed that each pixel is independent and has the same Gaussian distribution. The likelihood probability is regarded as the product of the likelihood probability at each pixel: P(Y
  • S) ⁇ g ⁇ G P(y g
  • the maximum a posteriori probability (Maximum A Posteriori, MAP) criterion is selected as the optimal judgment criterion for image segmentation, then the optimal solution of the objective function is to maximize the equation (12).
  • the solution of the posterior probability take the logarithm of both sides to obtain the following objective function:
  • the step 3) specifically further includes: referring to FIG. 2 , let u k , v k be the input and output voltages of the kth neuron in the recurrent neural network, respectively, and R k , C k are their input resistance and input voltage respectively.
  • Capacitance, I k is the bias current
  • g k (u k ) is the transfer function of the neuron
  • ⁇ jk is the connection resistance between the neuron j and the neuron k, that is, the connection weight
  • the overall energy function of the network usually has in the form of:
  • Equation (17) shows a downward decay trend as a whole over time, and is simplified as:
  • the energy function converges to the minimum value, so the recurrent neural network realizes the autonomous iterative optimization of the input signal
  • the images are labeled As an input to a recurrent neural network, while setting the network's bias current
  • the energy function of the network is:
  • the tracking and monitoring of the aircraft target in the video image coordinate system is obtained, the mapping relationship between the image pixel coordinates and the world coordinates is established, and the relevant aircraft sign information is found in the radar tracking result;
  • the pixel plane coordinates of the target point are obtained in the world Coordinate conversion relationship:
  • f x , f y are parameters representing the focal length
  • (u 0 , v 0 ) T is the position of the main point relative to the image plane (projection plane), that is, the intersection of the main optical axis and the image plane
  • z c is the pixel
  • R is the rotation matrix of the camera
  • T is the translation matrix
  • the coordinates of the tracking result of the foreground target in the world coordinate system are obtained, and the nearest neighbor method is used to establish the corresponding relationship between the video tracking coordinates and the ADS-B data, so as to realize the data association, so as to integrate the ADS-B in the data.
  • the signage information of the flight number is associated with the video to realize automatic signage.

Abstract

A Markov random field-based method for labeling a remote control tower video target, comprising the steps of: 1) establishing a model; 2) using a greedy algorithm to solve a sparse representation of a sequence of consecutive video frames, and obtaining a preliminary estimation of the background; 3) using a recurrent neural network to solve an image segmentation problem, and obtaining a foreground target tracking result and a background estimation; 4) using the nearest neighbor method to establish a correspondence between the positions of target coordinate points in the world coordinate system and automatic dependent surveillance broadcast data, so as to associate label information in the automatic dependent surveillance broadcast data with a video, thus achieving automatic labeling. The described method utilizes a sparse sampling means to reduce a data set of a calculation operation and reduce the complexity of solving a background. By using the background as an input and using a Hopfield network self-optimizing feature, an optimized estimation of a foreground target is automatically formed.

Description

基于马尔可夫随机场的远程塔台视频目标挂标牌方法Method of hanging signage on remote tower video target based on Markov random field 技术领域technical field
本发明属于远程塔台技术领域,具体指代一种基于马尔可夫(Markov)随机场的远程塔台视频目标挂标牌方法。The invention belongs to the technical field of remote towers, and specifically refers to a method for hanging signs on video objects of remote towers based on a Markov random field.
背景技术Background technique
目前,随着人们生活节奏的加快,航空出行已成为一种重要的出行方式,随之通用机场的建设也在一步步加快,预计2030年国内通用机场总量将超过2000个;但局限于通航机场航班量小,日常收益有限,依照传统机场建设和管制建设规划塔台,其建设成本、运营成本不能在常规的运营周期内实现冲抵、获取收益。且支线机场和通航机场爆发式增长势必会带动管制员人才需求,管制人才培养不能完全跟上机场建设需求。此外,机坪管制移交和跑道扩建需求进一步推动远程塔台技术发展。At present, with the accelerated pace of people's life, air travel has become an important way of travel, and the construction of general airports is also accelerating step by step. It is expected that the total number of domestic general airports will exceed 2,000 in 2030; The number of airport flights is small, and the daily income is limited. According to the traditional airport construction and control construction planning tower, its construction cost and operating cost cannot be offset and benefited in the regular operation cycle. Moreover, the explosive growth of regional airports and general aviation airports will inevitably drive the demand for controller talents, and the training of controller talents cannot fully keep up with the needs of airport construction. In addition, the need for apron control handover and runway expansion is further driving the development of remote tower technology.
远程塔台视频监视能有效帮助管制人员进行场面交通管理,但视频监视只能提供图像信息,管制员还需通过中小显、电子进程单等系统确定航空器标牌信息。视频目标自动挂标牌功能能够在视频中直观、准确的显示航班号、速度、机型等标牌信息,有效降低管制员管制负荷,提高管制效率,确保管制安全。Remote tower video surveillance can effectively help controllers to manage surface traffic, but video surveillance can only provide image information. The video target automatic signage function can intuitively and accurately display the flight number, speed, model and other signage information in the video, effectively reduce the controller's control load, improve the control efficiency, and ensure the control safety.
现有的视频与监视数据融合自动挂标方法主要采用背景差分法和KLT算法等实现飞机的检测与跟踪,选择单帧图像中目标中心点作为视频位置坐标,对监视数据中飞机经纬度进行坐标转换与视频位置坐标进行映射,但是单帧图像坐标映射方法存在标牌时延和丢失问题。The existing video and surveillance data fusion automatic labeling methods mainly use the background difference method and KLT algorithm to realize the detection and tracking of the aircraft, select the target center point in a single frame image as the video position coordinates, and perform coordinate transformation on the aircraft latitude and longitude in the surveillance data. It is mapped with the video position coordinates, but the single-frame image coordinate mapping method has the problem of signage delay and loss.
采用混合高斯模型建立背景模型,通过背景差分方法获取飞机图像坐标,随后在机场地图和视频图像上分别选取特征点建立映射关系,实现图像跟踪数据和广播式自动相关监视(ADS-B,Automatic Dependent Surveillance Broadcast)数据的融合,此方法采用协方差矩阵和单应映射修正测量误差,侧重降低图像检测结果与雷达跟踪结果的关联误差,忽略了视频跟踪结果误差。同时忽视了硬件成本带来的影响,如采用单帧匹配关联的方式,对每一个视频帧,均需处理图像目标检测、坐标映射、误差修正、查库关联监视数据的工作流,受系统性能的影响,在处理连续多帧目标时,可能出现时延或者目标丢失的情况。The mixed Gaussian model is used to establish the background model, the coordinates of the aircraft image are obtained by the background difference method, and then the feature points are selected on the airport map and the video image to establish the mapping relationship, so as to realize the image tracking data and broadcast automatic dependent surveillance (ADS-B, Automatic Dependent Surveillance). Surveillance Broadcast) data fusion, this method uses covariance matrix and homography mapping to correct measurement errors, focusing on reducing the correlation error between image detection results and radar tracking results, ignoring video tracking results errors. At the same time, the impact of hardware cost is ignored. For example, the method of single-frame matching and association is used. For each video frame, it is necessary to process the workflow of image target detection, coordinate mapping, error correction, database search and correlation monitoring data, which is affected by system performance. When dealing with consecutive multi-frame targets, there may be a delay or target loss.
运动检测模型类方法中,运动分割法将像素按照运动模式分类,常见的如KLT方法,根据运动目标在像素面上的矢量速度场,依据不同的运动参数将图像分解到不同的运动层级,此方法无需先验信息,但计算复杂,硬件成本较高。In the motion detection model method, the motion segmentation method classifies the pixels according to the motion mode. The common one is the KLT method, which decomposes the image into different motion levels according to the vector velocity field of the moving target on the pixel surface and different motion parameters. The method does not require prior information, but the calculation is complex and the hardware cost is high.
发明内容SUMMARY OF THE INVENTION
针对于上述现有技术的不足,本发明的目的在于提供一种基于马尔可夫随机场的远程塔台视频目标挂标牌方法,本发明利用稀疏采样方式,减少计算操作的数据集,降低背景解算的复杂度;将背景作为输入,利用Hopfield网络自主优化特性,自动形成对前景目标的优化估计。In view of the above-mentioned deficiencies of the prior art, the purpose of the present invention is to provide a method for hanging a signage on a remote tower video target based on a Markov random field. It takes the background as an input, and uses the Hopfield network's autonomous optimization characteristics to automatically form an optimal estimate of the foreground target.
为达到上述目的,本发明采用的技术方案如下:For achieving the above object, the technical scheme adopted in the present invention is as follows:
本发明的一种基于马尔可夫随机场的远程塔台视频目标挂标牌方法,步骤如下:A method for hanging a signage of a remote tower video target based on a Markov random field of the present invention, the steps are as follows:
1)建立模型:假定连续视频帧中背景图像线性相关,运动目标则视为在视频序列线性分解过程中无法纳入背景矩阵中的像素,通过求解背景估计和前景标号集,对视频帧图像中的像素进行背景和前景的分类标记;1) Build the model: Assuming that the background images in consecutive video frames are linearly correlated, the moving objects are regarded as pixels that cannot be included in the background matrix during the linear decomposition of the video sequence. Pixels are classified into background and foreground;
2)使用贪婪算法求解连续视频帧序列的稀疏表示,获得对背景的初步估计;2) Use a greedy algorithm to solve the sparse representation of the continuous video frame sequence to obtain a preliminary estimate of the background;
3)利用递归(Hopfield)神经网络求解图像分割问题,获得对前景标号集的估计;利用该前景标号集,对步骤2)求得的背景的初步估计进行修正,得到前景目标跟踪结果和背景估计;3) Use the recursive (Hopfield) neural network to solve the image segmentation problem, and obtain an estimate of the foreground label set; use the foreground label set to revise the preliminary estimate of the background obtained in step 2), and obtain the foreground target tracking result and background estimation ;
4)采用针孔透视模型建立从视频图像坐标系到世界坐标系的转换矩阵,求解视频帧中的前景目标跟踪结果在世界坐标系下的坐标位置;采用最近邻方法建立上述世界坐标系下的目标坐标点位置与广播式自动相关监视数据的对应关系,从而把广播式自动相关监视中的标牌信息关联到视频上,实现自动挂标牌。4) Use the pinhole perspective model to establish the transformation matrix from the video image coordinate system to the world coordinate system, and solve the coordinate position of the foreground target tracking result in the video frame under the world coordinate system; use the nearest neighbor method to establish the above-mentioned world coordinate system. The corresponding relationship between the target coordinate point position and the ADS-B broadcast data, so as to associate the sign information in the ADS-B broadcast with the video, and realize the automatic hanging of the sign.
进一步地,所述步骤1)具体包括:采用I t∈R m表示由视频序列中第t帧的图像按列堆叠形成的向量,该帧包括m个像素;D=[I 1,…,I n]∈R m×n为由表示帧的向量I组成的矩阵,代表了包括n帧的整个视频序列;B∈R m×n是与D同维的矩阵,表示视频帧中的背景,由n帧向量组成,每帧m个像素;第t帧第k个像素记为kt;用图像灰度来度量背景的强度,则在考察周期内照明条件基本不变的情况下,认为连续视频帧序列中背景强度基本不变,故对于一段连续视频序列D,其各组成帧中背景图像认为是线性相关的,运动目标则视为在视频序列线性分解过程中无法纳入背景矩阵B中的像素,记为前景E,将当前帧t中的目标看成其之前t-1帧向量张成的子空间中的一个线性表示,记前t-1帧组成的矩阵为D t-1=[I 1,…,I t-1],则第t帧的图像记为: Further, the step 1) comprises: I t ∈R m using a vector represented by the video sequence in the t-th frame is stacked to form columns, the frame includes m pixels; D = [I 1, ... , I n] ∈R m × n matrix by a vector showing a frame composed of I, representing the entire video sequence comprising n frames; B∈R m × n matrix with the same dimension D, representing the background video frame, the It consists of n frame vectors, each frame has m pixels; the kth pixel of the tth frame is denoted as kt; the intensity of the background is measured by the image grayscale, and the continuous video frame is considered to be a continuous video frame when the lighting conditions are basically unchanged during the inspection period. The background intensity in the sequence is basically unchanged, so for a continuous video sequence D, the background images in each frame are considered to be linearly related, and the moving target is regarded as a pixel that cannot be included in the background matrix B during the linear decomposition of the video sequence. Denoted as foreground E, the target in the current frame t is regarded as a linear representation in the subspace stretched by the vector of the previous t-1 frame, and the matrix composed of the previous t-1 frame is D t-1 =[I 1 ,...,I t-1 ], then the image of the t-th frame is recorded as:
y t=B+E=D t-1x+E      (1) y t =B+E=D t-1 x+E (1)
由各帧中背景组成的矩阵B=D t-1x是一个低秩矩阵,即背景矩阵B满足rank(B)≤K,K为预定义常数,且系数x为稀疏向量;考虑场景中噪声的影响,并假定噪声服从均值为0,方差为σ 2的高斯白噪声,式(1)视频帧信号表示为: The matrix B=D t-1 x composed of the background in each frame is a low-rank matrix, that is, the background matrix B satisfies rank(B)≤K, K is a predefined constant, and the coefficient x is a sparse vector; considering the noise in the scene , and assuming that the noise obeys the Gaussian white noise with mean 0 and variance σ 2 , the video frame signal of formula (1) is expressed as:
Figure PCTCN2020118643-appb-000001
Figure PCTCN2020118643-appb-000001
其中,I为单位矩阵,x eI代表高斯白噪声,在噪声影响下,第t帧的视频图像像素灰度值的大小记为y kt=B kt+e kt=ψ ktx+e kt;定义二值标号支撑集S∈{0,1} m×n作为图像像素标号,其元素规定为: Wherein, I is a unit matrix, x e I represents white Gaussian noise, and under the influence of noise, the size of the pixel gray value of the video image of the t-th frame is denoted as y kt =B kt +e ktkt x+e kt ; Define the binary label support set S∈{0,1} m×n as the image pixel label, and its elements are specified as:
Figure PCTCN2020118643-appb-000002
Figure PCTCN2020118643-appb-000002
则背景建模问题归结为求解下式(4)所示的优化问题:Then the background modeling problem boils down to solving the optimization problem shown in the following equation (4):
Figure PCTCN2020118643-appb-000003
Figure PCTCN2020118643-appb-000003
当S kt=1即像素kt属于前景时,背景被前景覆盖,视频帧信号的灰度与前景相同,故对目标的检测实际就是对前景标号集的估计;由于图像中相邻像素标号间存在相互作用,导致图像标号场不分段光滑,定义E smooth记录标号场不分段光滑的程度,E data记录标号与实测数据的误差,将对前景标号集的估计问题转化为对标号场能量优化问题,即令: When Skt = 1, that is, the pixel kt belongs to the foreground, the background is covered by the foreground, and the grayscale of the video frame signal is the same as the foreground, so the detection of the target is actually the estimation of the foreground label set; The interaction causes the image label field to be non-segmentally smooth. Define E smooth to record the degree of non-segmental smoothness of the label field, and E data to record the error between the label and the measured data. The problem of estimating the foreground label set is transformed into the energy optimization of the label field. The problem, i.e.:
E(S)=E smooth(S)+E data(S)      (5) E(S)=E smooth (S)+E data (S) (5)
取得最小值;get the minimum value;
在支撑集S的线性矩阵空间中定义矩阵X的正交投影:Define an orthogonal projection of a matrix X in the linear matrix space of the support set S:
Figure PCTCN2020118643-appb-000004
Figure PCTCN2020118643-appb-000004
Figure PCTCN2020118643-appb-000005
是Γ S(X)的补,有
Figure PCTCN2020118643-appb-000006
对视频帧中动态航空器目标y的检测就是对下式能量函数的最小化;
Figure PCTCN2020118643-appb-000005
is the complement of Γ S (X), we have
Figure PCTCN2020118643-appb-000006
The detection of the dynamic aircraft target y in the video frame is the minimization of the following energy function;
Figure PCTCN2020118643-appb-000007
Figure PCTCN2020118643-appb-000007
其中,参数α>0是与系数向量x稀疏度相关的常数,控制着背景的复杂度。where the parameter α>0 is a constant related to the sparsity of the coefficient vector x, which controls the complexity of the background.
进一步地,所述步骤2)具体包括:假设已得到优化支撑集估计S,则式(7)简化为下列最优化问题:Further, the step 2) specifically includes: assuming that the optimal support set estimate S has been obtained, the formula (7) is simplified to the following optimization problem:
Figure PCTCN2020118643-appb-000008
Figure PCTCN2020118643-appb-000008
使用高斯(Gaussian)随机矩阵Φ作为RIP矩阵,对观测值y进行压缩采样:Use a Gaussian random matrix Φ as the RIP matrix to compress and sample the observation y:
z=Φy=ΦΨx=Θx      (9)z=Φy=ΦΨx=Θx (9)
则式(8)所示问题转化为式(10)所示L1范式最小化问题:Then the problem shown in Equation (8) is transformed into the L1 normal form minimization problem shown in Equation (10):
min||x|| 1s.t.||Φy-Θx|| 2≤ε      (10) min||x|| 1 st||Φy-Θx|| 2 ≤ε (10)
初始化时,将视频起始一段短视频作为训练帧,背景复杂度已知,忽略参数α的影响,令α=1,使用贪婪算法求解(10)获得初始的背景估计,在此基础上,通过前景标号集的优化求解,进一步优化背景估计,并在后续迭代中,使用当前帧y替换D t-1中对应稀疏表示系数x最小的模板。 During initialization, a short video at the beginning of the video is used as a training frame, the background complexity is known, the influence of parameter α is ignored, and α=1, and the greedy algorithm is used to solve (10) to obtain the initial background estimate. The optimization solution of the foreground label set further optimizes the background estimation, and in the subsequent iterations, the current frame y is used to replace the template with the smallest sparse representation coefficient x in D t-1.
进一步地,所述步骤3)具体包括:当给定了稀疏系数x时,式(7)所示能量函数转化为:Further, the step 3) specifically includes: when the sparse coefficient x is given, the energy function shown in formula (7) is converted into:
Figure PCTCN2020118643-appb-000009
Figure PCTCN2020118643-appb-000009
其中,
Figure PCTCN2020118643-appb-000010
给定x,常数C也随之确定;为得到对式(11)中对支撑S的估计,从而得到各帧中的前景图像,采用基于马尔可夫随机场(Markov random fields,MRFs)的图像分割方法;
in,
Figure PCTCN2020118643-appb-000010
Given x, the constant C is also determined; in order to obtain the estimation of the support S in Eq. (11), so as to obtain the foreground image in each frame, the image based on Markov random fields (MRFs) is used. segmentation method;
用G={(i,j)|0≤i≤h,0≤j≤w}表示当前帧h×w图像中所有像素点的集合,g=(i,j)∈G表示二维图像中第i行,第j列的像素点,定义该像素的邻域为N g={f∈G|[dist(f,g)] 2≤r,f≠g},其中,dist(f,g)表示像素位置之间的欧式距离;对于图像G中的子集c,其中每对不同的元素总是相邻时,构成一个集簇,C为所有集簇c的集合; Use G={(i,j)|0≤i≤h,0≤j≤w} to represent the set of all pixels in the h×w image of the current frame, and g=(i,j)∈G to represent the two-dimensional image The pixel in the i-th row and the j-th column defines the neighborhood of the pixel as N g ={f∈G|[dist(f,g)] 2 ≤r,f≠g}, where dist(f,g ) represents the Euclidean distance between pixel positions; for the subset c in the image G, when each pair of different elements is always adjacent, a cluster is formed, and C is the set of all clusters c;
对于图像上每一个像素位置g,均对应了一个标号支撑集S∈{0,1} m×n中的随机取值,假定前景像素标号取值的局部条件概率仅随着其邻域状态的变化而变化,与邻域外的变化无关,则包含了位置关系的像素标号取值集合S是关于邻域系统N的一个马尔可夫随机场,依据观测图像数据Y,对各像素点标号的取值可由贝叶斯(Bayes)准则得出: For each pixel position g on the image, it corresponds to a random value in a label support set S∈{0,1} m×n . It is assumed that the local conditional probability of the label value of the foreground pixel only varies with the state of its neighborhood. The value set S of the pixel label including the positional relationship is a Markov random field about the neighborhood system N, according to the observed image data Y, the value of the pixel label is selected according to the observed image data Y. The value can be derived from the Bayesian criterion:
Figure PCTCN2020118643-appb-000011
Figure PCTCN2020118643-appb-000011
其中,P(Y)是观测数据的先验分布,给定视频帧图像,可视为一个常数;P(S)为标号场的先验分布,根据Hammersley-Cliford定理,给定集簇的势函数V c(l c),采用
Figure PCTCN2020118643-appb-000012
拟合标号场的先验分布,l c表示集簇c上的点的标号,
Figure PCTCN2020118643-appb-000013
为各集簇上势函数能量的和;Ising模型中势函数的定义为:
Among them, P(Y) is the prior distribution of the observed data, and a given video frame image can be regarded as a constant; P(S) is the prior distribution of the label field. According to the Hammersley-Cliford theorem, the potential of a given cluster is function V c (l c ), using
Figure PCTCN2020118643-appb-000012
Fitting the label field prior distribution, l c represents a reference point on the cluster c,
Figure PCTCN2020118643-appb-000013
is the sum of the energy of the potential functions on each cluster; the definition of the potential function in the Ising model is:
Figure PCTCN2020118643-appb-000014
Figure PCTCN2020118643-appb-000014
其中,
Figure PCTCN2020118643-appb-000015
为第t帧中图像像素g点处的标号,q为g邻域上的点,
Figure PCTCN2020118643-appb-000016
k是玻尔兹曼常数,当温度T一定时,β是一个常数;此时,标号场先验分布为:
in,
Figure PCTCN2020118643-appb-000015
is the label at point g of the image pixel in the t-th frame, q is the point on the neighborhood of g,
Figure PCTCN2020118643-appb-000016
k is the Boltzmann constant, and when the temperature T is constant, β is a constant; at this time, the prior distribution of the label field is:
Figure PCTCN2020118643-appb-000017
Figure PCTCN2020118643-appb-000017
P(Y|S)是似然概率,通常假定各像素点之间独立且同高斯分布,该似然概率看成各像素点处似然概率的乘积:P(Y|S)=∏ g∈GP(y g|s g),对其取对数后得: P(Y|S) is the likelihood probability. Usually, it is assumed that each pixel is independent and has the same Gaussian distribution. The likelihood probability is regarded as the product of the likelihood probability at each pixel: P(Y|S)=∏ g∈ G P(y g |s g ), take the logarithm of it to get:
Figure PCTCN2020118643-appb-000018
Figure PCTCN2020118643-appb-000018
其中,
Figure PCTCN2020118643-appb-000019
Figure PCTCN2020118643-appb-000020
分别为各标签所服从的高斯分布的均值和方差;选取最大后验概率(Maximum A Posteriori,MAP)准则作为图像分割最优判别准则,则目标函数最优解即是式(12)取得最大化后验概率的解,两边取对数得下述目标函数:
in,
Figure PCTCN2020118643-appb-000019
and
Figure PCTCN2020118643-appb-000020
are the mean and variance of the Gaussian distribution obeyed by each label, respectively; the maximum a posteriori probability (Maximum A Posteriori, MAP) criterion is selected as the optimal judgment criterion for image segmentation, then the optimal solution of the objective function is to maximize the equation (12). For the solution of the posterior probability, take the logarithm of both sides to obtain the following objective function:
Figure PCTCN2020118643-appb-000021
Figure PCTCN2020118643-appb-000021
利用递归神经神经网络自主优化特性求解式(16)所示目标函数的最优解。The optimal solution of the objective function shown in Eq. (16) is solved by using the self-optimization characteristic of the recurrent neural network.
进一步地,所述步骤3)具体还包括:令u k,v k分别为递归神经网络中第k个神经元的输入和输出电压,R k,C k分别为其输入电阻和输入电容,I k为偏置电流,g k(u k)为神经元的传递函数,ω jk为神经元j与神经元k之间的连接电阻,即连接权,则网络的总体能量函数通常具有如下形式: Further, the step 3) specifically also includes: let u k , v k be the input and output voltages of the kth neuron in the recurrent neural network, respectively, R k , C k are their input resistance and input capacitance, respectively, I k is the bias current, g k (u k ) is the transfer function of the neuron, ω jk is the connection resistance between the neuron j and the neuron k, that is, the connection weight, the overall energy function of the network usually has the following form:
Figure PCTCN2020118643-appb-000022
Figure PCTCN2020118643-appb-000022
将上述能量函数对时间求导数,有:Taking the derivative of the above energy function with respect to time, we have:
Figure PCTCN2020118643-appb-000023
Figure PCTCN2020118643-appb-000023
由于C k>0,选取Sigmoid函数
Figure PCTCN2020118643-appb-000024
作为传递函数时,g -1是单调非减函数,且
Figure PCTCN2020118643-appb-000025
衰减,此时式(17)所示能量函数随时间推移整体呈现下降衰减趋势,并简化为:
Since C k > 0, the sigmoid function is selected
Figure PCTCN2020118643-appb-000024
As a transfer function, g -1 is a monotone non-decreasing function, and
Figure PCTCN2020118643-appb-000025
At this time, the energy function shown in Equation (17) shows a downward decay trend as a whole over time, and is simplified as:
Figure PCTCN2020118643-appb-000026
Figure PCTCN2020118643-appb-000026
当网络达到稳定时,该能量函数收敛于极小值,故递归神经网络实现对输入信号的自主迭代优化;When the network is stable, the energy function converges to the minimum value, so the recurrent neural network realizes the autonomous iterative optimization of the input signal;
根据递归神经网络的自主优化特性,将图像标号
Figure PCTCN2020118643-appb-000027
作为递归神经网络的输入,同时设置网络的偏置电流
Figure PCTCN2020118643-appb-000028
依据式(19),网络的能量函数为:
According to the autonomous optimization characteristics of the recurrent neural network, the images are labeled
Figure PCTCN2020118643-appb-000027
As an input to a recurrent neural network, while setting the network's bias current
Figure PCTCN2020118643-appb-000028
According to equation (19), the energy function of the network is:
Figure PCTCN2020118643-appb-000029
Figure PCTCN2020118643-appb-000029
对图像进行二值化处理,此时图像上的像素值
Figure PCTCN2020118643-appb-000030
等价于标号
Figure PCTCN2020118643-appb-000031
采用8邻域二阶系统模型建模图像标号场,选取式(13)所示的Ising函数作为势函数,得到对前景标号的估计:
Binarize the image, at this time the pixel value on the image
Figure PCTCN2020118643-appb-000030
Equivalent to label
Figure PCTCN2020118643-appb-000031
The 8-neighborhood second-order system model is used to model the image label field, and the Ising function shown in equation (13) is selected as the potential function to obtain the estimation of the foreground label:
Figure PCTCN2020118643-appb-000032
Figure PCTCN2020118643-appb-000032
其中,
Figure PCTCN2020118643-appb-000033
为常数项,对比式(20)和式(21)发现,对前景标号的估计视为对(20)所示递归神经网络能量函数最小值的自主优化求解。
in,
Figure PCTCN2020118643-appb-000033
is a constant term. Comparing equations (20) and (21), it is found that the estimation of the foreground label is regarded as an autonomous optimization solution to the minimum value of the energy function of the recurrent neural network shown in (20).
进一步地,所述步骤4)具体包括:由上述背景和前景的估计,得到视频图像坐标系中对航空器目标的跟踪监测,建立图像像素坐标到世界坐标的映射关系,在雷达跟踪结果中找到相关航空器标牌信息;Further, the step 4) specifically includes: obtaining the tracking and monitoring of the aircraft target in the video image coordinate system by estimating the above-mentioned background and foreground, establishing the mapping relationship between the image pixel coordinates and the world coordinates, and finding relevant information in the radar tracking results. aircraft placard information;
假定目标点在像素平面坐标系中的坐标为(u,v) T,在世界坐标系下的坐标为(x,y,z) T,采用针孔透视模型,得目标点像素平面坐标到世界坐标的转换关系: Assuming that the coordinates of the target point in the pixel plane coordinate system are (u, v) T , and the coordinates in the world coordinate system are (x, y, z) T , using the pinhole perspective model, the pixel plane coordinates of the target point are obtained in the world Coordinate conversion relationship:
Figure PCTCN2020118643-appb-000034
Figure PCTCN2020118643-appb-000034
其中,f x,f y均为表示焦距的参数,(u 0,v 0) T为主点相对于图像平面(投影面)的位置,即主光轴与图像平面的交点;z c为像素平面原点相对于摄像机坐标系原点的偏移量,为一个常数;R为摄像机的旋转矩阵,T为平移矩阵,记: Among them, f x , f y are parameters representing the focal length, (u 0 , v 0 ) T is the position of the main point relative to the image plane (projection plane), that is, the intersection of the main optical axis and the image plane; z c is the pixel The offset of the plane origin relative to the origin of the camera coordinate system is a constant; R is the rotation matrix of the camera, and T is the translation matrix, denoted:
Figure PCTCN2020118643-appb-000035
Figure PCTCN2020118643-appb-000035
则(22)化简为:Then (22) simplifies to:
p i=KCp w          (23) p i = KCp w (23)
采用Markov随机场和稀疏背景求解,得到连续视频帧中的前景目标,采用批处理的方式,记P i=[p i1,p i2,…,p it]连续t帧中的目标像素坐标向量组成的矩阵,其对应的世界坐标系下的矩阵为P w=[p w1,p w2,…p wt],则(23)变为: Using Markov random field and sparse background to solve, obtain the foreground target in the continuous video frame, adopt the batch method, note P i =[p i1 ,p i2 ,...,p it ] The target pixel coordinate vector in consecutive t frames is composed of , the corresponding matrix in the world coordinate system is P w =[p w1 ,p w2 ,...p wt ], then (23) becomes:
P i=KCP w         (24) P i = KCP w (24)
根据式(24)得到前景目标跟踪结果在世界坐标系下的坐标,采用最近邻方法建立该视频跟踪坐标与广播式自动相关监视数据的对应关系,实现数据关联,从而把广播式自动相关监视中的航班号的标牌信息关联到视频上,实现自动挂标牌。According to the formula (24), the coordinates of the tracking result of the foreground target in the world coordinate system are obtained, and the nearest neighbor method is used to establish the corresponding relationship between the video tracking coordinates and the ADS-B data, so as to realize the data association, so as to integrate the ADS-B in the data. The signage information of the flight number is associated with the video to realize automatic signage.
本发明的有益效果:Beneficial effects of the present invention:
1、将背景建模为连续视频帧序列的稀疏表示,通过使用贪婪算法求解稀疏信号的恢复问题,可降低背景解算的复杂度。1. The background is modeled as a sparse representation of a continuous video frame sequence, and the complexity of the background solution can be reduced by using a greedy algorithm to solve the sparse signal recovery problem.
2、将前景解算定义为基于Markov随机场的图像分割问题,在获得背景图层的基础上,利用Hopfield网络自主优化特性,建立网络输入与前景建模的Markov随机场能量函数的对应关系,自动优化图像标号集合,求得平滑前景目标。前景目标可反馈到背景解算过程,迭代次数控制了整体前背景估计的计算复杂程度。2. Define the foreground solution as an image segmentation problem based on the Markov random field. On the basis of obtaining the background layer, use the independent optimization characteristics of the Hopfield network to establish the corresponding relationship between the network input and the Markov random field energy function of the foreground modeling, Automatically optimize the image label set to obtain a smooth foreground target. The foreground target can be fed back to the background solution process, and the number of iterations controls the computational complexity of the overall foreground and background estimation.
3、在连续帧视频图像中自动捕捉到运动目标后,通过坐标转换建立图像坐标与广播式自动相关监视数据的对应关系,将单帧查表映射方式转变为通过变换矩阵的批处理方式,批量将连续帧中的目标图像坐标转换为世界坐标,再依据最近邻原则查库关联广播式自动相关监 视数据,在一定程度上降低了处理性能限制导致的标牌时延和目标丢失问题。3. After the moving target is automatically captured in the continuous frame video image, the corresponding relationship between the image coordinates and the broadcast automatic correlation monitoring data is established through coordinate conversion, and the single-frame look-up table mapping method is transformed into the batch processing method through the transformation matrix. The target image coordinates in consecutive frames are converted into world coordinates, and then the database is searched and correlated with the broadcast ADS-C data according to the nearest neighbor principle, which reduces the problem of signage delay and target loss caused by processing performance limitations to a certain extent.
附图说明Description of drawings
图1为本发明方法的原理图。FIG. 1 is a schematic diagram of the method of the present invention.
图2为递归神经网络神经元模型图。Figure 2 is a diagram of a recurrent neural network neuron model.
具体实施方式detailed description
术语说明:Terminology Description:
稀疏(sparse):如果一个实值、有限长的一维离散信号y∈R N的线性表示只包含K个基,则称此信号y是K-稀疏的,K称为信号y的稀疏度。 Sparse: If the linear representation of a real-valued, finite-length one-dimensional discrete signal y∈R N contains only K bases, the signal y is said to be K-sparse, and K is called the sparsity of the signal y.
压缩采样(Compressed sampling):也称压缩感知(Compressive sensing)或稀疏采样(Sparse sampling),它通过开发信号的稀疏特性,在远小于Nyquist采样率的条件下,使用随机采样获取信号的离散样本,然后通过非线性重构算法完美重建信号。Compressed sampling: Also known as Compressive sensing or Sparse sampling, it exploits the sparse characteristics of the signal and uses random sampling to obtain discrete samples of the signal under the condition that the sampling rate is much smaller than the Nyquist sampling rate. The signal is then perfectly reconstructed by a nonlinear reconstruction algorithm.
图像分割:把图像分成若干个特定的、具有独特性质的区域并提出感兴趣目标的技术和过程,是根据图像内容对指定区域进行标记的计算机视觉任务。Image segmentation: The technology and process of dividing an image into several specific regions with unique properties and proposing objects of interest. It is a computer vision task of marking designated regions according to the content of the image.
Markov随机场:具备马尔可夫性质的随机场。当给每一个位置中按照某种分布随机赋予相空间的一个值之后,其全体就叫做随机场;而马尔可夫性质指一个随机变量序列按时间先后关系依次排开时,第N+1时刻的分布特性,与N时刻以前的随机变量的取值无关。Markov random field: A random field with Markov properties. When a value in the phase space is randomly assigned to each position according to a certain distribution, the whole is called a random field; and the Markov property refers to that when a sequence of random variables is arranged in time sequence, the N+1th moment The distribution characteristics of , have nothing to do with the value of the random variable before N time.
为了便于本领域技术人员的理解,下面结合实施例与附图对本发明作进一步的说明,实施方式提及的内容并非对本发明的限定。In order to facilitate the understanding of those skilled in the art, the present invention will be further described below with reference to the embodiments and the accompanying drawings, and the contents mentioned in the embodiments are not intended to limit the present invention.
参照图1所示,本发明的一种基于马尔可夫随机场的远程塔台视频目标挂标牌方法,步骤如下:Referring to shown in Figure 1, a kind of remote tower video target hanging sign method based on Markov random field of the present invention, the steps are as follows:
1)建立模型:假定连续视频帧中背景图像线性相关,运动目标则视为在视频序列线性分解过程中无法纳入背景矩阵中的像素,通过求解背景估计和前景标号集,对视频帧图像中的像素进行背景和前景的分类标记;1) Build the model: Assuming that the background images in consecutive video frames are linearly correlated, the moving objects are regarded as pixels that cannot be included in the background matrix during the linear decomposition of the video sequence. Pixels are classified into background and foreground;
所述步骤1)具体包括:采用I t∈R m表示由视频序列中第t帧的图像按列堆叠形成的向量,该帧包括m个像素;D=[I 1,…,I n]∈R m×n为由表示帧的向量I组成的矩阵,代表了包括n帧的整个视频序列;B∈R m×n是与D同维的矩阵,表示视频帧中的背景,由n帧向量组成,每帧m个像素;第t帧第k个像素记为kt;用图像灰度来度量背景的强度,则在考察周期内照明条件基本不变的情况下,认为连续视频帧序列中背景强度基本不变,故对于一段连续视频序列D,其各组成帧中背景图像认为是线性相关的,运动目标则视为在视频序列线性分解过程中无法 纳入背景矩阵B中的像素,记为前景E,将当前帧t中的目标看成其之前t-1帧向量张成的子空间中的一个线性表示,记前t-1帧组成的矩阵为D t-1=[I 1,…,I t-1],则第t帧的图像记为: Said step 1) comprises: I t ∈R m using a vector represented by the video sequence in the t-th frame is stacked to form columns, the frame includes m pixels; D = [I 1, ... , I n] ∈ R m×n is a matrix composed of a vector I representing a frame, representing the entire video sequence including n frames; B ∈ R m×n is a matrix of the same dimension as D, representing the background in the video frame, which is represented by the n frame vector It consists of m pixels per frame; the k-th pixel of the t-th frame is denoted as kt; the intensity of the background is measured by the image grayscale, and the lighting conditions are basically unchanged during the inspection period, it is considered that the background in the continuous video frame sequence The intensity is basically unchanged, so for a continuous video sequence D, the background images in each frame are considered to be linearly related, and the moving objects are regarded as pixels that cannot be included in the background matrix B during the linear decomposition of the video sequence, and are recorded as foreground. E, regard the target in the current frame t as a linear representation in the subspace stretched by the vector of the previous t-1 frame, and denote the matrix composed of the previous t-1 frame as D t-1 =[I 1 ,..., I t-1 ], then the image of the t-th frame is recorded as:
y t=B+E=D t-1x+E          (1) y t =B+E=D t-1 x+E (1)
由各帧中背景组成的矩阵B=D t-1x是一个低秩矩阵,即背景矩阵B满足rank(B)≤K,K为预定义常数,且系数x为稀疏向量;考虑场景中噪声的影响,并假定噪声服从均值为0,方差为σ 2的高斯白噪声,式(1)视频帧信号表示为: The matrix B=D t-1 x composed of the background in each frame is a low-rank matrix, that is, the background matrix B satisfies rank(B)≤K, K is a predefined constant, and the coefficient x is a sparse vector; considering the noise in the scene , and assuming that the noise obeys the Gaussian white noise with mean 0 and variance σ 2 , the video frame signal of formula (1) is expressed as:
Figure PCTCN2020118643-appb-000036
Figure PCTCN2020118643-appb-000036
其中,I为单位矩阵,x eI代表高斯白噪声,在噪声影响下,第t帧的视频图像像素灰度值的大小记为y kt=B kt+e kt=ψ ktx+e kt;定义二值标号支撑集S∈{0,1} m×n作为图像像素标号,其元素规定为: Wherein, I is a unit matrix, x e I represents white Gaussian noise, and under the influence of noise, the size of the pixel gray value of the video image of the t-th frame is denoted as y kt =B kt +e ktkt x+e kt ; Define the binary label support set S∈{0,1} m×n as the image pixel label, and its elements are specified as:
Figure PCTCN2020118643-appb-000037
Figure PCTCN2020118643-appb-000037
则背景建模问题归结为求解下式(4)所示的优化问题:Then the background modeling problem boils down to solving the optimization problem shown in the following equation (4):
Figure PCTCN2020118643-appb-000038
Figure PCTCN2020118643-appb-000038
当S kt=1即像素kt属于前景时,背景被前景覆盖,视频帧信号的灰度与前景相同,故对目标的检测实际就是对前景标号集的估计;由于图像中相邻像素标号间存在相互作用,导致图像标号场不分段光滑,定义E smooth记录标号场不分段光滑的程度,E data记录标号与实测数据的误差,将对前景标号集的估计问题转化为对标号场能量优化问题,即令: When Skt = 1, that is, the pixel kt belongs to the foreground, the background is covered by the foreground, and the grayscale of the video frame signal is the same as the foreground, so the detection of the target is actually the estimation of the foreground label set; The interaction causes the image label field to be non-segmentally smooth. Define E smooth to record the degree of non-segmental smoothness of the label field, and E data to record the error between the label and the measured data. The problem of estimating the foreground label set is transformed into the energy optimization of the label field. The problem, i.e.:
E(S)=E smooth(S)+E data(S)      (5) E(S)=E smooth (S)+E data (S) (5)
取得最小值;get the minimum value;
在支撑集S的线性矩阵空间中定义矩阵X的正交投影:Define an orthogonal projection of a matrix X in the linear matrix space of the support set S:
Figure PCTCN2020118643-appb-000039
Figure PCTCN2020118643-appb-000039
Figure PCTCN2020118643-appb-000040
是Γ S(X)的补,有
Figure PCTCN2020118643-appb-000041
对视频帧中动态航空器目标y的检测就是对下式能量函数的最小化;
Figure PCTCN2020118643-appb-000040
is the complement of Γ S (X), we have
Figure PCTCN2020118643-appb-000041
The detection of the dynamic aircraft target y in the video frame is the minimization of the following energy function;
Figure PCTCN2020118643-appb-000042
Figure PCTCN2020118643-appb-000042
其中,参数α>0是与系数向量x稀疏度相关的常数,控制着背景的复杂度。where the parameter α>0 is a constant related to the sparsity of the coefficient vector x, which controls the complexity of the background.
2)使用贪婪算法求解连续视频帧序列的稀疏表示,获得对背景的初步估计;2) Use a greedy algorithm to solve the sparse representation of the continuous video frame sequence to obtain a preliminary estimate of the background;
假设已得到优化支撑集估计S,则式(7)简化为下列最优化问题:Assuming that the optimal support set estimate S has been obtained, equation (7) reduces to the following optimization problem:
Figure PCTCN2020118643-appb-000043
Figure PCTCN2020118643-appb-000043
使用高斯(Gaussian)随机矩阵Φ作为RIP矩阵,对观测值y进行压缩采样:Use a Gaussian random matrix Φ as the RIP matrix to compress and sample the observation y:
z=Φy=ΦΨx=Θx      (9)z=Φy=ΦΨx=Θx (9)
则式(8)所示问题转化为式(10)所示L1范式最小化问题:Then the problem shown in Equation (8) is transformed into the L1 normal form minimization problem shown in Equation (10):
min||x|| 1s.t.||Φy-Θx|| 2≤ε      (10) min||x|| 1 st||Φy-Θx|| 2 ≤ε (10)
初始化时,将视频起始一段短视频作为训练帧,背景复杂度已知,忽略参数α的影响,令α=1,使用贪婪算法求解(10)获得初始的背景估计,在此基础上,通过前景标号集的优化求解,进一步优化背景估计,并在后续迭代中,使用当前帧y替换D t-1中对应稀疏表示系数x最小的模板。 During initialization, a short video at the beginning of the video is used as a training frame, the background complexity is known, the influence of parameter α is ignored, and α=1, and the greedy algorithm is used to solve (10) to obtain the initial background estimate. The optimization solution of the foreground label set further optimizes the background estimation, and in the subsequent iterations, the current frame y is used to replace the template with the smallest sparse representation coefficient x in D t-1.
3)利用递归(Hopfield)神经网络求解图像分割问题,获得对前景标号集的估计;利用该前景标号集,对步骤2)求得的背景的初步估计进行修正,得到前景目标跟踪结果和背景估计;3) Use the recursive (Hopfield) neural network to solve the image segmentation problem, and obtain an estimate of the foreground label set; use the foreground label set to revise the preliminary estimate of the background obtained in step 2), and obtain the foreground target tracking result and background estimation ;
当给定了稀疏系数x时,式(7)所示能量函数转化为:When the sparse coefficient x is given, the energy function shown in equation (7) is transformed into:
Figure PCTCN2020118643-appb-000044
Figure PCTCN2020118643-appb-000044
其中,
Figure PCTCN2020118643-appb-000045
给定x,常数C也随之确定;为得到对式(11)中对支撑S的估计,从而得到各帧中的前景图像,采用基于马尔可夫随机场(Markov random fields,MRFs)的图像分割方法;
in,
Figure PCTCN2020118643-appb-000045
Given x, the constant C is also determined; in order to obtain the estimation of the support S in Eq. (11), so as to obtain the foreground image in each frame, the image based on Markov random fields (MRFs) is used. segmentation method;
用G={(i,j)|0≤i≤h,0≤j≤w}表示当前帧h×w图像中所有像素点的集合,g=(i,j)∈G表示二维图像中第i行,第j列的像素点,定义该像素的邻域为 N g={f∈G|[dist(f,g)] 2≤r,f≠g},其中,dist(f,g)表示像素位置之间的欧式距离;对于图像G中的子集c,其中每对不同的元素总是相邻时,构成一个集簇,C为所有集簇c的集合; Use G={(i,j)|0≤i≤h,0≤j≤w} to represent the set of all pixels in the h×w image of the current frame, and g=(i,j)∈G to represent the two-dimensional image The pixel in the i-th row and the j-th column defines the neighborhood of the pixel as N g ={f∈G|[dist(f,g)] 2 ≤r,f≠g}, where dist(f,g ) represents the Euclidean distance between pixel positions; for the subset c in the image G, when each pair of different elements is always adjacent, a cluster is formed, and C is the set of all clusters c;
对于图像上每一个像素位置g,均对应了一个标号支撑集S∈{0,1} m×n中的随机取值,假定前景像素标号取值的局部条件概率仅随着其邻域状态的变化而变化,与邻域外的变化无关,则包含了位置关系的像素标号取值集合S是关于邻域系统N的一个马尔可夫随机场,依据观测图像数据Y,对各像素点标号的取值可由贝叶斯(Bayes)准则得出: For each pixel position g on the image, it corresponds to a random value in a label support set S∈{0,1} m×n . It is assumed that the local conditional probability of the label value of the foreground pixel only varies with the state of its neighborhood. The value set S of the pixel label including the positional relationship is a Markov random field about the neighborhood system N, according to the observed image data Y, the value of the pixel label is selected according to the observed image data Y. The value can be derived from the Bayesian criterion:
Figure PCTCN2020118643-appb-000046
Figure PCTCN2020118643-appb-000046
其中,P(Y)是观测数据的先验分布,给定视频帧图像,可视为一个常数;P(S)为标号场的先验分布,根据Hammersley-Cliford定理,给定集簇的势函数V c(l c),标号场的先验分布近似为
Figure PCTCN2020118643-appb-000047
l c表示集簇c上的点的标号,
Figure PCTCN2020118643-appb-000048
为各集簇上势函数能量的和;Ising模型中势函数的定义为:
Among them, P(Y) is the prior distribution of the observed data, and a given video frame image can be regarded as a constant; P(S) is the prior distribution of the label field. According to the Hammersley-Cliford theorem, the potential of a given cluster is function V c (l c ), the prior distribution of the label field is approximately
Figure PCTCN2020118643-appb-000047
l c represents the label of the point on the cluster c,
Figure PCTCN2020118643-appb-000048
is the sum of the energy of the potential functions on each cluster; the definition of the potential function in the Ising model is:
Figure PCTCN2020118643-appb-000049
Figure PCTCN2020118643-appb-000049
其中,
Figure PCTCN2020118643-appb-000050
为第t帧中图像像素g点处的标号,q为g邻域上的点,
Figure PCTCN2020118643-appb-000051
k是玻尔兹曼常数,当温度T一定时,β是一个常数;此时,标号场先验分布为:
in,
Figure PCTCN2020118643-appb-000050
is the label at point g of the image pixel in the t-th frame, q is the point on the neighborhood of g,
Figure PCTCN2020118643-appb-000051
k is the Boltzmann constant, and when the temperature T is constant, β is a constant; at this time, the prior distribution of the label field is:
Figure PCTCN2020118643-appb-000052
Figure PCTCN2020118643-appb-000052
P(Y|S)是似然概率,通常假定各像素点之间独立且同高斯分布,该似然概率看成各像素点处似然概率的乘积:P(Y|S)=∏ g∈GP(y g|s g),对其取对数后得: P(Y|S) is the likelihood probability. Usually, it is assumed that each pixel is independent and has the same Gaussian distribution. The likelihood probability is regarded as the product of the likelihood probability at each pixel: P(Y|S)=∏ g∈ G P(y g |s g ), take the logarithm of it to get:
Figure PCTCN2020118643-appb-000053
Figure PCTCN2020118643-appb-000053
其中,
Figure PCTCN2020118643-appb-000054
Figure PCTCN2020118643-appb-000055
分别为各标签所服从的高斯分布的均值和方差;选取最大后验概率(Maximum A Posteriori,MAP)准则作为图像分割最优判别准则,则目标函数最优解即是式(12)取得最大化后验概率的解,两边取对数得下述目标函数:
in,
Figure PCTCN2020118643-appb-000054
and
Figure PCTCN2020118643-appb-000055
are the mean and variance of the Gaussian distribution obeyed by each label, respectively; the maximum a posteriori probability (Maximum A Posteriori, MAP) criterion is selected as the optimal judgment criterion for image segmentation, then the optimal solution of the objective function is to maximize the equation (12). For the solution of the posterior probability, take the logarithm of both sides to obtain the following objective function:
Figure PCTCN2020118643-appb-000056
Figure PCTCN2020118643-appb-000056
利用递归神经神经网络自主优化特性求解式(16)所示目标函数的最优解。The optimal solution of the objective function shown in Eq. (16) is solved by using the self-optimization characteristic of the recurrent neural network.
所述步骤3)具体还包括:参照图2所示,令u k,v k分别为递归神经网络中第k个神经元的输入和输出电压,R k,C k分别为其输入电阻和输入电容,I k为偏置电流,g k(u k)为神经元的传递函数,ω jk为神经元j与神经元k之间的连接电阻,即连接权,则网络的总体能量函数通常具有如下形式: The step 3) specifically further includes: referring to FIG. 2 , let u k , v k be the input and output voltages of the kth neuron in the recurrent neural network, respectively, and R k , C k are their input resistance and input voltage respectively. Capacitance, I k is the bias current, g k (u k ) is the transfer function of the neuron, ω jk is the connection resistance between the neuron j and the neuron k, that is, the connection weight, the overall energy function of the network usually has in the form of:
Figure PCTCN2020118643-appb-000057
Figure PCTCN2020118643-appb-000057
将上述能量函数对时间求导数,有:Taking the derivative of the above energy function with respect to time, we have:
Figure PCTCN2020118643-appb-000058
Figure PCTCN2020118643-appb-000058
由于C k>0,选取Sigmoid函数
Figure PCTCN2020118643-appb-000059
作为传递函数时,g -1是单调非减函数,且
Figure PCTCN2020118643-appb-000060
衰减,此时式(17)所示能量函数随时间推移整体呈现下降衰减趋势,并简化为:
Since C k > 0, the sigmoid function is selected
Figure PCTCN2020118643-appb-000059
As a transfer function, g -1 is a monotone non-decreasing function, and
Figure PCTCN2020118643-appb-000060
At this time, the energy function shown in Equation (17) shows a downward decay trend as a whole over time, and is simplified as:
Figure PCTCN2020118643-appb-000061
Figure PCTCN2020118643-appb-000061
当网络达到稳定时,该能量函数收敛于极小值,故递归神经网络实现对输入信号的自主迭代优化;When the network is stable, the energy function converges to the minimum value, so the recurrent neural network realizes the autonomous iterative optimization of the input signal;
根据递归神经网络的自主优化特性,将图像标号
Figure PCTCN2020118643-appb-000062
作为递归神经网络的输入,同时设置网络的偏置电流
Figure PCTCN2020118643-appb-000063
依据式(19),网络的能量函数为:
According to the autonomous optimization characteristics of the recurrent neural network, the images are labeled
Figure PCTCN2020118643-appb-000062
As an input to a recurrent neural network, while setting the network's bias current
Figure PCTCN2020118643-appb-000063
According to equation (19), the energy function of the network is:
Figure PCTCN2020118643-appb-000064
Figure PCTCN2020118643-appb-000064
对图像进行二值化处理,此时图像上的像素值
Figure PCTCN2020118643-appb-000065
等价于标号
Figure PCTCN2020118643-appb-000066
采用8邻域二阶系统模型建模图像标号场,选取式(13)所示的Ising函数作为势函数,得到对前景标号的估计:
Binarize the image, at this time the pixel value on the image
Figure PCTCN2020118643-appb-000065
Equivalent to label
Figure PCTCN2020118643-appb-000066
The 8-neighborhood second-order system model is used to model the image label field, and the Ising function shown in equation (13) is selected as the potential function to obtain the estimation of the foreground label:
Figure PCTCN2020118643-appb-000067
Figure PCTCN2020118643-appb-000067
其中,
Figure PCTCN2020118643-appb-000068
为常数项,对比式(20)和式(21)发现,对前景标号的估计视为对(20)所示递归神经网络能量函数最小值的自主优化求解。
in,
Figure PCTCN2020118643-appb-000068
is a constant term. Comparing equations (20) and (21), it is found that the estimation of the foreground label is regarded as an autonomous optimization solution to the minimum value of the energy function of the recurrent neural network shown in (20).
4)采用针孔透视模型建立从视频图像坐标系到世界坐标系的转换矩阵,求解前景目标跟踪结果在世界坐标系下的坐标位置;采用最近邻方法建立上述世界坐标系下的目标坐标点位置与广播式自动相关监视数据的对应关系,从而把广播式自动相关监视中的标牌信息关联到视频上,实现自动挂标牌;4) Use the pinhole perspective model to establish the transformation matrix from the video image coordinate system to the world coordinate system, and solve the coordinate position of the foreground target tracking result under the world coordinate system; use the nearest neighbor method to establish the target coordinate point position under the above-mentioned world coordinate system Corresponding relationship with ADS-B broadcast data, so as to associate the sign information in ADS-B with the video to realize automatic sign hanging;
由上述背景和前景的估计,得到视频图像坐标系中对航空器目标的跟踪监测,建立图像像素坐标到世界坐标的映射关系,在雷达跟踪结果中找到相关航空器标牌信息;From the estimation of the above background and foreground, the tracking and monitoring of the aircraft target in the video image coordinate system is obtained, the mapping relationship between the image pixel coordinates and the world coordinates is established, and the relevant aircraft sign information is found in the radar tracking result;
假定目标点在像素平面坐标系中的坐标为(u,v) T,在世界坐标系下的坐标为(x,y,z) T,采用针孔透视模型,得目标点像素平面坐标到世界坐标的转换关系: Assuming that the coordinates of the target point in the pixel plane coordinate system are (u, v) T , and the coordinates in the world coordinate system are (x, y, z) T , using the pinhole perspective model, the pixel plane coordinates of the target point are obtained in the world Coordinate conversion relationship:
Figure PCTCN2020118643-appb-000069
Figure PCTCN2020118643-appb-000069
其中,f x,f y均为表示焦距的参数,(u 0,v 0) T为主点相对于图像平面(投影面)的位置,即主光轴与图像平面的交点;z c为像素平面原点相对于摄像机坐标系原点的偏移量,为一个常数;R为摄像机的旋转矩阵,T为平移矩阵,记: Among them, f x , f y are parameters representing the focal length, (u 0 , v 0 ) T is the position of the main point relative to the image plane (projection plane), that is, the intersection of the main optical axis and the image plane; z c is the pixel The offset of the plane origin relative to the origin of the camera coordinate system is a constant; R is the rotation matrix of the camera, and T is the translation matrix, denoted:
Figure PCTCN2020118643-appb-000070
Figure PCTCN2020118643-appb-000070
则(22)化简为:Then (22) simplifies to:
p i=KCp w           (23) p i = KCp w (23)
采用Markov随机场和稀疏背景求解,得到连续视频帧中的前景目标,采用批处理的方式, 记P i=[p i1,p i2,…,p it]连续t帧中的目标像素坐标向量组成的矩阵,其对应的世界坐标系下的矩阵为P w=[p w1,p w2,…p wt],则(23)变为: Using Markov random field and sparse background to solve, obtain the foreground target in the continuous video frame, adopt the batch method, note P i =[pi i1 , pi2 ,...,p it ] The target pixel coordinate vector in consecutive t frames is composed of , the corresponding matrix in the world coordinate system is P w =[p w1 ,p w2 ,...p wt ], then (23) becomes:
P i=KCP w         (24) P i = KCP w (24)
根据式(24)得到前景目标跟踪结果在世界坐标系下的坐标,采用最近邻方法建立该视频跟踪坐标与广播式自动相关监视数据的对应关系,实现数据关联,从而把广播式自动相关监视中的航班号的标牌信息关联到视频上,实现自动挂标牌。According to the formula (24), the coordinates of the tracking result of the foreground target in the world coordinate system are obtained, and the nearest neighbor method is used to establish the corresponding relationship between the video tracking coordinates and the ADS-B data, so as to realize the data association, so as to integrate the ADS-B in the data. The signage information of the flight number is associated with the video to realize automatic signage.
本发明具体应用途径很多,以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以作出若干改进,这些改进也应视为本发明的保护范围。There are many specific application ways of the present invention, and the above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principle of the present invention, several improvements can be made. These Improvements should also be considered as the protection scope of the present invention.

Claims (6)

  1. 一种基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,步骤如下:A method for hanging a signage for a remote tower video target based on a Markov random field, characterized in that the steps are as follows:
    1)建立模型:假定连续视频帧中背景图像线性相关,运动目标则视为在视频序列线性分解过程中无法纳入背景矩阵中的像素,通过求解背景估计和前景标号集,对视频帧图像中的像素进行背景和前景的分类标记;1) Build the model: Assuming that the background images in consecutive video frames are linearly correlated, the moving objects are regarded as pixels that cannot be included in the background matrix during the linear decomposition of the video sequence. Pixels are classified into background and foreground;
    2)使用贪婪算法求解连续视频帧序列的稀疏表示,获得对背景的初步估计;2) Use a greedy algorithm to solve the sparse representation of the continuous video frame sequence to obtain a preliminary estimate of the background;
    3)利用递归神经网络求解图像分割问题,获得对前景标号集的估计;利用该前景标号集,对步骤2)求得的背景的初步估计进行修正,得到前景目标跟踪结果和背景估计;3) solve the image segmentation problem by using the recurrent neural network, and obtain the estimation to the foreground label set; utilize the foreground label set, revise the preliminary estimation of the background obtained in step 2), and obtain the foreground target tracking result and the background estimation;
    4)采用针孔透视模型建立从视频图像坐标系到世界坐标系的转换矩阵,求解视频帧中的前景目标跟踪结果在世界坐标系下的坐标位置;采用最近邻方法建立上述世界坐标系下的目标坐标点位置与广播式自动相关监视数据的对应关系,从而把广播式自动相关监视中的标牌信息关联到视频上,实现自动挂标牌。4) Use the pinhole perspective model to establish the transformation matrix from the video image coordinate system to the world coordinate system, and solve the coordinate position of the foreground target tracking result in the video frame under the world coordinate system; use the nearest neighbor method to establish the above-mentioned world coordinate system. The corresponding relationship between the target coordinate point position and the ADS-B broadcast data, so as to associate the sign information in the ADS-B broadcast with the video, and realize the automatic hanging of the sign.
  2. 根据权利要求1所述的基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,所述步骤1)具体包括:采用I t∈R m表示由视频序列中第t帧的图像按列堆叠形成的向量,该帧包括m个像素;D=[I 1,…,I n]∈R m×n为由表示帧的向量I组成的矩阵,代表了包括n帧的整个视频序列;B∈R m×n是与D同维的矩阵,表示视频帧中的背景,由n帧向量组成,每帧m个像素;第t帧第k个像素记为kt;用图像灰度来度量背景的强度,则在考察周期内照明条件基本不变的情况下,认为连续视频帧序列中背景强度基本不变,故对于一段连续视频序列D,其各组成帧中背景图像认为是线性相关的,运动目标则视为在视频序列线性分解过程中无法纳入背景矩阵B中的像素,记为前景E,将当前帧t中的目标看成其之前t-1帧向量张成的子空间中的一个线性表示,记前t-1帧组成的矩阵为D t-1=[I 1,…,I t-1],则第t帧的图像记为: The Markov random field-based method for hanging a signage on a remote tower video target according to claim 1, wherein the step 1) specifically comprises: using I t ∈ R m to represent the image of the t-th frame in the video sequence column vector formed by stacking, the frame includes m pixels; D = [I 1, ... , I n] ∈R m × n vector by a matrix representing the I frame composition, represents the entire video sequence comprising n frames ; B∈R m×n is a matrix of the same dimension as D, representing the background in the video frame, consisting of n frame vectors, each frame has m pixels; the kth pixel of the tth frame is denoted as kt; To measure the intensity of the background, when the lighting conditions are basically unchanged during the investigation period, it is considered that the background intensity in the continuous video frame sequence is basically unchanged. Therefore, for a continuous video sequence D, the background images in each frame are considered to be linear correlation. , the moving target is regarded as a pixel that cannot be included in the background matrix B during the linear decomposition of the video sequence, and is denoted as the foreground E, and the target in the current frame t is regarded as the subspace stretched by the previous t-1 frame vector. A linear representation of , denoting the matrix composed of the first t-1 frames as D t-1 =[I 1 ,...,I t-1 ], then the image of the t-th frame is denoted as:
    y t=B+E=D t-1x+E    (1) y t =B+E=D t-1 x+E (1)
    由各帧中背景组成的矩阵B=D t-1x是一个低秩矩阵,即背景矩阵B满足rank(B)≤K,K为预定义常数,且系数x为稀疏向量;考虑场景中噪声的影响,并假定噪声服从均值为0,方差为σ 2的高斯白噪声,式(1)视频帧信号表示为: The matrix B=D t-1 x composed of the background in each frame is a low-rank matrix, that is, the background matrix B satisfies rank(B)≤K, K is a predefined constant, and the coefficient x is a sparse vector; considering the noise in the scene , and assuming that the noise obeys the Gaussian white noise with mean 0 and variance σ 2 , the video frame signal of formula (1) is expressed as:
    Figure PCTCN2020118643-appb-100001
    Figure PCTCN2020118643-appb-100001
    其中,I为单位矩阵,x eI代表高斯白噪声,在噪声影响下,第t帧的视频图像像素灰度值的大小记为y kt=B kt+e kt=ψ ktx+e kt;定义二值标号支撑集S∈{0,1} m×n作为图像像素标 号,其元素规定为: Wherein, I is a unit matrix, x e I represents white Gaussian noise, and under the influence of noise, the size of the pixel gray value of the video image of the t-th frame is denoted as y kt =B kt +e ktkt x+e kt ; Define the binary label support set S∈{0,1} m×n as the image pixel label, and its elements are specified as:
    Figure PCTCN2020118643-appb-100002
    Figure PCTCN2020118643-appb-100002
    则背景建模问题归结为求解下式(4)所示的优化问题:Then the background modeling problem boils down to solving the optimization problem shown in the following equation (4):
    Figure PCTCN2020118643-appb-100003
    Figure PCTCN2020118643-appb-100003
    当S kt=1即像素kt属于前景时,背景被前景覆盖,视频帧信号的灰度与前景相同,故对目标的检测实际就是对前景标号集的估计;由于图像中相邻像素标号间存在相互作用,导致图像标号场不分段光滑,定义E smooth记录标号场不分段光滑的程度,E data记录标号与实测数据的误差,将对前景标号集的估计问题转化为对标号场能量优化问题,即令: When Skt = 1, that is, the pixel kt belongs to the foreground, the background is covered by the foreground, and the grayscale of the video frame signal is the same as the foreground, so the detection of the target is actually the estimation of the foreground label set; The interaction causes the image label field to be non-segmentally smooth. Define E smooth to record the degree of non-segmental smoothness of the label field, and E data to record the error between the label and the measured data. The problem of estimating the foreground label set is transformed into the energy optimization of the label field. The problem, i.e.:
    E(S)=E smooth(S)+E data(S)    (5) E(S)=E smooth (S)+E data (S) (5)
    取得最小值;get the minimum value;
    在支撑集S的线性矩阵空间中定义矩阵X的正交投影:Define an orthogonal projection of a matrix X in the linear matrix space of the support set S:
    Figure PCTCN2020118643-appb-100004
    Figure PCTCN2020118643-appb-100004
    Figure PCTCN2020118643-appb-100005
    是Γ S(X)的补,有
    Figure PCTCN2020118643-appb-100006
    对视频帧中动态航空器目标y的检测就是对下式能量函数的最小化;
    Figure PCTCN2020118643-appb-100005
    is the complement of Γ S (X), we have
    Figure PCTCN2020118643-appb-100006
    The detection of the dynamic aircraft target y in the video frame is the minimization of the following energy function;
    Figure PCTCN2020118643-appb-100007
    Figure PCTCN2020118643-appb-100007
    其中,参数α>0是与系数向量x稀疏度相关的常数,控制着背景的复杂度。where the parameter α>0 is a constant related to the sparsity of the coefficient vector x, which controls the complexity of the background.
  3. 根据权利要求2所述的基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,所述步骤2)具体包括:假设已得到优化支撑集估计S,则式(7)简化为下列最优化问题:The Markov random field-based method for hanging a signage on a remote tower video target according to claim 2, wherein the step 2) specifically includes: assuming that the optimal support set estimate S has been obtained, the formula (7) is simplified as The following optimization problem:
    Figure PCTCN2020118643-appb-100008
    Figure PCTCN2020118643-appb-100008
    使用高斯随机矩阵Φ作为RIP矩阵,对观测值y进行压缩采样:Using a Gaussian random matrix Φ as the RIP matrix, compressively sample the observations y:
    z=Φy=ΦΨx=Θx    (9)z=Φy=ΦΨx=Θx (9)
    则式(8)所示问题转化为式(10)所示L1范式最小化问题:Then the problem shown in Equation (8) is transformed into the L1 normal form minimization problem shown in Equation (10):
    min||x|| 1 s.t. ||Φy-Θx|| 2≤ε    (10) min||x|| 1 st ||Φy-Θx|| 2 ≤ε (10)
    初始化时,将视频起始一段短视频作为训练帧,背景复杂度已知,忽略参数α的影响, 令α=1,使用贪婪算法求解式(10)获得初始的背景估计,在此基础上,通过前景标号集的优化求解,进一步优化背景估计,并在后续迭代中,使用当前帧y替换D t-1中对应稀疏表示系数x最小的模板。 During initialization, a short video at the beginning of the video is used as a training frame, the background complexity is known, the influence of parameter α is ignored, and α=1, and the greedy algorithm is used to solve Equation (10) to obtain the initial background estimation. On this basis, Through the optimization solution of the foreground label set, the background estimation is further optimized, and in the subsequent iterations, the current frame y is used to replace the template corresponding to the smallest sparse representation coefficient x in D t-1.
  4. 根据权利要求3所述的基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,所述步骤3)具体包括:当给定了稀疏系数x时,式(7)所示能量函数转化为:The method for hanging a signage on a remote tower video target based on a Markov random field according to claim 3, wherein the step 3) specifically comprises: when the sparse coefficient x is given, the energy shown in the formula (7) The function converts to:
    Figure PCTCN2020118643-appb-100009
    Figure PCTCN2020118643-appb-100009
    其中,
    Figure PCTCN2020118643-appb-100010
    给定x,常数C也随之确定;为得到对式(11)中对支撑S的估计,从而得到各帧中的前景图像,采用基于马尔可夫随机场的图像分割方法;
    in,
    Figure PCTCN2020118643-appb-100010
    Given x, the constant C is also determined accordingly; in order to obtain the estimation of the support S in the formula (11), so as to obtain the foreground image in each frame, the image segmentation method based on the Markov random field is adopted;
    用G={(i,j)|0≤i≤h,0≤j≤w}表示当前帧h×w图像中所有像素点的集合,g=(i,j)∈G表示二维图像中第i行,第j列的像素点,定义该像素的邻域为N g={f∈G|[dist(f,g)] 2≤r,f≠g},其中,dist(f,g)表示像素位置之间的欧式距离;对于图像G中的子集c,其中每对不同的元素总是相邻时,构成一个集簇,C为所有集簇c的集合; Use G={(i,j)|0≤i≤h,0≤j≤w} to represent the set of all pixels in the h×w image of the current frame, and g=(i,j)∈G to represent the two-dimensional image The pixel in the i-th row and the j-th column defines the neighborhood of the pixel as N g ={f∈G|[dist(f,g)] 2 ≤r,f≠g}, where dist(f,g ) represents the Euclidean distance between pixel positions; for the subset c in the image G, when each pair of different elements is always adjacent, a cluster is formed, and C is the set of all clusters c;
    对于图像上每一个像素位置g,均对应了一个标号支撑集S∈{0,1} m×n中的随机取值,假定前景像素标号取值的局部条件概率仅随着其邻域状态的变化而变化,与邻域外的变化无关,则包含了位置关系的像素标号取值集合S是关于邻域系统N的一个马尔可夫随机场,依据观测图像数据Y,对各像素点标号的取值可由贝叶斯准则得出: For each pixel position g on the image, it corresponds to a random value in a label support set S∈{0,1} m×n . It is assumed that the local conditional probability of the label value of the foreground pixel only varies with the state of its neighborhood. The value set S of the pixel label including the positional relationship is a Markov random field about the neighborhood system N, according to the observed image data Y, the value of the pixel label is selected according to the observed image data Y. The value can be derived from the Bayesian criterion:
    Figure PCTCN2020118643-appb-100011
    Figure PCTCN2020118643-appb-100011
    其中,P(Y)是观测数据的先验分布,给定视频帧图像,可视为一个常数;P(S)为标号场的先验分布,根据Hammersley-Cliford定理,给定集簇的势函数V c(l c),采用
    Figure PCTCN2020118643-appb-100012
    拟合标号场的先验分布,l c表示集簇c上的点的标号,
    Figure PCTCN2020118643-appb-100013
    为各集簇上势函数能量的和;Ising模型中势函数的定义为:
    Among them, P(Y) is the prior distribution of the observed data, and a given video frame image can be regarded as a constant; P(S) is the prior distribution of the label field. According to the Hammersley-Cliford theorem, the potential of a given cluster is function V c (l c ), using
    Figure PCTCN2020118643-appb-100012
    Fitting the label field prior distribution, l c represents a reference point on the cluster c,
    Figure PCTCN2020118643-appb-100013
    is the sum of the energy of the potential functions on each cluster; the definition of the potential function in the Ising model is:
    Figure PCTCN2020118643-appb-100014
    Figure PCTCN2020118643-appb-100014
    其中,
    Figure PCTCN2020118643-appb-100015
    为第t帧中图像像素g点处的标号,q为g邻域上的点,
    Figure PCTCN2020118643-appb-100016
    k是玻尔兹曼常数,当温度T一定时,β是一个常数;此时,标号场先验分布为:
    in,
    Figure PCTCN2020118643-appb-100015
    is the label at point g of the image pixel in the t-th frame, q is the point on the neighborhood of g,
    Figure PCTCN2020118643-appb-100016
    k is the Boltzmann constant, and when the temperature T is constant, β is a constant; at this time, the prior distribution of the label field is:
    Figure PCTCN2020118643-appb-100017
    Figure PCTCN2020118643-appb-100017
    P(Y|S)是似然概率,通常假定各像素点之间独立且同高斯分布,该似然概率看成各像素点处似然概率的乘积:P(Y|S)=Π g∈GP(y g|s g),对其取对数后得: P(Y|S) is the likelihood probability. It is usually assumed that each pixel is independent and has the same Gaussian distribution. The likelihood probability is regarded as the product of the likelihood probability at each pixel: P(Y|S)=Π g∈ G P(y g |s g ), take the logarithm of it to get:
    Figure PCTCN2020118643-appb-100018
    Figure PCTCN2020118643-appb-100018
    其中,
    Figure PCTCN2020118643-appb-100019
    Figure PCTCN2020118643-appb-100020
    分别为各标签所服从的高斯分布的均值和方差;选取最大后验概率准则作为图像分割最优判别准则,则目标函数最优解即是式(12)取得最大化后验概率的解,两边取对数得下述目标函数:
    in,
    Figure PCTCN2020118643-appb-100019
    and
    Figure PCTCN2020118643-appb-100020
    are the mean and variance of the Gaussian distribution obeyed by each label, respectively; the maximum posterior probability criterion is selected as the optimal judgment criterion for image segmentation, then the optimal solution of the objective function is the solution that maximizes the posterior probability of Equation (12). Take the logarithm to get the following objective function:
    Figure PCTCN2020118643-appb-100021
    Figure PCTCN2020118643-appb-100021
    利用递归神经神经网络自主优化特性求解式(16)所示目标函数的最优解。The optimal solution of the objective function shown in Eq. (16) is solved by using the self-optimization characteristic of the recurrent neural network.
  5. 根据权利要求4所述的基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,所述步骤3)具体还包括:令u k,v k分别为递归神经网络中第k个神经元的输入和输出电压,R k,C k分别为其输入电阻和输入电容,I k为偏置电流,g k(u k)为神经元的传递函数,ω jk为神经元j与神经元k之间的连接电阻,即连接权,则网络的总体能量函数通常具有如下形式: The Markov random field-based remote tower video target hanging signage method according to claim 4, wherein the step 3) specifically further comprises: let u k and v k be respectively the kth in the recurrent neural network The input and output voltages of the neuron, R k , C k are their input resistance and input capacitance, respectively, I k is the bias current, g k (u k ) is the transfer function of the neuron, ω jk is the neuron j and the neuron The connection resistance between elements k, that is, the connection weight, the overall energy function of the network usually has the following form:
    Figure PCTCN2020118643-appb-100022
    Figure PCTCN2020118643-appb-100022
    将上述能量函数对时间求导数,有:Taking the derivative of the above energy function with respect to time, we have:
    Figure PCTCN2020118643-appb-100023
    Figure PCTCN2020118643-appb-100023
    由于C k>0,选取Sigmoid函数
    Figure PCTCN2020118643-appb-100024
    作为传递函数时,g -1是单调非减函数,且
    Figure PCTCN2020118643-appb-100025
    衰减,此时式(17)所示能量函数随时间推移整体呈现下降衰减趋势,并简化为:
    Since C k > 0, the sigmoid function is selected
    Figure PCTCN2020118643-appb-100024
    As a transfer function, g -1 is a monotone non-decreasing function, and
    Figure PCTCN2020118643-appb-100025
    At this time, the energy function shown in Equation (17) shows a downward decay trend as a whole over time, and is simplified as:
    Figure PCTCN2020118643-appb-100026
    Figure PCTCN2020118643-appb-100026
    当网络达到稳定时,该能量函数收敛于极小值,故递归神经网络实现对输入信号的自主迭代优化;When the network is stable, the energy function converges to the minimum value, so the recurrent neural network realizes the autonomous iterative optimization of the input signal;
    根据递归神经网络的自主优化特性,将图像标号
    Figure PCTCN2020118643-appb-100027
    作为递归神经网络的输入,同时设置网络的偏置电流
    Figure PCTCN2020118643-appb-100028
    依据式(19),网络的能量函数为:
    According to the autonomous optimization characteristics of the recurrent neural network, the images are labeled
    Figure PCTCN2020118643-appb-100027
    As an input to a recurrent neural network, while setting the network's bias current
    Figure PCTCN2020118643-appb-100028
    According to equation (19), the energy function of the network is:
    Figure PCTCN2020118643-appb-100029
    Figure PCTCN2020118643-appb-100029
    对图像进行二值化处理,此时图像上的像素值
    Figure PCTCN2020118643-appb-100030
    等价于标号
    Figure PCTCN2020118643-appb-100031
    采用8邻域二阶系统模型建模图像标号场,选取式(13)所示的Ising函数作为势函数,得到对前景标号的估计:
    Binarize the image, at this time the pixel value on the image
    Figure PCTCN2020118643-appb-100030
    Equivalent to label
    Figure PCTCN2020118643-appb-100031
    The 8-neighborhood second-order system model is used to model the image label field, and the Ising function shown in equation (13) is selected as the potential function to obtain the estimation of the foreground label:
    Figure PCTCN2020118643-appb-100032
    Figure PCTCN2020118643-appb-100032
    其中,
    Figure PCTCN2020118643-appb-100033
    为常数项,对比式(20)和式(21)发现,对前景标号的估计视为对式(20)所示递归神经网络能量函数最小值的自主优化求解。
    in,
    Figure PCTCN2020118643-appb-100033
    is a constant term. Comparing equations (20) and (21), it is found that the estimation of the foreground label is regarded as an autonomous optimization solution to the minimum value of the energy function of the recurrent neural network shown in equation (20).
  6. 根据权利要求5所述的基于马尔可夫随机场的远程塔台视频目标挂标牌方法,其特征在于,所述步骤4)具体包括:由上述背景和前景的估计,得到视频图像坐标系中对航空器目标的跟踪监测,建立图像像素坐标到世界坐标的映射关系,在雷达跟踪结果中找到相关航空器标牌信息;The Markov random field-based remote tower video target hanging signage method according to claim 5, wherein the step 4) specifically comprises: from the estimation of the background and the foreground, obtain the video image coordinate system of the aircraft in the coordinate system. Tracking and monitoring of targets, establishing the mapping relationship between image pixel coordinates and world coordinates, and finding relevant aircraft signage information in the radar tracking results;
    假定目标点在像素平面坐标系中的坐标为(u,v) T,在世界坐标系下的坐标为(x,y,z) T,采 用针孔透视模型,得目标点像素平面坐标到世界坐标的转换关系: Assuming that the coordinates of the target point in the pixel plane coordinate system are (u, v) T , and the coordinates in the world coordinate system are (x, y, z) T , using the pinhole perspective model, the pixel plane coordinates of the target point are obtained in the world Coordinate conversion relationship:
    Figure PCTCN2020118643-appb-100034
    Figure PCTCN2020118643-appb-100034
    其中,f x,f y均为表示焦距的参数,(u 0,v 0) T为主点相对于图像平面的位置,即主光轴与图像平面的交点;z c为像素平面原点相对于摄像机坐标系原点的偏移量,为一个常数;R为摄像机的旋转矩阵,T为平移矩阵,记: Among them, f x , f y are parameters representing the focal length, (u 0 , v 0 ) T is the position of the main point relative to the image plane, that is, the intersection of the main optical axis and the image plane; z c is the pixel plane origin relative to the image plane The offset of the origin of the camera coordinate system is a constant; R is the rotation matrix of the camera, and T is the translation matrix, note:
    Figure PCTCN2020118643-appb-100035
    Figure PCTCN2020118643-appb-100035
    则式(22)化简为:The formula (22) simplifies to:
    p i=KCp w    (23) p i = KCp w (23)
    采用Markov随机场和稀疏背景求解,得到连续视频帧中的前景目标,采用批处理的方式,记P i=[p i1,p i2,…,p it]连续t帧中的目标像素坐标向量组成的矩阵,其对应的世界坐标系下的矩阵为P w=[p w1,p w2,…p wt],则式(23)变为: Using Markov random field and sparse background to solve, obtain the foreground target in the continuous video frame, adopt the batch method, note P i =[p i1 ,p i2 ,...,p it ] The target pixel coordinate vector in consecutive t frames is composed of , the corresponding matrix in the world coordinate system is P w =[p w1 ,p w2 ,...p wt ], then formula (23) becomes:
    P i=KCP w    (24) P i = KCP w (24)
    根据式(24)得到前景目标跟踪结果在世界坐标系下的坐标,采用最近邻方法建立该视频跟踪坐标与广播式自动相关监视数据的对应关系,实现数据关联,从而把广播式自动相关监视中的航班号的标牌信息关联到视频上,实现自动挂标牌。According to the formula (24), the coordinates of the tracking result of the foreground target in the world coordinate system are obtained, and the nearest neighbor method is used to establish the corresponding relationship between the video tracking coordinates and the ADS-B data, so as to realize the data association, so as to integrate the ADS-B in the data. The signage information of the flight number is associated with the video to realize automatic signage.
PCT/CN2020/118643 2020-07-03 2020-09-29 Markov random field-based method for labeling remote control tower video target WO2022000838A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010635670.6A CN111814654B (en) 2020-07-03 2020-07-03 Markov random field-based remote tower video target tagging method
CN202010635670.6 2020-07-03

Publications (1)

Publication Number Publication Date
WO2022000838A1 true WO2022000838A1 (en) 2022-01-06

Family

ID=72855204

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118643 WO2022000838A1 (en) 2020-07-03 2020-09-29 Markov random field-based method for labeling remote control tower video target

Country Status (2)

Country Link
CN (1) CN111814654B (en)
WO (1) WO2022000838A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114494444A (en) * 2022-04-15 2022-05-13 北京智行者科技有限公司 Obstacle dynamic and static state estimation method, electronic device and storage medium
CN114520920A (en) * 2022-04-15 2022-05-20 北京凯利时科技有限公司 Multi-machine-position video synchronization method and system and computer program product
CN114972440A (en) * 2022-06-21 2022-08-30 江西省国土空间调查规划研究院 Chain tracking method for pattern spot object of ES database for homeland survey
CN114998792A (en) * 2022-05-30 2022-09-02 中用科技有限公司 Safety monitoring method with AI network camera
CN115002409A (en) * 2022-05-20 2022-09-02 天津大学 Dynamic task scheduling method for video detection and tracking
CN115100266A (en) * 2022-08-24 2022-09-23 珠海翔翼航空技术有限公司 Digital airport model construction method, system and equipment based on neural network
CN115412416A (en) * 2022-07-05 2022-11-29 重庆邮电大学 Low-complexity OTFS signal detection method for high-speed mobile scene
CN115457351A (en) * 2022-07-22 2022-12-09 中国人民解放军战略支援部队航天工程大学 Multi-source information fusion uncertainty judgment method
CN115830516A (en) * 2023-02-13 2023-03-21 新乡职业技术学院 Computer neural network image processing method for battery detonation detection
CN116016931A (en) * 2023-03-24 2023-04-25 深圳市聚力得电子股份有限公司 Video encoding and decoding method of vehicle-mounted display
CN116095347A (en) * 2023-03-09 2023-05-09 中节能(临沂)环保能源有限公司 Construction engineering safety construction method and system based on video analysis
CN114998792B (en) * 2022-05-30 2024-05-14 中用科技有限公司 Security monitoring method with AI network camera

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819945B (en) * 2021-01-26 2022-10-04 北京航空航天大学 Fluid reconstruction method based on sparse viewpoint video
CN115019276B (en) * 2022-06-30 2023-10-27 南京慧尔视智能科技有限公司 Target detection method, system and related equipment
CN116468751A (en) * 2023-04-25 2023-07-21 北京拙河科技有限公司 High-speed dynamic image detection method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544852A (en) * 2013-10-18 2014-01-29 中国民用航空总局第二研究所 Method for automatically hanging labels on air planes in airport scene monitoring video
CN108986045A (en) * 2018-06-30 2018-12-11 长春理工大学 A kind of error correction tracking based on rarefaction representation
US10528818B1 (en) * 2013-03-14 2020-01-07 Hrl Laboratories, Llc Video scene analysis system for situational awareness

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103383451B (en) * 2013-06-07 2015-05-06 杭州电子科技大学 Method for optimizing radar weak target detection based on constant side length gradient weighting graph cut
CN103903015B (en) * 2014-03-20 2017-02-22 南京信息工程大学 Cell mitosis detection method
CN108133028B (en) * 2017-12-28 2020-08-04 北京天睿空间科技股份有限公司 Aircraft listing method based on combination of video analysis and positioning information
CN109389605A (en) * 2018-09-30 2019-02-26 宁波工程学院 Dividing method is cooperateed with based on prospect background estimation and the associated image of stepped zone
CN110287819B (en) * 2019-06-05 2023-06-02 大连大学 Moving target detection method based on low rank and sparse decomposition under dynamic background

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10528818B1 (en) * 2013-03-14 2020-01-07 Hrl Laboratories, Llc Video scene analysis system for situational awareness
CN103544852A (en) * 2013-10-18 2014-01-29 中国民用航空总局第二研究所 Method for automatically hanging labels on air planes in airport scene monitoring video
CN108986045A (en) * 2018-06-30 2018-12-11 长春理工大学 A kind of error correction tracking based on rarefaction representation

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114494444A (en) * 2022-04-15 2022-05-13 北京智行者科技有限公司 Obstacle dynamic and static state estimation method, electronic device and storage medium
CN114520920A (en) * 2022-04-15 2022-05-20 北京凯利时科技有限公司 Multi-machine-position video synchronization method and system and computer program product
CN115002409A (en) * 2022-05-20 2022-09-02 天津大学 Dynamic task scheduling method for video detection and tracking
CN115002409B (en) * 2022-05-20 2023-07-28 天津大学 Dynamic task scheduling method for video detection and tracking
CN114998792A (en) * 2022-05-30 2022-09-02 中用科技有限公司 Safety monitoring method with AI network camera
CN114998792B (en) * 2022-05-30 2024-05-14 中用科技有限公司 Security monitoring method with AI network camera
CN114972440A (en) * 2022-06-21 2022-08-30 江西省国土空间调查规划研究院 Chain tracking method for pattern spot object of ES database for homeland survey
CN114972440B (en) * 2022-06-21 2024-03-08 江西省国土空间调查规划研究院 Chained tracking method for ES database pattern spot objects for homeland investigation
CN115412416B (en) * 2022-07-05 2023-06-30 重庆邮电大学 Low-complexity OTFS signal detection method for high-speed moving scene
CN115412416A (en) * 2022-07-05 2022-11-29 重庆邮电大学 Low-complexity OTFS signal detection method for high-speed mobile scene
CN115457351A (en) * 2022-07-22 2022-12-09 中国人民解放军战略支援部队航天工程大学 Multi-source information fusion uncertainty judgment method
CN115457351B (en) * 2022-07-22 2023-10-20 中国人民解放军战略支援部队航天工程大学 Multi-source information fusion uncertainty judging method
CN115100266A (en) * 2022-08-24 2022-09-23 珠海翔翼航空技术有限公司 Digital airport model construction method, system and equipment based on neural network
CN115830516A (en) * 2023-02-13 2023-03-21 新乡职业技术学院 Computer neural network image processing method for battery detonation detection
CN116095347A (en) * 2023-03-09 2023-05-09 中节能(临沂)环保能源有限公司 Construction engineering safety construction method and system based on video analysis
CN116016931A (en) * 2023-03-24 2023-04-25 深圳市聚力得电子股份有限公司 Video encoding and decoding method of vehicle-mounted display
CN116016931B (en) * 2023-03-24 2023-06-06 深圳市聚力得电子股份有限公司 Video encoding and decoding method of vehicle-mounted display

Also Published As

Publication number Publication date
CN111814654A (en) 2020-10-23
CN111814654B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
WO2022000838A1 (en) Markov random field-based method for labeling remote control tower video target
Sun et al. Research on the hand gesture recognition based on deep learning
CN106897670B (en) Express violence sorting identification method based on computer vision
Kuznetsova et al. Expanding object detector's horizon: Incremental learning framework for object detection in videos
CN110555420B (en) Fusion model network and method based on pedestrian regional feature extraction and re-identification
Chen et al. Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning
CN112818905B (en) Finite pixel vehicle target detection method based on attention and spatio-temporal information
CN113158943A (en) Cross-domain infrared target detection method
CN108038515A (en) Unsupervised multi-target detection tracking and its storage device and camera device
CN110458022B (en) Autonomous learning target detection method based on domain adaptation
WO2022218396A1 (en) Image processing method and apparatus, and computer readable storage medium
CN113139896A (en) Target detection system and method based on super-resolution reconstruction
Ren et al. Research on infrared small target segmentation algorithm based on improved mask R-CNN
Li et al. IIE-SegNet: Deep semantic segmentation network with enhanced boundary based on image information entropy
Esfahani et al. DeepDSAIR: Deep 6-DOF camera relocalization using deblurred semantic-aware image representation for large-scale outdoor environments
Wan et al. Automatic moving object segmentation for freely moving cameras
Liu et al. Pseudo-label growth dictionary pair learning for crowd counting
Li et al. Few-shot meta-learning on point cloud for semantic segmentation
CN114758135A (en) Unsupervised image semantic segmentation method based on attention mechanism
Wang et al. Robust visual tracking via discriminative structural sparse feature
Luo et al. Learning scene-specific object detectors based on a generative-discriminative model with minimal supervision
Kamaleswari et al. An Assessment of Object Detection in Thermal (Infrared) Image Processing
Zhu et al. CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras
Zhang et al. A deep learning filter for visual drone single object tracking
Guo et al. Fast Visual Tracking using Memory Gradient Pursuit Algorithm.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20942864

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20942864

Country of ref document: EP

Kind code of ref document: A1