CN109034001A - Cross-modal video saliency detection method based on space-time clues - Google Patents

Cross-modal video saliency detection method based on space-time clues Download PDF

Info

Publication number
CN109034001A
CN109034001A CN201810725499.0A CN201810725499A CN109034001A CN 109034001 A CN109034001 A CN 109034001A CN 201810725499 A CN201810725499 A CN 201810725499A CN 109034001 A CN109034001 A CN 109034001A
Authority
CN
China
Prior art keywords
mode
saliency
frame
cross
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810725499.0A
Other languages
Chinese (zh)
Other versions
CN109034001B (en
Inventor
汤进
范东哲
李成龙
王逍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN201810725499.0A priority Critical patent/CN109034001B/en
Publication of CN109034001A publication Critical patent/CN109034001A/en
Application granted granted Critical
Publication of CN109034001B publication Critical patent/CN109034001B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Studio Devices (AREA)
  • Transforming Light Signals Into Electric Signals (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The invention discloses a cross-modal video saliency detection method based on a space-time clue, which comprises the steps of obtaining a pair of matched multi-modal video sequence frame pairs, segmenting superpixels of the matched multi-modal video sequence frame pairs by using an S L IC algorithm, calculating the saliency of each pixel point of a superpixel segmentation graph, selecting a node with high similarity as a foreground point, constructing the saliency graph by combining the weight of a saliency value, visible light and thermal infrared modes of the previous stage, comparing the saliency values of two adjacent frames, calculating the maximum overlapping ratio of the spatial positions of the two adjacent frames, finding out the inherent relation between the adjacent frames to obtain a multi-modal video saliency result based on space-time, and solving the model by using Lagrangian number multiplication to obtain a result.

Description

A kind of cross-module state saliency detection method based on Deja Vu
Technical field
The present invention relates to a kind of computer vision technique more particularly to a kind of cross-module state videos based on Deja Vu Conspicuousness detection method.
Background technique
Saliency detection is the basic project of computer vision field, it is intended to most noticeable in positioning video sequence Target area, have in fields such as visual classification, video frequency searching, video frequency abstract, scene understanding, target followings and widely answer With being basis and the critical issue of computer vision.In recent years, the concern by more and more researchers.Although at present this The research of aspect makes some progress, but significant for factors, videos such as noisy background, harsh weather under visible light Property is still a very challenging problem.In order to successfully manage above-mentioned challenge, integrate multiple and different but complementary Modal information may further increase saliency testing result such as visible light and thermal infrared spectrum information.
Currently, most of the algorithm of conspicuousness detection is all based on visible spectrum information, but visible light sensor is imaged It is highly prone to environment, the boisterous influence such as illumination variation and haze.Therefore some research work introduce other mode Data, such as: Thermal Infrared Data.Thermal infrared sensor can obtain the target surface temperature of absolute zero or more and be imaged, right It is insensitive that smog, low-light (level), sleet sky etc. challenge factor.In addition, because of its answering extensively in military field and security monitoring field With and by various research institutions both at home and abroad attention, and obtained significant progress, be gradually applied to many other fields.
In conjunction with a variety of different complementary hint informations, such as visible light and Thermal Infrared Data, it will becoming one kind can Above-mentioned scene is coped with well and challenges and improve the effective means of conspicuousness detection effect.In addition, visible light and thermal infrared letter Breath has complementarity, the effect that conspicuousness can be promoted to detect in terms of different.For example, thermal infrared sensor is a kind of passive Imaging sensor, can capture absolute zero any of the above target sending infra-red radiation, wave-length coverage (0.75~ 13um).Compared with existing sensor, thermal infrared camera has mainly had the advantage that long-range imaging capability;It is unwise to light Sense, can be to avoid the interference of different illumination conditions;There is very strong penetration power, such as: penetrate haze and smog.
Therefore, in the case where illumination condition difference and haze weather, thermal infrared sensor is more than visible spectrum camera Effectively.As shown in Figure 1, Fig. 1 illustrates the advantage of some Thermal Infrared Datas in the presence of a harsh environment.
Visible light camera acquisition image resolution ratio generally with higher, include geometry abundant and grain details, But to light sensitive, video image quality sharply declines under complex scene and environment, such as (Fig. 1 (a)) and (Fig. 1 (b)) point The imaging effect of visible light and Thermal Infrared Data under haze weather and low photoenvironment is not shown.Due to thermal infrared thermal image The surface temperature distribution of object in scene is reflected, therefore to illumination-insensitive, is penetrated with good cloud and mist and special Identify the ability of camouflage.And can be formed with visible data complementary, obtain more robust saliency testing result.
Summary of the invention
Technical problem to be solved by the present invention lies in: the visual modalities data by merging multiple complementations overcome low light According to the influence of the factors such as, haze and mixed and disorderly background, a kind of cross-module state saliency detection side based on Deja Vu is provided Method.
The present invention be by the following technical programs solution above-mentioned technical problem, the present invention the following steps are included:
(1) a pair of matched multi-modal video sequence frame pair is obtained, using SLIC algorithm to its super-pixel segmentation;
(2) conspicuousness that each pixel of super-pixel segmentation figure is calculated using multitask manifold ranking algorithm, before obtaining Then sight spot screens obtained foreground point, be selected above the node of given threshold, allow all nodes and screening after obtain Node compare, the node for selecting similarity big is as foreground point;
(3) by combining saliency value, the weight of two mode of visible light and thermal infrared on last stage to construct notable figure;
(4) saliency value for comparing adjacent two frame of front and back calculates its spatial position Maximum overlap ratio and then finds between consecutive frame Intrinsic relationship, obtain the multi-modal saliency result based on space-time;
(5) model is solved and is obtained a result using lagrange multiplier approach.
It include five continuous to many short windows, each window is divided by original video sequence in the step (2) Frame, it is as follows to propose multitask coordinated manifold ranking algorithmic formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
I, j indicate different super-pixel block, therefore between the node in the video sequence of two mode of visible light and thermal infrared Side right is defined as It is the color characteristic of each super-pixel block, K indicates different mode, γkIndicate the The scale parameter of k mode;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α is a balance parameters;
The Section 3 of formula is in order to avoid r over-fitting;
Formula Section 4 is cross-module state consistency constraint item, and effect formula balances harmonious between two mode Property.
As one of preferred embodiment of the invention, in the formula, k value is 2, i.e. two moulds of visible light and thermal infrared State.
In the step (2), respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e., Query object calculates sequence score of the figure interior joint relative to query object with the query object at image coboundary, then is subtracted with 1 The score is gone, finally the prospect vector that four direction is found out is done a little multiplied by calculating the initial significant of each mode in the first stage Value:
Wherein, ο symbol Indicate the dot product of vector element, i.e. corresponding element is multiplied;
It respectively indicates using the node of upper and lower, left and right four edges circle of image as background seed Point, the ranking value of each super-pixel block of t frame at corresponding mode k.
In the step (3), the sort algorithm used calculates the value that sequence obtainsAnd mode weight r and preceding single order Section is similar, and the ranking value regularization that will be obtained obtainsRange is between 0 to 1, finally, by combining ranking valueAnd mould State weight r, obtainsThen final notable figure has been obtained;Wherein,It indicates in mode K In the case of by combine mode weight rkWith the ranking value of each super-pixel blockObtained final saliency value.
In the step (4), for the video sequence of two mode of given a pair of of visible light and thermal infrared, target is more Significant object is found in each frame of mode video pair, utilizes formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
I, j indicate different super-pixel block, therefore between the node in the video sequence of two mode of visible light and thermal infrared Side right is defined as Refer to that the color characteristic of each super-pixel block, K indicate different mode, γk Indicate the scale parameter of kth mode;
D=diag { D11,...,Dnn, indicate metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α, λ, β indicate hyper parameter;
The Section 3 of formula is in order to avoid r over-fitting;
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode;
It indicates the saliency value estimated result for correcting t frame using t-1 frame, is based on space between t frame and t-1 frame The Maximum overlap ratio of position;
It is in order to find out the intrinsic relationship between consecutive frame, by calculating consecutive frame most Big overlap ratio, finds motion information, obtains the multi-modal saliency result based on space-time.
Model is optimized as follows:
Wherein, i, j indicate different super-pixel block, it is seen that between the node in the video sequence of infrared two mode of light and heat Side right be defined as Refer to the color characteristic of each super-pixel block, γkIndicate kth mode Scale parameter;
The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element;
D=diag { D11,...,Dnn, indicate metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α is a balance parameters;
The Section 3 of formulaIt is in order to avoid r over-fitting;
Section 4It is cross-module state consistency constraint item, balances the association between two mode Consistency is adjusted, space-time matrix P is introducedt,t-1And auxiliary variable zk,tTo replace the s in step (6) formulak,t
The present invention has the advantage that the angle that the present invention is merged from information compared with prior art, more by merging The visual modalities data of a complementation overcome the influence of the factors such as low illumination, haze and mixed and disorderly background, introduce the power of each pattern Reliability is indicated again, to realize the adaptive and collaboration fusion of not same source data.In addition, Deja Vu is also included in by the application In multi task model, to obtain smoother time domain effect.By iteratively solving multiple subproblems, mode weight and row have been obtained Order function.To obtain more robust saliency detection effect.
Detailed description of the invention
Fig. 1 is imaging view of the Thermal Infrared Data under complex scene;
Fig. 2 is flow diagram of the invention.
Specific embodiment
It elaborates below to the embodiment of the present invention, the present embodiment carries out under the premise of the technical scheme of the present invention Implement, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to following implementation Example.
As shown in Fig. 2, specific step is as follows for the present embodiment:
(1) a pair of of visible light and thermal infrared video pair are given, also regard thermal infrared video sequence as one of video frame channel. Simple SLIC super-pixel segmentation is carried out to the sequence of frames of video provided first and produces the super-pixel of each frame, to save two The initial configuration element of a mode video content.
(2) since multimodal information fusion has certain complexity, the reliability including heterogeneity, mode between mode And sequence in initial seed point noise.Therefore, the algorithm (Saliency that the present embodiment sorts in popular conventional Detection via Graph-based Manifold Ranking based on figure manifold ranking conspicuousness detect) basis On, it introduces mode reliability weight and seed point optimization respectively to overcome the above problem, proposes a kind of new collaboration manifold row Sequence model.Original video sequence is included five continuous frames to many short windows, each window is divided by the present embodiment, is mentioned Go out the formulating method of multitask coordinated manifold ranking algorithm, and give following formula:
Wherein, K takes 2, that is, two mode of visible light and thermal infrared in the present embodiment;T indicates t frame;It indicates The l of vector X2Square of norm, the i.e. quadratic sum of vector element.
(3) respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e. query object, with Query object at image coboundary calculates sequence score of the figure interior joint relative to query object, then subtracts the score with 1, most The prospect vector that four direction is found out is done a little multiplied by the initial saliency value for calculating each mode in the first stage afterwards:
Wherein, ο symbol The dot product and corresponding element for indicating vector element are multiplied;
It respectively indicates using the node of upper and lower, left and right four edges circle of image as background seed Point, the ranking value of each super-pixel block of t frame at corresponding mode k.
By taking the right margin of t frame picture as an example, using the super-pixel of this side as the query node being marked, remaining Then it is used as Unlabeled data.
According to the above-mentioned formula provided, sequence score is calculated by the sort algorithm of propositionAnd to its carry out Regularization, to obtain new sequence scoreIt is set between 0 to 1 by the range of this value.
And so on, respectively under, the node on left, upper three boundaries as background seed point, i.e. query object, with image The query object of boundary calculates sequence score of the figure interior joint relative to query object,
The score is subtracted with 1 again, the foreground point vector that four direction is found out finally is done dot product To obtain the prospect saliency value of first stage.
(4) formula provided according to step (2) uses proposed sort algorithm to calculate the value that sequence obtainsAnd Mode weight r.Ranking value regularization similar with previous stage, will obtaining, regularization operation be in order to prevent over-fitting and it is right The parameter for needing to optimize carries out restrict.It obtainsRange is between 0 to 1;Finally, by combining ranking valueAnd mould State weight r, obtainsThen final notable figure has been obtained.
(5) video sequence of two mode of a pair of of visible light and thermal infrared is given, target is in the every of multi-modal video pair Significant object is found in a frame.In the formula of step (2), multitask concept is introduced effectively to cooperate with different mode.So And it can only guarantee that the spatial smoothness of each frame but has ignored the clue of time.Therefore, for each frame, the present embodiment is mentioned Three important requirements: 1. being consistent property of mode are gone out.2. conspicuousness is also consistent.3. considering the timing letter of consecutive frame Breath.Specifically, the present embodiment provides following formula:
Wherein,Indicate the saliency value estimated result that t frame is corrected using t-1 frame.The principle be based on t frame and The Maximum overlap ratio of spatial position between t-1 frame.Last is to find out the intrinsic relationship between consecutive frame.Pass through calculating The Maximum overlap ratio of consecutive frame, finds motion information, has obtained a kind of side of multi-modal saliency detection based on space-time Method.
(6) it adaptively merges multi-modal information and excavates the internal relation between image block (figure node) to the weight of node The accurately calculating of the property wanted weight is very important.Therefore, the present embodiment studies a conjunctive model, graph structure, side right and point The calculating of power is fused in a unified Optimization Framework, boosting algorithm performance.In addition, conjunctive model becomes comprising multiple optimizations How amount, it is often very difficult solve the model.
Therefore, the present embodiment is studied and proposes the Efficient Solution algorithm of the conjunctive model.Available following Optimized model:
Wherein, i, j indicate different super-pixel block, thus the node in the video sequence of two mode of visible light and thermal infrared it Between side right be defined as Refer to the color characteristic of each super-pixel block, γkIndicate kth mode Scale parameter.The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element.
D=diag { D11,...,DnnIt is metric matrix, diag refers to diagonally operating.
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration.
R is mode weight vectors, r=[r1,r2,...,rM]T
α is a balance parameters.
The Section 3 of formula is in order to avoid r over-fitting.
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode.
For Optimization Solution above-mentioned formula, the present embodiment introduces a space-time matrix P in lastt,t-1, and Auxiliary variable zk,tTo replace the s in step (5) described formulak,t.For this Optimized model, calculation is multiplied using Lagrangian number Method, (the Augmented Lagrange Multiplier, once wide lagrange multiplier approach), which alternately updates, solves each ginseng Number, because its convergent efficiency is very high.Its dependent variable is fixed every time, then all subproblems have the closure of oneself to solve. It, can be by the two alternately in the renewal process of S and r.The fast convergence of algorithm is demonstrated in an experiment.And The mode weight of solution, the feature of image block, weights of importance, consecutive frame Maximum overlap to form target than being fused together Feature representation is cooperateed with, realizes accurate multi-modal saliency detection.
The present embodiment is optimized in original saliency detection method based on visible light, increases thermal infrared The input of frame pair enables and obtains significantly more efficient testing result when coping with the saliency detection under complex scene. Meanwhile also Deja Vu being included in multi task model, Maximum overlap ratio is obtained by the difference of two frames of comparison front and back, to obtain Smoother time domain effect.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (7)

1. a kind of cross-module state saliency detection method based on Deja Vu, which comprises the following steps:
(1) a pair of matched multi-modal video sequence frame pair is obtained, using SLIC algorithm to its super-pixel segmentation;
(2) conspicuousness that each pixel of super-pixel segmentation figure is calculated using multitask manifold ranking algorithm, obtains foreground point, Then obtained foreground point is screened, is selected above the node of given threshold, allow obtained section after all nodes and screening Point compares, and the node for selecting similarity big is as foreground point;
(3) by combining saliency value, the weight of two mode of visible light and thermal infrared on last stage to construct notable figure;
(4) saliency value for comparing adjacent two frame of front and back calculates its spatial position Maximum overlap than then finding consolidating between consecutive frame There is relationship, obtains the multi-modal saliency result based on space-time;
(5) model is solved and is obtained a result using lagrange multiplier approach.
2. a kind of cross-module state saliency detection method based on Deja Vu according to claim 1, feature exist In, it include five continuous frames to many short windows, each window is divided by original video sequence in the step (2), It is as follows to propose multitask coordinated manifold ranking algorithmic formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
The different super-pixel block of i, j expression, therefore the side right between the node in the video sequence of two mode of visible light and thermal infrared It is defined as It is the color characteristic of each super-pixel block, K indicates different mode, γkIndicate kth mould The scale parameter of state;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α is a balance parameters;
The Section 3 of formula is in order to avoid r over-fitting;
Formula Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode.
3. a kind of cross-module state saliency detection method based on Deja Vu according to claim 2, feature exist In in the formula, k value is 2, i.e. two mode of visible light and thermal infrared.
4. a kind of cross-module state saliency detection method based on Deja Vu according to claim 2, feature exist In in the step (2), respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e. inquiry pair As, sequence score of the figure interior joint relative to query object is calculated with the query object at image coboundary, then with 1 subtract this Point, finally the prospect vector that four direction is found out is done a little multiplied by the initial saliency value for calculating each mode in the first stage:
Wherein,Symbol indicates the dot product of vector element, i.e. corresponding element is multiplied;
The node using upper and lower, left and right four edges circle of image is respectively indicated as background seed point, right Answer the ranking value of each super-pixel block of t frame under mode k.
5. a kind of cross-module state saliency detection method based on Deja Vu according to claim 4, feature exist In in the step (3), the sort algorithm used calculates the value that sequence obtainsAnd mode weight r and previous stage phase Seemingly, the ranking value regularization that will be obtained, obtainsRange is between 0 to 1, finally, by combining ranking valueIt is weighed with mode Weight r, obtainsThen final notable figure has been obtained,
Wherein,It indicates in mode K by combining mode weight rkWith the ranking value of each super-pixel blockObtained final saliency value.
6. a kind of cross-module state saliency detection method based on Deja Vu according to claim 5, feature exist In in the step (4), for the video sequence of two mode of given a pair of of visible light and thermal infrared, target is multi-modal Significant object is found in each frame of video pair, utilizes formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
The different super-pixel block of i, j expression, therefore the side right between the node in the video sequence of two mode of visible light and thermal infrared It is defined as Refer to that the color characteristic of each super-pixel block, K indicate different mode, γkIt indicates The scale parameter of kth mode;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α, λ, β indicate hyper parameter;
The Section 3 of formula is in order to avoid r over-fitting;
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode;
The saliency value estimated result that t frame is corrected using t-1 frame is indicated, based on spatial position between t frame and t-1 frame Maximum overlap ratio;
It is in order to find out the intrinsic relationship between consecutive frame, by the maximum weight for calculating consecutive frame Folded ratio, finds motion information, obtains the multi-modal saliency result based on space-time.
7. a kind of cross-module state saliency detection method based on Deja Vu according to claim 6, feature exist In being optimized to model as follows:
Wherein, i, j indicate different super-pixel block, it is seen that the side between node in the video sequence of infrared two mode of light and heat Power is defined as Refer to the color characteristic of each super-pixel block, γkIndicate the scale of kth mode Parameter;
The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T
α is a balance parameters;
The Section 3 of formulaIt is in order to avoid r over-fitting;
Section 4It is cross-module state consistency constraint item, balances the coordination one between two mode Cause property introduces space-time matrix Pt,t-1And auxiliary variable zk,tTo replace the s in step (6) formulak,t
CN201810725499.0A 2018-07-04 2018-07-04 Cross-modal video saliency detection method based on space-time clues Active CN109034001B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810725499.0A CN109034001B (en) 2018-07-04 2018-07-04 Cross-modal video saliency detection method based on space-time clues

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810725499.0A CN109034001B (en) 2018-07-04 2018-07-04 Cross-modal video saliency detection method based on space-time clues

Publications (2)

Publication Number Publication Date
CN109034001A true CN109034001A (en) 2018-12-18
CN109034001B CN109034001B (en) 2021-06-25

Family

ID=65522401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810725499.0A Active CN109034001B (en) 2018-07-04 2018-07-04 Cross-modal video saliency detection method based on space-time clues

Country Status (1)

Country Link
CN (1) CN109034001B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110097555A (en) * 2019-04-26 2019-08-06 绵阳慧视光电技术有限责任公司 Electronic equipments safety monitoring method based on thermometric dot matrix fusion visible images
CN110188239A (en) * 2018-12-26 2019-08-30 北京大学 A kind of double-current video classification methods and device based on cross-module state attention mechanism
CN111426691A (en) * 2020-03-26 2020-07-17 浙江东瞳科技有限公司 Detection method based on multi-modal visual imaging
CN111783524A (en) * 2020-05-19 2020-10-16 普联国际有限公司 Scene change detection method and device, storage medium and terminal equipment
CN111881915A (en) * 2020-07-15 2020-11-03 武汉大学 Satellite video target intelligent detection method based on multiple prior information constraints
CN113011324A (en) * 2021-03-18 2021-06-22 安徽大学 Target tracking method and device based on feature map matching and super-pixel map sorting
CN116595343A (en) * 2023-07-17 2023-08-15 山东大学 Manifold ordering learning-based online unsupervised cross-modal retrieval method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104240244A (en) * 2014-09-10 2014-12-24 上海交通大学 Significant object detection method based on propagation modes and manifold ranking
CN106127785A (en) * 2016-06-30 2016-11-16 重庆大学 Based on manifold ranking and the image significance detection method of random walk
CN106997597A (en) * 2017-03-22 2017-08-01 南京大学 It is a kind of based on have supervision conspicuousness detection method for tracking target
CN107610136A (en) * 2017-09-22 2018-01-19 中国科学院西安光学精密机械研究所 Well-marked target detection method based on the sequence of convex closure structure center query point

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104240244A (en) * 2014-09-10 2014-12-24 上海交通大学 Significant object detection method based on propagation modes and manifold ranking
CN106127785A (en) * 2016-06-30 2016-11-16 重庆大学 Based on manifold ranking and the image significance detection method of random walk
CN106997597A (en) * 2017-03-22 2017-08-01 南京大学 It is a kind of based on have supervision conspicuousness detection method for tracking target
CN107610136A (en) * 2017-09-22 2018-01-19 中国科学院西安光学精密机械研究所 Well-marked target detection method based on the sequence of convex closure structure center query point

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHENGLONG ET AL.: "A unified RGB-T saliency detection benchmark: dataset, baselines, analysis and a novel approach", 《ARXIV》 *
CHUAN YANG ET AL.: "Saliency Detection via Graph-Based Manifold Ranking", 《2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
吴建国 等: "融合显著深度特征的RGB_D图像显著目标检测", 《电子与信息学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188239A (en) * 2018-12-26 2019-08-30 北京大学 A kind of double-current video classification methods and device based on cross-module state attention mechanism
CN110188239B (en) * 2018-12-26 2021-06-22 北京大学 Double-current video classification method and device based on cross-mode attention mechanism
CN110097555A (en) * 2019-04-26 2019-08-06 绵阳慧视光电技术有限责任公司 Electronic equipments safety monitoring method based on thermometric dot matrix fusion visible images
CN111426691A (en) * 2020-03-26 2020-07-17 浙江东瞳科技有限公司 Detection method based on multi-modal visual imaging
CN111783524A (en) * 2020-05-19 2020-10-16 普联国际有限公司 Scene change detection method and device, storage medium and terminal equipment
CN111783524B (en) * 2020-05-19 2023-10-17 普联国际有限公司 Scene change detection method and device, storage medium and terminal equipment
CN111881915A (en) * 2020-07-15 2020-11-03 武汉大学 Satellite video target intelligent detection method based on multiple prior information constraints
CN111881915B (en) * 2020-07-15 2022-07-15 武汉大学 Satellite video target intelligent detection method based on multiple prior information constraints
CN113011324A (en) * 2021-03-18 2021-06-22 安徽大学 Target tracking method and device based on feature map matching and super-pixel map sorting
CN116595343A (en) * 2023-07-17 2023-08-15 山东大学 Manifold ordering learning-based online unsupervised cross-modal retrieval method and system
CN116595343B (en) * 2023-07-17 2023-10-03 山东大学 Manifold ordering learning-based online unsupervised cross-modal retrieval method and system

Also Published As

Publication number Publication date
CN109034001B (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN109034001A (en) Cross-modal video saliency detection method based on space-time clues
CN107274419B (en) Deep learning significance detection method based on global prior and local context
KR20150079576A (en) Depth map generation from a monoscopic image based on combined depth cues
Suárez et al. Deep learning based single image dehazing
CN105160649A (en) Multi-target tracking method and system based on kernel function unsupervised clustering
CN109993052B (en) Scale-adaptive target tracking method and system under complex scene
CN106952286A (en) Dynamic background Target Segmentation method based on motion notable figure and light stream vector analysis
CN107527370B (en) Target tracking method based on camshift
CN110544269A (en) twin network infrared target tracking method based on characteristic pyramid
Cvejic et al. The effect of pixel-level fusion on object tracking in multi-sensor surveillance video
Hidayatullah et al. CAMSHIFT improvement on multi-hue and multi-object tracking
CN113592911A (en) Apparent enhanced depth target tracking method
CN107194949B (en) A kind of interactive video dividing method and system matched based on block and enhance Onecut
CN107609571A (en) A kind of adaptive target tracking method based on LARK features
CN113449658A (en) Night video sequence significance detection method based on spatial domain, frequency domain and time domain
CN108491857B (en) Multi-camera target matching method with overlapped vision fields
Hui RETRACTED ARTICLE: Motion video tracking technology in sports training based on Mean-Shift algorithm
CN102223545B (en) Rapid multi-view video color correction method
CN108021857A (en) Building object detecting method based on unmanned plane image sequence depth recovery
Chen et al. Visual depth guided image rain streaks removal via sparse coding
Zhang et al. Spatiotemporal saliency detection based on maximum consistency superpixels merging for video analysis
CN108171651B (en) Image alignment method based on multi-model geometric fitting and layered homography transformation
Sun et al. Research on cloud computing modeling based on fusion difference method and self-adaptive threshold segmentation
CN109064444A (en) Track plates Defect inspection method based on significance analysis
Xu et al. Moving target detection and tracking in FLIR image sequences based on thermal target modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant