CN109034001A - Cross-modal video saliency detection method based on space-time clues - Google Patents
Cross-modal video saliency detection method based on space-time clues Download PDFInfo
- Publication number
- CN109034001A CN109034001A CN201810725499.0A CN201810725499A CN109034001A CN 109034001 A CN109034001 A CN 109034001A CN 201810725499 A CN201810725499 A CN 201810725499A CN 109034001 A CN109034001 A CN 109034001A
- Authority
- CN
- China
- Prior art keywords
- mode
- saliency
- frame
- cross
- super
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 21
- 230000011218 segmentation Effects 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 16
- 230000000694 effects Effects 0.000 claims description 11
- 230000003044 adaptive effect Effects 0.000 claims description 8
- 238000013459 approach Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 2
- 238000000034 method Methods 0.000 description 9
- 238000005286 illumination Methods 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 2
- 238000001429 visible spectrum Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 238000002329 infrared spectrum Methods 0.000 description 1
- 239000003595 mist Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Studio Devices (AREA)
- Transforming Light Signals Into Electric Signals (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
The invention discloses a cross-modal video saliency detection method based on a space-time clue, which comprises the steps of obtaining a pair of matched multi-modal video sequence frame pairs, segmenting superpixels of the matched multi-modal video sequence frame pairs by using an S L IC algorithm, calculating the saliency of each pixel point of a superpixel segmentation graph, selecting a node with high similarity as a foreground point, constructing the saliency graph by combining the weight of a saliency value, visible light and thermal infrared modes of the previous stage, comparing the saliency values of two adjacent frames, calculating the maximum overlapping ratio of the spatial positions of the two adjacent frames, finding out the inherent relation between the adjacent frames to obtain a multi-modal video saliency result based on space-time, and solving the model by using Lagrangian number multiplication to obtain a result.
Description
Technical field
The present invention relates to a kind of computer vision technique more particularly to a kind of cross-module state videos based on Deja Vu
Conspicuousness detection method.
Background technique
Saliency detection is the basic project of computer vision field, it is intended to most noticeable in positioning video sequence
Target area, have in fields such as visual classification, video frequency searching, video frequency abstract, scene understanding, target followings and widely answer
With being basis and the critical issue of computer vision.In recent years, the concern by more and more researchers.Although at present this
The research of aspect makes some progress, but significant for factors, videos such as noisy background, harsh weather under visible light
Property is still a very challenging problem.In order to successfully manage above-mentioned challenge, integrate multiple and different but complementary
Modal information may further increase saliency testing result such as visible light and thermal infrared spectrum information.
Currently, most of the algorithm of conspicuousness detection is all based on visible spectrum information, but visible light sensor is imaged
It is highly prone to environment, the boisterous influence such as illumination variation and haze.Therefore some research work introduce other mode
Data, such as: Thermal Infrared Data.Thermal infrared sensor can obtain the target surface temperature of absolute zero or more and be imaged, right
It is insensitive that smog, low-light (level), sleet sky etc. challenge factor.In addition, because of its answering extensively in military field and security monitoring field
With and by various research institutions both at home and abroad attention, and obtained significant progress, be gradually applied to many other fields.
In conjunction with a variety of different complementary hint informations, such as visible light and Thermal Infrared Data, it will becoming one kind can
Above-mentioned scene is coped with well and challenges and improve the effective means of conspicuousness detection effect.In addition, visible light and thermal infrared letter
Breath has complementarity, the effect that conspicuousness can be promoted to detect in terms of different.For example, thermal infrared sensor is a kind of passive
Imaging sensor, can capture absolute zero any of the above target sending infra-red radiation, wave-length coverage (0.75~
13um).Compared with existing sensor, thermal infrared camera has mainly had the advantage that long-range imaging capability;It is unwise to light
Sense, can be to avoid the interference of different illumination conditions;There is very strong penetration power, such as: penetrate haze and smog.
Therefore, in the case where illumination condition difference and haze weather, thermal infrared sensor is more than visible spectrum camera
Effectively.As shown in Figure 1, Fig. 1 illustrates the advantage of some Thermal Infrared Datas in the presence of a harsh environment.
Visible light camera acquisition image resolution ratio generally with higher, include geometry abundant and grain details,
But to light sensitive, video image quality sharply declines under complex scene and environment, such as (Fig. 1 (a)) and (Fig. 1 (b)) point
The imaging effect of visible light and Thermal Infrared Data under haze weather and low photoenvironment is not shown.Due to thermal infrared thermal image
The surface temperature distribution of object in scene is reflected, therefore to illumination-insensitive, is penetrated with good cloud and mist and special
Identify the ability of camouflage.And can be formed with visible data complementary, obtain more robust saliency testing result.
Summary of the invention
Technical problem to be solved by the present invention lies in: the visual modalities data by merging multiple complementations overcome low light
According to the influence of the factors such as, haze and mixed and disorderly background, a kind of cross-module state saliency detection side based on Deja Vu is provided
Method.
The present invention be by the following technical programs solution above-mentioned technical problem, the present invention the following steps are included:
(1) a pair of matched multi-modal video sequence frame pair is obtained, using SLIC algorithm to its super-pixel segmentation;
(2) conspicuousness that each pixel of super-pixel segmentation figure is calculated using multitask manifold ranking algorithm, before obtaining
Then sight spot screens obtained foreground point, be selected above the node of given threshold, allow all nodes and screening after obtain
Node compare, the node for selecting similarity big is as foreground point;
(3) by combining saliency value, the weight of two mode of visible light and thermal infrared on last stage to construct notable figure;
(4) saliency value for comparing adjacent two frame of front and back calculates its spatial position Maximum overlap ratio and then finds between consecutive frame
Intrinsic relationship, obtain the multi-modal saliency result based on space-time;
(5) model is solved and is obtained a result using lagrange multiplier approach.
It include five continuous to many short windows, each window is divided by original video sequence in the step (2)
Frame, it is as follows to propose multitask coordinated manifold ranking algorithmic formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
I, j indicate different super-pixel block, therefore between the node in the video sequence of two mode of visible light and thermal infrared
Side right is defined as It is the color characteristic of each super-pixel block, K indicates different mode, γkIndicate the
The scale parameter of k mode;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α is a balance parameters;
The Section 3 of formula is in order to avoid r over-fitting;
Formula Section 4 is cross-module state consistency constraint item, and effect formula balances harmonious between two mode
Property.
As one of preferred embodiment of the invention, in the formula, k value is 2, i.e. two moulds of visible light and thermal infrared
State.
In the step (2), respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e.,
Query object calculates sequence score of the figure interior joint relative to query object with the query object at image coboundary, then is subtracted with 1
The score is gone, finally the prospect vector that four direction is found out is done a little multiplied by calculating the initial significant of each mode in the first stage
Value:
Wherein, ο symbol
Indicate the dot product of vector element, i.e. corresponding element is multiplied;
It respectively indicates using the node of upper and lower, left and right four edges circle of image as background seed
Point, the ranking value of each super-pixel block of t frame at corresponding mode k.
In the step (3), the sort algorithm used calculates the value that sequence obtainsAnd mode weight r and preceding single order
Section is similar, and the ranking value regularization that will be obtained obtainsRange is between 0 to 1, finally, by combining ranking valueAnd mould
State weight r, obtainsThen final notable figure has been obtained;Wherein,It indicates in mode K
In the case of by combine mode weight rkWith the ranking value of each super-pixel blockObtained final saliency value.
In the step (4), for the video sequence of two mode of given a pair of of visible light and thermal infrared, target is more
Significant object is found in each frame of mode video pair, utilizes formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
I, j indicate different super-pixel block, therefore between the node in the video sequence of two mode of visible light and thermal infrared
Side right is defined as Refer to that the color characteristic of each super-pixel block, K indicate different mode, γk
Indicate the scale parameter of kth mode;
D=diag { D11,...,Dnn, indicate metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α, λ, β indicate hyper parameter;
The Section 3 of formula is in order to avoid r over-fitting;
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode;
It indicates the saliency value estimated result for correcting t frame using t-1 frame, is based on space between t frame and t-1 frame
The Maximum overlap ratio of position;
It is in order to find out the intrinsic relationship between consecutive frame, by calculating consecutive frame most
Big overlap ratio, finds motion information, obtains the multi-modal saliency result based on space-time.
Model is optimized as follows:
Wherein, i, j indicate different super-pixel block, it is seen that between the node in the video sequence of infrared two mode of light and heat
Side right be defined as Refer to the color characteristic of each super-pixel block, γkIndicate kth mode
Scale parameter;
The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element;
D=diag { D11,...,Dnn, indicate metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α is a balance parameters;
The Section 3 of formulaIt is in order to avoid r over-fitting;
Section 4It is cross-module state consistency constraint item, balances the association between two mode
Consistency is adjusted, space-time matrix P is introducedt,t-1And auxiliary variable zk,tTo replace the s in step (6) formulak,t。
The present invention has the advantage that the angle that the present invention is merged from information compared with prior art, more by merging
The visual modalities data of a complementation overcome the influence of the factors such as low illumination, haze and mixed and disorderly background, introduce the power of each pattern
Reliability is indicated again, to realize the adaptive and collaboration fusion of not same source data.In addition, Deja Vu is also included in by the application
In multi task model, to obtain smoother time domain effect.By iteratively solving multiple subproblems, mode weight and row have been obtained
Order function.To obtain more robust saliency detection effect.
Detailed description of the invention
Fig. 1 is imaging view of the Thermal Infrared Data under complex scene;
Fig. 2 is flow diagram of the invention.
Specific embodiment
It elaborates below to the embodiment of the present invention, the present embodiment carries out under the premise of the technical scheme of the present invention
Implement, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to following implementation
Example.
As shown in Fig. 2, specific step is as follows for the present embodiment:
(1) a pair of of visible light and thermal infrared video pair are given, also regard thermal infrared video sequence as one of video frame channel.
Simple SLIC super-pixel segmentation is carried out to the sequence of frames of video provided first and produces the super-pixel of each frame, to save two
The initial configuration element of a mode video content.
(2) since multimodal information fusion has certain complexity, the reliability including heterogeneity, mode between mode
And sequence in initial seed point noise.Therefore, the algorithm (Saliency that the present embodiment sorts in popular conventional
Detection via Graph-based Manifold Ranking based on figure manifold ranking conspicuousness detect) basis
On, it introduces mode reliability weight and seed point optimization respectively to overcome the above problem, proposes a kind of new collaboration manifold row
Sequence model.Original video sequence is included five continuous frames to many short windows, each window is divided by the present embodiment, is mentioned
Go out the formulating method of multitask coordinated manifold ranking algorithm, and give following formula:
Wherein, K takes 2, that is, two mode of visible light and thermal infrared in the present embodiment;T indicates t frame;It indicates
The l of vector X2Square of norm, the i.e. quadratic sum of vector element.
(3) respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e. query object, with
Query object at image coboundary calculates sequence score of the figure interior joint relative to query object, then subtracts the score with 1, most
The prospect vector that four direction is found out is done a little multiplied by the initial saliency value for calculating each mode in the first stage afterwards:
Wherein, ο symbol
The dot product and corresponding element for indicating vector element are multiplied;
It respectively indicates using the node of upper and lower, left and right four edges circle of image as background seed
Point, the ranking value of each super-pixel block of t frame at corresponding mode k.
By taking the right margin of t frame picture as an example, using the super-pixel of this side as the query node being marked, remaining
Then it is used as Unlabeled data.
According to the above-mentioned formula provided, sequence score is calculated by the sort algorithm of propositionAnd to its carry out
Regularization, to obtain new sequence scoreIt is set between 0 to 1 by the range of this value.
And so on, respectively under, the node on left, upper three boundaries as background seed point, i.e. query object, with image
The query object of boundary calculates sequence score of the figure interior joint relative to query object,
The score is subtracted with 1 again, the foreground point vector that four direction is found out finally is done dot product
To obtain the prospect saliency value of first stage.
(4) formula provided according to step (2) uses proposed sort algorithm to calculate the value that sequence obtainsAnd
Mode weight r.Ranking value regularization similar with previous stage, will obtaining, regularization operation be in order to prevent over-fitting and it is right
The parameter for needing to optimize carries out restrict.It obtainsRange is between 0 to 1;Finally, by combining ranking valueAnd mould
State weight r, obtainsThen final notable figure has been obtained.
(5) video sequence of two mode of a pair of of visible light and thermal infrared is given, target is in the every of multi-modal video pair
Significant object is found in a frame.In the formula of step (2), multitask concept is introduced effectively to cooperate with different mode.So
And it can only guarantee that the spatial smoothness of each frame but has ignored the clue of time.Therefore, for each frame, the present embodiment is mentioned
Three important requirements: 1. being consistent property of mode are gone out.2. conspicuousness is also consistent.3. considering the timing letter of consecutive frame
Breath.Specifically, the present embodiment provides following formula:
Wherein,Indicate the saliency value estimated result that t frame is corrected using t-1 frame.The principle be based on t frame and
The Maximum overlap ratio of spatial position between t-1 frame.Last is to find out the intrinsic relationship between consecutive frame.Pass through calculating
The Maximum overlap ratio of consecutive frame, finds motion information, has obtained a kind of side of multi-modal saliency detection based on space-time
Method.
(6) it adaptively merges multi-modal information and excavates the internal relation between image block (figure node) to the weight of node
The accurately calculating of the property wanted weight is very important.Therefore, the present embodiment studies a conjunctive model, graph structure, side right and point
The calculating of power is fused in a unified Optimization Framework, boosting algorithm performance.In addition, conjunctive model becomes comprising multiple optimizations
How amount, it is often very difficult solve the model.
Therefore, the present embodiment is studied and proposes the Efficient Solution algorithm of the conjunctive model.Available following Optimized model:
Wherein, i, j indicate different super-pixel block, thus the node in the video sequence of two mode of visible light and thermal infrared it
Between side right be defined as Refer to the color characteristic of each super-pixel block, γkIndicate kth mode
Scale parameter.The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element.
D=diag { D11,...,DnnIt is metric matrix, diag refers to diagonally operating.
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration.
R is mode weight vectors, r=[r1,r2,...,rM]T。
α is a balance parameters.
The Section 3 of formula is in order to avoid r over-fitting.
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode.
For Optimization Solution above-mentioned formula, the present embodiment introduces a space-time matrix P in lastt,t-1, and
Auxiliary variable zk,tTo replace the s in step (5) described formulak,t.For this Optimized model, calculation is multiplied using Lagrangian number
Method, (the Augmented Lagrange Multiplier, once wide lagrange multiplier approach), which alternately updates, solves each ginseng
Number, because its convergent efficiency is very high.Its dependent variable is fixed every time, then all subproblems have the closure of oneself to solve.
It, can be by the two alternately in the renewal process of S and r.The fast convergence of algorithm is demonstrated in an experiment.And
The mode weight of solution, the feature of image block, weights of importance, consecutive frame Maximum overlap to form target than being fused together
Feature representation is cooperateed with, realizes accurate multi-modal saliency detection.
The present embodiment is optimized in original saliency detection method based on visible light, increases thermal infrared
The input of frame pair enables and obtains significantly more efficient testing result when coping with the saliency detection under complex scene.
Meanwhile also Deja Vu being included in multi task model, Maximum overlap ratio is obtained by the difference of two frames of comparison front and back, to obtain
Smoother time domain effect.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (7)
1. a kind of cross-module state saliency detection method based on Deja Vu, which comprises the following steps:
(1) a pair of matched multi-modal video sequence frame pair is obtained, using SLIC algorithm to its super-pixel segmentation;
(2) conspicuousness that each pixel of super-pixel segmentation figure is calculated using multitask manifold ranking algorithm, obtains foreground point,
Then obtained foreground point is screened, is selected above the node of given threshold, allow obtained section after all nodes and screening
Point compares, and the node for selecting similarity big is as foreground point;
(3) by combining saliency value, the weight of two mode of visible light and thermal infrared on last stage to construct notable figure;
(4) saliency value for comparing adjacent two frame of front and back calculates its spatial position Maximum overlap than then finding consolidating between consecutive frame
There is relationship, obtains the multi-modal saliency result based on space-time;
(5) model is solved and is obtained a result using lagrange multiplier approach.
2. a kind of cross-module state saliency detection method based on Deja Vu according to claim 1, feature exist
In, it include five continuous frames to many short windows, each window is divided by original video sequence in the step (2),
It is as follows to propose multitask coordinated manifold ranking algorithmic formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
The different super-pixel block of i, j expression, therefore the side right between the node in the video sequence of two mode of visible light and thermal infrared
It is defined as It is the color characteristic of each super-pixel block, K indicates different mode, γkIndicate kth mould
The scale parameter of state;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α is a balance parameters;
The Section 3 of formula is in order to avoid r over-fitting;
Formula Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode.
3. a kind of cross-module state saliency detection method based on Deja Vu according to claim 2, feature exist
In in the formula, k value is 2, i.e. two mode of visible light and thermal infrared.
4. a kind of cross-module state saliency detection method based on Deja Vu according to claim 2, feature exist
In in the step (2), respectively using the node of upper and lower, left and right four edges circle of image as background seed point, i.e. inquiry pair
As, sequence score of the figure interior joint relative to query object is calculated with the query object at image coboundary, then with 1 subtract this
Point, finally the prospect vector that four direction is found out is done a little multiplied by the initial saliency value for calculating each mode in the first stage:
Wherein,Symbol indicates the dot product of vector element, i.e. corresponding element is multiplied;
The node using upper and lower, left and right four edges circle of image is respectively indicated as background seed point, right
Answer the ranking value of each super-pixel block of t frame under mode k.
5. a kind of cross-module state saliency detection method based on Deja Vu according to claim 4, feature exist
In in the step (3), the sort algorithm used calculates the value that sequence obtainsAnd mode weight r and previous stage phase
Seemingly, the ranking value regularization that will be obtained, obtainsRange is between 0 to 1, finally, by combining ranking valueIt is weighed with mode
Weight r, obtainsThen final notable figure has been obtained,
Wherein,It indicates in mode K by combining mode weight rkWith the ranking value of each super-pixel blockObtained final saliency value.
6. a kind of cross-module state saliency detection method based on Deja Vu according to claim 5, feature exist
In in the step (4), for the video sequence of two mode of given a pair of of visible light and thermal infrared, target is multi-modal
Significant object is found in each frame of video pair, utilizes formula:
Wherein, t indicates t frame;Indicate the l of vector X2Square of norm, the i.e. quadratic sum of vector element;
The different super-pixel block of i, j expression, therefore the side right between the node in the video sequence of two mode of visible light and thermal infrared
It is defined as Refer to that the color characteristic of each super-pixel block, K indicate different mode, γkIt indicates
The scale parameter of kth mode;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α, λ, β indicate hyper parameter;
The Section 3 of formula is in order to avoid r over-fitting;
Section 4 is cross-module state consistency constraint item, and effect formula balances the coordinative coherence between two mode;
The saliency value estimated result that t frame is corrected using t-1 frame is indicated, based on spatial position between t frame and t-1 frame
Maximum overlap ratio;
It is in order to find out the intrinsic relationship between consecutive frame, by the maximum weight for calculating consecutive frame
Folded ratio, finds motion information, obtains the multi-modal saliency result based on space-time.
7. a kind of cross-module state saliency detection method based on Deja Vu according to claim 6, feature exist
In being optimized to model as follows:
Wherein, i, j indicate different super-pixel block, it is seen that the side between node in the video sequence of infrared two mode of light and heat
Power is defined as Refer to the color characteristic of each super-pixel block, γkIndicate the scale of kth mode
Parameter;
The Frobenius norm of representing matrix X, the i.e. quadratic sum of matrix element;
D=diag { D11,...,DnnIt is metric matrix;
Γ=[Γ1,...,ΓK]TIt is an adaptive parameter vector, is initialised after the first iteration;
R is mode weight vectors, r=[r1,r2,...,rM]T;
α is a balance parameters;
The Section 3 of formulaIt is in order to avoid r over-fitting;
Section 4It is cross-module state consistency constraint item, balances the coordination one between two mode
Cause property introduces space-time matrix Pt,t-1And auxiliary variable zk,tTo replace the s in step (6) formulak,t。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810725499.0A CN109034001B (en) | 2018-07-04 | 2018-07-04 | Cross-modal video saliency detection method based on space-time clues |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810725499.0A CN109034001B (en) | 2018-07-04 | 2018-07-04 | Cross-modal video saliency detection method based on space-time clues |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109034001A true CN109034001A (en) | 2018-12-18 |
CN109034001B CN109034001B (en) | 2021-06-25 |
Family
ID=65522401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810725499.0A Active CN109034001B (en) | 2018-07-04 | 2018-07-04 | Cross-modal video saliency detection method based on space-time clues |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109034001B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110097555A (en) * | 2019-04-26 | 2019-08-06 | 绵阳慧视光电技术有限责任公司 | Electronic equipments safety monitoring method based on thermometric dot matrix fusion visible images |
CN110188239A (en) * | 2018-12-26 | 2019-08-30 | 北京大学 | A kind of double-current video classification methods and device based on cross-module state attention mechanism |
CN111426691A (en) * | 2020-03-26 | 2020-07-17 | 浙江东瞳科技有限公司 | Detection method based on multi-modal visual imaging |
CN111783524A (en) * | 2020-05-19 | 2020-10-16 | 普联国际有限公司 | Scene change detection method and device, storage medium and terminal equipment |
CN111881915A (en) * | 2020-07-15 | 2020-11-03 | 武汉大学 | Satellite video target intelligent detection method based on multiple prior information constraints |
CN113011324A (en) * | 2021-03-18 | 2021-06-22 | 安徽大学 | Target tracking method and device based on feature map matching and super-pixel map sorting |
CN116595343A (en) * | 2023-07-17 | 2023-08-15 | 山东大学 | Manifold ordering learning-based online unsupervised cross-modal retrieval method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104240244A (en) * | 2014-09-10 | 2014-12-24 | 上海交通大学 | Significant object detection method based on propagation modes and manifold ranking |
CN106127785A (en) * | 2016-06-30 | 2016-11-16 | 重庆大学 | Based on manifold ranking and the image significance detection method of random walk |
CN106997597A (en) * | 2017-03-22 | 2017-08-01 | 南京大学 | It is a kind of based on have supervision conspicuousness detection method for tracking target |
CN107610136A (en) * | 2017-09-22 | 2018-01-19 | 中国科学院西安光学精密机械研究所 | Well-marked target detection method based on the sequence of convex closure structure center query point |
-
2018
- 2018-07-04 CN CN201810725499.0A patent/CN109034001B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104240244A (en) * | 2014-09-10 | 2014-12-24 | 上海交通大学 | Significant object detection method based on propagation modes and manifold ranking |
CN106127785A (en) * | 2016-06-30 | 2016-11-16 | 重庆大学 | Based on manifold ranking and the image significance detection method of random walk |
CN106997597A (en) * | 2017-03-22 | 2017-08-01 | 南京大学 | It is a kind of based on have supervision conspicuousness detection method for tracking target |
CN107610136A (en) * | 2017-09-22 | 2018-01-19 | 中国科学院西安光学精密机械研究所 | Well-marked target detection method based on the sequence of convex closure structure center query point |
Non-Patent Citations (3)
Title |
---|
CHENGLONG ET AL.: "A unified RGB-T saliency detection benchmark: dataset, baselines, analysis and a novel approach", 《ARXIV》 * |
CHUAN YANG ET AL.: "Saliency Detection via Graph-Based Manifold Ranking", 《2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
吴建国 等: "融合显著深度特征的RGB_D图像显著目标检测", 《电子与信息学报》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110188239A (en) * | 2018-12-26 | 2019-08-30 | 北京大学 | A kind of double-current video classification methods and device based on cross-module state attention mechanism |
CN110188239B (en) * | 2018-12-26 | 2021-06-22 | 北京大学 | Double-current video classification method and device based on cross-mode attention mechanism |
CN110097555A (en) * | 2019-04-26 | 2019-08-06 | 绵阳慧视光电技术有限责任公司 | Electronic equipments safety monitoring method based on thermometric dot matrix fusion visible images |
CN111426691A (en) * | 2020-03-26 | 2020-07-17 | 浙江东瞳科技有限公司 | Detection method based on multi-modal visual imaging |
CN111783524A (en) * | 2020-05-19 | 2020-10-16 | 普联国际有限公司 | Scene change detection method and device, storage medium and terminal equipment |
CN111783524B (en) * | 2020-05-19 | 2023-10-17 | 普联国际有限公司 | Scene change detection method and device, storage medium and terminal equipment |
CN111881915A (en) * | 2020-07-15 | 2020-11-03 | 武汉大学 | Satellite video target intelligent detection method based on multiple prior information constraints |
CN111881915B (en) * | 2020-07-15 | 2022-07-15 | 武汉大学 | Satellite video target intelligent detection method based on multiple prior information constraints |
CN113011324A (en) * | 2021-03-18 | 2021-06-22 | 安徽大学 | Target tracking method and device based on feature map matching and super-pixel map sorting |
CN116595343A (en) * | 2023-07-17 | 2023-08-15 | 山东大学 | Manifold ordering learning-based online unsupervised cross-modal retrieval method and system |
CN116595343B (en) * | 2023-07-17 | 2023-10-03 | 山东大学 | Manifold ordering learning-based online unsupervised cross-modal retrieval method and system |
Also Published As
Publication number | Publication date |
---|---|
CN109034001B (en) | 2021-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109034001A (en) | Cross-modal video saliency detection method based on space-time clues | |
CN107274419B (en) | Deep learning significance detection method based on global prior and local context | |
KR20150079576A (en) | Depth map generation from a monoscopic image based on combined depth cues | |
Suárez et al. | Deep learning based single image dehazing | |
CN105160649A (en) | Multi-target tracking method and system based on kernel function unsupervised clustering | |
CN109993052B (en) | Scale-adaptive target tracking method and system under complex scene | |
CN106952286A (en) | Dynamic background Target Segmentation method based on motion notable figure and light stream vector analysis | |
CN107527370B (en) | Target tracking method based on camshift | |
CN110544269A (en) | twin network infrared target tracking method based on characteristic pyramid | |
Cvejic et al. | The effect of pixel-level fusion on object tracking in multi-sensor surveillance video | |
Hidayatullah et al. | CAMSHIFT improvement on multi-hue and multi-object tracking | |
CN113592911A (en) | Apparent enhanced depth target tracking method | |
CN107194949B (en) | A kind of interactive video dividing method and system matched based on block and enhance Onecut | |
CN107609571A (en) | A kind of adaptive target tracking method based on LARK features | |
CN113449658A (en) | Night video sequence significance detection method based on spatial domain, frequency domain and time domain | |
CN108491857B (en) | Multi-camera target matching method with overlapped vision fields | |
Hui | RETRACTED ARTICLE: Motion video tracking technology in sports training based on Mean-Shift algorithm | |
CN102223545B (en) | Rapid multi-view video color correction method | |
CN108021857A (en) | Building object detecting method based on unmanned plane image sequence depth recovery | |
Chen et al. | Visual depth guided image rain streaks removal via sparse coding | |
Zhang et al. | Spatiotemporal saliency detection based on maximum consistency superpixels merging for video analysis | |
CN108171651B (en) | Image alignment method based on multi-model geometric fitting and layered homography transformation | |
Sun et al. | Research on cloud computing modeling based on fusion difference method and self-adaptive threshold segmentation | |
CN109064444A (en) | Track plates Defect inspection method based on significance analysis | |
Xu et al. | Moving target detection and tracking in FLIR image sequences based on thermal target modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |