CN108846384A - Merge the multitask coordinated recognition methods and system of video-aware - Google Patents
Merge the multitask coordinated recognition methods and system of video-aware Download PDFInfo
- Publication number
- CN108846384A CN108846384A CN201810744934.4A CN201810744934A CN108846384A CN 108846384 A CN108846384 A CN 108846384A CN 201810744934 A CN201810744934 A CN 201810744934A CN 108846384 A CN108846384 A CN 108846384A
- Authority
- CN
- China
- Prior art keywords
- feature
- task
- video
- collaboration
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The present invention provides a kind of multitask coordinated recognition methods and system for merging video-aware, belong to multi-source heterogeneous video data processing identification technology field, in conjunction with biology perception mechanism, the shared semantic description for studying multi-source heterogeneous video data feature collaboration obtains the generic features description of multi-source heterogeneous video data;Using suitable border computational theory, feature association study and the task forecasting mechanism of task cooperation are established, realizes the task interaction prediction mechanism of suitable border perception;In conjunction with it is long when rely on, propose that the vision multitask depth of context collaboration cooperates with identification model, realize have the multitask depth collaboration identification model of long-term memory, the generalization for solving video multitask identification is poor, the problems such as robustness is low, computation complexity is high.The present invention proposes that intelligence, generalization, the video general character description method of mobile and multitask depth cooperate with identification model, can promote intelligent information push, the personalized development for controlling the fields such as service of the multi-source heterogeneous video data in smart city.
Description
Technical field
The present invention relates to multi-source heterogeneous video datas to handle identification technology field, and in particular to a kind of fusion video-aware
Multitask coordinated recognition methods and system.
Background technique
Artificial intelligence is support with the development of the technologies such as big data, cloud computing, intelligent terminal, using deep neural network as base
Plinth will enter all-round developing new era.Upper ultrahigh speed, mobile and generalization are being stored and processed in face of mass data
Urgent need, the Special artificial based on single mode single task intelligently have become the important bottleneck for keeping field development in check.
Traditional single task identifies the generalization requirement being unable to satisfy under artificial intelligence background, with wherein most representative
For the mission requirements such as the face video identification, Human bodys' response, the vehicle classification identification that are related to simultaneously in the construction of smart city,
Video acquisition camera is many kinds of, specification is different, causes video data that massive multi-source is presented, needs regular isomorphism
The recognition mechanism that video features describe method and efficiently cooperate with, realization accurately identify target, scene, behavior, anomalous event.
Therefore, the visual identity mechanism towards the collaboration of multitask depth can be the reality of the following intelligent information push and personalized control service
It is existing, establish important theoretical basis.
The multitask depth collaboration Study of recognition of so-called multisource video perception refers to based on biology perception mechanism, extracts
The generic features of multi-source heterogeneous video data carry out feature association study and task prediction in conjunction with suitable border theory, and foundation has length
When remember depth collaboration identification network, i.e., realization context layer multitask collaborative perception identification.Such as:One section " in dining room
Xiao Ming greets to me " video clip in, achieve the effect that while identifying a variety of visual tasks, i.e., identify that scene (is eaten simultaneously
Hall), target (Xiao Ming), behavior (greetings), the visual tasks such as expression (laughing at), rather than each identification mission establish it is a set of independent
Identification model, export recognition result respectively, not only waste computing resource, but also be difficult to handle mass data, realize practical require.
In Current vision identification technology, based on the feature extracting method of deep learning in scene, target, behavior, expression
Etc. single identification mission test in show superior performance.However, for massive multi-source data, as user advises
Mould, scene change and time passage, and generate some new problems:
Extensive bottleneck:Data distribution significant difference between different task mode is also easy to produce over-fitting to small-scale data task,
And mass data task faces high training and label cost, so that balance generalization can not be obtained between different task, model
Generalization Capability is decreased obviously under changing environment or scene;
Efficiency bottle neck:Depth network model is complicated, and number of parameters is huge, although having generation confrontation network, capsule network
Deng having carried out good try to reducing data requirements and resource consumption, but facing different identification missions, heterogeneous networks structure,
Still it is difficult to realize resource balanced efficient distribution rapidly;
Migrate bottleneck:It can not be associated prediction according to the historical information of data when scene changes, it is selective when establishing long
Memory and Forgetting Mechanism realize the adaptive learning of suitable border migration.Such as Xiao Ming moves towards dining room from classroom, realizes to goal behavior
The identification mission having a meal migration is transformed to from study.
Therefore, the depth of the interaction prediction in vision multi-task learning with task cooperation ability and context layer cooperates with identification
Modeling becomes key problem urgently to be resolved in the identification of Current vision Intellisense.
With the lasting promotion of communication bandwidth and transmission speed, the video data volume is exponentially increased, to it is limited calculating and
Storage resource generates immense pressure, and the other single task treatment mechanism of heritage knowledge divides different data to different task, finds
Its corresponding character description method, causes resource utilization low, the descriptive difference of data.Conventional visual perceives identification mission
In, learning model generally no longer changes after establishing, but in the Intellisense epoch, as the time passage of different scenes and space become
It changes, original model is non-optimal, and there are potential incidence relations between different mode in scene, for the variation field of space-time passage
Scape is learnt feature association under scene change, is reached mass data using the mission mode association mining mechanism under the perception of suitable border
The dynamic prediction of lower task cooperation and certainly mark, the adaptive optimal of learning characteristic Model checking to keep.Single-mode
Identification can not carry out effective long-term memory reasoning to the feature learnt, realize a variety of visions according to the dynamic mapping of scene
Task identifies simultaneously.When having pop-up mission or target is added, model interoperability to be identified can not be handled, realize identification network
The balance of underloadingization and high usage.
Summary of the invention
The purpose of the present invention is to provide one kind can be directed to multi-source heterogeneous data, establishes generalization feature collaboration description machine
System, the video information effective supplement that different data sources are obtained, is evolved to multi-source elastic model for traditional single source fixed mode,
It effectively removes data redundancy, retains shared semantic information, establish a kind of high dynamic admission rate, high resource utilization, low network and disappear
The multitask recognition methods and system of the fusion video-aware of consumption rate, to solve technical problem present in above-mentioned background technique.
To achieve the goals above, this invention takes following technical solutions:
On the one hand, the present invention provides a kind of multitask recognition methods for merging video-aware, include the following steps:
Step S110:In conjunction with biology perception mechanism, the shared semanteme based on the collaboration of multi-source heterogeneous video data feature
Mechanism extracts the generic features of multi-source heterogeneous video data;
Step S120:Using suitable border computational theory, the feature association study mechanism of task cooperation is established, it is different to the multi-source
The generic features of structure video data carry out continuous learning as priori knowledge, generate the task interaction prediction model of suitable border perception;
Step S130:For it is long when input video stream, the task interaction prediction model foundation perceived in conjunction with the suitable border is long
When the generation memory models that rely on, establish the depth based on cooperative kinetics it is autonomous it is semi-supervised persistently identifies system, realize more
Business identification.
Further, in the step S110, the shared semantic mechanism packet of the multi-source heterogeneous video data feature collaboration
It includes:
Establish the primitive collaboration based on multi-source heterogeneous video data, the dictionary collaboration based on time synchronization and based on semantic phase
As theme cooperate with three-level feature synergistic mechanism establish multi-source heterogeneous video counts in conjunction with the attribute of multi-source heterogeneous video data
According to feature cooperation model, the regular shared semantic association relationship of dimension is determined;Wherein,
Primitive based on multi-source heterogeneous video data cooperates with:Using independent component analysis training video picture element,
The video image primitive is matched according to this using Gabor function, estimate the corresponding scale of each video image primitive and
The primitive feature of video image is extracted in direction, realizes the time-space domain efficient coding of video image internal structure;
Dictionary based on time synchronization cooperates with:It is encoded using local linear, using local distance as sparse basis letter
Several regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search
A dictionary updating is completed in direction, establishes a Coded concepts stream for each data channel, and the reference as complicated event is semantic
The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment by coding, and generation time translates letter
Number realizes the alignment of dictionary semanteme;
Include based on semantic similar theme collaboration:Using hidden semantic analysis, dictionary and video image primitive feature are constructed
Between co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to word
It converges, theme node and the description of scene mapping relations, by the video conditional probability under theme distribution, as particular category similarity,
Calculate the likelihood function of probability and prediction probability between true vocabulary and scene.
Further, described to establish multi-source heterogeneous video data feature cooperation model, determine the regular shared semanteme of dimension
Incidence relation includes:
Assuming that have C class heterogeneous channel feature, it willIt is denoted as niThe feature of a training sample
Matrix, data noise part are E, and Γ is twiddle factor, establishes the majorized function under orthogonality constraint:
Wherein, λ indicates sharing matrix coefficient,TRepresenting matrix carries out transposition operation, YiIndicate ith feature classification mark, F
Indicate Frobenius norm,Indicate projection matrix ΘiTransposition, α, β, μ1And μ2For multiplier factor, rank (X) is characterized square
The order of battle array X;
Obtain general semantics feature low dimensional manifold subspace { Θi, the semantic sharing matrix W under Unified frame0With specific spy
Levy modular matrix { Wi, using least square method for solving prediction loss function R1(W0,{Wi},{Θi), reconstruct loss letter
Number R2({Θi) and regular function R3(W0,{Wi) joint optimal solution;
By the way that the multi-source heterogeneous video data newly inputted to be extracted to the high-rise generic features with dimension to eigenspace projection
Shared semantic association relationship is established in description.
Further, in the step S120, using suitable border computational theory, the feature association learning machine of task cooperation is established
System, the task interaction prediction model for generating suitable border perception include:
The mapping function under low-rank constrains between vision mark and generic features is constructed, realizes feature mark collaboration;It introduces
Nuclear norm models mark correlation, feature correlation, while introducing the intrinsic structure that figure regular terms retains data with existing,
It realizes the mark prediction without mark characteristic, establishes following unconstrained function:
Wherein, g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and by g letter
The loss function of number acquisition task prediction result error minimizes,For being fitted given mark, Φ (g) and Λ
It (g) is the regular terms based on a priori assumption, λ and γ are regular terms parameters;
Task interaction prediction model interactive environment, environmental model and the loss model of suitable border perception;
Environmental model is used to learn the environment dynamic change of input feature vector, and loss model is used to estimate that environmental model to lose,
Predict visual zone, target and the task to be identified for needing to pay close attention in the future;
The interactive environment includes that definition status space is made of t moment and the description of the generic features at t-1 moment, current t
Moment state specifies identification mission at, predict next t+1 moment task status to be identified;
The environmental model includes giving historical informationGeneric features with go through
History mapping function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η
(h);Note ω is environmental model ω (ξ (h)) ∈ Y, when every subtask is predicted, introduces loss model Lwm(ω (ξ (h)), η (h)) appoints
Business prediction is related to H={ h=(st-k,at-k,···,st,at,st+1), ξ (h)=(st-k,at-k,···,st,at) and η
(h)=st+1, inverse kinematics forecasting mechanism and softmax cross entropy loss forecasting state in future, the nerve based on stochastic gradient descent
Network model ωφ, to encode, the stateful low-dimensional latent space completion visual attention location region into one comprising shared weight is mentioned
It takes and status predication;
The loss model includes given state stWith suggest next step task, for predicting environmental model RlA task hair
Raw probability distribution, softmax cross entropy loss function encode the state of next step task as penalty term.
Further, in the step S130, it is described for it is long when input video stream, being perceived in conjunction with the suitable border for task
The generation memory models that interaction prediction model foundation relies on when long include:
Model is generated using memory external system enhancing timing, generic features description is stored since the early stage of sequence
Effective information, establish sustainable generation memory models to information has been stored;Specifically,
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,···,eTAnd task association
Same hidden variable collection z≤T={ z1,z2,···,zT, h is mapped using translationt=fh(ht-1,et,zt) correct each time point
The hidden state variable h of certaintyt, priori mapping function fz(ht-1) description past observing and hidden variable non-linear dependence and offer
Hidden variable distribution parameter;Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided;Benefit
With memory external Modifying model timing variable autocoder, a memory text ψ is generated at every point of timet, priori letter
Breath and posterior information respectively indicate as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μ(Ψt),fz σ(Ψt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μ(Ψt-1,et),fq σ(Ψt-1,et))
Wherein, fz μIt is the translation mapping function of hidden variable z state μ, fz σIt is the translation mapping function of hidden variable z state σ, fq μIt is the translation mapping function of posterior probability q state μ, fq σThe translation mapping function of posterior probability q state σ, prior information are to rely on
F is mapped in priorizRemember the diagonal gauss of distribution function of text, and diagonal Gaussian approximation Posterior distrbutionp is depended on and is reflected by posteriority
Penetrate function fqAssociated memory text Ψt-1With current observation et。
Further, the depth of the foundation based on cooperative kinetics is independently semi-supervised persistently identifies system, realizes more
Business identifies:
Recognizer is cooperateed with based on the depth for generating memory models, using the evolutionary process of collaboration potential-energy function, will be remembered
Model is introduced into the dynamic process of coevolution, will solve prototype pattern and adjoint mode is attributed to solution nonlinear optimization
Problem obtains optimization contract network weight;
Long memory network f in short-termrnnFor promoting state history ht, memory external MtUse the hidden change from previous moment
Amount and external text information ctIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, introduce one collection key value, using cosine similarity evaluate and test willWith memory Mt-1Each row compares, and generation task pays attention to weight, the memory of retrievalBy attention weight and memory Mt-1's
Weighted sum obtains, and realizes multitask identification;Wherein,
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the crucial value function of r item for promoting state history, fattIt is attention mechanism function,It is t moment r
The memory weight of i-th point of item,Retrieval memory equation obtain as a result, ⊙ indicate point multiplication operation,It is to be remembered by retrieval
The Setover relatedly value learnt, σ () are sigmoid functions, form the expression mechanism for informing memory storage and retrieval as a result,
Ψt=[φt 1,φt 2,···,φt R,ht], as the output for generating memory models.
On the other hand, including general the present invention also provides a kind of multitask coordinated identifying system for merging video-aware
Characteristic extracting module, collaboration feature learning module, depth cooperate with identification module;
The generic features extraction module, it is special based on multi-source heterogeneous video data for combining biology perception mechanism
The shared semantic mechanism of sign collaboration, extracts the generic features of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study of task cooperation using suitable border computational theory
Mechanism carries out continuous learning as priori knowledge to the generic features of the multi-source heterogeneous video data, generates suitable border perception
Task interaction prediction model;
The depth cooperates with identification module, input video stream when for being directed to long, closes in conjunction with the task that the suitable border perceives
The generation memory models that connection prediction model relies on when establishing long establish the autonomous semi-supervised lasting knowledge of the depth based on cooperative kinetics
Complicated variant system realizes multitask identification.
Further, the generic features extraction module includes primitive collaboration submodule, dictionary collaboration submodule and theme
Cooperate with submodule;
The primitive cooperates with submodule, for utilizing independent component analysis training video picture element, utilizes Gabor letter
It is several that the video image primitive is matched according to this, estimate the corresponding scale of each video image primitive and direction, extracts view
The primitive feature of frequency image realizes the time-space domain efficient coding of video image internal structure;
The dictionary cooperates with submodule, for being encoded using local linear, using local distance as sparse basic function
Regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search direction,
A dictionary updating is completed, establishes a Coded concepts stream for each data channel, as the reference semantic coding of complicated event,
The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment, generation time translation function is real
Existing dictionary semanteme alignment;
The theme cooperates with submodule, for using hidden semantic analysis, constructs between dictionary and video image primitive feature
Co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, master
Node and the description of scene mapping relations are inscribed, the video conditional probability under theme distribution is calculated true as particular category similarity
The likelihood function of probability and prediction probability between notional word remittance and scene.
Further, the collaboration feature learning module includes the task pass of feature association study submodule and the perception of suitable border
Connection prediction submodule;
The feature association learns submodule, for constructing the mapping under low-rank constrains between vision mark and generic features
Function realizes feature mark collaboration;
The task interaction prediction submodule of the suitable border perception, for the feature association relationship by learning, in conjunction with view
Feel that the priori knowledge of perception, the task cooperation treatment mechanism based on environmental model and loss function are realized dynamic according to scene changes
State is adaptively adjusted task to be identified, completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
Further, the generation memory models submodule relied on when the depth collaboration identification module includes long and multitask
Depth collaboration identification submodule;
The generation memory models submodule relied on when described long, input video stream when for being directed to long, in conjunction with the suitable border
The generation memory models that the task interaction prediction model foundation of perception relies on when long;
The multitask depth collaboration identification submodule, for establishing, the depth based on cooperative kinetics is independently semi-supervised to be held
Continuous identification system realizes multitask identification.
Beneficial effect of the present invention:Complete effective extraction of multi-source heterogeneous video data information can be achieved, settling time is synchronous
Dictionary synergistic mechanism, reduce vision it is semantic between time uncertainty, improve model to the generalization ability of scene change;It can
Task to be identified is adjusted according to scene changes dynamic self-adapting, the perception of completion visual attention location region and mission requirements are predicted dynamic
State adjustment;Memory external is established in conjunction with long-range data dependence and generates model, enhances e-learning performance, with the storage of lesser data
Capacity reduces model parameter calculation complexity, extracts useful information at once, is applied to different type video sequence, solves multiple
Miscellaneous, long-range sequence data can not selective memory and forgetting problem;It realizes that identification feature independently selects, improves without labeled data
Identification, persistently promoted multitask identification accuracy rate and robustness, adjusted according to scene changes dynamic self-adapting wait know
Other task completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
The additional aspect of the present invention and advantage will be set forth in part in the description, these will become from the following description
Obviously, or practice through the invention is recognized.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment
Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill of field, without creative efforts, it can also be obtained according to these attached drawings others
Attached drawing.
Fig. 1 is that the multitask of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention identifies original
Manage block diagram.
Fig. 2 is that the generic features of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention are extracted
Module principle block diagram.
Fig. 3 is the collaboration feature learning of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention
Module principle block diagram.
Fig. 4 is that the depth of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention cooperates with identification
Module principle block diagram.
Fig. 5 is the task prediction model of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention
Functional block diagram.
Fig. 6 is that the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention is relied on based on outside
Generate memory models schematic diagram.
Fig. 7 is the multi-source heterogeneous multitask coordinated identification verification platform frame structure of video data described in the embodiment of the present invention
Figure.
Specific embodiment
Embodiments of the present invention are described below in detail, the example of the embodiment is shown in the accompanying drawings, wherein from beginning
Same or similar element or module with the same or similar functions are indicated to same or similar label eventually.Below by ginseng
The embodiment for examining attached drawing description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one
It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention
Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or modules, but it is not excluded that in the presence of or addition
Other one or more features, integer, step, operation, element, module and/or their group.
It should be noted that in embodiment of the present invention unless specifically defined or limited otherwise, term is " even
Connect ", " fixation " etc. shall be understood in a broad sense, may be a fixed connection, may be a detachable connection, or is integral, can be machine
Tool connection, is also possible to be electrically connected, can be and be directly connected to, be also possible to be indirectly connected with by intermediary, can be two
The interaction relationship of connection or two elements inside element, unless having specific limit.For those skilled in the art
For, the concrete meaning of above-mentioned term in embodiments of the present invention can be understood as the case may be.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technology
Term and scientific term) there is meaning identical with the general understanding of those of ordinary skill in fields of the present invention.Also answer
It should be appreciated that those terms such as defined in the general dictionary should be understood that have in the context of the prior art
The consistent meaning of meaning, and unless defined as here, it will not be explained in an idealized or overly formal meaning.
In order to facilitate understanding of embodiments of the present invention, further by taking specific embodiment as an example below in conjunction with attached drawing to be solved
Explanation is released, and embodiment does not constitute the restriction to the embodiment of the present invention.
Those of ordinary skill in the art are it should be understood that attached drawing is the schematic diagram of one embodiment, the portion in attached drawing
Part or device are not necessarily implemented necessary to the present invention.
Embodiment one
As shown in Figure 1, a kind of multitask coordinated identifying system for fusion video-aware that the embodiment of the present invention one provides.It should
System includes generic features extraction module, collaboration feature learning module, depth collaboration identification module;
It is special to study multi-source heterogeneous video data for combining biology perception mechanism for the generic features extraction module
The shared semantic description for levying collaboration obtains the generic features description of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study of task cooperation using suitable border computational theory
With task forecasting mechanism, the task interaction prediction mechanism of suitable border perception is realized;
The depth cooperates with identification module, relies on when for combining long, proposes the vision multitask depth association of context collaboration
Same identification model realizes have the multitask depth collaboration identification model module of long-term memory, solves video multitask identification
The problems such as generalization is poor, robustness is low, computation complexity is high.
As shown in Fig. 2, the generic features extraction module includes primitive collaboration in specific embodiments of the present invention one
The theme of module, the dictionary collaboration submodule of time synchronization and shared semantic information cooperates with submodule.Wherein,
Primitive cooperates with submodule --- and multi-source data processing mode requires can be simultaneously in time-space domain accurately detecting and tracking target
With the variation of scene, and the visual cortex cell of biology perception have sparsity, it is dilute in order to combine nerve signal process
Otherness is dredged, and is able to achieve the information completeness of multi-source heterogeneous data.It needs to combine multi-source heterogeneous video intrinsic characteristics, research tool
Standby scale, translation, rotational invariance primitive synergistic mechanism, realize complete effective extraction of multi-source heterogeneous data information.
The dictionary of time synchronization cooperates with submodule --- and include potential semantic information in multi-source heterogeneous data, realizes low layer
The time synchronization of primitive feature and high-level semantic is to overcome the matter of utmost importance of semantic gap.Therefore, it is necessary to combine hidden semantic feature
The dictionary synergistic mechanism that expression settling time synchronizes reduces the time uncertainty between vision semanteme.
The theme of shared semantic information cooperates with submodule --- come from social activity, information, physical space different platform and mode number
Comprising nature abundant and social property in, the dimension that has different characteristics and data distribution, but the synchronous multi-source obtained
Video contains potentially large number of semantic association information.Therefore, it is necessary to study the theme of the visual information of different modalities and dictionary semanteme
Relation mechanism proposes that semantic similar theme cooperates with characteristic analysis method, establishes the regular general spy of shared semantic association of dimension
Levy description method.
As shown in figure 3, the collaboration feature learning module includes under suitable border is theoretical in specific embodiments of the present invention one
Feature association study submodule and visual attention location extracted region and task predict submodule, wherein
Feature association under suitable border theory learns submodule --- and it is different in the visual perception identification mission of scene changes
There are High relevancy between scene characteristic, same feature the relevance in different identification missions between other features exist compared with
Big difference, and there are relevances between the label of same feature different task, and there are larger for the relevance between different labels
Difference.Therefore, it when constructing different task label and feature space mapping function, needs to model label and feature correlation,
Retain the intrinsic structure of data with existing (mark and unlabeled data), generalization ability of the lift scheme to scene change.
Visual attention location extracted region and task predict submodule --- learning tasks and feature be associated with mark after, need really
Recognize visual attention location region, carry out task prediction to be identified, such as identification pupilage and expression are main identification missions in classroom;
Outdoor Scene identifies target and behavior is main identification mission.Therefore, by the feature association relationship learnt, in conjunction with visual impression
The priori knowledge known proposes the task cooperation treatment mechanism based on environmental model and loss function, realizes dynamic according to scene changes
State is adaptively adjusted task to be identified, completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
As shown in figure 4, in specific embodiments of the present invention one, what the depth collaboration identification module relied on when including long
Generate memory models submodule and multitask depth collaboration identification submodule, wherein
The generation memory models submodule relied on when long --- for long-range, the feature stream of more sequences input, without memory capability
Study mechanism, need constantly to mark new input data, relearn network model realize identification mission, to calculate, storage
It is all huge waste with human resources.Therefore, it is necessary to establish memory external in conjunction with long-range data dependence to generate model, enhance net
Network learning performance reduces model parameter calculation complexity with lesser data storage capacity, extracts useful information at once, be applied to
Different type video sequence, to solve the problems, such as that complicated, long-range sequence data can not selective memory and forgetting.
Multitask depth collaboration identification submodule --- for lasting input without mark feature stream, need accurately and efficiently
Study provides to identify away from the joint optimal characteristics with maximum kind spacing for multitask in standby infima species, and without labeled data without
Method manually provides classification markup information, inevitably results in recognition performance loss.Therefore, it is necessary to combine to have long-term memory
Cooperative kinetics principle establishes the multitask recognition mechanism of the depth continuous learning under context collaboration, realizes identification feature certainly
Main selection improves the identification without labeled data, persistently promotes the accuracy rate and robustness of multitask identification.
Embodiment two
Fusion multisource video perception number is carried out using system described in embodiment one second embodiment of the present invention provides a kind of
According to multitask recognition methods.This method comprises the following steps:
Firstly, the shared semantic description of multi-source heterogeneous video data feature collaboration is studied in conjunction with biology perception mechanism,
Obtain the generic features description of multi-source heterogeneous video data.
Then, using suitable border computational theory, feature association study and the task forecasting mechanism of task cooperation is established, is realized suitable
The task interaction prediction mechanism of border perception.
Finally, in conjunction with it is long when rely on, propose context collaboration vision multitask depth cooperate with identification model, realization have length
When remember multitask depth collaboration identification model, solve video multitask identification generalization it is poor, robustness is low, calculating is complicated
Spend the problems such as high.
It is especially more in recognition of face, Expression analysis and behavior understanding etc. in recent years in visual perception multitask identification
The research achievement obtained in task scene to feature description, interaction prediction and collaboration identification etc., brings forward magnanimity multi-source
The generic features of video data, which describe method, suitable border perceives lower task interaction prediction and continue video flowing inputs lower long-term memory
Depth cooperate with identification model, introduce shared semantic association description, suitable border Perception Features study and semi-supervised lasting collaboration and identify
Etc. frontier theories, improve the multitask coordinated identification of multi-source vision extensive robustness and it is long when intelligence.
In the specific embodiment of the invention two, maximized for the big feature of the video data volume from visual perception mechanism
While compressed data, retain the identification information of multi-source heterogeneous data.In face of complicated transformation scene, calculates and manage in conjunction with suitable border
By the relevant task interaction prediction mechanism of research characteristic improves the generalization of feature learning.For the video flowing of lasting input,
It introduces the semi-supervised depth that continues and cooperates with identification model, realize the dynamic multitask identification demand of timing memory, build multi-source vision
Multitask coordinated identification verification platform, carries out the generalization and robustness verifying of theoretical method, while continuously improving and being promoted institute
Propose the performance of method.
As shown in Fig. 2, needing to establish primitive collaboration, dictionary collaboration and theme collaboration in generic features extraction step
Three-level synergistic mechanism.
Biology perception mechanism thinks that the interaction between visible elements is exactly the interconnection behavior between cellula visualis.
Visual behaviour is exactly the treatment process of visual cortex neural network, can be divided into characteristic layer, task layer, context layer.Therefore, it to solve
The certainly multitask coordinated identification of multisource video information first has to carry out feature collaboration, that is, extracts the general spy of multi-source heterogeneous data
Sign description.Although visual perception data source difference, various structures, storage format are changeable, it includes visual information and language
Adopted information.The key of feature collaboration is to realize the " semantic of human cognitive how by video image and semantic progress efficient association
It is similar " and " vision is similar " of data processing between consistent generic features mechanism is described.For the complete of visual perception primitive
Property, the signal of visual cortex simple cell responds the low dimensional manifold of sparsity and task scene.
Multi-source heterogeneous video primitives collaboration is intended to that extracted feature is made not only to meet the otherness for keeping nerve signal sparse, but also
The various possible signals in natural scene can effectively be captured.Although traditional global or local image feature representation can local solution
Certainly video data scale and rotational invariance, but the individual poor appearance that not competent target itself generates in things of the like description
It is different, therefore it is only applicable to single data source single task treatment mechanism, complete effective description space can not be provided, for higher
Visual perception task is even more helpless.
Primitive collaboration obtains one group of sparse, independent filter group first, has different description energy for detecting in video
A possibility that power feature occurs.0 norm or 1 norm for generalling use primitive coefficient evaluate sparsity.Independence requirement
Correlation is as small as possible between primitive vector.Using independent component analysis training video picture element, using Gabor function to base
Member matches according to this, estimates the corresponding scale of each primitive and direction, and primitive collaboration discloses primary visual cortex to a certain extent
Neural treatment process can be realized the time-space domain efficient coding of natural video frequency image internal structure.
The dictionary collaboration, utilizes Unsupervised clustering process it is assumed that potential applications refer to based on similar target appearance consistency
Local description is marked, using appearance similarity degree as the decision condition of objective attribute target attribute.Dictionary collaboration is a kind of word-based
The typical enigmatic language justice character representation method of remittance packet model.It is encoded using local linear, using local distance as sparse basic function
Regular terms, calculate the best response signal of original dictionary;The optimization signal is recycled to calculate feasible dictionary search direction,
To complete a dictionary updating, a Coded concepts stream is established for each data channel.As the semantic coding of complicated event,
The low-level feature stream of all new inputs flows into Mobile state Time alignment with reference to semantic coding, and generation time translation function is realized
The alignment of dictionary semanteme.This method require strong signal response dictionary and it is other between otherness it is larger, sample can be efficiently differentiated
Video block indicates, guarantees the consistency of similar video primitive set.
The theme collaboration defines its potential semantic topic analysis for the description of dictionary and scene symbiosis, will
Difference appearance under different context environment is mapped as certain potential low-dimensional in the groove, realizes theme Cooperative Analysis process, wherein
It not only had included that visual signature is mapped to more mappings, but also including visual signature to category label multipair 1 to category label 1.The present invention adopts
With hidden semantic analysis, the co-occurrence matrix between dictionary and video image is constructed, using hidden node by the corresponding semantic concept of theme
It embodies, is realized by probability inference method and vocabulary, theme node and scene mapping relations are described, the video under theme distribution
Conditional probability calculates the likelihood function of probability and prediction probability between true vocabulary and scene as particular category similarity, complete
At the building of projection matrix.
In the specific embodiment of the invention two, according to Semantic Similarity between different channels video, in order to effectively quantify difference
The shared semantic information of channel different dimensions overcomes noise, blocks, the influence to feature identification such as illumination, extracts more visions
The description of discrimination property maximum generic features, increases class spacing in task, reduces in class away from establishing isomeric data feature collaboration mould
Type.Assuming that having C class heterogeneous channel feature, to each characteristic typeIt is denoted as niA trained sample
This eigenmatrix, data noise part are E, and Γ is twiddle factor.Semantic sharing heterogeneous characteristic cooperation model purport under multitask
For each XiLearn a projection matrix Θi.Heterogeneous characteristic is projected as to equal intrinsic dimensionality, reduces data redundancy,
Majorized function is expressed as under orthogonality constraint:
The heterogeneous characteristic cooperation model is intended to obtain general semantics feature low dimensional manifold subspace { Θi, under Unified frame
Semantic sharing matrix W0With special characteristic modular matrix { Wi, least square method is for solving prediction loss function R1(W0,
{Wi},{Θi), reconstruct loss function R2({Θi) and regular function R3(W0,{Wi) joint optimal solution.By will newly input
Data extract the high-rise generic features with dimension to eigenspace projection and describe, establish shared semantic association relationship.
In conclusion primitive collaboration belongs to feature extraction phases, it is therefore intended that acquisition to the greatest extent may be used in the embodiment of the present invention two
It can sparse complete characteristic response signal;Dictionary collaboration belongs to the feature coding stage, it is therefore intended that local video block feature into
Row unsupervised learning obtains the semantic dictionary with local holding capacity;Theme collaboration belongs to the Feature Semantics stage, it is therefore intended that
Its hiding Semantic mapping space is solved by probabilistic framework, and all kinds of similarity in analysis space is realized feature collaboration, established
Perceive the similar generic features describing framework of semanteme of identification mission environment.
The learning ability of human vision is by the signal response and transmitting realization between cellula visualis, particular visual task
Need the mutual synergistic effect between a large amount of cellula visualises.Due to concurrency, hierarchy and the feedback between human vision cell,
Signal transmitting also has different collaboration meanings, and the transfer mode otherness of synergistic signal is the difficult point of its learning tasks.For
In vision multitask perception identification, scene is complicated and changeable, collaboration identification needs intelligent Forecasting a variety of visual tasks to be identified,
It proposes the task forecasting mechanism based on generic features association study under suitable border environment, task layer coevolution is realized, to solve to regard
Feel perception and the problem of being connected harmonious between natural environment.
As shown in figure 3, the collaboration feature learning includes what feature association learnt in specific embodiments of the present invention two
Task under mark collaboration and suitable border develop is predicted.Wherein,
The mark collaboration of feature association study --- there are stronger relevance between feature in practical application scene, and it is same
There is larger difference again in relevance of the feature in different identification missions;And exist with the closely related markup information of identification mission
Stronger relevance, and there is larger difference again in the relevance between different labeled.Coodination theory thinks that sample mark divides
With process, the feature of sample itself is depended not only upon, it is often more important that the data spatial and temporal distributions relationship that neighbour's space sample provides.
Same target may correspond to multiple marks, and point of the sample in higher dimensional space is often positioned in multiple when also taking into account feature learning
Ambiguousness in the decision boundaries for classification of being engaged in.
The mapping function between lower vision mark and generic features is constrained by building low-rank, realizes feature mark collaboration.
Nuclear norm is introduced to model mark correlation, feature correlation, at the same introduce figure regular terms retain data with existing (mark and
Unlabeled data) intrinsic structure, realize without mark characteristic mark prediction, the extensive energy of lifting feature learning model
Power overcomes semantic ambiguity, keeps model as simple as possible, reduces computation complexity.It is possible thereby to establish following unconstrained function:
Wherein g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and by g letter
The loss function of number acquisition task prediction result error minimizes,For being fitted given mark.Φ (g) and Λ
It (g) is the regular terms based on a priori assumption.The former keeps low-rank to constrain in practical application, and the latter is for retaining intrinsic structure.λ and
γ is the contribution that regular terms parameter is used for three in balance model.
As shown in figure 5, proposing the task prediction technique under suitable border develops, use and people in the specific embodiment of the invention two
The suitable border computational theory that cognitive process is close is formed being learnt based on linked character for task and predicts evolutionary model mechanism.The mould
Type is by interactive environment, environmental model and loss model composition.Environmental model learns the environment dynamic change of input feature vector, loses mould
Type estimates environmental model loss, predicts visual zone, target and the task to be identified for needing to pay close attention in the future.Wherein,
Interactive environment --- definition status space is made of t moment and the description of the generic features at t-1 moment, current t moment
State specifies identification mission at, predict next t+1 moment task status to be identified.
Environmental model --- given historical informationGeneric features and history map
Function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η (h).Remember ω
For environmental model ω (ξ (h)) ∈ Y.When every subtask is predicted, loss model L is introducedwm(ω(ξ(h)),η(h)).Task prediction
It is related to H={ h=(st-k,at-k,···,st,at,st+1), ξ (h)=(st-k,at-k,···,st,at) and η (h)=
st+1, inverse kinematics forecasting mechanism and softmax cross entropy loss forecasting state in future, the neural network based on stochastic gradient descent
Model ωφ, to encode it is stateful enter the low-dimensional latent space comprising shared weight complete visual attention location extracted region and
Status predication.
Loss model --- given state stWith suggestion next step task, loss model is for predicting environmental model RlA
The probability distribution that business occurs, softmax cross entropy loss function encode the state of suggestion task as penalty term, promote task
The accuracy of prediction.
It fits border prediction by the way that association study and task will be marked and develops, gradual task forecasting mechanism is established, by layer-by-layer
Storage environment aware migration knowledge establishes valuable loss model, defines the excitation and inhibition of identification mission in coevolution
Element, the current identification mission to be treated of decision solve the problems, such as to migrate knowledge from simulated environment to true environment, are promoted
The generalization and stability of feature association study and task prediction.
Learn to realize high-level semantic automatic marking using linked character, according to suitable border perception theory, in conjunction with complicated and changeable
Application environment, the suitable border region-of-interest offered the challenge under cooperateing with extracts and multitask forecasting mechanism.On this basis, according to suitable border
Perceptibility, low-rank are restrictive, pay attention to the synthesis limitation of the constraint conditions such as regionality, task relevance, realize and describe to generic features
Optimal association study, generalization ability of the lift scheme to mass data and multiple-task.By a priori assumption, aposterior reasoning,
It is associated with the theoretical research that optimization design completes relevant programme, further completes new departure by tools such as algorithm simulating platforms
Simulating, verifying work.
As shown in figure 4, organically combining biological neural network and Synergetic Pattern Recognition in the embodiment of the present invention two, utilize
Visual perception carries out effective semantic generic features description to target, and it is pre- to consider that the structured message of target scene carries out task
It surveys, realizes to the context layer cooperating analysis of visual task, reach to prototype pattern (task to be identified) and adjoint mode (single task
Recognition result) learn simultaneously, pattern dependency is effectively reduced, proposes the depth collaboration recognition methods of reduction.
The distribution of target-like State evolution is described as heat-supplied potential function by Coodination theory, it is believed that human brain memory system
Signal self-organization process is exactly human associative memory process.In general, the video flowing persistently inputted is based on time interval and sees in the past
The Long-range dependence examined can by the recognizable element of long time series and can not recognition element separate, to can not recognition element mark it is not true
It is qualitative, and quickly identification can be with the new element in aid forecasting future.This research generates mould using memory external system enhancing timing
Type, since the early stage of sequence store-memory feature describe effective information, and efficiently to stored information foundation can
Lasting generation memory models.
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,···,eTAnd task association
Same hidden variable collection z≤T={ z1,z2,···,zT, h is mapped using translationt=fh(ht-1,et,zt) correct each time point
The hidden state variable h of certaintyt, priori mapping function fz(ht-1) description past observing and hidden variable non-linear dependence and offer
Hidden variable distribution parameter.Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided.This
Memory external Modifying model timing variable autocoder is utilized in research, generates a memory text ψ at every point of timet,
Its priori and posterior probability are expressed as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μ(Ψt),fz σ(Ψt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μ(Ψt-1,et),fq σ(Ψt-1,et))
Wherein prior information is to rely on priori mapping fzRemember the diagonal gauss of distribution function of text, and diagonal Gauss is close
It is depended on like Posterior distrbutionp and passes through posteriority mapping function fqAssociated memory text Ψt-1With current observation et。
As shown in fig. 6, for the treatment process for using random calculating to scheme to generate model as memory timing.In order to make the structure
There is higher versatility and flexibility to different perception tasks, the memory and controller architecture for introducing high-level semantic are with stabilization
Storage information extracted for future, and carry out corresponding calculate to extract use information at once.
Depth collaboration identification improves collaboration prototype pattern modification method, from the angle that prototype pattern and adjoint mode learn simultaneously
Degree proposes to cooperate with recognizer that will remember directly using the evolutionary process of collaboration potential-energy function based on the depth for generating memory models
Recall the dynamic process that model is introduced into coevolution, will solve prototype pattern and adjoint mode be attributed to solve it is non-linear optimal
Change problem, to obtain better contract network weight.Long memory network f in short-termrnnFor promoting state history ht, memory external Mt
Use the hidden variable and external text information c from previous momenttIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, the network generate one collection key value, use cosine similarity
Evaluation and test willWith memory Mt-1Each row compares, and generates task weight-sets, the memory φ of retrievalt rBy attention weight and memory
Mt-1Weighted sum obtain, realize the sustainable multitask recognition mechanism of dynamic.
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the Setover relatedly value arrived by retrieving mnemonic learning, σ () is sigmoid function, memory external Mt
For storing hidden variable zt, the recognition mechanism Ψ of controller formation informing memory storage and retrievalt=[φt 1,φt 2,···,
φt R,ht].It is the output for generating memory models, and the multitask coordinated identification of the vision unknown for task definition and number is real
Now to the non-supervisory self-adapting estimation of the video flowing of lasting input.For new identification mission, training before being remained in training
The hiding layer state of model cooperates with the reward of each hidden layer in network before combining with task cooperation hierarchy based on feature
Biasing realizes that the depth of context collaboration cooperates with recognition mechanism, so that it is possessed the priori knowledge relied on for a long time, formed and appointed for identification
The completed policy of business improves the robustness of identification.
As shown in fig. 7, for a kind of multitask coordinated identification verifying for merging video-aware provided by Embodiment 2 of the present invention
Platform.
With the lasting input of extensive multi-source heterogeneous video data and being continuously increased for suitable border perception identification mission, need
A large amount of data storage and computing resource.Using the how intelligent coordinated processing of distribution, multinode, more GPU in high-performance calculation
Mechanism, builds the multitask coordinated identification verification platform of multi-source vision, and purposes is to carry out the more visual task collaboration identifications of multi-source data
Research, and theoretical study results involved in platform are assessed, it is intended to realization and testing vision collaboration are provided for researcher
The extensibility framework of identification model, the basic test environment of model is provided and to the System Performance Analysis method of related data and
Index provides for AI developer using the tool being integrated with fundamental research.It can be used as face in the following smart city construction
It is provided to the further research and development of the push of the intelligent information of multi-source heterogeneous data, personalized control service etc. valuable
Research verification platform.
In conjunction with above-mentioned intelligent verification demo platform, the result that multitask coordinated identification is collected from visual perception data is realized
Output provides a standard platform for subsequent further investigation and functionization.Visual perception more are considered in test method
The features such as high efficiency in Cooperative Analysis, dynamic, intelligence of being engaged in, in conjunction with the software design specification of soft project, using towards
The Programming Methodology of object designs a verifying demo system easily extended.
As seen through the above description of the embodiments, those skilled in the art can be understood that the present invention can
It realizes by means of software and necessary general hardware platform.Based on this understanding, technical solution of the present invention essence
On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product
It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment
(can be personal computer, server or the network equipment etc.) executes the certain of each embodiment or embodiment of the invention
Method described in part.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of protection of the claims
Subject to.
Claims (10)
1. a kind of multitask recognition methods for merging video-aware, which is characterized in that include the following steps:
Step S110:In conjunction with biology perception mechanism, the shared semantic mechanism based on the collaboration of multi-source heterogeneous video data feature,
Extract the generic features of multi-source heterogeneous video data;
Step S120:Using suitable border computational theory, the feature association study mechanism of task cooperation is established, to the multi-source heterogeneous view
The generic features of frequency evidence carry out continuous learning as priori knowledge, generate the task interaction prediction model of suitable border perception;
Step S130:For it is long when input video stream, when the task interaction prediction model foundation perceived in conjunction with the suitable border is long according to
Bad generation memory models, depth of the foundation based on cooperative kinetics is independently semi-supervised persistently to identify system, realizes that multitask is known
Not.
2. the multitask recognition methods of fusion video-aware according to claim 1, which is characterized in that the step S110
In, the shared semantic mechanism of the multi-source heterogeneous video data feature collaboration includes:
Establish the primitive collaboration based on multi-source heterogeneous video data, the dictionary collaboration based on time synchronization and based on semantic similar
It is special to establish multi-source heterogeneous video data in conjunction with the attribute of multi-source heterogeneous video data for the three-level feature synergistic mechanism of theme collaboration
Cooperation model is levied, determines the regular shared semantic association relationship of dimension;Wherein,
Primitive based on multi-source heterogeneous video data cooperates with:Using independent component analysis training video picture element, utilize
Gabor function matches the video image primitive according to this, estimates the corresponding scale of each video image primitive and direction,
The primitive feature of video image is extracted, realizes the time-space domain efficient coding of video image internal structure;
Dictionary based on time synchronization cooperates with:It is encoded using local linear, using local distance as sparse basic function
Regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search direction,
A dictionary updating is completed, establishes a Coded concepts stream for each data channel, as the reference semantic coding of complicated event,
The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment, generation time translation function is real
Existing dictionary semanteme alignment;
Include based on semantic similar theme collaboration:Using hidden semantic analysis, construct between dictionary and video image primitive feature
Co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, master
Node and the description of scene mapping relations are inscribed, the video conditional probability under theme distribution is calculated true as particular category similarity
The likelihood function of probability and prediction probability between notional word remittance and scene.
3. the multitask recognition methods of fusion video-aware according to claim 2, which is characterized in that described to establish multi-source
Isomery video data feature cooperation model determines that the regular shared semantic association relationship of dimension includes:
Assuming that there is C class heterogeneous channel feature, by i (i=1 ..., C),It is denoted as niThe feature square of a training sample
Battle array, data noise part are E, and Γ is twiddle factor, establishes the majorized function under orthogonality constraint:
Wherein, λ indicates sharing matrix coefficient,TRepresenting matrix carries out transposition operation, YiIndicate that ith feature classification mark, F indicate
Frobenius norm,Indicate projection matrix ΘiTransposition, α, β, μ1And μ2For multiplier factor, rank (X) is characterized matrix X
Order;
Obtain general semantics feature low dimensional manifold subspace { Θi, the semantic sharing matrix W under Unified frame0With special characteristic mould
Block matrix { Wi, using least square method for solving prediction loss function R1(W0,{Wi},{Θi), reconstruct loss function R2
({Θi) and regular function R3(W0,{Wi) joint optimal solution;
By the way that the multi-source heterogeneous video data newly inputted is extracted the high-rise generic features description with dimension to eigenspace projection,
Establish shared semantic association relationship.
4. the multitask recognition methods of fusion video-aware according to claim 3, which is characterized in that the step S120
In, using suitable border computational theory, the feature association study mechanism of task cooperation is established, generates the task interaction prediction of suitable border perception
Model includes:
The mapping function under low-rank constrains between vision mark and generic features is constructed, realizes feature mark collaboration;Introduce core model
Several pairs of mark correlations, feature correlations model, while introducing the intrinsic structure that figure regular terms retains data with existing, realize
Mark prediction without mark characteristic, establishes following unconstrained function:
Wherein, g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and is obtained by g function
The loss function for obtaining task prediction result error minimizes,For being fitted given mark, Φ (g) and Λ (g) are
Regular terms based on a priori assumption, λ and γ are regular terms parameters;
Task interaction prediction model interactive environment, environmental model and the loss model of suitable border perception;
Environmental model is used to learn the environment dynamic change of input feature vector, and loss model is for estimating that environmental model loses, prediction
Visual zone, target and the task to be identified for needing to pay close attention in the future;
The interactive environment includes that definition status space is made of t moment and the description of the generic features at t-1 moment, current t moment
State specifies identification mission at, predict next t+1 moment task status to be identified;
The environmental model includes giving historical informationGeneric features are reflected with history
Penetrate function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η (h);Note
ω is environmental model ω (ξ (h)) ∈ Y, when every subtask is predicted, introduces loss model Lwm(ω (ξ (h)), η (h)), task prediction
It is related to H={ h=(st-k,at-k,…,st,at,st+1), ξ (h)=(st-k,at-k,…,st,at) and η (h)=st+1, inverse kinematics
Forecasting mechanism and softmax cross entropy loss forecasting state in future, the neural network model ω based on stochastic gradient descentφ, come
Coding institute is stateful to enter a low-dimensional latent space completion visual attention location extracted region and status predication comprising shared weight;
The loss model includes given state stWith suggest next step task, for predicting environmental model RlWhat a task occurred
Probability distribution, softmax cross entropy loss function encode the state of next step task as penalty term.
5. the multitask recognition methods of fusion video-aware according to claim 4, which is characterized in that the step S130
In, it is described for it is long when input video stream, the life that the task interaction prediction model foundation perceived in conjunction with the suitable border relies on when long
Include at memory models:
Model is generated using memory external system enhancing timing, having for generic features description is stored since the early stage of sequence
Information is imitated, establishes sustainable generation memory models to information has been stored;Specifically,
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,…,eTAnd task cooperation hidden change
Quantity set z≤T={ z1,z2,…,zT, h is mapped using translationt=fh(ht-1,et,zt) correct the hidden shape of certainty of each time point
State variable ht, priori mapping function fz(ht-1) describe the non-linear dependence of past observing and hidden variable and hidden variable distribution ginseng is provided
Number;Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided;Utilize memory external mould
Type corrects timing variable autocoder, generates a memory text ψ at every point of timet, prior information and posterior information
It respectively indicates as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μ(Ψt),fz σ(Ψt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μ(Ψt-1,et),fq σ(Ψt-1,et))
Wherein,It is the translation mapping function of hidden variable z state μ,It is the translation mapping function of hidden variable z state σ,It is
The translation mapping function of posterior probability q state μ,The translation mapping function of posterior probability q state σ, prior information are to rely on
Priori maps fzRemember the diagonal gauss of distribution function of text, and diagonal Gaussian approximation Posterior distrbutionp is depended on and is mapped by posteriority
Function fqAssociated memory text Ψt-1With current observation et。
6. the multitask recognition methods of fusion video-aware according to claim 5, which is characterized in that the foundation is based on
The depth of cooperative kinetics is independently semi-supervised persistently to identify system, realizes that multitask identification includes:
Recognizer is cooperateed with based on the depth for generating memory models, using the evolutionary process of collaboration potential-energy function, by memory models
It is introduced into the dynamic process of coevolution, prototype pattern will be solved and adjoint mode is attributed to solution nonlinear optimization and asks
Topic obtains optimization contract network weight;
Long memory network f in short-termrnnFor promoting state history ht, memory external MtUsing from previous moment hidden variable and
External text information ctIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, introduce one collection key value, using cosine similarity evaluate and test willWith note
Recall Mt-1Each row compares, and generation task pays attention to weight, the memory of retrievalBy attention weight and memory Mt-1Weighted sum
It obtains, realizes multitask identification;Wherein,
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the crucial value function of r item for promoting state history, fattIt is attention mechanism function,It is t moment r item i-th
The memory weight of a point,Retrieval memory equation obtain as a result, ⊙ indicate point multiplication operation,It is by retrieving mnemonic learning
The Setover relatedly value arrived, σ () are sigmoid functions, form the expression mechanism for informing memory storage and retrieval as a result,As the output for generating memory models.
7. a kind of multitask coordinated identifying system for merging video-aware, it is characterised in that:Including generic features extraction module, association
Identification module is cooperateed with feature learning module, depth;
The generic features extraction module is assisted for combining biology perception mechanism based on multi-source heterogeneous video data feature
Same shared semantic mechanism, extracts the generic features of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study mechanism of task cooperation using suitable border computational theory,
Continuous learning is carried out as priori knowledge to the generic features of the multi-source heterogeneous video data, the generating suitable border perception of the task is closed
Join prediction model;
The depth cooperates with identification module, input video stream when for being directed to long, and the task association perceived in conjunction with the suitable border is pre-
The generation memory models that survey model foundation relies on when long establish the autonomous semi-supervised lasting identifier of the depth based on cooperative kinetics
System realizes multitask identification.
8. the multitask coordinated identifying system of fusion video-aware according to claim 7, it is characterised in that:It is described general
Characteristic extracting module includes primitive collaboration submodule, dictionary collaboration submodule and theme collaboration submodule;
The primitive cooperates with submodule, for utilizing independent component analysis training video picture element, using Gabor function to institute
It states video image primitive to match according to this, estimates the corresponding scale of each video image primitive and direction, extract video image
Primitive feature, realize video image internal structure time-space domain efficient coding;
The dictionary cooperates with submodule, for being encoded using local linear, using local distance as the canonical of sparse basic function
, the best response signal of original dictionary is calculated, recycles the best response signal to calculate feasible dictionary search direction, completes
Dictionary updating establishes a Coded concepts stream for each data channel, will be new as the reference semantic coding of complicated event
The low-level feature stream of input and the semantic coding that refers to flow into Mobile state Time alignment, and generation time translation function realizes word
The alignment of allusion quotation semanteme;
The theme cooperates with submodule, for using hidden semantic analysis, constructs being total between dictionary and video image primitive feature
The corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, theme section by raw matrix
Point and the description of scene mapping relations, by the video conditional probability under theme distribution, as particular category similarity, calculate true word
The likelihood function of probability and prediction probability between remittance and scene.
9. the multitask coordinated identifying system of fusion video-aware according to claim 8, it is characterised in that:The collaboration
Feature learning module includes the task interaction prediction submodule of feature association study submodule and the perception of suitable border;
The feature association learns submodule, for constructing the mapping letter under low-rank constrains between vision mark and generic features
Number realizes feature mark collaboration;
The task interaction prediction submodule of the suitable border perception, for the feature association relationship by learning, in conjunction with visual impression
The priori knowledge known, the task cooperation treatment mechanism based on environmental model and loss function are realized according to scene changes dynamic certainly
Task to be identified is adaptively adjusted, the dynamic adjustment of the perception of visual attention location region and mission requirements prediction is completed.
10. the multitask coordinated identifying system of fusion video-aware according to claim 9, it is characterised in that:The depth
The generation memory models submodule and multitask depth collaboration identification submodule that degree collaboration identification module relies on when including long;
The generation memory models submodule relied on when described long, input video stream when for being directed to long are perceived in conjunction with the suitable border
Task interaction prediction model foundation it is long when the generation memory models that rely on;
The multitask depth collaboration identification submodule, for establishing the autonomous semi-supervised lasting knowledge of the depth based on cooperative kinetics
Complicated variant system realizes multitask identification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810744934.4A CN108846384A (en) | 2018-07-09 | 2018-07-09 | Merge the multitask coordinated recognition methods and system of video-aware |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810744934.4A CN108846384A (en) | 2018-07-09 | 2018-07-09 | Merge the multitask coordinated recognition methods and system of video-aware |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108846384A true CN108846384A (en) | 2018-11-20 |
Family
ID=64195944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810744934.4A Withdrawn CN108846384A (en) | 2018-07-09 | 2018-07-09 | Merge the multitask coordinated recognition methods and system of video-aware |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108846384A (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376964A (en) * | 2018-12-10 | 2019-02-22 | 杭州世平信息科技有限公司 | A kind of criminal case charge prediction technique based on Memory Neural Networks |
CN109376963A (en) * | 2018-12-10 | 2019-02-22 | 杭州世平信息科技有限公司 | A kind of criminal case charge law article unified prediction neural network based |
CN109492059A (en) * | 2019-01-03 | 2019-03-19 | 北京理工大学 | A kind of multi-source heterogeneous data fusion and Modifying model process management and control method |
CN109687845A (en) * | 2018-12-25 | 2019-04-26 | 苏州大学 | A kind of sparse regularization multitask sef-adapting filter network of the cluster of robust |
CN109711411A (en) * | 2018-12-10 | 2019-05-03 | 浙江大学 | A kind of image segmentation and identification method based on capsule neuron |
CN109784399A (en) * | 2019-01-11 | 2019-05-21 | 中国人民解放军海军航空大学 | Based on the multi-source image target association method for improving dictionary learning |
CN109919177A (en) * | 2019-01-23 | 2019-06-21 | 西北工业大学 | Feature selection approach based on stratification depth network |
CN109933788A (en) * | 2019-02-14 | 2019-06-25 | 北京百度网讯科技有限公司 | Type determines method, apparatus, equipment and medium |
CN109977194A (en) * | 2019-03-20 | 2019-07-05 | 华南理工大学 | Text similarity computing method, system, equipment and medium based on unsupervised learning |
CN109992703A (en) * | 2019-01-28 | 2019-07-09 | 西安交通大学 | A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning |
CN110020626A (en) * | 2019-04-09 | 2019-07-16 | 中通服公众信息产业股份有限公司 | A kind of multi-source heterogeneous data personal identification method based on attention mechanism |
CN110147711A (en) * | 2019-02-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video scene recognition methods, device, storage medium and electronic device |
CN110245267A (en) * | 2019-05-17 | 2019-09-17 | 天津大学 | Multi-user's video flowing deep learning is shared to calculate multiplexing method |
CN110309861A (en) * | 2019-06-10 | 2019-10-08 | 浙江大学 | A kind of multi-modal mankind's activity recognition methods based on generation confrontation network |
CN110378190A (en) * | 2019-04-23 | 2019-10-25 | 南京邮电大学 | Video content detection system and detection method based on topic identification |
CN110688916A (en) * | 2019-09-12 | 2020-01-14 | 武汉理工大学 | Video description method and device based on entity relationship extraction |
CN110928889A (en) * | 2019-10-23 | 2020-03-27 | 深圳市华讯方舟太赫兹科技有限公司 | Training model updating method, device and computer storage medium |
CN110956105A (en) * | 2019-11-20 | 2020-04-03 | 北京影谱科技股份有限公司 | Gesture recognition method based on semantic probability network |
CN111160443A (en) * | 2019-12-25 | 2020-05-15 | 浙江大学 | Activity and user identification method based on deep multitask learning |
CN111242318A (en) * | 2020-01-13 | 2020-06-05 | 拉扎斯网络科技(上海)有限公司 | Business model training method and device based on heterogeneous feature library |
CN111488840A (en) * | 2020-04-15 | 2020-08-04 | 桂林电子科技大学 | Human behavior classification method based on multi-task learning model |
CN112100256A (en) * | 2020-08-06 | 2020-12-18 | 北京航空航天大学 | Data-driven urban accurate depth image system and method |
CN112527993A (en) * | 2020-12-17 | 2021-03-19 | 浙江财经大学东方学院 | Cross-media hierarchical deep video question-answer reasoning framework |
CN112766470A (en) * | 2019-10-21 | 2021-05-07 | 地平线(上海)人工智能技术有限公司 | Feature data processing method, instruction sequence generation method, device and equipment |
CN113110517A (en) * | 2021-05-24 | 2021-07-13 | 郑州大学 | Multi-robot collaborative search method based on biological elicitation in unknown environment |
CN113128669A (en) * | 2021-04-08 | 2021-07-16 | 中国科学院计算技术研究所 | Neural network model for semi-supervised learning and semi-supervised learning method |
CN113220911A (en) * | 2021-05-25 | 2021-08-06 | 中国农业科学院农业信息研究所 | Agricultural multi-source heterogeneous data analysis and mining method and application thereof |
CN113268818A (en) * | 2021-07-19 | 2021-08-17 | 中国空气动力研究与发展中心计算空气动力研究所 | Pneumatic global optimization method based on topological mapping generation, storage medium and terminal |
CN113285721A (en) * | 2021-06-10 | 2021-08-20 | 北京邮电大学 | Reconstruction and prediction algorithm for sparse mobile sensing data |
CN113411765A (en) * | 2021-05-22 | 2021-09-17 | 西北工业大学 | Mobile intelligent terminal energy consumption optimization method based on multi-sensor cooperative sensing |
CN113438204A (en) * | 2021-05-06 | 2021-09-24 | 中国地质大学(武汉) | Multi-node cooperative identification response method based on block chain |
CN113505611A (en) * | 2021-07-09 | 2021-10-15 | 中国人民解放军战略支援部队信息工程大学 | Training method and system for obtaining better speech translation model in generation of confrontation |
CN113537355A (en) * | 2021-07-19 | 2021-10-22 | 金鹏电子信息机器有限公司 | Multi-element heterogeneous data semantic fusion method and system for security monitoring |
CN113780578A (en) * | 2021-09-08 | 2021-12-10 | 北京百度网讯科技有限公司 | Model training method and device, electronic equipment and readable storage medium |
CN113822048A (en) * | 2021-09-16 | 2021-12-21 | 电子科技大学 | Social media text denoising method based on space-time burst characteristics |
CN113949880A (en) * | 2021-09-02 | 2022-01-18 | 北京大学 | Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method |
CN114694177A (en) * | 2022-03-10 | 2022-07-01 | 电子科技大学 | Fine-grained character attribute identification method based on multi-scale features and attribute association mining |
CN114783022A (en) * | 2022-04-08 | 2022-07-22 | 马上消费金融股份有限公司 | Information processing method and device, computer equipment and storage medium |
CN114898319A (en) * | 2022-05-25 | 2022-08-12 | 山东大学 | Vehicle type recognition method and system based on multi-sensor decision-level information fusion |
CN115632684A (en) * | 2022-12-21 | 2023-01-20 | 香港中文大学(深圳) | Transmission strategy design method of perception and communication integrated system |
CN115985402A (en) * | 2023-03-20 | 2023-04-18 | 北京航空航天大学 | Cross-modal data migration method based on normalized flow theory |
CN116503029A (en) * | 2023-06-27 | 2023-07-28 | 北京中电科卫星导航系统有限公司 | Module data cooperative processing method and system for automatic driving |
CN117292274A (en) * | 2023-11-22 | 2023-12-26 | 成都信息工程大学 | Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary |
CN111815030B (en) * | 2020-06-11 | 2024-02-06 | 浙江工商大学 | Multi-target feature prediction method based on small amount of questionnaire survey data |
-
2018
- 2018-07-09 CN CN201810744934.4A patent/CN108846384A/en not_active Withdrawn
Cited By (67)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376963B (en) * | 2018-12-10 | 2022-04-08 | 杭州世平信息科技有限公司 | Criminal case and criminal name and criminal law joint prediction method based on neural network |
CN109376963A (en) * | 2018-12-10 | 2019-02-22 | 杭州世平信息科技有限公司 | A kind of criminal case charge law article unified prediction neural network based |
CN109711411A (en) * | 2018-12-10 | 2019-05-03 | 浙江大学 | A kind of image segmentation and identification method based on capsule neuron |
CN109376964A (en) * | 2018-12-10 | 2019-02-22 | 杭州世平信息科技有限公司 | A kind of criminal case charge prediction technique based on Memory Neural Networks |
CN109376964B (en) * | 2018-12-10 | 2021-11-12 | 杭州世平信息科技有限公司 | Criminal case criminal name prediction method based on memory neural network |
CN109687845A (en) * | 2018-12-25 | 2019-04-26 | 苏州大学 | A kind of sparse regularization multitask sef-adapting filter network of the cluster of robust |
CN109492059A (en) * | 2019-01-03 | 2019-03-19 | 北京理工大学 | A kind of multi-source heterogeneous data fusion and Modifying model process management and control method |
CN109492059B (en) * | 2019-01-03 | 2020-10-27 | 北京理工大学 | Multi-source heterogeneous data fusion and model correction process control method |
CN109784399A (en) * | 2019-01-11 | 2019-05-21 | 中国人民解放军海军航空大学 | Based on the multi-source image target association method for improving dictionary learning |
CN109919177B (en) * | 2019-01-23 | 2022-03-29 | 西北工业大学 | Feature selection method based on hierarchical deep network |
CN109919177A (en) * | 2019-01-23 | 2019-06-21 | 西北工业大学 | Feature selection approach based on stratification depth network |
CN109992703A (en) * | 2019-01-28 | 2019-07-09 | 西安交通大学 | A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning |
CN109992703B (en) * | 2019-01-28 | 2022-03-01 | 西安交通大学 | Reliability evaluation method for differentiated feature mining based on multi-task learning |
CN109933788B (en) * | 2019-02-14 | 2023-05-23 | 北京百度网讯科技有限公司 | Type determining method, device, equipment and medium |
CN109933788A (en) * | 2019-02-14 | 2019-06-25 | 北京百度网讯科技有限公司 | Type determines method, apparatus, equipment and medium |
CN110147711A (en) * | 2019-02-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video scene recognition methods, device, storage medium and electronic device |
CN110147711B (en) * | 2019-02-27 | 2023-11-14 | 腾讯科技(深圳)有限公司 | Video scene recognition method and device, storage medium and electronic device |
CN109977194A (en) * | 2019-03-20 | 2019-07-05 | 华南理工大学 | Text similarity computing method, system, equipment and medium based on unsupervised learning |
CN109977194B (en) * | 2019-03-20 | 2021-08-10 | 华南理工大学 | Text similarity calculation method, system, device and medium based on unsupervised learning |
CN110020626A (en) * | 2019-04-09 | 2019-07-16 | 中通服公众信息产业股份有限公司 | A kind of multi-source heterogeneous data personal identification method based on attention mechanism |
CN110378190B (en) * | 2019-04-23 | 2022-10-04 | 南京邮电大学 | Video content detection system and detection method based on topic identification |
CN110378190A (en) * | 2019-04-23 | 2019-10-25 | 南京邮电大学 | Video content detection system and detection method based on topic identification |
CN110245267B (en) * | 2019-05-17 | 2023-08-11 | 天津大学 | Multi-user video stream deep learning sharing calculation multiplexing method |
CN110245267A (en) * | 2019-05-17 | 2019-09-17 | 天津大学 | Multi-user's video flowing deep learning is shared to calculate multiplexing method |
CN110309861A (en) * | 2019-06-10 | 2019-10-08 | 浙江大学 | A kind of multi-modal mankind's activity recognition methods based on generation confrontation network |
CN110688916A (en) * | 2019-09-12 | 2020-01-14 | 武汉理工大学 | Video description method and device based on entity relationship extraction |
CN112766470B (en) * | 2019-10-21 | 2024-05-07 | 地平线(上海)人工智能技术有限公司 | Feature data processing method, instruction sequence generating method, device and equipment |
CN112766470A (en) * | 2019-10-21 | 2021-05-07 | 地平线(上海)人工智能技术有限公司 | Feature data processing method, instruction sequence generation method, device and equipment |
CN110928889A (en) * | 2019-10-23 | 2020-03-27 | 深圳市华讯方舟太赫兹科技有限公司 | Training model updating method, device and computer storage medium |
CN110956105A (en) * | 2019-11-20 | 2020-04-03 | 北京影谱科技股份有限公司 | Gesture recognition method based on semantic probability network |
CN111160443B (en) * | 2019-12-25 | 2023-05-23 | 浙江大学 | Activity and user identification method based on deep multitasking learning |
CN111160443A (en) * | 2019-12-25 | 2020-05-15 | 浙江大学 | Activity and user identification method based on deep multitask learning |
CN111242318B (en) * | 2020-01-13 | 2024-04-26 | 拉扎斯网络科技(上海)有限公司 | Service model training method and device based on heterogeneous feature library |
CN111242318A (en) * | 2020-01-13 | 2020-06-05 | 拉扎斯网络科技(上海)有限公司 | Business model training method and device based on heterogeneous feature library |
CN111488840A (en) * | 2020-04-15 | 2020-08-04 | 桂林电子科技大学 | Human behavior classification method based on multi-task learning model |
CN111815030B (en) * | 2020-06-11 | 2024-02-06 | 浙江工商大学 | Multi-target feature prediction method based on small amount of questionnaire survey data |
CN112100256B (en) * | 2020-08-06 | 2023-05-26 | 北京航空航天大学 | Data-driven urban precise depth portrait system and method |
CN112100256A (en) * | 2020-08-06 | 2020-12-18 | 北京航空航天大学 | Data-driven urban accurate depth image system and method |
CN112527993A (en) * | 2020-12-17 | 2021-03-19 | 浙江财经大学东方学院 | Cross-media hierarchical deep video question-answer reasoning framework |
CN113128669A (en) * | 2021-04-08 | 2021-07-16 | 中国科学院计算技术研究所 | Neural network model for semi-supervised learning and semi-supervised learning method |
CN113438204A (en) * | 2021-05-06 | 2021-09-24 | 中国地质大学(武汉) | Multi-node cooperative identification response method based on block chain |
CN113411765A (en) * | 2021-05-22 | 2021-09-17 | 西北工业大学 | Mobile intelligent terminal energy consumption optimization method based on multi-sensor cooperative sensing |
CN113110517A (en) * | 2021-05-24 | 2021-07-13 | 郑州大学 | Multi-robot collaborative search method based on biological elicitation in unknown environment |
CN113220911B (en) * | 2021-05-25 | 2024-02-02 | 中国农业科学院农业信息研究所 | Agricultural multi-source heterogeneous data analysis and mining method and application thereof |
CN113220911A (en) * | 2021-05-25 | 2021-08-06 | 中国农业科学院农业信息研究所 | Agricultural multi-source heterogeneous data analysis and mining method and application thereof |
CN113285721A (en) * | 2021-06-10 | 2021-08-20 | 北京邮电大学 | Reconstruction and prediction algorithm for sparse mobile sensing data |
CN113505611A (en) * | 2021-07-09 | 2021-10-15 | 中国人民解放军战略支援部队信息工程大学 | Training method and system for obtaining better speech translation model in generation of confrontation |
CN113537355A (en) * | 2021-07-19 | 2021-10-22 | 金鹏电子信息机器有限公司 | Multi-element heterogeneous data semantic fusion method and system for security monitoring |
CN113268818A (en) * | 2021-07-19 | 2021-08-17 | 中国空气动力研究与发展中心计算空气动力研究所 | Pneumatic global optimization method based on topological mapping generation, storage medium and terminal |
CN113949880A (en) * | 2021-09-02 | 2022-01-18 | 北京大学 | Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method |
CN113780578B (en) * | 2021-09-08 | 2023-12-12 | 北京百度网讯科技有限公司 | Model training method, device, electronic equipment and readable storage medium |
CN113780578A (en) * | 2021-09-08 | 2021-12-10 | 北京百度网讯科技有限公司 | Model training method and device, electronic equipment and readable storage medium |
CN113822048B (en) * | 2021-09-16 | 2023-03-21 | 电子科技大学 | Social media text denoising method based on space-time burst characteristics |
CN113822048A (en) * | 2021-09-16 | 2021-12-21 | 电子科技大学 | Social media text denoising method based on space-time burst characteristics |
CN114694177A (en) * | 2022-03-10 | 2022-07-01 | 电子科技大学 | Fine-grained character attribute identification method based on multi-scale features and attribute association mining |
CN114694177B (en) * | 2022-03-10 | 2023-04-28 | 电子科技大学 | Fine-grained character attribute identification method based on multi-scale feature and attribute association mining |
CN114783022B (en) * | 2022-04-08 | 2023-07-21 | 马上消费金融股份有限公司 | Information processing method, device, computer equipment and storage medium |
CN114783022A (en) * | 2022-04-08 | 2022-07-22 | 马上消费金融股份有限公司 | Information processing method and device, computer equipment and storage medium |
CN114898319A (en) * | 2022-05-25 | 2022-08-12 | 山东大学 | Vehicle type recognition method and system based on multi-sensor decision-level information fusion |
CN114898319B (en) * | 2022-05-25 | 2024-04-02 | 山东大学 | Vehicle type recognition method and system based on multi-sensor decision level information fusion |
CN115632684A (en) * | 2022-12-21 | 2023-01-20 | 香港中文大学(深圳) | Transmission strategy design method of perception and communication integrated system |
CN115985402B (en) * | 2023-03-20 | 2023-09-19 | 北京航空航天大学 | Cross-modal data migration method based on normalized flow theory |
CN115985402A (en) * | 2023-03-20 | 2023-04-18 | 北京航空航天大学 | Cross-modal data migration method based on normalized flow theory |
CN116503029A (en) * | 2023-06-27 | 2023-07-28 | 北京中电科卫星导航系统有限公司 | Module data cooperative processing method and system for automatic driving |
CN116503029B (en) * | 2023-06-27 | 2023-09-05 | 北京中电科卫星导航系统有限公司 | Module data cooperative processing method and system for automatic driving |
CN117292274B (en) * | 2023-11-22 | 2024-01-30 | 成都信息工程大学 | Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary |
CN117292274A (en) * | 2023-11-22 | 2023-12-26 | 成都信息工程大学 | Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108846384A (en) | Merge the multitask coordinated recognition methods and system of video-aware | |
Qin et al. | A dual-stage attention-based recurrent neural network for time series prediction | |
Kaymak et al. | A brief survey and an application of semantic image segmentation for autonomous driving | |
CN108804715A (en) | Merge multitask coordinated recognition methods and the system of audiovisual perception | |
CN109829541A (en) | Deep neural network incremental training method and system based on learning automaton | |
CN111507378A (en) | Method and apparatus for training image processing model | |
CN111860951A (en) | Rail transit passenger flow prediction method based on dynamic hypergraph convolutional network | |
CN116415654A (en) | Data processing method and related equipment | |
CN109102000A (en) | A kind of image-recognizing method extracted based on layered characteristic with multilayer impulsive neural networks | |
Alshmrany | Adaptive learning style prediction in e-learning environment using levy flight distribution based CNN model | |
CN112417289B (en) | Information intelligent recommendation method based on deep clustering | |
Chen et al. | Binarized neural architecture search for efficient object recognition | |
Gupta et al. | Rv-gan: Recurrent gan for unconditional video generation | |
Qin et al. | [Retracted] Evaluation of College Students’ Ideological and Political Education Management Based on Wireless Network and Artificial Intelligence with Big Data Technology | |
Gao | Application of convolutional neural network in emotion recognition of ideological and political teachers in colleges and universities | |
CN113657272B (en) | Micro video classification method and system based on missing data completion | |
CN113553918B (en) | Machine ticket issuing character recognition method based on pulse active learning | |
Wu et al. | Short-term memory neural network-based cognitive computing in sports training complexity pattern recognition | |
CN113408721A (en) | Neural network structure searching method, apparatus, computer device and storage medium | |
CN117131933A (en) | Multi-mode knowledge graph establishing method and application | |
CN116737897A (en) | Intelligent building knowledge extraction model and method based on multiple modes | |
Ikram | A benchmark for evaluating Deep Learning based Image Analytics | |
Zhang et al. | A fast evolutionary knowledge transfer search for multiscale deep neural architecture | |
Su et al. | Soft regression of monocular depth using scale-semantic exchange network | |
CN112036546A (en) | Sequence processing method and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20181120 |