CN108846384A - Merge the multitask coordinated recognition methods and system of video-aware - Google Patents

Merge the multitask coordinated recognition methods and system of video-aware Download PDF

Info

Publication number
CN108846384A
CN108846384A CN201810744934.4A CN201810744934A CN108846384A CN 108846384 A CN108846384 A CN 108846384A CN 201810744934 A CN201810744934 A CN 201810744934A CN 108846384 A CN108846384 A CN 108846384A
Authority
CN
China
Prior art keywords
feature
task
video
collaboration
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810744934.4A
Other languages
Chinese (zh)
Inventor
明悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201810744934.4A priority Critical patent/CN108846384A/en
Publication of CN108846384A publication Critical patent/CN108846384A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of multitask coordinated recognition methods and system for merging video-aware, belong to multi-source heterogeneous video data processing identification technology field, in conjunction with biology perception mechanism, the shared semantic description for studying multi-source heterogeneous video data feature collaboration obtains the generic features description of multi-source heterogeneous video data;Using suitable border computational theory, feature association study and the task forecasting mechanism of task cooperation are established, realizes the task interaction prediction mechanism of suitable border perception;In conjunction with it is long when rely on, propose that the vision multitask depth of context collaboration cooperates with identification model, realize have the multitask depth collaboration identification model of long-term memory, the generalization for solving video multitask identification is poor, the problems such as robustness is low, computation complexity is high.The present invention proposes that intelligence, generalization, the video general character description method of mobile and multitask depth cooperate with identification model, can promote intelligent information push, the personalized development for controlling the fields such as service of the multi-source heterogeneous video data in smart city.

Description

Merge the multitask coordinated recognition methods and system of video-aware
Technical field
The present invention relates to multi-source heterogeneous video datas to handle identification technology field, and in particular to a kind of fusion video-aware Multitask coordinated recognition methods and system.
Background technique
Artificial intelligence is support with the development of the technologies such as big data, cloud computing, intelligent terminal, using deep neural network as base Plinth will enter all-round developing new era.Upper ultrahigh speed, mobile and generalization are being stored and processed in face of mass data Urgent need, the Special artificial based on single mode single task intelligently have become the important bottleneck for keeping field development in check.
Traditional single task identifies the generalization requirement being unable to satisfy under artificial intelligence background, with wherein most representative For the mission requirements such as the face video identification, Human bodys' response, the vehicle classification identification that are related to simultaneously in the construction of smart city, Video acquisition camera is many kinds of, specification is different, causes video data that massive multi-source is presented, needs regular isomorphism The recognition mechanism that video features describe method and efficiently cooperate with, realization accurately identify target, scene, behavior, anomalous event. Therefore, the visual identity mechanism towards the collaboration of multitask depth can be the reality of the following intelligent information push and personalized control service It is existing, establish important theoretical basis.
The multitask depth collaboration Study of recognition of so-called multisource video perception refers to based on biology perception mechanism, extracts The generic features of multi-source heterogeneous video data carry out feature association study and task prediction in conjunction with suitable border theory, and foundation has length When remember depth collaboration identification network, i.e., realization context layer multitask collaborative perception identification.Such as:One section " in dining room Xiao Ming greets to me " video clip in, achieve the effect that while identifying a variety of visual tasks, i.e., identify that scene (is eaten simultaneously Hall), target (Xiao Ming), behavior (greetings), the visual tasks such as expression (laughing at), rather than each identification mission establish it is a set of independent Identification model, export recognition result respectively, not only waste computing resource, but also be difficult to handle mass data, realize practical require.
In Current vision identification technology, based on the feature extracting method of deep learning in scene, target, behavior, expression Etc. single identification mission test in show superior performance.However, for massive multi-source data, as user advises Mould, scene change and time passage, and generate some new problems:
Extensive bottleneck:Data distribution significant difference between different task mode is also easy to produce over-fitting to small-scale data task, And mass data task faces high training and label cost, so that balance generalization can not be obtained between different task, model Generalization Capability is decreased obviously under changing environment or scene;
Efficiency bottle neck:Depth network model is complicated, and number of parameters is huge, although having generation confrontation network, capsule network Deng having carried out good try to reducing data requirements and resource consumption, but facing different identification missions, heterogeneous networks structure, Still it is difficult to realize resource balanced efficient distribution rapidly;
Migrate bottleneck:It can not be associated prediction according to the historical information of data when scene changes, it is selective when establishing long Memory and Forgetting Mechanism realize the adaptive learning of suitable border migration.Such as Xiao Ming moves towards dining room from classroom, realizes to goal behavior The identification mission having a meal migration is transformed to from study.
Therefore, the depth of the interaction prediction in vision multi-task learning with task cooperation ability and context layer cooperates with identification Modeling becomes key problem urgently to be resolved in the identification of Current vision Intellisense.
With the lasting promotion of communication bandwidth and transmission speed, the video data volume is exponentially increased, to it is limited calculating and Storage resource generates immense pressure, and the other single task treatment mechanism of heritage knowledge divides different data to different task, finds Its corresponding character description method, causes resource utilization low, the descriptive difference of data.Conventional visual perceives identification mission In, learning model generally no longer changes after establishing, but in the Intellisense epoch, as the time passage of different scenes and space become It changes, original model is non-optimal, and there are potential incidence relations between different mode in scene, for the variation field of space-time passage Scape is learnt feature association under scene change, is reached mass data using the mission mode association mining mechanism under the perception of suitable border The dynamic prediction of lower task cooperation and certainly mark, the adaptive optimal of learning characteristic Model checking to keep.Single-mode Identification can not carry out effective long-term memory reasoning to the feature learnt, realize a variety of visions according to the dynamic mapping of scene Task identifies simultaneously.When having pop-up mission or target is added, model interoperability to be identified can not be handled, realize identification network The balance of underloadingization and high usage.
Summary of the invention
The purpose of the present invention is to provide one kind can be directed to multi-source heterogeneous data, establishes generalization feature collaboration description machine System, the video information effective supplement that different data sources are obtained, is evolved to multi-source elastic model for traditional single source fixed mode, It effectively removes data redundancy, retains shared semantic information, establish a kind of high dynamic admission rate, high resource utilization, low network and disappear The multitask recognition methods and system of the fusion video-aware of consumption rate, to solve technical problem present in above-mentioned background technique.
To achieve the goals above, this invention takes following technical solutions:
On the one hand, the present invention provides a kind of multitask recognition methods for merging video-aware, include the following steps:
Step S110:In conjunction with biology perception mechanism, the shared semanteme based on the collaboration of multi-source heterogeneous video data feature Mechanism extracts the generic features of multi-source heterogeneous video data;
Step S120:Using suitable border computational theory, the feature association study mechanism of task cooperation is established, it is different to the multi-source The generic features of structure video data carry out continuous learning as priori knowledge, generate the task interaction prediction model of suitable border perception;
Step S130:For it is long when input video stream, the task interaction prediction model foundation perceived in conjunction with the suitable border is long When the generation memory models that rely on, establish the depth based on cooperative kinetics it is autonomous it is semi-supervised persistently identifies system, realize more Business identification.
Further, in the step S110, the shared semantic mechanism packet of the multi-source heterogeneous video data feature collaboration It includes:
Establish the primitive collaboration based on multi-source heterogeneous video data, the dictionary collaboration based on time synchronization and based on semantic phase As theme cooperate with three-level feature synergistic mechanism establish multi-source heterogeneous video counts in conjunction with the attribute of multi-source heterogeneous video data According to feature cooperation model, the regular shared semantic association relationship of dimension is determined;Wherein,
Primitive based on multi-source heterogeneous video data cooperates with:Using independent component analysis training video picture element, The video image primitive is matched according to this using Gabor function, estimate the corresponding scale of each video image primitive and The primitive feature of video image is extracted in direction, realizes the time-space domain efficient coding of video image internal structure;
Dictionary based on time synchronization cooperates with:It is encoded using local linear, using local distance as sparse basis letter Several regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search A dictionary updating is completed in direction, establishes a Coded concepts stream for each data channel, and the reference as complicated event is semantic The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment by coding, and generation time translates letter Number realizes the alignment of dictionary semanteme;
Include based on semantic similar theme collaboration:Using hidden semantic analysis, dictionary and video image primitive feature are constructed Between co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to word It converges, theme node and the description of scene mapping relations, by the video conditional probability under theme distribution, as particular category similarity, Calculate the likelihood function of probability and prediction probability between true vocabulary and scene.
Further, described to establish multi-source heterogeneous video data feature cooperation model, determine the regular shared semanteme of dimension Incidence relation includes:
Assuming that have C class heterogeneous channel feature, it willIt is denoted as niThe feature of a training sample Matrix, data noise part are E, and Γ is twiddle factor, establishes the majorized function under orthogonality constraint:
Wherein, λ indicates sharing matrix coefficient,TRepresenting matrix carries out transposition operation, YiIndicate ith feature classification mark, F Indicate Frobenius norm,Indicate projection matrix ΘiTransposition, α, β, μ1And μ2For multiplier factor, rank (X) is characterized square The order of battle array X;
Obtain general semantics feature low dimensional manifold subspace { Θi, the semantic sharing matrix W under Unified frame0With specific spy Levy modular matrix { Wi, using least square method for solving prediction loss function R1(W0,{Wi},{Θi), reconstruct loss letter Number R2({Θi) and regular function R3(W0,{Wi) joint optimal solution;
By the way that the multi-source heterogeneous video data newly inputted to be extracted to the high-rise generic features with dimension to eigenspace projection Shared semantic association relationship is established in description.
Further, in the step S120, using suitable border computational theory, the feature association learning machine of task cooperation is established System, the task interaction prediction model for generating suitable border perception include:
The mapping function under low-rank constrains between vision mark and generic features is constructed, realizes feature mark collaboration;It introduces Nuclear norm models mark correlation, feature correlation, while introducing the intrinsic structure that figure regular terms retains data with existing, It realizes the mark prediction without mark characteristic, establishes following unconstrained function:
Wherein, g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and by g letter The loss function of number acquisition task prediction result error minimizes,For being fitted given mark, Φ (g) and Λ It (g) is the regular terms based on a priori assumption, λ and γ are regular terms parameters;
Task interaction prediction model interactive environment, environmental model and the loss model of suitable border perception;
Environmental model is used to learn the environment dynamic change of input feature vector, and loss model is used to estimate that environmental model to lose, Predict visual zone, target and the task to be identified for needing to pay close attention in the future;
The interactive environment includes that definition status space is made of t moment and the description of the generic features at t-1 moment, current t Moment state specifies identification mission at, predict next t+1 moment task status to be identified;
The environmental model includes giving historical informationGeneric features with go through History mapping function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η (h);Note ω is environmental model ω (ξ (h)) ∈ Y, when every subtask is predicted, introduces loss model Lwm(ω (ξ (h)), η (h)) appoints Business prediction is related to H={ h=(st-k,at-k,···,st,at,st+1), ξ (h)=(st-k,at-k,···,st,at) and η (h)=st+1, inverse kinematics forecasting mechanism and softmax cross entropy loss forecasting state in future, the nerve based on stochastic gradient descent Network model ωφ, to encode, the stateful low-dimensional latent space completion visual attention location region into one comprising shared weight is mentioned It takes and status predication;
The loss model includes given state stWith suggest next step task, for predicting environmental model RlA task hair Raw probability distribution, softmax cross entropy loss function encode the state of next step task as penalty term.
Further, in the step S130, it is described for it is long when input video stream, being perceived in conjunction with the suitable border for task The generation memory models that interaction prediction model foundation relies on when long include:
Model is generated using memory external system enhancing timing, generic features description is stored since the early stage of sequence Effective information, establish sustainable generation memory models to information has been stored;Specifically,
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,···,eTAnd task association Same hidden variable collection z≤T={ z1,z2,···,zT, h is mapped using translationt=fh(ht-1,et,zt) correct each time point The hidden state variable h of certaintyt, priori mapping function fz(ht-1) description past observing and hidden variable non-linear dependence and offer Hidden variable distribution parameter;Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided;Benefit With memory external Modifying model timing variable autocoder, a memory text ψ is generated at every point of timet, priori letter Breath and posterior information respectively indicate as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μt),fz σt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μt-1,et),fq σt-1,et))
Wherein, fz μIt is the translation mapping function of hidden variable z state μ, fz σIt is the translation mapping function of hidden variable z state σ, fq μIt is the translation mapping function of posterior probability q state μ, fq σThe translation mapping function of posterior probability q state σ, prior information are to rely on F is mapped in priorizRemember the diagonal gauss of distribution function of text, and diagonal Gaussian approximation Posterior distrbutionp is depended on and is reflected by posteriority Penetrate function fqAssociated memory text Ψt-1With current observation et
Further, the depth of the foundation based on cooperative kinetics is independently semi-supervised persistently identifies system, realizes more Business identifies:
Recognizer is cooperateed with based on the depth for generating memory models, using the evolutionary process of collaboration potential-energy function, will be remembered Model is introduced into the dynamic process of coevolution, will solve prototype pattern and adjoint mode is attributed to solution nonlinear optimization Problem obtains optimization contract network weight;
Long memory network f in short-termrnnFor promoting state history ht, memory external MtUse the hidden change from previous moment Amount and external text information ctIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, introduce one collection key value, using cosine similarity evaluate and test willWith memory Mt-1Each row compares, and generation task pays attention to weight, the memory of retrievalBy attention weight and memory Mt-1's Weighted sum obtains, and realizes multitask identification;Wherein,
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the crucial value function of r item for promoting state history, fattIt is attention mechanism function,It is t moment r The memory weight of i-th point of item,Retrieval memory equation obtain as a result, ⊙ indicate point multiplication operation,It is to be remembered by retrieval The Setover relatedly value learnt, σ () are sigmoid functions, form the expression mechanism for informing memory storage and retrieval as a result, Ψt=[φt 1t 2,···,φt R,ht], as the output for generating memory models.
On the other hand, including general the present invention also provides a kind of multitask coordinated identifying system for merging video-aware Characteristic extracting module, collaboration feature learning module, depth cooperate with identification module;
The generic features extraction module, it is special based on multi-source heterogeneous video data for combining biology perception mechanism The shared semantic mechanism of sign collaboration, extracts the generic features of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study of task cooperation using suitable border computational theory Mechanism carries out continuous learning as priori knowledge to the generic features of the multi-source heterogeneous video data, generates suitable border perception Task interaction prediction model;
The depth cooperates with identification module, input video stream when for being directed to long, closes in conjunction with the task that the suitable border perceives The generation memory models that connection prediction model relies on when establishing long establish the autonomous semi-supervised lasting knowledge of the depth based on cooperative kinetics Complicated variant system realizes multitask identification.
Further, the generic features extraction module includes primitive collaboration submodule, dictionary collaboration submodule and theme Cooperate with submodule;
The primitive cooperates with submodule, for utilizing independent component analysis training video picture element, utilizes Gabor letter It is several that the video image primitive is matched according to this, estimate the corresponding scale of each video image primitive and direction, extracts view The primitive feature of frequency image realizes the time-space domain efficient coding of video image internal structure;
The dictionary cooperates with submodule, for being encoded using local linear, using local distance as sparse basic function Regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search direction, A dictionary updating is completed, establishes a Coded concepts stream for each data channel, as the reference semantic coding of complicated event, The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment, generation time translation function is real Existing dictionary semanteme alignment;
The theme cooperates with submodule, for using hidden semantic analysis, constructs between dictionary and video image primitive feature Co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, master Node and the description of scene mapping relations are inscribed, the video conditional probability under theme distribution is calculated true as particular category similarity The likelihood function of probability and prediction probability between notional word remittance and scene.
Further, the collaboration feature learning module includes the task pass of feature association study submodule and the perception of suitable border Connection prediction submodule;
The feature association learns submodule, for constructing the mapping under low-rank constrains between vision mark and generic features Function realizes feature mark collaboration;
The task interaction prediction submodule of the suitable border perception, for the feature association relationship by learning, in conjunction with view Feel that the priori knowledge of perception, the task cooperation treatment mechanism based on environmental model and loss function are realized dynamic according to scene changes State is adaptively adjusted task to be identified, completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
Further, the generation memory models submodule relied on when the depth collaboration identification module includes long and multitask Depth collaboration identification submodule;
The generation memory models submodule relied on when described long, input video stream when for being directed to long, in conjunction with the suitable border The generation memory models that the task interaction prediction model foundation of perception relies on when long;
The multitask depth collaboration identification submodule, for establishing, the depth based on cooperative kinetics is independently semi-supervised to be held Continuous identification system realizes multitask identification.
Beneficial effect of the present invention:Complete effective extraction of multi-source heterogeneous video data information can be achieved, settling time is synchronous Dictionary synergistic mechanism, reduce vision it is semantic between time uncertainty, improve model to the generalization ability of scene change;It can Task to be identified is adjusted according to scene changes dynamic self-adapting, the perception of completion visual attention location region and mission requirements are predicted dynamic State adjustment;Memory external is established in conjunction with long-range data dependence and generates model, enhances e-learning performance, with the storage of lesser data Capacity reduces model parameter calculation complexity, extracts useful information at once, is applied to different type video sequence, solves multiple Miscellaneous, long-range sequence data can not selective memory and forgetting problem;It realizes that identification feature independently selects, improves without labeled data Identification, persistently promoted multitask identification accuracy rate and robustness, adjusted according to scene changes dynamic self-adapting wait know Other task completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
The additional aspect of the present invention and advantage will be set forth in part in the description, these will become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill of field, without creative efforts, it can also be obtained according to these attached drawings others Attached drawing.
Fig. 1 is that the multitask of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention identifies original Manage block diagram.
Fig. 2 is that the generic features of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention are extracted Module principle block diagram.
Fig. 3 is the collaboration feature learning of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention Module principle block diagram.
Fig. 4 is that the depth of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention cooperates with identification Module principle block diagram.
Fig. 5 is the task prediction model of the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention Functional block diagram.
Fig. 6 is that the multitask coordinated identifying system of fusion video-aware described in the embodiment of the present invention is relied on based on outside Generate memory models schematic diagram.
Fig. 7 is the multi-source heterogeneous multitask coordinated identification verification platform frame structure of video data described in the embodiment of the present invention Figure.
Specific embodiment
Embodiments of the present invention are described below in detail, the example of the embodiment is shown in the accompanying drawings, wherein from beginning Same or similar element or module with the same or similar functions are indicated to same or similar label eventually.Below by ginseng The embodiment for examining attached drawing description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or modules, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, module and/or their group.
It should be noted that in embodiment of the present invention unless specifically defined or limited otherwise, term is " even Connect ", " fixation " etc. shall be understood in a broad sense, may be a fixed connection, may be a detachable connection, or is integral, can be machine Tool connection, is also possible to be electrically connected, can be and be directly connected to, be also possible to be indirectly connected with by intermediary, can be two The interaction relationship of connection or two elements inside element, unless having specific limit.For those skilled in the art For, the concrete meaning of above-mentioned term in embodiments of the present invention can be understood as the case may be.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technology Term and scientific term) there is meaning identical with the general understanding of those of ordinary skill in fields of the present invention.Also answer It should be appreciated that those terms such as defined in the general dictionary should be understood that have in the context of the prior art The consistent meaning of meaning, and unless defined as here, it will not be explained in an idealized or overly formal meaning.
In order to facilitate understanding of embodiments of the present invention, further by taking specific embodiment as an example below in conjunction with attached drawing to be solved Explanation is released, and embodiment does not constitute the restriction to the embodiment of the present invention.
Those of ordinary skill in the art are it should be understood that attached drawing is the schematic diagram of one embodiment, the portion in attached drawing Part or device are not necessarily implemented necessary to the present invention.
Embodiment one
As shown in Figure 1, a kind of multitask coordinated identifying system for fusion video-aware that the embodiment of the present invention one provides.It should System includes generic features extraction module, collaboration feature learning module, depth collaboration identification module;
It is special to study multi-source heterogeneous video data for combining biology perception mechanism for the generic features extraction module The shared semantic description for levying collaboration obtains the generic features description of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study of task cooperation using suitable border computational theory With task forecasting mechanism, the task interaction prediction mechanism of suitable border perception is realized;
The depth cooperates with identification module, relies on when for combining long, proposes the vision multitask depth association of context collaboration Same identification model realizes have the multitask depth collaboration identification model module of long-term memory, solves video multitask identification The problems such as generalization is poor, robustness is low, computation complexity is high.
As shown in Fig. 2, the generic features extraction module includes primitive collaboration in specific embodiments of the present invention one The theme of module, the dictionary collaboration submodule of time synchronization and shared semantic information cooperates with submodule.Wherein,
Primitive cooperates with submodule --- and multi-source data processing mode requires can be simultaneously in time-space domain accurately detecting and tracking target With the variation of scene, and the visual cortex cell of biology perception have sparsity, it is dilute in order to combine nerve signal process Otherness is dredged, and is able to achieve the information completeness of multi-source heterogeneous data.It needs to combine multi-source heterogeneous video intrinsic characteristics, research tool Standby scale, translation, rotational invariance primitive synergistic mechanism, realize complete effective extraction of multi-source heterogeneous data information.
The dictionary of time synchronization cooperates with submodule --- and include potential semantic information in multi-source heterogeneous data, realizes low layer The time synchronization of primitive feature and high-level semantic is to overcome the matter of utmost importance of semantic gap.Therefore, it is necessary to combine hidden semantic feature The dictionary synergistic mechanism that expression settling time synchronizes reduces the time uncertainty between vision semanteme.
The theme of shared semantic information cooperates with submodule --- come from social activity, information, physical space different platform and mode number Comprising nature abundant and social property in, the dimension that has different characteristics and data distribution, but the synchronous multi-source obtained Video contains potentially large number of semantic association information.Therefore, it is necessary to study the theme of the visual information of different modalities and dictionary semanteme Relation mechanism proposes that semantic similar theme cooperates with characteristic analysis method, establishes the regular general spy of shared semantic association of dimension Levy description method.
As shown in figure 3, the collaboration feature learning module includes under suitable border is theoretical in specific embodiments of the present invention one Feature association study submodule and visual attention location extracted region and task predict submodule, wherein
Feature association under suitable border theory learns submodule --- and it is different in the visual perception identification mission of scene changes There are High relevancy between scene characteristic, same feature the relevance in different identification missions between other features exist compared with Big difference, and there are relevances between the label of same feature different task, and there are larger for the relevance between different labels Difference.Therefore, it when constructing different task label and feature space mapping function, needs to model label and feature correlation, Retain the intrinsic structure of data with existing (mark and unlabeled data), generalization ability of the lift scheme to scene change.
Visual attention location extracted region and task predict submodule --- learning tasks and feature be associated with mark after, need really Recognize visual attention location region, carry out task prediction to be identified, such as identification pupilage and expression are main identification missions in classroom; Outdoor Scene identifies target and behavior is main identification mission.Therefore, by the feature association relationship learnt, in conjunction with visual impression The priori knowledge known proposes the task cooperation treatment mechanism based on environmental model and loss function, realizes dynamic according to scene changes State is adaptively adjusted task to be identified, completes the dynamic adjustment of the perception of visual attention location region and mission requirements prediction.
As shown in figure 4, in specific embodiments of the present invention one, what the depth collaboration identification module relied on when including long Generate memory models submodule and multitask depth collaboration identification submodule, wherein
The generation memory models submodule relied on when long --- for long-range, the feature stream of more sequences input, without memory capability Study mechanism, need constantly to mark new input data, relearn network model realize identification mission, to calculate, storage It is all huge waste with human resources.Therefore, it is necessary to establish memory external in conjunction with long-range data dependence to generate model, enhance net Network learning performance reduces model parameter calculation complexity with lesser data storage capacity, extracts useful information at once, be applied to Different type video sequence, to solve the problems, such as that complicated, long-range sequence data can not selective memory and forgetting.
Multitask depth collaboration identification submodule --- for lasting input without mark feature stream, need accurately and efficiently Study provides to identify away from the joint optimal characteristics with maximum kind spacing for multitask in standby infima species, and without labeled data without Method manually provides classification markup information, inevitably results in recognition performance loss.Therefore, it is necessary to combine to have long-term memory Cooperative kinetics principle establishes the multitask recognition mechanism of the depth continuous learning under context collaboration, realizes identification feature certainly Main selection improves the identification without labeled data, persistently promotes the accuracy rate and robustness of multitask identification.
Embodiment two
Fusion multisource video perception number is carried out using system described in embodiment one second embodiment of the present invention provides a kind of According to multitask recognition methods.This method comprises the following steps:
Firstly, the shared semantic description of multi-source heterogeneous video data feature collaboration is studied in conjunction with biology perception mechanism, Obtain the generic features description of multi-source heterogeneous video data.
Then, using suitable border computational theory, feature association study and the task forecasting mechanism of task cooperation is established, is realized suitable The task interaction prediction mechanism of border perception.
Finally, in conjunction with it is long when rely on, propose context collaboration vision multitask depth cooperate with identification model, realization have length When remember multitask depth collaboration identification model, solve video multitask identification generalization it is poor, robustness is low, calculating is complicated Spend the problems such as high.
It is especially more in recognition of face, Expression analysis and behavior understanding etc. in recent years in visual perception multitask identification The research achievement obtained in task scene to feature description, interaction prediction and collaboration identification etc., brings forward magnanimity multi-source The generic features of video data, which describe method, suitable border perceives lower task interaction prediction and continue video flowing inputs lower long-term memory Depth cooperate with identification model, introduce shared semantic association description, suitable border Perception Features study and semi-supervised lasting collaboration and identify Etc. frontier theories, improve the multitask coordinated identification of multi-source vision extensive robustness and it is long when intelligence.
In the specific embodiment of the invention two, maximized for the big feature of the video data volume from visual perception mechanism While compressed data, retain the identification information of multi-source heterogeneous data.In face of complicated transformation scene, calculates and manage in conjunction with suitable border By the relevant task interaction prediction mechanism of research characteristic improves the generalization of feature learning.For the video flowing of lasting input, It introduces the semi-supervised depth that continues and cooperates with identification model, realize the dynamic multitask identification demand of timing memory, build multi-source vision Multitask coordinated identification verification platform, carries out the generalization and robustness verifying of theoretical method, while continuously improving and being promoted institute Propose the performance of method.
As shown in Fig. 2, needing to establish primitive collaboration, dictionary collaboration and theme collaboration in generic features extraction step Three-level synergistic mechanism.
Biology perception mechanism thinks that the interaction between visible elements is exactly the interconnection behavior between cellula visualis. Visual behaviour is exactly the treatment process of visual cortex neural network, can be divided into characteristic layer, task layer, context layer.Therefore, it to solve The certainly multitask coordinated identification of multisource video information first has to carry out feature collaboration, that is, extracts the general spy of multi-source heterogeneous data Sign description.Although visual perception data source difference, various structures, storage format are changeable, it includes visual information and language Adopted information.The key of feature collaboration is to realize the " semantic of human cognitive how by video image and semantic progress efficient association It is similar " and " vision is similar " of data processing between consistent generic features mechanism is described.For the complete of visual perception primitive Property, the signal of visual cortex simple cell responds the low dimensional manifold of sparsity and task scene.
Multi-source heterogeneous video primitives collaboration is intended to that extracted feature is made not only to meet the otherness for keeping nerve signal sparse, but also The various possible signals in natural scene can effectively be captured.Although traditional global or local image feature representation can local solution Certainly video data scale and rotational invariance, but the individual poor appearance that not competent target itself generates in things of the like description It is different, therefore it is only applicable to single data source single task treatment mechanism, complete effective description space can not be provided, for higher Visual perception task is even more helpless.
Primitive collaboration obtains one group of sparse, independent filter group first, has different description energy for detecting in video A possibility that power feature occurs.0 norm or 1 norm for generalling use primitive coefficient evaluate sparsity.Independence requirement Correlation is as small as possible between primitive vector.Using independent component analysis training video picture element, using Gabor function to base Member matches according to this, estimates the corresponding scale of each primitive and direction, and primitive collaboration discloses primary visual cortex to a certain extent Neural treatment process can be realized the time-space domain efficient coding of natural video frequency image internal structure.
The dictionary collaboration, utilizes Unsupervised clustering process it is assumed that potential applications refer to based on similar target appearance consistency Local description is marked, using appearance similarity degree as the decision condition of objective attribute target attribute.Dictionary collaboration is a kind of word-based The typical enigmatic language justice character representation method of remittance packet model.It is encoded using local linear, using local distance as sparse basic function Regular terms, calculate the best response signal of original dictionary;The optimization signal is recycled to calculate feasible dictionary search direction, To complete a dictionary updating, a Coded concepts stream is established for each data channel.As the semantic coding of complicated event, The low-level feature stream of all new inputs flows into Mobile state Time alignment with reference to semantic coding, and generation time translation function is realized The alignment of dictionary semanteme.This method require strong signal response dictionary and it is other between otherness it is larger, sample can be efficiently differentiated Video block indicates, guarantees the consistency of similar video primitive set.
The theme collaboration defines its potential semantic topic analysis for the description of dictionary and scene symbiosis, will Difference appearance under different context environment is mapped as certain potential low-dimensional in the groove, realizes theme Cooperative Analysis process, wherein It not only had included that visual signature is mapped to more mappings, but also including visual signature to category label multipair 1 to category label 1.The present invention adopts With hidden semantic analysis, the co-occurrence matrix between dictionary and video image is constructed, using hidden node by the corresponding semantic concept of theme It embodies, is realized by probability inference method and vocabulary, theme node and scene mapping relations are described, the video under theme distribution Conditional probability calculates the likelihood function of probability and prediction probability between true vocabulary and scene as particular category similarity, complete At the building of projection matrix.
In the specific embodiment of the invention two, according to Semantic Similarity between different channels video, in order to effectively quantify difference The shared semantic information of channel different dimensions overcomes noise, blocks, the influence to feature identification such as illumination, extracts more visions The description of discrimination property maximum generic features, increases class spacing in task, reduces in class away from establishing isomeric data feature collaboration mould Type.Assuming that having C class heterogeneous channel feature, to each characteristic typeIt is denoted as niA trained sample This eigenmatrix, data noise part are E, and Γ is twiddle factor.Semantic sharing heterogeneous characteristic cooperation model purport under multitask For each XiLearn a projection matrix Θi.Heterogeneous characteristic is projected as to equal intrinsic dimensionality, reduces data redundancy, Majorized function is expressed as under orthogonality constraint:
The heterogeneous characteristic cooperation model is intended to obtain general semantics feature low dimensional manifold subspace { Θi, under Unified frame Semantic sharing matrix W0With special characteristic modular matrix { Wi, least square method is for solving prediction loss function R1(W0, {Wi},{Θi), reconstruct loss function R2({Θi) and regular function R3(W0,{Wi) joint optimal solution.By will newly input Data extract the high-rise generic features with dimension to eigenspace projection and describe, establish shared semantic association relationship.
In conclusion primitive collaboration belongs to feature extraction phases, it is therefore intended that acquisition to the greatest extent may be used in the embodiment of the present invention two It can sparse complete characteristic response signal;Dictionary collaboration belongs to the feature coding stage, it is therefore intended that local video block feature into Row unsupervised learning obtains the semantic dictionary with local holding capacity;Theme collaboration belongs to the Feature Semantics stage, it is therefore intended that Its hiding Semantic mapping space is solved by probabilistic framework, and all kinds of similarity in analysis space is realized feature collaboration, established Perceive the similar generic features describing framework of semanteme of identification mission environment.
The learning ability of human vision is by the signal response and transmitting realization between cellula visualis, particular visual task Need the mutual synergistic effect between a large amount of cellula visualises.Due to concurrency, hierarchy and the feedback between human vision cell, Signal transmitting also has different collaboration meanings, and the transfer mode otherness of synergistic signal is the difficult point of its learning tasks.For In vision multitask perception identification, scene is complicated and changeable, collaboration identification needs intelligent Forecasting a variety of visual tasks to be identified, It proposes the task forecasting mechanism based on generic features association study under suitable border environment, task layer coevolution is realized, to solve to regard Feel perception and the problem of being connected harmonious between natural environment.
As shown in figure 3, the collaboration feature learning includes what feature association learnt in specific embodiments of the present invention two Task under mark collaboration and suitable border develop is predicted.Wherein,
The mark collaboration of feature association study --- there are stronger relevance between feature in practical application scene, and it is same There is larger difference again in relevance of the feature in different identification missions;And exist with the closely related markup information of identification mission Stronger relevance, and there is larger difference again in the relevance between different labeled.Coodination theory thinks that sample mark divides With process, the feature of sample itself is depended not only upon, it is often more important that the data spatial and temporal distributions relationship that neighbour's space sample provides. Same target may correspond to multiple marks, and point of the sample in higher dimensional space is often positioned in multiple when also taking into account feature learning Ambiguousness in the decision boundaries for classification of being engaged in.
The mapping function between lower vision mark and generic features is constrained by building low-rank, realizes feature mark collaboration. Nuclear norm is introduced to model mark correlation, feature correlation, at the same introduce figure regular terms retain data with existing (mark and Unlabeled data) intrinsic structure, realize without mark characteristic mark prediction, the extensive energy of lifting feature learning model Power overcomes semantic ambiguity, keeps model as simple as possible, reduces computation complexity.It is possible thereby to establish following unconstrained function:
Wherein g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and by g letter The loss function of number acquisition task prediction result error minimizes,For being fitted given mark.Φ (g) and Λ It (g) is the regular terms based on a priori assumption.The former keeps low-rank to constrain in practical application, and the latter is for retaining intrinsic structure.λ and γ is the contribution that regular terms parameter is used for three in balance model.
As shown in figure 5, proposing the task prediction technique under suitable border develops, use and people in the specific embodiment of the invention two The suitable border computational theory that cognitive process is close is formed being learnt based on linked character for task and predicts evolutionary model mechanism.The mould Type is by interactive environment, environmental model and loss model composition.Environmental model learns the environment dynamic change of input feature vector, loses mould Type estimates environmental model loss, predicts visual zone, target and the task to be identified for needing to pay close attention in the future.Wherein,
Interactive environment --- definition status space is made of t moment and the description of the generic features at t-1 moment, current t moment State specifies identification mission at, predict next t+1 moment task status to be identified.
Environmental model --- given historical informationGeneric features and history map Function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η (h).Remember ω For environmental model ω (ξ (h)) ∈ Y.When every subtask is predicted, loss model L is introducedwm(ω(ξ(h)),η(h)).Task prediction It is related to H={ h=(st-k,at-k,···,st,at,st+1), ξ (h)=(st-k,at-k,···,st,at) and η (h)= st+1, inverse kinematics forecasting mechanism and softmax cross entropy loss forecasting state in future, the neural network based on stochastic gradient descent Model ωφ, to encode it is stateful enter the low-dimensional latent space comprising shared weight complete visual attention location extracted region and Status predication.
Loss model --- given state stWith suggestion next step task, loss model is for predicting environmental model RlA The probability distribution that business occurs, softmax cross entropy loss function encode the state of suggestion task as penalty term, promote task The accuracy of prediction.
It fits border prediction by the way that association study and task will be marked and develops, gradual task forecasting mechanism is established, by layer-by-layer Storage environment aware migration knowledge establishes valuable loss model, defines the excitation and inhibition of identification mission in coevolution Element, the current identification mission to be treated of decision solve the problems, such as to migrate knowledge from simulated environment to true environment, are promoted The generalization and stability of feature association study and task prediction.
Learn to realize high-level semantic automatic marking using linked character, according to suitable border perception theory, in conjunction with complicated and changeable Application environment, the suitable border region-of-interest offered the challenge under cooperateing with extracts and multitask forecasting mechanism.On this basis, according to suitable border Perceptibility, low-rank are restrictive, pay attention to the synthesis limitation of the constraint conditions such as regionality, task relevance, realize and describe to generic features Optimal association study, generalization ability of the lift scheme to mass data and multiple-task.By a priori assumption, aposterior reasoning, It is associated with the theoretical research that optimization design completes relevant programme, further completes new departure by tools such as algorithm simulating platforms Simulating, verifying work.
As shown in figure 4, organically combining biological neural network and Synergetic Pattern Recognition in the embodiment of the present invention two, utilize Visual perception carries out effective semantic generic features description to target, and it is pre- to consider that the structured message of target scene carries out task It surveys, realizes to the context layer cooperating analysis of visual task, reach to prototype pattern (task to be identified) and adjoint mode (single task Recognition result) learn simultaneously, pattern dependency is effectively reduced, proposes the depth collaboration recognition methods of reduction.
The distribution of target-like State evolution is described as heat-supplied potential function by Coodination theory, it is believed that human brain memory system Signal self-organization process is exactly human associative memory process.In general, the video flowing persistently inputted is based on time interval and sees in the past The Long-range dependence examined can by the recognizable element of long time series and can not recognition element separate, to can not recognition element mark it is not true It is qualitative, and quickly identification can be with the new element in aid forecasting future.This research generates mould using memory external system enhancing timing Type, since the early stage of sequence store-memory feature describe effective information, and efficiently to stored information foundation can Lasting generation memory models.
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,···,eTAnd task association Same hidden variable collection z≤T={ z1,z2,···,zT, h is mapped using translationt=fh(ht-1,et,zt) correct each time point The hidden state variable h of certaintyt, priori mapping function fz(ht-1) description past observing and hidden variable non-linear dependence and offer Hidden variable distribution parameter.Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided.This Memory external Modifying model timing variable autocoder is utilized in research, generates a memory text ψ at every point of timet, Its priori and posterior probability are expressed as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μt),fz σt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μt-1,et),fq σt-1,et))
Wherein prior information is to rely on priori mapping fzRemember the diagonal gauss of distribution function of text, and diagonal Gauss is close It is depended on like Posterior distrbutionp and passes through posteriority mapping function fqAssociated memory text Ψt-1With current observation et
As shown in fig. 6, for the treatment process for using random calculating to scheme to generate model as memory timing.In order to make the structure There is higher versatility and flexibility to different perception tasks, the memory and controller architecture for introducing high-level semantic are with stabilization Storage information extracted for future, and carry out corresponding calculate to extract use information at once.
Depth collaboration identification improves collaboration prototype pattern modification method, from the angle that prototype pattern and adjoint mode learn simultaneously Degree proposes to cooperate with recognizer that will remember directly using the evolutionary process of collaboration potential-energy function based on the depth for generating memory models Recall the dynamic process that model is introduced into coevolution, will solve prototype pattern and adjoint mode be attributed to solve it is non-linear optimal Change problem, to obtain better contract network weight.Long memory network f in short-termrnnFor promoting state history ht, memory external Mt Use the hidden variable and external text information c from previous momenttIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, the network generate one collection key value, use cosine similarity Evaluation and test willWith memory Mt-1Each row compares, and generates task weight-sets, the memory φ of retrievalt rBy attention weight and memory Mt-1Weighted sum obtain, realize the sustainable multitask recognition mechanism of dynamic.
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the Setover relatedly value arrived by retrieving mnemonic learning, σ () is sigmoid function, memory external Mt For storing hidden variable zt, the recognition mechanism Ψ of controller formation informing memory storage and retrievalt=[φt 1t 2,···, φt R,ht].It is the output for generating memory models, and the multitask coordinated identification of the vision unknown for task definition and number is real Now to the non-supervisory self-adapting estimation of the video flowing of lasting input.For new identification mission, training before being remained in training The hiding layer state of model cooperates with the reward of each hidden layer in network before combining with task cooperation hierarchy based on feature Biasing realizes that the depth of context collaboration cooperates with recognition mechanism, so that it is possessed the priori knowledge relied on for a long time, formed and appointed for identification The completed policy of business improves the robustness of identification.
As shown in fig. 7, for a kind of multitask coordinated identification verifying for merging video-aware provided by Embodiment 2 of the present invention Platform.
With the lasting input of extensive multi-source heterogeneous video data and being continuously increased for suitable border perception identification mission, need A large amount of data storage and computing resource.Using the how intelligent coordinated processing of distribution, multinode, more GPU in high-performance calculation Mechanism, builds the multitask coordinated identification verification platform of multi-source vision, and purposes is to carry out the more visual task collaboration identifications of multi-source data Research, and theoretical study results involved in platform are assessed, it is intended to realization and testing vision collaboration are provided for researcher The extensibility framework of identification model, the basic test environment of model is provided and to the System Performance Analysis method of related data and Index provides for AI developer using the tool being integrated with fundamental research.It can be used as face in the following smart city construction It is provided to the further research and development of the push of the intelligent information of multi-source heterogeneous data, personalized control service etc. valuable Research verification platform.
In conjunction with above-mentioned intelligent verification demo platform, the result that multitask coordinated identification is collected from visual perception data is realized Output provides a standard platform for subsequent further investigation and functionization.Visual perception more are considered in test method The features such as high efficiency in Cooperative Analysis, dynamic, intelligence of being engaged in, in conjunction with the software design specification of soft project, using towards The Programming Methodology of object designs a verifying demo system easily extended.
As seen through the above description of the embodiments, those skilled in the art can be understood that the present invention can It realizes by means of software and necessary general hardware platform.Based on this understanding, technical solution of the present invention essence On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes the certain of each embodiment or embodiment of the invention Method described in part.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of protection of the claims Subject to.

Claims (10)

1. a kind of multitask recognition methods for merging video-aware, which is characterized in that include the following steps:
Step S110:In conjunction with biology perception mechanism, the shared semantic mechanism based on the collaboration of multi-source heterogeneous video data feature, Extract the generic features of multi-source heterogeneous video data;
Step S120:Using suitable border computational theory, the feature association study mechanism of task cooperation is established, to the multi-source heterogeneous view The generic features of frequency evidence carry out continuous learning as priori knowledge, generate the task interaction prediction model of suitable border perception;
Step S130:For it is long when input video stream, when the task interaction prediction model foundation perceived in conjunction with the suitable border is long according to Bad generation memory models, depth of the foundation based on cooperative kinetics is independently semi-supervised persistently to identify system, realizes that multitask is known Not.
2. the multitask recognition methods of fusion video-aware according to claim 1, which is characterized in that the step S110 In, the shared semantic mechanism of the multi-source heterogeneous video data feature collaboration includes:
Establish the primitive collaboration based on multi-source heterogeneous video data, the dictionary collaboration based on time synchronization and based on semantic similar It is special to establish multi-source heterogeneous video data in conjunction with the attribute of multi-source heterogeneous video data for the three-level feature synergistic mechanism of theme collaboration Cooperation model is levied, determines the regular shared semantic association relationship of dimension;Wherein,
Primitive based on multi-source heterogeneous video data cooperates with:Using independent component analysis training video picture element, utilize Gabor function matches the video image primitive according to this, estimates the corresponding scale of each video image primitive and direction, The primitive feature of video image is extracted, realizes the time-space domain efficient coding of video image internal structure;
Dictionary based on time synchronization cooperates with:It is encoded using local linear, using local distance as sparse basic function Regular terms calculates the best response signal of original dictionary, and the best response signal is recycled to calculate feasible dictionary search direction, A dictionary updating is completed, establishes a Coded concepts stream for each data channel, as the reference semantic coding of complicated event, The low-level feature stream newly inputted and the semantic coding that refers to are flowed into Mobile state Time alignment, generation time translation function is real Existing dictionary semanteme alignment;
Include based on semantic similar theme collaboration:Using hidden semantic analysis, construct between dictionary and video image primitive feature Co-occurrence matrix, the corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, master Node and the description of scene mapping relations are inscribed, the video conditional probability under theme distribution is calculated true as particular category similarity The likelihood function of probability and prediction probability between notional word remittance and scene.
3. the multitask recognition methods of fusion video-aware according to claim 2, which is characterized in that described to establish multi-source Isomery video data feature cooperation model determines that the regular shared semantic association relationship of dimension includes:
Assuming that there is C class heterogeneous channel feature, by i (i=1 ..., C),It is denoted as niThe feature square of a training sample Battle array, data noise part are E, and Γ is twiddle factor, establishes the majorized function under orthogonality constraint:
Wherein, λ indicates sharing matrix coefficient,TRepresenting matrix carries out transposition operation, YiIndicate that ith feature classification mark, F indicate Frobenius norm,Indicate projection matrix ΘiTransposition, α, β, μ1And μ2For multiplier factor, rank (X) is characterized matrix X Order;
Obtain general semantics feature low dimensional manifold subspace { Θi, the semantic sharing matrix W under Unified frame0With special characteristic mould Block matrix { Wi, using least square method for solving prediction loss function R1(W0,{Wi},{Θi), reconstruct loss function R2 ({Θi) and regular function R3(W0,{Wi) joint optimal solution;
By the way that the multi-source heterogeneous video data newly inputted is extracted the high-rise generic features description with dimension to eigenspace projection, Establish shared semantic association relationship.
4. the multitask recognition methods of fusion video-aware according to claim 3, which is characterized in that the step S120 In, using suitable border computational theory, the feature association study mechanism of task cooperation is established, generates the task interaction prediction of suitable border perception Model includes:
The mapping function under low-rank constrains between vision mark and generic features is constructed, realizes feature mark collaboration;Introduce core model Several pairs of mark correlations, feature correlations model, while introducing the intrinsic structure that figure regular terms retains data with existing, realize Mark prediction without mark characteristic, establishes following unconstrained function:
Wherein, g is the mapping function of feature association study, and data fidelity term Q () is used to evaluate and test given mark and is obtained by g function The loss function for obtaining task prediction result error minimizes,For being fitted given mark, Φ (g) and Λ (g) are Regular terms based on a priori assumption, λ and γ are regular terms parameters;
Task interaction prediction model interactive environment, environmental model and the loss model of suitable border perception;
Environmental model is used to learn the environment dynamic change of input feature vector, and loss model is for estimating that environmental model loses, prediction Visual zone, target and the task to be identified for needing to pay close attention in the future;
The interactive environment includes that definition status space is made of t moment and the description of the generic features at t-1 moment, current t moment State specifies identification mission at, predict next t+1 moment task status to be identified;
The environmental model includes giving historical informationGeneric features are reflected with history Penetrate function ξ:H → X and true value mark and history mapping function η:H → Y carrys out academic environment model mapping function ξ (h) → η (h);Note ω is environmental model ω (ξ (h)) ∈ Y, when every subtask is predicted, introduces loss model Lwm(ω (ξ (h)), η (h)), task prediction It is related to H={ h=(st-k,at-k,…,st,at,st+1), ξ (h)=(st-k,at-k,…,st,at) and η (h)=st+1, inverse kinematics Forecasting mechanism and softmax cross entropy loss forecasting state in future, the neural network model ω based on stochastic gradient descentφ, come Coding institute is stateful to enter a low-dimensional latent space completion visual attention location extracted region and status predication comprising shared weight;
The loss model includes given state stWith suggest next step task, for predicting environmental model RlWhat a task occurred Probability distribution, softmax cross entropy loss function encode the state of next step task as penalty term.
5. the multitask recognition methods of fusion video-aware according to claim 4, which is characterized in that the step S130 In, it is described for it is long when input video stream, the life that the task interaction prediction model foundation perceived in conjunction with the suitable border relies on when long Include at memory models:
Model is generated using memory external system enhancing timing, having for generic features description is stored since the early stage of sequence Information is imitated, establishes sustainable generation memory models to information has been stored;Specifically,
Generate the generic features description collection e that memory models include feature collaboration≤T={ e1,e2,…,eTAnd task cooperation hidden change Quantity set z≤T={ z1,z2,…,zT, h is mapped using translationt=fh(ht-1,et,zt) correct the hidden shape of certainty of each time point State variable ht, priori mapping function fz(ht-1) describe the non-linear dependence of past observing and hidden variable and hidden variable distribution ginseng is provided Number;Nonlinear observation mapping function fe(zt,ht-1) likelihood function for depending on hidden variable and state is provided;Utilize memory external mould Type corrects timing variable autocoder, generates a memory text ψ at every point of timet, prior information and posterior information It respectively indicates as follows:
Prior information pθ(zt|z< T,e< T)=N (zt|fz μt),fz σt-1))
Posterior information qφ(zt|z< T,e≤T)=N (zt|fq μt-1,et),fq σt-1,et))
Wherein,It is the translation mapping function of hidden variable z state μ,It is the translation mapping function of hidden variable z state σ,It is The translation mapping function of posterior probability q state μ,The translation mapping function of posterior probability q state σ, prior information are to rely on Priori maps fzRemember the diagonal gauss of distribution function of text, and diagonal Gaussian approximation Posterior distrbutionp is depended on and is mapped by posteriority Function fqAssociated memory text Ψt-1With current observation et
6. the multitask recognition methods of fusion video-aware according to claim 5, which is characterized in that the foundation is based on The depth of cooperative kinetics is independently semi-supervised persistently to identify system, realizes that multitask identification includes:
Recognizer is cooperateed with based on the depth for generating memory models, using the evolutionary process of collaboration potential-energy function, by memory models It is introduced into the dynamic process of coevolution, prototype pattern will be solved and adjoint mode is attributed to solution nonlinear optimization and asks Topic obtains optimization contract network weight;
Long memory network f in short-termrnnFor promoting state history ht, memory external MtUsing from previous moment hidden variable and External text information ctIt generating, generation model is as follows,
State updates (ht,Mt)=frnn(ht-1,Mt-1,zt-1,ct)
Memory M is derived from order to be formedtTask recognition instruction, introduce one collection key value, using cosine similarity evaluate and test willWith note Recall Mt-1Each row compares, and generation task pays attention to weight, the memory of retrievalBy attention weight and memory Mt-1Weighted sum It obtains, realizes multitask identification;Wherein,
Key value
Task weighting
Retrieval memory
Identification generates
Wherein,It is the crucial value function of r item for promoting state history, fattIt is attention mechanism function,It is t moment r item i-th The memory weight of a point,Retrieval memory equation obtain as a result, ⊙ indicate point multiplication operation,It is by retrieving mnemonic learning The Setover relatedly value arrived, σ () are sigmoid functions, form the expression mechanism for informing memory storage and retrieval as a result,As the output for generating memory models.
7. a kind of multitask coordinated identifying system for merging video-aware, it is characterised in that:Including generic features extraction module, association Identification module is cooperateed with feature learning module, depth;
The generic features extraction module is assisted for combining biology perception mechanism based on multi-source heterogeneous video data feature Same shared semantic mechanism, extracts the generic features of multi-source heterogeneous video data;
The collaboration feature learning module, for establishing the feature association study mechanism of task cooperation using suitable border computational theory, Continuous learning is carried out as priori knowledge to the generic features of the multi-source heterogeneous video data, the generating suitable border perception of the task is closed Join prediction model;
The depth cooperates with identification module, input video stream when for being directed to long, and the task association perceived in conjunction with the suitable border is pre- The generation memory models that survey model foundation relies on when long establish the autonomous semi-supervised lasting identifier of the depth based on cooperative kinetics System realizes multitask identification.
8. the multitask coordinated identifying system of fusion video-aware according to claim 7, it is characterised in that:It is described general Characteristic extracting module includes primitive collaboration submodule, dictionary collaboration submodule and theme collaboration submodule;
The primitive cooperates with submodule, for utilizing independent component analysis training video picture element, using Gabor function to institute It states video image primitive to match according to this, estimates the corresponding scale of each video image primitive and direction, extract video image Primitive feature, realize video image internal structure time-space domain efficient coding;
The dictionary cooperates with submodule, for being encoded using local linear, using local distance as the canonical of sparse basic function , the best response signal of original dictionary is calculated, recycles the best response signal to calculate feasible dictionary search direction, completes Dictionary updating establishes a Coded concepts stream for each data channel, will be new as the reference semantic coding of complicated event The low-level feature stream of input and the semantic coding that refers to flow into Mobile state Time alignment, and generation time translation function realizes word The alignment of allusion quotation semanteme;
The theme cooperates with submodule, for using hidden semantic analysis, constructs being total between dictionary and video image primitive feature The corresponding semantic concept of theme is embodied using hidden node, is realized by probability inference method to vocabulary, theme section by raw matrix Point and the description of scene mapping relations, by the video conditional probability under theme distribution, as particular category similarity, calculate true word The likelihood function of probability and prediction probability between remittance and scene.
9. the multitask coordinated identifying system of fusion video-aware according to claim 8, it is characterised in that:The collaboration Feature learning module includes the task interaction prediction submodule of feature association study submodule and the perception of suitable border;
The feature association learns submodule, for constructing the mapping letter under low-rank constrains between vision mark and generic features Number realizes feature mark collaboration;
The task interaction prediction submodule of the suitable border perception, for the feature association relationship by learning, in conjunction with visual impression The priori knowledge known, the task cooperation treatment mechanism based on environmental model and loss function are realized according to scene changes dynamic certainly Task to be identified is adaptively adjusted, the dynamic adjustment of the perception of visual attention location region and mission requirements prediction is completed.
10. the multitask coordinated identifying system of fusion video-aware according to claim 9, it is characterised in that:The depth The generation memory models submodule and multitask depth collaboration identification submodule that degree collaboration identification module relies on when including long;
The generation memory models submodule relied on when described long, input video stream when for being directed to long are perceived in conjunction with the suitable border Task interaction prediction model foundation it is long when the generation memory models that rely on;
The multitask depth collaboration identification submodule, for establishing the autonomous semi-supervised lasting knowledge of the depth based on cooperative kinetics Complicated variant system realizes multitask identification.
CN201810744934.4A 2018-07-09 2018-07-09 Merge the multitask coordinated recognition methods and system of video-aware Withdrawn CN108846384A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810744934.4A CN108846384A (en) 2018-07-09 2018-07-09 Merge the multitask coordinated recognition methods and system of video-aware

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810744934.4A CN108846384A (en) 2018-07-09 2018-07-09 Merge the multitask coordinated recognition methods and system of video-aware

Publications (1)

Publication Number Publication Date
CN108846384A true CN108846384A (en) 2018-11-20

Family

ID=64195944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810744934.4A Withdrawn CN108846384A (en) 2018-07-09 2018-07-09 Merge the multitask coordinated recognition methods and system of video-aware

Country Status (1)

Country Link
CN (1) CN108846384A (en)

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376964A (en) * 2018-12-10 2019-02-22 杭州世平信息科技有限公司 A kind of criminal case charge prediction technique based on Memory Neural Networks
CN109376963A (en) * 2018-12-10 2019-02-22 杭州世平信息科技有限公司 A kind of criminal case charge law article unified prediction neural network based
CN109492059A (en) * 2019-01-03 2019-03-19 北京理工大学 A kind of multi-source heterogeneous data fusion and Modifying model process management and control method
CN109687845A (en) * 2018-12-25 2019-04-26 苏州大学 A kind of sparse regularization multitask sef-adapting filter network of the cluster of robust
CN109711411A (en) * 2018-12-10 2019-05-03 浙江大学 A kind of image segmentation and identification method based on capsule neuron
CN109784399A (en) * 2019-01-11 2019-05-21 中国人民解放军海军航空大学 Based on the multi-source image target association method for improving dictionary learning
CN109919177A (en) * 2019-01-23 2019-06-21 西北工业大学 Feature selection approach based on stratification depth network
CN109933788A (en) * 2019-02-14 2019-06-25 北京百度网讯科技有限公司 Type determines method, apparatus, equipment and medium
CN109977194A (en) * 2019-03-20 2019-07-05 华南理工大学 Text similarity computing method, system, equipment and medium based on unsupervised learning
CN109992703A (en) * 2019-01-28 2019-07-09 西安交通大学 A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning
CN110020626A (en) * 2019-04-09 2019-07-16 中通服公众信息产业股份有限公司 A kind of multi-source heterogeneous data personal identification method based on attention mechanism
CN110147711A (en) * 2019-02-27 2019-08-20 腾讯科技(深圳)有限公司 Video scene recognition methods, device, storage medium and electronic device
CN110245267A (en) * 2019-05-17 2019-09-17 天津大学 Multi-user's video flowing deep learning is shared to calculate multiplexing method
CN110309861A (en) * 2019-06-10 2019-10-08 浙江大学 A kind of multi-modal mankind's activity recognition methods based on generation confrontation network
CN110378190A (en) * 2019-04-23 2019-10-25 南京邮电大学 Video content detection system and detection method based on topic identification
CN110688916A (en) * 2019-09-12 2020-01-14 武汉理工大学 Video description method and device based on entity relationship extraction
CN110928889A (en) * 2019-10-23 2020-03-27 深圳市华讯方舟太赫兹科技有限公司 Training model updating method, device and computer storage medium
CN110956105A (en) * 2019-11-20 2020-04-03 北京影谱科技股份有限公司 Gesture recognition method based on semantic probability network
CN111160443A (en) * 2019-12-25 2020-05-15 浙江大学 Activity and user identification method based on deep multitask learning
CN111242318A (en) * 2020-01-13 2020-06-05 拉扎斯网络科技(上海)有限公司 Business model training method and device based on heterogeneous feature library
CN111488840A (en) * 2020-04-15 2020-08-04 桂林电子科技大学 Human behavior classification method based on multi-task learning model
CN112100256A (en) * 2020-08-06 2020-12-18 北京航空航天大学 Data-driven urban accurate depth image system and method
CN112527993A (en) * 2020-12-17 2021-03-19 浙江财经大学东方学院 Cross-media hierarchical deep video question-answer reasoning framework
CN112766470A (en) * 2019-10-21 2021-05-07 地平线(上海)人工智能技术有限公司 Feature data processing method, instruction sequence generation method, device and equipment
CN113110517A (en) * 2021-05-24 2021-07-13 郑州大学 Multi-robot collaborative search method based on biological elicitation in unknown environment
CN113128669A (en) * 2021-04-08 2021-07-16 中国科学院计算技术研究所 Neural network model for semi-supervised learning and semi-supervised learning method
CN113220911A (en) * 2021-05-25 2021-08-06 中国农业科学院农业信息研究所 Agricultural multi-source heterogeneous data analysis and mining method and application thereof
CN113268818A (en) * 2021-07-19 2021-08-17 中国空气动力研究与发展中心计算空气动力研究所 Pneumatic global optimization method based on topological mapping generation, storage medium and terminal
CN113285721A (en) * 2021-06-10 2021-08-20 北京邮电大学 Reconstruction and prediction algorithm for sparse mobile sensing data
CN113411765A (en) * 2021-05-22 2021-09-17 西北工业大学 Mobile intelligent terminal energy consumption optimization method based on multi-sensor cooperative sensing
CN113438204A (en) * 2021-05-06 2021-09-24 中国地质大学(武汉) Multi-node cooperative identification response method based on block chain
CN113505611A (en) * 2021-07-09 2021-10-15 中国人民解放军战略支援部队信息工程大学 Training method and system for obtaining better speech translation model in generation of confrontation
CN113537355A (en) * 2021-07-19 2021-10-22 金鹏电子信息机器有限公司 Multi-element heterogeneous data semantic fusion method and system for security monitoring
CN113780578A (en) * 2021-09-08 2021-12-10 北京百度网讯科技有限公司 Model training method and device, electronic equipment and readable storage medium
CN113822048A (en) * 2021-09-16 2021-12-21 电子科技大学 Social media text denoising method based on space-time burst characteristics
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method
CN114694177A (en) * 2022-03-10 2022-07-01 电子科技大学 Fine-grained character attribute identification method based on multi-scale features and attribute association mining
CN114783022A (en) * 2022-04-08 2022-07-22 马上消费金融股份有限公司 Information processing method and device, computer equipment and storage medium
CN114898319A (en) * 2022-05-25 2022-08-12 山东大学 Vehicle type recognition method and system based on multi-sensor decision-level information fusion
CN115632684A (en) * 2022-12-21 2023-01-20 香港中文大学(深圳) Transmission strategy design method of perception and communication integrated system
CN115985402A (en) * 2023-03-20 2023-04-18 北京航空航天大学 Cross-modal data migration method based on normalized flow theory
CN116503029A (en) * 2023-06-27 2023-07-28 北京中电科卫星导航系统有限公司 Module data cooperative processing method and system for automatic driving
CN117292274A (en) * 2023-11-22 2023-12-26 成都信息工程大学 Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary
CN111815030B (en) * 2020-06-11 2024-02-06 浙江工商大学 Multi-target feature prediction method based on small amount of questionnaire survey data

Cited By (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376963B (en) * 2018-12-10 2022-04-08 杭州世平信息科技有限公司 Criminal case and criminal name and criminal law joint prediction method based on neural network
CN109376963A (en) * 2018-12-10 2019-02-22 杭州世平信息科技有限公司 A kind of criminal case charge law article unified prediction neural network based
CN109711411A (en) * 2018-12-10 2019-05-03 浙江大学 A kind of image segmentation and identification method based on capsule neuron
CN109376964A (en) * 2018-12-10 2019-02-22 杭州世平信息科技有限公司 A kind of criminal case charge prediction technique based on Memory Neural Networks
CN109376964B (en) * 2018-12-10 2021-11-12 杭州世平信息科技有限公司 Criminal case criminal name prediction method based on memory neural network
CN109687845A (en) * 2018-12-25 2019-04-26 苏州大学 A kind of sparse regularization multitask sef-adapting filter network of the cluster of robust
CN109492059A (en) * 2019-01-03 2019-03-19 北京理工大学 A kind of multi-source heterogeneous data fusion and Modifying model process management and control method
CN109492059B (en) * 2019-01-03 2020-10-27 北京理工大学 Multi-source heterogeneous data fusion and model correction process control method
CN109784399A (en) * 2019-01-11 2019-05-21 中国人民解放军海军航空大学 Based on the multi-source image target association method for improving dictionary learning
CN109919177B (en) * 2019-01-23 2022-03-29 西北工业大学 Feature selection method based on hierarchical deep network
CN109919177A (en) * 2019-01-23 2019-06-21 西北工业大学 Feature selection approach based on stratification depth network
CN109992703A (en) * 2019-01-28 2019-07-09 西安交通大学 A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning
CN109992703B (en) * 2019-01-28 2022-03-01 西安交通大学 Reliability evaluation method for differentiated feature mining based on multi-task learning
CN109933788B (en) * 2019-02-14 2023-05-23 北京百度网讯科技有限公司 Type determining method, device, equipment and medium
CN109933788A (en) * 2019-02-14 2019-06-25 北京百度网讯科技有限公司 Type determines method, apparatus, equipment and medium
CN110147711A (en) * 2019-02-27 2019-08-20 腾讯科技(深圳)有限公司 Video scene recognition methods, device, storage medium and electronic device
CN110147711B (en) * 2019-02-27 2023-11-14 腾讯科技(深圳)有限公司 Video scene recognition method and device, storage medium and electronic device
CN109977194A (en) * 2019-03-20 2019-07-05 华南理工大学 Text similarity computing method, system, equipment and medium based on unsupervised learning
CN109977194B (en) * 2019-03-20 2021-08-10 华南理工大学 Text similarity calculation method, system, device and medium based on unsupervised learning
CN110020626A (en) * 2019-04-09 2019-07-16 中通服公众信息产业股份有限公司 A kind of multi-source heterogeneous data personal identification method based on attention mechanism
CN110378190B (en) * 2019-04-23 2022-10-04 南京邮电大学 Video content detection system and detection method based on topic identification
CN110378190A (en) * 2019-04-23 2019-10-25 南京邮电大学 Video content detection system and detection method based on topic identification
CN110245267B (en) * 2019-05-17 2023-08-11 天津大学 Multi-user video stream deep learning sharing calculation multiplexing method
CN110245267A (en) * 2019-05-17 2019-09-17 天津大学 Multi-user's video flowing deep learning is shared to calculate multiplexing method
CN110309861A (en) * 2019-06-10 2019-10-08 浙江大学 A kind of multi-modal mankind's activity recognition methods based on generation confrontation network
CN110688916A (en) * 2019-09-12 2020-01-14 武汉理工大学 Video description method and device based on entity relationship extraction
CN112766470B (en) * 2019-10-21 2024-05-07 地平线(上海)人工智能技术有限公司 Feature data processing method, instruction sequence generating method, device and equipment
CN112766470A (en) * 2019-10-21 2021-05-07 地平线(上海)人工智能技术有限公司 Feature data processing method, instruction sequence generation method, device and equipment
CN110928889A (en) * 2019-10-23 2020-03-27 深圳市华讯方舟太赫兹科技有限公司 Training model updating method, device and computer storage medium
CN110956105A (en) * 2019-11-20 2020-04-03 北京影谱科技股份有限公司 Gesture recognition method based on semantic probability network
CN111160443B (en) * 2019-12-25 2023-05-23 浙江大学 Activity and user identification method based on deep multitasking learning
CN111160443A (en) * 2019-12-25 2020-05-15 浙江大学 Activity and user identification method based on deep multitask learning
CN111242318B (en) * 2020-01-13 2024-04-26 拉扎斯网络科技(上海)有限公司 Service model training method and device based on heterogeneous feature library
CN111242318A (en) * 2020-01-13 2020-06-05 拉扎斯网络科技(上海)有限公司 Business model training method and device based on heterogeneous feature library
CN111488840A (en) * 2020-04-15 2020-08-04 桂林电子科技大学 Human behavior classification method based on multi-task learning model
CN111815030B (en) * 2020-06-11 2024-02-06 浙江工商大学 Multi-target feature prediction method based on small amount of questionnaire survey data
CN112100256B (en) * 2020-08-06 2023-05-26 北京航空航天大学 Data-driven urban precise depth portrait system and method
CN112100256A (en) * 2020-08-06 2020-12-18 北京航空航天大学 Data-driven urban accurate depth image system and method
CN112527993A (en) * 2020-12-17 2021-03-19 浙江财经大学东方学院 Cross-media hierarchical deep video question-answer reasoning framework
CN113128669A (en) * 2021-04-08 2021-07-16 中国科学院计算技术研究所 Neural network model for semi-supervised learning and semi-supervised learning method
CN113438204A (en) * 2021-05-06 2021-09-24 中国地质大学(武汉) Multi-node cooperative identification response method based on block chain
CN113411765A (en) * 2021-05-22 2021-09-17 西北工业大学 Mobile intelligent terminal energy consumption optimization method based on multi-sensor cooperative sensing
CN113110517A (en) * 2021-05-24 2021-07-13 郑州大学 Multi-robot collaborative search method based on biological elicitation in unknown environment
CN113220911B (en) * 2021-05-25 2024-02-02 中国农业科学院农业信息研究所 Agricultural multi-source heterogeneous data analysis and mining method and application thereof
CN113220911A (en) * 2021-05-25 2021-08-06 中国农业科学院农业信息研究所 Agricultural multi-source heterogeneous data analysis and mining method and application thereof
CN113285721A (en) * 2021-06-10 2021-08-20 北京邮电大学 Reconstruction and prediction algorithm for sparse mobile sensing data
CN113505611A (en) * 2021-07-09 2021-10-15 中国人民解放军战略支援部队信息工程大学 Training method and system for obtaining better speech translation model in generation of confrontation
CN113537355A (en) * 2021-07-19 2021-10-22 金鹏电子信息机器有限公司 Multi-element heterogeneous data semantic fusion method and system for security monitoring
CN113268818A (en) * 2021-07-19 2021-08-17 中国空气动力研究与发展中心计算空气动力研究所 Pneumatic global optimization method based on topological mapping generation, storage medium and terminal
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method
CN113780578B (en) * 2021-09-08 2023-12-12 北京百度网讯科技有限公司 Model training method, device, electronic equipment and readable storage medium
CN113780578A (en) * 2021-09-08 2021-12-10 北京百度网讯科技有限公司 Model training method and device, electronic equipment and readable storage medium
CN113822048B (en) * 2021-09-16 2023-03-21 电子科技大学 Social media text denoising method based on space-time burst characteristics
CN113822048A (en) * 2021-09-16 2021-12-21 电子科技大学 Social media text denoising method based on space-time burst characteristics
CN114694177A (en) * 2022-03-10 2022-07-01 电子科技大学 Fine-grained character attribute identification method based on multi-scale features and attribute association mining
CN114694177B (en) * 2022-03-10 2023-04-28 电子科技大学 Fine-grained character attribute identification method based on multi-scale feature and attribute association mining
CN114783022B (en) * 2022-04-08 2023-07-21 马上消费金融股份有限公司 Information processing method, device, computer equipment and storage medium
CN114783022A (en) * 2022-04-08 2022-07-22 马上消费金融股份有限公司 Information processing method and device, computer equipment and storage medium
CN114898319A (en) * 2022-05-25 2022-08-12 山东大学 Vehicle type recognition method and system based on multi-sensor decision-level information fusion
CN114898319B (en) * 2022-05-25 2024-04-02 山东大学 Vehicle type recognition method and system based on multi-sensor decision level information fusion
CN115632684A (en) * 2022-12-21 2023-01-20 香港中文大学(深圳) Transmission strategy design method of perception and communication integrated system
CN115985402B (en) * 2023-03-20 2023-09-19 北京航空航天大学 Cross-modal data migration method based on normalized flow theory
CN115985402A (en) * 2023-03-20 2023-04-18 北京航空航天大学 Cross-modal data migration method based on normalized flow theory
CN116503029A (en) * 2023-06-27 2023-07-28 北京中电科卫星导航系统有限公司 Module data cooperative processing method and system for automatic driving
CN116503029B (en) * 2023-06-27 2023-09-05 北京中电科卫星导航系统有限公司 Module data cooperative processing method and system for automatic driving
CN117292274B (en) * 2023-11-22 2024-01-30 成都信息工程大学 Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary
CN117292274A (en) * 2023-11-22 2023-12-26 成都信息工程大学 Hyperspectral wet image classification method based on zero-order learning of deep semantic dictionary

Similar Documents

Publication Publication Date Title
CN108846384A (en) Merge the multitask coordinated recognition methods and system of video-aware
Qin et al. A dual-stage attention-based recurrent neural network for time series prediction
Kaymak et al. A brief survey and an application of semantic image segmentation for autonomous driving
CN108804715A (en) Merge multitask coordinated recognition methods and the system of audiovisual perception
CN109829541A (en) Deep neural network incremental training method and system based on learning automaton
CN111507378A (en) Method and apparatus for training image processing model
CN111860951A (en) Rail transit passenger flow prediction method based on dynamic hypergraph convolutional network
CN116415654A (en) Data processing method and related equipment
CN109102000A (en) A kind of image-recognizing method extracted based on layered characteristic with multilayer impulsive neural networks
Alshmrany Adaptive learning style prediction in e-learning environment using levy flight distribution based CNN model
CN112417289B (en) Information intelligent recommendation method based on deep clustering
Chen et al. Binarized neural architecture search for efficient object recognition
Gupta et al. Rv-gan: Recurrent gan for unconditional video generation
Qin et al. [Retracted] Evaluation of College Students’ Ideological and Political Education Management Based on Wireless Network and Artificial Intelligence with Big Data Technology
Gao Application of convolutional neural network in emotion recognition of ideological and political teachers in colleges and universities
CN113657272B (en) Micro video classification method and system based on missing data completion
CN113553918B (en) Machine ticket issuing character recognition method based on pulse active learning
Wu et al. Short-term memory neural network-based cognitive computing in sports training complexity pattern recognition
CN113408721A (en) Neural network structure searching method, apparatus, computer device and storage medium
CN117131933A (en) Multi-mode knowledge graph establishing method and application
CN116737897A (en) Intelligent building knowledge extraction model and method based on multiple modes
Ikram A benchmark for evaluating Deep Learning based Image Analytics
Zhang et al. A fast evolutionary knowledge transfer search for multiscale deep neural architecture
Su et al. Soft regression of monocular depth using scale-semantic exchange network
CN112036546A (en) Sequence processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20181120