CN104133807A - Method and device for learning cross-platform multi-mode media data common feature representation - Google Patents

Method and device for learning cross-platform multi-mode media data common feature representation Download PDF

Info

Publication number
CN104133807A
CN104133807A CN201410366722.9A CN201410366722A CN104133807A CN 104133807 A CN104133807 A CN 104133807A CN 201410366722 A CN201410366722 A CN 201410366722A CN 104133807 A CN104133807 A CN 104133807A
Authority
CN
China
Prior art keywords
beta
alpha
lambda
sigma
solution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410366722.9A
Other languages
Chinese (zh)
Other versions
CN104133807B (en
Inventor
徐常胜
杨小汕
张天柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201410366722.9A priority Critical patent/CN104133807B/en
Publication of CN104133807A publication Critical patent/CN104133807A/en
Application granted granted Critical
Publication of CN104133807B publication Critical patent/CN104133807B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a method and device for learning cross-platform multi-mode media data common feature representation by using a denoising auto-encoder. The method comprises the steps that S1, an optimal object equation is established, wherein in the object equation, media data features of different platforms and different modes are reestablished by adopting the single-layer denoising auto-encoder, mode correlation constraint and cross-platform constraint are considered during reestablishing, and the media data features of the different platforms and the different modes comprise image features and textual features; S2, an analytical solution to the optimal object equation is obtained, and the globally optimal solution is obtained by searching for the point of which a partial derivative is zero in the analytical solution obtaining process; S3, the analytical solution is solved through a marginalization method, wherein solving is conducted by marginalizing random noise of feature vectors through the weak law of large numbers in the solving process. According to the method and device for learning cross-platform multi-mode media data common feature representation, the random noise of the feature vectors is marginalized by means of the weak law of large numbers.

Description

Learn method and device that cross-platform multi-modal media data common trait represents
Technical field
The invention belongs to Social Media (social media) and analyze and represent field across media characteristic, be specifically related to a kind of method of utilizing the cross-platform multi-modal media data common trait of denoising own coding device study to represent.
Background technology
Along with popularizing fast of Web2.0, information can and be shared for user's issue in increasing Social Media website (for example Flickr, YouTube, Facebook and Google).This makes to occur in people's event around and is recorded at faster speed and propagates and produced thereupon the media data of a large amount of different modalities, for example image, text and video.According to statistics, within 1 minute, have 3125 pictures to be uploaded to Flickr, 700K bar message is sent out on Facebook, and 2MM video is viewed on YouTube.The information that user uploads is not only huge, and is present in different platform with different mode.In these social multi-medium datas, exist valuable information, and be used in a large amount of application.For example, Social Media data stream (Twitter) is used to semantic video recommendation, social event prediction and image labeling in real time.Pictorial information on Flickr is successfully used to predict America, presidential elections, the monitoring of production marketing branch and sales volume of the product prediction in 2008.Human face expression in Social Media image is used to monitor public feelings information in presidential election.
In the various application of Social Media, crucial problem is how from broad medium data, to extract effective feature.Current most method is all the contextual information based on media, for example time, position and textual description.These descriptive informations are easily extracted, but have a large amount of media datas not comprise these contextual informations, therefore cannot obtain effective character representation.Content-based Social Media information extraction can address these problems.But there are three difficult points in content-based character representation: (1) Social Media data have multi-modal attribute.For example the media sample in Social Media website is conventionally simultaneously by image, text representation.(2) the cross-platform characteristic of Social Media data.For example may be present in Flickr and Facebook about the image of particular social event simultaneously.(3) feature that traditional-handwork is set still can not represent the semantic information comprising in multi-medium data effectively.
Summary of the invention
The object of the invention is the cross-platform multi-modal characteristic for Social Media data, improve the expression ability of low-level image feature by denoising own coding device (denoising auto-encoder), excavate the common semantic feature of different modalities data by the correlativity between maximization different modalities, learn the common trait of multi-medium data in different platform by cross-platform constraint and represent.
For achieving the above object, the invention provides a kind of method of utilizing the cross-platform multi-modal media data common trait of denoising own coding device study to represent, the method comprises the following steps:
Step S1, sets up optimization aim equation; In target equation, the media data feature that adopts the denoising own coding device of individual layer to rebuild different platform and different modalities, in the time rebuilding, considers Modal Correlation constraint and cross-platform constraint; Wherein, the media data feature of described different platform and different modalities comprises characteristics of image and text feature;
Step S2, tries to achieve the analytic solution of described optimization aim equation, asking in analytic solution process, obtains globally optimal solution by finding the point that partial derivative is zero;
Step S3, utilizes marginalisation method to solve obtained analytic solution, and the random noise of wherein carrying out marginalisation proper vector by the weak law of large number in solution procedure solves.
The present invention also provides a kind of device that utilizes the cross-platform multi-modal media data common trait of denoising own coding device study to represent, it is characterized in that, this device comprises:
Set up module, it is for setting up optimization aim equation; In target equation, the media data feature that adopts the denoising own coding device of individual layer to rebuild different platform and different modalities, in the time rebuilding, considers Modal Correlation constraint and cross-platform constraint; Wherein, the media data feature of described different platform and different modalities comprises characteristics of image and text feature;
Analytic solution module, it asking in analytic solution process, obtains globally optimal solution by finding the point that partial derivative is zero for trying to achieve the analytic solution of described optimization aim equation;
Solve module, it utilizes marginalisation method to solve obtained analytic solution, and the random noise of wherein carrying out marginalisation proper vector by the weak law of large number in solution procedure solves.
Beneficial effect of the present invention: adopt denoising own coding device can improve the ability to express of low-level image feature, Modal Correlation constraint is conducive to find maximally related feature between different modalities data, and platform adapts to constraint can reduce the difference that the multi-medium data feature in different platform distributes.
Brief description of the drawings
Fig. 1 is the schematic diagram of cross-platform multi-modal own coding device of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in further detail.
The present invention supposes that two kinds of modal datas are respectively image and text, and these two kinds of modal datas are distributed in again two Social Media platform Google and Flickr.We use with represent respectively the n on Google platform 1the image of individual media data sample and text feature, use with represent respectively the n on Flickr platform 2the image of individual media data sample and text feature, here n 2=n-n 1, the number of n presentation medium data sample, i.e. the number sum of characteristics of image and text feature.In addition, we use X=[X s, X t] and Y=[Y s, Y t] represent the image and the text feature that combine on two platforms.With with presentation video feature x and text feature y have added the later proper vector of noise. the matrix of m X composition, the matrix of m Y composition, with be with the version that plus noise is later, m is the number of times of plus noise, described plus noise refers to that a part of element of choosing at random in matrix is become 0.
Target of the present invention is to have proposed a kind ofly to utilize method described in the method that the cross-platform multi-modal media data common trait of denoising own coding device study represents to comprise three parts: 1) set up optimization aim equation, 2) try to achieve analytic solution, 3) rapid solving of marginalisation.Specifically, said method comprising the steps of:
Step S1, sets up optimization aim equation.For the common trait of learning different platform media data represents, first we adopt the denoising own coding device (denoising auto-encoder) of individual layer to rebuild input feature vector.In individual layer denoising own coding device, input data are mapped to output layer with a linear mapping matrix, and output is the reconstruction of input data later to plus noise.We use respectively W xand W ythe linear mapping matrix of the denoising own coding device that presentation video and text are corresponding, this linear mapping matrix is for being mapped to same characteristic vector space by image and Text eigenvector, and then makes between the data in different platform, there is no platform difference.In addition we use with represent respectively the reconstructed object equation of two kinds of modal datas, i.e. the reconstruction error to image and Text eigenvector, uses with represent Modal Correlation constraint and cross-platform constraint, i.e. semantic difference between difference between data in different platform, and different modalities data.Final optimization aim equation can be expressed as:
Here λ x, λ y, λ mcand λ cdall represent regularization parameter.The denoising own coding device of only considering a kind of modal data from tradition is different, and in (1) formula, we have not only considered respectively the reconstructed object equation of image and two kinds of modal datas of text with also add Modal Correlation constraint with cross-platform constraint by solving (1) formula, we just can the data-mapping of image and two kinds of different modalities of text to Modal Correlation maximize and the feature space of platform difference after minimizing in.(1) the reconstructed object equation of the view data in formula and text data be defined as follows respectively:
Wherein, the mark of Tr representing matrix.
Concerning the media data of every kind of mode, the sample on Flickr and Google platform is combined structure Reconstructed equation.The distributional difference of the data that the character representation obtaining so just can reduce same mode in different platform.
In order to consider the correlativity between different modalities data, by means of canonical correlation analysis (CCA), we maximize two kinds of correlativitys between mode in rebuilding image and text data, we be defined as:
Here the variance matrix of presentation video data, represent the variance matrix of text data, it is the covariance matrix between image and text data.
In order to consider the media data difference in different platform, we reduce the media data difference in different platform by means of maximizing average difference (MMD).Specifically, be defined as follows:
Here φ x(x)=W xx, φ y(y)=W yy, G x = ( 1 mn 1 Σ j = 1 m Σ i = 1 n 1 x ~ ij - 1 mn 2 Σ j = 1 m Σ i = n 1 + 1 n x ~ ij ) , G y = ( 1 mn 1 Σ j = 1 m Σ i = 1 n 1 y ~ ij - 1 mn 2 Σ j = 1 m Σ i = n 1 + 1 n y ~ ij ) .
Step S2, tries to achieve analytic solution.Described in step 1 with four represented functions of expression formula are all protruding, and what therefore (1) formula represented is a convex quadratic programming problem.The globally optimal solution of this quadratic programming problem can obtain by finding the point that partial derivative is zero.
about W xpartial derivative may be calculated:
Here and Q x = 2 ( ( λ x + λ c ) C xx + λ m G x G x T ) . Similarly, we can obtain about W ylocal derviation:
Here and Q y = 2 ( ( λ y + λ c ) C yy + λ m G y G y T ) .
By solving equation group work as Q xwith all reversible time, we obtain W xand W yanalytic solution:
W y = 2 ( 2 λ c λ x C ‾ xx Q x - 1 C xy + λ y C ‾ yy ) ( Q y - 4 λ c 2 C xy T Q x - 1 C xy ) - 1 - - - ( 8 )
W x = 2 ( λ c W y C xy T + λ x C ‾ xx ) Q x - 1 - - - ( 9 )
In actual applications, Q xwith conventionally be all reversible.If they are irreversible, we obtain approximate solution by pseudoinverse.
Step S3, the rapid solving of marginalisation.Tradition own coding device easily study obtains a nonsensical unit matrix.In order to address this problem, denoising own coding device is rebuild original feature vector with having added the later input feature value of noise.Plus noise operation is that a part of element by input vector is set is at random zero to realize.The major function of denoising own coding device is with not having the element information of losing to predict the element information of loss.The added random noise of original feature vector is conducive to catch the statistical information in input vector.In the optimization aim equation of introducing in step 1, original feature vector is added to the noise of m time.M is larger, and the linear mapping that study obtains is more effective, but has also increased time cost simultaneously.Here we carry out the random noise of proper vector in marginalisation (1) formula by the weak law of large number.
Suppose the change probability of each element when p represents proper vector plus noise.Therefore the probability that proper vector element is not changed by noise is 1-p.P is that needs pass through the predetermined parameter of test in algorithm is realized, and in our test, the value of p is can obtain good effect at 0.7 o'clock.C below xx, c yy, c xy, G xand G ythe solution that adopts marginalisation Noise Method to obtain.
Suppose S=XX t, we can obtain C xxmarginalisation after solution:
E ( C xx ) = Σ i = 1 n E [ x ~ i x ~ i T ] - - - ( 10 )
E ( C ‾ xx ) = Σ i = 1 n E [ x i , x ~ i T ] - - - ( 11 )
Wherein, E[] represent to ask expectation; Known according to step 2, C xxwith have random noise, regard random noise as stochastic variable, the object of above-mentioned formula is stochastic variable corresponding to mode cancellation noise by asking expectation; E (C xx) and in the element value of the capable β of α row can direct representation be:
E ( C xx ) α , β = S αβ ( 1 - p ) 2 if α ≠ β S αβ ( 1 - p ) if α = β - - - ( 12 )
E ( C ‾ xx ) α , β S αβ ( 1 - p ) - - - ( 13 )
Wherein, 1-p herein represents the probability that the element value of the capable β row of α of s is not changed by noise.
Suppose R=YY t, we can obtain C yywith marginalisation after solution:
E ( C yy ) = Σ i = 1 n E [ y ~ i y ~ i T ] - - - ( 14 )
E ( C ‾ yy ) = Σ i = 1 n E [ y i , y ~ i T ] - - - ( 15 )
E (C yy) and in the element value of the capable β of α row can direct representation be:
E ( C yy ) α , β = R αβ ( 1 - p ) 2 if α ≠ β R αβ ( 1 - p ) if α = β - - - ( 16 )
E ( C ‾ yy ) α , β R αβ ( 1 - p ) - - - ( 17 )
Wherein, (1-p) herein represents the probability that the element value of the capable β row of α of R is not changed by noise.
Suppose U=XY t, we can obtain C xymarginalisation after solution:
E ( C xy ) = Σ i = 1 n E [ x ~ i y ~ i T ] - - - ( 18 )
E ( C xy ) α , β = U αβ ( 1 - p ) 2 if α ≠ β U αβ ( 1 - p ) if α = β - - - ( 19 )
Wherein, 1-p herein represents the probability that the element value of the capable β row of α of U is not changed by noise.
Suppose V = ( 1 n 1 Σ i = 1 n 1 x i - 1 n 2 Σ i = n 1 + 1 n x i ) , Z = ( 1 n 1 Σ i = 1 n 1 y i - 1 n 2 Σ i = n 1 + 1 n y i ) , We can obtain G xand G yedge after solution:
E ( G x ) = ( 1 n 1 Σ i = 1 n 1 E ( x ~ i ) - 1 n 2 Σ i = n 1 + 1 n E ( x ~ i ) ) - - - ( 20 )
E ( G y ) = ( 1 n 1 Σ i = 1 n 1 E ( y ~ i ) - 1 n 2 Σ i = n 1 + 1 n E ( y ~ i ) ) - - - ( 21 )
E(G x) α=V α(1-p),E(G y) α=Z α(1-p)
Wherein, 1-p herein represents α the probability that element value is not changed by noise in V or Z.
So far we can obtain the marginalisation formal solution of the described optimization method of formula (1) in step 1:
W y = ( λ c λ x E ( C ‾ xx ) E ( Q x ) - 1 E ( C xy ) + λ y E ( C ‾ yy ) ) ( E ( Q y ) - λ c 2 E ( C xy ) T E ( Q x ) - 1 E ( C xy ) ) - 1 - - - ( 22 )
W x = ( λ c W y E ( C xy ) T + λ x E ( C ‾ xx ) ) E ( Q x ) - 1 - - - ( 23 )
Here E (Q x)=2 ((λ x+ λ c) E (C xx)+λ me (G x) E (G x) t),
E(Q y)=2((λ yc)E(C yy)+λ mE(G y)E(G y) T)。
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any amendment of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (6)

1. a method of utilizing the cross-platform multi-modal media data common trait of denoising own coding device study to represent, is characterized in that, the method comprises the following steps:
Step S1, sets up optimization aim equation; In target equation, the media data feature that adopts the denoising own coding device of individual layer to rebuild different platform and different modalities, in the time rebuilding, considers Modal Correlation constraint and cross-platform constraint; Wherein, the media data feature of described different platform and different modalities comprises characteristics of image and text feature;
Step S2, tries to achieve the analytic solution of described optimization aim equation, asking in analytic solution process, obtains globally optimal solution by finding the point that partial derivative is zero;
Step S3, utilizes marginalisation method to solve obtained analytic solution, and the random noise of wherein carrying out marginalisation proper vector by the weak law of large number in solution procedure solves.
2. method according to claim 1, is characterized in that, the optimization aim equation in step S1 is to adopt denoising own coding device structure, and the optimization aim equation of constructing can be expressed as:
Wherein, λ x, λ y, λ mcand λ cdall represent regularization parameter; W xand W ythe linear mapping matrix of the denoising own coding device that presentation video feature and text feature are corresponding; with the reconstructed object equation of presentation video feature and text feature respectively, with represent Modal Correlation constraint and cross-platform constraint.
3. method according to claim 2, is characterized in that, be defined as follows respectively:
Wherein, m represents the number of times of plus noise, the number of n presentation medium data characteristics vector, the matrix of m X composition, the matrix of m Y composition, X=[X s, X t] and Y=[Y s, Y t] represent the image characteristic matrix and the text feature matrix that combine in two different platforms, X={x i| i=1 ..., n}, Y={y i| i=1 ..., n}, with be with the version that plus noise is later, x ithe proper vector of the i row of matrix X, for i*n+j column element; be defined as follows:
Wherein, the variance matrix of presentation video feature, represent the variance matrix of text feature, the covariance matrix between characteristics of image and text feature, the mark of Tr representing matrix;
be defined as follows:
Wherein, φ x(x)=W xx, φ y(y)=W yy, G x = ( 1 mn 1 Σ j = 1 m Σ i = 1 n 1 x ~ ij - 1 mn 2 Σ j = 1 m Σ i = n 1 + 1 n x ~ ij ) , n1 and n2 represent respectively the number of the media data feature in different platform.
4. method according to claim 3, is characterized in that, in step S2, obtains globally optimal solution by finding the point that the partial derivative of optimization aim equation is zero, specifically comprises:
about W xpartial derivative may be calculated:
Wherein, representative with and Q x = 2 ( ( λ x + λ c ) C xx + λ m G x G x T ) ;
about W ylocal derviation:
Wherein, C yy = Y ~ Y ~ T , C ‾ yy = Y ‾ ( Y ~ ) T And Q y = 2 ( ( λ y + λ c ) C yy + λ m G y G y T ) ;
By solving equation group work as Q xwith all reversible time, obtain W xand W yanalytic solution:
W y = 2 ( 2 λ c λ x C ‾ xx Q x - 1 C xy + λ y C ‾ yy ) ( Q y - 4 λ c 2 C xy T Q x - 1 C xy ) - 1 , W x = 2 ( λ c W y C xy T + λ x C ‾ xx ) Q x - 1 ;
Work as Q xwith when irreversible, obtain approximate solution by pseudoinverse.
5. method according to claim 1, is characterized in that, step S3 specifically comprises:
Suppose that p represents the probability to each element is changed by noise in proper vector;
Suppose S=XX t, obtain C xxwith marginalisation after solution:
E ( C xx ) = Σ i = 1 n E [ x ~ i x ~ i T ] , E ( C ‾ xx ) = Σ i = 1 n E [ x i x ~ i T ]
E (C xx) and in the element value of the capable β of α row can direct representation be:
E ( C xx ) α , β = S αβ ( 1 - p ) 2 if α ≠ β S αβ ( 1 - p ) if α = β , E ( C ‾ xx ) α , β = S αβ ( 1 - p )
Wherein, X=[X s, X t] represent the image characteristic matrix that combines in two different platforms, X={x i| i=1 ..., n}, x iit is the proper vector of the i row in X; to x iproper vector after plus noise; E[] represent to ask expectation; S α βfor the capable β column element of the α in S value;
Suppose R=YY t, obtain C yywith marginalisation after solution:
E ( C yy ) = Σ i = 1 n E [ y ~ i y ~ i T ] , E ( C ‾ yy ) = Σ i = 1 n E [ y i y ~ i T ]
E (C yy) and in the element value of the capable β of α row can direct representation be:
E ( C yy ) α , β = R αβ ( 1 - p ) 2 if α ≠ β R αβ ( 1 - p ) if α = β , E ( C ‾ yy ) α , β = R αβ ( 1 - p )
Wherein, Y=[Y s, Y t] represent the text feature matrix that combines in two different platforms, Y={y i| i=1 ..., n}, y iit is the i row characteristic of correspondence vector in Y; to y iproper vector after plus noise; R α βfor the capable β column element of the α in R value;
Suppose U=XY t, obtain C xymarginalisation after solution:
E ( C xy ) = Σ i = 1 n E [ x ~ i y ~ i T ] , E ( C xy ) α , β = U αβ ( 1 - p ) 2 if α ≠ β U αβ ( 1 - p ) if α = β
Wherein, U α βfor the capable β column element of the α in U value;
Suppose V = ( 1 n 1 Σ i = 1 n 1 x i - 1 n 2 Σ i = n 1 + 1 n x i ) , Z = ( 1 n 1 Σ i = 1 n 1 y i - 1 n 2 Σ i = n 1 + 1 n y i ) , Obtain G xand G yedge after solution:
E ( G x ) = ( 1 n 1 Σ i = 1 n 1 E ( x ~ i ) - 1 n 2 Σ i = n 1 + 1 n E ( x ~ i ) ) , E ( G y ) = ( 1 n 1 Σ i = 1 n 1 E ( y ~ i ) - 1 n 2 Σ i = n 1 + 1 n E ( y ~ i ) )
E(G x) α=V α(1-p),E(G y) α=Z α(1-p)
Wherein, V αand Z αbe respectively α element in vectorial V and Z;
The marginalisation formal solution of optimization aim equation is expressed as follows:
W y = ( λ c λ x E ( C ‾ xx ) E ( Q x ) - 1 E ( C xy ) + λ y E ( C ‾ yy ) ) ( E ( Q y ) - λ c 2 E ( C xy ) T E ( Q x ) - 1 E ( C xy ) ) - 1
W x = ( λ c W y E ( C xy ) T + λ x E ( C ‾ xx ) ) E ( Q x ) - 1
Wherein, E (Q x)=2 ((λ x+ λ c) E (C xx)+λ me (G x) E (G x) t),
E(Q y)=2((λ yc)E(C yy)+λ mE(G y)E(G y) T)。
6. a device that utilizes the cross-platform multi-modal media data common trait of denoising own coding device study to represent, is characterized in that, this device comprises:
Set up module, it is for setting up optimization aim equation; In target equation, the media data feature that adopts the denoising own coding device of individual layer to rebuild different platform and different modalities, in the time rebuilding, considers Modal Correlation constraint and cross-platform constraint; Wherein, the media data feature of described different platform and different modalities comprises characteristics of image and text feature;
Analytic solution module, it asking in analytic solution process, obtains globally optimal solution by finding the point that partial derivative is zero for trying to achieve the analytic solution of described optimization aim equation;
Solve module, it utilizes marginalisation method to solve obtained analytic solution, and the random noise of wherein carrying out marginalisation proper vector by the weak law of large number in solution procedure solves.
CN201410366722.9A 2014-07-29 2014-07-29 Learn the method and device that cross-platform multi-modal media data common trait is represented Active CN104133807B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410366722.9A CN104133807B (en) 2014-07-29 2014-07-29 Learn the method and device that cross-platform multi-modal media data common trait is represented

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410366722.9A CN104133807B (en) 2014-07-29 2014-07-29 Learn the method and device that cross-platform multi-modal media data common trait is represented

Publications (2)

Publication Number Publication Date
CN104133807A true CN104133807A (en) 2014-11-05
CN104133807B CN104133807B (en) 2017-06-23

Family

ID=51806486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410366722.9A Active CN104133807B (en) 2014-07-29 2014-07-29 Learn the method and device that cross-platform multi-modal media data common trait is represented

Country Status (1)

Country Link
CN (1) CN104133807B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202281A (en) * 2016-06-28 2016-12-07 广东工业大学 A kind of multi-modal data represents learning method and system
CN114145756A (en) * 2021-12-15 2022-03-08 电子科技大学中山学院 Cooperative robot control method, apparatus and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130039569A1 (en) * 2010-04-28 2013-02-14 Olympus Corporation Method and apparatus of compiling image database for three-dimensional object recognition
CN103020221A (en) * 2012-12-12 2013-04-03 中国科学院自动化研究所 Social search method based on multi-mode self-adaptive social relation strength excavation
JP2013105373A (en) * 2011-11-15 2013-05-30 Yahoo Japan Corp Device, method, and program for data acquisition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130039569A1 (en) * 2010-04-28 2013-02-14 Olympus Corporation Method and apparatus of compiling image database for three-dimensional object recognition
JP2013105373A (en) * 2011-11-15 2013-05-30 Yahoo Japan Corp Device, method, and program for data acquisition
CN103020221A (en) * 2012-12-12 2013-04-03 中国科学院自动化研究所 Social search method based on multi-mode self-adaptive social relation strength excavation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱明: "基于深度网络的图像处理的研究", 《电子技术与软件工程》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202281A (en) * 2016-06-28 2016-12-07 广东工业大学 A kind of multi-modal data represents learning method and system
CN114145756A (en) * 2021-12-15 2022-03-08 电子科技大学中山学院 Cooperative robot control method, apparatus and computer readable storage medium

Also Published As

Publication number Publication date
CN104133807B (en) 2017-06-23

Similar Documents

Publication Publication Date Title
O’Halloran et al. A digital mixed methods research design: Integrating multimodal analysis with data mining and information visualization for big data analytics
Huang et al. Unidirectional variation and deep CNN denoiser priors for simultaneously destriping and denoising optical remote sensing images
US20190392255A1 (en) Systems and methods of windowing time series data for pattern detection
Van Everdingen et al. Modeling global spillover of new product takeoff
CN106250464B (en) Training method and device of ranking model
US10915586B2 (en) Search engine for identifying analogies
US11709885B2 (en) Determining fine-grain visual style similarities for digital images by extracting style embeddings disentangled from image content
CN111949886A (en) Sample data generation method and related device for information recommendation
JP2009157442A (en) Data retrieval device and method
CN104133807A (en) Method and device for learning cross-platform multi-mode media data common feature representation
Qing et al. Attentive and context-aware deep network for saliency prediction on omni-directional images
Bardaro et al. Generalized sampling approximation of bivariate signals: rate of pointwise convergence
Jin et al. C2F: An effective coarse-to-fine network for video summarization
Pliuskuvienė et al. Machine learning-based chatGPT usage detection in open-ended question answers
Ye et al. Hybrid no-reference quality assessment for surveillance images
Chen et al. Salientime: user-driven selection of salient time steps for large-scale geospatial data visualization
CN112118486A (en) Content item delivery method and device, computer equipment and storage medium
Çiçek et al. Deterioration of pre-war and rehabilitation of post-war urbanscapes using generative adversarial networks
Hasanaj et al. Cooperative edge deepfake detection
US20230274310A1 (en) Jointly predicting multiple individual-level features from aggregate data
US20240143348A1 (en) Personalized recommender system for information visualization
He et al. [Retracted] Application of High‐Resolution Face Recognition and EDF Image Reconstruction in English Classroom Teaching
US11620565B1 (en) System and method for enhanced distribution of data to compute nodes
Bosch et al. Captioning of full motion video from unmanned aerial platforms
Dumpis et al. Comparative analysis of eleven dynamic artificial neural networks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant