CN104767998B - A kind of visual signature coding method and device towards video - Google Patents

A kind of visual signature coding method and device towards video Download PDF

Info

Publication number
CN104767998B
CN104767998B CN201510134617.7A CN201510134617A CN104767998B CN 104767998 B CN104767998 B CN 104767998B CN 201510134617 A CN201510134617 A CN 201510134617A CN 104767998 B CN104767998 B CN 104767998B
Authority
CN
China
Prior art keywords
local feature
frame
information
bit stream
video flowing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510134617.7A
Other languages
Chinese (zh)
Other versions
CN104767998A (en
Inventor
段凌宇
黄章帅
陈杰
黄铁军
高文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201510134617.7A priority Critical patent/CN104767998B/en
Publication of CN104767998A publication Critical patent/CN104767998A/en
Application granted granted Critical
Publication of CN104767998B publication Critical patent/CN104767998B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a kind of visual signature coding method towards video and device, methods described to include:Obtain the local feature of present frame in video flowing;Reference local feature scope of the local feature of the present frame in the reference frame of present frame is determined, the reference frame of the present frame is the adjacent frame or multiframe of the present frame;According to the reference local feature scope of the reference frame, the reference local feature of the local feature of the present frame in the reference frame is determined;According to the local feature of each frame in the video flowing and with reference to local feature, the local feature bit stream to be sent of the video flowing is obtained.The above method can in client transmissions data Fast Compression transmit characteristic, reduce transmitted data amount, and improve efficiency of transmission.

Description

A kind of visual signature coding method and device towards video
Technical field
The present invention relates to computer technology, and in particular to a kind of visual signature coding method and device towards video.
Background technology
Currently, with the popularization of intelligent terminal, by terminal camera Real-time captured video stream, and analyzed in real time It is more and more with the application of excavation.That is, how to be excavated in the image/video of magnanimity user needs video/image information into For study hotspot.
In current techniques, the live video stream analysis method based on intelligent terminal has two kinds.
The first is:Mobile terminal side directly transmit it is encoded after video flow to server, server, which receives, to be regarded Frequency stream is decoded after the meeting and visual analysis.The defects of program is:To ensure that video quality can be used for visual analysis, video is compiled Code compression ratio is relatively low, code stream is big, finally bring very big bandwidth consumption.
It is for second:At least one local visual that mobile terminal extracts each frame to the frame sequence of video flowing successively is special Sign, the local visual feature of every frame is then sequentially transmitted to server and carries out visual analysis.The program regards in extraction part Feel during feature using the process of Feature Dimension Reduction and quantification treatment to obtain relatively low bit rate, but shadow to a certain extent Ring visual analysis, it is impossible to support more visual analysis tasks;In addition, second scheme does not account for interframe local visual , therefore, there is redundancy in correlation of the feature in time domain, cause the data volume that client transmits very on characteristic data flow Greatly, and transmission delay, can not realize to visual analysis task it is real-time in the requirement that handles.
The content of the invention
For in the prior art the defects of, the invention provides a kind of visual signature coding method towards video and dress Put, can in client transmissions data Fast Compression transmit characteristic, reduce transmitted data amount.
In a first aspect, the present invention provides a kind of visual signature coding method towards video, including:
Obtain the local feature of present frame in video flowing;
Reference local feature scope of the local feature of the present frame in the reference frame of present frame is determined, it is described current The reference frame of frame is an adjacent frame or multiframe for the present frame;
According to the reference local feature scope of the reference frame, determine the local feature of the present frame in the reference frame In reference local feature;
According to the local feature of each frame in the video flowing and with reference to local feature, the to be sent of the video flowing is obtained Local feature bit stream.
Optionally it is determined that reference local feature scope of the local feature of the present frame in the reference frame of present frame, Including:
Any frame in the adjacent frame of present frame or multiframe is selected as reference frame;
Described is all local features of whole reference frame with reference to local feature scope;
Or
Any frame in the adjacent frame of present frame or multiframe is selected as reference frame;
Described is the subset of local feature in reference frame with reference to local feature scope, the local feature in all subsets with The metric range of the local feature of present frame is less than or equal to default metric range.
Alternatively, according to the reference local feature scope of the reference frame, determine the local feature of the present frame in institute The reference local feature in reference frame is stated, including:
Obtain the matching of local feature of each local feature of present frame with referring to local feature scope in reference frame Similarity;
By matching similarity corresponding to each local feature of present frame compared with preset threshold range;
If all matching similarities corresponding to some local feature of present frame are unsatisfactory for preset threshold range, really The local feature of settled previous frame is without with reference to local feature;
If meet of preset threshold range in all matching similarities corresponding to some local feature of present frame There are two or more with similarity, then selected in two or more matching similarities with the time point of present frame most Most meet reference local feature of the local feature as present frame local feature of matching similarity near reference frame.
Alternatively, the local feature bit stream includes:
Head zone and non-head region;
The head zone includes:It is whether corresponding with the reference frame using the information with reference to local feature, record reference frame Reference local feature scope information, indicate the information of the number of local feature, indicate the reference index information of local feature Information, and sign quantization parameter information;
The non-head region includes:The local feature without reference local feature after being encoded in each frame, and coding The local feature having with reference to local feature afterwards and the residual error with reference to local feature.
Alternatively, according to the local feature of each frame in the video flowing and with reference to local feature, the video flowing is obtained Local feature bit stream to be sent, including:
To being compiled without the local feature with reference to local feature using the first pre-arranged code mode after being encoded in each frame Code, obtain the first bit stream;
Obtaining has the local feature with reference to local feature and the residual error with reference to local feature;
The residual error is encoded using the second pre-arranged code mode, obtains the second bit stream;
First bit stream and second bit stream form the local feature bit stream to be sent of the video flowing;
The head zone of the local feature bit stream is made up of two-value code, and non-head region includes:It is default using first The local feature of coded system coding, and the residual error encoded using the second pre-arranged code mode.
Alternatively, methods described also includes:
The local feature bit stream to be sent of the video flowing is sent into server, so that server is based on the part Tag bit stream obtains the local feature of each frame in the video flowing.
Second aspect, the present invention also provide a kind of visual signature coding/decoding method towards video, including:
The local feature bit stream for the video flowing that client is sent is received, the local feature bit stream includes:Header area Domain and non-head region;
According to the local feature bit stream, the local feature of each frame in the video flowing is obtained;
Wherein, the head zone includes:Whether the information with reference to local feature, record reference frame and the reference frame are used The corresponding information with reference to local feature scope, the information of the number of local feature is indicated, indicate the reference key of local feature The information of information, and the information of sign quantization parameter;
The non-head region includes:The local feature without reference local feature after being encoded in each frame, and coding The local feature having with reference to local feature afterwards and the residual error with reference to local feature;
Correspondingly, according to the local feature bit stream, the local feature of each frame in the video flowing is obtained, including:
Determined from the head zone of the local feature bit stream current using after the information with reference to local feature, obtaining Frame local feature number, and index information and the information of the quantization parameter of sign local feature with reference to local feature;
According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head Local feature described in regional decoding, obtain the local feature of each frame in the video flowing.
The third aspect, the present invention also provide a kind of visual signature code device towards video, including:
Local feature acquiring unit, for obtaining the local feature of present frame in video flowing;
Determining unit, for determining reference local feature of the local feature of the present frame in the reference frame of present frame Scope, the reference frame of the present frame are the adjacent frame or multiframe of the present frame;
With reference to local feature determining unit, for the reference local feature scope according to the reference frame, it is determined that described work as The reference local feature of the local feature of previous frame in the reference frame;
Local feature bit stream acquiring unit, for according to the local feature of each frame in the video flowing and with reference to local special Sign, obtain the local feature bit stream to be sent of the video flowing.
Fourth aspect, the present invention also provide a kind of server, including:
Receiving unit, the local feature bit stream of the video flowing for receiving client transmission, the local feature bit Stream includes:Head zone and non-head region;
Local feature recovery unit, for according to the local feature bit stream, obtaining the office of each frame in the video flowing Portion's feature;
Wherein, the head zone includes:Whether the information with reference to local feature, record reference frame and the reference frame are used The corresponding information with reference to local feature scope, the information of the number of local feature is indicated, indicate the reference key of local feature The information of information, and the information of sign quantization parameter;
The non-head region includes:The local feature without reference local feature after being encoded in each frame, and coding The local feature having with reference to local feature afterwards and the residual error with reference to local feature;
Correspondingly, local feature recovery unit, it is specifically used for:
Determined from the head zone of the local feature bit stream current using after the information with reference to local feature, obtaining Frame local feature number, and index information and the information of the quantization parameter of sign local feature with reference to local feature;
According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head Local feature described in regional decoding, obtain the local feature of each frame in the video flowing.
5th aspect, the embodiment of the present invention also provide a kind of processing system for video, including:
As above any described visual signature code device towards video and it is as above any described in server, wherein, The local feature bit stream of the video flowing of acquisition is sent the server, institute by the visual signature code device towards video State the local feature that server reduces each frame in the video flowing according to the local feature bit stream of reception.
As shown from the above technical solution, the visual signature coding method and device of the invention towards video, passes through acquisition The local feature of present frame in video flowing, and then reference local feature of the present frame in the reference frame of present frame is determined, and then The local feature bit stream to be sent of video flowing is obtained, the feature that Fast Compression is transmitted when may be implemented in client transmissions data Data, transmitted data amount is reduced, improve the efficiency of transmission of the video flowing of client.
Brief description of the drawings
Fig. 1 is the schematic flow sheet for the visual signature coding method towards video that one embodiment of the invention provides;
Fig. 2 is the schematic flow sheet for the visual signature coding method towards video that another embodiment of the present invention provides;
Fig. 3 is the schematic flow sheet for the visual signature coding/decoding method towards video that another embodiment of the present invention provides;
Fig. 4 is the structural representation for the visual signature code device towards video that one embodiment of the invention provides;
Fig. 5 is the structural representation for the server that one embodiment of the invention provides.
Embodiment
Below in conjunction with the accompanying drawings, the embodiment of invention is further described.Following examples are only used for more clear Illustrate to Chu technical scheme, and can not be limited the scope of the invention with this.Made in the embodiment of the present invention " first ", " second " are only clearer explanation present context, without specific meanings, are not also limited in any Hold.
Fig. 1 shows the schematic flow sheet for the visual signature coding method towards video that one embodiment of the invention provides, As shown in figure 1, the visual signature coding method towards video of the present embodiment is as described below.
101st, the local feature of present frame in video flowing is obtained.
For example, local feature can be that Scale invariant describes son (Scale Invariant Feature Transform, referred to as:SIFT), or, the scale invariant feature of fast robust describes son (Speeded Up Robust Features, referred to as:SURF), or, the basic description (Brief of binary robust independence:Binary robust Independent elementary features) etc., the present embodiment is not limited thereof, by way of example only.
It is to be understood that SIFT, SURF represent the local feature of floating number description, and Brief is the office of binary system description Portion's feature.At the same time, SIFT, SURF or Brief extracting mode are existing extracting mode, and the present embodiment is no longer detailed State;
102nd, reference local feature scope of the local feature of the present frame in the reference frame of present frame is determined.
In the present embodiment, the reference frame of present frame is an adjacent frame or multiframe for the present frame.An adjacent frame at this Or multiframe can be the frame or multiframe before or after time point where present frame.
In addition, it is necessary to explanation, present frame is the frame (image) where the local feature to be encoded, and reference frame can To have carried out the frame (image) of local feature coding;Reference frame can be the preceding a later frame or preceding of present frame in time Several frames afterwards.
For example, any frame in the adjacent frame of present frame or multiframe is selected as reference frame;The office of present frame Reference local feature scope of portion's feature in reference frame can be all local features of whole reference frame, i.e., with reference to local special Sign scope is all local features of whole reference frame;
Or
Any frame in the adjacent frame of present frame or multiframe is selected as reference frame;The local feature of present frame is being joined It can be the local feature subset according to preset rules in reference frame to examine the reference local feature scope in frame;
Preset rules are such as:
Wherein p is m-th of local feature of present frame, and the corresponding coordinate in current frame image isQ is reference N-th of local feature in frame, it is corresponding in reference frame
When meeting above-mentioned formula condition, reference frame local feature q is by the local special of present frame local feature p references Sign.
It is understood that described can be the subset of local feature in each reference frame with reference to local feature scope, should The metric range of the local feature of a little concentration and the local feature of present frame is less than or equal to default metric range.
103rd, according to the reference local feature scope of the reference frame, determine the local feature of the present frame in the ginseng Examine the reference local feature in frame.
In a particular application, it can will refer to and be searched in local feature scope with local feature to be encoded most in reference frame The local feature of matching is used as and refers to local feature.
For example, first, obtain part of each local feature of present frame with referring to local feature scope in reference frame The matching similarity of feature;
Secondly, by matching similarity corresponding to each local feature of present frame compared with preset threshold range;
If all matching similarities corresponding to some local feature of present frame are unsatisfactory for preset threshold range, really The local feature of settled previous frame is without with reference to local feature;
If meet of preset threshold range in all matching similarities corresponding to some local feature of present frame There are two or more with similarity, then selected in two or more matching similarities with the time point of present frame most Most meet reference local feature of the local feature as present frame local feature of matching similarity near reference frame.
It will be appreciated that in this step, it can be limited according to the distance of local feature to be encoded and candidate's local feature Matching similarity is defined, formalization is as follows:
For present frame m-th of local feature of the i-th frame;
For reference frame jth n-th of local feature of frame;
DistanceCan be Euclidean distance, manhatton distance or Hamming distance etc.;
Most matching is defined as:
Wherein DisminFor candidate feature withIt is closest, DissecondFor candidate feature withDistance second is near;
Because meet the reference local feature of above-mentioned most matching definitionWith local feature to be encodedDistance is most Closely;
It should be noted that when θ=1, andClosest candidate's local feature is with reference to local feature.
In specific implementation process, the local feature of present frame without or it is multiple meet that above-mentioned matching defines reference it is local Feature;When having multiple candidates, in the nearest reference frame of selection time pointIt is worth minimum local feature conduct Uniquely refer to local feature.
104th, according to the local feature of each frame in the video flowing and with reference to local feature, the pending of the video flowing is obtained The local feature bit stream sent.
For example, local feature bit stream may include:Head zone and non-head region;
The head zone includes:It is whether corresponding with the reference frame using the information with reference to local feature, record reference frame Reference local feature scope information, indicate the information of the number of local feature, indicate the reference index information of local feature Information, and sign quantization parameter information;
The non-head region includes:The local feature without reference local feature after being encoded in each frame, and coding The local feature having with reference to local feature afterwards and the residual error with reference to local feature.
In actual applications, step 104 may include the sub-step not shown in following figures:
1041st, to being entered without the local feature with reference to local feature using the first pre-arranged code mode after being encoded in each frame Row coding, obtains the first bit stream;
1042nd, obtaining has the local feature with reference to local feature and the residual error with reference to local feature;
1043rd, the residual error is encoded using the second pre-arranged code mode, obtains the second bit stream;
1044th, first bit stream and second bit stream form the local feature ratio to be sent of the video flowing Spy's stream;
1045th, the head zone of the local feature bit stream is made up of two-value code, and non-head region includes:Using first The local feature that pre-arranged code mode encodes, and the residual error encoded using the second pre-arranged code mode, for example, becoming to residual error The value changed and quantified carries out entropy code acquisition.
The above method can in client transmissions data Fast Compression transmit characteristic, reduce transmitted data amount, carry The efficiency of transmission of the video flowing of high client.
In a kind of specific example, step 101 can also be following step 101 in the method shown in earlier figures 1 ':
Step 101 ', obtain video flowing in present frame local feature and present frame in each local feature attribute.
It will be appreciated that the attribute of each local feature of each frame may include in video flowing:Local feature correlation The information such as coordinate, yardstick.
Correspondingly, step 104 can be following step 104 ':
Step 104 ', according to the local feature of each frame in the video flowing and with reference to local feature, local feature in each frame Attribute, obtain the local feature bit stream to be sent of the video flowing.
The bit stream includes head zone and non-head region;The head zone is made up of some 0,1, the head Region includes:Whether using the information with reference to local feature, record reference frame and the reference frame are corresponding refers to local feature model The information enclosed, the information of the number of local feature is indicated, indicate the information of the reference index information of local feature, and labelled amount Change the information of parameter;
The non-head region includes:It is special without the local feature with reference to local feature and the part after being encoded in each frame The local feature having with reference to local feature after the attribute of sign, and coding and the residual error with reference to local feature.
The above method by prediction can greatly compressed video data characteristic, ensure visual analysis task property It can reach the requirement handled in real time while energy.
Fig. 2 shows the schematic flow sheet for the visual signature coding method towards video that one embodiment of the invention provides, As shown in Fig. 2 the visual signature coding method towards video of the present embodiment is as described below.
201st, the local feature of present frame in video flowing is obtained, and the local feature of present frame is pre-processed.
The local feature of present frame can be Zi Ji local feature descriptions of local feature description vector in the present embodiment.This implementation Local feature in example can be one or more.
It should be noted that the difference of the present embodiment and the coding method shown in above-mentioned Fig. 1 is to be additionally operable to carry present frame The local feature taken is pre-processed.
For example, dimension-reduction treatment can be carried out to the local feature of present frame;And/or the local feature amount of progress to present frame Change is handled.
Specifically, predetermined dimensionality reduction matrix can be used to local special in the subset of the local feature composition of present frame Sign carries out dimensionality reduction, obtains the local feature after dimensionality reduction;Wherein, the dimensionality reduction matrix is to train default first using dimensionality reduction mode The matrix obtained after image data set.
It should be noted that dimensionality reduction is optionally to operate.Dimensionality reduction mode can be principal component analysis, linear discriminant analysis etc. Mode, wherein principal component analysis mode may be referred to " Jolliffe, I.T. (1986) .Principal Component Analysis.Springer-Verlag.pp.487. the content disclosed in ".
Quantification treatment can be into compact local feature using scalar quantization or vector quantization.Quantization in the present embodiment Handle as can selection operation.
202nd, reference local feature scope of the local feature of the present frame in the reference frame of present frame is determined.
In the present embodiment, the reference frame of the present frame is an adjacent frame or multiframe for the present frame.
203rd, according to the reference local feature scope of the reference frame, determine that the local feature of pretreated present frame exists Reference local feature in the reference frame.
204th, according to the pretreated local feature of each frame in the video flowing and with reference to local feature, regarded described in acquisition The local feature bit stream to be sent of frequency stream.
In the present embodiment, the target of above-mentioned coding method is that local feature is encoded into bit stream.
Local feature bit stream in the present embodiment may include:Head zone and non-head region;
Foregoing head zone is made up of some 0,1, and the information that the head zone is included can be with enumerating with step 104 The information that is included of head zone it is identical;Non-head region includes:The office without reference local feature after being encoded in each frame The local feature having with reference to local feature after portion's feature, and coding and the residual error with reference to local feature.
It should be noted that the local feature in non-head region can be the local spy encoded using the first pre-arranged code mode Sign;
Residual error in non-head region can be the residual error encoded using the second pre-arranged code mode, for example, being carried out to residual error The value of transform and quantization carries out entropy code acquisition.
In a particular application, the step of method shown in earlier figures 2 may also include not shown in following figures:
The local feature bit stream to be sent of the video flowing is sent into server, so that server is based on the part Tag bit stream obtains the local feature of each frame in the video flowing.
Thus, the characteristic that Fast Compression is transmitted when above-mentioned coding method may be implemented in client transmissions data, reduce Transmitted data amount, improve the efficiency of transmission of the video flowing of client.
In a kind of optional implementation, if also obtaining the attribute for each local feature for having each frame in step 201, And when if the attribute of each local feature of each frame is coordinate attributes in step 204, now need to encode each office The coordinate attributes of portion's feature are used for the application of object positioning, and specific coding method is as follows:
Coded system one:The coordinate of local feature to be encoded and the coordinate of reference local feature are made the difference, and residual error is carried out Quantify, obtained value carries out entropy code and then obtains the residual error that corresponding foregoing use the second pre-arranged code mode encodes.
Coded system two:Because the coordinate set of present frame local feature can pass through the coordinate set of reference frame local feature Conjunction obtains by affine transformation,For the coordinate that m-th of local feature in current i-th frame is to be encoded,For N-th of coordinate with reference to local feature in reference frame jth frame, affine matrixAffine transformation is as follows:
Affine matrix A can be calculated using least square method;
As long as therefore, it is necessary to just encode this affine transformation matrix in the cataloged procedure of coordinate, while encode original coordinatesWith the coordinate after conversionResidual error, can be affine to this in order to further reduce bit rate The element and residual error of matrix are quantified.
Now, the bit stream of the local feature coding includes head zone and non-head region;
The head zone is made up of some 0,1, and the head zone includes:Whether the letter with reference to local feature is used Breath, record reference frame and the corresponding information with reference to local feature scope of the reference frame, the information of the number of local feature is indicated, The information of the reference index information of local feature is indicated, indicates the information of local feature quantization parameter, whether sign is using local The information of characteristic coordinates attribute, and the information of marker coordinate attribute quantification parameter;
The non-head region includes:It is special without the local feature with reference to local feature and the part after being encoded in each frame The local feature having with reference to local feature after the attribute of sign, and coding and the residual error with reference to local feature, having after coding Local feature coordinate with reference to local feature and the residual error with reference to local feature coordinate.
Or
The bit stream of the local feature coding includes head zone and non-head region;The head zone by some 0, 1 composition, the head zone include:Whether using the information with reference to local feature, record corresponding to reference frame and the reference frame With reference to the information of local feature scope, the information of the number of local feature is indicated, indicates the reference index information of local feature Information, indicates the information of local feature quantization parameter, and whether sign uses the information of local feature coordinate attributes, and sign to sit Mark transformation matrix information and the information of quantization parameter;
The non-head region includes:It is special without the local feature with reference to local feature and the part after being encoded in each frame The local feature having with reference to local feature after the attribute of sign, and coding and the residual error with reference to local feature, having after coding Local feature coordinate with reference to local feature and the residual error with reference to local feature coordinate.It should be noted that can also be according to not Same local feature and vision application encode the attribute of different local features.
In a kind of concrete implementation mode, the part for obtaining present frame in video flowing in step 101 and step 201 Characteristic procedure, while the subset of present frame local feature is obtained using local feature selection rule, local feature selection rule can It is illustrated below:
For a width two field picture in the present embodiment, exemplified by extracting local feature SIFT, if extracting more than one SIFT, The subset for including N number of SIFT is chosen from all SIFT, wherein N is more than 0.N is 300, it is necessary to the thing of explanation, root in the present embodiment Can adaptively it be chosen according to the value without application N.
It should be noted that when the SIFT of above-mentioned image zooming-out number is less than N, then all SIFT for choosing image make For the element in subset.
M01, match images pair and the non-matching images local feature all to extraction to some respectively;
Wherein, image is matched to referring to the two images comprising same object or same scene, non-matching image pair Refer to the two images comprising different objects or different scenes.These matching images pair and non-matching image be not to including above-mentioned step Rapid 101 and the two field picture of the pending operation in step 201.
M02, by statistics, obtain the different attribute of the local feature in the local feature and error hiding office correctly matched Probability distribution in portion's feature;
Wherein, can include for SIFT local features, different attribute, such as:Yardstick, direction, difference of Gaussian peak value, arrive Distance of picture centre etc..
M03, based on above-mentioned probability distribution, calculate the local special of the two field picture of working as the pending operation in step 101 and 201 Each attribute of sign is when being respectively at a certain span, the probability that the local feature correctly matches, according to the probability institute There is the local feature that one or more local features are chosen in local feature as the two field picture.
Where it is assumed that the different qualities statistical iteration of the SIFT, the probability that the SIFT is correctly matched is based on difference The product for the probability that the SIFT of property calculation is correctly matched, and in this, as the foundation for choosing the element in SIFT subsets.
In actual applications, the system of selection of other local features can also be used, is not limited to the step M01 of the example above To step M03.
It should be noted that above-mentioned steps M01 and step M02 can be obtained in advance, i.e., it is offline to obtain and then store In a device.
Fig. 3 shows the schematic flow sheet for the visual signature coding/decoding method towards video that one embodiment of the invention provides, As shown in figure 3, the visual signature coding/decoding method towards video of the present embodiment is as described below.
301st, the local feature bit stream for the video flowing that client is sent is received;
302nd, according to the local feature bit stream, the local feature of each frame in the video flowing is obtained.
For example, determine, using after the information with reference to local feature, to obtain from the head zone of the local feature bit stream Take present frame local feature number, and index information and the letter of the quantization parameter of sign local feature with reference to local feature Breath;
According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head Local feature described in regional decoding, obtain the local feature of each frame in the video flowing.In a particular application, if the bit of coding Stream includes the attribute of local feature, and now, corresponding step 302 may particularly include following sub-steps:
A01, the dimensional information using predictive coding is indicated whether, is then obtained for bit stream to be decoded, first acquisition Take and refer to local feature range information.
A02, determine present frame local feature to be decoded number and the reference local feature in reference frame, example Such as, the index information with reference to local feature, and the information of local feature quantization parameter are obtained from local feature bit stream head.
A03, according to decoding is predicted to local feature to be decoded with reference to local feature, for example, decoding includes decoding The local feature attribute related to local feature.
To local feature to be decoded, the corresponding bits stream progress entropy decoding first to non-head obtains residual error, then will be residual Difference is added to obtain the local feature with decoding with reference to local feature;
To coordinate information to be decoded, the decoding process of corresponding afore-mentioned code mode one can be:First to the phase of non-head Answer bit stream to carry out entropy decoding and obtain residual error, then be added to obtain coordinate to be decoded by coordinate of the residual sum with reference to local feature Data;
To coordinate information to be decoded, the decoding process of corresponding afore-mentioned code mode two can be:First to the phase of non-head Answer bit stream decoding to obtain transformation matrix A and residual error, then calculate conversion according to transformation matrix and with reference to the coordinate of local feature Coordinate, last coordinate transforming are added to obtain coordinate to be decoded with residual error.
Fig. 4 shows the structural representation for the visual signature code device towards video that one embodiment of the invention provides, As shown in figure 4, the visual signature code device towards video of the present embodiment includes:Local feature acquiring unit 41, determine list First 42, with reference to local feature determining unit 43 and local tag bit stream acquiring unit 44;
Wherein, local feature acquiring unit 41 is used for the local feature for obtaining present frame in video flowing;
Reference of the local feature that determining unit 42 is used to determine the present frame in the reference frame of present frame is local special Scope is levied, the reference frame of the present frame is the adjacent frame or multiframe of the present frame;
It is used for the reference local feature scope according to the reference frame with reference to local feature determining unit 43, it is determined that described work as The reference local feature of the local feature of previous frame in the reference frame;
Local feature bit stream acquiring unit 44 is used for according to the local feature of each frame in the video flowing and with reference to local Feature, obtain the local feature bit stream to be sent of the video flowing.
For example, the local feature bit stream includes:Head zone and non-head region;
The head zone includes:It is whether corresponding with the reference frame using the information with reference to local feature, record reference frame Reference local feature scope information, indicate the information of the number of local feature, indicate the reference index information of local feature Information, and sign quantization parameter information;
The non-head region includes:The local feature without reference local feature after being encoded in each frame, and coding The local feature having with reference to local feature afterwards and the residual error with reference to local feature.
In a particular application, the local feature bit stream acquiring unit 44 can be specifically used for, after being encoded in each frame Encoded without the local feature with reference to local feature using the first pre-arranged code mode, obtain the first bit stream;
Obtaining has the local feature with reference to local feature and the residual error with reference to local feature;
The residual error is encoded using the second pre-arranged code mode, obtains the second bit stream;
First bit stream and second bit stream form the local feature bit stream to be sent of the video flowing;
The head zone of the local feature bit stream is made up of two-value code, and non-head region includes:It is default using first The local feature of coded system coding, and the residual error encoded using the second pre-arranged code mode.
In a kind of concrete implementation mode, foregoing code device may also include the pretreatment unit not shown in figure, The pretreatment unit is located at after local feature acquiring unit 41, and before local feature determining unit 43 is referred to, it is described Pretreatment unit is specifically used for, and the local feature of the present frame is pre-processed;For example, the local feature of present frame is entered Row dimension-reduction treatment;And/or quantification treatment is carried out to the local feature of present frame.
Correspondingly, can be specifically used for reference to local feature determining unit 43, according to the reference local feature of the reference frame Scope, determine the reference local feature of the local feature of pretreated present frame in the reference frame.For example, with reference to local Characteristic range is all local features of whole reference frame, or it is local special in each reference frame to refer to local feature scope The subset of sign, the metric range of the local feature of local feature and present frame in all subsets are less than or equal to default measurement Distance.
In another concrete implementation mode, foregoing code device may also include the transmitting element not shown in figure, should Transmitting element is located at after local feature bit stream acquiring unit 44, for by the local feature ratio to be sent of the video flowing Spy's stream sends server, so that the part that server obtains each frame in the video flowing based on the local feature bit stream is special Sign.
The visual signature code device towards video of the present embodiment can be located at any client in, as mobile terminal or its In his intelligent mobile phone terminal or in fixed terminal, earlier figures 1 and any described embodiments of the method for Fig. 2 are can perform, at this no longer It is described in detail.
The visual signature code device towards video of the present embodiment, Fast Compression when may be implemented in client transmissions data The characteristic of transmission, transmitted data amount is reduced, improve the efficiency of transmission of the video flowing of client.
Fig. 5 shows the structural representation for the server that one embodiment of the invention provides, as shown in figure 5, the present embodiment Server includes:Receiving unit 51, local feature recovery unit 52;
Wherein, receiving unit 51 is used for the local feature bit stream for receiving the video flowing of client transmission;
Local feature recovery unit 52 is used for the office for according to the local feature bit stream, obtaining each frame in the video flowing Portion's feature.
For example, local feature recovery unit 52 is determined using reference from the head zone of the local feature bit stream After the information of local feature, present frame local feature number, and the index information with reference to local feature and sign office are obtained The information of the quantization parameter of portion's feature;
According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head Local feature described in regional decoding, obtain the local feature of each frame in the video flowing.The server of the present embodiment can be with execution The client interaction of coding, realizes the recovery to the local feature of each frame in video flowing in the server, the server can perform All flows of earlier figures 3 and coding/decoding method, the present embodiment are no longer described in detail.
The server of the present embodiment interacts with towards the visual signature code device of video can solve the problem that visitor in the prior art Family end transmit data when can not Fast Compression transmission characteristic, reduce transmitted data amount the problem of.
5th aspect, the embodiment of the present invention also provide a kind of image processing system, including:Above-mentioned any embodiment towards The visual signature code device of video and the as above server described in any embodiment, wherein, the vision towards video is special Levy code device and the local feature bit stream of the video flowing of acquisition is sent into the server, the server is according to the institute of reception State the local feature that local feature bit stream reduces each frame in the video flowing.
During specific implementation, client can obtain the office of each frame in video flowing according to the method for above-mentioned any embodiment Portion's tag bit stream, and the local feature bit stream of acquisition is sent into server, server can be according to the local feature bit Stream decoding obtains the local feature of video flowing.
The system when can solve client transmissions data in the prior art can not Fast Compression transmission characteristic, drop The problem of low transmission data volume.
In the specification of the present invention, numerous specific details are set forth.It is to be appreciated, however, that embodiments of the invention can be with Put into practice in the case of these no details.In some instances, known method, structure and skill is not been shown in detail Art, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that disclose to simplify the present invention and help to understand one or more in each inventive aspect Individual, in the description to the exemplary embodiment of the present invention above, each feature of the invention is grouped together into single sometimes In embodiment, figure or descriptions thereof.It is intended to however, should not explain the method for the disclosure in reflection is following:Want Seek the application claims features more more than the feature being expressly recited in each claim of protection.More precisely, such as As following claims reflect, inventive aspect is all features less than single embodiment disclosed above. Therefore, it then follows thus claims of embodiment are expressly incorporated in the embodiment, wherein each right will Ask itself all as separate embodiments of the invention.
It will be understood by those skilled in the art that the module in the equipment in embodiment can adaptively be changed And they are provided in the different one or more equipment of the embodiment.Can the module in embodiment or unit or Component is combined into a module or unit or component, and can be divided into multiple submodule or subelement or subgroup in addition Part.Except at least some in such feature and/or process or unit are mutually exclusive parts, any combinations can be used To all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and such disclosed any side All processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint right will Ask, make a summary and accompanying drawing) disclosed in each feature can be replaced by the alternative features for providing identical, equivalent or similar purpose.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be realized with hardware, or to be run on one or more processor Software module realize, or realized with combinations thereof.It should be noted that above-described embodiment the present invention will be described and Do not limit the invention, and those skilled in the art can set without departing from the scope of the appended claims Count out alternative embodiment.In the claims, any reference symbol between bracket should not be configured to claim Limitation.Word "comprising" does not exclude the presence of element or step not listed in the claims.Word before element "a" or "an" does not exclude the presence of multiple such elements.The present invention can be by means of including the hardware of some different elements And realized by means of properly programmed computer.In if the unit claim of equipment for drying is listed, in these devices Several can be embodied by same hardware branch.The use of word first, second, and third does not indicate that Any order.These words can be construed to title.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent The present invention is described in detail with reference to foregoing embodiments for pipe, it will be understood by those within the art that:Its according to The technical scheme described in foregoing embodiments can so be modified, either which part or all technical characteristic are entered Row equivalent substitution;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology The scope of scheme, it all should cover among the claim of the present invention and the scope of specification.

Claims (9)

  1. A kind of 1. visual signature coding method towards video, it is characterised in that including:
    Obtain the local feature of present frame in video flowing;
    Reference local feature scope of the local feature of the present frame in the reference frame of present frame is determined, the present frame Reference frame is an adjacent frame or multiframe for the present frame;
    According to the reference local feature scope of the reference frame, determine the local feature of the present frame in the reference frame With reference to local feature;
    According to the local feature of each frame in the video flowing and with reference to local feature, the part to be sent of the video flowing is obtained Tag bit stream;
    Wherein, the local feature bit stream includes:
    Head zone and non-head region;
    The head zone includes:Whether using the information with reference to local feature, record reference frame and the reference frame are corresponding joins The information of local feature scope is examined, indicates the information of the number of local feature, indicates the letter of the reference index information of local feature Breath, and the information of sign quantization parameter;
    The non-head region includes:The local feature without reference local feature after being encoded in each frame, and after coding There are the local feature with reference to local feature and the residual error with reference to local feature.
  2. 2. according to the method for claim 1, it is characterised in that determine ginseng of the local feature in present frame of the present frame The reference local feature scope in frame is examined, including:
    Any frame in the adjacent frame of present frame or multiframe is selected as reference frame;
    Described is all local features of whole reference frame with reference to local feature scope;
    Or
    Any frame in the adjacent frame of present frame or multiframe is selected as reference frame;
    Described is the subset of local feature in reference frame with reference to local feature scope, the local feature in all subsets with it is current The metric range of the local feature of frame is less than or equal to default metric range.
  3. 3. method according to claim 1 or 2, it is characterised in that according to the reference local feature scope of the reference frame, The reference local feature of the local feature of the present frame in the reference frame is determined, including:
    The matching for obtaining local feature of each local feature of present frame to referring to local feature scope in reference frame is similar Degree;
    By matching similarity corresponding to each local feature of present frame compared with preset threshold range;
    If all matching similarities corresponding to some local feature of present frame are unsatisfactory for preset threshold range, it is determined that when The local feature of previous frame is without with reference to local feature;
    If meet the matching phase of preset threshold range in all matching similarities corresponding to some local feature of present frame There are two or more like degree, then selection and time point of present frame are nearest in two or more matching similarities Most meet reference local feature of the local feature as present frame local feature of matching similarity in reference frame.
  4. 4. method according to claim 1 or 2, it is characterised in that according to the local feature of each frame in the video flowing and With reference to local feature, the local feature bit stream to be sent of the video flowing is obtained, including:
    To being encoded without the local feature with reference to local feature using the first pre-arranged code mode after being encoded in each frame, obtain Take the first bit stream;
    Obtaining has the local feature with reference to local feature and the residual error with reference to local feature;
    The residual error is encoded using the second pre-arranged code mode, obtains the second bit stream;
    First bit stream and second bit stream form the local feature bit stream to be sent of the video flowing;
    The head zone of the local feature bit stream is made up of two-value code, and non-head region includes:Using the first pre-arranged code The local feature that mode encodes, and the residual error encoded using the second pre-arranged code mode.
  5. 5. method according to claim 1 or 2, it is characterised in that methods described also includes:
    The local feature bit stream to be sent of the video flowing is sent into server, so that server is based on the local feature Bit stream obtains the local feature of each frame in the video flowing.
  6. A kind of 6. visual signature coding/decoding method towards video, it is characterised in that including:
    The local feature bit stream for the video flowing that client is sent is received, the local feature bit stream includes:Head zone and Non-head region;
    According to the local feature bit stream, the local feature of each frame in the video flowing is obtained;
    Wherein, the head zone includes:It is whether corresponding with the reference frame using the information with reference to local feature, record reference frame Reference local feature scope information, indicate the information of the number of local feature, indicate the reference index information of local feature Information, and sign quantization parameter information;
    The non-head region includes:The local feature without reference local feature after being encoded in each frame, and after coding There are the local feature with reference to local feature and the residual error with reference to local feature;
    Correspondingly, according to the local feature bit stream, the local feature of each frame in the video flowing is obtained, including:
    Determine, using after the information with reference to local feature, to obtain present frame office from the head zone of the local feature bit stream Portion's number of features, and index information and the information of the quantization parameter of sign local feature with reference to local feature;
    According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head region The local feature is decoded, obtains the local feature of each frame in the video flowing.
  7. A kind of 7. visual signature code device towards video, it is characterised in that including:
    Local feature acquiring unit, for obtaining the local feature of present frame in video flowing;
    Determining unit, for determining reference local feature model of the local feature of the present frame in the reference frame of present frame Enclose, the reference frame of the present frame is the adjacent frame or multiframe of the present frame;
    With reference to local feature determining unit, for the reference local feature scope according to the reference frame, the present frame is determined Local feature reference local feature in the reference frame;
    Local feature bit stream acquiring unit, for according to the local feature of each frame in the video flowing and with reference to local feature, Obtain the local feature bit stream to be sent of the video flowing;
    Wherein, the local feature bit stream includes:
    Head zone and non-head region;
    The head zone includes:Whether using the information with reference to local feature, record reference frame and the reference frame are corresponding joins The information of local feature scope is examined, indicates the information of the number of local feature, indicates the letter of the reference index information of local feature Breath, and the information of sign quantization parameter;
    The non-head region includes:The local feature without reference local feature after being encoded in each frame, and after coding There are the local feature with reference to local feature and the residual error with reference to local feature.
  8. A kind of 8. server, it is characterised in that including:
    Receiving unit, the local feature bit stream of the video flowing for receiving client transmission, the local feature bit stream bag Include:Head zone and non-head region;
    Local feature recovery unit, for according to the local feature bit stream, the part for obtaining each frame in the video flowing to be special Sign;
    Wherein, the head zone includes:It is whether corresponding with the reference frame using the information with reference to local feature, record reference frame Reference local feature scope information, indicate the information of the number of local feature, indicate the reference index information of local feature Information, and sign quantization parameter information;
    The non-head region includes:The local feature without reference local feature after being encoded in each frame, and after coding There are the local feature with reference to local feature and the residual error with reference to local feature;
    Correspondingly, local feature recovery unit, it is specifically used for:
    Determine, using after the information with reference to local feature, to obtain present frame office from the head zone of the local feature bit stream Portion's number of features, and index information and the information of the quantization parameter of sign local feature with reference to local feature;
    According to the information of the index information with reference to local feature and the quantization parameter of sign local feature, from non-head region The local feature is decoded, obtains the local feature of each frame in the video flowing.
  9. A kind of 9. processing system for video, it is characterised in that including:
    As above the visual signature code device and the as above server described in claim 8 towards video described in claim 7, Wherein, the local feature bit stream of the video flowing of acquisition is sent the service by the visual signature code device towards video Device, the server reduce the local feature of each frame in the video flowing according to the local feature bit stream of reception.
CN201510134617.7A 2015-03-25 2015-03-25 A kind of visual signature coding method and device towards video Active CN104767998B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510134617.7A CN104767998B (en) 2015-03-25 2015-03-25 A kind of visual signature coding method and device towards video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510134617.7A CN104767998B (en) 2015-03-25 2015-03-25 A kind of visual signature coding method and device towards video

Publications (2)

Publication Number Publication Date
CN104767998A CN104767998A (en) 2015-07-08
CN104767998B true CN104767998B (en) 2017-12-08

Family

ID=53649565

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510134617.7A Active CN104767998B (en) 2015-03-25 2015-03-25 A kind of visual signature coding method and device towards video

Country Status (1)

Country Link
CN (1) CN104767998B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108882020B (en) * 2017-05-15 2021-01-01 北京大学 Video information processing method, device and system
CN107846576B (en) 2017-09-30 2019-12-10 北京大学 Method and system for encoding and decoding visual characteristic data
CN113453017B (en) * 2021-06-24 2022-08-23 咪咕文化科技有限公司 Video processing method, device, equipment and computer program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226589A (en) * 2012-10-15 2013-07-31 北京大学 Method for obtaining compact global feature descriptors of image and image retrieval method
CN103561264A (en) * 2013-11-07 2014-02-05 北京大学 Media decoding method based on cloud computing and decoder
CN104093030A (en) * 2014-07-09 2014-10-08 天津大学 Distributed video coding side information generating method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040258147A1 (en) * 2003-06-23 2004-12-23 Tsu-Chang Lee Memory and array processor structure for multiple-dimensional signal processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226589A (en) * 2012-10-15 2013-07-31 北京大学 Method for obtaining compact global feature descriptors of image and image retrieval method
CN103561264A (en) * 2013-11-07 2014-02-05 北京大学 Media decoding method based on cloud computing and decoder
CN104093030A (en) * 2014-07-09 2014-10-08 天津大学 Distributed video coding side information generating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向人体动作识别的局部特征时空编码方法;王斌等;《四川大学学报(工程科学版)》;20140331;第46卷(第2期);第72-78页 *

Also Published As

Publication number Publication date
CN104767998A (en) 2015-07-08

Similar Documents

Publication Publication Date Title
Qin et al. Semantic communications: Principles and challenges
Duan et al. Compact descriptors for visual search
CN103026368B (en) Use the process identification that increment feature extracts
US20140185949A1 (en) Efficient compact descriptors in visual search systems
Ma et al. Joint feature and texture coding: Toward smart video representation via front-end intelligence
CN104767998B (en) A kind of visual signature coding method and device towards video
CN102148987A (en) Compressed sensing image reconstructing method based on prior model and 10 norms
CN116978011B (en) Image semantic communication method and system for intelligent target recognition
Dost et al. Reduced reference image and video quality assessments: review of methods
CN104767997B (en) A kind of visual signature coding method and device towards video
CN110991298B (en) Image processing method and device, storage medium and electronic device
CN103020138A (en) Method and device for video retrieval
Chen et al. Context-aware image compression optimization for visual analytics offloading
CN114581920A (en) Molecular image identification method for double-branch multi-level characteristic decoding
CN117058595B (en) Video semantic feature and extensible granularity perception time sequence action detection method and device
CN105959685B (en) A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis
Baroffio et al. Hybrid coding of visual content and local image features
CN104320661B (en) Image coding quality predicting method based on difference entropy and structural similarity
Van Opdenbosch et al. A joint compression scheme for local binary feature descriptors and their corresponding bag-of-words representation
CN116320538A (en) Semantic communication transmission method and system for substation inspection image
Kiran et al. Novel multi-media steganography model using meta-heuristic and deep learning assisted adaptive lifting wavelet transform
Monteiro et al. Coding mode decision algorithm for binary descriptor coding
Pillai et al. Compression based clustering technique for enhancing accuracy in web scale videos
CN111008276A (en) Complete entity relationship extraction method and device
CN104618723B (en) A kind of H.264/AVC compressed domain video matching process based on motion vector projection matrix

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant