CN103843011B - The decoding of feature locations information - Google Patents
The decoding of feature locations information Download PDFInfo
- Publication number
- CN103843011B CN103843011B CN201280038785.0A CN201280038785A CN103843011B CN 103843011 B CN103843011 B CN 103843011B CN 201280038785 A CN201280038785 A CN 201280038785A CN 103843011 B CN103843011 B CN 103843011B
- Authority
- CN
- China
- Prior art keywords
- histogram
- hexagonal
- encoded
- feature locations
- cells
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/005—Statistical coding, e.g. Huffman, run length coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/19—Image acquisition by sensing codes defining pattern positions
Abstract
The present invention is disclosed for the method and apparatus that feature locations are entered with row decoding.In one embodiment, a kind of feature locations information to image is entered the method for row decoding and is included:Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;The feature locations of image are quantified using the hexagonal mesh;Produce appearance of the histogram with recording feature position in each hexagonal cells;And the histogram is encoded according to appearance of the feature locations in each hexagonal cells.It is described that the method that the histogram is encoded is included:The information of follow-up hexagonal cells to be encoded in the histogram is encoded using the contextual information of adjacent hexagonal unit, wherein contextual information of the contextual information comprising the one-level adjacent cells from the follow-up hexagonal cells to be encoded and two grades of contextual informations of adjacent cells from the follow-up hexagonal cells to be encoded.
Description
The cross reference of related application
Present application advocate filed in September in 2011 9 days No. 13/229,654 U. S. application case " feature locations information
The rights and interests of decoding (Coding of Feature Location Information) ", the application case advocates in August, 2011 again
61/522nd, No. 171 U.S. Provisional Application case " decoding (the Coding of Feature of feature locations information filed in 10 days
Location Information) " rights and interests.Aforesaid US is incorporated by reference herein in its entirety herein.
Technical field
Field the present invention relates to process DID.Exactly, the present invention relates to the Q-character confidence of image
The decoding of breath.
Background technology
As camera phone and personal digital assistant (PDA) are commercially widely used, these devices become for vision
Search and the extensive platform of mobile augmented reality application.For the application supporting to need image to compare, it is necessary to from mobile device to
Server upload information, or need from server to mobile device download information.Need to launch and/or connect via wireless network
The performance and ease for use that the data volume of receipts is applied for these become most important.
The searching system of conventional feature based generally enters row decoding using direct scheme to positional information.In these systems
In, (x, y) coordinate of each feature is quantized into certain fixed resolution, for example each feature locations 8.Then store and send out
Penetrate these quantified (x, y) right.For example, in the case where the image with 1,000 features and 8 bit resolutions is used,
This scheme will need the data of each image about 2K bytes.These data coding schemes are produced largely to be needed via wireless network
The data of network transmitting, this can negatively affect the performance and ease for use of visual search and mobile augmented reality application again.
Accordingly, it would be desirable to the system of the above mentioned problem that can solve conventional system for being used to entering feature locations information row decoding
And method.
The content of the invention
The present invention relates to the decoding of the feature locations information of image.Embodiments in accordance with the present invention, a kind of spy to image
Levy positional information and enter the method for row decoding and include:Hexagonal mesh, the hexagonal gridding is produced to include multiple hexagonal cells;Make
The feature locations of image are quantified with the hexagonal mesh;Produce histogram with recording feature position in each hexagon
Appearance in unit;And the occurrence number according to feature locations in each hexagonal cells is encoded to histogram.
The method of the generation hexagonal mesh is included and determines hexagon according to the predetermined quantitative grade of feature locations information
The size of unit.The method quantified to feature locations is included:Each feature locations are performed from two dimensional surface to three-dimensional
The coordinate transform in space;Transformed coordinate is rounded to corresponding immediate integer;And verify transformed coordinate category
Hexagon plane in the three dimensions.It should be noted that the conversion is reversible.In order to the coordinate for verifying transformed belongs to
Hexagon plane in three dimensions, methods described calculates the summation of transformed coordinate, and verifies transformed coordinate
Summation is equal to zero.
It is described to produce histogrammic method to be configured to comprising feature locations in each hexagonal cells comprising generation
The Histogram Mapping of appearance, and generation is configured to the straight of occurrence number of the Expressive Features position in each hexagonal cells
Side's figure is counted.The method encoded to histogram can be comprising the contextual information of application adjacent hexagonal unit to straight
The information of follow-up hexagonal cells to be encoded is encoded in square figure, wherein the contextual information is included to be encoded
The contextual information and two grades of phases from follow-up hexagonal cells to be encoded of the one-level adjacent cells of follow-up hexagonal cells
The contextual information of adjacent unit.
In another embodiment, a kind of mobile device is included:Image module, it is configured to obtain image;Visual search
Module, its encoded feature locations information for being configured to produce image;And controller, it is configured to via wireless network
Network is by the encoded feature locations information transmission of image to server.The visual search module of the mobile device is included:With
In the logic for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;For using the hexagonal
The logic that shape grid is quantified to the feature locations of image;For producing histogram with recording feature position in each hexagon
The logic of the appearance in unit;And histogram is compiled for the appearance according to feature locations in each hexagonal cells
The logic of code.
Brief description of the drawings
After the detailed description for coordinating figures below to read embodiments of the invention, preceding feature of the invention and excellent
Point and its additional features and advantage can more clearly understand.
Fig. 1 a-1b illustrate the generation histogrammic method of feature locations according to certain aspects of the invention.
Fig. 2 illustrates the generation histogrammic other method of feature locations according to certain aspects of the invention.
Fig. 3 a are illustrated according to certain aspects of the invention in three-dimensional (3D) space with hexagon planar representation feature
The method of positional information.
Fig. 3 b illustrate the characteristic of hexagonal cells according to certain aspects of the invention.
The self adaptation statistical coding application that Fig. 4 a-4b are illustrated according to certain aspects of the invention to histogram value is upper and lower
Text configuration.
Fig. 5 is illustrated and of the invention translated using the feature locations of square net and hexagonal mesh in a certain respect
The comparing of code scheme.
Fig. 6 a illustrate the block diagram of the mobile device for being configured to perform visual search according to certain aspects of the invention.
Fig. 6 b illustrate the method for being used for image retrieval according to an embodiment of the invention.
Fig. 6 c illustrate the method that embodiments in accordance with the present invention enter row decoding to the feature locations information of image.
Fig. 7 a-7b illustrate the functional exemplary embodiment of visual search according to an embodiment of the invention.
Fig. 8 a-8b illustrate embodiments in accordance with the present invention and words tree are used when feature locations are indexed and is associated
Inverted index.
Specific embodiment
The present invention discloses the embodiment for entering row decoding to feature locations information.Following description is presented so that art
Technical staff can make and use the present invention.The description of specific embodiment and application is provided only as example.Affiliated neck
The technical staff in domain is readily apparent the various modifications and combinations of example as herein described, and is not departing from spirit of the invention
In the case of scope, generic principles defined herein can apply to other examples and application.Thus, the present invention is not
Wish to be limited to described and displaying example, but the widest range for meeting principles and features disclosed herein should be endowed.
Fig. 1 a are illustrated such as in September, 2009《International mobile multimedia communication proceeding》Middle Cai (Tsai) et al.
" for position decoding (the Location coding for mobile image retrieval of mobile image searching system
Systems the use rectangular mesh described by) " produces the histogrammic method of feature locations.This part of full text of bibliography is drawing
Mode is incorporated herein.In 102, stain represents the feature of image.In 104, square net is covered with image,
Wherein described square net includes multiple square shaped cells.According to application, the size of each square shaped cells can be from 2x2
Square pixel changes to 32x32 square pixel.In 106, produce Histogram Mapping to show the position of the feature of image
Put.The unit containing stain in Histogram Mapping is shown with grey, and does not contain the unit white displaying of stain.In spy
Levy between two square shaped cells it is borderline in the case of, square shaped cells of the selection containing larger characteristic area.
In the case of feature is homoeomerous between two units, any one of described unit can be selected.In 108, based on 106
Histogram Mapping produce histogram counts.Numeral in square shaped cells represents that 106 belong to described in Histogram Mapping
The number of the feature in unit.
Embodiments in accordance with the present invention, it is assumed that n indicate image feature number, and assume m indicate histogram in
The number of unit.Then, in the case of given Video Graphics Array (VGA) image, and use ratio invariant features are converted
(SIFT) or robust character (SURF), in the case of n=1000, then m=640*480/w are accelerated2, wherein w is the unit
Size (in units of pixel).
It should be noted that, it is possible to use the idea of multiset represents the histogram of feature locations information, allows in multiset
Member occurs more than once.The number of times that one element belongs to multiset is the multiplicity of the member.Element in multiset
Sum (comprising the member for repeating) is the cardinality of multiset.For example, in multiset { a, a, b, b, b, c }, member a, b
It is respectively 2,3 and 1 with the multiplicity of c, and the cardinality of multiset is 6.
In fig 1 a in shown example, (wherein element is taken from having for cardinality m to the number n of the multiset of cardinality
Limit collection) it is multiset coefficient or multiset number.Represent that the possibility with m unit and n sum is straight by multiset coefficient
The number of square figure:
Thus, it is supposed that all histogrammic probability are equal, may spend about
Individual position encodes to it.In above-mentioned expression formula, O represents big O symbols, its description when independent variable tend to it is specific
The limitation sexual behaviour of (usual for better simply function) described function when value or infinity.Increasing of the big O symbols according to function
Rate long characterizes function, to allow that there are the different functions of identical growth rate to be represented using identical O symbols.And, it is assumed that α is
Constant, and H () is entropy function causing:
H(x)=-xlogx-(1-x)log(1-x)。
In n=1000 and m=640*480/w2In the case of show by using this formula obtain numeral, wherein w
Represent the histogrammic block size in position.Show that the bit-by-bit of each feature locations puts the curve map of histogram block size w in Fig. 1 b.
In this curve map, when block size smaller (for example, 2 pixels), the speed for entering row decoding to feature locations information is larger
(about 8/feature).As block size increases, the speed that row decoding is entered to feature locations information reduces.When block size is about
During 30 pixel, decoding rate is about 1/feature.The experience entropy estimate that Fig. 1 b are reported in the article comprising Cai et al..Should note
Meaning, experience entropy estimate in view of model information launch cost, and thus its above-mentioned formula prediction curve somewhat
Lower section.Generally speaking, it follows the similar trend relative to position histogram block size w.
It shall yet further be noted that can be directly proportional to block size w by the distortion (covering radius) that this scheme is introduced, and giving
In the case of the fixed point q and reconstructed point q ' of its correspondence, it is:
Using above-mentioned relation, the rate-distortion characteristics for the decoding of histogram position can be expressed as below (for example, for L2
Norm):
Wherein W and H refers to the width and height of input picture, and n is the number of feature, and the wherein progressive statement in right side is
Obtained for high fidelity (ε → 0) system.
Fig. 2 illustrates the generation histogrammic other method of feature locations according to certain aspects of the invention.In 202,
Stain represents the feature of image.Hexagonal mesh (also referred to as hexagonal lattice) is covered with 204, on image, wherein described
Hexagonal mesh includes multiple hexagonal cells.In 206, Histogram Mapping can be formed to show the position of the feature of image
Put.In this example, the unit containing stain is shown with grey in Histogram Mapping, and does not contain the unit white of stain
Displaying.Feature be in two hexagonal cells between it is borderline in the case of, selection containing larger characteristic area hexagonal
Shape unit.In the case of feature is homoeomerous between two units, any one of described unit can be selected.208
In, the Histogram Mapping that can be based on 206 forms histogram counts.Number in hexagonal cells represents the Nogata for belonging to 206
The number of the feature in the unit in figure mapping.It should be noted that spy can be produced using different size of hexagonal cells
The different quantification gradations in positional information are levied, for example 4,5 or 6 positions of each feature.For example, the hexagonal in hexagonal mesh
The a line of shape unit can have 2,4,8,16 or 32 sizes of pixel.For the hexagonal cells of each size, Nogata
The entropy for scheming mapping can have not bit rate and each image to have a not bit rate with each feature, and histogram counts
Entropy can have not bit rate with each feature, and its mean speed can change for different images.Similarly, each size
Hexagonal cells (i.e. 2,4,8,16 or 32 pixels) different quantification gradations in feature locations information can be produced.Histogram
Mapping and histogram counts can be encoded separately, and adjacent hexagonal list can be used when row decoding is entered to Histogram Mapping
The spatial relationship of the feature of unit.
Method hexagonal lattice subregion instead of the square grid subregion of space characteristics position shown in Fig. 2.Make
In this way, the histogram of the feature locations for being quantized into hexagonal lattice is calculated, and then result of calculation is compiled
Code.It is to reduce the number of the position needed for being encoded to the positional information of each feature to create the histogrammic target of feature locations.
A kind of method is not to enter row decoding to the positional information of each feature, but the positional information of feature is converted into position Nogata
Figure, and row decoding is entered to the position histogram.Positional information is converted into position histogram and row decoding is entered to histogram
There are some benefits.First, it allows interpretation method to be not based on the order of the project through decoding, and thus reduce the position of decoding
Speed.Additionally, can be the system point in image because of feature, it is possible to the space between using feature during decoding
Structural relation.
Fig. 3 a are illustrated according to certain aspects of the invention in three-dimensional (3D) space with hexagon planar representation feature
The method of positional information.As shown in fig. 3a, 3d space is shown as the cube 302 defined by u axles, v axles and w axles.
Hexagon plane 304 can as shown be formed as having the summit on 305,306,307,308,309 and 310.This example
In hexagon plane 304 center 312 or the center of cube 302, it has coordinate (0.0,0.0 and 0.0).
Embodiments in accordance with the present invention, can be by characteristics of image with the method for hexagon planar representation feature locations information
Coordinate (x, y) is from the hexagon plane 304 in two-dimentional (2D) space projection to 3d space.When u, v and w of the point in 3d space sit
When target summation meets following condition, this point is located in hexagon plane.
u+v+w=0。
In a kind of exemplary method, using following matrix by the point transformation in 2D spaces to 3d space:
And above-mentioned matrix meets following condition:
This means with down conversion:
(u, v, w)=(x, y) M
Can be reversible:
(x, y)=(u, v, w) MT
The example of this conversion is illustrated in Fig. 3 a.Hexagonal lattice in u+v+w=0 planes is one group, and there is integer to sit
Target point, such as point 314a and 314b:
(u, v, w)lattice∈3。
Embodiments in accordance with the present invention, it is a kind of to 3d space in the method that is quantified of transformed point include following meter
Calculate.
In the case of the point q with following coordinate in given 3d space
q=(uq, vq, wq)
Define a little:
q′=(〈uq>, < vq>, < wq〉)
Wherein < x > indicate the integer closest to real number x.
Calculate summation and verify whether quantified point is located in hexagon plane:
Δ=〈uq〉+〈vq〉+〈wq〉。
If Δ=0, it means that quantified point is located in hexagon plane, then this process is completed.In other words,
Q ' belongs to hexagon plane (u+v+w=0), and thus it is effective grid point.
Calculation error:
δ=(uq-〈uq>, vq-〈vq>, wq-〈wq〉)
And error is ranked up to cause
If Δ>0, then from q ' with highest error amount δiΔ component subtract 1.If Δ<0, then to tool
There is minimum error values δiQ ' the individual component of | Δ | add 1.In order to control image feature locations (x, y) to the mapping of grid point
Rugosity, can introduce scale parameter σ.It should be noted that whole quantizing process can be described as a series of conversion:
(x, y) → (u, v, w)=σ-1(x, y) M → (u, v, w)lattice
Reconstructed value (x ', y ') is obtained as below:
(u, v, w)lattice→ σ (u, v, w)latticeMT→ (x ', y ')
, there are several technologies that can be enumerated with encoded raster point in embodiments in accordance with the present invention.A kind of method is to follow
Wherein as the order of hexagonal cells occurs in the raster scanning that methods described performs image coordinate (x, y).Or, methods described
The dictionary order of the seat target value according to hexagonal cells enumerates hexagonal cells.
In some embodiments, unit of the methods described scanning containing image coordinate, and it is each to becoming to be mapped to
The number of the feature of unit is counted.After histogram is calculated, it may map to unique index and is then compiled
Code.As indicated above, the total histogrammic number of possibility with m unit and n can be represented by multiset coefficient:
And the speed needed for representing histogram index is:
Individual position.
Embodiments in accordance with the present invention, can be using various decoding techniques to the Histogram Mapping 206 and histogram meter of Fig. 2
Number 208 enters row decoding.In one approach, histogram can be converted into unique dictionary formula index, and be then used by with R
The fixed length code of (m, n) position is encoded.If Y.A. Rui Sinike (Y.A.Reznik) are " for the amount of discrete probability distribution
Algorithm (the An Algorithm for Quantization of Discrete Probability of change
Distributions)”(《Data compression proceeding (DCC ' 11)》, in March, 2011, the 333-343 pages, entire contents
It is incorporated herein by reference) described in, there is m binary number, n tale and per binary given
In individual count k1... .km it is histogrammic in the case of, unique index I (k can be obtained as below1..., km):
This formula by concluding (from m=2,3 ... start) continue, and implement various types of lexicographic enumerations.Lift
For example,
I (0,0 ..., 0, n)=0,
I (0,0 ..., 1, n-1)=1,
In other method, the empty block in Histogram Mapping can be converted into run length with raster scan order.Connect
And row decoding is entered to run length using entropy encoder.Entropy encoder can using Columbus-Lai Si codes, Hoffman code or
At least one of arithmetic code.In other method, methods described uses variable-length decoding scheme, its capture key point
The characteristic of spatial distribution.In another method, the histogram value in several surrounding hexagonal cells is used as context.Further
The configuration of these contexts is described with reference to Fig. 4 a and Fig. 4 b.
The self adaptation statistical coding application that Fig. 4 a-4b are illustrated according to certain aspects of the invention to histogram value is upper and lower
Text configuration.In fig .4, in order to be encoded to the hexagonal cells X in hexagonal mesh, it is possible to use adjacent from one-level
The contextual information of unit A, B and C is encoded to hexagon Histogram Mapping and histogram counts.In this example, one
Level adjacent cells A, B and C are previously encoded hexagonal cells, and hexagonal cells X is follow-up hexagonal to be encoded
Shape unit.Similarly in fig. 4b, in order to be encoded to the hexagonal cells Y in hexagonal mesh, it is possible to use from one
Level and two grades of contextual informations of adjacent cells (A, B, C, D, E, F, G, H and I) are to hexagon Histogram Mapping and histogram meter
Number is encoded.Firsts and seconds adjacent cells A, B, C, D, E, F, G, H and I are previously encoded hexagonal cells, and
And hexagonal cells Y is follow-up hexagonal cells to be encoded.
It should be noted that compared with square grid, hexagonal lattice provides the more preferable placement of the point that can serve as context.Lift
For example, in fig .4, three one-level adjacent hexagonal units A, B and C can serve as context.Although in square grid
In, only exist two these available one-level adjacent square units, the i.e. pros in the square shaped cells of top and left side
Shape unit, it is assumed that scanning direction is from left to right and from top to bottom.
It should be noted that compared with square grid, hexagonal lattice produces the thinner covering in two-dimentional (2D) space.This can be improved
The accuracy that feature locations are represented.As shown in figs 4 a and 4b, the mapping for hexagonal space is translated from context modeling and entropy
Code viewpoint sees it is beneficial.It should be noted that characteristics of image position can't change actual pixels to the translation method of hexagonal space
Value, it means that it can be performed for computing resource with effective manner.
Paragraphs below analysis and utilization hexagonal lattice enters the benefit of row decoding to feature locations information.A kind of method is to estimate
The rate-distortion characteristics of proposed scheme, and with enter the scheme ratio of row decoding to feature locations information using square grid
Compared with.
Consider two grid points:(0,0,0) and (0,1,1), and convert it back to pixel domain.Please remember, this conversion
Carried out by mapping:
(u, v, w)lattice→ σ (u, v, w)latticeMT→ (x ', y ')
Wherein σ is scale parameter.This draws:
And
In pixel domain these point the distance between be:
It should be noted that the height that the same distance in grid domain corresponds to the hexagonal cells shown in Fig. 3 b is:
Cell radius in pixel domain can be expressed as;
Similarly, the region that single hexagonal cells are occupied can be expressed as:
In the case of the image with H x w pixels, it will need at least
Individual hexagonal cells are covered to it.In this case, the quantization error based on L2 norms is equal to covering half
Footpath:
This further produces following relation:
And rate-distortion function:
Comparatively, being for the rate-distortion function of square grid:
Therefore, proposed quantization scheme can be saved about
Position/characteristic point, while keeping identical worst case accuracy.
Fig. 5 is illustrated and of the invention translated using the feature locations of square grid and hexagonal lattice in a certain respect
The comparing of code scheme.Curve 502 represents the position using each feature locations of hexagonal lattice decoding scheme to quantization error.It is bent
Line 504 represents the position using each feature locations of square grid decoding scheme to quantization error.The two curves are all used
With about 1,000 VGA images of feature.As shown in this example, if position decoding is under the bit rate of 5/feature
Operation, then there is hexagonal lattice decoding scheme the bit rate better than about the 8.16% of square grid decoding scheme to change
Enter.
Embodiments of the invention describe the improved technique of the decoding for characteristics of image positional information.The technology is utilized
The histogrammic construction of the appearance of hexagonal lattice, feature locations for the quantization of feature locations in grid cell and this is straight
The coding of square figure.The performance of this technology is analyzed, and by this technology and utilizes the square grid (scalar of location parameter
Quantify) histogram decoding performance be compared.Illustrate proposed scheme and result in the bright of the bit rate that position decodes
It is aobvious to improve.The technology is suitable for implementing on a mobile platform.
Disclosed method goes for wherein visual search and augmented reality (AR) system depends on Q-character confidence
Cease to perform the mobile device of multiple tasks.For example, feature locations information can be used for the geometry matched between 1) image
Checking;2) parameter of the geometric transformation between the view of same object is calculated;3) position and project the border of object of interest;
And 4) extraneous information enhancing is used to capture the view of the institute's identification objects in image or video, and other purposes.
In some cases, if representing positional information with compact and easy-to-use form, then AR and visual search system
System can be benefited.If necessary to launch positional information via wireless network, then compactedness is even more important.May also allow for position letter
The a certain loss of accuracy of breath, but only allow the loss of a certain degree, because this may influence whether retrieval accuracy and several
The accuracy of the matching area/object of what conversion and the localization of parameter.
Fig. 6 a illustrate the block diagram of the mobile device for being configured to perform visual search according to certain aspects of the invention.
At mobile device, antenna 602 receives modulated signal from base station, and the signal that will be received is provided to modem
604 demodulator (DEMOD) part.Signal that demodulator processes (for example, regulation and digitize) are received and it is input into
Sample.It further performs Orthodoxy Frequency Division Multiplex (OFDM) demodulation to input sample, and provides the frequency of all subcarriers
The symbol that domain receives.RX data processors 606 process (for example, symbol de-maps, release of an interleave and decoding) frequency domain and receive
Symbol, and provide the controller/processor 608 to mobile device by decoded data.
Controller/processor 608 may be configured to control mobile device via wireless network and server communication.TX data
Processor 610 produces signaling symbols, data symbol and frequency pilot sign, and these symbols can be by the modulation of modem 604
Device (MOD) treatment, and it is launched into base station via antenna 602.Additionally, at the guiding mobile device of controller/processor 608
The operation of various processing units.Memory 612 can be configured to store the program code and data for mobile device.Image mould
Block 616 can be configured to obtain image.Visual search module 614 can be configured to implement the feature locations information decoding to image
Method and image search method as described below.
Embodiments in accordance with the present invention, CBIR can be using referred to as " feature bag " (BoF) or " word
The method of language bag " (BoW).BoW methods are derived from text document retrieval.In order to find particular text document, such as webpage,
Word using several good selections is just much of that.In database, document can be represented equally by " bag " of prominent word in itself,
No matter these words appear in where document.It is the firm local feature of the characteristic of specific image for image
Serve as the role of " vision word ".As text retrieval, BoF image retrievals do not consider which position in the picture occurs in feature
Put, be so at least in the starting stage of retrieval pipeline.
Fig. 6 b illustrate the method for being used for image retrieval according to an embodiment of the invention.In frame 622, methods described
Obtain query image.In frame 624, local image characteristics/descriptor is extracted from query image.In frame 626, then by these
The descriptors match of the image stored in descriptor and database 630.Descriptors match function can further comprising matching office
Portion's characteristics of image, image of the selection with top score, and perform geometric verification.In frame 628, then select and enumerate and look into
Asking image has the image of many common traits.Geometric verification step as described below can be used for refusal to be had and can not pass through
The matching of the feature locations that replacing checks position credibly to explain.
Method shown in Fig. 6 b may be implemented for the pipeline of large-scale image retrieval.First, carried from query image
Take local feature (also referred to as descriptor).The phase between query image and database images is assessed using described group of local feature
Like property.In order to be able to be used for Mobile solution, each feature should be regarded compared with correspondence database image relative in user from difference
The geometry and photometric distortion put and run into when obtaining inquiry photo with different illumination are firm.Next, by query characteristics
With the characteristic matching of image of the storage in database.By using special index structure so as to allow quick access to contain matching
The image list of feature, it is possible to achieve this result.Based on the number of the common feature of itself and query image, selected from database
Select the short list of potentially similar image.Finally, to the most like matching application geometric verification step in database.Geometric verification
It is correct that the relevant spatial model between the feature of query image and the feature of candidate data storehouse image is found to ensure to match
's.
Fig. 6 c illustrate the method that embodiments in accordance with the present invention enter row decoding to the feature locations information of image.As schemed
Shown in 6c, in frame 632, methods described produces the hexagonal mesh comprising multiple hexagonal cells, and feature based position
The predetermined quantitative grade (such as each feature 4,5 or 6) of confidence breath determines the size of hexagonal cells.
In frame 634, methods described is quantified using hexagonal mesh to the feature locations of image.For each feature
Position, methods described produces the transformed coordinate from two dimensional surface to three dimensions of feature locations, by transformed coordinate
Corresponding immediate integer is rounded to, and verifies that transformed coordinate belongs to the hexagon plane in three dimensions.Pass through
The summation for the calculating transformed coordinate and summation for verifying transformed coordinate is equal to zero to verify transformed coordinate.
In frame 636, methods described produces appearance of the histogram with recording feature position in each hexagonal cells.Institute
State histogram and include the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells, and be configured
With the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
In frame 638, appearance of the methods described according to feature locations in each hexagonal cells is compiled to histogram
Code.Histogram is converted into unique dictionary formula index by methods described, and using fixed length code to unique dictionary formula
Index is encoded.Additionally, histogrammic empty block is converted into run length by methods described raster scan order, and make
Run length is encoded with entropy encoder.Entropy encoder can use Columbus-Lai Si codes, Hoffman code or arithmetic
Code.
In other method, coding is carried out to histogram can apply the contextual information of adjacent hexagonal unit to Nogata
The information of follow-up hexagonal cells to be encoded is encoded in figure.The contextual information is included to be encoded follow-up six
The one-level adjacent cells of corner shaped elements and two grades of contextual informations of adjacent cells.The contextual information is used as arithmetic coding
The input of device.
Embodiments in accordance with the present invention, Columbus-Lai Si decodings are a kind of lossless numbers of use series data compression code
According to compression method, wherein the alphabet that geometry distribution is followed in adaptive decoding scheme can have Columbus's-Lai Si codes
As prefix coee.Columbus's-Lai Si codes have tunable parameter (being two power), and this causes that these codes are conveniently used for meter
On calculation machine, because can more efficiently implement two multiplication and division in binary arithmetic.Hoffman decodeng uses variable length
Degree code table is encoded to carry out lossless data compression to source symbol.Variable-length codes table can be based on each of source symbol
The estimated probability of the appearance of probable value and derive.Hoffman decodeng selects the expression of each symbol using specific method, from
And prefix coee is produced, the prefix coee is most common to express using the bit string more shorter than bit string for more uncommon source symbol
Source symbol.For be with homogeneous probability distribution and multiple two power member a group code, Hoffman decodeng is equivalent to
Binary block is encoded.Arithmetically decoding is a kind of variable-length entropy code of the form for lossless data compression.Can be using every
The position of the fixed number of individual character represents a string of characters, with American Standard Code for Information Interchange in.When arithmetic coding is converted a string to,
The character that can be commonly used with less bits storage, and the less frequent character for occurring is stored with compared with multidigit, so that
The position for using altogether is less.Arithmetically decoding is different from the entropy code (such as Hoffman decodeng) of other forms, because arithmetically decoding
It is not that input is separated component amount symbol and substitutes each component symbol with code, but by whole message coding into single number
Word, i.e. fraction n, wherein (0.0≤n<1.0).
Fig. 7 a-7b illustrate the functional exemplary embodiment of visual search according to an embodiment of the invention.Such as
The method for entering row decoding to feature locations information described in the present invention can be in the client as shown in Fig. 7 a and Fig. 7 b
Implement with server environment.
As shown in Figure 7 a, the system includes mobile device 702 (such as mobile phone), visual search server 704
With wireless network 706.Mobile device 702 includes image capture module 703, image coding module 705 and process and display result
Module 707.Visual search server 704 includes image decoder module 711, descriptor extraction module 713, descriptors match module
715th, Search Results module 717 and database 719.The component of mobile device 702, wireless network 706 and visual search server
704 are communicatively coupled, as shown in the flow chart of Fig. 7 a.Mobile device 702 analyzes query image, extracts topography special
Levy (descriptor), and emission characteristic data.Search method is being regarded using emitted feature as inquiry with performing search
Feel and run on search server 704.
In the example for showing in fig .7b, the system includes mobile device 722 (being shown as mobile phone), visual search
Server 724 and wireless network 726.Mobile device 722 includes image capture module 723, descriptor extraction module 725, description
Symbol coding module 727, descriptors match module 729, decision branch 731, process and display object module 733 and local data
Storehouse (D/B) or cache 735.Visual search server 724 includes descriptor decoder module 741, descriptors match module
743rd, Search Results module 745 and database 747.The component of mobile device 722, wireless network 726 and visual search server
724 are communicatively coupled, as shown in the flow chart of Fig. 7 b.Mobile device 722 maintain database cache and
Locally execute images match.In the case where matching is not found, mobile device 722 sends to visual search server 724 and inquires about
Request.In this way, it is further reduced via the data volume transmitted by network.
Fig. 7 a and Fig. 7 b it is each in the case of, retrieval framework be adapted to strict mobile system requirement.The movement
Treatment on device needs quick and economical in terms of power consumption.Needed as far as possible via the size of data of network launches
It is small, to minimize network latency, and thus provide optimal user experience.Method for retrieval needs to contract
Potentially great database is put into, and accurate result can be surrendered with low latency.Additionally, searching system needs to be steady
Solid, so as to allow to be reliably identified in it is varied under conditions of the object that captures, the condition includes different distance, visual angle
With illumination condition or in the case where there is Partial occlusion or motion blur.
Prominent point of interest in characteristic extraction procedure identification image.In order to realize firm images match, these points of interest
Need to be repeated under view transformation (such as ratio change, rotation and translation) and illumination change.In order to realize constant rate
Property, it is possible to use image pyramid calculates point of interest under multiple ratios.In order to realize rotational invariance, around each point of interest
Insertion code it is upwardly-directed in the side of main gradient.Gradient in each path is further normalized so that it becomes for illumination
Change firm.
It should be noted that different interest spot detectors provide repeatability compromise different from complexity.For example, pass through
Gaussian difference (DoG) point that SIFT is produced may calculate relatively slow, but it can highly be repeated;And angle detector method may
The quick but relatively low repeatability of its offer.In the various sides that can realize the good compromise between repeatability and complexity
In the middle of method is the Hai Sai-binary large object detection device (Hessian-blob detector) accelerated with complete image.It is right
VGA images make in this way, can perform point of interest with approximately less than one second on some current mobile phones and detect.
After point of interest detection, insert code using these smaller images for putting surrounding and calculate " vision word " descriptor.
Problem when calculating feature descriptor is the characteristic for making it very to distinguish an image or smaller group image.Almost every
In individual image occur descriptor (such as in text document word " and " equivalent) will be unsuitable for retrieval.
In one embodiment, the process for calculating descriptor described below:
● insertion code is divided into the binary number of several (such as 5 to 9) space colocalizations;
● then calculate joint (dx, the dy) histogram of gradients in each space binary number.CHoG histogram binary systems
Change the typical deflection in the gradient statistics observed using the insertion code for being extracted around key point;And
● the histogram of the gradient from each space binary number is quantified and is deposited as a part for descriptor
Storage.
In the above-mentioned embodiment for extracting the feature of image, extract different proportion under point of interest (for example, angle,
Binary large object).Insertion code under different proportion is oriented along main gradient.Operating specification ground is oriented and normalized slotting
Enter code and calculate descriptor.Insertion code is divided into the space binary number of localization, and it provides and is localized for point of interest
The steadiness of error.Directly compress the distribution of the gradient in each space binary number and retouched to obtain insertion the compact of code
State.
Allow to measure (such as KL divergings) using information distance using histogram to assess the mismatch between characteristics of image
Degree.Histogram also allows simple and efficient coding.In some instances, it is only necessary to which 50 to 60 positions are with by each insertion
Code become compression based on histogrammic descriptor.
The mobile AR and visual search system of transmitting or storage local image characteristics are needed efficiently to feature and Q-character
Information aggregate is put to be encoded (and/or multiplexing).Feature locations information is also required to be encoded, because this is geometric verification institute
Need.In one approach, in order to realize matching accuracy, 500 local features are typically at least needed.These features are usual
It is spatially very related.As shown in above-mentioned Fig. 2 to Fig. 4, by being quantized into 2D histograms first and being then used by base
It is related with utilization space in the arithmetically decoding technology of context, it is possible to achieve the coding of feature locations information.This technology can be with
About 5/feature decoding rate is realized, while providing the sufficiently high precision that feature locations information is represented.
By emission characteristic position histogram first and then (its position goes out i.e. when being decoded to histogram successively
Existing order) emission characteristic, it is possible to achieve the coding of whole group local feature and its correspondence position.For example, if histogram
Block (x, y) is indicated to include three features, then encoder can be sequentially output three codes of correspondence descriptor in bit stream.
Decoded using compact descriptor (such as descriptor as described above) and feature locations, can be by about 4K
Byte (500 × (60+5)/8) is represented with 500 query images of feature.Consider that JPEG compression query image is generally used
About 40-80K bytes, disclosed method represents the order of magnitude reduction of bit rate.
In order to the feature to the image in large-scale image data base is indexed and is matched, disclosed embodiment uses a kind of
Data structure, the data structure passes the short list of the database candidate for being likely to match with query image back.As long as comprising
Correct matching, the short list may contain false positive.Short list that then can be only to candidate rather than whole database
Execution is shorter to compare in pairs.
Various data structures can be used to be indexed for the local feature in image data base.A kind of method is with first most
The approximate of good binary number strategy use SIFT descriptors is searched for closest to neighbor (ANN).In addition it is possible to use feature bag
(BoF) model.BoF codebooks are to train the k averages cluster of set to build by descriptor.During inquiring about, by using with
The associated inverted file index of BoF codebooks, can perform the scoring to database images.In order to produce large-scale codebook, can be with
Words tree VT is created using k averages cluster is classified).Also other search techniques, the hash of such as locality-aware can be used
(LSH) and traditional method based on tree improvement.
Fig. 8 a illustrate embodiments in accordance with the present invention and build word by the classification k averages cluster of training characteristics descriptor
The method of remittance tree.Words tree shown in this example has 2 grades.Using branching factor k=3, and words tree has k^
2=9 leaf node.Fig. 8 b illustrate words tree according to an embodiment of the invention and associated inverted index.Inverted index
Containing image list, and indicate the same paths that the counter of the number of features in file followed in words tree.
As shown in Fig. 8 a-8b, for characteristics of image is indexed and is matched when using words tree (VT) and its is associated
Inverted index structure.As illustrated in Fig. 8 a, classification k is performed by the training characteristics descriptor set to representing database equal
Value cluster, can build the VT for database.First, for all training descriptors produce k large construction cluster.By with appropriate
Distance function (such as L2 norms or the symmetric form of KL divergings) carries out this and grasps using k mean algorithms (being quantized into k unit)
Make.Then, for each large construction cluster, to the training descriptor application k averages cluster for being assigned to the cluster with produce k compared with
Small cluster.The recurrence division in this descriptor space repeats to presence always ensures that the binary number of good classification performance is enough
Only.For example, indeed, it is possible to using with height 6, branching factor k=10 and produce 1,000,000 (106) node VT
Design.
The inverted index being associated with VT each leaf node two lists of maintenance, as shown in figure 8b.For leaf node x,
There is the sorted array { i of image recognition symbolx1..., ixNx, which N it indicatesxDatabase images have belong to and this
The feature of the associated cluster of node.Similarly, there is corresponding counter array { Cxl..., CxNx, it indicates each correspondence
The number of the feature in image belongs to same cluster.
During inquiring about, for query image in each feature travel through the VT, every time on one of leaf node
Terminate.The similitude for then being calculated using the corresponding lists of image and frequency counting between these images and query image is obtained
Point.These scores can be calculated using standard word frequency-inverse document frequency (TF-IDF) weighting scheme.By from all these row
Table pulls image and is sorted according to score, can derive and be likely to the database containing with the true match of query image
Image subset.Because each query characteristics only need to perform a small amount of lookup, and can directly obtain all phases from inverted index
The list of file is closed, so this scheme can scale to support large database.
Geometric verification is performed after characteristic matching.At this stage, using in query image and database images
The positional information of feature meets the change of the viewpoint between two images confirming characteristic matching.Estimate to inquire about using regression technique
Geometric transformation between image and database images.Generally by incorporating the basic of 3D geometries, homography or affine model
Matrix represents conversion.
It should be noted that [0090] section, Fig. 2, Fig. 6 a-6c and its correspondence description are provided for producing comprising multiple hexagonal cells
Hexagonal mesh device, for the device quantified to the feature locations of image using hexagonal mesh, for producing
The device of appearance of the histogram with recording feature position in each hexagonal cells and for according to feature locations every 1
The device that appearance in corner shaped elements is encoded to histogram.[0090] section, Fig. 2, Fig. 3 a-3b, Fig. 6 a-6c and its correspondence are retouched
The device that the transformed coordinate from two dimensional surface to three dimensions for producing feature locations is provided is stated, for will be transformed
Coordinate be rounded to the device of the immediate integer of correspondence, and for verify transformed coordinate belong to three dimensions in hexagonal
The device of shape plane.[0090] section, Fig. 2, Fig. 6 a-6c and its correspondence description is provided and is configured to comprising feature locations for generation
The device of the Histogram Mapping of the appearance in each hexagonal cells, and Expressive Features position is configured to every for producing
The device of the histogram counts of the occurrence number in one hexagonal cells.Section [0090], Fig. 4 a-4b, Fig. 6 a-6c and its correspondence are retouched
State and provide the contextual information for application adjacent hexagonal unit with to follow-up hexagonal cells to be encoded in histogram
The device that information is encoded.
Method described herein and mobile device can depend on application and pass through various devices and implement.For example, this
A little methods can be implemented with hardware, firmware, software or its combination.For hardware embodiments, processing unit can at one or
It is more than one application specific integrated circuit (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable
Logic device (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, electronics dress
Put, other be designed to perform function as herein described electronic unit or its combination in implement.Herein, term " logic control
System " covers the logic implemented by software, hardware, firmware or combination.
For firmware and/or Software implementations, can be with the module of execution functionality described herein (for example, program, letter
Number etc.) implement methods described.Any machine-readable medium for visibly embodying instruction may be used to implement as herein described
Method.For example, software code can be stored in memory and performed by processing unit.Memory can be embodied in
In processing unit or outside processing unit.As used herein, term " memory " refers to any kind of long-term, short-term, easy
The property lost, non-volatile or other storage devices and the memory of any certain types of memory or number is not limited to, or appointed
What certain types of media that store memory.
If with firmware and/or software implementation, then can be using the function as one or more instructions or code
Storage is on computer-readable media.Example has computer journey comprising the computer-readable media and coding that coding has data structure
The computer-readable media of sequence.Computer-readable media can be in the form of manufacture.Computer-readable media includes physics
Computer storage media.Storage media can be can be by any useable medium of computer access.It is unrestricted by means of example, this
Class computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage apparatus, disk storage device or
Other magnetic storage devices, or any other form that can be used for store instruction or data structure expectation program code and can
By the media of computer access;As used herein, disk and CD include compact disk (CD), laser-optical disk, optics
The usual magnetically reproduce data of CD, digital versatile disc (DVD), floppy discs and Blu-ray Disc, wherein disk, and
CD laser reproduce data optically.The combination of said apparatus should be also included in the range of computer-readable media.
In addition to storing on computer-readable media, instruction and/or data can be also provided as being wrapped in communication equipment
The signal on transmitting media for containing.For example, communication equipment can include the transceiver of the signal with indicator and data.
The instruction and data is configured to cause the function that one or more processors are implemented to be summarized in claims.Also
It is to say, communication equipment includes the transmitting media with the information for indicating to be used to the function of performing disclosed.In the very first time, communication
The transmitting media included in equipment can include the Part I of the information of the function of being used to perform disclosed, and in the second time,
The transmitting media included in communication equipment can include the Part II of the information of the function of being used to perform disclosed.
The present invention can coordinate such as wireless wide area network (WWAN), WLAN (WLAN), Wireless Personal Network (WPAN)
Implement etc. various cordless communication networks.Term " network " and " system " usually used interchangeably.Term " position " and " place " are usually
Used interchangeably.WWAN can be CDMA (CDMA) network, time division multiple acess (TDMA) network, frequency division multiple access (FDMA) network,
OFDM (OFDMA) network, single-carrier frequency division multiple access (SC-FDMA) network, Long Term Evolution (LTE) network, WiMAX
(IEEE802.16) network etc..Cdma network can implement one or more radio access technologies (RAT), for example
Cdma2000, wideband CDMA (W-CDMA) etc..Cdma2000 includes IS-95, IS2000 and IS-856 standard.TDMA networks can be real
Apply global system for mobile communications (GSM), digital advanced mobile phone system (D-AMPS) or a certain other RAT.GSM and W-CDMA
It is described in the document of the association from entitled " third generation partner program " (3GPP).Cdma2000 is described in from entitled
In the document of the association of " third generation partner program 2 " (3GPP2).3GPP and 3GPP2 documents are publicly available.WLAN
Can be IEEE802.11x networks, and WPAN can be blueteeth network, IEEE802.15x or some other type of networks.
The technology can also coordinate any combinations of WWAN, WLAN and/or WPAN to implement.
Mobile station refers to such as honeycomb fashion or other radio communication devices, PCS Personal Communications System (PCS) device, personal navigation
Device (PND), personal information manager (PIM), personal digital assistant (PDA), laptop computer or other can receive nothing
The devices such as the suitable mobile device of line communication and/or navigation signal.Term " mobile station " is also wanted to comprising for example by short distance
Wirelessly, infrared ray, wired connection or other connection (either at described device or at PND generation satellite signal receiving,
Assistance data reception and/or the treatment related to position) device that is communicated with personal navigation apparatus (PND).Also, " mobile station "
Wish comprising all devices, comprising radio communication device, computer, laptop computer etc., its can for example via internet,
Wi-Fi or other networks and server communication, and either at described device, at server or with the network phase
There is satellite signal receiving, assistance data reception and/or the treatment related to position at another device of association.Said apparatus
Any operable combination is also regarded as " mobile station ".
It is optimised that certain things " optimised ", the saying of " required " or other sayings do not indicate that the present invention is only applicable to
Wherein there is the system (or other are attributed to the limitation of other sayings) of described " required " element in system.These sayings only refer to
Generation specific described embodiment.Certainly, many embodiments are possible.The technology can be discussed with except herein
Agreement outside the agreement stated is used together, comprising developing or untapped agreement.
Those skilled in the relevant art will be recognized that, it is possible to use many of disclosed embodiment may modification and group
Close, while still using the basic fundamental mechanism of identical and method.For illustrative purposes, be described above is with reference to specific implementation
What example was write.But, illustrative discussions above is not intended to be exhaustive or limits the invention to disclosed precise forms.
In view of teachings above, what many modifications and variations were equally possible.It is of the invention in order to explain to select and describe the embodiment
Principle and its practical application, and in order that those skilled in the art can be being suitable for covered specific use
Various modifications best utilize embodiment of the invention and various.
Claims (34)
1. a kind of method that feature locations information to image enters row decoding, it includes:
Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;
The feature locations of image are quantified using the hexagonal mesh, wherein carry out quantization to feature locations including:It is right
In each feature locations, the transformed coordinate from two dimensional surface to three dimensions of the feature locations is produced, by the warp
During the coordinate of conversion is rounded to the immediate integer of correspondence and verifies that the transformed coordinate belongs to the three dimensions
Hexagon plane;
Produce appearance of the histogram with recording feature position in each hexagonal cells;And
The appearance according to feature locations in each hexagonal cells is encoded to the histogram.
2. method according to claim 1, wherein producing the hexagonal mesh to include:
Predetermined quantitative grade according to feature locations information determines the size of the hexagonal cells.
3. method according to claim 1, wherein verifying that the transformed coordinate includes:
Calculate the summation of the transformed coordinate;And
Verify that the summation of the transformed coordinate is equal to zero.
4. method according to claim 1, wherein producing the histogram to include:
Generation is configured to the Histogram Mapping of the appearance comprising feature locations in each hexagonal cells.
5. method according to claim 4, it is further included:
Generation is configured to the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
6. method according to claim 1, wherein carry out coding to histogram including:
The histogram is converted into unique dictionary formula index;And
Unique dictionary formula is indexed using fixed length code is encoded.
7. method according to claim 1, wherein carry out coding to histogram further including:
The histogrammic empty block is converted into run length with raster scan order;And
The run length is encoded using entropy encoder.
8. method according to claim 7, wherein the entropy encoder uses Columbus's-Lai Si codes.
9. method according to claim 7, wherein the entropy encoder uses Hoffman code.
10. method according to claim 7, wherein the entropy encoder uses arithmetic code.
11. methods according to claim 1, wherein carry out coding to the histogram further including:
Using the contextual information of adjacent hexagonal unit to the histogram in follow-up hexagonal cells to be encoded letter
Breath is encoded.
12. methods according to claim 11, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
13. methods according to claim 12, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from the follow-up hexagonal cells to be encoded.
14. methods according to claim 11, wherein contextual information to be used as the input of arithmetic encoder.
A kind of 15. mobile devices, it includes:
Image module, it is configured to obtain image;
Visual search module, its encoded feature locations information for being configured to produce described image;And
Controller, its be configured to via wireless network by the described encoded feature locations information transmission of described image to clothes
Business device;
Wherein described visual search module is further configured to:
Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;
The feature locations of image are quantified using the hexagonal mesh, wherein carry out quantization to feature locations including:It is right
In each feature locations, produce transformed coordinate of the feature locations from two dimensional surface to three dimensions, by described through becoming
The coordinate for changing is rounded to the immediate integer of correspondence and verifies that the transformed coordinate belongs in the three dimensions six
Angular plane;
Produce appearance of the histogram with recording feature position in each hexagonal cells;And
The appearance according to feature locations in each hexagonal cells is encoded to the histogram.
16. mobile devices according to claim 15, wherein generation hexagonal mesh includes:
Predetermined quantitative grade according to feature locations information determines the size of the hexagonal cells.
17. mobile devices according to claim 15, wherein verifying that the transformed coordinate includes:
Calculate the summation of the transformed coordinate;And
Verify that the summation of the transformed coordinate is equal to zero.
18. mobile devices according to claim 15, wherein generation histogram includes:
Generation is configured to the Histogram Mapping of the appearance comprising feature locations in each hexagonal cells.
19. mobile devices according to claim 18, wherein the visual search module is further configured to:
Generation is configured to the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
20. mobile devices according to claim 15, wherein carry out coding to histogram including:
The histogram is converted into unique dictionary formula index;And
Unique dictionary formula is indexed using fixed length code is encoded.
21. mobile devices according to claim 15, wherein carry out coding to histogram further including:
The histogrammic empty block is converted into run length with raster scan order;And
The run length is encoded using entropy encoder.
22. mobile devices according to claim 15, wherein carry out coding to the histogram further including:
Using the contextual information of adjacent hexagonal unit to the histogram in follow-up hexagonal cells to be encoded letter
Breath is encoded.
23. mobile devices according to claim 22, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
24. mobile devices according to claim 23, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from described follow-up hexagonal cells to be encoded.
A kind of 25. mobile devices, it includes:
Image module, it is configured to obtain image;
Visual search module, its encoded feature locations information for being configured to produce described image;And
Controller, its be configured to via wireless network by the described encoded feature locations information transmission of described image to clothes
Business device;
Wherein described visual search module is included
Device for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;
For the device quantified to the feature locations of image using the hexagonal mesh, wherein for entering to feature locations
The device that row quantifies includes:For each feature locations, for produce the feature locations from two dimensional surface to three dimensions
The device of transformed coordinate, the device for the transformed coordinate to be rounded to the immediate integer of correspondence and
For verifying that the transformed coordinate belongs to the device of the hexagon plane in the three dimensions;
Device for producing appearance of the histogram with recording feature position in each hexagonal cells;And
For the device that the appearance according to feature locations in each hexagonal cells is encoded to the histogram.
26. mobile devices according to claim 25, wherein for producing the histogrammic device to include:
Device for producing the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells.
27. mobile devices according to claim 26, further include:
Dress for producing the histogram counts for being configured to occurrence number of the Expressive Features position in each hexagonal cells
Put.
28. mobile devices according to claim 25, wherein the device for being encoded to the histogram is further
Including:
For application adjacent hexagonal unit contextual information to follow-up hexagonal cells to be encoded in the histogram
The device that information is encoded.
29. mobile devices according to claim 28, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
30. mobile devices according to claim 29, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from described follow-up hexagonal cells to be encoded.
The equipment that a kind of 31. feature locations information to image enter row decoding, it includes:
Device for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;
For the device quantified to the feature locations of image using the hexagonal mesh, wherein for entering to feature locations
The device that row quantifies includes:For each feature locations, for produce the feature locations from two dimensional surface to three dimensions
The device of transformed coordinate, the device for the transformed coordinate to be rounded to the immediate integer of correspondence and
For verifying that the transformed coordinate belongs to the device of the hexagon plane in the three dimensions;
Device for producing appearance of the histogram with recording feature position in each hexagonal cells;And
For the device that the appearance according to feature locations in each hexagonal cells is encoded to the histogram.
32. equipment according to claim 31, wherein for producing the histogrammic device to include:
Device for producing the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells.
33. equipment according to claim 32, further include:
Dress for producing the histogram counts for being configured to occurrence number of the Expressive Features position in each hexagonal cells
Put.
34. equipment according to claim 32, wherein for being further included to the device that the histogram is encoded:
For application adjacent hexagonal unit contextual information to the histogram in follow-up hexagonal cells to be encoded
The device that is encoded of information.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161522171P | 2011-08-10 | 2011-08-10 | |
US61/522,171 | 2011-08-10 | ||
US13/229,654 | 2011-09-09 | ||
US13/229,654 US8571306B2 (en) | 2011-08-10 | 2011-09-09 | Coding of feature location information |
PCT/US2012/049055 WO2013022656A2 (en) | 2011-08-10 | 2012-07-31 | Coding of feature location information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103843011A CN103843011A (en) | 2014-06-04 |
CN103843011B true CN103843011B (en) | 2017-05-31 |
Family
ID=46634570
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280038785.0A Expired - Fee Related CN103843011B (en) | 2011-08-10 | 2012-07-31 | The decoding of feature locations information |
Country Status (6)
Country | Link |
---|---|
US (1) | US8571306B2 (en) |
EP (1) | EP2742486A2 (en) |
JP (1) | JP5911578B2 (en) |
KR (1) | KR101565265B1 (en) |
CN (1) | CN103843011B (en) |
WO (1) | WO2013022656A2 (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9396421B2 (en) | 2010-08-14 | 2016-07-19 | Rujan Entwicklung Und Forschung Gmbh | Producing, capturing and using visual identification tags for moving objects |
CN103858433B (en) * | 2011-08-25 | 2017-08-15 | 汤姆逊许可公司 | Layered entropy encoding and decoding |
US20130114900A1 (en) * | 2011-11-07 | 2013-05-09 | Stanford University | Methods and apparatuses for mobile visual search |
US9412020B2 (en) * | 2011-11-09 | 2016-08-09 | Board Of Regents Of The University Of Texas System | Geometric coding for billion-scale partial-duplicate image search |
JP6168303B2 (en) * | 2012-01-30 | 2017-07-26 | 日本電気株式会社 | Information processing system, information processing method, information processing apparatus and control method and control program thereof, communication terminal and control method and control program thereof |
US9449249B2 (en) * | 2012-01-31 | 2016-09-20 | Nokia Corporation | Method and apparatus for enhancing visual search |
EP2801190B1 (en) * | 2012-04-20 | 2018-08-15 | Huawei Technologies Co., Ltd. | Method for processing an image |
CA2900841C (en) | 2013-01-16 | 2018-07-17 | Huawei Technologies Co., Ltd. | Context based histogram map coding for visual search |
KR102113813B1 (en) * | 2013-11-19 | 2020-05-22 | 한국전자통신연구원 | Apparatus and Method Searching Shoes Image Using Matching Pair |
US10423596B2 (en) * | 2014-02-11 | 2019-09-24 | International Business Machines Corporation | Efficient caching of Huffman dictionaries |
WO2015164724A1 (en) * | 2014-04-24 | 2015-10-29 | Arizona Board Of Regents On Behalf Of Arizona State University | System and method for quality assessment of optical colonoscopy images |
ES2898868T3 (en) * | 2015-06-23 | 2022-03-09 | Torino Politecnico | Method and device for image search |
US10885098B2 (en) | 2015-09-15 | 2021-01-05 | Canon Kabushiki Kaisha | Method, system and apparatus for generating hash codes |
US9727775B2 (en) * | 2015-12-01 | 2017-08-08 | Intel Corporation | Method and system of curved object recognition using image matching for image processing |
CN107341191B (en) * | 2017-06-14 | 2020-10-09 | 童晓冲 | Multi-scale integer coding method and device for three-dimensional space |
CN111818346B (en) | 2019-04-11 | 2023-04-18 | 富士通株式会社 | Image encoding method and apparatus, image decoding method and apparatus |
CN113114272B (en) * | 2021-04-12 | 2023-02-17 | 中国人民解放军战略支援部队信息工程大学 | Method and device for encoding data structure of hexagonal grid with consistent global tiles |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101069401A (en) * | 2004-11-15 | 2007-11-07 | 艾利森电话股份有限公司 | Method and apparatus for header compression with transmission of context information dependent upon media characteristic |
CN102138160A (en) * | 2008-07-15 | 2011-07-27 | 韩国巴斯德研究所 | Method and apparatus for imaging of features on a substrate |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5658368A (en) * | 1979-10-17 | 1981-05-21 | Matsushita Electric Ind Co Ltd | Band compressing method |
JPH03110691A (en) * | 1989-09-25 | 1991-05-10 | Meidensha Corp | Dictionary preparing method |
JPH0746599A (en) * | 1993-07-15 | 1995-02-14 | Kyocera Corp | Motion compensation circuit for moving image |
JPH08149016A (en) * | 1994-11-17 | 1996-06-07 | N T T Ido Tsushinmo Kk | Character string coding method |
DE60117930T2 (en) | 2000-06-06 | 2006-10-05 | Agilent Technologies Inc., A Delaware Corp., Palo Alto | Method and system for the automatic extraction of data from a molecular array |
JP2006121302A (en) * | 2004-10-20 | 2006-05-11 | Canon Inc | Device and method for encoding |
JPWO2007088926A1 (en) * | 2006-02-01 | 2009-06-25 | 日本電気株式会社 | Image processing, image feature extraction, and image collation apparatus, method, and program, and image collation system |
US7876959B2 (en) | 2006-09-06 | 2011-01-25 | Sharp Laboratories Of America, Inc. | Methods and systems for identifying text in digital images |
US7894668B1 (en) | 2006-09-28 | 2011-02-22 | Fonar Corporation | System and method for digital image intensity correction |
US8170101B2 (en) * | 2006-10-27 | 2012-05-01 | Sharp Laboratories Of America, Inc. | Methods and systems for low-complexity data compression |
US8712109B2 (en) | 2009-05-08 | 2014-04-29 | Microsoft Corporation | Pose-variant face recognition using multiscale local descriptors |
US20100303354A1 (en) * | 2009-06-01 | 2010-12-02 | Qualcomm Incorporated | Efficient coding of probability distributions for image feature descriptors |
US9449249B2 (en) * | 2012-01-31 | 2016-09-20 | Nokia Corporation | Method and apparatus for enhancing visual search |
-
2011
- 2011-09-09 US US13/229,654 patent/US8571306B2/en not_active Expired - Fee Related
-
2012
- 2012-07-31 JP JP2014525051A patent/JP5911578B2/en not_active Expired - Fee Related
- 2012-07-31 KR KR1020147006300A patent/KR101565265B1/en not_active IP Right Cessation
- 2012-07-31 CN CN201280038785.0A patent/CN103843011B/en not_active Expired - Fee Related
- 2012-07-31 EP EP12743870.3A patent/EP2742486A2/en not_active Withdrawn
- 2012-07-31 WO PCT/US2012/049055 patent/WO2013022656A2/en unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101069401A (en) * | 2004-11-15 | 2007-11-07 | 艾利森电话股份有限公司 | Method and apparatus for header compression with transmission of context information dependent upon media characteristic |
CN102138160A (en) * | 2008-07-15 | 2011-07-27 | 韩国巴斯德研究所 | Method and apparatus for imaging of features on a substrate |
Non-Patent Citations (4)
Title |
---|
Interactive 3-D Video Representation and Coding Technologies;ALJOSCHA SMOLIC等;《Proceedings of the IEEE (Volume:93,Issue: 1 )》;20050630;全文 * |
Overview of the Stereo and Multiview Video Coding Extensions of theH.264/MPEG-4 AVC Standard;Anthony Vetro等;《Proceedings of the IEEE (Volume:99,Issue: 4 )》;20110430;全文 * |
三维视频编码技术研究;杨海涛;《中国博士学位论文全文数据库信息科技辑》;20091215;全文 * |
基于直方图变换的多光谱图像3D SPIHT压缩编码算法;陈林杰等;《光学技术》;20070228;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103843011A (en) | 2014-06-04 |
EP2742486A2 (en) | 2014-06-18 |
JP5911578B2 (en) | 2016-04-27 |
US20130039566A1 (en) | 2013-02-14 |
KR101565265B1 (en) | 2015-11-02 |
WO2013022656A2 (en) | 2013-02-14 |
JP2014524693A (en) | 2014-09-22 |
US8571306B2 (en) | 2013-10-29 |
KR20140045585A (en) | 2014-04-16 |
WO2013022656A3 (en) | 2014-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103843011B (en) | The decoding of feature locations information | |
Tsai et al. | Location coding for mobile image retrieval | |
CN105144141B (en) | For using the system and method apart from relevance hashing to media database addressing | |
US9420299B2 (en) | Method for processing an image | |
US8447122B2 (en) | Animated image code, apparatus for generating/decoding animated image code, and method thereof | |
CN111275038A (en) | Image text recognition method and device, computer equipment and computer storage medium | |
US7519221B1 (en) | Reconstructing high-fidelity electronic documents from images via generation of synthetic fonts | |
CN107273458B (en) | Depth model training method and device, and image retrieval method and device | |
WO2016082277A1 (en) | Video authentication method and apparatus | |
CN106503112B (en) | Video retrieval method and device | |
Duan et al. | Optimizing JPEG quantization table for low bit rate mobile visual search | |
CN115443490A (en) | Image auditing method and device, equipment and storage medium | |
JP2008234479A (en) | Image quality improvement device, method, and program | |
CN105229670A (en) | The text representation of image | |
Vázquez et al. | Using normalized compression distance for image similarity measurement: an experimental study | |
CN104115162A (en) | Image analysis | |
Chen et al. | Efficient video hashing based on low‐rank frames | |
CN116258917A (en) | Method and device for classifying malicious software based on TF-IDF transfer entropy | |
Zhang et al. | Blind image quality assessment based on local quantized pattern | |
KR101572330B1 (en) | Apparatus and method for near duplicate video clip detection | |
US9756342B2 (en) | Method for context based encoding of a histogram map of an image | |
Zhang et al. | A probabilistic analysis of sparse coded feature pooling and its application for image retrieval | |
Wang et al. | A novel low bit rate side match vector quantization algorithm based on structed state codebook | |
Wang et al. | PQ-WGLOH: A bit-rate scalable local feature descriptor | |
KR101826039B1 (en) | Method, Device, and Computer-Readable Medium for Optimizing Document Image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170531 Termination date: 20190731 |