CN103843011B - The decoding of feature locations information - Google Patents

The decoding of feature locations information Download PDF

Info

Publication number
CN103843011B
CN103843011B CN201280038785.0A CN201280038785A CN103843011B CN 103843011 B CN103843011 B CN 103843011B CN 201280038785 A CN201280038785 A CN 201280038785A CN 103843011 B CN103843011 B CN 103843011B
Authority
CN
China
Prior art keywords
histogram
hexagonal
encoded
feature locations
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201280038785.0A
Other languages
Chinese (zh)
Other versions
CN103843011A (en
Inventor
尤里娅·列兹尼克
奥努尔·C·哈姆西奇
桑迪普·瓦达迪
约翰·H·洪
重·U·李
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN103843011A publication Critical patent/CN103843011A/en
Application granted granted Critical
Publication of CN103843011B publication Critical patent/CN103843011B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/005Statistical coding, e.g. Huffman, run length coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/19Image acquisition by sensing codes defining pattern positions

Abstract

The present invention is disclosed for the method and apparatus that feature locations are entered with row decoding.In one embodiment, a kind of feature locations information to image is entered the method for row decoding and is included:Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;The feature locations of image are quantified using the hexagonal mesh;Produce appearance of the histogram with recording feature position in each hexagonal cells;And the histogram is encoded according to appearance of the feature locations in each hexagonal cells.It is described that the method that the histogram is encoded is included:The information of follow-up hexagonal cells to be encoded in the histogram is encoded using the contextual information of adjacent hexagonal unit, wherein contextual information of the contextual information comprising the one-level adjacent cells from the follow-up hexagonal cells to be encoded and two grades of contextual informations of adjacent cells from the follow-up hexagonal cells to be encoded.

Description

The decoding of feature locations information
The cross reference of related application
Present application advocate filed in September in 2011 9 days No. 13/229,654 U. S. application case " feature locations information The rights and interests of decoding (Coding of Feature Location Information) ", the application case advocates in August, 2011 again 61/522nd, No. 171 U.S. Provisional Application case " decoding (the Coding of Feature of feature locations information filed in 10 days Location Information) " rights and interests.Aforesaid US is incorporated by reference herein in its entirety herein.
Technical field
Field the present invention relates to process DID.Exactly, the present invention relates to the Q-character confidence of image The decoding of breath.
Background technology
As camera phone and personal digital assistant (PDA) are commercially widely used, these devices become for vision Search and the extensive platform of mobile augmented reality application.For the application supporting to need image to compare, it is necessary to from mobile device to Server upload information, or need from server to mobile device download information.Need to launch and/or connect via wireless network The performance and ease for use that the data volume of receipts is applied for these become most important.
The searching system of conventional feature based generally enters row decoding using direct scheme to positional information.In these systems In, (x, y) coordinate of each feature is quantized into certain fixed resolution, for example each feature locations 8.Then store and send out Penetrate these quantified (x, y) right.For example, in the case where the image with 1,000 features and 8 bit resolutions is used, This scheme will need the data of each image about 2K bytes.These data coding schemes are produced largely to be needed via wireless network The data of network transmitting, this can negatively affect the performance and ease for use of visual search and mobile augmented reality application again.
Accordingly, it would be desirable to the system of the above mentioned problem that can solve conventional system for being used to entering feature locations information row decoding And method.
The content of the invention
The present invention relates to the decoding of the feature locations information of image.Embodiments in accordance with the present invention, a kind of spy to image Levy positional information and enter the method for row decoding and include:Hexagonal mesh, the hexagonal gridding is produced to include multiple hexagonal cells;Make The feature locations of image are quantified with the hexagonal mesh;Produce histogram with recording feature position in each hexagon Appearance in unit;And the occurrence number according to feature locations in each hexagonal cells is encoded to histogram.
The method of the generation hexagonal mesh is included and determines hexagon according to the predetermined quantitative grade of feature locations information The size of unit.The method quantified to feature locations is included:Each feature locations are performed from two dimensional surface to three-dimensional The coordinate transform in space;Transformed coordinate is rounded to corresponding immediate integer;And verify transformed coordinate category Hexagon plane in the three dimensions.It should be noted that the conversion is reversible.In order to the coordinate for verifying transformed belongs to Hexagon plane in three dimensions, methods described calculates the summation of transformed coordinate, and verifies transformed coordinate Summation is equal to zero.
It is described to produce histogrammic method to be configured to comprising feature locations in each hexagonal cells comprising generation The Histogram Mapping of appearance, and generation is configured to the straight of occurrence number of the Expressive Features position in each hexagonal cells Side's figure is counted.The method encoded to histogram can be comprising the contextual information of application adjacent hexagonal unit to straight The information of follow-up hexagonal cells to be encoded is encoded in square figure, wherein the contextual information is included to be encoded The contextual information and two grades of phases from follow-up hexagonal cells to be encoded of the one-level adjacent cells of follow-up hexagonal cells The contextual information of adjacent unit.
In another embodiment, a kind of mobile device is included:Image module, it is configured to obtain image;Visual search Module, its encoded feature locations information for being configured to produce image;And controller, it is configured to via wireless network Network is by the encoded feature locations information transmission of image to server.The visual search module of the mobile device is included:With In the logic for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;For using the hexagonal The logic that shape grid is quantified to the feature locations of image;For producing histogram with recording feature position in each hexagon The logic of the appearance in unit;And histogram is compiled for the appearance according to feature locations in each hexagonal cells The logic of code.
Brief description of the drawings
After the detailed description for coordinating figures below to read embodiments of the invention, preceding feature of the invention and excellent Point and its additional features and advantage can more clearly understand.
Fig. 1 a-1b illustrate the generation histogrammic method of feature locations according to certain aspects of the invention.
Fig. 2 illustrates the generation histogrammic other method of feature locations according to certain aspects of the invention.
Fig. 3 a are illustrated according to certain aspects of the invention in three-dimensional (3D) space with hexagon planar representation feature The method of positional information.
Fig. 3 b illustrate the characteristic of hexagonal cells according to certain aspects of the invention.
The self adaptation statistical coding application that Fig. 4 a-4b are illustrated according to certain aspects of the invention to histogram value is upper and lower Text configuration.
Fig. 5 is illustrated and of the invention translated using the feature locations of square net and hexagonal mesh in a certain respect The comparing of code scheme.
Fig. 6 a illustrate the block diagram of the mobile device for being configured to perform visual search according to certain aspects of the invention.
Fig. 6 b illustrate the method for being used for image retrieval according to an embodiment of the invention.
Fig. 6 c illustrate the method that embodiments in accordance with the present invention enter row decoding to the feature locations information of image.
Fig. 7 a-7b illustrate the functional exemplary embodiment of visual search according to an embodiment of the invention.
Fig. 8 a-8b illustrate embodiments in accordance with the present invention and words tree are used when feature locations are indexed and is associated Inverted index.
Specific embodiment
The present invention discloses the embodiment for entering row decoding to feature locations information.Following description is presented so that art Technical staff can make and use the present invention.The description of specific embodiment and application is provided only as example.Affiliated neck The technical staff in domain is readily apparent the various modifications and combinations of example as herein described, and is not departing from spirit of the invention In the case of scope, generic principles defined herein can apply to other examples and application.Thus, the present invention is not Wish to be limited to described and displaying example, but the widest range for meeting principles and features disclosed herein should be endowed.
Fig. 1 a are illustrated such as in September, 2009《International mobile multimedia communication proceeding》Middle Cai (Tsai) et al. " for position decoding (the Location coding for mobile image retrieval of mobile image searching system Systems the use rectangular mesh described by) " produces the histogrammic method of feature locations.This part of full text of bibliography is drawing Mode is incorporated herein.In 102, stain represents the feature of image.In 104, square net is covered with image, Wherein described square net includes multiple square shaped cells.According to application, the size of each square shaped cells can be from 2x2 Square pixel changes to 32x32 square pixel.In 106, produce Histogram Mapping to show the position of the feature of image Put.The unit containing stain in Histogram Mapping is shown with grey, and does not contain the unit white displaying of stain.In spy Levy between two square shaped cells it is borderline in the case of, square shaped cells of the selection containing larger characteristic area. In the case of feature is homoeomerous between two units, any one of described unit can be selected.In 108, based on 106 Histogram Mapping produce histogram counts.Numeral in square shaped cells represents that 106 belong to described in Histogram Mapping The number of the feature in unit.
Embodiments in accordance with the present invention, it is assumed that n indicate image feature number, and assume m indicate histogram in The number of unit.Then, in the case of given Video Graphics Array (VGA) image, and use ratio invariant features are converted (SIFT) or robust character (SURF), in the case of n=1000, then m=640*480/w are accelerated2, wherein w is the unit Size (in units of pixel).
It should be noted that, it is possible to use the idea of multiset represents the histogram of feature locations information, allows in multiset Member occurs more than once.The number of times that one element belongs to multiset is the multiplicity of the member.Element in multiset Sum (comprising the member for repeating) is the cardinality of multiset.For example, in multiset { a, a, b, b, b, c }, member a, b It is respectively 2,3 and 1 with the multiplicity of c, and the cardinality of multiset is 6.
In fig 1 a in shown example, (wherein element is taken from having for cardinality m to the number n of the multiset of cardinality Limit collection) it is multiset coefficient or multiset number.Represent that the possibility with m unit and n sum is straight by multiset coefficient The number of square figure:
Thus, it is supposed that all histogrammic probability are equal, may spend about
Individual position encodes to it.In above-mentioned expression formula, O represents big O symbols, its description when independent variable tend to it is specific The limitation sexual behaviour of (usual for better simply function) described function when value or infinity.Increasing of the big O symbols according to function Rate long characterizes function, to allow that there are the different functions of identical growth rate to be represented using identical O symbols.And, it is assumed that α is Constant, and H () is entropy function causing:
H(x)=-xlogx-(1-x)log(1-x)。
In n=1000 and m=640*480/w2In the case of show by using this formula obtain numeral, wherein w Represent the histogrammic block size in position.Show that the bit-by-bit of each feature locations puts the curve map of histogram block size w in Fig. 1 b. In this curve map, when block size smaller (for example, 2 pixels), the speed for entering row decoding to feature locations information is larger (about 8/feature).As block size increases, the speed that row decoding is entered to feature locations information reduces.When block size is about During 30 pixel, decoding rate is about 1/feature.The experience entropy estimate that Fig. 1 b are reported in the article comprising Cai et al..Should note Meaning, experience entropy estimate in view of model information launch cost, and thus its above-mentioned formula prediction curve somewhat Lower section.Generally speaking, it follows the similar trend relative to position histogram block size w.
It shall yet further be noted that can be directly proportional to block size w by the distortion (covering radius) that this scheme is introduced, and giving In the case of the fixed point q and reconstructed point q ' of its correspondence, it is:
Using above-mentioned relation, the rate-distortion characteristics for the decoding of histogram position can be expressed as below (for example, for L2 Norm):
Wherein W and H refers to the width and height of input picture, and n is the number of feature, and the wherein progressive statement in right side is Obtained for high fidelity (ε → 0) system.
Fig. 2 illustrates the generation histogrammic other method of feature locations according to certain aspects of the invention.In 202, Stain represents the feature of image.Hexagonal mesh (also referred to as hexagonal lattice) is covered with 204, on image, wherein described Hexagonal mesh includes multiple hexagonal cells.In 206, Histogram Mapping can be formed to show the position of the feature of image Put.In this example, the unit containing stain is shown with grey in Histogram Mapping, and does not contain the unit white of stain Displaying.Feature be in two hexagonal cells between it is borderline in the case of, selection containing larger characteristic area hexagonal Shape unit.In the case of feature is homoeomerous between two units, any one of described unit can be selected.208 In, the Histogram Mapping that can be based on 206 forms histogram counts.Number in hexagonal cells represents the Nogata for belonging to 206 The number of the feature in the unit in figure mapping.It should be noted that spy can be produced using different size of hexagonal cells The different quantification gradations in positional information are levied, for example 4,5 or 6 positions of each feature.For example, the hexagonal in hexagonal mesh The a line of shape unit can have 2,4,8,16 or 32 sizes of pixel.For the hexagonal cells of each size, Nogata The entropy for scheming mapping can have not bit rate and each image to have a not bit rate with each feature, and histogram counts Entropy can have not bit rate with each feature, and its mean speed can change for different images.Similarly, each size Hexagonal cells (i.e. 2,4,8,16 or 32 pixels) different quantification gradations in feature locations information can be produced.Histogram Mapping and histogram counts can be encoded separately, and adjacent hexagonal list can be used when row decoding is entered to Histogram Mapping The spatial relationship of the feature of unit.
Method hexagonal lattice subregion instead of the square grid subregion of space characteristics position shown in Fig. 2.Make In this way, the histogram of the feature locations for being quantized into hexagonal lattice is calculated, and then result of calculation is compiled Code.It is to reduce the number of the position needed for being encoded to the positional information of each feature to create the histogrammic target of feature locations. A kind of method is not to enter row decoding to the positional information of each feature, but the positional information of feature is converted into position Nogata Figure, and row decoding is entered to the position histogram.Positional information is converted into position histogram and row decoding is entered to histogram There are some benefits.First, it allows interpretation method to be not based on the order of the project through decoding, and thus reduce the position of decoding Speed.Additionally, can be the system point in image because of feature, it is possible to the space between using feature during decoding Structural relation.
Fig. 3 a are illustrated according to certain aspects of the invention in three-dimensional (3D) space with hexagon planar representation feature The method of positional information.As shown in fig. 3a, 3d space is shown as the cube 302 defined by u axles, v axles and w axles. Hexagon plane 304 can as shown be formed as having the summit on 305,306,307,308,309 and 310.This example In hexagon plane 304 center 312 or the center of cube 302, it has coordinate (0.0,0.0 and 0.0).
Embodiments in accordance with the present invention, can be by characteristics of image with the method for hexagon planar representation feature locations information Coordinate (x, y) is from the hexagon plane 304 in two-dimentional (2D) space projection to 3d space.When u, v and w of the point in 3d space sit When target summation meets following condition, this point is located in hexagon plane.
u+v+w=0。
In a kind of exemplary method, using following matrix by the point transformation in 2D spaces to 3d space:
And above-mentioned matrix meets following condition:
This means with down conversion:
(u, v, w)=(x, y) M
Can be reversible:
(x, y)=(u, v, w) MT
The example of this conversion is illustrated in Fig. 3 a.Hexagonal lattice in u+v+w=0 planes is one group, and there is integer to sit Target point, such as point 314a and 314b:
(u, v, w)lattice3
Embodiments in accordance with the present invention, it is a kind of to 3d space in the method that is quantified of transformed point include following meter Calculate.
In the case of the point q with following coordinate in given 3d space
q=(uq, vq, wq)
Define a little:
q′=(〈uq>, < vq>, < wq〉)
Wherein < x > indicate the integer closest to real number x.
Calculate summation and verify whether quantified point is located in hexagon plane:
Δ=〈uq〉+〈vq〉+〈wq〉。
If Δ=0, it means that quantified point is located in hexagon plane, then this process is completed.In other words, Q ' belongs to hexagon plane (u+v+w=0), and thus it is effective grid point.
Calculation error:
δ=(uq-〈uq>, vq-〈vq>, wq-〈wq〉)
And error is ranked up to cause
If Δ>0, then from q ' with highest error amount δiΔ component subtract 1.If Δ<0, then to tool There is minimum error values δiQ ' the individual component of | Δ | add 1.In order to control image feature locations (x, y) to the mapping of grid point Rugosity, can introduce scale parameter σ.It should be noted that whole quantizing process can be described as a series of conversion:
(x, y) → (u, v, w)=σ-1(x, y) M → (u, v, w)lattice
Reconstructed value (x ', y ') is obtained as below:
(u, v, w)lattice→ σ (u, v, w)latticeMT→ (x ', y ')
, there are several technologies that can be enumerated with encoded raster point in embodiments in accordance with the present invention.A kind of method is to follow Wherein as the order of hexagonal cells occurs in the raster scanning that methods described performs image coordinate (x, y).Or, methods described The dictionary order of the seat target value according to hexagonal cells enumerates hexagonal cells.
In some embodiments, unit of the methods described scanning containing image coordinate, and it is each to becoming to be mapped to The number of the feature of unit is counted.After histogram is calculated, it may map to unique index and is then compiled Code.As indicated above, the total histogrammic number of possibility with m unit and n can be represented by multiset coefficient:
And the speed needed for representing histogram index is:
Individual position.
Embodiments in accordance with the present invention, can be using various decoding techniques to the Histogram Mapping 206 and histogram meter of Fig. 2 Number 208 enters row decoding.In one approach, histogram can be converted into unique dictionary formula index, and be then used by with R The fixed length code of (m, n) position is encoded.If Y.A. Rui Sinike (Y.A.Reznik) are " for the amount of discrete probability distribution Algorithm (the An Algorithm for Quantization of Discrete Probability of change Distributions)”(《Data compression proceeding (DCC ' 11)》, in March, 2011, the 333-343 pages, entire contents It is incorporated herein by reference) described in, there is m binary number, n tale and per binary given In individual count k1... .km it is histogrammic in the case of, unique index I (k can be obtained as below1..., km):
This formula by concluding (from m=2,3 ... start) continue, and implement various types of lexicographic enumerations.Lift For example,
I (0,0 ..., 0, n)=0,
I (0,0 ..., 1, n-1)=1,
In other method, the empty block in Histogram Mapping can be converted into run length with raster scan order.Connect And row decoding is entered to run length using entropy encoder.Entropy encoder can using Columbus-Lai Si codes, Hoffman code or At least one of arithmetic code.In other method, methods described uses variable-length decoding scheme, its capture key point The characteristic of spatial distribution.In another method, the histogram value in several surrounding hexagonal cells is used as context.Further The configuration of these contexts is described with reference to Fig. 4 a and Fig. 4 b.
The self adaptation statistical coding application that Fig. 4 a-4b are illustrated according to certain aspects of the invention to histogram value is upper and lower Text configuration.In fig .4, in order to be encoded to the hexagonal cells X in hexagonal mesh, it is possible to use adjacent from one-level The contextual information of unit A, B and C is encoded to hexagon Histogram Mapping and histogram counts.In this example, one Level adjacent cells A, B and C are previously encoded hexagonal cells, and hexagonal cells X is follow-up hexagonal to be encoded Shape unit.Similarly in fig. 4b, in order to be encoded to the hexagonal cells Y in hexagonal mesh, it is possible to use from one Level and two grades of contextual informations of adjacent cells (A, B, C, D, E, F, G, H and I) are to hexagon Histogram Mapping and histogram meter Number is encoded.Firsts and seconds adjacent cells A, B, C, D, E, F, G, H and I are previously encoded hexagonal cells, and And hexagonal cells Y is follow-up hexagonal cells to be encoded.
It should be noted that compared with square grid, hexagonal lattice provides the more preferable placement of the point that can serve as context.Lift For example, in fig .4, three one-level adjacent hexagonal units A, B and C can serve as context.Although in square grid In, only exist two these available one-level adjacent square units, the i.e. pros in the square shaped cells of top and left side Shape unit, it is assumed that scanning direction is from left to right and from top to bottom.
It should be noted that compared with square grid, hexagonal lattice produces the thinner covering in two-dimentional (2D) space.This can be improved The accuracy that feature locations are represented.As shown in figs 4 a and 4b, the mapping for hexagonal space is translated from context modeling and entropy Code viewpoint sees it is beneficial.It should be noted that characteristics of image position can't change actual pixels to the translation method of hexagonal space Value, it means that it can be performed for computing resource with effective manner.
Paragraphs below analysis and utilization hexagonal lattice enters the benefit of row decoding to feature locations information.A kind of method is to estimate The rate-distortion characteristics of proposed scheme, and with enter the scheme ratio of row decoding to feature locations information using square grid Compared with.
Consider two grid points:(0,0,0) and (0,1,1), and convert it back to pixel domain.Please remember, this conversion Carried out by mapping:
(u, v, w)lattice→ σ (u, v, w)latticeMT→ (x ', y ')
Wherein σ is scale parameter.This draws:
And
In pixel domain these point the distance between be:
It should be noted that the height that the same distance in grid domain corresponds to the hexagonal cells shown in Fig. 3 b is:
Cell radius in pixel domain can be expressed as;
Similarly, the region that single hexagonal cells are occupied can be expressed as:
In the case of the image with H x w pixels, it will need at least
Individual hexagonal cells are covered to it.In this case, the quantization error based on L2 norms is equal to covering half Footpath:
This further produces following relation:
And rate-distortion function:
Comparatively, being for the rate-distortion function of square grid:
Therefore, proposed quantization scheme can be saved about
Position/characteristic point, while keeping identical worst case accuracy.
Fig. 5 is illustrated and of the invention translated using the feature locations of square grid and hexagonal lattice in a certain respect The comparing of code scheme.Curve 502 represents the position using each feature locations of hexagonal lattice decoding scheme to quantization error.It is bent Line 504 represents the position using each feature locations of square grid decoding scheme to quantization error.The two curves are all used With about 1,000 VGA images of feature.As shown in this example, if position decoding is under the bit rate of 5/feature Operation, then there is hexagonal lattice decoding scheme the bit rate better than about the 8.16% of square grid decoding scheme to change Enter.
Embodiments of the invention describe the improved technique of the decoding for characteristics of image positional information.The technology is utilized The histogrammic construction of the appearance of hexagonal lattice, feature locations for the quantization of feature locations in grid cell and this is straight The coding of square figure.The performance of this technology is analyzed, and by this technology and utilizes the square grid (scalar of location parameter Quantify) histogram decoding performance be compared.Illustrate proposed scheme and result in the bright of the bit rate that position decodes It is aobvious to improve.The technology is suitable for implementing on a mobile platform.
Disclosed method goes for wherein visual search and augmented reality (AR) system depends on Q-character confidence Cease to perform the mobile device of multiple tasks.For example, feature locations information can be used for the geometry matched between 1) image Checking;2) parameter of the geometric transformation between the view of same object is calculated;3) position and project the border of object of interest; And 4) extraneous information enhancing is used to capture the view of the institute's identification objects in image or video, and other purposes.
In some cases, if representing positional information with compact and easy-to-use form, then AR and visual search system System can be benefited.If necessary to launch positional information via wireless network, then compactedness is even more important.May also allow for position letter The a certain loss of accuracy of breath, but only allow the loss of a certain degree, because this may influence whether retrieval accuracy and several The accuracy of the matching area/object of what conversion and the localization of parameter.
Fig. 6 a illustrate the block diagram of the mobile device for being configured to perform visual search according to certain aspects of the invention. At mobile device, antenna 602 receives modulated signal from base station, and the signal that will be received is provided to modem 604 demodulator (DEMOD) part.Signal that demodulator processes (for example, regulation and digitize) are received and it is input into Sample.It further performs Orthodoxy Frequency Division Multiplex (OFDM) demodulation to input sample, and provides the frequency of all subcarriers The symbol that domain receives.RX data processors 606 process (for example, symbol de-maps, release of an interleave and decoding) frequency domain and receive Symbol, and provide the controller/processor 608 to mobile device by decoded data.
Controller/processor 608 may be configured to control mobile device via wireless network and server communication.TX data Processor 610 produces signaling symbols, data symbol and frequency pilot sign, and these symbols can be by the modulation of modem 604 Device (MOD) treatment, and it is launched into base station via antenna 602.Additionally, at the guiding mobile device of controller/processor 608 The operation of various processing units.Memory 612 can be configured to store the program code and data for mobile device.Image mould Block 616 can be configured to obtain image.Visual search module 614 can be configured to implement the feature locations information decoding to image Method and image search method as described below.
Embodiments in accordance with the present invention, CBIR can be using referred to as " feature bag " (BoF) or " word The method of language bag " (BoW).BoW methods are derived from text document retrieval.In order to find particular text document, such as webpage, Word using several good selections is just much of that.In database, document can be represented equally by " bag " of prominent word in itself, No matter these words appear in where document.It is the firm local feature of the characteristic of specific image for image Serve as the role of " vision word ".As text retrieval, BoF image retrievals do not consider which position in the picture occurs in feature Put, be so at least in the starting stage of retrieval pipeline.
Fig. 6 b illustrate the method for being used for image retrieval according to an embodiment of the invention.In frame 622, methods described Obtain query image.In frame 624, local image characteristics/descriptor is extracted from query image.In frame 626, then by these The descriptors match of the image stored in descriptor and database 630.Descriptors match function can further comprising matching office Portion's characteristics of image, image of the selection with top score, and perform geometric verification.In frame 628, then select and enumerate and look into Asking image has the image of many common traits.Geometric verification step as described below can be used for refusal to be had and can not pass through The matching of the feature locations that replacing checks position credibly to explain.
Method shown in Fig. 6 b may be implemented for the pipeline of large-scale image retrieval.First, carried from query image Take local feature (also referred to as descriptor).The phase between query image and database images is assessed using described group of local feature Like property.In order to be able to be used for Mobile solution, each feature should be regarded compared with correspondence database image relative in user from difference The geometry and photometric distortion put and run into when obtaining inquiry photo with different illumination are firm.Next, by query characteristics With the characteristic matching of image of the storage in database.By using special index structure so as to allow quick access to contain matching The image list of feature, it is possible to achieve this result.Based on the number of the common feature of itself and query image, selected from database Select the short list of potentially similar image.Finally, to the most like matching application geometric verification step in database.Geometric verification It is correct that the relevant spatial model between the feature of query image and the feature of candidate data storehouse image is found to ensure to match 's.
Fig. 6 c illustrate the method that embodiments in accordance with the present invention enter row decoding to the feature locations information of image.As schemed Shown in 6c, in frame 632, methods described produces the hexagonal mesh comprising multiple hexagonal cells, and feature based position The predetermined quantitative grade (such as each feature 4,5 or 6) of confidence breath determines the size of hexagonal cells.
In frame 634, methods described is quantified using hexagonal mesh to the feature locations of image.For each feature Position, methods described produces the transformed coordinate from two dimensional surface to three dimensions of feature locations, by transformed coordinate Corresponding immediate integer is rounded to, and verifies that transformed coordinate belongs to the hexagon plane in three dimensions.Pass through The summation for the calculating transformed coordinate and summation for verifying transformed coordinate is equal to zero to verify transformed coordinate.
In frame 636, methods described produces appearance of the histogram with recording feature position in each hexagonal cells.Institute State histogram and include the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells, and be configured With the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
In frame 638, appearance of the methods described according to feature locations in each hexagonal cells is compiled to histogram Code.Histogram is converted into unique dictionary formula index by methods described, and using fixed length code to unique dictionary formula Index is encoded.Additionally, histogrammic empty block is converted into run length by methods described raster scan order, and make Run length is encoded with entropy encoder.Entropy encoder can use Columbus-Lai Si codes, Hoffman code or arithmetic Code.
In other method, coding is carried out to histogram can apply the contextual information of adjacent hexagonal unit to Nogata The information of follow-up hexagonal cells to be encoded is encoded in figure.The contextual information is included to be encoded follow-up six The one-level adjacent cells of corner shaped elements and two grades of contextual informations of adjacent cells.The contextual information is used as arithmetic coding The input of device.
Embodiments in accordance with the present invention, Columbus-Lai Si decodings are a kind of lossless numbers of use series data compression code According to compression method, wherein the alphabet that geometry distribution is followed in adaptive decoding scheme can have Columbus's-Lai Si codes As prefix coee.Columbus's-Lai Si codes have tunable parameter (being two power), and this causes that these codes are conveniently used for meter On calculation machine, because can more efficiently implement two multiplication and division in binary arithmetic.Hoffman decodeng uses variable length Degree code table is encoded to carry out lossless data compression to source symbol.Variable-length codes table can be based on each of source symbol The estimated probability of the appearance of probable value and derive.Hoffman decodeng selects the expression of each symbol using specific method, from And prefix coee is produced, the prefix coee is most common to express using the bit string more shorter than bit string for more uncommon source symbol Source symbol.For be with homogeneous probability distribution and multiple two power member a group code, Hoffman decodeng is equivalent to Binary block is encoded.Arithmetically decoding is a kind of variable-length entropy code of the form for lossless data compression.Can be using every The position of the fixed number of individual character represents a string of characters, with American Standard Code for Information Interchange in.When arithmetic coding is converted a string to, The character that can be commonly used with less bits storage, and the less frequent character for occurring is stored with compared with multidigit, so that The position for using altogether is less.Arithmetically decoding is different from the entropy code (such as Hoffman decodeng) of other forms, because arithmetically decoding It is not that input is separated component amount symbol and substitutes each component symbol with code, but by whole message coding into single number Word, i.e. fraction n, wherein (0.0≤n<1.0).
Fig. 7 a-7b illustrate the functional exemplary embodiment of visual search according to an embodiment of the invention.Such as The method for entering row decoding to feature locations information described in the present invention can be in the client as shown in Fig. 7 a and Fig. 7 b Implement with server environment.
As shown in Figure 7 a, the system includes mobile device 702 (such as mobile phone), visual search server 704 With wireless network 706.Mobile device 702 includes image capture module 703, image coding module 705 and process and display result Module 707.Visual search server 704 includes image decoder module 711, descriptor extraction module 713, descriptors match module 715th, Search Results module 717 and database 719.The component of mobile device 702, wireless network 706 and visual search server 704 are communicatively coupled, as shown in the flow chart of Fig. 7 a.Mobile device 702 analyzes query image, extracts topography special Levy (descriptor), and emission characteristic data.Search method is being regarded using emitted feature as inquiry with performing search Feel and run on search server 704.
In the example for showing in fig .7b, the system includes mobile device 722 (being shown as mobile phone), visual search Server 724 and wireless network 726.Mobile device 722 includes image capture module 723, descriptor extraction module 725, description Symbol coding module 727, descriptors match module 729, decision branch 731, process and display object module 733 and local data Storehouse (D/B) or cache 735.Visual search server 724 includes descriptor decoder module 741, descriptors match module 743rd, Search Results module 745 and database 747.The component of mobile device 722, wireless network 726 and visual search server 724 are communicatively coupled, as shown in the flow chart of Fig. 7 b.Mobile device 722 maintain database cache and Locally execute images match.In the case where matching is not found, mobile device 722 sends to visual search server 724 and inquires about Request.In this way, it is further reduced via the data volume transmitted by network.
Fig. 7 a and Fig. 7 b it is each in the case of, retrieval framework be adapted to strict mobile system requirement.The movement Treatment on device needs quick and economical in terms of power consumption.Needed as far as possible via the size of data of network launches It is small, to minimize network latency, and thus provide optimal user experience.Method for retrieval needs to contract Potentially great database is put into, and accurate result can be surrendered with low latency.Additionally, searching system needs to be steady Solid, so as to allow to be reliably identified in it is varied under conditions of the object that captures, the condition includes different distance, visual angle With illumination condition or in the case where there is Partial occlusion or motion blur.
Prominent point of interest in characteristic extraction procedure identification image.In order to realize firm images match, these points of interest Need to be repeated under view transformation (such as ratio change, rotation and translation) and illumination change.In order to realize constant rate Property, it is possible to use image pyramid calculates point of interest under multiple ratios.In order to realize rotational invariance, around each point of interest Insertion code it is upwardly-directed in the side of main gradient.Gradient in each path is further normalized so that it becomes for illumination Change firm.
It should be noted that different interest spot detectors provide repeatability compromise different from complexity.For example, pass through Gaussian difference (DoG) point that SIFT is produced may calculate relatively slow, but it can highly be repeated;And angle detector method may The quick but relatively low repeatability of its offer.In the various sides that can realize the good compromise between repeatability and complexity In the middle of method is the Hai Sai-binary large object detection device (Hessian-blob detector) accelerated with complete image.It is right VGA images make in this way, can perform point of interest with approximately less than one second on some current mobile phones and detect.
After point of interest detection, insert code using these smaller images for putting surrounding and calculate " vision word " descriptor. Problem when calculating feature descriptor is the characteristic for making it very to distinguish an image or smaller group image.Almost every In individual image occur descriptor (such as in text document word " and " equivalent) will be unsuitable for retrieval.
In one embodiment, the process for calculating descriptor described below:
● insertion code is divided into the binary number of several (such as 5 to 9) space colocalizations;
● then calculate joint (dx, the dy) histogram of gradients in each space binary number.CHoG histogram binary systems Change the typical deflection in the gradient statistics observed using the insertion code for being extracted around key point;And
● the histogram of the gradient from each space binary number is quantified and is deposited as a part for descriptor Storage.
In the above-mentioned embodiment for extracting the feature of image, extract different proportion under point of interest (for example, angle, Binary large object).Insertion code under different proportion is oriented along main gradient.Operating specification ground is oriented and normalized slotting Enter code and calculate descriptor.Insertion code is divided into the space binary number of localization, and it provides and is localized for point of interest The steadiness of error.Directly compress the distribution of the gradient in each space binary number and retouched to obtain insertion the compact of code State.
Allow to measure (such as KL divergings) using information distance using histogram to assess the mismatch between characteristics of image Degree.Histogram also allows simple and efficient coding.In some instances, it is only necessary to which 50 to 60 positions are with by each insertion Code become compression based on histogrammic descriptor.
The mobile AR and visual search system of transmitting or storage local image characteristics are needed efficiently to feature and Q-character Information aggregate is put to be encoded (and/or multiplexing).Feature locations information is also required to be encoded, because this is geometric verification institute Need.In one approach, in order to realize matching accuracy, 500 local features are typically at least needed.These features are usual It is spatially very related.As shown in above-mentioned Fig. 2 to Fig. 4, by being quantized into 2D histograms first and being then used by base It is related with utilization space in the arithmetically decoding technology of context, it is possible to achieve the coding of feature locations information.This technology can be with About 5/feature decoding rate is realized, while providing the sufficiently high precision that feature locations information is represented.
By emission characteristic position histogram first and then (its position goes out i.e. when being decoded to histogram successively Existing order) emission characteristic, it is possible to achieve the coding of whole group local feature and its correspondence position.For example, if histogram Block (x, y) is indicated to include three features, then encoder can be sequentially output three codes of correspondence descriptor in bit stream.
Decoded using compact descriptor (such as descriptor as described above) and feature locations, can be by about 4K Byte (500 × (60+5)/8) is represented with 500 query images of feature.Consider that JPEG compression query image is generally used About 40-80K bytes, disclosed method represents the order of magnitude reduction of bit rate.
In order to the feature to the image in large-scale image data base is indexed and is matched, disclosed embodiment uses a kind of Data structure, the data structure passes the short list of the database candidate for being likely to match with query image back.As long as comprising Correct matching, the short list may contain false positive.Short list that then can be only to candidate rather than whole database Execution is shorter to compare in pairs.
Various data structures can be used to be indexed for the local feature in image data base.A kind of method is with first most The approximate of good binary number strategy use SIFT descriptors is searched for closest to neighbor (ANN).In addition it is possible to use feature bag (BoF) model.BoF codebooks are to train the k averages cluster of set to build by descriptor.During inquiring about, by using with The associated inverted file index of BoF codebooks, can perform the scoring to database images.In order to produce large-scale codebook, can be with Words tree VT is created using k averages cluster is classified).Also other search techniques, the hash of such as locality-aware can be used (LSH) and traditional method based on tree improvement.
Fig. 8 a illustrate embodiments in accordance with the present invention and build word by the classification k averages cluster of training characteristics descriptor The method of remittance tree.Words tree shown in this example has 2 grades.Using branching factor k=3, and words tree has k^ 2=9 leaf node.Fig. 8 b illustrate words tree according to an embodiment of the invention and associated inverted index.Inverted index Containing image list, and indicate the same paths that the counter of the number of features in file followed in words tree.
As shown in Fig. 8 a-8b, for characteristics of image is indexed and is matched when using words tree (VT) and its is associated Inverted index structure.As illustrated in Fig. 8 a, classification k is performed by the training characteristics descriptor set to representing database equal Value cluster, can build the VT for database.First, for all training descriptors produce k large construction cluster.By with appropriate Distance function (such as L2 norms or the symmetric form of KL divergings) carries out this and grasps using k mean algorithms (being quantized into k unit) Make.Then, for each large construction cluster, to the training descriptor application k averages cluster for being assigned to the cluster with produce k compared with Small cluster.The recurrence division in this descriptor space repeats to presence always ensures that the binary number of good classification performance is enough Only.For example, indeed, it is possible to using with height 6, branching factor k=10 and produce 1,000,000 (106) node VT Design.
The inverted index being associated with VT each leaf node two lists of maintenance, as shown in figure 8b.For leaf node x, There is the sorted array { i of image recognition symbolx1..., ixNx, which N it indicatesxDatabase images have belong to and this The feature of the associated cluster of node.Similarly, there is corresponding counter array { Cxl..., CxNx, it indicates each correspondence The number of the feature in image belongs to same cluster.
During inquiring about, for query image in each feature travel through the VT, every time on one of leaf node Terminate.The similitude for then being calculated using the corresponding lists of image and frequency counting between these images and query image is obtained Point.These scores can be calculated using standard word frequency-inverse document frequency (TF-IDF) weighting scheme.By from all these row Table pulls image and is sorted according to score, can derive and be likely to the database containing with the true match of query image Image subset.Because each query characteristics only need to perform a small amount of lookup, and can directly obtain all phases from inverted index The list of file is closed, so this scheme can scale to support large database.
Geometric verification is performed after characteristic matching.At this stage, using in query image and database images The positional information of feature meets the change of the viewpoint between two images confirming characteristic matching.Estimate to inquire about using regression technique Geometric transformation between image and database images.Generally by incorporating the basic of 3D geometries, homography or affine model Matrix represents conversion.
It should be noted that [0090] section, Fig. 2, Fig. 6 a-6c and its correspondence description are provided for producing comprising multiple hexagonal cells Hexagonal mesh device, for the device quantified to the feature locations of image using hexagonal mesh, for producing The device of appearance of the histogram with recording feature position in each hexagonal cells and for according to feature locations every 1 The device that appearance in corner shaped elements is encoded to histogram.[0090] section, Fig. 2, Fig. 3 a-3b, Fig. 6 a-6c and its correspondence are retouched The device that the transformed coordinate from two dimensional surface to three dimensions for producing feature locations is provided is stated, for will be transformed Coordinate be rounded to the device of the immediate integer of correspondence, and for verify transformed coordinate belong to three dimensions in hexagonal The device of shape plane.[0090] section, Fig. 2, Fig. 6 a-6c and its correspondence description is provided and is configured to comprising feature locations for generation The device of the Histogram Mapping of the appearance in each hexagonal cells, and Expressive Features position is configured to every for producing The device of the histogram counts of the occurrence number in one hexagonal cells.Section [0090], Fig. 4 a-4b, Fig. 6 a-6c and its correspondence are retouched State and provide the contextual information for application adjacent hexagonal unit with to follow-up hexagonal cells to be encoded in histogram The device that information is encoded.
Method described herein and mobile device can depend on application and pass through various devices and implement.For example, this A little methods can be implemented with hardware, firmware, software or its combination.For hardware embodiments, processing unit can at one or It is more than one application specific integrated circuit (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable Logic device (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, electronics dress Put, other be designed to perform function as herein described electronic unit or its combination in implement.Herein, term " logic control System " covers the logic implemented by software, hardware, firmware or combination.
For firmware and/or Software implementations, can be with the module of execution functionality described herein (for example, program, letter Number etc.) implement methods described.Any machine-readable medium for visibly embodying instruction may be used to implement as herein described Method.For example, software code can be stored in memory and performed by processing unit.Memory can be embodied in In processing unit or outside processing unit.As used herein, term " memory " refers to any kind of long-term, short-term, easy The property lost, non-volatile or other storage devices and the memory of any certain types of memory or number is not limited to, or appointed What certain types of media that store memory.
If with firmware and/or software implementation, then can be using the function as one or more instructions or code Storage is on computer-readable media.Example has computer journey comprising the computer-readable media and coding that coding has data structure The computer-readable media of sequence.Computer-readable media can be in the form of manufacture.Computer-readable media includes physics Computer storage media.Storage media can be can be by any useable medium of computer access.It is unrestricted by means of example, this Class computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage apparatus, disk storage device or Other magnetic storage devices, or any other form that can be used for store instruction or data structure expectation program code and can By the media of computer access;As used herein, disk and CD include compact disk (CD), laser-optical disk, optics The usual magnetically reproduce data of CD, digital versatile disc (DVD), floppy discs and Blu-ray Disc, wherein disk, and CD laser reproduce data optically.The combination of said apparatus should be also included in the range of computer-readable media.
In addition to storing on computer-readable media, instruction and/or data can be also provided as being wrapped in communication equipment The signal on transmitting media for containing.For example, communication equipment can include the transceiver of the signal with indicator and data. The instruction and data is configured to cause the function that one or more processors are implemented to be summarized in claims.Also It is to say, communication equipment includes the transmitting media with the information for indicating to be used to the function of performing disclosed.In the very first time, communication The transmitting media included in equipment can include the Part I of the information of the function of being used to perform disclosed, and in the second time, The transmitting media included in communication equipment can include the Part II of the information of the function of being used to perform disclosed.
The present invention can coordinate such as wireless wide area network (WWAN), WLAN (WLAN), Wireless Personal Network (WPAN) Implement etc. various cordless communication networks.Term " network " and " system " usually used interchangeably.Term " position " and " place " are usually Used interchangeably.WWAN can be CDMA (CDMA) network, time division multiple acess (TDMA) network, frequency division multiple access (FDMA) network, OFDM (OFDMA) network, single-carrier frequency division multiple access (SC-FDMA) network, Long Term Evolution (LTE) network, WiMAX (IEEE802.16) network etc..Cdma network can implement one or more radio access technologies (RAT), for example Cdma2000, wideband CDMA (W-CDMA) etc..Cdma2000 includes IS-95, IS2000 and IS-856 standard.TDMA networks can be real Apply global system for mobile communications (GSM), digital advanced mobile phone system (D-AMPS) or a certain other RAT.GSM and W-CDMA It is described in the document of the association from entitled " third generation partner program " (3GPP).Cdma2000 is described in from entitled In the document of the association of " third generation partner program 2 " (3GPP2).3GPP and 3GPP2 documents are publicly available.WLAN Can be IEEE802.11x networks, and WPAN can be blueteeth network, IEEE802.15x or some other type of networks. The technology can also coordinate any combinations of WWAN, WLAN and/or WPAN to implement.
Mobile station refers to such as honeycomb fashion or other radio communication devices, PCS Personal Communications System (PCS) device, personal navigation Device (PND), personal information manager (PIM), personal digital assistant (PDA), laptop computer or other can receive nothing The devices such as the suitable mobile device of line communication and/or navigation signal.Term " mobile station " is also wanted to comprising for example by short distance Wirelessly, infrared ray, wired connection or other connection (either at described device or at PND generation satellite signal receiving, Assistance data reception and/or the treatment related to position) device that is communicated with personal navigation apparatus (PND).Also, " mobile station " Wish comprising all devices, comprising radio communication device, computer, laptop computer etc., its can for example via internet, Wi-Fi or other networks and server communication, and either at described device, at server or with the network phase There is satellite signal receiving, assistance data reception and/or the treatment related to position at another device of association.Said apparatus Any operable combination is also regarded as " mobile station ".
It is optimised that certain things " optimised ", the saying of " required " or other sayings do not indicate that the present invention is only applicable to Wherein there is the system (or other are attributed to the limitation of other sayings) of described " required " element in system.These sayings only refer to Generation specific described embodiment.Certainly, many embodiments are possible.The technology can be discussed with except herein Agreement outside the agreement stated is used together, comprising developing or untapped agreement.
Those skilled in the relevant art will be recognized that, it is possible to use many of disclosed embodiment may modification and group Close, while still using the basic fundamental mechanism of identical and method.For illustrative purposes, be described above is with reference to specific implementation What example was write.But, illustrative discussions above is not intended to be exhaustive or limits the invention to disclosed precise forms. In view of teachings above, what many modifications and variations were equally possible.It is of the invention in order to explain to select and describe the embodiment Principle and its practical application, and in order that those skilled in the art can be being suitable for covered specific use Various modifications best utilize embodiment of the invention and various.

Claims (34)

1. a kind of method that feature locations information to image enters row decoding, it includes:
Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;
The feature locations of image are quantified using the hexagonal mesh, wherein carry out quantization to feature locations including:It is right In each feature locations, the transformed coordinate from two dimensional surface to three dimensions of the feature locations is produced, by the warp During the coordinate of conversion is rounded to the immediate integer of correspondence and verifies that the transformed coordinate belongs to the three dimensions Hexagon plane;
Produce appearance of the histogram with recording feature position in each hexagonal cells;And
The appearance according to feature locations in each hexagonal cells is encoded to the histogram.
2. method according to claim 1, wherein producing the hexagonal mesh to include:
Predetermined quantitative grade according to feature locations information determines the size of the hexagonal cells.
3. method according to claim 1, wherein verifying that the transformed coordinate includes:
Calculate the summation of the transformed coordinate;And
Verify that the summation of the transformed coordinate is equal to zero.
4. method according to claim 1, wherein producing the histogram to include:
Generation is configured to the Histogram Mapping of the appearance comprising feature locations in each hexagonal cells.
5. method according to claim 4, it is further included:
Generation is configured to the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
6. method according to claim 1, wherein carry out coding to histogram including:
The histogram is converted into unique dictionary formula index;And
Unique dictionary formula is indexed using fixed length code is encoded.
7. method according to claim 1, wherein carry out coding to histogram further including:
The histogrammic empty block is converted into run length with raster scan order;And
The run length is encoded using entropy encoder.
8. method according to claim 7, wherein the entropy encoder uses Columbus's-Lai Si codes.
9. method according to claim 7, wherein the entropy encoder uses Hoffman code.
10. method according to claim 7, wherein the entropy encoder uses arithmetic code.
11. methods according to claim 1, wherein carry out coding to the histogram further including:
Using the contextual information of adjacent hexagonal unit to the histogram in follow-up hexagonal cells to be encoded letter Breath is encoded.
12. methods according to claim 11, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
13. methods according to claim 12, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from the follow-up hexagonal cells to be encoded.
14. methods according to claim 11, wherein contextual information to be used as the input of arithmetic encoder.
A kind of 15. mobile devices, it includes:
Image module, it is configured to obtain image;
Visual search module, its encoded feature locations information for being configured to produce described image;And
Controller, its be configured to via wireless network by the described encoded feature locations information transmission of described image to clothes Business device;
Wherein described visual search module is further configured to:
Hexagonal mesh is produced, wherein the hexagonal mesh includes multiple hexagonal cells;
The feature locations of image are quantified using the hexagonal mesh, wherein carry out quantization to feature locations including:It is right In each feature locations, produce transformed coordinate of the feature locations from two dimensional surface to three dimensions, by described through becoming The coordinate for changing is rounded to the immediate integer of correspondence and verifies that the transformed coordinate belongs in the three dimensions six Angular plane;
Produce appearance of the histogram with recording feature position in each hexagonal cells;And
The appearance according to feature locations in each hexagonal cells is encoded to the histogram.
16. mobile devices according to claim 15, wherein generation hexagonal mesh includes:
Predetermined quantitative grade according to feature locations information determines the size of the hexagonal cells.
17. mobile devices according to claim 15, wherein verifying that the transformed coordinate includes:
Calculate the summation of the transformed coordinate;And
Verify that the summation of the transformed coordinate is equal to zero.
18. mobile devices according to claim 15, wherein generation histogram includes:
Generation is configured to the Histogram Mapping of the appearance comprising feature locations in each hexagonal cells.
19. mobile devices according to claim 18, wherein the visual search module is further configured to:
Generation is configured to the histogram counts of occurrence number of the Expressive Features position in each hexagonal cells.
20. mobile devices according to claim 15, wherein carry out coding to histogram including:
The histogram is converted into unique dictionary formula index;And
Unique dictionary formula is indexed using fixed length code is encoded.
21. mobile devices according to claim 15, wherein carry out coding to histogram further including:
The histogrammic empty block is converted into run length with raster scan order;And
The run length is encoded using entropy encoder.
22. mobile devices according to claim 15, wherein carry out coding to the histogram further including:
Using the contextual information of adjacent hexagonal unit to the histogram in follow-up hexagonal cells to be encoded letter Breath is encoded.
23. mobile devices according to claim 22, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
24. mobile devices according to claim 23, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from described follow-up hexagonal cells to be encoded.
A kind of 25. mobile devices, it includes:
Image module, it is configured to obtain image;
Visual search module, its encoded feature locations information for being configured to produce described image;And
Controller, its be configured to via wireless network by the described encoded feature locations information transmission of described image to clothes Business device;
Wherein described visual search module is included
Device for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;
For the device quantified to the feature locations of image using the hexagonal mesh, wherein for entering to feature locations The device that row quantifies includes:For each feature locations, for produce the feature locations from two dimensional surface to three dimensions The device of transformed coordinate, the device for the transformed coordinate to be rounded to the immediate integer of correspondence and For verifying that the transformed coordinate belongs to the device of the hexagon plane in the three dimensions;
Device for producing appearance of the histogram with recording feature position in each hexagonal cells;And
For the device that the appearance according to feature locations in each hexagonal cells is encoded to the histogram.
26. mobile devices according to claim 25, wherein for producing the histogrammic device to include:
Device for producing the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells.
27. mobile devices according to claim 26, further include:
Dress for producing the histogram counts for being configured to occurrence number of the Expressive Features position in each hexagonal cells Put.
28. mobile devices according to claim 25, wherein the device for being encoded to the histogram is further Including:
For application adjacent hexagonal unit contextual information to follow-up hexagonal cells to be encoded in the histogram The device that information is encoded.
29. mobile devices according to claim 28, wherein the contextual information includes:
The contextual information of the one-level adjacent cells from the follow-up hexagonal cells to be encoded.
30. mobile devices according to claim 29, wherein the contextual information is further included:
Two grades of contextual informations of adjacent cells from described follow-up hexagonal cells to be encoded.
The equipment that a kind of 31. feature locations information to image enter row decoding, it includes:
Device for producing hexagonal mesh, wherein the hexagonal mesh includes multiple hexagonal cells;
For the device quantified to the feature locations of image using the hexagonal mesh, wherein for entering to feature locations The device that row quantifies includes:For each feature locations, for produce the feature locations from two dimensional surface to three dimensions The device of transformed coordinate, the device for the transformed coordinate to be rounded to the immediate integer of correspondence and For verifying that the transformed coordinate belongs to the device of the hexagon plane in the three dimensions;
Device for producing appearance of the histogram with recording feature position in each hexagonal cells;And
For the device that the appearance according to feature locations in each hexagonal cells is encoded to the histogram.
32. equipment according to claim 31, wherein for producing the histogrammic device to include:
Device for producing the Histogram Mapping for being configured to the appearance comprising feature locations in each hexagonal cells.
33. equipment according to claim 32, further include:
Dress for producing the histogram counts for being configured to occurrence number of the Expressive Features position in each hexagonal cells Put.
34. equipment according to claim 32, wherein for being further included to the device that the histogram is encoded:
For application adjacent hexagonal unit contextual information to the histogram in follow-up hexagonal cells to be encoded The device that is encoded of information.
CN201280038785.0A 2011-08-10 2012-07-31 The decoding of feature locations information Expired - Fee Related CN103843011B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201161522171P 2011-08-10 2011-08-10
US61/522,171 2011-08-10
US13/229,654 2011-09-09
US13/229,654 US8571306B2 (en) 2011-08-10 2011-09-09 Coding of feature location information
PCT/US2012/049055 WO2013022656A2 (en) 2011-08-10 2012-07-31 Coding of feature location information

Publications (2)

Publication Number Publication Date
CN103843011A CN103843011A (en) 2014-06-04
CN103843011B true CN103843011B (en) 2017-05-31

Family

ID=46634570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280038785.0A Expired - Fee Related CN103843011B (en) 2011-08-10 2012-07-31 The decoding of feature locations information

Country Status (6)

Country Link
US (1) US8571306B2 (en)
EP (1) EP2742486A2 (en)
JP (1) JP5911578B2 (en)
KR (1) KR101565265B1 (en)
CN (1) CN103843011B (en)
WO (1) WO2013022656A2 (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9396421B2 (en) 2010-08-14 2016-07-19 Rujan Entwicklung Und Forschung Gmbh Producing, capturing and using visual identification tags for moving objects
CN103858433B (en) * 2011-08-25 2017-08-15 汤姆逊许可公司 Layered entropy encoding and decoding
US20130114900A1 (en) * 2011-11-07 2013-05-09 Stanford University Methods and apparatuses for mobile visual search
US9412020B2 (en) * 2011-11-09 2016-08-09 Board Of Regents Of The University Of Texas System Geometric coding for billion-scale partial-duplicate image search
JP6168303B2 (en) * 2012-01-30 2017-07-26 日本電気株式会社 Information processing system, information processing method, information processing apparatus and control method and control program thereof, communication terminal and control method and control program thereof
US9449249B2 (en) * 2012-01-31 2016-09-20 Nokia Corporation Method and apparatus for enhancing visual search
EP2801190B1 (en) * 2012-04-20 2018-08-15 Huawei Technologies Co., Ltd. Method for processing an image
CA2900841C (en) 2013-01-16 2018-07-17 Huawei Technologies Co., Ltd. Context based histogram map coding for visual search
KR102113813B1 (en) * 2013-11-19 2020-05-22 한국전자통신연구원 Apparatus and Method Searching Shoes Image Using Matching Pair
US10423596B2 (en) * 2014-02-11 2019-09-24 International Business Machines Corporation Efficient caching of Huffman dictionaries
WO2015164724A1 (en) * 2014-04-24 2015-10-29 Arizona Board Of Regents On Behalf Of Arizona State University System and method for quality assessment of optical colonoscopy images
ES2898868T3 (en) * 2015-06-23 2022-03-09 Torino Politecnico Method and device for image search
US10885098B2 (en) 2015-09-15 2021-01-05 Canon Kabushiki Kaisha Method, system and apparatus for generating hash codes
US9727775B2 (en) * 2015-12-01 2017-08-08 Intel Corporation Method and system of curved object recognition using image matching for image processing
CN107341191B (en) * 2017-06-14 2020-10-09 童晓冲 Multi-scale integer coding method and device for three-dimensional space
CN111818346B (en) 2019-04-11 2023-04-18 富士通株式会社 Image encoding method and apparatus, image decoding method and apparatus
CN113114272B (en) * 2021-04-12 2023-02-17 中国人民解放军战略支援部队信息工程大学 Method and device for encoding data structure of hexagonal grid with consistent global tiles

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101069401A (en) * 2004-11-15 2007-11-07 艾利森电话股份有限公司 Method and apparatus for header compression with transmission of context information dependent upon media characteristic
CN102138160A (en) * 2008-07-15 2011-07-27 韩国巴斯德研究所 Method and apparatus for imaging of features on a substrate

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5658368A (en) * 1979-10-17 1981-05-21 Matsushita Electric Ind Co Ltd Band compressing method
JPH03110691A (en) * 1989-09-25 1991-05-10 Meidensha Corp Dictionary preparing method
JPH0746599A (en) * 1993-07-15 1995-02-14 Kyocera Corp Motion compensation circuit for moving image
JPH08149016A (en) * 1994-11-17 1996-06-07 N T T Ido Tsushinmo Kk Character string coding method
DE60117930T2 (en) 2000-06-06 2006-10-05 Agilent Technologies Inc., A Delaware Corp., Palo Alto Method and system for the automatic extraction of data from a molecular array
JP2006121302A (en) * 2004-10-20 2006-05-11 Canon Inc Device and method for encoding
JPWO2007088926A1 (en) * 2006-02-01 2009-06-25 日本電気株式会社 Image processing, image feature extraction, and image collation apparatus, method, and program, and image collation system
US7876959B2 (en) 2006-09-06 2011-01-25 Sharp Laboratories Of America, Inc. Methods and systems for identifying text in digital images
US7894668B1 (en) 2006-09-28 2011-02-22 Fonar Corporation System and method for digital image intensity correction
US8170101B2 (en) * 2006-10-27 2012-05-01 Sharp Laboratories Of America, Inc. Methods and systems for low-complexity data compression
US8712109B2 (en) 2009-05-08 2014-04-29 Microsoft Corporation Pose-variant face recognition using multiscale local descriptors
US20100303354A1 (en) * 2009-06-01 2010-12-02 Qualcomm Incorporated Efficient coding of probability distributions for image feature descriptors
US9449249B2 (en) * 2012-01-31 2016-09-20 Nokia Corporation Method and apparatus for enhancing visual search

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101069401A (en) * 2004-11-15 2007-11-07 艾利森电话股份有限公司 Method and apparatus for header compression with transmission of context information dependent upon media characteristic
CN102138160A (en) * 2008-07-15 2011-07-27 韩国巴斯德研究所 Method and apparatus for imaging of features on a substrate

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Interactive 3-D Video Representation and Coding Technologies;ALJOSCHA SMOLIC等;《Proceedings of the IEEE (Volume:93,Issue: 1 )》;20050630;全文 *
Overview of the Stereo and Multiview Video Coding Extensions of theH.264/MPEG-4 AVC Standard;Anthony Vetro等;《Proceedings of the IEEE (Volume:99,Issue: 4 )》;20110430;全文 *
三维视频编码技术研究;杨海涛;《中国博士学位论文全文数据库信息科技辑》;20091215;全文 *
基于直方图变换的多光谱图像3D SPIHT压缩编码算法;陈林杰等;《光学技术》;20070228;全文 *

Also Published As

Publication number Publication date
CN103843011A (en) 2014-06-04
EP2742486A2 (en) 2014-06-18
JP5911578B2 (en) 2016-04-27
US20130039566A1 (en) 2013-02-14
KR101565265B1 (en) 2015-11-02
WO2013022656A2 (en) 2013-02-14
JP2014524693A (en) 2014-09-22
US8571306B2 (en) 2013-10-29
KR20140045585A (en) 2014-04-16
WO2013022656A3 (en) 2014-03-13

Similar Documents

Publication Publication Date Title
CN103843011B (en) The decoding of feature locations information
Tsai et al. Location coding for mobile image retrieval
CN105144141B (en) For using the system and method apart from relevance hashing to media database addressing
US9420299B2 (en) Method for processing an image
US8447122B2 (en) Animated image code, apparatus for generating/decoding animated image code, and method thereof
CN111275038A (en) Image text recognition method and device, computer equipment and computer storage medium
US7519221B1 (en) Reconstructing high-fidelity electronic documents from images via generation of synthetic fonts
CN107273458B (en) Depth model training method and device, and image retrieval method and device
WO2016082277A1 (en) Video authentication method and apparatus
CN106503112B (en) Video retrieval method and device
Duan et al. Optimizing JPEG quantization table for low bit rate mobile visual search
CN115443490A (en) Image auditing method and device, equipment and storage medium
JP2008234479A (en) Image quality improvement device, method, and program
CN105229670A (en) The text representation of image
Vázquez et al. Using normalized compression distance for image similarity measurement: an experimental study
CN104115162A (en) Image analysis
Chen et al. Efficient video hashing based on low‐rank frames
CN116258917A (en) Method and device for classifying malicious software based on TF-IDF transfer entropy
Zhang et al. Blind image quality assessment based on local quantized pattern
KR101572330B1 (en) Apparatus and method for near duplicate video clip detection
US9756342B2 (en) Method for context based encoding of a histogram map of an image
Zhang et al. A probabilistic analysis of sparse coded feature pooling and its application for image retrieval
Wang et al. A novel low bit rate side match vector quantization algorithm based on structed state codebook
Wang et al. PQ-WGLOH: A bit-rate scalable local feature descriptor
KR101826039B1 (en) Method, Device, and Computer-Readable Medium for Optimizing Document Image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170531

Termination date: 20190731