EP3278238A1 - Fast orthogonal projection - Google Patents
Fast orthogonal projectionInfo
- Publication number
- EP3278238A1 EP3278238A1 EP16760273.9A EP16760273A EP3278238A1 EP 3278238 A1 EP3278238 A1 EP 3278238A1 EP 16760273 A EP16760273 A EP 16760273A EP 3278238 A1 EP3278238 A1 EP 3278238A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- series
- matrix
- element matrices
- matrices
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
Definitions
- linear projections are performed efficiently using a comparatively large structured matrix in order to achieve cost savings with respect to computation time and storage space.
- the comparatively large, structured matrix may be generated based on a series of comparatively small orthogonal element matrices.
- the comparatively large structured matrix may be formed by taking the Kronecker product of the series of comparatively small orthogonal element matrices.
- the Kronecker product or tensor product which is denoted by ®, is an operation on two matrices of arbitrary size resulting in a large matrix.
- the Kronecker product is a generalization of the outer product from vectors to matrices, and gives the matrix of the tensor product with respect to a standard choice of basis.
- the subject matter embodied in this specification may be embodied in methods that may include the actions of obtaining a plurality of content items. Additional actions may include extracting a plurality of features from each of the plurality of content items, generating a feature vector for each of the extracted features in order to create a search space, generating a series of element matrices based upon the generated feature vectors, wherein each element matrix of the series of element matrices may be associated with one or more relationships, and enhancing the search space at least in part by transforming the series of element matrices into a structured matrix such that the transformation preserves the one or more relationships associated with each element matrix of the series of element matrices.
- Additional actions may include receiving a search object, searching the enhanced search space based on the received search object, and providing one or more links to one or more content items that are responsive to the search object.
- the plurality of content items may comprise high dimensional data.
- the high dimensional data may be selected from the group consisting of: text; an image; a video; a content ad; and map data.
- the relationships associated with the element matrices may include orthogonality.
- the relationships associated with the element matrices may include a Euclidean distance.
- transforming the series of element matrices into the structured matrix may include generating a Kronecker projection that is based in part on the application of a Kronecker product to a series of element matrices.
- the series of element matrices may be randomly generated based, at least in part, on the Euclidean distance of a particular snapshot of a feature vector search space.
- the transformation of the series of element matrices into a structured matrix such that the transformation preserves the one or more predetermined relationships associated with each element matrix of the series of element matrices may be achieved with a storage complexity of OQogd) for d- dimensional data.
- the method may include actions for extracting one or more features associated with a search object, generating a search object vector that is representative of the features of the search object, comparing the search object vector against an enhanced search space that includes a structured matrix, and identifying one or more content items that satisfy a predetermined relationship based upon the comparison.
- aspects can be implemented in any convenient form.
- aspects and implementations may be implemented by appropriate computer programs which may be carried on appropriate carrier media which may be tangible carrier media (e.g. disks) or intangible carrier media (e.g. communications signals).
- aspects may also be implemented using suitable apparatus which may take the form of programmable computers running computer programs arranged to implement the invention.
- FIG. 1 is a block diagram of an example system that may be utilized to efficiently perform linear projections, in accordance with at least one aspect of the present disclosure.
- FIG. 2 is a flowchart of an example process that may be utilized to efficiently perform linear projections, in accordance with at least one aspect of the present disclosure.
- FIG. 3 is a block diagram of another example system that may be utilized to efficiently perform linear projections, in accordance with at least another aspect of the present disclosure.
- FIG. 4 is a flowchart of an example process for executing a search query against an enhanced search space, in accordance with at least on aspect of the present disclosure.
- FIG.1 is a block diagram of an example system 100 that may be utilized to efficiently perform linear projections, in accordance with at least one aspect of the present disclosure.
- System 100 may include, for example, a client 110, a server 120, a remote computer 130, and a network 140.
- a family of structured matrices may be used to efficiently perform orthogonal projections for high-dimensional data that may exist in relation to a variety of complex computer applications such as, for example, computer vision applications.
- the system 100 may provide for the creation of a series of comparatively small orthogonal element matrices. Once the series of comparatively small orthogonal element matrices are obtained, aspects of the present disclosure may transform the series of small orthogonal element matrices into a structured matrix.
- the structured matrix may be formed by taking the Kronecker product of the series of small orthogonal element matrices.
- the present disclosure may achieve advantages over existing systems in both computational complexity and space complexity. For instance, the present disclosure achieves a computational complexity of O(d[ogd) and a space complexity of O( ⁇ ogd) for c/-dimensional data.
- the client 110 of system 100 may include at least a processor 111, a memory 112, and a database 115.
- the memory 112 may provide for the storage of computer program code used to execute one or more applications on client 110.
- the applications may include, for example, a browser 113.
- the client 110 may be able to access one or more web based applications via the network 140 using browser 113.
- Such web based applications may include, for example, a maps application, a video streaming application, a mobile payment system, advertising services, or the like.
- Browser 113 may be configured receive inputs from a user of client 110 through one or more user interfaces associated with client 110.
- Received inputs may include, among other things, for example, search queries input via a keypad, e.g., physical keyboard, graphically reproduced keyboard via a capacitive touch user interface, or the like, search queries input via a voice command, gestures representative of one or more executable commands, or the like.
- client 1 10 may utilize processor 11 1 and memory 1 12 to store and execute one or more mobile applications 1 14 stored locally on client 110.
- client 1 10 may include a content database 1 15 that may be configured to store local content including, for example, text files, audio files, image files, video files, or combinations thereof.
- content database 1 15 may be configured to store local content including, for example, text files, audio files, image files, video files, or combinations thereof.
- one or more mobile applications 114 may provide functionality to facilitate, for example, a local document search, a local audio file search, a local image file search, a local video search, or the like.
- a mobile application 1 14 may also ensure that any such local search may also be executed remotely against one or more content databases 129, 133 hosted by one or more computers 120, 130 accessible via network 140 to provide a merged list of search results that may include search results from both local and remote content databases.
- mobile applications 114 may include other types of applications that include, for example, handwriting recognition programs. Other types of mobile applications 114 may also fall with the scope the disclosure provided by this specification.
- Mobile applications 114 may be configured to receive inputs from a user of client 110 in a manner similar to that described above with respect to browser 1 13.
- one or more mobile applications 1 14 may be configured to receive different inputs than browser 113 based on the particular functionality provided by the one or more mobile application 1 14.
- a handwriting recognition program may be configured to receive inputs in the form of handwritten text input via motions performed by a user using a stylus or the user's finger in combination with a capacitive touch user interface that is either integrated to the client 1 10 or externally coupled to client 1 10.
- Client 110 may be representative of one or multiple client devices. Such client devices may include, for example, mobile computing platforms and non-mobile computing platforms. Mobile computing platforms may include, for example, a smartphone, tablet, laptop computers, or other thin client devices. Non-mobile computing platforms may include, for example, desktop computers, set top box entertainment systems, or the like. Clients 1 10 may be configured to communicate server 120 via network 140 using one or more communications protocols.
- Server 120 may be representative of one or more multiple server computers.
- Server 120 may include at least a processor 121, memory 122, and content database 129.
- the memory 122 may include a suite of software tools that may be utilized to implement features of the subject matter disclosed by this specification. These software tools may include, for example, a content identification unit 123, a feature extraction unit 124, a feature vector generation unit 125, an element matrix generation unit 126, and a structured matrix generation unit 127.
- the aforementioned software tools may each comprise program instructions that, when executed by processor 121 , may perform the exemplary functionality described in this specification to create an enhanced search space that significantly reduces the memory footprint required to facilitate storage, search, and retrieval operations involving high dimensional data.
- High-dimensional data may include data with many dimensions such as, for example, hundreds of dimensions, thousands of dimensions, millions of dimensions, or even more dimensions.
- Content identification unit 123 may be configured to obtain content from one or more of a plurality of different sources. For instance, content identification unit 123 may utilize a web crawler, web spider, or the like that may traverse network 140 to scan and identify content items maintained in database 133 of one or more remote computers 130. Once identified, content identification unit may obtain a copy of the content item, or a portion thereof, from database 133 and store the copy of the content item in content database 129 of server 120.
- the content item may include a variety of different types of content that may be created using a client 1 10, server 120, or remote computer 130 including, for example, text data, audio data, image data, video data, or any combination thereof.
- content identification unit 123 may be configured to capture portions of content input by a user via one or more user interfaces of client device 110. For instance, content identification unit 123 may be configured to capture handwritten text input via motions performed by a user using a stylus or the user's finger in combination with a capacitive touch user interface that is either integrated into client 110 or externally coupled to client 110. Alternatively, or in addition, content identification unit 123 may be configured to receive one or more content items that may be uploaded via one or more remote computers. For instance, content identification unit 123 may receive one or more content items that one or more users of remote computer 130 wishes to add to a library of content items maintained by database 129. Alternatively, or in addition, content identification unit may be configured to obtain content items that were previously stored in database 129 of server 120.
- Content items obtained from one or more of the aforementioned sources may be used to generate a library of content items stored in database 129 that may be made available for access by one or more users of client 110, remote computer 130, or the like.
- server 120 may aggregate a vast amount of location information, geographic information, image information, and the like over a certain period of time that may be used to support a maps application accessible to a user of client 110 via either browser 112 or a mobile application 114 or to a user of similar applications via remote computer 130.
- server 120 may aggregate a vast amount of video files over a certain period of time in order to support a video streaming service accessible to a user of client 110 via either browser 112 or a mobile application 114 or to a user of similar applications via remote computer 130.
- the content items obtained by server 120 may be similarly utilized to support other types of applications accessible to users of client 110 or a remote computer 130.
- Content identification unit 123 may periodically determine that a sufficient number of content items have been collected in order to begin generation of an enhanced search space. This periodic determination may be based upon, for example, the expiration of a predetermined period of time. Alternatively, or in addition, the periodic determination may be made based upon the collection of a predetermined amount of data, e.g., after collecting 100 GB of data, 100 TB of data, or the like.
- the periodic determination may be made based upon the determination that content has been collected from a predetermined amount of content sources, e.g., content captured from a predetermined number of users subscribed to a service, content captured from a predetermined number of users actively using the service, content captured from a predetermined percentage of all known content sources, or the like.
- the content identification unit 123 may trigger the generation of an enhanced search space in response to the receipt of an instruction to generate an enhanced search space from one or more human users.
- Feature extraction unit 124 may be configured to analyze the content obtained by content identification unit 123 in order to identify particular content dependent features, or characteristics, that may be uniquely associated with each particular content item.
- Feature data may include, for example, colors, counters, curves, texture, pixels or the like that may be associated with, for example, image content.
- feature data may include, for example, document keywords, word use frequency, or the like associated with, for example, text content.
- a particular high definition image may be associated with at least one feature that corresponds to each particular pixel in the image.
- the likelihood that a particular content item can be identified during a search and retrieval process based on features extracted from the content item may increase with the amount of features that are extracted from the content item.
- the content features extracted by feature extraction unit 124 may be stored in memory unit 122 or database 129 for subsequent use by feature vector generation unit 125.
- Feature vector generation unit 125 may be configured to obtain, or otherwise receive, high-dimensional feature data extracted by feature extraction unit 124. Upon receipt of the extracted feature data, feature vector generation unit 125 may generate a plurality of feature vectors that may be used to numerically represent each of the features extracted from the obtained content. The values of a particular feature vector may be expressed in the form of a single row matrix. The collective set of feature vectors generated from the extracted features stored in database 129 may thus create a searchable model of the high-dimensional data obtained by content identification unit 123. Similarity determinations may be made between any two or more feature vectors based on the calculation of a Euclidean distance between the two or more feature vectors.
- the Euclidean distance between each feature vector that exists in a particular feature vector search space for a particular snapshot of the feature vector search space at any particular point in time.
- a particular snapshot of a feature vector search space may be captured, for example, at any particular point in time after a predetermined number of feature vectors have been generated by feature vector generation unit 125.
- an original feature vector search space that includes multiple feature vectors that are each separated by an original Euclidean distance.
- the original Euclidean distance may be, for example, the Euclidean distance that exists between each of the feature vectors at the time a snapshot of the feature vector search space is captured.
- Element matrix generation unit 126 may be configured to obtain, or otherwise receive, a plurality of high dimensional feature vectors generated by feature vector generation unit 125. Element matrix generation unit 126 may then organize the obtained feature vectors into a series of M element matrices. Each element matrix in the series of M element matrices may be comparatively smaller than the structured matrix that is described below. For example, in at least one aspect of the subject matter disclosed by the present specification, each element matrix may be of the size 2 x 2. Alternatively, or in addition, each element matrix in the series of M element matrices may be orthogonal. Element matrix generation unit 126 may generate the series of M element matrices by, for example, generating a small, random Gaussian matrix and then performing QR factorization.
- the random generation of the series of M element matrices may be, for example, based at least in part upon the original Euclidean distance of a particular snapshot of a feature vector search space.
- each element matrix in the series of M element matrices may be randomly generated in order to preserve the original Euclidean distance of the original feature vector search space.
- the series of M element matrices may be configured to utilize a machine learning system to train the element matrices in order to return, for example, particular image results when presented with a particular image.
- Structured matrix generation unit 127 may be configured to obtain, or otherwise receive, a series of M element matrices generated by element matrix generation unit 126.
- the structured matrix generation unit 127 may be configured to transform the series of M element matrices into a structured matrix.
- the structured matrix may be comparatively larger in size than each matrix of the series of M comparatively smaller, element matrices.
- the transformation of the series of M element matrices may occur in a manner that preserves the relationships associated with each element matrix of the series of M element matrices.
- the preserved relationship may include, for example, orthogonality or Euclidean distance.
- the transformation may include generating a linear projection by taking the Kronecker Product of a series of M element matrices.
- Kronecker Product of the series of M element matrices may be achieved using processes that include, for example, fast fourier transforms or fast fourier transform-like calculations.
- the generation of a linear projection using structured matrix generation unit 127 to transform of the series of M small element matrices into a comparatively larger structured matrix may result in a significant reduction in computation and space costs compared to the projection of unstructured matrices.
- linear projections generated using structured matrix generation unit 127 may achieve computation speeds of O(dlogd) and space complexity of O( ⁇ ogd) for d dimensional data as opposed to computation speeds of 0(cP) and space complexity of 0(d 2 ) for unstructured matrices.
- the output of structured matrix generation unit 127 may result in an enhanced search space.
- the enhanced search space may be stored in enhanced search space storage area 128.
- the search space has been enhanced to reduce space complexity to O(logcf) from 0(cP) for unstructured matrices, the comparatively larger structure matrix may still provide a representation of the feature vector space that may include substantially all of the generated feature vectors for a particular set of content items. Accordingly, neither accuracy nor precision of a search is compromised utilizing aspects of the present disclosure, described herein.
- Remote computer 130 may be representative of one or more multiple remote computers. Each remote computer 130 may include at least a processor 131, a memory 132, and content database 133. Remote computer 130 may be configured to make one or more content items available for discovery to software tools capable of identifying and obtaining web content such as, for example, content identification unit 123. One or more users of certain remote computers 130 may also be able to search and access content items maintained in content database 129. Remote computer 130 may be configured to communicate with server 120 via network 140. [00034] Network 140 may be configured to facilitate connectivity between a client 110, a server 120, and/or a remote computer 130.
- Client 110, server 120, and/or remote computer 130 may be connected to network 140 via one or more wired, or wireless, communication links 142a, 142b, and/or 142c, respectively.
- Network 140 may include any combination of one or more types of public and/or private networks including but not limited to a local area network (LAN), wide area network (WAN), the Internet, a cellular data network, or any combination thereof.
- FIG. 2 is a flowchart of an example process 200 that may be utilized to efficiently perform linear projections, in accordance with at least one aspect of the present disclosure.
- Process 200 may begin at 210 by utilizing content identification unit 123 to initiate a scan for content via one or more content items from one or more content sources that may be local, or remote, from server 120.
- the content scan may be performed by, for example, a web crawler, web spider, or the like.
- content identification unit 123 may receive one or more content items from one or more remote computers 130 or one or more client computers 1 10.
- content identification unit may sample the identified content and store at least a portion of the identified content in content database 129, store at least a portion of the identified content in another portion of main memory 122, or transmit at least a portion of the identified content to feature extraction unit 124.
- Process 200 may continue at 220 where a feature extraction unit 124 may access one or more portions of content identified by content identification unit 123.
- Feature extraction unit 124 may extract one or features and/or characteristics associated with the obtained content.
- the extracted features may be stored in content database 129, stored in another portion of main memory 122, or transmitted to feature vector generation unit 125.
- Process 200 may continue at 230 where a feature vector generation unit 125 may generate one or more feature vectors based on the content features extracted by feature extraction unit 124.
- the feature vectors may be used to generate a searchable data model of high-dimensional data.
- the searchable model may be facilitate similarity determinations based on a comparison of two or feature vectors. Such comparisons may be based on the evaluation of a Euclidean distance that exists between two or more feature vectors. The smaller the distance that exists between any given pair of feature vectors, the greater the similarity that may exist between the feature vectors.
- Generated feature vectors may be stored in content database 129, stored in another portion of main memory 122, or transmitted to element matrix generation unit 126.
- Process 200 may continue at 240 where an element matrix generation unit 126 may generate a series of M element matrices based on a set of a plurality of high dimensional feature vectors generated by feature vector generation unit 125. Each matrix of the series of M element matrices may be orthogonal. The series of M element matrices may be randomly or pseudo-randomly generated based at least in part upon the original Euclidean distance of a particular snapshot of a feature vector search space. Alternatively, or in addition, the series of M element matrices may be trained using one or more machine language learning system, such as those set forth herein below. The generated series of M element matrices may be stored in content database 129, stored in another portion of main memory 122, or transmitted to structured matrix generation unit 117.
- Process 200 may continue at 250 where a structured matrix generation unit 127 may be configured to create a comparatively larger, structured matrix based on a series of M comparatively smaller element matrices.
- the comparatively larger, structured matrix may be created by transforming or rotating the series of M element matrices into the comparatively larger, structured matrix.
- the transformation may be performed such that the transformation preserves the relationships associated with each element matrix of the series of M element matrices.
- the transformation may include generating a linear projection by taking the Kronecker Product of the series of M element matrices.
- the comparatively larger, structured matrix may result in an enhanced search space.
- the space complexity of the comparatively larger, structured matrix may be on the order of O( ⁇ ogd) for c/-dimensional data.
- the enhanced search space may be stored in main memory in an enhanced search space storage area 128, stored in content database 129, or the like.
- any efficient transformation of element matrices that preserves one or more relationships associated with the series of element matrices may be utilized in order to generate the large structured matrix from the series of small orthogonal element matrices in accordance with the present disclosure.
- Examples of such relationships that may be preserved in the generated structured matrix include, among other things, for example, orthogonality, Euclidean distance, etc.
- FIG. 3 is a block diagram of an example system 300 that may be utilized to efficiently perform linear projections, in accordance with at least another aspect of the present disclosure.
- System 300 may include, for example, a client 310, a server 320, a remote computer 330, and a network 340.
- Client 310 may include one or multiple client devices that each may be substantially similar to client 1 10.
- Client 310 may include at least a processor 311 , a main memory 312, and a content database 319.
- client 310 may also include a content identification unit 313, a feature extraction unit 314, a feature vector generation unit 315, an element matrix generation unit 316, a structured matrix generation unit 317, and an enhanced search space storage area 318.
- Each of content identification unit 313, a feature extraction unit 314, a feature vector generation unit 315, an element matrix generation unit 316, a structured matrix generation unit 317, and an enhanced search space storage area 318 may be substantially the same as the content identification unit 123, feature extraction unit 124, feature vector generation unit 125, element matrix generation unit 126, structured matrix generation unit 127, and enhanced search space storage area 128 of Fig. 1, system 100.
- content identification unit 313, feature extraction unit 314, feature vector generation unit 315, element matrix generation unit 316, structured matrix generation unit 317, and enhanced search space storage area 318 may be implemented on client 310 instead of, or in addition to, server 320.
- efficiencies provided by the subject matter of the present specification may facilitate the search and retrieval of high dimensional data on client devices such as, for example, client 310.
- the features of the subject matter described by the present specification may be applied to aspects of one or more mobile applications 114 that may be run on client 310 such as, for example, the generation of an enhanced search space to support local storage, search, and retrieval of text files, audio files, image files video files, or combinations thereof.
- mobile applications 114 may be run on client 310 such as, for example, the generation of an enhanced search space to support local storage, search, and retrieval of text files, audio files, image files video files, or combinations thereof.
- Features of the present disclosure may also be applicable to the generation of an enhanced search space to improve storage, search, and retrieval operations associated with other types of mobile applications such as, for example, handwriting recognition applications, search and display of content advertisements, or the like.
- the present disclosure may provide significant advantages to search and retrieval techniques including, for example, the approximate nearest neighbor (ANN) search method when utilizing approaches such as, among other things, for example, binary embedding or Cartesian k-means.
- ANN approximate nearest neighbor
- the present disclosure thus solves complex search problems with better accuracy while also requiring significantly less time and memory.
- FIG. 4 is a flowchart of an example process 400 for executing a search query against an enhanced search space, in accordance with at least on aspect of the present disclosure.
- Process 400 may begin at 410 when a computer such as, for example, server 120 or client 310, receives a search object.
- the search object may include a query that includes one or more keywords, an image, a video clip, handwriting strokes input via a stylus or a user's finger, an address, and/or other data that may be associated with a content item maintained by a content database 129 or 319.
- server 120 or client 310 may analyze the search object to extract one or multiple features, or characteristics, associated with the received search object at 420.
- Process 400 may continue at 430 by generating one or multiple search object feature vectors associated with the search object and based upon the search object features extracted at 420.
- the server 120 or client 310 may process the search object feature vectors against a previously generated enhanced search space maintained in enhanced search space storage area 128 or 318. This may include, for example, analyzing the search object feature vectors in view of the linear projection of the structured matrix in order to identify a subset of high-dimensional feature vectors that provide a nearest neighbor match for the search obj ect feature vector.
- stage 440 may include identifying multiple matches that represent the subset of feature vectors that fall within a predetermined threshold distance of the search obj ect feature vector.
- the distance between the search obj ect vector and the feature vectors linearly projected via a structured matrix in the enhanced search space may be a Euclidean distance.
- the process may retrieve one or more content items associated with the subset of feature vectors identified in the enhanced search space as sufficiently matching the search obj ect feature vector.
- one or more links referencing the retrieved content items may be provided to the computer that submitted the search object.
- At least one stage in a method for large- scale search and retrieval of data associated with complex computer applications may be to utilize a linear projection.
- Such linear projections may be followed by, among other things, for example, quantization to convert high dimensional features into compact codes that utilize less memory such as, for example, binary embeddings or product code.
- the compact codes may be binary code or non-binary code.
- Such compact codes may be used to efficiently perform search execution time and reduce storage requirements associated with a variety of complex computer applications such as, for example, image retrieval, feature
- LSH Locality Sensitive Hashing
- projecting one vector can take 800 ms on a single core.
- the projection matrix may be orthogonal.
- Orthogonal transformation may be beneficial because, among other things, for example, it may preserve the Euclidean distance between points, and is also known to distribute the variance more evenly across the dimensions. These properties are important to make several well-known techniques perform well on real world data.
- orthogonality may be one approach to satisfying the goal of learning maximally uncorrelated bits while learning data-dependent binary codes.
- One way to achieve this aforementioned goal is by imposing orthogonal, or near orthogonal, constraints on the projections.
- imposing an orthogonality constraint on a projection may achieve improved results when executing an approximate nearest neighbor search.
- the present disclosure provides a method, system, and non-transitory computer readable medium for transforming a series of small element matrices into a structured matrix in a manner that preserves the relationships associated with the original element matrices as the relationships existed prior to transformation of the matrices.
- the structured matrix that results from the transformation may be, for example, a large single matrix.
- the structured matrix may be conceptually representative of a flexible family of orthogonal structured matrices.
- the preserved relationship may be the orthogonality associated with each matrix of the series of element matrices.
- the preserved relationship may be the distance between corresponding feature vectors associated with a matrix.
- the distance may be, for example, the Euclidean distance between corresponding feature vectors associated with a matrix.
- the transformation of the element matrices may be achieved by using a Kronecker product of the small element matrices, leading to substantially reduced space and computational complexity. The flexibility associated with this transformation may facilitate a variation of the number of free parameters in the matrices to adapt to the needs of a given application.
- at least one aspect of the present disclosure may construct a family of orthogonal structured matrices by transforming a series of small orthogonal matrices to form a large structured matrix.
- At least one aspect of the present disclosure facilitates the aforementioned transformation by using the Kronecker product of a series of small orthogonal element matrices.
- the Kronecker projection matrix may satisfy Equation (3):
- a large matrix produced in accordance with the aforementioned transformation may be associated with at least four main advantages. First, the large matrix satisfies the orthogonality constraint and therefore the large matrix may preserve Euclidean distances in the original space. Second, Fast Fourier Transform-like computations can be used to compute the projection with a time complexity of 0(dlogd).
- the resulting large matrix may be associated with a varying number of parameters (degrees of freedom), thus making it easier to control performance-speed trade-off.
- the space complexity of the large matrix is 0( ⁇ ogd) in comparison to 0(d) for most other structured matrices.
- the proposed Kronecker projection provides advantages in the approximate nearest neighbor search problem under a variety of different settings that include, for example, binary embedding and vector quantization.
- the approximate nearest neighbors may be retrieved using Hamming distance in the binary code space, which can be computed very efficiently in a variety of ways including, for example, using a table lookup, or the POPCNT instruction on modern computer architectures.
- LSH may be used to generate binary codes in a manner that preserves cosine distance and typically uses randomized projections to generate binary code.
- using such randomized projections may forego the advantages of learning data-dependent binary codes by optimizing the projection matrix R.
- methods utilizing Iterative Quantization (ITQ) indicate that by using a PCA projection followed by a learned orthogonal projection, the resulting binary embedding may outperform a nonorthogonal or randomized orthogonal projection.
- the projection may be learned by alternating between projecting datapoints and solving for projections via SVD.
- ITQ Iterative Quantization
- Euclidean distances between q and all datapoints in the database may be computed.
- the Euclidean distances may be approximated by vector-to-quantizer distances.
- quantization may be carried out in subspaces independently. A commonly used subspace may be identified by chunking the vectors, which may lead to the Product Quantization (PQ).
- the distance between the query vector q and a database point x may be set forth with respect to Equation (4):
- Equation (4) where m is the total number of subspaces, ' and q ' are subvectors and ⁇ ,( ⁇ ' ) is the quantization function on subspace i. Because of its asymmetric nature, only the database points are quantized, and not the query vector. To increase performance, different subspaces have similar variance for the given data.
- One way to achieve this is by applying an orthogonal transform R to the data, as set forth in Equation (5): q - x Rq Rx T j:; ⁇ : - /. ti Rx) ⁇
- the projection matrix can be learned from given data leading to improved retrieval results.
- methods for facilitating the projection operation in existence prior to the present disclosure can be associated with high resource costs, e.g., processor use, memory use, execution time, etc. , in high-dimensional spaces.
- a fast projection that is both orthogonal and efficiently learnable is needed.
- these objectives may be achieved by transforming a series of element matrices into a large structured matrix using a transformation algorithm that preserves the relationships associated with each respective element matrix.
- the transformation algorithm may include, among other things, for example, the use of a Kronecker product to generate the projection.
- the Kronecker product may be associated with a number of properties that facilitate the aforementioned transformation. For instance, Let Ai £ M. k i xd i, and A2 £ M 3 ⁇ 4.
- the Kronecker product of Ai and A2 is Ai (g) A2 £ l ⁇ 2
- Equation (6) where ct ⁇ (i,j) is the element of the z-th row, and y-th column of Ai.
- the Kronecker product may also be referred to as a tensor product or a direct product.
- the Kronecker product may be associated with a plurality of characteristics that facilitate the advantages recited herein. For instance, at least a subset of these characteristics aid in the generation of a fast orthogonal projection, while also preserving the relationships associated with the original element matrices. Two particular characteristics of the
- a Kronecker proj ection matrix R £ may include a Kronecker product of several element matrices, as set forth below in Equation (7):
- the FLOPs of performing Kronecker projection of a d- dimension vector is d(2d e - l)log de d.
- the present disclosure has been described with reference to examples wherein the Kronecker projection R, and all the element matrices are square. However, the present disclosure need not be so limited. Instead, for example, the present disclosure may also extended to the non-square Kronecker projections and/or non-square element matrices. For instance, the sizes of the element matrices may be chosen by factorizing d and k. Alternatively, or in addition, there may arise instances when d or k cannot be factorized as the product of small numbers. For example, with respect to the input feature, one may alter the dimension by subsampling or padding zeros. Separately, for example, with respect to the output, one may use a longer code and then subsample. The generation of a Kronecker projection will be further discussed below in the context of both a square projection matrix R and a non-square projection matrix.
- a Kronecker projection may also be generated randomly. However, a randomly generated Kronecker projection improves upon the
- the randomized Kronecker projection may be applied in binary embedding and quantization. Such applications of the Kronecker projection may be achieved by replacing the unstructured projection matrix (R in Equation (1) and Equation (5)) with the randomized Kronecker projection matrix.
- the method, system, and computer program described herein may generate M (small) orthogonal element matrices.
- the element matrices may be generated by creating a small random Gaussian matrix and then performing QR factorization.
- the time complexity of generating a randomized Kronecker projection of order d is only 0( ⁇ ogd). This is a significant benefit because, for example, the generation of an unstructured projection of an orthogonal matrix of order d which has a time complexity 0 (P).
- the randomized Kronecker projection provides a practical solution for generating randomized projections for high-dimensional data.
- the Kronecker structure is imposed on R.
- a local solution of Equation (3) may be found by alternating minimization.
- B is computed by a straightforward binarization based on definition.
- I ⁇ ⁇ and k d (we will discuss k ⁇ d case below)
- R is found by the orthogonal procrustes problem set forth in Equation (11):
- each subspace may be quantized to h sub-centers.
- h sub-centers In accordance with the example discussed below, a scenario is considered where all the sub-center sets have the same fixed cardinality.
- the present disclosure need not be so limited. For instance, the present disclosure may also be applied in a similar manner for sub-center sets with varying cardinalities.
- an orthogonal procrustes problem may arise.
- the problem may be referred to as a Kronecker procrustes.
- the Kronecker procrustes may be shown below with respect to Equation (15):
- an iterative method may be utilized to update each element matrix sequentially to find a local solution.
- the method may begin by rewriting
- the next step may be to maximize tr (((8> j li A ; )XB T ).
- tr ⁇ B r (3 ⁇ 4Jl :i A j )X) V bf (cg Lj A j )xj, (17)
- Equation (17) where b and x are the z-th column of matrix B and matrix X respectively. This problem may be solved by updating one element matrix at a time, while keeping all others fixed. Without loss of generality, consider updating Ay as shown in
- a pre , A next , and A,- be k pre x d pre , k next x d next and k ; - x d ; - , respectively.
- Equation (19) can be expressed as:
- the computational cost may come from three different sources.
- the first source referred to here as SI
- the second source referred to here as S2
- the third source referred to here as S3, results from performing SVD to get the optimal element matrix.
- the optimization bottleneck may be SVD.
- the element matrices may be small such as, for example, 2x2, performing SVD can be achieved in approximately constant time.
- the main computational cost therefore comes from SI (O(Ndlog d)) and S2 (O(iVd)). Since there are a total of log de d element matrices, the computation complexity of the whole optimization is 0(Nd ⁇ og 2 d).
- the projection matrix R can be formed by the Kronecker product of non-square row/column orthogonal element matrices.
- the Kronecker product may preserve the row/column orthogonality.
- R r R ⁇ I the second equality in Equation (16) does not hold. Accordingly,
- Various implementations of the systems and techniques described here may be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations may include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- ASICs application specific integrated circuits
- the systems and techniques described here may be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user may provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices may be used to provide for interaction with a user as well; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
- the systems and techniques described here may be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user may interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
- the components of the system may be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN”), a wide area network (“WAN”), and the Internet.
- LAN local area network
- WAN wide area network
- the Internet the global information network
- the computing system may include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
Abstract
Description
Claims
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562232258P | 2015-09-24 | 2015-09-24 | |
US201562232238P | 2015-09-24 | 2015-09-24 | |
US14/951,909 US10394777B2 (en) | 2015-09-24 | 2015-11-25 | Fast orthogonal projection |
PCT/US2016/047965 WO2017052874A1 (en) | 2015-09-24 | 2016-08-22 | Fast orthogonal projection |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3278238A1 true EP3278238A1 (en) | 2018-02-07 |
Family
ID=60808567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16760273.9A Withdrawn EP3278238A1 (en) | 2015-09-24 | 2016-08-22 | Fast orthogonal projection |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP3278238A1 (en) |
JP (2) | JP6469890B2 (en) |
KR (1) | KR102002573B1 (en) |
CN (1) | CN107636639B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380494A (en) * | 2020-11-17 | 2021-02-19 | 中国银联股份有限公司 | Method and device for determining object characteristics |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111611418A (en) * | 2019-02-25 | 2020-09-01 | 阿里巴巴集团控股有限公司 | Data storage method and data query method |
KR102217219B1 (en) * | 2020-07-30 | 2021-02-18 | 주식회사 파란헤움 | Emergency notification system apparatus capable of notifying emergency about indoor space and operating method thereof |
KR102302949B1 (en) * | 2020-08-13 | 2021-09-16 | 주식회사 한컴위드 | Digital content provision service server supporting the provision of digital limited content through linkage with gold bar and operating method thereof |
KR102302948B1 (en) * | 2020-08-13 | 2021-09-16 | 주식회사 한컴위드 | Gold bar genuine product certification server to perform genuine product certification for gold bar and operating method thereof |
KR102417839B1 (en) * | 2020-08-13 | 2022-07-06 | 주식회사 한컴위드 | Cloud-based offline commerce platform server that enables offline commerce based on gold and digital gold token, and operating method thereof |
KR102639404B1 (en) | 2020-10-30 | 2024-02-21 | 가부시키가이샤 페닉스 솔루션 | RFID tag for rubber products and manufacturing method of RFID tag for rubber products |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5479523A (en) * | 1994-03-16 | 1995-12-26 | Eastman Kodak Company | Constructing classification weights matrices for pattern recognition systems using reduced element feature subsets |
US6859802B1 (en) * | 1999-09-13 | 2005-02-22 | Microsoft Corporation | Image retrieval based on relevance feedback |
US20030217052A1 (en) * | 2000-08-24 | 2003-11-20 | Celebros Ltd. | Search engine method and apparatus |
US6826300B2 (en) * | 2001-05-31 | 2004-11-30 | George Mason University | Feature based classification |
US8015350B2 (en) * | 2006-10-10 | 2011-09-06 | Seagate Technology Llc | Block level quality of service data in a data storage device |
US7941442B2 (en) * | 2007-04-18 | 2011-05-10 | Microsoft Corporation | Object similarity search in high-dimensional vector spaces |
US8457409B2 (en) * | 2008-05-22 | 2013-06-04 | James Ting-Ho Lo | Cortex-like learning machine for temporal and hierarchical pattern recognition |
CN100593785C (en) * | 2008-05-30 | 2010-03-10 | 清华大学 | Three-dimensional model search method based on multiple characteristic related feedback |
JP5375676B2 (en) * | 2010-03-04 | 2013-12-25 | 富士通株式会社 | Image processing apparatus, image processing method, and image processing program |
JP5563494B2 (en) * | 2011-02-01 | 2014-07-30 | 株式会社デンソーアイティーラボラトリ | Corresponding reference image search device and method, content superimposing device, system and method, and computer program |
JP5258915B2 (en) * | 2011-02-28 | 2013-08-07 | 株式会社デンソーアイティーラボラトリ | Feature conversion device, similar information search device including the same, coding parameter generation method, and computer program |
US8891878B2 (en) * | 2012-06-15 | 2014-11-18 | Mitsubishi Electric Research Laboratories, Inc. | Method for representing images using quantized embeddings of scale-invariant image features |
CN103389966A (en) * | 2012-05-09 | 2013-11-13 | 阿里巴巴集团控股有限公司 | Massive data processing, searching and recommendation methods and devices |
JP5563016B2 (en) * | 2012-05-30 | 2014-07-30 | 株式会社デンソーアイティーラボラトリ | Information search device, information search method and program |
JP5959446B2 (en) * | 2013-01-30 | 2016-08-02 | Kddi株式会社 | Retrieval device, program, and method for high-speed retrieval by expressing contents as a set of binary feature vectors |
US20140280426A1 (en) * | 2013-03-13 | 2014-09-18 | International Business Machines Corporation | Information retrieval using sparse matrix sketching |
CN103279578B (en) * | 2013-06-24 | 2016-04-06 | 魏骁勇 | A kind of video retrieval method based on context space |
CN103440280A (en) * | 2013-08-13 | 2013-12-11 | 江苏华大天益电力科技有限公司 | Retrieval method and device applied to massive spatial data retrieval |
JP6195365B2 (en) * | 2013-10-18 | 2017-09-13 | Kddi株式会社 | Vector encoding program, apparatus and method |
CN104794733B (en) * | 2014-01-20 | 2018-05-08 | 株式会社理光 | Method for tracing object and device |
CN103984675A (en) * | 2014-05-06 | 2014-08-13 | 大连理工大学 | Orthogonal successive approximation method for solving global optimization problem |
-
2016
- 2016-08-22 JP JP2017556909A patent/JP6469890B2/en active Active
- 2016-08-22 EP EP16760273.9A patent/EP3278238A1/en not_active Withdrawn
- 2016-08-22 CN CN201680028711.7A patent/CN107636639B/en active Active
- 2016-08-22 KR KR1020177031376A patent/KR102002573B1/en active IP Right Grant
-
2019
- 2019-01-15 JP JP2019004165A patent/JP2019057329A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380494A (en) * | 2020-11-17 | 2021-02-19 | 中国银联股份有限公司 | Method and device for determining object characteristics |
CN112380494B (en) * | 2020-11-17 | 2023-09-01 | 中国银联股份有限公司 | Method and device for determining object characteristics |
Also Published As
Publication number | Publication date |
---|---|
CN107636639B (en) | 2021-01-08 |
JP2019057329A (en) | 2019-04-11 |
JP2018524660A (en) | 2018-08-30 |
JP6469890B2 (en) | 2019-02-13 |
CN107636639A (en) | 2018-01-26 |
KR20170132291A (en) | 2017-12-01 |
KR102002573B1 (en) | 2019-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Luo et al. | Robust discrete code modeling for supervised hashing | |
EP3278238A1 (en) | Fast orthogonal projection | |
US10394777B2 (en) | Fast orthogonal projection | |
CN106649890B (en) | Data storage method and device | |
US9934260B2 (en) | Streamlined analytic model training and scoring system | |
JP7360497B2 (en) | Cross-modal feature extraction method, extraction device, and program | |
JP5926291B2 (en) | Method and apparatus for identifying similar images | |
US20210342345A1 (en) | Latent network summarization | |
WO2017092183A1 (en) | Image retrieval method based on variable-length deep hash learning | |
US20230123941A1 (en) | Multiscale Quantization for Fast Similarity Search | |
CN112395487A (en) | Information recommendation method and device, computer-readable storage medium and electronic equipment | |
Meena et al. | Architecture for software as a service (SaaS) model of CBIR on hybrid cloud of microsoft azure | |
Kumar et al. | Parameterization reduction using soft set theory for better decision making | |
US10824811B2 (en) | Machine learning data extraction algorithms | |
Guo et al. | Parametric and nonparametric residual vector quantization optimizations for ANN search | |
Cheng et al. | Sparse representations based distributed attribute learning for person re-identification | |
WO2017052874A1 (en) | Fast orthogonal projection | |
CN113901278A (en) | Data search method and device based on global multi-detection and adaptive termination | |
WO2020247960A1 (en) | Method and apparatus for cosmetic product recommendation | |
Cordel II et al. | Fast emulation of self-organizing maps for large datasets | |
He et al. | Sparse Graph Hashing with Spectral Regression | |
Campobello et al. | An efficient algorithm for parallel distributed unsupervised learning | |
Mathankumar et al. | Redundant bit based classification for large scale images | |
CN112287136A (en) | Image feature index library establishing method and similar image determining method | |
CN116975609A (en) | Feature extraction model processing and information pushing method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20171101 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: GOOGLE LLC |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20200917 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20230602 |