WO2019127299A1 - 数据查询方法、电子设备及存储介质 - Google Patents

数据查询方法、电子设备及存储介质 Download PDF

Info

Publication number
WO2019127299A1
WO2019127299A1 PCT/CN2017/119637 CN2017119637W WO2019127299A1 WO 2019127299 A1 WO2019127299 A1 WO 2019127299A1 CN 2017119637 W CN2017119637 W CN 2017119637W WO 2019127299 A1 WO2019127299 A1 WO 2019127299A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
queried
binary tree
tree index
node
Prior art date
Application number
PCT/CN2017/119637
Other languages
English (en)
French (fr)
Inventor
彭齐荣
魏运运
Original Assignee
深圳云天励飞技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术有限公司 filed Critical 深圳云天励飞技术有限公司
Publication of WO2019127299A1 publication Critical patent/WO2019127299A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the field of artificial intelligence technologies, and in particular, to a data query method, an electronic device, and a storage medium.
  • artificial intelligence can use many fields, such as monitoring, intelligent transportation, etc., which involves the storage of a large amount of image data.
  • the current way of storing is to store all the images in the database associated with the server. As the amount of data stored becomes larger, it is impossible to query the required data from all the stored data.
  • a data query method comprising:
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture set similar to the to-be-queried picture is searched;
  • the query condition includes a shooting time, and determining, according to the query condition, that the to-be-searched area includes:
  • An area matching the time and the shooting time is used as the to-be-searched area according to the shooting time.
  • one leaf node of the binary tree represents picture data of one category, and each node except the leaf node represents a difference feature of two candidate categories, except for the leaf node
  • two candidate categories corresponding to any one of the nodes outside the root node are classified by the first candidate category of the two candidate categories corresponding to the parent node of the any one of the nodes, and the brother nodes of the any one of the nodes
  • the two candidate categories corresponding to the points are classified by the second candidate categories of the two candidate categories corresponding to the parent nodes of the any one of the nodes.
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture similar to the to-be-queried picture is searched for.
  • the set includes:
  • a priority queue is used to traverse each binary tree index, and the traversed node is inserted in the priority queue, based on the difference between the feature of the image to be queried and the node represented by the traversal
  • the distance between the priority queues is sorted until the traversal termination condition is reached, and the first picture set similar to the picture to be queried is determined based on the picture data in the category corresponding to the leaf node in the priority queue.
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture similar to the to-be-queried picture is searched for.
  • the set includes:
  • outputting a picture similar to the picture to be queried based on the first picture set and/or the second picture set includes one or more of the following combinations:
  • the identifier of the picture, the data associated with the identifier of each picture in the first picture set and/or the data associated with the identifier of each picture in the second picture set are output to the user.
  • the method further includes:
  • the picture to be stored is stored in the corresponding area according to the shooting time of the picture to be stored.
  • the method further includes:
  • An electronic device comprising a memory and a processor, the memory for storing at least one instruction, the processor for executing the at least one instruction to implement the data query method of any of any of the embodiments .
  • a computer readable storage medium storing at least one instruction, the at least one instruction being executed by a processor to implement the data query method of any of the embodiments.
  • the present invention obtains a picture to be queried and a query condition; determines a to-be-searched area according to the query condition; extracts a feature of the to-be-queried picture; and when determining that the to-be-searched area has a binary tree index, And traversing the binary tree index to find a first picture set similar to the to-be-queried picture according to the feature of the to-be-queried picture; and when determining that the to-be-searched area has a picture that does not establish a binary tree index, Aligning features of the to-be-queried picture with features of the picture in which the binary tree index is not established, searching for a second picture set similar to the picture to be queried; based on the first picture set and/or the second picture set And outputting a picture similar to the picture to be queried.
  • the invention can search the binary tree in the partition in which the binary tree index is established, and does not need to compare each piece of data, thereby greatly improving the search speed; in the case where the binary tree index partition is not established, the search needs of the partition and the partition are The picture data of each piece is compared to obtain a picture whose similarity reaches a certain threshold. Therefore, using the present invention to query data improves the search speed as a whole.
  • 1 is a flow chart of a preferred embodiment of the data query method of the present invention.
  • FIG. 2 is a schematic diagram of a binary tree index of the present invention.
  • FIG. 3 is a schematic diagram of the present invention for finding a binary tree.
  • FIG. 4 is a functional block diagram of a preferred embodiment of the data querying apparatus of the present invention.
  • Figure 5 is a block diagram showing a preferred embodiment of an electronic device in at least one embodiment of the present invention.
  • FIG. 1 is a flow chart of a preferred embodiment of the data query method of the present invention. The order of the steps in the flowchart may be changed according to different requirements, and some steps may be omitted.
  • the electronic device acquires the image to be inquired and the query condition.
  • the electronic device communicates with a plurality of terminal devices, including but not limited to: a mobile phone, a camera, and the like.
  • the terminal device acquires the input image to be queried and the query condition, and sends the to-be-queried image and the query condition to the electronic device.
  • the electronic device is configured to store a picture captured by a plurality of camera devices in a shopping mall. Entering a face image on the user interface and inputting a query condition, such as a time period of July 2017, the electronic device searches the database for a picture similar to the face image in the July 2017 time period. .
  • the query condition includes, but is not limited to, a shooting time.
  • the electronic device determines a to-be-searched area according to the query condition.
  • the dividing the stored picture data into a plurality of areas comprises: acquiring shooting time data of each picture in the stored picture data; and dividing the stored picture data into multiple areas based on shooting time data of each picture .
  • the picture data of the shooting time of 2016 is divided into the first area
  • the picture data of the shooting time of 2017 is divided into the second area, and the like.
  • the stored picture data is divided into a plurality of areas.
  • the stored picture data is divided into a plurality of areas.
  • the determining, according to the query condition, that the to-be-searched area comprises:
  • An area matching the time and the shooting time is used as the to-be-searched area according to the shooting time. It avoids querying all partitions and improves efficiency.
  • the electronic device extracts features of the to-be-queried picture.
  • the feature of the to-be-queried picture is extracted from the to-be-queried picture by using an image extraction technique.
  • the image extraction technique is prior art and the invention is not described in detail.
  • the electronic device traverses the binary tree index according to the feature of the to-be-queried picture, and searches for a first picture set similar to the to-be-queried picture.
  • the data may be queried based on the binary tree index.
  • a binary tree index is generated for the picture data in the filtered area.
  • a binary tree index is generated until the amount of storage reaches the first storage threshold.
  • a binary tree index is established for the data stored in the area, which facilitates the search of subsequent data, and does not need to compare each piece of data, thereby greatly improving the search speed.
  • the electronic device acquires a shooting time of the picture to be stored; and stores the picture to be stored in the corresponding area according to the shooting time of the picture to be stored.
  • the picture in a region in which a binary tree index has been generated, when a picture needs to be stored, the picture is first stored in the one area, but a binary tree index is not established on the picture.
  • the storage amount of the picture data in which the binary tree index is not established in the one area is greater than or equal to the second threshold, the binary tree index is generated for the picture data in which the binary tree index is not established in the one area.
  • the second threshold may be the same as or different from the first threshold.
  • a leaf node of the binary tree represents picture data of a category, except for the leaf node
  • the nodes represent the difference features of the two candidate categories
  • the two candidate categories corresponding to any one of the nodes other than the leaf node and the root node are the two candidate categories corresponding to the parent node of the any one of the nodes.
  • the two candidate categories corresponding to the sibling nodes of the any one of the nodes are classified by the second candidate category of the two candidate categories corresponding to the parent node of the any one of the nodes.
  • generating a binary tree index for the image data in any one of the regions by using the approximate nearest neighbor algorithm includes:
  • each of the at least one set of candidate categories corresponding to the candidate category a first candidate category and a second candidate category;
  • the binary tree node corresponding to each group of candidate categories represents a difference feature value between the first candidate category and the second candidate category in each group of candidate categories;
  • the first candidate category and the second candidate category in each of the candidate categories are respectively iteratively clustered, and two candidate categories corresponding to the first candidate category in each of the candidate categories are obtained and the Two candidate categories corresponding to the second candidate category in each group of candidate categories, and a binary tree node representing a difference feature of the two candidate categories corresponding to the first candidate category in each of the candidate categories is corresponding to each of the candidate categories
  • the left node or the right node of the binary tree node which will represent the binary tree node of the difference feature of the two candidate categories corresponding to the second candidate category in each group of candidate categories as the binary tree node corresponding to each group of candidate categories
  • the iterative clustering is performed multiple times until the convergence condition is reached, and multiple categories corresponding to the arbitrary one area are obtained, wherein one leaf node of the binary tree represents picture data of one category.
  • difference features between the two candidate categories are represented by differential features between the cluster centers of the two candidate categories.
  • the iterative termination condition includes, but is not limited to, the number of iterations and the like.
  • Q is used to represent image data stored in a region, and in the first iterative clustering, randomly selected from Q.
  • Two points are used as initial center points, and Q is clustered by the initial center point to obtain two candidate categories A1 and A2, and node A represents a difference feature between candidate category A1 and candidate category A2, and node A is taken as The root node of the binary tree index.
  • the candidate category A1 is clustered to obtain the candidate category B1 and the candidate category B2, and the candidate category A2 is clustered to obtain the candidate category C1 and the candidate category C2, and between the candidate category B1 and the candidate category B2.
  • the difference feature is represented by a node B
  • the difference feature between the candidate category C1 and the candidate category C2 is represented by a node C
  • the node B is taken as the left child node of the root node A, and the node is connected.
  • C is the right child node of the root node A.
  • the leaf node D1 represents a candidate category B1
  • the leaf node D1 represents a candidate category B2
  • the leaf node D3 represents a candidate category C1
  • the leaf node D4 represents a candidate category C2.
  • the data when a binary tree index is generated for the picture data stored in the area, when the image data is subsequently queried, the data may be queried based on the binary tree index.
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture set similar to the to-be-queried picture is:
  • a priority queue is used to traverse each binary tree index, and the traversed node is inserted in the priority queue, based on the difference between the feature of the image to be queried and the node represented by the traversal
  • the distance between the priority queues is sorted until the traversal termination condition is reached, and the first picture set similar to the picture to be queried is determined based on the picture data in the category corresponding to the leaf node in the priority queue.
  • a binary tree index is established for the data stored in the area. When performing data search, it is not necessary to compare each piece of data, which greatly improves the search speed.
  • the query is performed based on the distance between the feature of the image to be queried and the differential feature represented by the traversed node, so that traversing all the binary trees can be avoided, and the query speed is improved.
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture set similar to the to-be-queried picture is:
  • the traversal termination condition includes, but is not limited to, the picture data in the category corresponding to the leaf node in the priority queue is greater than or equal to the data threshold and the like.
  • two binary trees are generated in a to-be-searched area, and the feature of the image to be queried is represented by W, and the difference between the feature W of the image to be queried and the root node H of the first binary tree is calculated.
  • the distance T1 between the features calculates the distance T2 between the feature W of the image to be queried and the difference feature indicated by the root node M of the second binary tree. According to the size of T1 and T2, the root node is small to large. H and the root node M are inserted in the priority queue.
  • T1 is smaller than T2
  • the distance T3 between the feature W of the image to be queried and the left child node G of the root node H is calculated first, and the node G is correspondingly
  • the two leaf nodes are inserted in the priority queue, and then the distance W between the feature W of the image to be queried and the right child node K of the root node H is calculated, and the node G and the node K are determined according to the size of T3 and T4. Put into the priority queue.
  • T4 is smaller than T3, sort from small to large, then in the priority queue, the order of the leaf nodes is: K1, K2, G1, G2.
  • the picture data in the category indicated by the leaf node exceeds the threshold, the picture data in the category indicated by the leaf node is used as the first picture set similar to the picture to be queried.
  • the electronic device compares a feature of the to-be-queried picture with a feature of the picture that does not establish a binary tree index, and searches and waits for the picture. Query the second set of pictures with similar pictures.
  • the electronic device outputs a picture similar to the picture to be queried based on the first picture set and/or the second picture set.
  • a picture similar to the picture to be queried is output.
  • a picture similar to the picture to be queried is output.
  • the first picture set and the second picture set exist simultaneously, the picture similar to the to-be-queried picture is output based on the first picture set and the second picture set.
  • outputting a picture similar to the picture to be queried based on the first picture set and/or the second picture set includes:
  • the identifier of the picture, the data associated with the identifier of each picture in the first picture set and/or the data associated with the identifier of each picture in the second picture set are output to the user.
  • the data associated with the identifier of each picture in the first picture set and the second picture set includes, but is not limited to, a shooting location, a number of the captured camera device, a picture number, a character of the person in the picture, and the like. .
  • the present invention obtains a picture to be queried and a query condition; determines a to-be-searched area according to the query condition; extracts a feature of the image to be queried; and when determining that the to-be-searched area has a binary tree index, according to the feature of the image to be queried And traversing the binary tree index to find a first picture set similar to the picture to be queried; and when determining that the to-be-searched area has a picture that does not establish a binary tree index, the feature of the picture to be queried is Aligning features of the picture of the binary tree index that are not established, searching for a second picture set similar to the picture to be queried; and outputting the picture to be queried based on the first picture set and/or the second picture set Similar pictures.
  • the invention can search the binary tree in the partition in which the binary tree index is established, and does not need to compare each piece of data, thereby greatly improving the search speed; in the case where the binary tree index partition is not established, the search needs of the partition and the partition are The picture data of each piece is compared to obtain a picture whose similarity reaches a certain threshold. Therefore, using the present invention to query data improves the search speed as a whole.
  • the data querying apparatus 11 includes an obtaining module 100, a determining module 101, a dividing module 102, an extracting module 103, a querying module 104, a storage module 105, a generating module 106, and an output module 107.
  • the unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor of the data query device 11 and capable of performing fixed functions, which are stored in the memory. In the present embodiment, the functions of the respective units will be described in detail in the subsequent embodiments.
  • the obtaining module 100 acquires a picture to be queried and a query condition.
  • the electronic device communicates with a plurality of terminal devices, including but not limited to: a mobile phone, a camera, and the like.
  • the terminal device acquires the input image to be queried and the query condition, and sends the to-be-queried image and the query condition to the electronic device.
  • the electronic device is configured to store a picture captured by a plurality of camera devices in a shopping mall. Entering a face image on the user interface and inputting a query condition, such as a time period of July 2017, the electronic device searches the database for a picture similar to the face image in the July 2017 time period. .
  • the query condition includes, but is not limited to, a shooting time.
  • the determining module 101 determines the to-be-searched area according to the query condition.
  • the dividing module 102 divides the stored picture data into multiple regions, including: acquiring shooting time data of each picture in the stored picture data; and storing the stored picture data based on shooting time data of each picture Divided into multiple zones. For example, the picture data of the shooting time of 2016 is divided into the first area, the picture data of the shooting time of 2017 is divided into the second area, and the like.
  • the dividing module 102 divides the stored picture data into a plurality of areas.
  • the dividing module 102 divides the stored picture data into a plurality of areas.
  • the determining module 101 determines, according to the query condition, that the to-be-searched area includes:
  • An area matching the time and the shooting time is used as the to-be-searched area according to the shooting time. It avoids querying all partitions and improves efficiency.
  • the extraction module 103 extracts features of the to-be-queried picture.
  • the feature of the to-be-queried picture is extracted from the to-be-queried picture by using an image extraction technique.
  • the image extraction technique is prior art and the invention is not described in detail.
  • the electronic device traverses the binary tree index according to the feature of the to-be-queried picture, and the querying module 104 searches for a first picture similar to the to-be-queried picture. set.
  • the query module 104 after dividing the stored picture data into multiple regions, generates a binary tree index for the qualified regions, and when the image data is subsequently queried, the data may be queried based on the binary tree index.
  • the query module 104 generates image data in the filtered region when the stored image data is divided into multiple regions, and the region whose storage amount is greater than or equal to the first storage threshold is filtered from the plurality of regions.
  • Binary tree index For an area where the amount of storage is less than the first storage threshold, a binary tree index is generated until the amount of storage reaches the first storage threshold.
  • a binary tree index is established for the data stored in the area, which facilitates the search of subsequent data, and does not need to compare each piece of data, thereby greatly improving the search speed.
  • the storage module 105 acquires a shooting time of the to-be-stored picture; and stores the to-be-stored picture in the corresponding area according to the shooting time of the to-be-stored picture.
  • the generating module 106 when a picture needs to be stored, the picture is first stored in the one area, but a binary tree index is not established on the picture.
  • the generating module 106 when the storage amount of the picture data in which the binary tree index is not established in the one area is greater than or equal to the second threshold, the generating module 106 generates a binary tree index for the picture data in which the binary tree index is not established in the one area.
  • the second threshold may be the same as or different from the first threshold.
  • a leaf node of the binary tree represents picture data of a category, except for the leaf node
  • the nodes represent the difference features of the two candidate categories
  • the two candidate categories corresponding to any one of the nodes other than the leaf node and the root node are the two candidate categories corresponding to the parent node of the any one of the nodes.
  • the two candidate categories corresponding to the sibling nodes of the any one of the nodes are classified by the second candidate category of the two candidate categories corresponding to the parent node of the any one of the nodes.
  • the generating module 106 generates a binary tree index for the picture data in any one area by using an approximate nearest neighbor algorithm, including:
  • each of the at least one set of candidate categories corresponding to the candidate category a first candidate category and a second candidate category;
  • the binary tree node corresponding to each group of candidate categories represents a difference feature value between the first candidate category and the second candidate category in each group of candidate categories;
  • the first candidate category and the second candidate category in each of the candidate categories are respectively iteratively clustered, and two candidate categories corresponding to the first candidate category in each of the candidate categories are obtained and the Two candidate categories corresponding to the second candidate category in each group of candidate categories, and a binary tree node representing a difference feature of the two candidate categories corresponding to the first candidate category in each of the candidate categories is corresponding to each of the candidate categories
  • the left node or the right node of the binary tree node which will represent the binary tree node of the difference feature of the two candidate categories corresponding to the second candidate category in each group of candidate categories as the binary tree node corresponding to each group of candidate categories
  • the iterative clustering is performed multiple times until the convergence condition is reached, and multiple categories corresponding to the arbitrary one area are obtained, wherein one leaf node of the binary tree represents picture data of one category.
  • difference feature between the two candidate categories is represented by the difference feature between the cluster centers of the two candidate categories.
  • the iterative termination condition includes, but is not limited to, the number of iterations and the like.
  • Q is used to represent image data stored in a region, and in the first iterative clustering, randomly selected from Q.
  • Two points are used as initial center points, and Q is clustered by the initial center point to obtain two candidate categories A1 and A2, and node A represents a difference feature between candidate category A1 and candidate category A2, and node A is taken as The root node of the binary tree index.
  • the candidate category A1 is clustered to obtain the candidate category B1 and the candidate category B2, and the candidate category A2 is clustered to obtain the candidate category C1 and the candidate category C2, and between the candidate category B1 and the candidate category B2.
  • the difference feature is represented by a node B
  • the difference feature between the candidate category C1 and the candidate category C2 is represented by a node C
  • the node B is taken as the left child node of the root node A, and the node is connected.
  • C is the right child node of the root node A.
  • the leaf node D1 represents a candidate category B1
  • the leaf node D1 represents a candidate category B2
  • the leaf node D3 represents a candidate category C1
  • the leaf node D4 represents a candidate category C2.
  • the data when a binary tree index is generated for the picture data stored in the area, when the image data is subsequently queried, the data may be queried based on the binary tree index.
  • the query module 104 traverses the binary tree index according to the feature of the to-be-queried picture, and searches for the first picture similar to the to-be-queried picture.
  • the photo gallery includes:
  • a priority queue is used to traverse each binary tree index, and the traversed node is inserted in the priority queue, based on the difference between the feature of the image to be queried and the node represented by the traversal
  • the distance between the priority queues is sorted until the traversal termination condition is reached, and the first picture set similar to the picture to be queried is determined based on the picture data in the category corresponding to the leaf node in the priority queue.
  • a binary tree index is established for the data stored in the area. When performing data search, it is not necessary to compare each piece of data, which greatly improves the search speed.
  • the query is performed based on the distance between the feature of the image to be queried and the differential feature represented by the traversed node, so that traversing all the binary trees can be avoided, and the query speed is improved.
  • the query module 104 traverses the binary tree index according to the feature of the to-be-queried picture, and searches for a first picture set similar to the to-be-queried picture.
  • the traversal termination condition includes, but is not limited to, the picture data in the category corresponding to the leaf node in the priority queue is greater than or equal to the data threshold and the like.
  • two binary trees are generated in a to-be-searched area, and the feature of the image to be queried is represented by W, and the difference between the feature W of the image to be queried and the root node H of the first binary tree is calculated.
  • the distance T1 between the features calculates the distance T2 between the feature W of the image to be queried and the difference feature indicated by the root node M of the second binary tree. According to the size of T1 and T2, the root node is small to large. H and the root node M are inserted in the priority queue.
  • T1 is smaller than T2
  • the distance T3 between the feature W of the image to be queried and the left child node G of the root node H is calculated first, and the node G is correspondingly
  • the two leaf nodes are inserted in the priority queue, and then the distance W between the feature W of the image to be queried and the right child node K of the root node H is calculated, and the node G and the node K are determined according to the size of T3 and T4. Put into the priority queue.
  • T4 is smaller than T3, sort from small to large, then in the priority queue, the order of the leaf nodes is: K1, K2, G1, G2.
  • the picture data in the category indicated by the leaf node exceeds the threshold, the picture data in the category indicated by the leaf node is used as the first picture set similar to the picture to be queried.
  • the querying module 104 compares the feature of the to-be-queried picture with the feature of the picture that does not establish the binary tree index, and searches for the to-be-queried A second set of pictures with similar pictures.
  • the output module 107 Based on the first picture set and/or the second picture set, the output module 107 outputs a picture similar to the picture to be queried.
  • the output module 107 when there is no first picture set, based on the second picture set, the output module 107 outputs a picture similar to the picture to be queried. When there is no second picture set, based on the first picture set, the output module 107 outputs a picture similar to the picture to be queried. When the first picture set and the second picture set exist simultaneously, based on the first picture set and the second picture set, the output module 107 outputs a picture similar to the picture to be queried.
  • the output module 107 outputs a picture similar to the picture to be queried based on the first picture set and/or the second picture set, including:
  • the identifier of the picture, the data associated with the identifier of each picture in the first picture set and/or the data associated with the identifier of each picture in the second picture set are output to the user.
  • the data associated with the identifier of each picture in the first picture set and the second picture set includes, but is not limited to, a shooting location, a number of the captured camera device, a picture number, a character of the person in the picture, and the like. .
  • the present invention obtains a picture to be queried and a query condition; determines a to-be-searched area according to the query condition; extracts a feature of the image to be queried; and when determining that the to-be-searched area has a binary tree index, according to the feature of the image to be queried And traversing the binary tree index to find a first picture set similar to the picture to be queried; and when determining that the to-be-searched area has a picture that does not establish a binary tree index, the feature of the picture to be queried is Aligning features of the picture of the binary tree index that are not established, searching for a second picture set similar to the picture to be queried; and outputting the picture to be queried based on the first picture set and/or the second picture set Similar pictures.
  • the invention can search the binary tree in the partition in which the binary tree index is established, and does not need to compare each piece of data, thereby greatly improving the search speed; in the case where the binary tree index partition is not established, the search needs of the partition and the partition are The picture data of each piece is compared to obtain a picture whose similarity reaches a certain threshold. Therefore, using the present invention to query data improves the search speed as a whole.
  • the above-described integrated unit implemented in the form of a software function module can be stored in a computer readable storage medium.
  • the above software functional modules are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the method of each embodiment of the present invention. Part of the steps.
  • the electronic device 3 comprises at least one transmitting device 31, at least one memory 32, at least one processor 33, at least one receiving device 34 and at least one communication bus.
  • the communication bus is used to implement connection communication between these components.
  • the electronic device 3 is a device capable of automatically performing numerical calculation and/or information processing according to an instruction set or stored in advance, and the hardware includes but is not limited to a microprocessor and an application specific integrated circuit (ASIC). ), Field-Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.
  • the electronic device 3 may also comprise a network device and/or a user device.
  • the network device includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud computing-based cloud composed of a large number of hosts or network servers, where the cloud computing is distributed computing.
  • a super virtual computer consisting of a group of loosely coupled computers.
  • the electronic device 3 can be, but is not limited to, any electronic product that can interact with a user through a keyboard, a touch pad, or a voice control device, such as a tablet, a smart phone, or a personal digital assistant (Personal Digital Assistant). , PDA), smart wearable devices, camera equipment, monitoring equipment and other terminals.
  • a keyboard e.g., a keyboard
  • a touch pad e.g., a touch pad
  • a voice control device such as a tablet, a smart phone, or a personal digital assistant (Personal Digital Assistant). , PDA), smart wearable devices, camera equipment, monitoring equipment and other terminals.
  • PDA Personal Digital Assistant
  • the network in which the electronic device 3 is located includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (VPN), and the like.
  • the Internet includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (VPN), and the like.
  • VPN virtual private network
  • the receiving device 34 and the transmitting device 31 may be wired transmission ports, or may be wireless devices, for example, including antenna devices, for performing data communication with other devices.
  • the memory 32 is used to store program code.
  • the memory 32 may be a circuit having a storage function, such as a RAM (Random-Access Memory), a FIFO (First In First Out), or the like, which has no physical form in the integrated circuit.
  • the memory 32 may also be a memory having a physical form, such as a memory stick, a TF card (Trans-flash Card), a smart media card, a secure digital card, a flash memory card.
  • Storage devices such as (flash card) and the like.
  • the processor 33 can include one or more microprocessors, digital processors.
  • the processor 33 can call program code stored in the memory 32 to perform related functions. For example, the various units described in FIG. 2 are program code stored in the memory 32 and executed by the processor 33 to implement a data query method.
  • the processor 33 also known as a central processing unit (CPU), is a very large-scale integrated circuit, which is a computing core (Core) and a control unit (Control Unit).
  • the embodiment of the present invention further provides a computer readable storage medium having stored thereon computer instructions, when executed by an electronic device including one or more processors, causing the electronic device to perform the method embodiment as described above Data query method.
  • the memory 32 in the electronic device 3 stores a plurality of instructions to implement a commodity recommendation method, and the processor 33 can execute the plurality of instructions to implement:
  • the binary tree index is traversed according to the feature of the to-be-queried picture, and the first picture set similar to the to-be-queried picture is searched;
  • the query condition includes a shooting time
  • the plurality of instructions executable by the processor 33 are configured to determine, according to the query condition, that the to-be-searched area comprises:
  • An area matching the time and the shooting time is used as the to-be-searched area according to the shooting time.
  • one leaf node of the binary tree represents picture data of one category, and each node except the leaf node represents a difference feature of two candidate categories, except for the leaf node
  • two candidate categories corresponding to any one of the nodes outside the root node are classified by the first candidate category of the two candidate categories corresponding to the parent node of the any one of the nodes, and the brother nodes of the any one of the nodes
  • the two candidate categories corresponding to the points are classified by the second candidate categories of the two candidate categories corresponding to the parent nodes of the any one of the nodes.
  • the multiple instructions executable by the processor 33 are configured to traverse the binary tree index according to characteristics of the to-be-queried picture. Finding a first picture set similar to the picture to be queried includes:
  • a priority queue is used to traverse each binary tree index, and the traversed node is inserted in the priority queue, based on the difference between the feature of the image to be queried and the node represented by the traversal
  • the distance between the priority queues is sorted until the traversal termination condition is reached, and the first picture set similar to the picture to be queried is determined based on the picture data in the category corresponding to the leaf node in the priority queue.
  • the multiple instructions executable by the processor 33 are configured to traverse the binary tree index according to characteristics of the to-be-queried picture. Finding a first picture set similar to the picture to be queried includes:
  • the plurality of instructions executable by the processor 33 for outputting a picture similar to the picture to be queried includes One or more of the following combinations:
  • the identifier of the picture, the data associated with the identifier of each picture in the first picture set and/or the data associated with the identifier of each picture in the second picture set are output to the user.
  • the plurality of instructions executable by the processor 33 further includes:
  • the picture to be stored is stored in the corresponding area according to the shooting time of the picture to be stored.
  • the method further includes:
  • the above-described characteristic means of the present invention can be implemented by an integrated circuit and control the function of implementing the data query method in any of the above embodiments. That is, the integrated circuit of the present invention is installed in the electronic device such that the electronic device performs the following functions:
  • the binary tree index is traversed to find a first picture set similar to the picture to be queried; when it is determined that the picture to be searched has a picture in which a binary tree index is not established, the feature of the picture to be queried is not established Aligning features of the picture of the binary tree index, searching for a second picture set similar to the picture to be queried; and outputting a similar picture to the picture to be queried based on the first picture set and/or the second picture set image.
  • the functions that can be implemented by the data query method in any of the embodiments can be installed in the electronic device by using the integrated circuit of the present invention, so that the electronic device can be implemented by the data query method in any embodiment. Function, no longer detailed here.
  • the disclosed apparatus may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Library & Information Science (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

一种数据查询方法,所述方法包括:获取待查询图片及查询条件(S10);根据所述查询条件,确定待查找区(S11);提取所述待查询图片的特征(S12);当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集(S13);当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集(S14);基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片(S15)。还提供一种电子设备及存储介质。能提高搜索速度。

Description

数据查询方法、电子设备及存储介质
本申请要求于2017年12月25日提交中国专利局,申请号为201711421701.2、发明名称为“数据查询方法、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及人工智能技术领域,尤其涉及一种数据查询方法、电子设备及存储介质。
背景技术
目前人工智能能运用很多领域,如监控、智能交通等等,这都会涉及到大量图片数据的存储。目前存储的方式都是将所有的图片存储在服务器相关联的数据库中。随着存储的数据量逐渐变大,从存储的所有数据中查询所需的数据也无法达到实时。
发明内容
鉴于以上内容,有必要提供一种数据查询方法、电子设备及存储介质,能提高搜索速度。
一种数据查询方法,所述方法包括:
获取待查询图片及查询条件;
根据所述查询条件,确定待查找区;
提取所述待查询图片的特征;
当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;
当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;
基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。
根据本发明优选实施例,所述查询条件包括拍摄时间,所述根据所述查询条件,确定待查找区包括:
根据所述拍摄时间,将时间与所述拍摄时间匹配的区作为所述待查找区。
根据本发明优选实施例,在所述二叉树索引中,二叉树的一个叶子结点表 示一个类别的图片数据,除叶子结点外的每个结点表示两个候选类别的差分特征,除叶子结点及根结点外任意一个结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第一候选类别分类得到的,所述任意一个结点的兄弟结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第二候选类别分类得到的。
根据本发明优选实施例,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
从每个二叉树索引的根结点开始,采用优先队列遍历每个二叉树索引,将遍历的结点插入在所述优先队列中,基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离,对所述优先队列进行排序,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
根据本发明优选实施例,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
依次计算所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离;
将所述每个二叉树索引中的根结点依次插入优先队列中,并基于所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离,对所述优先队列中的根结点进行排序;
从所述优先队列中最小距离对应的根结点开始,依次遍历所述优先队列中每个根结点对应的孩子结点,计算所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离;
基于所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离,对所述优先队列中的孩子结点进行排序;
从所述优先队列中最小距离对应的孩子结点开始,利用优先队列递归遍历每个二叉树索引,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
根据本发明优选实施例,基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片包括以下一种或者多种的组合:
获取所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识;根据所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识,查找所述第一图片集中每张图片的标识关联的数据及/或所述第二图片集中每张图片的标识关联的数据输出给用户。
根据本发明优选实施例,所述方法还包括:
获取待存储图片的拍摄时间;
根据所述待存储图片的拍摄时间,将待存储图片存储于对应的区中。
根据本发明优选实施例,所述方法还包括:
当一个区中未建立二叉树索引的图片数据的存储量大于或者等于第二阈值时,对所述一个区中未建立二叉树索引的图片数据生成二叉树索引。
一种电子设备,所述电子设备包括存储器及处理器,所述存储器用于存储至少一个指令,所述处理器用于执行所述至少一个指令以实现任意实施例中任一项所述数据查询方法。
一种计算机可读存储介质,所述计算机可读存储介质存储有至少一个指令,所述至少一个指令被处理器执行时实现任意实施例中任一项所述数据查询方法。
由以上技术方案可以看出,本发明获取待查询图片及查询条件;根据所述查询条件,确定待查找区;提取所述待查询图片的特征;当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。本发明能在建立了二叉树索引的分区中,对二叉树进行搜索,不需要对每一条的数据进行比对,大大提升了搜索速度;在没有建立二叉树索引分区中,该分区的搜索需要与分区中每条的图片数据进行比对得到相似度达到一定阈值的图片。因此,利用本发明对数据进行查询,从整体上提高了搜索速度。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1是本发明数据查询方法的较佳实施例的流程图。
图2是本发明二叉树索引示意图。
图3是本发明查找二叉树示意图。
图4是本发明数据查询装置的较佳实施例的功能模块图。
图5是本发明至少一个实例中电子设备的较佳实施例的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而非用于描述特定顺序。此外,术语“包括”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
如图1所示,是本发明数据查询方法的较佳实施例的流程图。根据不同的需求,该流程图中步骤的顺序可以改变,某些步骤可以省略。
S10,电子设备获取获取待查询图片及查询条件。
在本发明优选实施例中,所述电子设备与多个终端设备相通信,所述终端设备包括,但并不限于:手机、摄像装置等等。在用户界面上,所述终端设备获取输入的待查询图片及查询条件,并将所述待查询图片及查询条件发送至所述电子设备。例如,所述电子设备用于存储一个商场中多个摄像装置抓拍的图片。在用户界面上输入一个人脸图片,并输入查询条件,例如时间段2017年7月,则所述电子设备在数据库中,查找2017年7月时间段中,与所述人脸图片相似的图片。
进一步地,所述查询条件包括,但不限于:拍摄时间。
S11,所述电子设备根据所述查询条件,确定待查找区。
优选地,所述将存储的图片数据分成多个区包括:获取所述存储的图片数据中每张图片的拍摄时间数据;基于每张图片的拍摄时间数据,将存储的图片数据分成多个区。例如,将拍摄时间为2016的图片数据分在第一区,将拍摄时间为2017的图片数据分在第二区等等。
进一步地,当所述存储的图片数据的存储量大于或者等于存储阈值时,将 存储的图片数据分成多个区。
进一步地,在当前时间到达预设时间时,将存储的图片数据分成多个区。
优选地,所述根据所述查询条件,确定待查找区包括:
根据所述拍摄时间,将时间与所述拍摄时间匹配的区作为所述待查找区。避免了对所有分区进行查询,提高了效率。
S12,所述电子设备提取所述待查询图片的特征。
在本发明中,利用图像提取技术,从所述待查询图片中提取所述待查询图片的特征。图像提取技术为现有技术,本发明不再详述。
S13,当确定所述待查找区有二叉树索引时,所述电子设备根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集。
在本发明中,在对存储的图片数据分成多个区后,对符合条件的区生成二叉树索引后,后续对图片数据进行查询时,可以基于所述二叉树索引查询数据。
优选地,在对存储的图片数据分成多个区后,从所述多个区中筛选存储量大于或者等于第一存储阈值的区时,对筛选的区中的图片数据生成二叉树索引。对于存储量小于第一存储阈值的区,直到存储量达到第一存储阈值时,才会生成二叉树索引。对区中存储的数据建立二叉树索引,便于后续数据的搜索,无需对每一条的数据进行比对,大大提升了搜索速度。
在本发明中,对于每个区而言,后续都会继续有图片要存入。优选地,所述电子设备获取待存储图片的拍摄时间;根据所述待存储图片的拍摄时间,将待存储图片存储于对应的区中。
在本发明的优选实施例中,在已经生成了二叉树索引的一个区中,当有图片需要存入时,先将图片存储在所述一个区中,但不对所述图片建立二叉树索引。优选地,当所述一个区中未建立二叉树索引的图片数据的存储量大于或者等于第二阈值时,对所述一个区中未建立二叉树索引的图片数据生成二叉树索引。所述第二阈值可以与第一阈值相同或者不同。
在本发明的优选实施例中,在对每个区中的图片数据生成二叉树索引时,在所述二叉树索引中,二叉树的一个叶子结点表示一个类别的图片数据,除叶子结点外的每个结点表示两个候选类别的差分特征,除叶子结点及根结点外任意一个结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第一候选类别分类得到的,所述任意一个结点的兄弟结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第二候选类别分类得到的。这样建立二叉树索引后,后续在对数据进行搜索,无需对每一条的数据进行比对,大大提升了搜索速度。
进一步地,利用近似最近邻算法,对任意一个区中的图片数据生成二叉树 索引包括:
从所述任意一个区中随机选取两点作为初始中心点;
利用预设算法,基于所述初始中心点进行迭代聚类,在迭代次数大于1的当前迭代的上一次迭代中能得到至少一组候选类别,所述至少一组候选类别中每组候选类别对应第一候选类别及第二候选类别;所述每组候选类别对应的二叉树结点表示每组候选类别中第一候选类别与第二候选类别间的差分特征值;
在当前迭代中,分别对所述每组候选类别中第一候选类别及第二候选类别分别进行迭代聚类,得到所述每组候选类别中第一候选类别对应的两个候选类别及所述每组候选类别中第二候选类别对应的两个候选类别,将表示所述每组候选类别中第一候选类别对应的两个候选类别的差分特征的二叉树结点作为所述每组候选类别对应的二叉树结点的左结点或右结点,将表示所述每组候选类别中第二候选类别对应的两个候选类别的差分特征的二叉树结点作为所述每组候选类别对应的二叉树结点的右结点或左结点;
执行多次迭代聚类,直至达到收敛条件,得到所述任意一个区对应的多个类别,其中二叉树的一个叶子结点表示一个类别的图片数据。
进一步地,两个候选类别间的差分特征用两个候选类别的聚类中心间的差分特征表示。
进一步地,所述迭代终止条件包括,但不限于:迭代次数等等。
例如,为了便于理解上述生成二叉树索引的过程,以二次迭代为例,如图2所示,用Q表示一个区中存储的图片数据,在第一次迭代聚类中,从Q中随机选取两点作为初始中心点,以所述初始中心点对Q进行聚类,得到两个候选类别A1及A2,结点A表示候选类别A1与候选类别A2间的差分特征,并将结点A作为所述二叉树索引的根结点。第二次迭代聚类中,分别对候选类别A1进行聚类得到候选类别B1和候选类别B2,对候选类别A2进行聚类得到候选类别C1和候选类别C2,候选类别B1与候选类别B2间的差分特征用结点B表示,候选类别C1与候选类别C2件间的差分特征用结点C表示,并按照预设规则,将结点B作为根结点A的左边子结点,将结点C作为根结点A的右边子结点。所述叶子结点D1表示候选类别B1,所述叶子结点D1表示候选类别B2,所述叶子结点D3表示候选类别C1,所述叶子结点D4表示候选类别C2。
在本发明中,当对区存储的图片数据生成二叉树索引后,后续对图片数据进行查询时,可以基于所述二叉树索引查询数据。
优选地,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
从每个二叉树索引的根结点开始,采用优先队列遍历每个二叉树索引,将 遍历的结点插入在所述优先队列中,基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离,对所述优先队列进行排序,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。对区中存储的数据建立二叉树索引,在进行数据搜索时,无需对每一条的数据进行比对,大大提升了搜索速度。而且基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离进行查询,这样可以避免遍历所有的二叉树,提高了查询速度。
进一步地,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
依次计算所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离;
将所述每个二叉树索引中的根结点依次插入优先队列中,并基于所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离,对所述优先队列中的根结点进行排序;
从所述优先队列中最小距离对应的根结点开始,依次遍历所述优先队列中每个根结点对应的孩子结点,计算所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离;
基于所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离,对所述优先队列中的孩子结点进行排序;
从所述优先队列中最小距离对应的孩子结点开始,利用优先队列递归遍历每个二叉树索引,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
进一步地,所述遍历终止条件包括,但不限于:所述优先队列中叶子结点对应的类别中的图片数据大于或者等于数据阈值等等。
例如,如图3所示,在一个待查找区生成了2棵二叉树,所述待查询图片的特征用W表示,计算待查询图片的特征W与第一棵二叉树的根结点H表示的差分特征间的距离T1,计算待查询图片的特征W与第二棵二叉树的根结点M表示的差分特征间的距离T2,根据T1及T2大小,按照小从到大,将所述根结点H及所述根结点M插入在优先队列中,例如,T1小于T2,则先计算待查询图片的特征W与根结点H的左孩子结点G间的距离T3,将结点G对应的两个叶子结点插入在优先队列中,再计算待查询图片的特征W与根结点H的右孩子结点K间的距离T4,根据T3及T4大小,将结点G及结点K放入优先队列中,若T4小于T3,按照从小到大排序,则在优先队列中,叶子结点的排序为:K1、K2、G1、G2。当此时叶子结点表示的类别中图片数据超过阈值时,将所述叶子 结点表示的类别中图片数据作为所述待查询图片相似的第一图片集。
S14,当确定所述待查找区有未建立二叉树索引的图片时,所述电子设备将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集。
S15,所述电子设备基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。
在本发明中,当没有第一图片集时,基于第二图片集,输出与所述待查询图片相似的图片。当没有第二图片集时,基于第一图片集,输出与所述待查询图片相似的图片。当第一图片集及第二图片集同时存在时,基于第一图片集及第二图片集,输出与所述待查询图片相似的图片。
优选地,基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片包括:
获取所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识;根据所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识,查找所述第一图片集中每张图片的标识关联的数据及/或所述第二图片集中每张图片的标识关联的数据输出给用户。
进一步地,所述第一图片集、第二图片集中每张图片的标识关联的数据包括,但不限于:拍摄地点、抓拍的摄像装置的编号、图片编号、图片中人物的身份等等其他信息。
本发明获取待查询图片及查询条件;根据所述查询条件,确定待查找区;提取所述待查询图片的特征;当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。本发明能在建立了二叉树索引的分区中,对二叉树进行搜索,不需要对每一条的数据进行比对,大大提升了搜索速度;在没有建立二叉树索引分区中,该分区的搜索需要与分区中每条的图片数据进行比对得到相似度达到一定阈值的图片。因此,利用本发明对数据进行查询,从整体上提高了搜索速度。
如图4所示,本发明数据查询装置的较佳实施例的功能模块图。所述数据查询装置11包括获取模块100、确定模块101、划分模块102、提取模块103、查询模块104、存储模块105、生成模块106及输出模块107。本发明所称的单元是指一种能够被数据查询装置11的处理器所执行并且能够完成 固定功能的一系列计算机程序段,其存储在存储器中。在本实施例中,关于各单元的功能将在后续的实施例中详述。
所述获取模块100获取获取待查询图片及查询条件。
在本发明优选实施例中,所述电子设备与多个终端设备相通信,所述终端设备包括,但并不限于:手机、摄像装置等等。在用户界面上,所述终端设备获取输入的待查询图片及查询条件,并将所述待查询图片及查询条件发送至所述电子设备。例如,所述电子设备用于存储一个商场中多个摄像装置抓拍的图片。在用户界面上输入一个人脸图片,并输入查询条件,例如时间段2017年7月,则所述电子设备在数据库中,查找2017年7月时间段中,与所述人脸图片相似的图片。
进一步地,所述查询条件包括,但不限于:拍摄时间。
所述确定模块101根据所述查询条件,确定待查找区。
优选地,所述划分模块102将存储的图片数据分成多个区,包括:获取所述存储的图片数据中每张图片的拍摄时间数据;基于每张图片的拍摄时间数据,将存储的图片数据分成多个区。例如,将拍摄时间为2016的图片数据分在第一区,将拍摄时间为2017的图片数据分在第二区等等。
进一步地,当所述存储的图片数据的存储量大于或者等于存储阈值时,所述划分模块102将存储的图片数据分成多个区。
进一步地,在当前时间到达预设时间时,所述划分模块102将存储的图片数据分成多个区。
优选地,所述确定模块101根据所述查询条件,确定待查找区包括:
根据所述拍摄时间,将时间与所述拍摄时间匹配的区作为所述待查找区。避免了对所有分区进行查询,提高了效率。
所述提取模块103提取所述待查询图片的特征。
在本发明中,利用图像提取技术,从所述待查询图片中提取所述待查询图片的特征。图像提取技术为现有技术,本发明不再详述。
当确定所述待查找区有二叉树索引时,所述电子设备根据所述待查询图片的特征,对所述二叉树索引进行遍历,所述查询模块104查找与所述待查询图片相似的第一图片集。
在本发明中,所述查询模块104在对存储的图片数据分成多个区后,对符合条件的区生成二叉树索引后,后续对图片数据进行查询时,可以基于所述二叉树索引查询数据。
优选地,所述查询模块104在对存储的图片数据分成多个区后,从所述多个区中筛选存储量大于或者等于第一存储阈值的区时,对筛选的区中的图片数据生成二叉树索引。对于存储量小于第一存储阈值的区,直到存储量达到第一 存储阈值时,才会生成二叉树索引。对区中存储的数据建立二叉树索引,便于后续数据的搜索,无需对每一条的数据进行比对,大大提升了搜索速度。
在本发明中,对于每个区而言,后续都会继续有图片要存入。优选地,所述存储模块105获取待存储图片的拍摄时间;根据所述待存储图片的拍摄时间,将待存储图片存储于对应的区中。
在本发明的优选实施例中,在已经生成了二叉树索引的一个区中,当有图片需要存入时,先将图片存储在所述一个区中,但不对所述图片建立二叉树索引。优选地,当所述一个区中未建立二叉树索引的图片数据的存储量大于或者等于第二阈值时,所述生成模块106对所述一个区中未建立二叉树索引的图片数据生成二叉树索引。所述第二阈值可以与第一阈值相同或者不同。
在本发明的优选实施例中,在对每个区中的图片数据生成二叉树索引时,在所述二叉树索引中,二叉树的一个叶子结点表示一个类别的图片数据,除叶子结点外的每个结点表示两个候选类别的差分特征,除叶子结点及根结点外任意一个结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第一候选类别分类得到的,所述任意一个结点的兄弟结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第二候选类别分类得到的。这样建立二叉树索引后,后续在对数据进行搜索,无需对每一条的数据进行比对,大大提升了搜索速度。
进一步地,所述生成模块106利用近似最近邻算法,对任意一个区中的图片数据生成二叉树索引包括:
从所述任意一个区中随机选取两点作为初始中心点;
利用预设算法,基于所述初始中心点进行迭代聚类,在迭代次数大于1的当前迭代的上一次迭代中能得到至少一组候选类别,所述至少一组候选类别中每组候选类别对应第一候选类别及第二候选类别;所述每组候选类别对应的二叉树结点表示每组候选类别中第一候选类别与第二候选类别间的差分特征值;
在当前迭代中,分别对所述每组候选类别中第一候选类别及第二候选类别分别进行迭代聚类,得到所述每组候选类别中第一候选类别对应的两个候选类别及所述每组候选类别中第二候选类别对应的两个候选类别,将表示所述每组候选类别中第一候选类别对应的两个候选类别的差分特征的二叉树结点作为所述每组候选类别对应的二叉树结点的左结点或右结点,将表示所述每组候选类别中第二候选类别对应的两个候选类别的差分特征的二叉树结点作为所述每组候选类别对应的二叉树结点的右结点或左结点;
执行多次迭代聚类,直至达到收敛条件,得到所述任意一个区对应的多个类别,其中二叉树的一个叶子结点表示一个类别的图片数据。
进一步地,两个候选类别间的差分特征用两个候选类别的聚类中心间的差 分特征表示。
进一步地,所述迭代终止条件包括,但不限于:迭代次数等等。
例如,为了便于理解上述生成二叉树索引的过程,以二次迭代为例,如图2所示,用Q表示一个区中存储的图片数据,在第一次迭代聚类中,从Q中随机选取两点作为初始中心点,以所述初始中心点对Q进行聚类,得到两个候选类别A1及A2,结点A表示候选类别A1与候选类别A2间的差分特征,并将结点A作为所述二叉树索引的根结点。第二次迭代聚类中,分别对候选类别A1进行聚类得到候选类别B1和候选类别B2,对候选类别A2进行聚类得到候选类别C1和候选类别C2,候选类别B1与候选类别B2间的差分特征用结点B表示,候选类别C1与候选类别C2件间的差分特征用结点C表示,并按照预设规则,将结点B作为根结点A的左边子结点,将结点C作为根结点A的右边子结点。所述叶子结点D1表示候选类别B1,所述叶子结点D1表示候选类别B2,所述叶子结点D3表示候选类别C1,所述叶子结点D4表示候选类别C2。
在本发明中,当对区存储的图片数据生成二叉树索引后,后续对图片数据进行查询时,可以基于所述二叉树索引查询数据。
优选地,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,所述查询模块104对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
从每个二叉树索引的根结点开始,采用优先队列遍历每个二叉树索引,将遍历的结点插入在所述优先队列中,基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离,对所述优先队列进行排序,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。对区中存储的数据建立二叉树索引,在进行数据搜索时,无需对每一条的数据进行比对,大大提升了搜索速度。而且基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离进行查询,这样可以避免遍历所有的二叉树,提高了查询速度。
进一步地,当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,所述查询模块104对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
依次计算所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离;
将所述每个二叉树索引中的根结点依次插入优先队列中,并基于所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离,对所述优先队列中的根结点进行排序;
从所述优先队列中最小距离对应的根结点开始,依次遍历所述优先队列中 每个根结点对应的孩子结点,计算所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离;
基于所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离,对所述优先队列中的孩子结点进行排序;
从所述优先队列中最小距离对应的孩子结点开始,利用优先队列递归遍历每个二叉树索引,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
进一步地,所述遍历终止条件包括,但不限于:所述优先队列中叶子结点对应的类别中的图片数据大于或者等于数据阈值等等。
例如,如图3所示,在一个待查找区生成了2棵二叉树,所述待查询图片的特征用W表示,计算待查询图片的特征W与第一棵二叉树的根结点H表示的差分特征间的距离T1,计算待查询图片的特征W与第二棵二叉树的根结点M表示的差分特征间的距离T2,根据T1及T2大小,按照小从到大,将所述根结点H及所述根结点M插入在优先队列中,例如,T1小于T2,则先计算待查询图片的特征W与根结点H的左孩子结点G间的距离T3,将结点G对应的两个叶子结点插入在优先队列中,再计算待查询图片的特征W与根结点H的右孩子结点K间的距离T4,根据T3及T4大小,将结点G及结点K放入优先队列中,若T4小于T3,按照从小到大排序,则在优先队列中,叶子结点的排序为:K1、K2、G1、G2。当此时叶子结点表示的类别中图片数据超过阈值时,将所述叶子结点表示的类别中图片数据作为所述待查询图片相似的第一图片集。
当确定所述待查找区有未建立二叉树索引的图片时,所述查询模块104将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集。
基于所述第一图片集及/或所述第二图片集,所述输出模块107输出与所述待查询图片相似的图片。
在本发明中,当没有第一图片集时,基于第二图片集,所述输出模块107输出与所述待查询图片相似的图片。当没有第二图片集时,基于第一图片集,所述输出模块107输出与所述待查询图片相似的图片。当第一图片集及第二图片集同时存在时,基于第一图片集及第二图片集,所述输出模块107输出与所述待查询图片相似的图片。
优选地,基于所述第一图片集及/或所述第二图片集,所述输出模块107输出与所述待查询图片相似的图片包括:
获取所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识;根据所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识,查找所述第一图片集中每张图片的标识关联的数据及/或所述第二图片 集中每张图片的标识关联的数据输出给用户。
进一步地,所述第一图片集、第二图片集中每张图片的标识关联的数据包括,但不限于:拍摄地点、抓拍的摄像装置的编号、图片编号、图片中人物的身份等等其他信息。
本发明获取待查询图片及查询条件;根据所述查询条件,确定待查找区;提取所述待查询图片的特征;当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。本发明能在建立了二叉树索引的分区中,对二叉树进行搜索,不需要对每一条的数据进行比对,大大提升了搜索速度;在没有建立二叉树索引分区中,该分区的搜索需要与分区中每条的图片数据进行比对得到相似度达到一定阈值的图片。因此,利用本发明对数据进行查询,从整体上提高了搜索速度。
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明每个实施例所述方法的部分步骤。
如图5所示,所述电子设备3包括至少一个发送装置31、至少一个存储器32、至少一个处理器33、至少一个接收装置34以及至少一个通信总线。其中,所述通信总线用于实现这些组件之间的连接通信。
所述电子设备3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。所述电子设备3还可包括网络设备和/或用户设备。其中,所述网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算(Cloud Computing)的由大量主机或网络服务器构成的云,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。
所述电子设备3可以是,但不限于任何一种可与用户通过键盘、触摸板或声控设备等方式进行人机交互的电子产品,例如,平板电脑、智能手机、个人数字助理(Personal Digital Assistant,PDA)、智能式穿戴式设备、摄像设 备、监控设备等终端。
所述电子设备3所处的网络包括,但不限于互联网、广域网、城域网、局域网、虚拟专用网络(Virtual Private Network,VPN)等。
其中,所述接收装置34和所述发送装置31可以是有线发送端口,也可以为无线设备,例如包括天线装置,用于与其他设备进行数据通信。
所述存储器32用于存储程序代码。所述存储器32可以是集成电路中没有实物形式的具有存储功能的电路,如RAM(Random-Access Memory,随机存取存储器)、FIFO(First In First Out,)等。或者,所述存储器32也可以是具有实物形式的存储器,如内存条、TF卡(Trans-flash Card)、智能媒体卡(smart media card)、安全数字卡(secure digital card)、快闪存储器卡(flash card)等储存设备等等。
所述处理器33可以包括一个或者多个微处理器、数字处理器。所述处理器33可调用存储器32中存储的程序代码以执行相关的功能。例如,图2中所述的各个单元是存储在所述存储器32中的程序代码,并由所述处理器33所执行,以实现一种数据查询方法。所述处理器33又称中央处理器(CPU,Central Processing Unit),是一块超大规模的集成电路,是运算核心(Core)和控制核心(Control Unit)。
本发明实施例还提供一种计算机可读存储介质,其上存储有计算机指令,所述指令当被包括一个或多个处理器的电子设备执行时,使电子设备执行如上文方法实施例所述的数据查询方法。
结合图1所示,所述电子设备3中的所述存储器32存储多个指令以实现一种商品推荐方法,所述处理器33可执行所述多个指令从而实现:
获取待查询图片及查询条件;
根据所述查询条件,确定待查找区;
提取所述待查询图片的特征;
当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;
当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;
基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。
根据本发明优选实施例,所述查询条件包括拍摄时间,所述处理器33可执行的所述多个指令用于根据所述查询条件,确定待查找区包括:
根据所述拍摄时间,将时间与所述拍摄时间匹配的区作为所述待查找区。
根据本发明优选实施例,在所述二叉树索引中,二叉树的一个叶子结点表示一个类别的图片数据,除叶子结点外的每个结点表示两个候选类别的差分特征,除叶子结点及根结点外任意一个结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第一候选类别分类得到的,所述任意一个结点的兄弟结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第二候选类别分类得到的。
根据本发明优选实施例,当确定所述待查找区有二叉树索引时,所述处理器33可执行的所述多个指令用于根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
从每个二叉树索引的根结点开始,采用优先队列遍历每个二叉树索引,将遍历的结点插入在所述优先队列中,基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离,对所述优先队列进行排序,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
根据本发明优选实施例,当确定所述待查找区有二叉树索引时,所述处理器33可执行的所述多个指令用于根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
依次计算所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离;
将所述每个二叉树索引中的根结点依次插入优先队列中,并基于所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离,对所述优先队列中的根结点进行排序;
从所述优先队列中最小距离对应的根结点开始,依次遍历所述优先队列中每个根结点对应的孩子结点,计算所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离;
基于所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离,对所述优先队列中的孩子结点进行排序;
从所述优先队列中最小距离对应的孩子结点开始,利用优先队列递归遍历每个二叉树索引,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
根据本发明优选实施例,基于所述第一图片集及/或所述第二图片集,所述处理器33可执行的所述多个指令用于输出与所述待查询图片相似的图片包括以下一种或者多种的组合:
获取所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的 标识;根据所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识,查找所述第一图片集中每张图片的标识关联的数据及/或所述第二图片集中每张图片的标识关联的数据输出给用户。
根据本发明优选实施例,所述处理器33可执行的所述多个指令还包括:
获取待存储图片的拍摄时间;
根据所述待存储图片的拍摄时间,将待存储图片存储于对应的区中。
根据本发明优选实施例,所述方法还包括:
当一个区中未建立二叉树索引的图片数据的存储量大于或者等于第二阈值时,对所述一个区中未建立二叉树索引的图片数据生成二叉树索引。
以上说明的本发明的特征性的手段可以通过集成电路来实现,并控制实现上述任意实施例中所述数据查询方法的功能。即,本发明的集成电路安装于所述电子设备中,使所述电子设备发挥如下功能:
获取待查询图片及查询条件;根据所述查询条件,确定待查找区;提取所述待查询图片的特征;当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。
在任意实施例中所述数据查询方法所能实现的功能都能通过本发明的集成电路安装于所述电子设备中,使所述电子设备发挥任意实施例中所述数据查询方法所能实现的功能,在此不再详述。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征 可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明的各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (10)

  1. 一种数据查询方法,其特征在于,所述方法包括:
    获取待查询图片及查询条件;
    根据所述查询条件,确定待查找区;
    提取所述待查询图片的特征;
    当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集;
    当确定所述待查找区有未建立二叉树索引的图片时,将所述待查询图片的特征与所述未建立二叉树索引的图片的特征进行比对,查找与所述待查询图片相似的第二图片集;
    基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片。
  2. 如权利要求1所述的数据查询方法,其特征在于,所述查询条件包括拍摄时间,所述根据所述查询条件,确定待查找区包括:
    根据所述拍摄时间,将时间与所述拍摄时间匹配的区作为所述待查找区。
  3. 如权利要求1所述的数据查询方法,其特征在于,在所述二叉树索引中,二叉树的一个叶子结点表示一个类别的图片数据,除叶子结点外的每个结点表示两个候选类别的差分特征,除叶子结点及根结点外任意一个结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第一候选类别分类得到的,所述任意一个结点的兄弟结点对应的两个候选类别由对所述任意一个结点的父节点对应的两个候选类别中第二候选类别分类得到的。
  4. 如权利要求1至3中任一项所述的数据查询方法,其特征在于,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
    从每个二叉树索引的根结点开始,采用优先队列遍历每个二叉树索引,将遍历的结点插入在所述优先队列中,基于所述待查询图片的特征与遍历的结点表示的差分特征间的距离,对所述优先队列进行排序,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
  5. 如权利要求4所述的数据查询方法,其特征在于,所述当确定所述待查找区有二叉树索引时,根据所述待查询图片的特征,对所述二叉树索引进行遍历,查找与所述待查询图片相似的第一图片集包括:
    依次计算所述待查询图片的特征与每个二叉树索引中的根结点表示的差分 特征的距离;
    将所述每个二叉树索引中的根结点依次插入优先队列中,并基于所述待查询图片的特征与每个二叉树索引中的根结点表示的差分特征的距离,对所述优先队列中的根结点进行排序;
    从所述优先队列中最小距离对应的根结点开始,依次遍历所述优先队列中每个根结点对应的孩子结点,计算所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离;
    基于所述待查询图片的特征与每个根结点对应的孩子结点表示的差分特征的距离,对所述优先队列中的孩子结点进行排序;
    从所述优先队列中最小距离对应的孩子结点开始,利用优先队列递归遍历每个二叉树索引,直至达到遍历终止条件,基于所述优先队列中叶子结点对应的类别中的图片数据,确定所述待查询图片相似的第一图片集。
  6. 如权利要求1所述的数据查询方法,其特征在于,基于所述第一图片集及/或所述第二图片集,输出与所述待查询图片相似的图片包括以下一种或者多种的组合:
    获取所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识;根据所述第一图片集中每张图片的标识及/或所述第二图片集中每张图片的标识,查找所述第一图片集中每张图片的标识关联的数据及/或所述第二图片集中每张图片的标识关联的数据输出给用户。
  7. 如权利要求1所述的数据查询方法,其特征在于,所述方法还包括:
    获取待存储图片的拍摄时间;
    根据所述待存储图片的拍摄时间,将待存储图片存储于对应的区中。
  8. 如权利要求1或7所述的数据查询方法,其特征在于,所述方法还包括:
    当一个区中未建立二叉树索引的图片数据的存储量大于或者等于第二阈值时,对所述一个区中未建立二叉树索引的图片数据生成二叉树索引。
  9. 一种电子设备,其特征在于,所述电子设备包括存储器及处理器,所述存储器用于存储至少一个指令,所述处理器用于执行所述至少一个指令以实现如权利要求1至8中任一项所述数据查询方法。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有至少一个指令,所述至少一个指令被处理器执行时实现如权利要求1至8中任一项所述数据查询方法。
PCT/CN2017/119637 2017-12-25 2017-12-29 数据查询方法、电子设备及存储介质 WO2019127299A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711421701.2 2017-12-25
CN201711421701.2A CN110019876B (zh) 2017-12-25 2017-12-25 数据查询方法、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2019127299A1 true WO2019127299A1 (zh) 2019-07-04

Family

ID=67064331

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119637 WO2019127299A1 (zh) 2017-12-25 2017-12-29 数据查询方法、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN110019876B (zh)
WO (1) WO2019127299A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177440A (zh) * 2019-12-20 2020-05-19 北京旷视科技有限公司 目标图像的检索方法、装置、计算机设备和存储介质
CN111797256A (zh) * 2020-05-29 2020-10-20 平安普惠企业管理有限公司 基于多色图片的查询结果发送方法、装置和计算机设备
CN112069129A (zh) * 2020-09-18 2020-12-11 浪潮云信息技术股份公司 基于国产cpu的配置文件格式转换方法、装置及介质
CN112241672A (zh) * 2019-07-19 2021-01-19 杭州海康威视数字技术股份有限公司 一种身份数据的关联方法、装置、电子设备及存储介质
CN112364246A (zh) * 2020-11-19 2021-02-12 中国平安人寿保险股份有限公司 数据搜索方法、装置、计算机设备及存储介质
CN113268485A (zh) * 2021-06-02 2021-08-17 电信科学技术第十研究所有限公司 数据表关联分析方法、装置、设备及存储介质
CN116541420A (zh) * 2023-07-07 2023-08-04 上海爱可生信息技术股份有限公司 向量数据的查询方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159443B (zh) * 2019-12-31 2022-03-25 深圳云天励飞技术股份有限公司 一种图像特征值的搜索方法、装置及电子设备
CN111311094B (zh) * 2020-02-14 2022-05-24 上海东普信息科技有限公司 供应链项目分析方法、装置、设备及存储介质
CN113568956B (zh) * 2021-08-05 2024-06-18 武汉美之修行信息科技有限公司 一种近似产品查询方法、装置及计算机可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436491A (zh) * 2011-11-08 2012-05-02 张三明 一种基于BigBase的海量图片搜索系统及方法
CN103092848A (zh) * 2011-10-28 2013-05-08 浙江大华技术股份有限公司 一种图片存储与检索方法
CN103164402A (zh) * 2011-12-08 2013-06-19 青岛海尔电子有限公司 文件显示的方法
CN105183746A (zh) * 2015-07-08 2015-12-23 西安交通大学 从多相关图片中挖掘显著特征实现图像检索的方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902826B (zh) * 2012-11-08 2016-07-06 公安部第三研究所 一种基于基准图像索引的图像快速检索方法
CN103324682B (zh) * 2013-05-31 2017-06-06 百度在线网络技术(北京)有限公司 一种基于图片来搜索视频的方法及装置
CN105354273A (zh) * 2015-10-29 2016-02-24 浙江高速信息工程技术有限公司 一种快速检索高速公路逃费车辆高相似度图像的方法
US10210180B2 (en) * 2015-12-04 2019-02-19 Intel Corporation Hybrid nearest neighbor search tree with hashing table
CN106874513A (zh) * 2017-03-07 2017-06-20 维沃移动通信有限公司 一种照片查找方法及移动终端

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092848A (zh) * 2011-10-28 2013-05-08 浙江大华技术股份有限公司 一种图片存储与检索方法
CN102436491A (zh) * 2011-11-08 2012-05-02 张三明 一种基于BigBase的海量图片搜索系统及方法
CN103164402A (zh) * 2011-12-08 2013-06-19 青岛海尔电子有限公司 文件显示的方法
CN105183746A (zh) * 2015-07-08 2015-12-23 西安交通大学 从多相关图片中挖掘显著特征实现图像检索的方法

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112241672B (zh) * 2019-07-19 2024-05-03 杭州海康威视数字技术股份有限公司 一种身份数据的关联方法、装置、电子设备及存储介质
CN112241672A (zh) * 2019-07-19 2021-01-19 杭州海康威视数字技术股份有限公司 一种身份数据的关联方法、装置、电子设备及存储介质
CN111177440B (zh) * 2019-12-20 2023-11-07 北京旷视科技有限公司 目标图像的检索方法、装置、计算机设备和存储介质
CN111177440A (zh) * 2019-12-20 2020-05-19 北京旷视科技有限公司 目标图像的检索方法、装置、计算机设备和存储介质
CN111797256A (zh) * 2020-05-29 2020-10-20 平安普惠企业管理有限公司 基于多色图片的查询结果发送方法、装置和计算机设备
CN112069129B (zh) * 2020-09-18 2024-01-19 浪潮云信息技术股份公司 基于国产cpu的配置文件格式转换方法、装置及介质
CN112069129A (zh) * 2020-09-18 2020-12-11 浪潮云信息技术股份公司 基于国产cpu的配置文件格式转换方法、装置及介质
CN112364246A (zh) * 2020-11-19 2021-02-12 中国平安人寿保险股份有限公司 数据搜索方法、装置、计算机设备及存储介质
CN112364246B (zh) * 2020-11-19 2024-02-06 中国平安人寿保险股份有限公司 数据搜索方法、装置、计算机设备及存储介质
CN113268485A (zh) * 2021-06-02 2021-08-17 电信科学技术第十研究所有限公司 数据表关联分析方法、装置、设备及存储介质
CN113268485B (zh) * 2021-06-02 2024-02-09 电信科学技术第十研究所有限公司 数据表关联分析方法、装置、设备及存储介质
CN116541420A (zh) * 2023-07-07 2023-08-04 上海爱可生信息技术股份有限公司 向量数据的查询方法
CN116541420B (zh) * 2023-07-07 2023-09-15 上海爱可生信息技术股份有限公司 向量数据的查询方法

Also Published As

Publication number Publication date
CN110019876B (zh) 2023-07-28
CN110019876A (zh) 2019-07-16

Similar Documents

Publication Publication Date Title
WO2019127299A1 (zh) 数据查询方法、电子设备及存储介质
US11727053B2 (en) Entity recognition from an image
CN109101620B (zh) 相似度计算方法、聚类方法、装置、存储介质及电子设备
WO2017137000A1 (zh) 对描述同一实体的不同实例进行合并的方法、装置及设备
US8874597B2 (en) Method and system for text filtering based on semantic matching
US11409789B2 (en) Determining identity in an image that has multiple people
WO2021082100A1 (zh) 知识图谱的实体对齐方法、装置、设备及存储介质
WO2019137185A1 (zh) 一种图片筛选方法及装置、存储介质、计算机设备
JP6608972B2 (ja) ソーシャルネットワークに基づいてグループを探索する方法、デバイス、サーバ及び記憶媒体
US20180330190A1 (en) Anomaly detection using non-target clustering
CN109697641A (zh) 计算商品相似度的方法和装置
CN109992766B (zh) 提取目标词的方法和装置
CN111400504A (zh) 企业关键人的识别方法和装置
WO2019192120A1 (zh) 轨迹查询方法、电子设备及存储介质
CN108241646B (zh) 一种搜索匹配方法和装置、推荐方法和装置
CN111767420B (zh) 一种生成服饰搭配数据的方法和装置
WO2019127300A1 (zh) 数据存储方法、电子设备及存储介质
CN114444619B (zh) 样本生成方法、训练方法、数据处理方法以及电子设备
CN111667018B (zh) 一种对象聚类的方法、装置、计算机可读介质及电子设备
WO2017065795A1 (en) Incremental update of a neighbor graph via an orthogonal transform based indexing
CN110209895B (zh) 向量检索方法、装置和设备
CN116975018A (zh) 一种数据处理方法、装置、计算机设备以及可读存储介质
CN114547476B (zh) 一种基于二部图的社区搜索方法、装置以及处理设备
WO2014117296A1 (en) Generating a hint for a query
CN114238576A (zh) 数据匹配方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.10.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17936041

Country of ref document: EP

Kind code of ref document: A1