CN110865992A - Retrieval library management method, retrieval device and retrieval medium - Google Patents

Retrieval library management method, retrieval device and retrieval medium Download PDF

Info

Publication number
CN110865992A
CN110865992A CN201911044479.8A CN201911044479A CN110865992A CN 110865992 A CN110865992 A CN 110865992A CN 201911044479 A CN201911044479 A CN 201911044479A CN 110865992 A CN110865992 A CN 110865992A
Authority
CN
China
Prior art keywords
data
hot
cold
search library
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911044479.8A
Other languages
Chinese (zh)
Inventor
李明耀
韦跃明
严石伟
蒋楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Cloud Computing Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Beijing Co Ltd filed Critical Tencent Cloud Computing Beijing Co Ltd
Priority to CN201911044479.8A priority Critical patent/CN110865992A/en
Publication of CN110865992A publication Critical patent/CN110865992A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a search library management method, a search method, a device and a medium, wherein the search library management method comprises the following steps: acquiring attribute parameters of each thermal data under the condition that the total number of data in the thermal search library is greater than a first preset scale threshold; migrating the hot data with the attribute parameters meeting the hot data management conditions to a cold search library; calculating first matching degrees of each cold data and each hot data stored in a cold search library; and migrating cold data meeting cold data management conditions to a hot search library. According to the method and the device, the cold data and the hot data of the search library are separated, and the scale of the hot search library is ensured to be within a threshold range by triggering the cold and hot library upgrading and degrading logic, so that the effectiveness of the search data is improved, and the search reliability, accuracy and search efficiency are improved.

Description

Retrieval library management method, retrieval device and retrieval medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, and a medium for managing a search library.
Background
At present, identity recognition and retrieval by applying an artificial intelligence technology are already in certain ground and popularized in application scenes such as intelligent retail, intelligent communities, intelligent finance, public security and the like.
In the application scenario, in order to improve the retrieval efficiency, the retrieval library for identity recognition is generally fixedly stored in a processor with better concurrency capability and memory access speed. Taking a face retrieval scene as an example, a face database is usually stored in a Graphics Processing Unit (GPU) memory. However, in these application scenarios, with the addition of new identity users, a large amount of excellent processor and expensive (e.g., GPU) resources are consumed, and the storage cost of the search is greatly increased. In addition, the retrieval reliability, accuracy and retrieval efficiency of the existing retrieval method need to be further improved.
Disclosure of Invention
The application provides a retrieval library management method, a retrieval device and a retrieval medium, which are used for solving at least one technical problem.
In one aspect, the present application provides a method for managing a search library, the search library including a hot search library and a cold search library, the method including:
acquiring attribute parameters of each piece of thermal data stored in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library;
calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively;
and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
In another aspect, the present application further provides a retrieval method, which performs retrieval using a data-managed search library, where the search library includes a hot search library and a cold search library, and the method includes:
acquiring characteristic data to be retrieved of an object;
searching the characteristic data to be searched in a hot search library managed by a search library to obtain a search result;
returning the retrieval result;
the search library performs data management by the following search library management method, and the database management method comprises the following steps:
acquiring attribute parameters of each piece of thermal data in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library;
calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively;
and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
In another aspect, the present application further provides a search library management apparatus, the search library including a hot search library and a cold search library, the apparatus including:
the attribute acquisition module is used for acquiring attribute parameters of each piece of thermal data stored in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
the first migration module is used for migrating the hot data meeting the hot data management condition to the cold search library if the attribute parameters of the hot data meet the hot data management condition;
the first calculation module is used for calculating a first matching degree between each cold data stored in the cold search library and each hot data in the hot search library;
and the second migration module is used for migrating the cold data meeting the cold data management condition to the hot search library if the first matching degree is determined to meet the cold data management condition.
In another aspect, the present application further provides a retrieval apparatus, including:
the characteristic management module is used for acquiring characteristic data to be retrieved of the object;
the above-mentioned search library management device is used for carrying out data management on a search library, and the search library comprises a hot search library and a cold search library;
the retrieval module is used for retrieving the characteristic data to be retrieved in a data-managed hot retrieval library to obtain a retrieval result;
and the returning module is used for returning the retrieval result.
Another aspect further provides a search library management apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement any of the above described search library management methods.
Another aspect further provides a retrieval device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which are loaded and executed by the processor to implement any one of the retrieval methods described above.
Yet another aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions for being loaded by a processor and executing a search library management method as described in any of the above.
A further aspect provides a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes or set of instructions for being loaded by a processor and executing a retrieval method as described in any of the above.
The retrieval library management method, the retrieval device and the retrieval medium have the following technical effects:
the search library comprises a hot search library and a cold search library, and the attribute parameters of each piece of hot data stored in the hot search library are acquired under the condition that the total number of the hot data stored in the hot search library is greater than a first preset scale threshold; if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library; calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively; and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library. Therefore, by separating the cold and hot data of the search library, and under the condition that the total number of the hot data stored in the hot search library is larger than the first preset scale threshold, the cold and hot library upgrading and downgrading logic is triggered, the scale of the hot search library is ensured to be within the threshold range, the effectiveness of the hot data in the hot search library is improved, and the search reliability, the accuracy and the search efficiency of the managed hot search library are improved.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic illustration of an implementation environment provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a search library management method according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating the steps provided by an embodiment of the present application for migrating hot data to a cold repository;
FIG. 4 is a schematic flow chart diagram for migrating hot data to a cold repository according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating steps of data management for a hot repository according to an embodiment of the present application;
FIG. 6 is a flow chart illustrating the database cleaning step of the cold search database provided in the embodiment of the present application;
FIG. 7 is a flowchart illustrating steps of data management for a cold repository according to an embodiment of the present application;
fig. 8 is a block diagram illustrating a structure of a search library management apparatus according to an embodiment of the present application;
fig. 9 is a schematic flowchart of a retrieval method provided in an embodiment of the present application;
FIG. 10 is a schematic diagram of a framework for implementing a retrieval method based on a face retrieval scene according to the present application;
fig. 11 is a block diagram illustrating a structure of a search apparatus according to an embodiment of the present application;
fig. 12 is a hardware structural diagram of an apparatus for implementing the method provided by the embodiment of the present application.
Detailed Description
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
First, technical terms that may be involved in the present invention are briefly described:
face recognition: the method refers to a process of extracting face features and comparing similarity, and face recognition is mainly divided into 1: 1 Face Verification (Face Verification), 1: n Face Recognition (Face Recognition) and Face Retrieval (Face Retrieval).
1: n, face recognition: and finding one or more faces with the highest similarity with the face to be searched in a large-scale face database.
And (3) SDK: a Software Development Kit (SDK) is generally a collection of Development tools used by Software engineers to build application Software for a particular Software package, Software framework, hardware platform, service system, etc.
CPU/GPU: the CPU is a central processing unit, and the GPU is an image processor; both are computational resources for video multimedia.
At present, identity recognition and retrieval have been put to the ground and popularized to a certain extent in application scenes such as intelligent retail, intelligent communities, intelligent finance, public security and the like. In these application scenarios, in order to improve the retrieval efficiency, the retrieval library for performing identity recognition is generally fixedly stored in a processor with better concurrency capability and memory access speed. The existing retrieval method is mainly suitable for application scenes of a fixed retrieval base, such as scenes (such as entrance guard, attendance and the like) in which the advance registration of the retrieval base is completed; for the scenes that the retrieval base is not fixed (such as intelligent retail, intelligent communities and the like), along with the addition of a large number of new identity users, the scale of the retrieval base is continuously increased, and great challenges are brought to the existing retrieval method.
Taking a face retrieval scene as an example, a face database is usually stored in a Graphics Processing Unit (GPU) memory. However, in these application scenarios, with the addition of new identity users, a large amount of excellent processor (e.g., GPU) resources are consumed, and the storage cost of the search is greatly increased. In addition, the retrieval reliability, accuracy and retrieval efficiency of the existing retrieval method need to be further improved.
In addition, the inventors have also found that: in the identity identification process, identity retrieval is usually performed in a retrieval base by using a retrieval algorithm, however, the retrieval accuracy based on the retrieval algorithm depends on the scale of the retrieval base. If the scale of the search library is larger, the requirements on the model precision and the calculation performance of the search algorithm are higher. For example, taking a face retrieval scenario as an example, as the number of user identities in the retrieval library increases, the accuracy of a general face recognition algorithm may decrease significantly. As shown in table 1 below, the face first hit rate is obtained for the same search algorithm model and different search library sizes at one-thousandth of error rate.
TABLE 1 relationship between search accuracy and search library size
Size/number of people in search pool 100 500 1000
Rate of accuracy/%) 100 99.2 78.4
In order to solve at least one of the above technical problems, the present application provides a search library management method, a search method, an apparatus, and a medium.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic diagram of an implementation environment provided by an embodiment of the present application is shown. The implementation environment may include: a terminal 10, and a server 20 connected to the terminal 10 through a network.
The terminal 10 may specifically include software running in a physical device, such as an application installed on the device, and may also include at least one of a smart phone, a desktop computer, a tablet computer, a notebook computer, a digital assistant, a smart wearable device, and the like, which are installed with the application. Specifically, the terminal 10 runs an operating system, which may be a desktop operating system such as a Windows (Windows) operating system, a Linux operating system, or a Mac OS (apple desktop operating system), or a mobile operating system such as an iOS (apple mobile terminal operating system) or an Android (Android) operating system.
The server 20 may be an independent server, a server cluster composed of a plurality of independent servers, or a distributed server, or a cloud server providing basic cloud computing services such as a cloud computing server, a cloud database, and a cloud storage. The distributed server may specifically be a Block Chain (Block Chain) structure, and any node in the Block Chain structure may execute or participate in executing a search library management method or a search method.
It should be understood that the implementation environment shown in fig. 1 is only one application environment of the present application, and is not limited to the application environment of the present application, and other application environments may include more or less computer devices than those shown in the drawings, or a network connection relationship of computer devices.
While a specific embodiment of a method for managing a search query according to the present application is described below, fig. 2 is a flowchart of a method for managing a search query according to an embodiment of the present application, which provides the method steps according to the embodiment or the flowchart, but may include more or less steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. As shown in fig. 2, the execution subject of the method may be a server in the application environment, the search library includes a hot search library and a cold search library, and the method may include:
s201: and acquiring the attribute parameters of the thermal data stored in the thermal search library under the condition that the total number of the thermal data stored in the thermal search library is greater than a first preset scale threshold.
In the present embodiment, the hot search library is a search base library for performing a search. The hot search library stores a large amount of hot data for searching comparison, and the hot data includes but is not limited to the identity of the user and basic data of the user. The basic data of the user includes object data such as a face image, a face feature image, a voiceprint, an iris, and the like.
For example, taking a scene of detecting a face as an example, if a user a enters a certain intelligent retail store, an acquisition device acquires a face image of the user a, then a hot search library acquires the face image of the user a, and whether a corresponding face image feature exists in the hot search library based on the feature of the face image, if so, it is indicated that the user a has visited before, if not, it is indicated that the user a is a new visiting user, an identity ID needs to be registered for the user, and the registered identity ID and the corresponding face image feature are used as new hot data to be stored in the hot search library in an associated manner.
The cold search library is used for storing cold data to be managed. And managing the cold data in the cold search library and the hot data in the hot search library under the condition that a preset management condition is met, wherein the management comprises but is not limited to data migration, data deletion, data merging, data unloading to a disk, data synchronization and the like.
In application scenarios such as intelligent retail, intelligent community, intelligent finance, public security and the like, a large number of users who visit continuously exist. The identification of users who visit continuously is carried out, and a retrieval algorithm is generally adopted to carry out retrieval in a database. As the inventors have found, the search algorithm is related to the size of the database. Therefore, in order to improve the retrieval accuracy, when the number of the retrieval bases exceeds the first preset size threshold, the number of the retrieval bases already exceeds the calculation capacity of the retrieval algorithm, and the retrieval result accuracy is low. The first preset scale threshold value is adjusted adaptively according to the theoretical accuracy of the retrieval algorithm, the theoretical precision of the retrieval model and the theoretical computing capacity.
And in the case that the total number of the hot data stored in the hot search library is larger than a first preset size threshold, triggering a management condition of the hot search library. At this time, the attribute parameters of each piece of thermal data stored in the thermal search library need to be acquired, so as to filter the thermal data that needs to be managed. The attribute parameters of the thermal data include at least one of a recording time of the thermal data and a user activity associated with the thermal data. For example, the recorded time may be a visit time of the user associated with the recorded hot data, and the activity may be based on a number of times the user visits within a preset time period (e.g., a week, a month, a half year, a year, etc.).
It should be noted that the value of the first preset size threshold is not fixed, and may be adaptively configured according to a search algorithm, so as to match different search algorithms and improve search performance.
In practical application, taking a scene of detecting a human face as an example, if the scale of a rated base corresponding to the requirement of achieving the retrieval accuracy by the specified human face retrieval algorithm is 1 ten thousand, the first preset scale threshold value may be configured to be 1 ten thousand. And then, under the condition that the total number of the hot data stored in the current hot search library is detected to be larger than the first preset scale threshold value, triggering to perform database management on the hot search library. In the database management, the attribute parameters (such as the storage time and the activity of the hot data) of each hot data stored in the hot database can be acquired during non-daily working hours (such as night, early morning or other time periods with less retrieval requests), and then the hot data meeting the preset conditions are managed based on the attribute parameters, so that the normal face feature adding and deleting modification services during the working hours can be avoided, and the machine night resources are fully utilized.
In other embodiments, in the event that the total number of hot data stored in the hot repository is determined to be less than the first preset size threshold k1, the deep data management of the hot repository may not be triggered first; and further judging whether the total number of the hot data stored in the hot search library is larger than a third preset size threshold k3(k3 is smaller than k1) or not, and if so, triggering data management of the hot search library. Therefore, the total number of data in the hot search library is reduced, and the search efficiency and accuracy based on the hot search library are further improved.
S203: and if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library.
After data management of the thermal search library is triggered, and attribute parameters of each thermal data in the thermal search library are acquired, the thermal data needing to be managed are screened out from all the thermal data, and then corresponding data management is carried out on the screened thermal data.
Attribute parameters of the thermal data the attribute parameters of the thermal data include at least one of a last recording timestamp of the thermal data and a user activity associated with the thermal data.
In the present embodiment, as shown in fig. 3 and 4, the attribute parameters of the hot data include a last recording time stamp of the hot data and a user activity associated with the hot data. If it is determined that the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library, including:
s301: and if the time interval between the latest recording time stamp of the hot data and the first preset time is judged to be larger than or equal to a first preset threshold value, and the user activity degree associated with the hot data is judged to be smaller than a second preset threshold value, determining that the attribute parameters of the hot data meet the hot data management conditions.
Wherein the last recorded timestamp is a time when the user identity associated with the thermal data was last visited the record. The first preset time includes, but is not limited to, the current management time, some other fixed time, and the like. The first preset time can be configured according to the visiting situation of the user in the actual application scene. For example, in a scenario where the visiting time is frequent, for example, in a smart community scenario, since the user needs to frequently visit the community, the first preset time may be configured to be a small value, such as half a month, a quarter, or a year. In a scenario where the visit time is not frequent, for example, in an intelligent retail scenario, since the user occasionally enters the retail store, the first preset time may be configured to be a large value, such as a quarter, a half year, a year, and the like.
The object activity is an activity index of the user corresponding to the thermal data in a second preset time period. Specifically, the object activity may be determined according to the number of times of activity of the user in a preset time period and the second preset time period. The number of active times can be cumulative number of visits, cumulative number of uses, etc. The second preset time and the active times can be configured according to the visiting condition of the user in the actual application scene. For example, in a scenario with a relatively frequent visit, for example, in a scenario of a smart community, since a user needs to frequently go in and out of the community, the second preset time may be configured to be a smaller value and the number of times of activity is a larger value, for example, the second preset time is configured to be approximately three days, approximately one week, approximately half a month, and the like; the corresponding active times can be determined according to the situation, for example, the corresponding active times can be set to be 5-20 times. In a scenario with infrequent visits, for example, in a smart retail scenario, since the user enters the retail store occasionally, the second preset time may be configured to be a larger value, such as about half a month, about a month, etc.; the corresponding active times can be determined according to the situation, and can be set to be 3-8 times, for example.
For example, in an intelligent retail scenario, in a preset management period, the last recording time stamp of the thermal data corresponding to the user identifier of the user B in the thermal data base is t 1. In the last month, the accumulated number of visits of the user B to the intelligent retail store is m. If the time interval between t1 and the current query is greater than a first preset threshold (for example, one year), and m is less than a second preset threshold (for example, 5 times), it is determined that the attribute parameter of the hot data corresponding to user B satisfies the hot data management condition.
In other embodiments, the attribute parameters of the thermal data may include only a last recording timestamp of the thermal data or a user activity associated with the thermal data. For example, in a security scenario, the attribute parameters of the thermal data may only include a last recording timestamp.
The hot retrieval library performs identity screening corresponding to the hot data through the latest recording timestamp of the hot data and/or the activity of the associated user, and guarantees timeliness and reliability of retrieval results.
S303: and determining the thermal data meeting the thermal data management condition as thermal data to be managed.
S305: and migrating the hot data to be managed to the cold search library.
And when the preset management period is reached, migrating the hot data to be managed to the cold search library. The preset management period includes, but is not limited to, one day, one week, one month, etc.
In one embodiment, the migrating the hot data to be managed to the cold repository includes:
s3051: and calculating second matching degrees of the hot data to be managed and each cold data in the cold search library respectively.
And the second matching degree is used for representing the similarity degree between the thermal data to be managed and each cold data in the cold search library. The number of the second matching degrees is multiple, and the second matching degrees can be determined by calculating the similarity of the hot data to be managed and all the cold data. The similarity includes, but is not limited to, cosine similarity.
S3053: and judging whether the maximum second matching degree is greater than or equal to a third preset threshold value or not in a plurality of second matching degrees corresponding to the thermal data to be managed.
S3055: if the judgment result is yes, merging the hot data to be managed and the cold data of which the maximum second matching degree is greater than or equal to a third preset threshold value;
s3057: and if the judgment result is negative, storing the hot data to be managed with the maximum second matching degree smaller than a third preset threshold value into the cold search library.
The third preset threshold is a preset value, and the value of the third preset threshold can be configured but not limited to any value of 0.8-1. Data consolidation refers to consolidating hot data to be managed to corresponding cold data storage locations. And if the maximum second matching degree is judged to be greater than or equal to a third preset threshold value in the plurality of second matching degrees corresponding to the hot data to be managed, indicating that the cold database is stored with cold data similar to the hot data to be managed corresponding to the maximum second matching degree, merging the corresponding hot data to be managed with the cold data, and deleting the hot data to be managed in the hot database. Correspondingly, if the maximum second matching degree is judged to be smaller than the third preset threshold, the hot data to be managed corresponding to the maximum second matching degree is indicated to be new cold data, and the new cold data is directly stored in the cold search library.
According to the method and the device, the hot storage degradation strategy is triggered through the first preset scale threshold of the hot retrieval storage, the identity in the storage sinks to the cold storage with poor timeliness, the scale and the identity of the hot retrieval storage are guaranteed to be latest, and the retrieval accuracy and speed are improved. The hot database degradation strategy ensures that the hot data of the retrieval base database are all hot database identities subjected to time activity sinking, so that the returned retrieval results are all the object information corresponding to the query identities and having the closest correspondence.
S205: and calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library.
The first matching degree is used for representing the similarity degree of each cold data stored in the cold search library and each hot data in the hot search library. The number of the first matching degrees may be plural, which may be determined by calculating the similarity of all cold data and all hot data. The similarity includes, but is not limited to, cosine similarity.
S207: and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
And searching the hot search library by using the identity in the cold search library, calculating a first matching degree of all cold data in the cold search library and all hot data in the hot search library, and performing library upgrading management on the cold data if cold data meeting cold data management conditions exist.
In this embodiment, as shown in fig. 5, if it is determined that the first matching degree satisfies the cold data management condition, migrating the cold data satisfying the cold data management condition to the hot repository includes:
s401: and if the maximum first matching degree is judged to be greater than or equal to a fourth preset threshold value in the plurality of first matching degrees corresponding to the cold data, determining that the first matching degree meets the cold data management condition.
S403: and determining cold data meeting the cold data management condition as cold data to be managed.
S405: migrating the cold data to be managed to the hot search library.
The fourth preset threshold may be a preset value, and the value thereof may be configured but not limited to any value of 0.8-1. And searching the hot search library by using the identity in the cold search library, and if the maximum first matching degree is judged to be greater than or equal to a fourth preset threshold value in a plurality of first matching degrees corresponding to each cold data, indicating that the hot search library stores hot data similar to the cold data corresponding to the maximum first matching degree, namely that a user corresponding to the identity of the cold data visits recently, so that the identity information of the user is registered in the hot search library, determining the cold data corresponding to the cold data management condition as cold data to be managed, and performing data migration to the hot search library. For example, in one case, the user identification of the cold repository may be directly overwritten with the user identification of the hot repository, and the identity of the cold repository corresponding to the cold data may be deleted. In another case, the identity corresponding cold data of the cold search library can be deleted directly.
Correspondingly, if the maximum first matching degree is smaller than the fourth preset threshold, it is indicated that the user corresponding to the identity of the cold data does not visit again recently, and at this time, data migration is not performed on the cold data, and the cold data is stored in the cold search library.
The search library comprises a hot search library and a cold search library, and the attribute parameters of each piece of hot data stored in the hot search library are acquired under the condition that the total number of the hot data stored in the hot search library is greater than a first preset scale threshold; if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library; calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively; and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library. Therefore, by separating the cold and hot data of the search library, and under the condition that the total number of the hot data stored in the hot search library is greater than the first preset scale threshold, the cold and hot library upgrading and downgrading logic is triggered, the scale of the hot search library is ensured to be within the threshold range, the effectiveness of the hot data in the hot search library is improved, and the search reliability, the accuracy and the search efficiency of the managed hot search library are improved.
In some embodiments, as shown in fig. 6, the method may further comprise:
s501: detecting a total number of cold data stored in the cold repository;
s503: if the total number of the detected cold data is larger than a second preset scale threshold value, taking the difference value of the total number of the detected cold data and the second preset scale threshold value as the number N to be transferred, wherein N is a positive integer;
s505: sequencing all cold data in the cold search library according to the sequence of the latest recording time;
s507: and transferring the N cold data which are ranked at the top into a magnetic disk.
The second preset size threshold value can be determined according to the storage performance of the cold search library. By way of example, the second predetermined size threshold includes, but is not limited to, 10 ten thousand, 50 ten thousand, 100 ten thousand, and the like.
Specifically, after the cold storage heating process is completed, the total number of all cold data in the current cold search storage is judged, if the total number reaches a second preset scale threshold (which can be configured to be 100 thousands), the cold storage cleaning module is triggered, the identities of all the remaining cold search storages are sorted according to the latest visit time, and the cold search storages are cleaned. The main implementation is to write the cold data (such as TOP N) with the longest visit time in the cold database from the memory to the disk and delete the cold data from the memory, so as to ensure the controllable size of the cold database. The information of the cold storage written on the disk is mainly used as backup, and the service selects the required identity ID according to time or other dimensions to perform offline processing or specifically reactivate the ID and then re-enter the hot storage.
In some embodiments, the hot search library is stored in a memory corresponding to an image processor or a sound processor for matching an object to be searched, and the cold search library is stored in a memory corresponding to a central processor;
the object to be retrieved includes but is not limited to at least one of the following: face, audio, and iris.
For example, taking a scene of face retrieval as an example, if N face identities exist in the retrieval library, M face identities meeting the aging requirement may be stored in a video memory of the image processor GPU, and the remaining N-M face identities may be stored as cold data in a memory of the central processing unit CPU. Therefore, the cold data and the hot data are stored separately, the dependence and the requirement on the high-performance GPU are greatly reduced, and the storage cost of the search library is reduced.
According to the method and the system, the search libraries are separated from each other in a cold-hot mode, the online requests are not influenced in the database management process, the hot search libraries are searched online in the daytime by fully utilizing time resources, and the hot search libraries are searched by utilizing the cold search libraries at night. Meanwhile, a hot retrieval base with higher query heat is stored in a GPU memory, a cold retrieval base with inactive identity is stored in a CPU memory, the scales of the hot retrieval base and the cold storage base are ensured to be within a specified threshold value, and a cold and hot storage base upgrading and degrading flow is triggered after the upper limit of the scale of the hot retrieval base reaches the threshold value; and the upper limit of the scale of the refrigeration storage reaches a threshold value, and then the storage clearing logic is triggered, so that the resources of a CPU (central processing unit), a GPU (graphic processing unit) and a disk of the machine are utilized to the maximum extent, the requirement on the GPU with high cost is reduced, and the retrieval cost can be reduced.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Please refer to fig. 8, which shows a block diagram of a search library management apparatus according to an embodiment of the present application. The device has the function of realizing the server side in the above method example, and the function can be realized by hardware or by hardware executing corresponding software. The search repositories include a hot search repository and a cold search repository, and the apparatus 60 may include:
an attribute obtaining module 601, configured to obtain an attribute parameter of each piece of thermal data stored in the thermal search library when a total number of pieces of thermal data stored in the thermal search library is greater than a first preset size threshold;
a first migration module 602, configured to, if it is determined that the attribute parameter of the hot data meets the hot data management condition, migrate the hot data meeting the hot data management condition to the cold repository;
a first calculating module 603, configured to calculate a first matching degree between each piece of cold data stored in the cold repository and each piece of hot data in the hot repository;
a second migration module 604, configured to migrate the cold data meeting the cold data management condition to the hot repository if it is determined that the first matching degree meets the cold data management condition.
In some embodiments, the attribute parameters of the thermal data include at least one of a last recording timestamp of the thermal data and a user activity associated with the thermal data.
In some embodiments, where the attribute parameters of the hot data include a last recording timestamp of the hot data and a user activity associated with the hot data, the first migration module comprises:
the first determining unit is used for determining that the attribute parameters of the hot data meet hot data management conditions if the time interval between the latest recording time stamp of the hot data and first preset time is judged to be greater than or equal to a first preset threshold value and the user activity degree associated with the hot data is judged to be less than a second preset threshold value;
a second determination unit configured to determine thermal data satisfying a thermal data management condition as thermal data to be managed;
the first migration unit is used for migrating the hot data to be managed to the cold search library.
In some embodiments, the first migration unit comprises:
the calculating subunit is used for calculating a second matching degree between the hot data to be managed and each cold data in the cold search library;
the judging unit is used for judging whether the maximum second matching degree is larger than or equal to a third preset threshold value or not in a plurality of second matching degrees corresponding to the thermal data to be managed;
the data merging subunit is used for merging the hot data to be managed and the cold data of which the maximum second matching degree is greater than or equal to a third preset threshold value if the judgment result is positive;
and the storage subunit is used for storing the hot data to be managed, of which the maximum second matching degree is smaller than a third preset threshold value, to the cold search library if the judgment result is negative.
In some embodiments, the second migration module comprises:
a third determining unit, configured to determine that, in a plurality of first matching degrees corresponding to each piece of cold data, if it is determined that a maximum first matching degree is greater than or equal to a fourth preset threshold, the first matching degree satisfies a cold data management condition;
a fourth determination unit configured to determine cold data satisfying a cold data management condition as cold data to be managed;
and the second migration unit is used for migrating the cold data to be managed to the hot search library.
In some embodiments, the apparatus further comprises:
the detection module is used for detecting the total number of cold data stored in the cold search library;
the quantity determining module is used for taking the difference value between the total number of the detected cold data and a second preset scale threshold value as the quantity N to be transferred if the total number of the detected cold data is larger than the second preset scale threshold value, wherein N is a positive integer;
the sorting module is used for sorting all cold data in the cold search library according to the sequence of the latest recording time;
and the unloading module is used for unloading the N cold data which are sequenced at the front into the disk.
In some embodiments, the hot search library is stored in a memory corresponding to an image processor or a sound processor for matching an object to be searched, and the cold search library is stored in a memory corresponding to a central processor;
the object to be retrieved comprises at least one of the following: face, audio, and iris.
The embodiment of the present application provides a retrieval library management apparatus, which may include a processor and a memory, where the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the retrieval library management method provided by the above method embodiment.
The present application further provides a computer-readable storage medium, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored, and the at least one instruction, at least one program, a set of codes, or a set of instructions is loaded by a processor and executes any one of the above-mentioned retrieval library management methods.
While specific embodiments of a retrieval method of the present application are described below, fig. 9 is a flow chart of a retrieval method provided by embodiments of the present application, which provides the method operation steps described in the embodiments or the flow chart, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. As shown in fig. 9, an executing subject of the method may be a server in the application environment, and the searching is performed by using a data-managed search library, where the search library includes a hot search library and a cold search library, and the method may include:
s701: and acquiring characteristic data to be retrieved of the object.
In this embodiment, the object may be a face, an audio, an iris, and the like of the user.
In an example, taking face retrieval as an example, a face image of a user is acquired through acquisition equipment, and then the face image is subjected to image processing to obtain feature data to be retrieved.
The acquisition equipment comprises but is not limited to face shooting for camera shooting, self shooting of an identity card and video original frames. The image processing includes but is not limited to image preprocessing, face detection and registration, face feature extraction. The image preprocessing comprises picture noise reduction processing and the like. The face detection and registration are used for obtaining a face frame and face registration is used for obtaining face five-point information. The face feature extraction can be based on face extraction feature SDK to calculate features according to face five points to obtain feature data to be retrieved.
S703: and searching the characteristic data to be searched in a thermal search library subjected to data management to obtain a search result.
S705: and returning the retrieval result.
The search library performs data management through at least one search library management method.
In some embodiments, the retrieval method may further include the step of managing a database, and the managing the database may include:
acquiring attribute parameters of each piece of thermal data in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library;
calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively;
and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
It should be noted that specific contents and advantageous effects of managing the database are similar to those of the above embodiments, and are not described herein again.
Illustratively, after the feature data to be retrieved is determined, similarity retrieval is performed on the feature data to be retrieved in a data-managed thermal retrieval library by using a conventional retrieval algorithm model trained based on machine learning, and if the thermal data with the similarity reaching a preset threshold is retrieved, the corresponding thermal data is used as a retrieval result and returned. And if the hot data with the similarity reaching the preset threshold is not searched, indicating that the user corresponding to the characteristic data to be searched is a new identity user, inserting a new identity and the characteristic data corresponding to the new identity user into the hot search library, and adding the new identity into the library.
It should be noted that the data management of the search base by any of the above-described search base management methods is performed periodically.
And when the scale of the hot search library reaches a set threshold, triggering a data management strategy to complete the sinking of the hot data in the hot search library, and storing the controllable scale of the hot search library and the latest hot data in the hot search library. The cold search library receives the sinking hot data from the hot search library, regularly (configurable) searches the hot search library every day, completes the combination with the high search threshold of the hot search library, and triggers the library cleaning logic when the scale of the cold search library reaches the set threshold, thereby ensuring the controllable scale of the cold search library.
In the retrieval process, the hot retrieval library subjected to data management is utilized for retrieval, and the cold retrieval library is not used for retrieval, so that the retrieval efficiency can be greatly improved. The cold search library is used as a transition library and stored in the same search library with the hot search library, so that data reading and writing between the cold search library and the hot search library are facilitated, and efficient management of hot data in the hot search library is facilitated. In the database management process, cold and hot data of the search library are separated, and under the condition that the total number of the hot data stored in the hot search library is larger than a first preset scale threshold, cold and hot library upgrading and downgrading logic is triggered, so that the scale of the hot search library serving as a search base library is ensured to be within the threshold range, the effectiveness of the hot data in the hot search library is improved, and the search reliability, accuracy and search efficiency of the managed hot search library are also improved.
The following specifically describes the search method of the present application by taking a face search scene as an example. Fig. 10 is a schematic frame diagram of a retrieval method for implementing a scene based on face retrieval according to the present application. As shown in fig. 10, the framework may include three parts of face input, face preprocessing and face retrieval.
Wherein, 1) the human face input part can acquire a human face picture through the acquisition module. The acquisition module includes but is not limited to a camera for capturing a face photograph, an identity card for self-photographing, a video original frame and the like.
2) The face preprocessing part can be realized by a face picture preprocessing module, a face detection/registration module and a face feature extraction module. The face image preprocessing module is used for preprocessing the acquired face image, and the preprocessing includes but is not limited to noise reduction processing and the like. The face detection/registration module is used for detecting the preprocessed face picture to obtain a face frame, and performing face registration based on the face frame to obtain face five-point information. The face feature extraction module is used for calculating face features according to the acquired face five-point information. In practical applications, the face preprocessing can be performed by running a Software Development Kit (SDK).
3) The face retrieval part can comprise a retrieval library, the retrieval library comprises a hot retrieval library and a cold retrieval library, wherein the hot retrieval library is stored in a video memory of the GPU, and the hot retrieval library can comprise a face feature management module, a retrieval module and a hot retrieval library management module. The cold search library is stored in the memory of the CPU, and can comprise a cold search library management module and a cold search library cleaning module. The hot search library management module, the cold search library management module and the cold search library cleaning module are independent from each other and can be synchronously performed in a concurrent mode, so that the face search service can fully utilize the computing resources of the GPU and the CPU to perform deep learning computation, and the overall operational capability of the equipment is greatly improved.
The face feature management module is used for executing corresponding feature management operations (such as operations of adding, deleting, modifying and checking features) on the search library. The hot search library is used as a search base library and stores hot data for face search. The hot search library management module is used for managing hot data stored in a hot search library, and under the condition that the hot data management condition is met, the hot data is transferred to a cold search library, so that the strategy of the hot library cooling warehouse is realized. The retrieval module is used for retrieving the thermal data similar to the human face features to be retrieved in the thermal retrieval library and returning a retrieval result based on a retrieval similarity result.
The cold index database management module is used for managing cold data stored in the cold index database, and migrating the cold data to the hot index database under the condition of meeting the cold data management condition, so as to realize the strategy of raising the temperature of the cold database. The cold search library cleaning module is used for cleaning cold data stored in the cold search library, and transferring the cold data to a magnetic disk under the condition that the data cleaning condition is met.
The retrieval method of the application can be applied to 1: the N face retrieval framework plays an important role in aspects such as identity filing, identity retrieval, identity combination and the like in scenes such as intelligent retail, intelligent communities, security and the like, and the timeliness and retrieval performance of identity retrieval are guaranteed.
The retrieval method based on the retrieval cold and hot database upgrading and degrading strategy solves a series of problems caused by overlarge scale of a retrieval base in the current retrieval. The strategy mainly comprises a thermal storage sinking refrigeration storage strategy based on aging and a refrigeration storage rising thermal storage strategy based on a retrieval thermal storage, and the two strategies are synchronously carried out according to respective rules, so that the storage scale controllability and the storage information timeliness of the thermal retrieval storage are ensured. The novel retrieval base designed based on the strategy is beneficial to improving the performance of the retrieval algorithm and the utilization rate of resources.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Referring to fig. 11, a block diagram of a retrieval apparatus according to an embodiment of the present application is shown. The device has the function of realizing the server side in the above method example, and the function can be realized by hardware or by hardware executing corresponding software. The apparatus 80 may include:
the characteristic management module 801 is used for acquiring characteristic data to be retrieved of an object;
a search library management means 802 for performing data management on a search library, the search library including a hot search library and a cold search library;
the retrieval module 803 is configured to retrieve the feature data to be retrieved in a data-managed thermal retrieval library to obtain a retrieval result;
a returning module 804, configured to return the retrieval result.
In one embodiment, the search library management apparatus is any one of the search library management apparatuses described above, and the search library management apparatus may include:
the attribute acquisition module is used for acquiring attribute parameters of each piece of thermal data stored in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
the first migration module is used for migrating the hot data meeting the hot data management condition to the cold search library if the attribute parameters of the hot data meet the hot data management condition;
the first calculation module is used for calculating a first matching degree between each cold data stored in the cold search library and each hot data in the hot search library;
and the second migration module is used for migrating the cold data meeting the cold data management condition to the hot search library if the first matching degree is determined to meet the cold data management condition.
In some embodiments, the search library management means comprises a hot search library management module and a cold search library management module,
the hot search library management module comprises the attribute acquisition module and the first migration module;
the cold search library management module includes the first computing module and the second migration module.
In a possible embodiment, the search module may be further configured to calculate a first matching degree between each cold data stored in the cold search library and each hot data in the hot search library, and calculate a second matching degree between the hot data to be managed and each cold data in the cold search library. Therefore, the calculation subunit and the first calculation module in the retrieval library management device can be omitted, the functions of the retrieval module are fully utilized, more feature matching function modules do not need to be additionally configured, and the retrieval cost is reduced.
In some embodiments, the hot search library is stored in an image processor or sound processor for matching objects to be searched, and the cold search library is stored in a central processor;
the object to be retrieved comprises at least one of the following: face, audio, and iris.
Embodiments of the present application provide a retrieval device, which may include a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the retrieval method provided by the above method embodiments.
Embodiments of the present application further provide a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded by a processor and executes any one of the above-mentioned retrieval methods.
Further, fig. 12 is a schematic hardware structural diagram of an apparatus for implementing the search library management method or the search method provided in the embodiment of the present application, where the apparatus may be a computer terminal, a mobile terminal, or other apparatuses, and the apparatus may also participate in forming or including the apparatus provided in the embodiment of the present application. As shown in fig. 12, the computer terminal 10 may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission device 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 12 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 12, or have a different configuration than shown in FIG. 12.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the methods described in the embodiments of the present application, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 104, so as to implement one of the neural network processing methods described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of search library management, wherein the search libraries comprise a hot search library and a cold search library, the method comprising:
acquiring attribute parameters of each piece of thermal data in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library;
calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively;
and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
2. The method of claim 1, wherein the attribute parameters of the hot data include at least one of a last recording timestamp of the hot data and a user activity associated with the hot data;
in a case that the attribute parameters of the hot data include a last recording timestamp of the hot data and a user activity associated with the hot data, if it is determined that the attribute parameters of the hot data satisfy a hot data management condition, migrating the hot data satisfying the hot data management condition to the cold repository, including:
if the time interval between the latest recording time stamp of the hot data and the first preset time is judged to be larger than or equal to a first preset threshold value, and the user activity degree associated with the hot data is judged to be smaller than a second preset threshold value, determining that the attribute parameters of the hot data meet the hot data management conditions;
determining the thermal data meeting the thermal data management condition as thermal data to be managed;
and migrating the hot data to be managed to the cold search library.
3. The method of claim 2, wherein the migrating the hot data to be managed to the cold repository comprises:
calculating a second matching degree between the hot data to be managed and each cold data in the cold search library respectively;
judging whether the maximum second matching degree is greater than or equal to a third preset threshold value or not in a plurality of second matching degrees corresponding to the thermal data to be managed;
if so, merging the hot data to be managed and the cold data of which the maximum second matching degree is greater than or equal to a third preset threshold;
and if the judgment result is negative, storing the hot data to be managed, of which the maximum second matching degree is smaller than a third preset threshold value, into the cold search library.
4. The method of claim 1, wherein migrating cold data satisfying a cold data management condition to the hot repository if it is determined that the first degree of match satisfies the cold data management condition comprises:
if the maximum first matching degree is judged to be greater than or equal to a fourth preset threshold value in a plurality of first matching degrees corresponding to each cold data, the first matching degree is determined to meet the cold data management condition;
determining cold data meeting cold data management conditions as cold data to be managed;
migrating the cold data to be managed to the hot search library.
5. The method of claim 1, further comprising:
detecting a total number of cold data stored in the cold repository;
if the total number of the detected cold data is larger than a second preset scale threshold value, taking the difference value of the total number of the detected cold data and the second preset scale threshold value as the number N to be transferred, wherein N is a positive integer;
sequencing all cold data in the cold search library according to the sequence of the latest recording time;
and transferring the N cold data which are ranked at the top into a magnetic disk.
6. The method of claim 1, wherein the hot search database is stored in a memory corresponding to an image processor or a sound processor for matching an object to be searched, and the cold search database is stored in a memory corresponding to a central processor;
the object to be retrieved comprises at least one of the following: face, audio, and iris.
7. A retrieval method, wherein a data-managed repository is used for retrieval, the repository includes a hot repository and a cold repository, and the method includes:
acquiring characteristic data to be retrieved of an object;
searching the characteristic data to be searched in a hot search library managed by a search library to obtain a search result;
returning the retrieval result;
the search library performs data management by the following search library management method, and the database management method comprises the following steps:
acquiring attribute parameters of each piece of thermal data in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
if the attribute parameters of the hot data meet the hot data management conditions, migrating the hot data meeting the hot data management conditions to the cold search library;
calculating first matching degrees of each cold data stored in the cold search library and each hot data in the hot search library respectively;
and if the first matching degree is determined to meet the cold data management condition, migrating the cold data meeting the cold data management condition to the hot search library.
8. A search library management apparatus, wherein the search library includes a hot search library and a cold search library, the apparatus comprising:
the attribute acquisition module is used for acquiring attribute parameters of each piece of thermal data stored in the thermal search library under the condition that the total number of the pieces of thermal data stored in the thermal search library is greater than a first preset scale threshold;
the first migration module is used for migrating the hot data meeting the hot data management condition to the cold search library if the attribute parameters of the hot data meet the hot data management condition;
the first calculation module is used for calculating a first matching degree between each cold data stored in the cold search library and each hot data in the hot search library;
and the second migration module is used for migrating the cold data meeting the cold data management condition to the hot search library if the first matching degree is determined to meet the cold data management condition.
9. A search apparatus for performing a search using a data-managed search library including a hot search library and a cold search library, the apparatus comprising:
the characteristic management module is used for acquiring characteristic data to be retrieved of the object;
the search library management apparatus of claim 8, for data management of search libraries, the search libraries including a hot search library and a cold search library;
the retrieval module is used for retrieving the characteristic data to be retrieved in a data-managed hot retrieval library to obtain a retrieval result;
and the returning module is used for returning the retrieval result.
10. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions that is loaded by a processor and that performs a method of managing a search library as claimed in any one of claims 1 to 7 or a method of searching as claimed in claim 8.
CN201911044479.8A 2019-10-30 2019-10-30 Retrieval library management method, retrieval device and retrieval medium Pending CN110865992A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911044479.8A CN110865992A (en) 2019-10-30 2019-10-30 Retrieval library management method, retrieval device and retrieval medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911044479.8A CN110865992A (en) 2019-10-30 2019-10-30 Retrieval library management method, retrieval device and retrieval medium

Publications (1)

Publication Number Publication Date
CN110865992A true CN110865992A (en) 2020-03-06

Family

ID=69652998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911044479.8A Pending CN110865992A (en) 2019-10-30 2019-10-30 Retrieval library management method, retrieval device and retrieval medium

Country Status (1)

Country Link
CN (1) CN110865992A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459939A (en) * 2020-03-31 2020-07-28 中国银行股份有限公司 Data processing method and device
CN111858604A (en) * 2020-07-24 2020-10-30 平安证券股份有限公司 Data storage method and device, electronic equipment and storage medium
CN111858520A (en) * 2020-07-21 2020-10-30 杭州溪塔科技有限公司 Method and device for separately storing block link point data
CN112380217A (en) * 2020-11-17 2021-02-19 安徽鸿程光电有限公司 Data processing method, device, equipment and medium
CN112416929A (en) * 2020-11-17 2021-02-26 四川长虹电器股份有限公司 Retrieval library management and data retrieval method based on mysql and java
CN116401212A (en) * 2023-06-07 2023-07-07 东营市第二人民医院 Personnel file quick searching system based on data analysis
CN118012851A (en) * 2024-04-08 2024-05-10 浪潮通信信息系统有限公司 Scene data management method and device, electronic equipment and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111459939A (en) * 2020-03-31 2020-07-28 中国银行股份有限公司 Data processing method and device
CN111459939B (en) * 2020-03-31 2023-09-19 中国银行股份有限公司 Data processing method and device
CN111858520A (en) * 2020-07-21 2020-10-30 杭州溪塔科技有限公司 Method and device for separately storing block link point data
CN111858520B (en) * 2020-07-21 2024-03-22 杭州溪塔科技有限公司 Method and device for separately storing block chain node data
CN111858604A (en) * 2020-07-24 2020-10-30 平安证券股份有限公司 Data storage method and device, electronic equipment and storage medium
CN111858604B (en) * 2020-07-24 2022-11-04 平安证券股份有限公司 Data storage method and device, electronic equipment and storage medium
CN112380217A (en) * 2020-11-17 2021-02-19 安徽鸿程光电有限公司 Data processing method, device, equipment and medium
CN112416929A (en) * 2020-11-17 2021-02-26 四川长虹电器股份有限公司 Retrieval library management and data retrieval method based on mysql and java
CN112380217B (en) * 2020-11-17 2024-04-12 安徽鸿程光电有限公司 Data processing method, device, equipment and medium
CN116401212A (en) * 2023-06-07 2023-07-07 东营市第二人民医院 Personnel file quick searching system based on data analysis
CN116401212B (en) * 2023-06-07 2023-08-11 东营市第二人民医院 Personnel file quick searching system based on data analysis
CN118012851A (en) * 2024-04-08 2024-05-10 浪潮通信信息系统有限公司 Scene data management method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110865992A (en) Retrieval library management method, retrieval device and retrieval medium
US10169485B2 (en) Dynamic partitioning of graph databases based on edge sampling
CN112052387B (en) Content recommendation method, device and computer readable storage medium
WO2015081915A1 (en) File recommendation method and device
US9396247B2 (en) Method and device for processing a time sequence based on dimensionality reduction
CN110674144A (en) User portrait generation method and device, computer equipment and storage medium
CN106933511B (en) Space data storage organization method and system considering load balance and disk efficiency
CN110347724A (en) Abnormal behaviour recognition methods, device, electronic equipment and medium
WO2021063037A1 (en) Person database partitioning method, and device
WO2021073260A1 (en) Object management method and apparatus, computer device, and storage medium
EP2864906A2 (en) Searching for events by attendants
CN110609952A (en) Data acquisition method and system and computer equipment
JP6079270B2 (en) Information provision device
CN114741544B (en) Image retrieval method, retrieval library construction method, device, electronic equipment and medium
CN110309143A (en) Data similarity determines method, apparatus and processing equipment
CN113220904A (en) Data processing method, data processing device and electronic equipment
CN110968564A (en) Data processing method and training method of data state prediction model
CN103595747A (en) User-information recommending method and system
CN115982346A (en) Question-answer library construction method, terminal device and storage medium
CN107526741B (en) User label generation method and device
CN113821657A (en) Artificial intelligence-based image processing model training method and image processing method
CN110019870B (en) Image retrieval method and system based on memory image cluster
CN108170693B (en) Hot word pushing method and device
CN116628042A (en) Data processing method, device, equipment and medium
KR102242042B1 (en) Method, apparatus and computer program for data labeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40021537

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination