CN114385891A - Data searching method and device, electronic equipment and storage medium - Google Patents

Data searching method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114385891A
CN114385891A CN202210286230.3A CN202210286230A CN114385891A CN 114385891 A CN114385891 A CN 114385891A CN 202210286230 A CN202210286230 A CN 202210286230A CN 114385891 A CN114385891 A CN 114385891A
Authority
CN
China
Prior art keywords
data
searched
search
mark information
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210286230.3A
Other languages
Chinese (zh)
Other versions
CN114385891B (en
Inventor
何文松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wenjingsong Technology Co ltd
Original Assignee
Beijing Wenjingsong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wenjingsong Technology Co ltd filed Critical Beijing Wenjingsong Technology Co ltd
Priority to CN202210286230.3A priority Critical patent/CN114385891B/en
Publication of CN114385891A publication Critical patent/CN114385891A/en
Application granted granted Critical
Publication of CN114385891B publication Critical patent/CN114385891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data searching method, a data searching device, electronic equipment and a storage medium. The method comprises the following steps: acquiring search request data, determining a search data set corresponding to the search request data, and determining a data subset to be searched corresponding to the search request data according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched; determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched; when all the data to be searched in the data subset to be searched has corresponding search mark information, a search result is obtained based on the similarity between the data requested to be searched and each piece of data to be searched, the situation of repeated searching in the data searching process is avoided, and therefore the data searching efficiency is improved.

Description

Data searching method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a data searching method and device, electronic equipment and a storage medium.
Background
In the prior art, in the process of searching for data corresponding to data requested to be searched in a data set, multiple search operations are usually required to be performed in the data set, and in practical applications, data duplication is likely to occur in data determined in the multiple search operations, for example, data determined in a first search operation and data determined in a second search operation have duplicate data. Therefore, the conventional data searching method has the technical problem of repeatedly performing data searching on data in a data set.
Disclosure of Invention
Embodiments of the present invention provide a data search method, an apparatus, an electronic device, and a storage medium, so as to implement fast search of data, and avoid a situation of repeated search in a data search process, thereby improving data search efficiency.
In a first aspect, an embodiment of the present invention provides a data search method, where the method includes:
acquiring request search data, determining a search data set corresponding to the request search data, and determining a data subset to be searched corresponding to the request search data according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched;
determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched;
and when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the request search data and each piece of data to be searched.
In a second aspect, an embodiment of the present invention further provides a data search apparatus, where the apparatus includes:
the device comprises a to-be-searched data subset determining module, a searching data set and a to-be-searched data subset determining module, wherein the to-be-searched data subset determining module is used for acquiring search request data, determining a searching data set corresponding to the search request data, and determining a to-be-searched data subset corresponding to the search request data according to the searching data set, and the to-be-searched data subset comprises at least one piece of to-be-searched data;
a data marking module, configured to determine, for each piece of data to be searched included in the subset of data to be searched, whether search mark information corresponding to the data to be searched exists, and if not, mark the data to be searched, where the search mark information is used to indicate that the data to be searched has been searched;
and the search result obtaining module is used for obtaining a search result based on the similarity between the request search data and each piece of data to be searched when all the data to be searched in the data subset to be searched has corresponding search mark information.
In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:
one or more processors;
storage means for storing one or more programs;
when executed by the processor, cause the processor to implement a data search method as provided by any of the embodiments of the invention.
In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data search method provided in any embodiment of the present invention.
According to the technical scheme of the embodiment of the invention, the search data set corresponding to the request search data is determined by acquiring the request search data, and the data subset to be searched corresponding to the request search data is determined according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched. And determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched. And when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the request search data and each piece of data to be searched. The technical scheme of the embodiment of the invention solves the technical problem of repeated data search of data in a data set in the prior art, realizes rapid data search, and avoids repeated search in the data search process, thereby improving the data search efficiency.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, a brief description is given below of the drawings used in describing the embodiments. It should be clear that the described figures are only views of some of the embodiments of the invention to be described, not all, and that for a person skilled in the art, other figures can be derived from these figures without inventive effort.
Fig. 1 is a schematic flow chart of a data searching method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a data searching method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a data search apparatus according to a fourth embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a schematic flowchart of a data search method according to an embodiment of the present invention, where the present embodiment is applicable to a case of performing data search on data in a data set, the method may be executed by a data search apparatus, and the data search apparatus may be implemented by software and/or hardware, and may be integrated in an electronic device such as a computer or a server.
As shown in fig. 1, the method of the present embodiment includes:
s110, obtaining the search request data, and determining a search data set corresponding to the search request data.
The data requested to search may be data obtained by analyzing a data search request input by a user. The data search request may be a request for data search input by a user based on an input device. The input device may be a physical input device and/or a touch device. Physical input devices may include, but are not limited to, a mouse, a keyboard, and the like. The touch device may include, but is not limited to, a virtual keyboard, a handwriting area, and the like. The data type of the search request data may be various, and is not specifically limited, such as a video type, an audio type, a picture type, or a text type. A search data set is understood to be a data set capable of data searching and may be used to provide data for searching for data searching operations. The type of search data set may or may not be the same as the type of data for which search data is requested.
Specifically, a data search request for searching data, which is input by a user based on an input device, is received. Upon receiving the data search request, the data search request may be parsed. And the requested search data can be obtained. In turn, a search data set corresponding to the requested search data may be determined based on the requested search data. The search data set corresponding to the search request data is determined based on the search request data, which may be that the search request data is subjected to feature extraction to obtain feature data of the search request data, a data set corresponding to the feature data of the search data is determined based on the feature data of the search data, and the data set corresponding to the feature data of the search data is used as the search data set corresponding to the search request data.
It should be noted that, in the embodiment of the present invention, there are various ways of obtaining the data search request, for example, when a data search request for performing data search is received, the data search request may be analyzed, and then the data search data may be obtained; or, receiving data input by the user for data search based on the input device, in which case, the data input by the user for data search based on the input device may be a search keyword.
And S120, determining a data subset to be searched corresponding to the data requested to be searched according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched.
The data subset to be searched can be understood as a data set of data to be searched corresponding to the data requested to be searched. The number of the data to be searched included in the data subset to be searched may be one, two, or more than two. Generally, in practical applications, the number of pieces of data to be searched included in the subset of data to be searched is multiple.
In the embodiment of the present invention, there are various ways of determining the data subset to be searched corresponding to the requested search data according to the search data set. As an alternative embodiment of the present invention, after determining the search data set, initial search data corresponding to the search-requested data in the search data set may be determined. And constructing a data subset to be searched according to the initial search data. The initial search data may be understood as data to be searched, in which the first piece of data in the search data set is compared with the requested search data in terms of similarity.
The method for determining the initial search data corresponding to the search request data in the search data set includes various ways, for example, the initial search data may be preset according to actual needs of a user; any data to be searched in the search data set may also be used as initial search data, where any data to be searched in the search data set may be any data to be searched in the randomly acquired search data set.
As another alternative embodiment of the present invention, after determining the search data set, initial search data corresponding to the search data requested in the search data set may be determined. After the initial search data is determined, the associated search data corresponding to the initial search data may be determined according to the similarity between the initial search data and the remaining data to be searched in the search data set. And then constructing a data subset to be searched according to the initial search data and the associated search data. The rest of the data to be searched in the search data set can be understood as the data to be searched in the search data set except the initial search data. The associated search data may be data determined based on a similarity of the initial search data and data to be searched for other than the initial data. The number of the associated search data may be one, two or more, and in practical applications, the number of the associated search data may be multiple.
Specifically, after determining the search data set, initial search data corresponding to the requested search data in the search data set may be determined. After determining the initial search data, a similarity between the initial search data and the remaining data to be searched in the search data set may be determined. And then, the associated search data corresponding to the initial search data can be determined based on the similarity between the initial search data and the rest of the data to be searched in the search data set. So that a subset of data to be searched can be constructed from the initial search data and the associated search data. The similarity between the initial search data and the rest of the data to be searched in the search data set can be understood as the data distance between the initial search data and the rest of the data to be searched in the search data set.
Optionally, the similarity between the initial search data and the remaining data to be searched in the search data set is determined by:
and determining the similarity between the initial search data and the rest of the data to be searched in the search data set by traversing the data table for storing the similarity between the data to be searched in the search data set. The data table for storing the similarity between the data to be searched in the search data set may be a data table constructed in advance based on the similarity between the data to be searched in the search data set. This has the advantage that the efficiency of the data search can be increased.
S130, determining whether search mark information corresponding to the data to be searched exists or not according to each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist.
The search mark information may include, but is not limited to, a data identifier and/or a data value of the data to be searched. The search flag information may be used to indicate that the data to be searched has been searched.
In one embodiment, if the data to be searched contained in the subset of data to be searched is initial search data, it may be determined whether search flag information of the initial search data exists. The initial search data may be flagged if there is no search flag information for the initial search data.
In another embodiment, if the data to be searched included in the subset of data to be searched includes initial search data and at least one piece of associated search data corresponding to the initial search data, it is determined whether search flag information of the initial search data exists. The initial search data may be flagged if there is no search flag information for the initial search data. After the initial search data is tagged, for each associated search data corresponding to the initial search data, it may be determined whether there is search tag information corresponding to the associated search data. If the search marking information corresponding to the associated search data does not exist, the associated search data can be marked.
Optionally, the data to be searched is marked in the following manner:
in the first mode, when the similarity between the data to be searched and the data requested to be searched is inquired, the associated search data is marked.
And secondly, when the similarity between the data to be searched and the data requested to be searched is inquired, marking the associated search data.
S140, when all the data to be searched in the data subset to be searched have corresponding search mark information, obtaining a search result based on the similarity between the data requested to be searched and each piece of data to be searched.
The search result may be a result based on a similarity between the requested search data and each piece of data to be searched. The search result may be understood as data contained in the subset of data to be searched that is capable of satisfying the data search request. At least one piece of data may be included in the search results.
Specifically, when all the data to be searched in the data subset to be searched has corresponding search tag information, the data meeting the data search request, that is, the search result, may be obtained based on the similarity between the search request data and each data to be searched in the data subset to be searched. In the embodiment of the invention, after the search result is obtained, the obtained search result can be stored in the storage queue for storing the search result. This has the benefit of facilitating subsequent updates to the search results.
According to the technical scheme of the embodiment of the invention, the search data set corresponding to the search-requested data is determined by acquiring the search-requested data, and the data subset to be searched corresponding to the search-requested data is determined according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched. And determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched. And when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the data requested to be searched and each piece of data to be searched. The technical scheme of the embodiment of the invention solves the technical problem of repeated data search of data in a data set in the prior art, realizes rapid data search, and avoids repeated search in the data search process, thereby improving the data search efficiency.
Example two
Fig. 2 is a schematic flow chart of a data search method according to a second embodiment of the present invention, where on the basis of the foregoing embodiment, optionally, the determining whether there is search flag information corresponding to the data to be searched includes: and determining space state information of a storage space for storing search mark information corresponding to the data to be searched, and determining whether the search mark information corresponding to the data to be searched exists or not based on the space state information. The technical terms that are the same as or corresponding to the above embodiments are not repeated herein.
As shown in fig. 2, the method of the embodiment may specifically include:
s210, obtaining the search request data, and determining a search data set corresponding to the search request data.
S220, determining a data subset to be searched corresponding to the data requested to be searched according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched.
And S230, determining the space state information of the storage space for storing the search mark information corresponding to the data to be searched aiming at each piece of data to be searched contained in the data to be searched subset.
The space state information may identify whether a storage space for storing search flag information corresponding to the data to be searched is initialized. The spatial state information may include uninitialized and initialized.
Specifically, for each piece of search data included in the subset of data to be searched, the current data to be searched may be determined based on all pieces of data to be searched included in the subset of data to be searched. After the current data to be searched is determined, the space state information of the storage space for storing the search data mark information corresponding to the current data to be searched can be determined, and the next data to be searched of the current data to be searched is used as the current data to be searched.
It should be noted that the advantage of determining the space state information of the storage space for storing the search flag information corresponding to the data to be searched is that whether the storage space for storing the search flag information corresponding to the data to be searched is initialized can be determined more quickly, and if the storage space is not initialized, the storage space for storing the search flag information corresponding to the data to be searched can be initialized in a targeted manner, so that the technical effect of shortening the time duration for initializing the storage space for storing the search flag information corresponding to the data to be searched can be achieved, and the data search efficiency is improved.
Optionally, the current data to be searched is determined based on all the data to be searched contained in the data subset to be searched in the following manner:
similarity between the requested search data and all data to be searched in the search data subset is determined. After the determination, the similarity between the requested search data and all the data to be searched in the search data subset may be sorted according to a preset sorting order. And then determining the current data to be searched based on the sorted similarity. The preset arrangement sequence may be from large to small, or from small to large. It is understood that the next data to be searched for of the current data to be searched for may be determined based on the sorted similarity.
And S240, determining whether search mark information corresponding to the data to be searched exists or not based on the space state information.
Specifically, if the spatial state information is not initialized, it is determined that there is no search flag information corresponding to the data to be searched. And if the space state information is initialized, determining that the search mark information corresponding to the data to be searched exists.
The spatial state information is not initialized, which means that the storage space for storing the search mark information corresponding to the data to be searched at the current time is not initialized and the search mark information cannot be stored. If the space state information is initialized, it can be understood that the storage space for storing the search mark information corresponding to the data to be searched at the current time is initialized, and the search mark information can be stored.
And S250, if the search mark information corresponding to the data to be searched does not exist, marking the data to be searched.
In one embodiment, the search indicia information may be a data identification of the data to be searched. If it is determined that there is no search marker information corresponding to the data to be searched, a storage space for storing the search marker information may be determined. After determining the storage space, the space state information of the storage space may be updated from uninitialized to initialized, and the search flag information may be written into the storage space.
Specifically, if it is determined that there is no search marker information corresponding to the data to be searched, it may be determined that the spatial state information of the storage space for storing the search marker information corresponding to the data to be searched is uninitialized. And then, based on the space state information of the storage space for storing the search mark information corresponding to the data to be searched, determining the storage space for storing the search mark information corresponding to the data to be searched. And further initializing a storage space for storing search mark information corresponding to the data to be searched. After the storage space is initialized, the search flag information corresponding to the data to be searched may be stored in the initialized storage space corresponding to the data to be searched.
It should be noted that, by such processing, the technical problem that a storage space needs to be allocated for the search tag information corresponding to the data to be searched in the data set at one time before the data search is performed in the data set in the prior art is solved. In practical applications, because the data set contains a large amount of data to be searched, the amount of data to be searched that needs to be marked in the data searching process will increase. In this case, a large storage space needs to be allocated to the search tag information corresponding to the data to be searched in the data set at a time, which results in a waste of storage space.
It should be further noted that, if the search flag information of the data to be searched is the data identifier, at this time, the relationship between the space bit in the storage space for storing the search flag information and the data identifier may be one-to-one or many-to-one. In the embodiment of the present invention, the space bit in the storage space for storing the search flag information may have a one-to-one relationship with the data identifier. Here, the space bit may be understood as a storage location for storing search flag information of a single data to be searched.
Optionally, the storage space for storing the search flag information is determined by:
determining space state information of a storage space for storing data to be searched, analyzing the space state information, and determining space address information contained in the space state information. A storage space for storing the search flag information may be determined based on the spatial address information. The space address information may be address information of a storage space for storing search flag information of at least one piece of data to be searched.
In another embodiment, the search flag information may be data to be searched. If it is determined that the search mark information corresponding to the data to be searched does not exist, the data to be searched can be written into the storage space corresponding to the search mark information of the data to be searched, so as to mark the data to be searched.
It should be noted that, if the search flag information of the data to be searched is the data to be searched, at this time, the number of space bits in the storage space for storing the search flag information may be less than or equal to the number of space bits for storing the data to be searched. In the embodiment of the present invention, the number of space bits in the storage space for storing the search flag information may be less than the number for storing the data to be searched.
On the basis of the above, when the number of space bits in the storage space for storing the search flag information may be less than the number of space bits for storing the data to be searched, a situation may occur in which the search flag information of the data to be searched cannot be stored continuously, and the following processing may be performed: for example, as an optional implementation manner of the embodiment of the present invention, if the storage space for storing the search flag information of the data to be searched is full, the written data to be searched is used as the search flag information, and the data that is not written in the search flag information is discarded.
As another optional implementation manner of the embodiment of the present invention, if the storage space for storing the search flag information of the data to be searched is full, the search flag information that is not written in the search flag information may be overwritten on the search flag information that is written in the storage space.
And S260, when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the data requested to be searched and each piece of data to be searched.
On the basis, after the search result is obtained, the embodiment of the present invention may perform an emptying process on the space state information of the storage space for storing the search flag information. Or, when a search request input by a user is received, the space state information of the storage space used for storing the search mark information corresponding to the data to be searched in the previous search request may be updated from initialized to uninitialized, so as to facilitate the data search operation of this time.
On the basis of the above embodiment, after determining the search data set corresponding to the requested search data, the embodiment of the present invention further includes: the method comprises the steps of predicting the space size of a storage space occupied by search mark information for storing data to be searched, further allocating the storage space of the space size based on the predicted space size, and initializing the allocated storage space to store the search mark information of the data to be searched. Or predicting the total number of the marks of the search mark information of the data to be searched for storing the data to be searched, allocating storage space corresponding to the total number of the marks based on the total number of the marks, and performing initialization processing to enable the initialized storage space to store the search mark information of the data to be searched. The processing method has the advantages of improving the utilization rate of the storage space and avoiding the waste of the storage space for storing the search mark information of the data to be searched.
According to the technical scheme of the embodiment of the invention, the space state information of the storage space used for storing the search mark information corresponding to the data to be searched is determined. And then whether search mark information corresponding to the data to be searched exists is determined based on the space state information, and the existing technical problem that the storage space needs to be allocated for the search mark information corresponding to the data to be searched in the data set at one time before the data search is carried out in the data set is solved, so that the phenomenon that a larger storage space is allocated to store the search mark information corresponding to the data to be searched in the data search operation is avoided.
EXAMPLE III
The third embodiment of the present invention provides an alternative embodiment of a data search method, and specific implementation manners thereof may be found in the following embodiments. The technical terms that are the same as or corresponding to the above embodiments are not repeated herein.
Before the embodiments of the present invention are described, it should be noted that the primary bitmap block can be understood as a storage space for storing data flag information of data to be searched. The secondary bitmap may be understood as space state information of a storage space for storing data flag information of data to be searched.
The method of the embodiment specifically comprises the following steps:
1. and acquiring the search request data, and determining a search data set corresponding to the search request data.
2. And determining to-be-searched data corresponding to the data requested to be searched in the search data set, and determining a secondary bitmap of a storage space for storing data mark information of the to-be-searched data.
3. And if the secondary bitmap of the primary bitmap block for storing the data mark information of the data to be searched is not initialized, modifying the secondary bitmap from non-initialization to initialization, initializing the primary bitmap block which corresponds to the secondary bitmap and is used for storing the data mark information of the data to be searched, and storing the data mark information of the data to be searched into the initialized primary bitmap block for storing the data mark information of the data to be searched.
The initializing of the primary bitmap block corresponding to the secondary bitmap and used for storing the data mark information of the data to be searched can be understood as initializing the uninitialized space for storing the data mark information of the data to be searched.
4. And if the secondary bitmap of the primary bitmap block for storing the data mark information of the data to be searched is initialized, storing the data mark information of the data to be searched into the initialized primary bitmap block for storing the data mark information of the data to be searched.
The technical scheme of the embodiment of the invention not only solves the technical problem of repeatedly searching the data in the data set in the prior art, realizes the rapid search of the data, and avoids the repeated search in the data searching process, thereby improving the data searching efficiency. The technical problem that storage space needs to be allocated for the search mark information corresponding to the data to be searched in the data set at one time and initialization processing needs to be carried out before data searching is carried out in the data set in the prior art is solved, and the phenomenon that large storage space is allocated to store the search mark information corresponding to the data to be searched in data searching operation is avoided.
Example four
Fig. 3 is a schematic structural diagram of a data search apparatus according to a fourth embodiment of the present invention, where the data search apparatus includes: a data subset to be searched determining module 410, a data marking module 420 and a search result obtaining module 430.
The to-be-searched data subset determining module 410 is configured to obtain search request data, determine a search data set corresponding to the search request data, and determine a to-be-searched data subset corresponding to the search request data according to the search data set, where the to-be-searched data subset includes at least one piece of to-be-searched data; a data marking module 420, configured to determine, for each piece of data to be searched included in the subset of data to be searched, whether search mark information corresponding to the data to be searched exists, and if not, mark the data to be searched, where the search mark information is used to indicate that the data to be searched has been searched; a search result obtaining module 430, configured to, when corresponding search mark information exists in all the data to be searched in the subset of data to be searched, obtain a search result based on a similarity between the requested search data and each piece of data to be searched.
According to the technical scheme of the embodiment of the invention, the data subset determination module to be searched is used for acquiring the data requested to be searched, determining the search data set corresponding to the data requested to be searched, and determining the data subset to be searched corresponding to the data requested to be searched according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched. And determining whether search mark information corresponding to the data to be searched exists or not by a data marking module aiming at each data to be searched contained in the data to be searched subset, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched. And obtaining a search result based on the similarity between the request search data and each piece of data to be searched when all the data to be searched in the subset of the data to be searched has corresponding search mark information through a search result obtaining module. The technical scheme of the embodiment of the invention solves the technical problem of repeated data search of data in a data set in the prior art, realizes rapid data search, and avoids repeated search in the data search process, thereby improving the data search efficiency.
Optionally, the data tagging module 420 is configured to tag the associated search data when the similarity between the associated search data and the requested search data is found.
Optionally, the data tagging module 420 is configured to determine space state information of a storage space for storing search tag information corresponding to the data to be searched, and determine whether search tag information corresponding to the data to be searched exists based on the space state information.
Optionally, the spatial state information includes uninitialized and initialized; a data marking module 420, configured to determine that there is no search marking information corresponding to the data to be searched if the spatial state information is not initialized; and if the space state information is initialized, determining that the search mark information corresponding to the data to be searched exists.
Optionally, the data marking module 420 is configured to determine a storage space for storing the search mark information, update the space state information of the storage space from un-initialized to initialized, and write the search mark information into the storage space.
Optionally, the data marking module 420 is configured to write the data to be searched into a storage space corresponding to the search marking information of the data to be searched, so as to mark the data to be searched.
Optionally, the apparatus further comprises: and the data discarding module is used for taking the written data to be searched as the search mark information and discarding the data which is not written in the data to be searched if the storage space for storing the search mark information of the data to be searched is full.
Optionally, after obtaining the search result, the apparatus further includes: and the space state information emptying module is used for emptying the space state information of the storage space for storing the search mark information.
Optionally, the to-be-searched data subset determining module 410 is configured to determine initial search data corresponding to the search request data in the search data set; determining associated search data corresponding to the initial search data according to the similarity between the initial search data and the rest of data to be searched in the search data set; and constructing a data subset to be searched according to the initial search data and the associated search data.
The device can execute the data search method provided by any embodiment of the invention, and has the corresponding functional module and the beneficial effect of executing the data search method.
It should be noted that, the units and modules included in the data search apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
EXAMPLE five
Fig. 4 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing any of the embodiments of the present invention. The electronic device 12 shown in fig. 4 is only an example and should not bring any limitation to the function and the scope of use of the embodiment of the present invention. The device 12 is typically an electronic device that undertakes the processing of configuration information.
As shown in FIG. 4, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples the various components (including the memory 28 and the processing unit 16).
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
Electronic device 12 typically includes a variety of computer-readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 28 may include computer device readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown, but commonly referred to as a "hard drive"). Although not shown, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk-Read Only Memory (CD-ROM), Digital Video disk (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product 40, with program product 40 having a set of program modules 42 configured to carry out the functions of embodiments of the invention. Program product 40 may be stored, for example, in memory 28, and such program modules 42 include, but are not limited to, one or more application programs, other program modules, and program data, each of which examples or some combination may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, mouse, camera, etc., and display), one or more devices that enable a user to interact with electronic device 12, and/or any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network such as the internet) via the Network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) devices, tape drives, and data backup storage devices, to name a few.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing the data search method provided by the above-described embodiment of the present invention, the method including:
acquiring request search data, determining a search data set corresponding to the request search data, and determining a data subset to be searched corresponding to the request search data according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched; determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched; and when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the request search data and each piece of data to be searched.
Of course, those skilled in the art can understand that the processor can also implement the technical solution of the data searching method provided in any embodiment of the present invention.
EXAMPLE six
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor, and is characterized in that, for example, the data searching method provided in the foregoing embodiment of the present invention includes:
acquiring request search data, determining a search data set corresponding to the request search data, and determining a data subset to be searched corresponding to the request search data according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched; determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched; and when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the request search data and each piece of data to be searched.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method of searching data, comprising:
acquiring request search data, determining a search data set corresponding to the request search data, and determining a data subset to be searched corresponding to the request search data according to the search data set, wherein the data subset to be searched comprises at least one piece of data to be searched;
determining whether search mark information corresponding to the data to be searched exists or not for each piece of data to be searched contained in the data subset to be searched, and marking the data to be searched if the search mark information does not exist, wherein the search mark information is used for indicating that the data to be searched has been searched;
and when all the data to be searched in the data subset to be searched has corresponding search mark information, obtaining a search result based on the similarity between the request search data and each piece of data to be searched.
2. The method of claim 1, wherein the marking the data to be searched comprises:
and marking the data to be searched when the similarity between the data to be searched and the data requested to be searched is inquired.
3. The method of claim 1, wherein the determining whether search flag information corresponding to the data to be searched exists comprises:
and determining space state information of a storage space for storing search mark information corresponding to the data to be searched, and determining whether the search mark information corresponding to the data to be searched exists or not based on the space state information.
4. The method of claim 3, wherein the spatial state information comprises uninitialized and initialized; the determining whether search flag information corresponding to the data to be searched exists based on the spatial state information includes:
if the space state information is not initialized, determining that search mark information corresponding to the data to be searched does not exist;
and if the space state information is initialized, determining that the search mark information corresponding to the data to be searched exists.
5. The method of claim 4, wherein the marking the data to be searched comprises:
determining a storage space for storing the search mark information, updating the space state information of the storage space from uninitialized to initialized, and writing the search mark information into the storage space.
6. The method of claim 5, wherein the marking the data to be searched comprises:
and writing the data to be searched into a storage space corresponding to the search mark information of the data to be searched so as to mark the data to be searched.
7. The method of claim 6, further comprising:
and if the storage space for storing the search mark information of the data to be searched is full, taking the written data to be searched as the search mark information, and discarding the data which is not written in the data to be searched.
8. The method of claim 3, wherein after the obtaining search results, the method further comprises:
and performing emptying processing on the space state information of the storage space for storing the search mark information.
9. The method of claim 1, wherein determining a subset of data to be searched corresponding to the requested search data from the search data set comprises:
determining initial search data corresponding to the search request data in the search data set;
determining associated search data corresponding to the initial search data according to the similarity between the initial search data and the rest of data to be searched in the search data set;
and constructing a data subset to be searched according to the initial search data and the associated search data.
10. A data search apparatus, comprising:
the device comprises a to-be-searched data subset determining module, a searching data set and a to-be-searched data subset determining module, wherein the to-be-searched data subset determining module is used for acquiring search request data, determining a searching data set corresponding to the search request data, and determining a to-be-searched data subset corresponding to the search request data according to the searching data set, and the to-be-searched data subset comprises at least one piece of to-be-searched data;
a data marking module, configured to determine, for each piece of data to be searched included in the subset of data to be searched, whether search mark information corresponding to the data to be searched exists, and if not, mark the data to be searched, where the search mark information is used to indicate that the data to be searched has been searched;
and the search result obtaining module is used for obtaining a search result based on the similarity between the request search data and each piece of data to be searched when all the data to be searched in the data subset to be searched has corresponding search mark information.
CN202210286230.3A 2022-03-23 2022-03-23 Data searching method and device, electronic equipment and storage medium Active CN114385891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210286230.3A CN114385891B (en) 2022-03-23 2022-03-23 Data searching method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210286230.3A CN114385891B (en) 2022-03-23 2022-03-23 Data searching method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114385891A true CN114385891A (en) 2022-04-22
CN114385891B CN114385891B (en) 2022-06-28

Family

ID=81204880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210286230.3A Active CN114385891B (en) 2022-03-23 2022-03-23 Data searching method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114385891B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303124A (en) * 2023-03-29 2023-06-23 浙江正泰仪器仪表有限责任公司 Data searching method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070005573A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Automatic filtering and scoping of search results
CN101963966A (en) * 2009-07-24 2011-02-02 李占胜 Method for sorting search results by adding labels into search results
US20130173655A1 (en) * 2012-01-04 2013-07-04 International Business Machines Corporation Selective fetching of search results
CN105786969A (en) * 2016-02-01 2016-07-20 百度在线网络技术(北京)有限公司 Information display method and apparatus
CN109344336A (en) * 2018-12-25 2019-02-15 北京时光荏苒科技有限公司 Searching method, search set creation method, device, medium, terminal and server
CN111183422A (en) * 2017-08-31 2020-05-19 深圳市欢太科技有限公司 Information processing method and related product
CN111209431A (en) * 2020-01-13 2020-05-29 上海极链网络科技有限公司 Video searching method, device, equipment and medium
CN111260193A (en) * 2020-01-09 2020-06-09 江苏满运软件科技有限公司 Vehicle and goods matching search system, method, computer device and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070005573A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Automatic filtering and scoping of search results
CN101963966A (en) * 2009-07-24 2011-02-02 李占胜 Method for sorting search results by adding labels into search results
US20130173655A1 (en) * 2012-01-04 2013-07-04 International Business Machines Corporation Selective fetching of search results
CN105786969A (en) * 2016-02-01 2016-07-20 百度在线网络技术(北京)有限公司 Information display method and apparatus
CN111183422A (en) * 2017-08-31 2020-05-19 深圳市欢太科技有限公司 Information processing method and related product
CN109344336A (en) * 2018-12-25 2019-02-15 北京时光荏苒科技有限公司 Searching method, search set creation method, device, medium, terminal and server
CN111260193A (en) * 2020-01-09 2020-06-09 江苏满运软件科技有限公司 Vehicle and goods matching search system, method, computer device and storage medium
CN111209431A (en) * 2020-01-13 2020-05-29 上海极链网络科技有限公司 Video searching method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303124A (en) * 2023-03-29 2023-06-23 浙江正泰仪器仪表有限责任公司 Data searching method and device, electronic equipment and storage medium
CN116303124B (en) * 2023-03-29 2024-01-30 浙江正泰仪器仪表有限责任公司 Data searching method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114385891B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
CN111090628B (en) Data processing method and device, storage medium and electronic equipment
CN110390054B (en) Interest point recall method, device, server and storage medium
CN107038157B (en) Artificial intelligence-based recognition error discovery method and device and storage medium
CN111258966A (en) Data deduplication method, device, equipment and storage medium
CN109471851B (en) Data processing method, device, server and storage medium
CN111104347B (en) Heap memory block searching method, device, equipment and storage medium
CN111061740B (en) Data synchronization method, device and storage medium
CN109376173A (en) A kind of data query method, apparatus, electronic equipment and storage medium
CN110688434B (en) Method, device, equipment and medium for processing interest points
CN114373460A (en) Instruction determination method, device, equipment and medium for vehicle-mounted voice assistant
CN114116811B (en) Log processing method, device, equipment and storage medium
CN114385891B (en) Data searching method and device, electronic equipment and storage medium
US9213759B2 (en) System, apparatus, and method for executing a query including boolean and conditional expressions
CN107729944A (en) A kind of recognition methods, device, server and the storage medium of vulgar picture
CN113807416A (en) Model training method and device, electronic equipment and storage medium
CN113760894A (en) Data calling method and device, electronic equipment and storage medium
CN112328630B (en) Data query method, device, equipment and storage medium
CN111400282B (en) Data processing strategy adjustment method, device, equipment and storage medium
CN111061744B (en) Graph data updating method and device, computer equipment and storage medium
CN114077858A (en) Vector data processing method, device, equipment and storage medium
CN108280139B (en) POI data processing method, device, equipment and computer readable storage medium
CN111782834A (en) Image retrieval method, device, equipment and computer readable storage medium
CN112560459A (en) Sample screening method, device, equipment and storage medium for model training
CN110750569A (en) Data extraction method, device, equipment and storage medium
CN110716946A (en) Method and device for updating feature rule matching library, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant