CN109977334B - Search speed optimization method - Google Patents

Search speed optimization method Download PDF

Info

Publication number
CN109977334B
CN109977334B CN201910231353.5A CN201910231353A CN109977334B CN 109977334 B CN109977334 B CN 109977334B CN 201910231353 A CN201910231353 A CN 201910231353A CN 109977334 B CN109977334 B CN 109977334B
Authority
CN
China
Prior art keywords
data
search
storage
writing
search item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910231353.5A
Other languages
Chinese (zh)
Other versions
CN109977334A (en
Inventor
潘杰
曹建军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Duyan Information Technology Co ltd
Original Assignee
Zhejiang Duyan Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Duyan Information Technology Co ltd filed Critical Zhejiang Duyan Information Technology Co ltd
Priority to CN201910231353.5A priority Critical patent/CN109977334B/en
Publication of CN109977334A publication Critical patent/CN109977334A/en
Application granted granted Critical
Publication of CN109977334B publication Critical patent/CN109977334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a search speed optimization method, which is characterized in that a plurality of corresponding storage files are respectively established for each search item, and each data under the search item is written into the corresponding storage file by adopting a plurality of corresponding threads; combining a plurality of corresponding storage files into one storage file aiming at regularly and circularly searched items, and arranging data in sequence in the combined storage files; and searching according to the input searching condition based on the combined storage file. The application adopts a mode of writing data by multithreading and multiple files, thereby improving the writing efficiency; the application combines the corresponding multiple storage files into one storage file aiming at the regularly and circularly searched items, so that the data are orderly arranged in the combined storage files, and the searching is carried out based on the combined storage files, thereby improving the writing efficiency by utilizing a multi-thread multi-file writing mode and simultaneously ensuring the searching efficiency.

Description

Search speed optimization method
Technical Field
The application belongs to the field of information retrieval, and particularly relates to a retrieval speed optimization method.
Background
With the advent of the big data age, how to quickly and accurately retrieve data that has been needed from huge databases using search engines has become a hotspot for research. Currently, a search engine generally comprises four main parts, namely capturing, recording, screening and displaying, and the conventional search engine has the problem of low writing efficiency and low searching efficiency.
Disclosure of Invention
The application provides a search speed optimization method, which aims to solve the problem of low writing and searching efficiency of the existing search engine.
According to a first aspect of an embodiment of the present application, there is provided a search speed optimization method, including:
for each search item, respectively establishing a plurality of corresponding storage files, and writing each data under the search item into the corresponding storage files by adopting a plurality of corresponding threads;
combining a plurality of corresponding storage files into one storage file aiming at regularly and circularly searched items, and arranging data in sequence in the combined storage files; and searching according to the input searching condition based on the combined storage file.
In an optional implementation manner, for a search term in text format, word segmentation processing is performed on first text data stored under the search term to generate a plurality of first words, word segmentation processing is performed on second text data in an input search condition to generate a plurality of second words, and for each second word, matching search is performed on the second word and the plurality of first words, so that first text data matched with the second text data in the search condition is obtained.
In another alternative implementation, after the search condition is input, field condition combination or field logic value combination is performed on the search condition;
and searching according to the combined searching conditions.
In another alternative implementation, for a search term with less space usage, the data under the search term is preloaded into memory prior to searching.
In another alternative implementation, the storage format type of the data under each search term is set according to the characteristics of the data under the search term.
In another optional implementation manner, when writing a piece of complete data information, for the data of the piece of data information under each search item, firstly judging whether an idle thread exists under the search item, if so, calling any idle thread, writing the data of the data information under the search item into a storage file corresponding to the calling thread, writing a segmenter after the data writing is completed, and setting the calling thread to be idle.
In another alternative implementation manner, for each search item allowing a longer input field, the thread is divided into a main thread and a standby thread, and a temporary storage file corresponding to the standby thread is established, and compared with the search item allowing a shorter input field length, the total number of threads is larger;
when writing a piece of complete data information, firstly judging whether an idle main thread exists under each search item aiming at the data of the data information under each search item, if so, calling any idle main thread, writing the data of the data information under the search item into a storage file corresponding to a calling main thread, writing a segmenter after the data writing is finished, and setting the calling main thread as the idle main thread; if the idle main thread does not exist under the search item, inserting a write-in request from the rear end of the corresponding queue, and starting a corresponding timer; judging whether the timer exceeds the corresponding duration, if so, starting an idle standby thread, adopting the standby thread to write the data of the data information under the search item into a corresponding temporary storage file, writing the segmenter after the data is written, and setting the standby thread to be idle.
In another alternative implementation manner, the number of standby threads and/or the corresponding duration are/is adjusted to regulate the writing duration of the whole piece of data information, so that the updating speed of the database is regulated.
In another alternative implementation manner, an address association table is established for each piece of data information, the address association table comprises storage address information of data of the piece of data information under each search item, and an index table is established for each search item, the index table comprises storage address information of data of each piece of data information under the search item and a pointing address corresponding to each storage address information, the storage address information comprises a storage file address of corresponding data and an address of the data under a storage file of the data, and the pointing address is used for pointing to a position of a first storage address information in the address association table corresponding to the storage address information of the data;
when searching under the corresponding search item, searching according to the input search condition based on the data stored in the storage file under the search item; when the search result is displayed, firstly, the storage address information of the data matched with the input search condition is determined, then the pointing address corresponding to the determined storage address information is searched from the index table corresponding to the search item, the position of the first storage address information in the address association table corresponding to the determined storage address information is located according to the searched pointing address, the address association table corresponding to the determined storage address information is obtained, and the storage address of the data information under each search item is obtained, so that complete data information is obtained.
In another optional implementation manner, the search item includes a name, an age, a self-introduction, a province, a deposit and a birth date, wherein a storage format type of the name is set as a text type, a storage format type of the age is set as an int type, a storage format type of the self-introduction is set as a txt type, a storage format type of the province is set as a keyword type, a storage format type of the deposit is set as a long type, and a storage format type of the birth date is set as a date type.
The beneficial effects of the application are as follows:
1. according to the application, by adopting a multithreading multi-file writing mode, data corresponding to the search item in the data information is written into the corresponding storage file, and even if the field occupied by the data corresponding to the search item in each piece of data information is uncertain, parallel writing of the corresponding data in a plurality of pieces of data information can be realized, so that the writing efficiency can be improved; according to the application, by combining a plurality of corresponding storage files into one storage file aiming at regularly and circularly available retrieval items and arranging data in sequence in the combined storage files, the writing efficiency can be improved by utilizing a multi-thread multi-file writing mode, and the retrieval efficiency can be ensured;
2. according to the text matching method, word segmentation processing is carried out on text data stored under a text format search item and text data input in search conditions, word matching is carried out on the two types of text data after word segmentation, and word matching is not carried out, so that the search efficiency can be greatly improved;
3. the application can improve the searching efficiency by carrying out the field condition combination or the field logic value combination on the searching condition and then searching;
4. according to the application, aiming at the retrieval items with less space occupation, the data under the retrieval items are preloaded into the memory before retrieval, so that the retrieval efficiency can be further improved;
5. the application can ensure the accuracy of the search by accurately setting the format type of the data of each search item;
6. according to the application, aiming at the search items with longer allowed input field length in each search item, a standby thread and a temporary storage file are additionally arranged, when the writing data does not have an idle main thread, a timer is additionally arranged, and when the writing request exceeds the corresponding duration, the idle standby thread is called to write the data into the temporary storage file, so that the data writing time length of the search items with longer allowed input field length can be dynamically adjusted, the writing time length of the whole piece of data information is controllable, and a designer can conveniently regulate and control the updating speed of a database;
7. according to the application, the address association table is established for each piece of data information, and the index table is established for each search item, so that the data of the data information stored under each search item can be accurately associated, and the complete data information can be quickly found out when the data information is displayed, thereby improving the accuracy and speed of search.
Drawings
FIG. 1 is a flow chart of one embodiment of the search rate optimization method of the present application.
Detailed Description
In order to better understand the technical solution in the embodiments of the present application and make the above objects, features and advantages of the embodiments of the present application more comprehensible, the technical solution in the embodiments of the present application is described in further detail below with reference to the accompanying drawings.
In the description of the present application, unless otherwise specified and defined, it should be noted that the term "connected" should be interpreted broadly, and for example, it may be a mechanical connection or an electrical connection, or may be a connection between two elements, or may be a direct connection or may be an indirect connection through an intermediary, and it will be understood to those skilled in the art that the specific meaning of the term may be interpreted according to the specific circumstances.
Referring to FIG. 1, a flowchart of one embodiment of a search rate optimization method of the present application is shown. The method is applied to a processing device (such as a computer, a server and the like), and the processing device realizes the search speed optimization, and the method can comprise the following steps:
step S101, establishing a plurality of corresponding storage files for each search item, and writing each data under the search item into the corresponding storage file by adopting a plurality of corresponding threads.
In the conventional search engine, when the acquired data information is stored in terms of items according to the search item, generally, only one thread is used for one search item and only one storage file is built, and only after the thread writes the data corresponding to the search item in the previous piece of data information into the storage file, the data corresponding to the search item in the next piece of data information can be written, so that the writing efficiency of the data information is lower. In addition, since the fields of the data corresponding to a certain search item in each piece of data information may be different, even if multi-thread writing is adopted at this time, the previous thread can only wait for the previous thread to write the corresponding data in the previous piece of data information into the storage file, and then the next thread can write the corresponding data in the next piece of data information into the storage file, if the corresponding data in the previous piece of data information is not written completely, the writing of the corresponding data in the next piece of data information is started, and disorder is caused, so that the writing efficiency of the data information cannot be improved only by adopting the multi-thread writing mode. Therefore, the application proposes a multi-thread multi-file writing mode, taking the age of a search item as an example, the data information comprises age data, 5 storage files are established aiming at the age of the search item, correspondingly, the age data in 5 pieces of data information can be respectively stored into the corresponding storage files by adopting 5 threads, and the parallel storage of the age data in 5 pieces of data information is realized, so that the writing efficiency is improved. The application writes the data corresponding to the search item in the data information into the corresponding storage file by adopting a multithreading multi-file writing mode, and even if the field of the data corresponding to the search item in each piece of data information is uncertain, the parallel writing of the corresponding data in a plurality of pieces of data information can be realized, thereby improving the writing efficiency.
Step S102, combining a plurality of corresponding storage files into one storage file aiming at regularly and circularly searched items, and arranging data in sequence in the combined storage files; and searching according to the input searching condition based on the combined storage file.
While multithreaded multi-file writing may improve writing efficiency, it may result in less efficient retrieval. For example, the 5 storage files stored with the ages include "2,3", "1,5", "5,6", "2,7" and "1,2", respectively, and if the maximum ages are to be queried, the respective maximum ages need to be first searched out from the 5 storage files, and then the searched 5 maximum ages are compared to obtain the required maximum ages. For this purpose, the application aims at regularly and circularly searching items, and combines the storage files into one storage file during searching, and the data are arranged in sequence in the combined storage file. The regularly and circularly searched items can comprise name, age, province, deposit, birth date and the like, wherein the name and the province can be ordered according to the initial of the pinyin of the first word, the age and the deposit can be ordered according to the order from small to large, and the birth date can be ordered according to the size of the birth date. Taking 5 storage files with age as an example, wherein the 5 storage files respectively comprise '2, 3', '1, 5', '5, 6', '2, 7' and '1, 2', the application firstly combines the 5 storage files into one storage file '1,1,2,2,2,3,4,5,6,7' when searching for the maximum age, and the application can directly start to search from the last of the storage files formed by combining. The application combines the corresponding multiple storage files into one storage file aiming at the regularly and circularly searched items, so that the data are orderly arranged in the storage files formed by combining, and the writing efficiency can be improved by utilizing a multi-thread multi-file writing mode and the searching efficiency can be ensured.
Specifically, when writing a piece of complete data information, firstly judging whether an idle thread exists under each search item aiming at the data of the data information under each search item, if so, calling any idle thread, writing the data of the data information under the search item into a storage file corresponding to a calling thread, writing a segmenter after the data writing is finished, and setting the calling thread to be idle, if not, inserting a writing request from the rear end of a corresponding queue, and entering waiting until the writing request is positioned at the head end of the queue and the idle thread exists. The length of the fields allowed to be input by each search item is usually different, for example, the length of the fields allowed to be input by the search item in the name of a person, the age, the self-introduction, the province, the deposit and the date of birth is far longer than that of the other five search items, the writing time is relatively longer, if the same number of threads are adopted for the other five search items and the same number of storage files are established, the data corresponding to the other five search items in the data information can be completely written, and the data corresponding to the self-introduction in the data information still waits for writing, and the waiting time is very long.
In order to improve the overall writing efficiency of the data information, for the search items with longer allowed input field length in each search item (for example, when the difference between the length of the allowed input field of a certain search item in each search item and the average value of the allowed input field lengths of the rest search items is greater than the corresponding preset length, the search item can be determined to be the search item with longer allowed input field length), the threads are divided into a main thread and a standby thread (wherein the main thread can be a plurality of standby threads, at least one of which has a larger total number of threads compared with the search item with shorter allowed input field length), and a temporary storage file corresponding to the standby thread is established. When writing a piece of complete data information, firstly judging whether an idle main thread exists under each search item aiming at the data of the data information under each search item, if so, calling any idle main thread, writing the data of the data information under the search item into a storage file corresponding to a calling main thread, writing a segmenter after the data writing is finished, and setting the calling main thread as the idle main thread; if the idle main thread does not exist under the search item, inserting a write-in request from the rear end of the corresponding queue, and starting a corresponding timer; judging whether the timer exceeds the corresponding duration, if so, starting an idle standby thread, adopting the standby thread to write the data of the data information under the search item into a corresponding temporary storage file, writing the segmenter after the data is written, and setting the standby thread to be idle.
The application adds the standby thread and the temporary storage file aiming at the search items with longer allowed input field length in each search item, and adds the timer when the writing data does not have the idle main thread, and calls the idle standby thread when the writing request exceeds the corresponding time length to write the data into the temporary storage file, thereby dynamically adjusting the data writing time length of the search items with longer allowed input field length, ensuring that the writing time length of the whole piece of data information is controllable, and being convenient for a designer to regulate and control the updating speed of the database. The application regulates and controls the writing time length of the whole data information by regulating the number of the standby threads and/or the corresponding time length, thereby regulating and controlling the updating speed of the database. And in addition, after the standby thread is adopted to write the data of the data information under the search item into the corresponding temporary storage file, judging whether the idle main thread is in a long-term idle state (for example, the idle state exceeds a preset time period), if so, calling the idle main thread, writing the data in the temporary storage file into the corresponding storage file, writing the divider after the data is written, and emptying the temporary storage file. According to the application, when the main thread is idle for a long time, the data in the temporary storage file is written into the storage file corresponding to the main thread, so that on one hand, the retrieval can be conveniently carried out only in the storage file corresponding to the main thread during the retrieval, the data can be conveniently collected, and on the other hand, the temporary storage file is emptied, and the next data writing can be conveniently carried out.
The data information is written in a multithreading and multi-file mode, so that the data storage positions of the data information under each search item are disordered and irregularly circulated. In order to accurately show corresponding data in data information, an address association table is established for each piece of data information, the address association table comprises storage address information of data of the piece of data information under each search item, and an index table is established for each search item, the index table comprises storage address information of data of each piece of data information under the search item and pointing addresses corresponding to the storage address information, the storage address information comprises storage file addresses of the corresponding data and addresses of the data under the storage files of the corresponding data, and the pointing addresses are used for pointing to positions of first storage address information in the address association table corresponding to the storage address information of the corresponding data.
When searching under the corresponding search item, searching according to the input search condition based on the data stored in the storage file under the search item; when the search result is displayed, firstly, the storage address information of the data matched with the input search condition is determined, then the pointing address corresponding to the determined storage address information is searched from the index table corresponding to the search item, the position of the first storage address information in the address association table corresponding to the determined storage address information is located according to the searched pointing address, the address association table corresponding to the determined storage address information is obtained, and the storage address of the data information under each search item is obtained, so that complete data information is obtained. According to the application, the address association table is established for each piece of data information, and the index table is established for each search item, so that the data of the data information stored under each search item can be accurately associated, and the complete data information can be quickly found out when the data information is displayed, thereby improving the accuracy and speed of search.
As can be seen from the above embodiments, by adopting the multithreading multi-file writing manner, the present application writes the data corresponding to the search item in the data information into the corresponding storage file, and even if the occupied field of the data corresponding to the search item in each piece of data information is uncertain, the parallel writing of the corresponding data in the plurality of pieces of data information can be realized, thereby improving the writing efficiency; the application combines the corresponding multiple storage files into one storage file aiming at the regularly and circularly searched items, and ensures that the data are orderly arranged in the storage files formed by combining, thereby improving the writing efficiency by utilizing a multi-thread multi-file writing mode and simultaneously ensuring the searching efficiency.
In addition, for the search term in text format, when the text is searched, the text is usually searched by dividing the text, for example, a self introduction is that "I am Zhang San, I am like to play basketball, I am an active boy. This word is stored by default in the search engine as one unit per word, i.e. "i/yes/w/tri/i/happy/cheerful/hit/basket/ball/i/yes/w/live/splash/child/small/male/child", with a query at this time, provided that the self-introduced field matches "basketball", at this time, this condition is also broken down into 'basket/ball' words, then the 'basket' words are used to match first, from 'me' to 'blue', until the ninth word is matched, then the 'ball' words are used to match, from 'me' to 'ball', the tenth word is matched, and the total experience of nineteen matches is low in matching efficiency. Therefore, the application introduces a word segmentation system, carries out word segmentation processing on first text data stored under a text format search item to generate a plurality of first words, carries out word segmentation processing on second text data in an input search condition to generate a plurality of second words, and carries out matching search on each second word and the plurality of first words respectively, thereby obtaining first text data matched with the second text data in the search condition. For example, the self-introduction becomes "i/yes/three/i/like/play/basketball/i/are individual/lively/small/boy", the query condition becomes "basketball", and the result can be found by only 7 matches from 'i' to 'basketball' at this time. In this case, the efficiency of the query is improved by more than 2 times by utilizing the segmentation, and the longer the text, the more obvious the improvement effect is. According to the application, word segmentation processing is carried out on the text data stored under the search item in the text format and the text data input in the search condition, and word-by-word matching is carried out on the two types of text data after word segmentation instead of word-by-word matching, so that the search efficiency can be greatly improved.
Three logical relationships are defined in the search, MUST, SHOULD, MUST _NOT, two of which are the most representative are now taken, and an explanation is made for the combination of search conditions: mud: whether multiple conditions or a single condition must be satisfied in their entirety. Shold: a single condition must be satisfied, and a plurality of conditions may be satisfied. A situation is encountered during searching, where the user enters a search criteria that is relatively complex and the entered criteria are irregular. For example, the search conditions are:
the mud name contains 'sheet';
the age of SHOULD is greater than or equal to 20 and less than or equal to 25;
the mud province is Zhejiang;
the mud name contains 'country' inside;
the age of SHOULD is 25 or more and 30 or less);
in this condition, there are a plurality of occurrences of matching of names and matching of ages, and if searching is performed in order, the names are matched first, then the ages are matched, then the provinces are matched, then the names are matched, and then the ages are matched. But in fact this condition may be combined into a more compact condition. The conditions after the merging are:
the mud name contains 'Zhang' and 'Guo';
the age of MUST is more than or equal to 20 and less than or equal to 30;
the mud province is Zhejiang;
the name and age fields are not matched once more and then additional conditions are matched when such matching is completed. It should be noted that the condition merging herein includes two kinds of merging, one is simple field condition merging like a name, and the other is logical value merging of fields like an age, merging 'greater than or equal to 20 less than or equal to 25' and 'greater than or equal to 25 less than or equal to 30' into one "greater than or equal to 20 less than or equal to 30". After the search condition is input, the application performs field condition combination or field logic value combination on the search condition; the search is performed according to the combined search conditions, whereby the search efficiency can be further improved.
The memory is obviously superior to a magnetic disk in the aspect of file read-write efficiency, even a solid state disk, the memory is only 500mb/s, and the read-write speed of a common memory can reach 7000mb/s, so that the memory is reasonably utilized, and the search speed can be improved. Search engines search for files by default because the amount of data of a search engine is often very large, as large as several hundred GB, and a server with several hundred GB of memory is found, and the price of a server with a defeated GB disk is very different, even if the data is larger, so that no server with such large memory is found on the market. However, the space occupation of some fields is relatively small, such as age, province, and can be loaded into the memory in advance, and the data exist in the memory, so that the search speed can be greatly improved when the two fields are matched. That is, the application is aimed at the search item with less space occupation, and the data under the search item is preloaded into the memory before searching, so as to improve the searching efficiency.
The data features of different search items in the search are different, the format types of the different feature data are accurately defined, and the method is a basis for realizing accurate search. Therefore, the application sets the format type of the data storage under each search item according to the characteristics of the data under the search item. For example, the search items include name, age, self-introduction, province, deposit, and date of birth, and the common format types of the fields are:
int type (used to represent integers, with the advantage of fast search ordering),
long type (used to represent integer types longer than int, search ordering is fast, volume is larger than int),
the data type (used to represent the date),
text type (text type, slow sort search, text split match, support word match, not support aggregate query),
keyword type (text type, fast search speed, text inseparable matching, no word segmentation matching support, aggregate query support).
Next, each field of the batch of data mentioned in the precondition is analyzed.
The name is text format, so this field must be set to text or keyword type, analyze this field, we may need to use when searching again, search all people of last name's' or search all people of last name's' word, therefore, this text format needs to support match after splitting, so this field type is text.
Age, is an integer type. Either the int or long type is selected. int and long are different in computers of different digits and different programming languages, but the difference between int and long is that int is applicable to shorter integer types, which saves space. The type of this field is int.
Deposit is also an integer type relative to the age field, but its length may grow beyond the upper limit of int. The type of this field is long.
Self-introduction is a large piece of text content, and split text matching is needed during searching, for example, matching is needed, and two words of 'playing' are recorded in the self-introduction. This field is set to text type.
Province, also text format content, but more particularly, the province's field value need not be split to match, ' Zhejiang ' and ' Jiangsu ' to match ' Jiangsu ', and need not input a ' Jiang ' to match all bands ' Jiang ' out, then the field is set to keyword.
The date of birth is set to date type.
In addition, an operation called aggregate query is used in the search engine, similar to count+group BY in mysql, that is, for a certain field or fields, the number is counted when the value of the field or values of the fields are the same. For example, how many people with inquiry age=20, how many people with inquiry province are 'Zhejiang', even the age years with inquiry number in the first ten, and the province name with inquiry number in the first ten. This operation is called an aggregate query. As mentioned in the field description, text in the text format does not have the capability of aggregating queries, and keyword has the capability of aggregating queries. The business may use the statistics of the number of people in Zhejiang, province, but does not use how many people of the same self-introduction, so that the rationality of setting the self-introduction into text type and setting the province into keyword is also verified. The application can ensure the accuracy of the search by accurately setting the format type of the data of each search item.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (7)

1. A search rate optimization method, comprising:
for each search item, a plurality of corresponding storage files are respectively established, and a plurality of corresponding threads are adopted to write each data under the search item into the corresponding storage files, so that when the field of the data corresponding to the search item in each piece of data information is uncertain, the parallel writing of the corresponding data in the plurality of pieces of data information is realized;
aiming at the search items with longer allowed input fields in each search item, the thread is divided into a main thread and a standby thread, a temporary storage file corresponding to the standby thread is established, compared with the search items with shorter allowed input fields, the total number of the threads is larger, the search items with longer allowed input fields comprise self-introduction, and the search items with shorter allowed input fields comprise names and ages;
when writing a piece of complete data information, firstly judging whether an idle main thread exists under each search item aiming at the data of the data information under each search item, if so, calling any idle main thread, writing the data of the data information under the search item into a storage file corresponding to a calling main thread, writing a segmenter after the data writing is finished, and setting the calling main thread as the idle main thread; if the idle main thread does not exist under the search item, inserting a write-in request from the rear end of the corresponding queue, and starting a corresponding timer; judging whether the timer exceeds the corresponding duration, if so, starting an idle standby thread, adopting the standby thread to write the data of the data information under the search item into a corresponding temporary storage file, writing a divider after the data is written, and setting the standby thread to be idle; after the data of the data information under the search item is written into the corresponding temporary storage file by adopting the standby thread, judging whether the idle main thread is in a long-term idle state, if so, calling the idle main thread, writing the data in the temporary storage file into the corresponding storage file, writing the segmenter after the data is written, and emptying the temporary storage file;
the writing time length of the whole piece of data information is regulated and controlled by regulating the number of the standby threads and the corresponding time length, so that the updating speed of the database is regulated and controlled;
combining a plurality of corresponding storage files into one storage file aiming at regularly and circularly searched items, and arranging data in sequence in the combined storage files; and searching according to the input searching condition based on the combined storage file.
2. The search rate optimizing method according to claim 1, characterized by further comprising:
for a search term in text format, performing word segmentation processing on first text data stored under the search term to generate a plurality of first words, performing word segmentation processing on second text data in an input search condition to generate a plurality of second words, and for each second word, performing matching search on the second word and the plurality of first words, so as to obtain first text data matched with the second text data in the search condition.
3. The search rate optimizing method according to claim 1 or 2, characterized by further comprising:
after the search condition is input, carrying out field condition combination or field logic value combination on the search condition;
and searching according to the combined searching conditions.
4. The search rate optimizing method according to claim 3, characterized by further comprising:
for a search term with small space occupation, the data under the search term is preloaded into the memory before searching.
5. The search rate optimizing method according to claim 4, characterized by further comprising:
and setting the storage format type of the data under each search item according to the characteristics of the data under the search item.
6. The search speed optimizing method according to claim 1, wherein for each piece of data information, an address association table is established, the address association table including storage address information of data of the piece of data information under each search item, and for each search item, an index table is established, the index table including storage address information of data of each piece of data information under the search item and a pointing address corresponding to each storage address information, the storage address information including a storage file address where the corresponding data is located and an address where the data is located under its storage file, the pointing address being used to point to a location where first storage address information is located in the address association table corresponding to the storage address information thereof;
when searching under the corresponding search item, searching according to the input search condition based on the data stored in the storage file under the search item; when the search result is displayed, firstly, the storage address information of the data matched with the input search condition is determined, then the pointing address corresponding to the determined storage address information is searched from the index table corresponding to the search item, the position of the first storage address information in the address association table corresponding to the determined storage address information is located according to the searched pointing address, the address association table corresponding to the determined storage address information is obtained, and the storage address of the data information under each search item is obtained, so that complete data information is obtained.
7. The search speed optimizing method according to claim 5, wherein the search items include a name, an age, a self-introduction, a province, a deposit, and a date of birth, wherein a storage format type of the name is set to a text type, a storage format type of the age is set to an int type, a storage format type of the self-introduction is set to a txt type, a storage format type of the province is set to a keyword type, a storage format type of the deposit is set to a long type, and a storage format type of the date of birth is set to a date type.
CN201910231353.5A 2019-03-26 2019-03-26 Search speed optimization method Active CN109977334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910231353.5A CN109977334B (en) 2019-03-26 2019-03-26 Search speed optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910231353.5A CN109977334B (en) 2019-03-26 2019-03-26 Search speed optimization method

Publications (2)

Publication Number Publication Date
CN109977334A CN109977334A (en) 2019-07-05
CN109977334B true CN109977334B (en) 2023-10-20

Family

ID=67080577

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910231353.5A Active CN109977334B (en) 2019-03-26 2019-03-26 Search speed optimization method

Country Status (1)

Country Link
CN (1) CN109977334B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104373B (en) * 2019-12-24 2023-09-19 天地伟业技术有限公司 Database performance optimization method
CN115587115B (en) * 2022-12-12 2023-02-28 西南石油大学 Database query optimization method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101087210A (en) * 2007-05-22 2007-12-12 网御神州科技(北京)有限公司 High-performance Syslog processing and storage method
CN101739293A (en) * 2009-12-24 2010-06-16 航天恒星科技有限公司 Method for scheduling satellite data product production tasks in parallel based on multithread
CN103729442A (en) * 2013-12-30 2014-04-16 华为技术有限公司 Method for recording event logs and database engine
CN104461915A (en) * 2014-11-17 2015-03-25 苏州阔地网络科技有限公司 Method and device for dynamically allocating internal storage in online class system
CN105069149A (en) * 2015-08-24 2015-11-18 电子科技大学 Structured line data-oriented distributed parallel data importing method
CN107368362A (en) * 2017-06-29 2017-11-21 上海阅文信息技术有限公司 A kind of multithreading/multi-process for disk read-write data is without lock processing method and system
CN108139938A (en) * 2015-07-31 2018-06-08 华为技术有限公司 For assisting the device of main thread executing application task, method and computer program using secondary thread
CN108694187A (en) * 2017-04-07 2018-10-23 北京国双科技有限公司 The storage method and device of real-time streaming data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008077283A1 (en) * 2006-12-27 2008-07-03 Intel Corporation Pointer renaming in workqueuing execution model
CN103856558B (en) * 2014-01-22 2017-07-14 北京京东尚科信息技术有限公司 A kind of data processing method and device for terminal applies
WO2017013701A1 (en) * 2015-07-17 2017-01-26 株式会社日立製作所 Computer system and database management method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101087210A (en) * 2007-05-22 2007-12-12 网御神州科技(北京)有限公司 High-performance Syslog processing and storage method
CN101739293A (en) * 2009-12-24 2010-06-16 航天恒星科技有限公司 Method for scheduling satellite data product production tasks in parallel based on multithread
CN103729442A (en) * 2013-12-30 2014-04-16 华为技术有限公司 Method for recording event logs and database engine
CN104461915A (en) * 2014-11-17 2015-03-25 苏州阔地网络科技有限公司 Method and device for dynamically allocating internal storage in online class system
CN108139938A (en) * 2015-07-31 2018-06-08 华为技术有限公司 For assisting the device of main thread executing application task, method and computer program using secondary thread
CN105069149A (en) * 2015-08-24 2015-11-18 电子科技大学 Structured line data-oriented distributed parallel data importing method
CN108694187A (en) * 2017-04-07 2018-10-23 北京国双科技有限公司 The storage method and device of real-time streaming data
CN107368362A (en) * 2017-06-29 2017-11-21 上海阅文信息技术有限公司 A kind of multithreading/multi-process for disk read-write data is without lock processing method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Large-Scale Speculation for the Thread-Level Parallelization;Yuki Shoji等;2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence;全文 *
基于HTML5的浏览器端多线程下载技术;任双君;周旭;任勇毛;李灵玲;;计算机系统应用(11);全文 *
天文大数据存储管理关键技术研究;过汇卿;《优秀硕士论文全文数据库 信息科技辑》;20160515(第05期);正文第3.2,4.1-4.3,5.2小节,图1-2,图5-1 *
孙丽云等.数据结构(C语言版).《数据结构(C语言版)》.华中科技大学出版社,2017,第232页. *

Also Published As

Publication number Publication date
CN109977334A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN110321344B (en) Information query method and device for associated data, computer equipment and storage medium
CN103390020B (en) The method and system of data is stored in the database
US7406485B2 (en) Shared scans utilizing query monitor during query execution to improve buffer cache utilization across multi-stream query environments
US9846567B2 (en) Flash optimized columnar data layout and data access algorithms for big data query engines
US9507816B2 (en) Partitioned database model to increase the scalability of an information system
US20090210445A1 (en) Method and system for optimizing data access in a database using multi-class objects
US7849113B2 (en) Query statistics
US20100211577A1 (en) Database processing system and method
US20100274795A1 (en) Method and system for implementing a composite database
US20190236201A1 (en) Techniques for processing database tables using indexes
Lin et al. Frame-sliced signature files
CN109977334B (en) Search speed optimization method
US11221999B2 (en) Database key compression
EP2833277A1 (en) Global dictionary for database management systems
US9104726B2 (en) Columnar databases
EP2333678A1 (en) Systems and methods for distribution of data in a database index
CN1838118B (en) File management method
CN113468107A (en) Data processing method, device, storage medium and system
US20090157621A1 (en) Search device, search method and search program
JPH10260876A (en) Data structure of database, and data processing method for database
US10795876B2 (en) Processing query of database and data stream
CN109299143A (en) The knowledge fast indexing method in the data interoperation knowledge on testing library based on Redis caching
US11487731B2 (en) Read iterator for pre-fetching nodes of a B-tree into memory
CN110874360A (en) Ordered queue caching method and device based on fixed capacity
CN111190895B (en) Organization method, device and storage medium of column-type storage data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant