CN108062418A - A kind of data search method, device and server - Google Patents

A kind of data search method, device and server Download PDF

Info

Publication number
CN108062418A
CN108062418A CN201810011936.2A CN201810011936A CN108062418A CN 108062418 A CN108062418 A CN 108062418A CN 201810011936 A CN201810011936 A CN 201810011936A CN 108062418 A CN108062418 A CN 108062418A
Authority
CN
China
Prior art keywords
index data
data
search
target pages
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810011936.2A
Other languages
Chinese (zh)
Other versions
CN108062418B (en
Inventor
高大陆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810011936.2A priority Critical patent/CN108062418B/en
Publication of CN108062418A publication Critical patent/CN108062418A/en
Application granted granted Critical
Publication of CN108062418B publication Critical patent/CN108062418B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

An embodiment of the present invention provides a kind of data search method, device and server, the data search method is applied to server, the described method includes:Receive searching request;The index data that search matches with searching request in first kind index data, first kind index data are:It is stored in the data in non-caching region in memory;If searching for the index data to match with searching request in the second class index data there is no the index data to match with searching request in first kind index data, the second class index data is:It is stored in the data of buffer zone in memory;If there is the index data that index request matches in the second class index data, search result is obtained from the second class index data.The technical solution for implementing to provide by the present invention can shorten the time of server search index data, reduce stand-by period when user searches for information.

Description

A kind of data search method, device and server
Technical field
The present invention relates to search engine technique fields, are filled more particularly to a kind of data search method, a kind of data search It puts and a kind of server.
Background technology
Search engine refers to a kind of website that search service is specially provided on internet, and this kind of corresponding server in website leads to Web search software or network entry mode are crossed, the page info of website in internet is collected into local.For the ease of with Family carries out information search, and above-mentioned server is that the page info being collected into establishes index data base, and will be in index data base Index datastore in disk.
In the prior art, on startup, the corresponding index data of the higher page info of access frequency is added for server It is downloaded in memory.Server need to first search the index data to match with searching request, so after the searching request of user is received Provide the information of user's needs to the user according to the index data found afterwards, wherein, server is being searched and searching request phase During matched index data, the index data to match with searching request is searched in memory first, when not looking into memory When finding the index data to match with searching request, then the index data to match with searching request is searched in disk, and It will be loaded into the index data that searching request matches from disk in memory.
However, inventor has found in the implementation of the present invention, at least there are the following problems for the prior art:
If server on startup not by the corresponding index datastore of searching request in memory, server is connecing It receives after searching request, it is necessary to perform following two steps, first step:It is looked into a large amount of index datas stored in disk Look for the index data to match with searching request;Second step:In being loaded into the index data that searching request matches In depositing, and the two steps are required to consumption longer time, increase so as to cause the time of server search index data, into And need the time waited longer during user's search information.
The content of the invention
The embodiment of the present invention is designed to provide a kind of data search method, device and server, shortens clothes to realize The time of business device search index data, and then reduce stand-by period when user searches for information.Specific technical solution is as follows:
In a first aspect, an embodiment of the present invention provides a kind of data search method, applied to server, the method bag It includes:
Receive searching request;
The index data that search matches with described search request in first kind index data, wherein, the first kind Index data is:It is stored in the data in non-caching region in memory;
If the index data to match with described search request is not present in first kind index data, in the second class index number According to the index data that middle search matches with described search request, wherein, the second class index data is:It is stored in memory The data of buffer zone, the data of the buffer zone are:Loaded according to default data loading rule from disk and visit Ask that probability is more than the corresponding index data of page info of predetermined probabilities;
If there are the index data that the index request matches in the second class index data, from the second class index number According to middle acquisition search result.
Optionally, according in the following manner, the access probability of target pages information is calculated, wherein, the target pages letter It ceases for any page info:
Using user in the first preset duration to the visit capacity of the target pages information, the target pages information is calculated Access probability;
Alternatively, according to the feature of the target pages information, the access probability of the target pages information is calculated.
Optionally, the feature according to the target pages information calculates the access probability of the target pages information, Including:
Extract the feature of the target pages information;
Using the access probability of the corresponding page info of each feature extracted, estimate that each feature extracted corresponds to Access frequency, wherein, the corresponding page info of a feature is:In candidate page information aggregate, there is the page of this feature Information;
The each access frequency obtained using estimation calculates the access probability of the target pages information.
Optionally, the page info in candidate page information aggregate is:
According to user in the second preset duration to the visit capacity of page info, in all page infos stored from disk Visit capacity is selected to be more than the page info of default visit capacity;
Alternatively,
All page infos stored in disk.
Optionally, the feature of the target pages information includes at least one of following characteristics:Temperature, issuing time, Source, type, title feature.
Second aspect, an embodiment of the present invention provides a kind of data serching device, applied to server, described device bag It includes:
Request receiving module, for receiving searching request;
First data search module, for the index that search matches with described search request in first kind index data Data, wherein, the first kind index data is:It is stored in the data in non-caching region in memory;
Second data search module, if for the rope to match with described search request to be not present in first kind index data Argument evidence, the index data that search matches with described search request in the second class index data, wherein, the second class rope Argument is according to being:The data of buffer zone in memory are stored in, the data of the buffer zone are:It loads and advises according to default data Then loaded from disk and access probability is more than the corresponding index data of page info of predetermined probabilities;
Search result acquisition module, if for there are the index numbers that the index request matches in the second class index data According to obtaining search result from the second class index data.
Optionally, described device further includes:
Access probability computing module, for according in the following manner, calculating the access probability of target pages information, wherein, institute Target pages information is stated as any page info:
Using user in the first preset duration to the visit capacity of the target pages information, the target pages information is calculated Access probability;
Alternatively, according to the feature of the target pages information, the access probability of the target pages information is calculated.
Optionally, the access probability computing module, is specifically used for:
Extract the feature of the target pages information;
Using the access probability of the corresponding page info of each feature extracted, estimate that each feature extracted corresponds to Access frequency, wherein, the corresponding page info of a feature is:In candidate page information aggregate, there is the page of this feature Information;
The each access frequency obtained using estimation calculates the access probability of the target pages information.
Optionally, candidate page information aggregate is:
According to user in the second preset duration to the visit capacity of page info, in all page infos stored from disk Visit capacity is selected to be more than the page info of default visit capacity;
Alternatively,
All page infos stored in disk.
Optionally, the feature of the target pages information includes at least one of following characteristics:Temperature, issuing time, Source, type, title feature.
The third aspect, an embodiment of the present invention provides a kind of servers, including processor, communication interface, memory and lead to Believe bus, wherein, processor, communication interface, memory completes mutual communication by communication bus;
Memory, for storing computer program;
Processor during for performing the program stored on memory, realizes any data described in above-mentioned first aspect Searching method.
At the another aspect that the present invention is implemented, a kind of computer readable storage medium is additionally provided, it is described computer-readable Instruction is stored in storage medium, when run on a computer so that computer performs appointing described in above-mentioned first aspect One data search method.
At the another aspect that the present invention is implemented, the embodiment of the present invention additionally provides a kind of computer program production comprising instruction Product, when run on a computer so that computer performs any data searching method described in above-mentioned first aspect.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is attached drawing needed in technology description to be briefly described.
A kind of flow diagram for data search method that Fig. 1 is provided by the embodiment of the present invention;
A kind of structure diagram for data serching device that Fig. 2 is provided by the embodiment of the present invention;
A kind of structure diagram for server that Fig. 3 is provided by the embodiment of the present invention.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is described.
In order to solve to be needed when the time of server search index data in the prior art is longer, user searches for information The technical issues of time to be waited is longer, an embodiment of the present invention provides a kind of data search method, device and server, with Shorten the time of server search index data, and then shorten stand-by period when user searches for information.
In order to more clearly and completely describe the embodiment of the present invention, first below to involved in the embodiment of the present invention Concept be introduced.
Non-caching region:It is a part for memory, for storing first kind index data;
Buffer zone:It is for non-caching region and a part for memory, for storing the second class index Data;
The first kind index data when can be the startup of server for providing search service, be loaded into from disk in memory The higher corresponding index data of page info of access frequency;It can also be that server is in starting state, be searched receiving Rope request after, if do not search in memory with the corresponding index data of searching request, memory is loaded into from disk Index data;
Second class index data:Loaded for server according to default data loading rule from disk, access probability Index data corresponding more than the page info of predetermined probabilities;
Wherein, default data loading rule can be default time interval, for example, default time interval is two My god, then server calculates current time, and a few days ago interior access probability is more than the page info of default access probability, and by these The corresponding index data of page info that access probability is more than default access probability is loaded into caching.Certainly, the present invention is to pre- If data loading rule and be not specifically limited.
After the concept involved in the embodiment of the present invention is introduced, the data that the embodiment of the present invention will be provided Searching method, data serching device, server etc. are introduced.
In a first aspect, a kind of data search method provided the embodiment of the present invention is introduced.
As shown in Figure 1, a kind of data search method that the embodiment of the present invention is provided, includes the following steps:
S101 receives searching request;
S102, the index data that search matches with searching request in first kind index data, wherein, first kind index Data are:It is stored in the data in non-caching region in memory;
S103, if being indexed in first kind index data there is no the index data to match with searching request in the second class The index data that search matches with searching request in data, wherein, the second class index data is:It is stored in buffer area in memory The data in domain, the data of buffer zone are:Loaded according to default data loading rule from disk and access probability is more than The corresponding index data of page info of predetermined probabilities;
S104, if there is the index data that index request matches in the second class index data, from the second class index data Middle acquisition search result.
Using above-mentioned data search method shown in FIG. 1, server is after searching request is received, first in memory Non-caching range searching whether there is the search data to match with searching request;If it is not searched in non-caching region The search data to match with searching request, the then index number that buffer zone search in memory matches with searching request According to.Due to stored in the buffer zone in memory index data be server according to default data loading rule from disk plus Carry, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., buffer zone in memory searches The probability of the index data to match with searching request is larger.
Such rather than the prior art, server is only on startup by the corresponding index of the higher page info of access frequency Data are loaded into memory, after startup of server, the index data in memory can't be automatically updated, if searching request pair The index data answered is the corresponding index data of page info stored in disk, that access frequency is higher, then server needs The index data to match with searching request is searched in a large amount of index datas stored in disk, and will be with searching request phase The index data matched somebody with somebody is loaded into memory, and searching and load the index data to match with searching request needs consumption longer Time, increase so as to cause the time of server search index data, and then when user searches for information needs time for waiting It is longer.
As it can be seen that technical solution provided in an embodiment of the present invention, can shorten the time of server search index data, reduce User searches for stand-by period during information.
Data search method provided in an embodiment of the present invention will be described in detail below.
S101 receives searching request;
Search engine provides the page for including search box, and user inputs when needing to search for information in search box Search term, and search is clicked on, server receives searching request at this time.
For example, search engine is Baidu, for user when needing to check latest news, input is " newest new in search box Hear ", and " using Baidu.com " is clicked on, server receives searching request at this time.
It should be noted that above-mentioned searching request is used for searched page information, in searched page information, it is necessary to first search for To the corresponding index data of page info, therefore, in embodiments of the present invention, above-mentioned searching request can be used for searching for index number According to.
S102, the index data that search matches with searching request in first kind index data, wherein, first kind index Data are:It is stored in the data in non-caching region in memory;
Due to first kind index data can be startup of server when, the corresponding index of the higher page info of access frequency Data, therefore, server is after searching request is received, and first search whether there is and index in above-mentioned first kind index data The index data of data match, can if there is the index data to match with index data in first kind index data To obtain search result in first kind index data.If it is not found in first kind index data and searching request phase The index data matched somebody with somebody then performs step S103.
It is understood that can be one with the index data that searching request matches or multiple.Usual feelings Under condition, have with the index data that searching request matches multiple.For example, search engine is Baidu, and user searches Baidu " latest news " are inputted in rope frame, and click on " using Baidu.com ", the index data to match at this time with searching request can be " most New sports news ", " newest stock information ", " newest entertainment information " etc..
S103, if being indexed in first kind index data there is no the index data to match with searching request in the second class The index data that search matches with searching request in data, wherein, the second class index data is:It is stored in buffer area in memory The data in domain, the data of buffer zone are:Loaded according to default data loading rule from disk and access probability is more than The corresponding index data of page info of predetermined probabilities;
In this step, if there is no the index data to match with searching request in first kind index data, that is, take When business device does not find the index data to match with searching request in first kind index data, buffer area in memory The index data to match with searching request is searched in the second class index data stored in domain.Due to the buffer zone in memory Middle storage index data be server loaded according to default data loading rule from disk, access probability be more than it is default general The corresponding index data of page info of rate, i.e., buffer zone in memory search the index number to match with searching request According to probability it is larger.It is such rather than the prior art, if server is not loaded into the index of memory when its own starts The index data to match with searching request is found in data, then searches and searches in a large amount of index datas stored into disk The index data that rope request matches, and the index data found is loaded into memory, so as to add search index number According to time, extend user search for information when stand-by period.
It is emphasized that in practical applications, server, can be simultaneously in first kind rope after searching request is received The index data that search matches with searching request in argument evidence and the second class index data.Certainly, server can also first exist The index data that search matches with searching request in second class index data, if being not present and search in the second class index data The index data to match, then the index data that search matches with searching request in first kind index data are asked, this is all It is rational.That is, the embodiment of the present invention in first kind index data to searching for the index to match with searching request The execution sequence of the index data to match with searching request is searched in data, the second class index data and is not specifically limited.
It is understood that any disk of the above-mentioned disk for storage index data, which can be the magnetic of server The disk of disk or other storage index datas, this is all rational.
It should be noted that the second class index data is more than the corresponding index of page info of predetermined probabilities for access probability Data, the access probability that below will be made of how to calculate page info are specifically introduced.
In one embodiment, user in the first preset duration can be utilized to the visit capacity of target pages information, to count The access probability of target pages information is calculated, wherein, target pages information is any page info stored in disk.
Above-mentioned first preset duration is changeable, i.e., the first preset duration can be:30 days, 15 days, 7 days, 3 days, 2 days, 1 My god, 3 it is small when, 1 it is small when etc., this is all rational, and the embodiment of the present invention is not especially limited the size of the first preset duration.With First preset duration is to be specifically described exemplified by 2 days.
Server records user in 2 days and, to the visit capacity of target pages information, is obtaining the visit capacity of target pages information Afterwards, judge whether the visit capacity is more than or equal to default visit capacity, it, should if the visit capacity is more than or equal to default visit capacity Target pages information is determined as the high page info of access frequency;If the visit capacity is less than default visit capacity, by the target Page info is determined as the low page info of access frequency.
Certainly, since target pages information is any page info of the storage in disk, accordingly, it is determined that target pages are believed The mode of the visit capacity of breath can also be:Server is by the page info of the storage in disk according to descending suitable of visit capacity Sequence is ranked up, and the page info that the sequence number that sorts is less than or equal to default sequence number is determined as the high page info of access probability, will The page info that sequence sequence number is more than default sequence number is determined as the low page info of access probability.Similarly, server can be with The page info of storage in disk according to the ascending order of visit capacity is ranked up, sequence sequence number is less than or equal to pre- If the page info of sequence number is determined as the low page info of access probability, the page info that the sequence number that sorts is more than to default sequence number is true It is set to the high page info of access probability.
In another embodiment, the access of target pages information can according to the feature of target pages information, be calculated Probability, wherein, which is any page info stored in disk.
In this embodiment, the feature of target pages information is extracted first, and the feature of target pages information can be hair Cloth time, temperature, source, type, title feature etc..Specifically, issuing time can be nearest one month, a nearest star Phase, it is three days nearest, nearest one day etc., temperature can be:Hot value;Source can be:In station or stand outer;It is common What user uploads or editor uploaded;Type can there are many, for example, it may be TV play, film, variety show etc.;Mark It can be length, word, meaning of one's words of title etc. to inscribe feature.It is of course possible to the feature for characterizing page info may each be above-mentioned mesh The feature of page info is marked, the embodiment of the present invention is not specifically limited the feature of target pages information.
After the feature of target pages information is extracted, for each feature, in candidate page information aggregate, search To user to the access probability of the page info with this feature, so as to estimate the corresponding access frequency of this feature.Its In, the method for machine learning may be employed to estimate the corresponding access frequency of each feature, so as to improve each feature pair The accuracy in computation for the access frequency answered.
It should be noted that the page info in above-mentioned candidate page information aggregate can be all pages stored in disk Face information or according to user in the second preset duration to the visit capacity of page info, all pages stored from disk The visit capacity selected in the information of face is more than the page info of default visit capacity.Wherein, which is also greatly may be used Small, the embodiment of the present invention is not especially limited the size of the second preset duration.
For example, it is assumed that extract target pages information and be characterized in issuing time, in candidate page information aggregate Page info is all page infos stored in disk.In this case, what is stored in server extraction disk goes out page object The issuing time and access probability of other page infos outside the information of face, it is general with accessing according to the issuing time extracted Rate estimates the access frequency of issuing time by machine learning.It is understood that if user is nearest one to issuing time The access probability of the page info in a week is higher, then the issuing time estimated for a nearest week access frequency compared with It is high.
By the method for calculating access frequency described above, it is estimated that the visit of each feature of target pages information It asks frequency, visit of the user to target pages information may finally be calculated using the access frequency for estimating obtained each feature Ask probability.It it is understood that can be by the way that the access frequency for estimating obtained each feature be made and to obtain user to mesh The access probability of page info is marked, it is, of course, also possible to the access frequency for estimating obtained each feature is multiplied by weighting coefficient, so Afterwards by after each weighting access frequency make and, so as to obtain access probability of the user to target pages information, this is all reasonable 's.The embodiment of the present invention is to the calculation of the access probability of target pages information and is not specifically limited.
In another embodiment, user can be utilized in the first preset duration to the visit capacity of target pages information, And the feature of target pages information, the access probability of calculating target pages information.
In this embodiment, user in the first preset duration can be utilized to the visit capacity of target pages information, to calculate Go out the first access probability of target pages information;Using the feature of target pages information, the second of target pages information is calculated Access probability finally using the first access probability and the second access probability, calculates the access probability of target pages information.
It it should be noted that can be by following three kinds of algorithms, to calculate the access probability of target pages information.
The first algorithm:The corresponding weight of first access probability is the first weight, and the corresponding weight of the second access probability is Second weight, the first weight are equal to the second weight, can make the first access probability and the second access probability at this time and obtain mesh Mark the access probability of page info;
Second algorithm:The corresponding weight of first access probability is the first weight, and the corresponding weight of the second access probability is Second weight, the first weight are more than the second weight, at this point, the first access probability is multiplied by the first weight, obtain the first probability, and Second access probability is multiplied by the second weight, obtains the second probability, finally the first probability and the second probability made and, obtain target The access probability of page info;
The third algorithm:The corresponding weight of first access probability is the first weight, and the corresponding weight of the second access probability is Second weight, the first weight are less than the second weight, at this point, the first access probability is multiplied by the first weight, obtain the first probability, and Second access probability is multiplied by the second weight, obtains the second probability, finally the first probability and the second probability made and, obtain target The access probability of page info.
For above-mentioned three kinds of algorithms, in practical applications, above-mentioned any algorithm can be used determines according to actual conditions Calculate the access probability of target pages information, the embodiment of the present invention to the computational algorithm of the access probability of target pages information not It is specifically limited.
S104, if there is the index data that index request matches in the second class index data, from the second class index data Middle acquisition search result.
If there is the index data that index request matches in the second class index data, then can be indexed in the second class Search result is obtained in data.Due to the second class index data be server according to default data loading rule from disk plus Carry, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., is searched in the second class index data The probability of the index data to match with searching request is larger.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
Second aspect, the embodiment of the present invention additionally provides a kind of data serching device, applied to server, as shown in Fig. 2, Described device includes:
Request receiving module 210, for receiving searching request;
First data search module 220 matches for the search in first kind index data and described search request Index data, wherein, the first kind index data is:It is stored in the data in non-caching region in memory;
Second data search module 230, if matching for being not present in first kind index data with described search request Index data, search and the index data that matches of described search request in the second class index data, wherein, described second Class index data is:The data of buffer zone in memory are stored in, the data of the buffer zone are:Add according to default data Carry the corresponding index data of page info that rule is loaded from disk and access probability is more than predetermined probabilities;
Search result acquisition module 240, if for there are the ropes that the index request matches in the second class index data Argument evidence obtains search result from the second class index data.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
Optionally, described device further includes:
Access probability computing module, for according in the following manner, calculating the access probability of target pages information, wherein, institute Target pages information is stated as any page info:
Using user in the first preset duration to the visit capacity of the target pages information, the target pages information is calculated Access probability;
Alternatively, according to the feature of the target pages information, the access probability of the target pages information is calculated.
Optionally, the access probability computing module, is specifically used for:
Extract the feature of the target pages information;
Using the access probability of the corresponding page info of each feature extracted, estimate that each feature extracted corresponds to Access frequency, wherein, the corresponding page info of a feature is:In candidate page information aggregate, there is the page of this feature Information;The each access frequency obtained using estimation calculates the access probability of the target pages information.
Optionally, candidate page information aggregate is:
According to user in the second preset duration to the visit capacity of page info, in all page infos stored from disk Visit capacity is selected to be more than the page info of default visit capacity;
Alternatively,
All page infos stored in disk.
Optionally, the feature of the target pages information includes at least one of following characteristics:Temperature, issuing time, Source, type, title feature.
The third aspect, the embodiment of the present invention additionally provide a kind of server, as shown in figure 3, including processor 301, communication Interface 302, memory 303 and communication bus 304, wherein, processor 301, communication interface 302, memory 303 is by communicating always Line 304 completes mutual communication,
Memory 303, for storing computer program;
Processor 301 during for performing the program stored on memory 303, is realized described in above method embodiment Data search method.
The communication bus that above-mentioned server is mentioned can be Peripheral Component Interconnect standard (Peripheral Component Interconnect, PCI) bus or expanding the industrial standard structure (Extended Industry Standard Architecture, EISA) bus etc..The communication bus can be divided into address bus, data/address bus, controlling bus etc..For just It is only represented in expression, figure with a thick line, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned server and other equipment.
Memory can include random access memory (Random Access Memory, RAM), can also include non-easy The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also To be at least one storage device for being located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal Processing, DSP), it is application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete Door or transistor logic, discrete hardware components.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
In another embodiment provided by the invention, a kind of computer readable storage medium is additionally provided, which can It reads to be stored with instruction in storage medium, when run on a computer so that computer performs institute in above method embodiment The data search method stated.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
In another embodiment provided by the invention, a kind of computer program product for including instruction is additionally provided, when it When running on computers so that computer performs the data search method described in above method embodiment.
Compared with prior art, technical solution provided in an embodiment of the present invention, server are first after searching request is received First non-caching range searching in memory whether there is the search data to match with searching request;If in non-caching region In do not search the search data to match with searching request, then in memory buffer zone search and searching request phase The index data matched somebody with somebody.Since storage index data is server according to default data loading rule in the buffer zone in memory Loaded from disk, access probability is more than the corresponding index data of page info of predetermined probabilities, i.e., caching in memory The probability of range searching to the index data to match with searching request is larger, so as to shorten server search index data Time, reduce user search for information when stand-by period.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or its any combination real It is existing.When implemented in software, can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When loading on computers and performing the computer program instructions, all or It partly generates according to the flow or function described in the embodiment of the present invention.The computer can be all-purpose computer, special meter Calculation machine, computer network or other programmable devices.The computer instruction can be stored in computer readable storage medium In or from a computer readable storage medium to another computer readable storage medium transmit, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is the data storage devices such as server, the data center integrated comprising one or more usable mediums.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state disk Solid State Disk (SSD)) etc..
It should be noted that herein, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to Non-exclusive inclusion, so that process, method, article or equipment including a series of elements not only will including those Element, but also including other elements that are not explicitly listed or further include as this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that Also there are other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is described using relevant mode, identical similar portion between each embodiment Point just to refer each other, and the highlights of each of the examples are difference from other examples.Especially for device, For server, computer readable storage medium, computer program product embodiments, implement since it is substantially similar to method Example, so description is fairly simple, the relevent part can refer to the partial explaination of embodiments of method.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modifications, equivalent replacements and improvements are made within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (11)

1. a kind of data search method, which is characterized in that applied to server, the described method includes:
Receive searching request;
The index data that search matches with described search request in first kind index data, wherein, the first kind index Data are:It is stored in the data in non-caching region in memory;
If the index data to match with described search request is not present in first kind index data, in the second class index data The index data that search matches with described search request, wherein, the second class index data is:It is stored in memory and caches The data in region, the data of the buffer zone are:Loaded according to default data loading rule from disk and access is general Rate is more than the corresponding index data of page info of predetermined probabilities;
If there are the index data that the index request matches in the second class index data, from the second class index data Obtain search result.
2. according to the method described in claim 1, it is characterized in that, according in the following manner, the access of target pages information is calculated Probability, wherein, the target pages information is any page info:
Using user in the first preset duration to the visit capacity of the target pages information, the visit of the target pages information is calculated Ask probability;
Alternatively, according to the feature of the target pages information, the access probability of the target pages information is calculated.
3. according to the method described in claim 2, it is characterized in that, the feature according to the target pages information, calculates The access probability of the target pages information, including:
Extract the feature of the target pages information;
Using the access probability of the corresponding page info of each feature extracted, the corresponding visit of each feature extracted is estimated Ask frequency, wherein, the corresponding page info of a feature is:In candidate page information aggregate, there is the page info of this feature;
The each access frequency obtained using estimation calculates the access probability of the target pages information.
4. according to the method described in claim 3, it is characterized in that, the page info in candidate page information aggregate is:
According to user in the second preset duration to the visit capacity of page info, selected in all page infos stored from disk Visit capacity is more than the page info of default visit capacity;
Alternatively,
All page infos stored in disk.
5. according to the method in claim 2 or 3, which is characterized in that the feature of the target pages information includes following spy At least one of sign:Temperature, issuing time, source, type, title feature.
6. a kind of data serching device, which is characterized in that applied to server, described device includes:
Request receiving module, for receiving searching request;
First data search module, for the index number that search matches with described search request in first kind index data According to, wherein, the first kind index data is:It is stored in the data in non-caching region in memory;
Second data search module, if for the index number to match with described search request to be not present in first kind index data According to, the index data that search matches with described search request in the second class index data, wherein, the second class index number According to for:The data of buffer zone in memory are stored in, the data of the buffer zone are:According to default data loading rule from Loaded in disk and access probability is more than the corresponding index data of page info of predetermined probabilities;
Search result acquisition module, if to ask the index data to match for there are the indexes in the second class index data, Search result is obtained from the second class index data.
7. device according to claim 6, which is characterized in that described device further includes:
Access probability computing module, for according in the following manner, calculating the access probability of target pages information, wherein, the mesh Mark page info is any page info:
Using user in the first preset duration to the visit capacity of the target pages information, the visit of the target pages information is calculated Ask probability;
Alternatively, according to the feature of the target pages information, the access probability of the target pages information is calculated.
8. device according to claim 7, which is characterized in that the access probability computing module is specifically used for:
Extract the feature of the target pages information;
Using the access probability of the corresponding page info of each feature extracted, the corresponding visit of each feature extracted is estimated Ask frequency, wherein, the corresponding page info of a feature is:In candidate page information aggregate, there is the page info of this feature;
The each access frequency obtained using estimation calculates the access probability of the target pages information.
9. device according to claim 8, which is characterized in that candidate page information aggregate is:
According to user in the second preset duration to the visit capacity of page info, selected in all page infos stored from disk Visit capacity is more than the page info of default visit capacity;
Alternatively,
All page infos stored in disk.
10. the device according to claim 7 or 8, which is characterized in that the feature of the target pages information includes following spy At least one of sign:Temperature, issuing time, source, type, title feature.
11. a kind of server, which is characterized in that including processor, communication interface, memory and communication bus, wherein, processing Device, communication interface, memory complete mutual communication by communication bus;
Memory, for storing computer program;
Processor during for performing the program stored on memory, realizes any method and steps of claim 1-5.
CN201810011936.2A 2018-01-05 2018-01-05 Data searching method and device and server Active CN108062418B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810011936.2A CN108062418B (en) 2018-01-05 2018-01-05 Data searching method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810011936.2A CN108062418B (en) 2018-01-05 2018-01-05 Data searching method and device and server

Publications (2)

Publication Number Publication Date
CN108062418A true CN108062418A (en) 2018-05-22
CN108062418B CN108062418B (en) 2022-07-22

Family

ID=62141361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810011936.2A Active CN108062418B (en) 2018-01-05 2018-01-05 Data searching method and device and server

Country Status (1)

Country Link
CN (1) CN108062418B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897886A (en) * 2018-07-09 2018-11-27 掌阅科技股份有限公司 Page display method calculates equipment and computer storage medium
CN109063199A (en) * 2018-09-11 2018-12-21 广州神马移动信息科技有限公司 Resource filtering method and device thereof, electronic equipment, computer-readable medium
CN109284236A (en) * 2018-08-28 2019-01-29 北京三快在线科技有限公司 Data preheating method, device, electronic equipment and storage medium
CN109933585A (en) * 2019-02-22 2019-06-25 京东数字科技控股有限公司 Data query method and data query system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080183663A1 (en) * 2007-01-31 2008-07-31 Paul Reuben Day Dynamic Index Selection for Database Queries
CN103500213A (en) * 2013-09-30 2014-01-08 北京搜狗科技发展有限公司 Page hot-spot resource updating method and device based on pre-reading
CN104572643A (en) * 2013-10-10 2015-04-29 北大方正集团有限公司 Search method and search engine
CN105653646A (en) * 2015-12-28 2016-06-08 北京中电普华信息技术有限公司 Dynamic query system and method under concurrent query condition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080183663A1 (en) * 2007-01-31 2008-07-31 Paul Reuben Day Dynamic Index Selection for Database Queries
CN103500213A (en) * 2013-09-30 2014-01-08 北京搜狗科技发展有限公司 Page hot-spot resource updating method and device based on pre-reading
CN104572643A (en) * 2013-10-10 2015-04-29 北大方正集团有限公司 Search method and search engine
CN105653646A (en) * 2015-12-28 2016-06-08 北京中电普华信息技术有限公司 Dynamic query system and method under concurrent query condition

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897886A (en) * 2018-07-09 2018-11-27 掌阅科技股份有限公司 Page display method calculates equipment and computer storage medium
CN109284236A (en) * 2018-08-28 2019-01-29 北京三快在线科技有限公司 Data preheating method, device, electronic equipment and storage medium
CN109063199A (en) * 2018-09-11 2018-12-21 广州神马移动信息科技有限公司 Resource filtering method and device thereof, electronic equipment, computer-readable medium
CN109063199B (en) * 2018-09-11 2022-10-25 优视科技有限公司 Resource filtering method and device, electronic equipment and computer readable medium
CN109933585A (en) * 2019-02-22 2019-06-25 京东数字科技控股有限公司 Data query method and data query system
CN109933585B (en) * 2019-02-22 2021-11-02 京东数字科技控股有限公司 Data query method and data query system

Also Published As

Publication number Publication date
CN108062418B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN108062418A (en) A kind of data search method, device and server
CN110162695B (en) Information pushing method and equipment
US8559731B2 (en) Personalized tag ranking
US8494996B2 (en) Creation and revision of network object graph topology for a network performance management system
WO2020007138A1 (en) Method for event identification, method for model training, device, and storage medium
US20150193347A1 (en) Predictive caching and fetch priority
US20160210678A1 (en) Systems for generating a global product taxonomy
CN108898230A (en) A kind of device management method and management server
CN104081392A (en) Influence scores for social media profiles
TWI579787B (en) Systems and methods for instant e-coupon distribution
CN113259149A (en) Prediction of insufficient capacity of a communication network
US20170308620A1 (en) Making graph pattern queries bounded in big graphs
WO2022083093A1 (en) Probability calculation method and apparatus in graph, computer device and storage medium
CN107247798B (en) Method and device for constructing search word bank
CN108154024A (en) A kind of data retrieval method, device and electronic equipment
CN108984688B (en) Mother and infant knowledge topic recommendation method and device
CN111831915A (en) Method, device, electronic equipment and storage medium for responding to data query request
CN112330382A (en) Item recommendation method and device, computing equipment and medium
CN113157198A (en) Method, apparatus and computer program product for managing a cache
WO2020107264A1 (en) Neural network architecture search method and apparatus
CN107885875B (en) Synonymy transformation method and device for search words and server
CN111625630A (en) Information processing apparatus, information processing method, and computer-readable recording medium
CN115525793A (en) Computer-implemented method, system, and storage medium
Nadim et al. Semantic discovery architecture for dynamic environments of web of things
US10936686B2 (en) Method and system for asynchronous correlation of data entries in spatially separated instances of heterogeneous databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant