CN108804477A - Dynamic Truncation method, apparatus and server - Google Patents

Dynamic Truncation method, apparatus and server Download PDF

Info

Publication number
CN108804477A
CN108804477A CN201710311708.2A CN201710311708A CN108804477A CN 108804477 A CN108804477 A CN 108804477A CN 201710311708 A CN201710311708 A CN 201710311708A CN 108804477 A CN108804477 A CN 108804477A
Authority
CN
China
Prior art keywords
chain
document
memory
disk
heap
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710311708.2A
Other languages
Chinese (zh)
Inventor
代俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Guangdong Shenma Search Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Shenma Search Technology Co Ltd filed Critical Guangdong Shenma Search Technology Co Ltd
Priority to CN201710311708.2A priority Critical patent/CN108804477A/en
Publication of CN108804477A publication Critical patent/CN108804477A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present invention provides Dynamic Truncation method, apparatus and server.In one embodiment, the Dynamic Truncation method includes:The document that newly enters to being newly stored in memory is identified, and obtains the keyword for newly entering document;The corresponding index chain of the keyword is obtained, all documents in the memory are ranked up according to setting rule, selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.

Description

Dynamic Truncation method, apparatus and server
Technical field
The present invention relates to retrieval technique fields, in particular to a kind of Dynamic Truncation side based on real-time search system Method, device and server.
Background technology
In many scenes of retrieval, it is required for carrying out retrieving newest data using real-time retrieval.In general, real When retrieval often preserve several hours, even data within these last few days recently, cause retrieval amount very big, recall precision It is very low.
Invention content
In view of this, the embodiment of the present invention is designed to provide a kind of Dynamic Truncation method, apparatus and server.
A kind of Dynamic Truncation method provided in an embodiment of the present invention is applied to provide the server of real-time search service, should Method includes:
The document that newly enters to being newly stored in memory is identified, and obtains the keyword for newly entering document;
The corresponding index chain of the keyword is obtained, all documents in the memory are arranged according to setting rule Sequence selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
The embodiment of the present invention also provides a kind of Dynamic Truncation device, is applied to provide the server of real-time search service, should Device includes:
Keyword acquisition module, for the document that newly enters for being newly stored in memory to be identified, acquisition is described newly to enter document Keyword;
Chain generation module is blocked, for obtaining the corresponding index chain of the keyword, by all documents in the memory It is ranked up according to setting rule, selects the index of preset quantity in the index chain to form blocking for memory according to ranking results Chain.
The embodiment of the present invention also provides a kind of server, including:
Including:
Memory;
Processor;
The Dynamic Truncation device installed/be stored in the memory and executed by the processor, the device include:
Keyword acquisition module, for the document that newly enters for being newly stored in memory to be identified, acquisition is described newly to enter document Keyword;
Chain generation module is blocked, for obtaining the corresponding index chain of the keyword, by all documents in the memory It is ranked up according to setting rule, selects the index of preset quantity in the index chain to form blocking for memory according to ranking results Chain.
Compared with prior art, Dynamic Truncation method and device of the invention, by section for establishing an auxiliary in memory Chain rupture can reach the recall precision for using and improving real time indexing retrieval for reducing memory.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the block diagram for the server that present pre-ferred embodiments provide.
Fig. 2 is the high-level schematic functional block diagram for the Dynamic Truncation device that present pre-ferred embodiments provide
Fig. 3 is the flow chart for the Dynamic Truncation method that the first preferred embodiment of the invention provides.
Fig. 4 is the flow chart for the Dynamic Truncation method that the second preferred embodiment of the invention provides.
Fig. 5 is the flow chart for the Dynamic Truncation method that third preferred embodiment of the present invention provides.
Specific implementation mode
Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, the detailed description of the embodiment of the present invention to providing in the accompanying drawings is not intended to limit claimed invention below Range, but it is merely representative of the selected embodiment of the present invention.Based on the embodiment of the present invention, those skilled in the art are not doing The every other embodiment obtained under the premise of going out creative work, shall fall within the protection scope of the present invention.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.Meanwhile the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
As shown in Figure 1, the block diagram of the server 100 provided for present pre-ferred embodiments.The server 100 Including memory 102, processor 104 and network module 106.It will appreciated by the skilled person that structure shown in Fig. 2 Only illustrate, the structure of server 100 is not caused to limit.For example, server 100 may also include than shown in Fig. 2 more More either less components or with the configuration different from shown in Fig. 2.In the present embodiment, the server 100 can be used In the retrieval server for providing real-time retrieval function, for example, it can be the backstage that Google, baidu etc. provide retrieval service Server.
Memory 102 can be used for storing software program and module, and processor 104 is stored in memory 102 by operation Interior software program and module, to perform various functions application and data processing.Memory 102 may include that high speed is random Memory may also include nonvolatile memory, such as one or more magnetic storage device, flash memory or other are non-volatile Property solid-state memory.In some instances, memory 102 can further comprise the storage remotely located relative to processor 104 Device, these remote memories can pass through network connection to server 100.The example of above-mentioned network includes but not limited to interconnect Net, intranet, LAN, mobile radio communication and combinations thereof.
The processor 104 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 104 can be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processes Device (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (DSP), application-specific integrated circuit (ASIC), field programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.It is general Processor can be microprocessor or the processor can also be any conventional processor etc..
The network module 106 is for receiving and transmitting network signal.Above-mentioned network signal may include wireless signal or Person's wire signal.In an example, above-mentioned network signal is cable network signal.At this point, network module 106 may include handling The elements such as device, random access memory, converter, crystal oscillator.
Above-mentioned software program and module include:Operating system 108 and Dynamic Truncation device 110.Wherein operating system 108 may be, for example, LINUX, the operating systems such as UNIX, WINDOWS, may include various for managing system task (such as memory Management, storage device control, power management etc.) component software and/or driving, and can be mutual with various hardware or component software Communication, to provide the running environment of other software component.The Dynamic Truncation device 110 operates in the base of operating system 108 On plinth, the various tasks such as real-time retrieval function document process in memory provided for realizing server 100 later will be right This is described in detail.
As shown in Fig. 2, Fig. 2 is the high-level schematic functional block diagram for the Dynamic Truncation device 110 that present pre-ferred embodiments provide, The Dynamic Truncation device 110 includes:Keyword acquisition module 1101, block chain generation module 1102, heap establishes module 1103, Key assignments comparison module 1104, heap reorder module 1105, internal storage data judgment module 1106, data conversion storage module 1107, block Chain judgment module 1108 blocks chain merging module 1109 and search chain synthesis module 1110.
Three embodiments of following Dynamic Truncation method will combine the description of the flow chart of Fig. 3 to Fig. 5 to described above dynamic Each function module that state cutting device 110 includes is described in detail.
First embodiment
Referring to Fig. 3, being that the dynamic applied to server 100 shown in FIG. 1 that a preferred embodiment of the present invention provides is cut The flow chart of disconnected method.Detailed process shown in Fig. 3 will be described in detail below.
Step S101, the document that newly enters to being newly stored in memory are identified, and obtain the keyword for newly entering document.One compared with In good embodiment, process described in the step S101 can be executed and realized by the keyword acquisition module 1101.
In the present embodiment, the keyword can indicate word, the word for newly entering document main contents or thought Or sentence.In an example, it is described newly enter document can be the newest information associated documents being stored in real time, for being supplied to The server 100 provides search service according to the real-time searching request of user.
Step S102 obtains the corresponding index chain of the keyword, by the document in the memory according to setting rule into Row sequence selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.One preferably implements Example in, process described in the step S102 can by it is described block chain generation module 1102 execute and realize.
In the present embodiment, described newly to enter in document may include multiple keywords, and the keyword can indicate described new Enter the document content to be showed.For example, described, newly to enter document can be latest news about famous person A, then the keyword It can be the name of the famous person A, can also be the abbreviation of the famous person A dependent events.The keyword can also be multiple The combination of word, for example, the date that the name of the famous person A and the latest news occur.
In one embodiment, each node stores the storage address for corresponding to document respectively in the index chain.Also It is to say, the index chain can be formed by the storage address of multiple documents.In another embodiment, the index chain can be The index chain formed by the corresponding title of multiple documents.
In the present embodiment, the setting rule can be arranged according to the degree of correlation between document and the keyword Sequence.The degree of correlation can be the length for newly entering the interval time that the time that document generates is formed with the keyword, example Such as, the keyword newly enters the document structure tree time to be separated by the more close then described degree of correlation higher with described, for another example the keyword Newly enter the document structure tree time with described to be separated by the more remote then described degree of correlation higher, that is to say, that document generation time is from current time The more close then described degree of correlation is higher.The degree of correlation can also be the height for the frequency for occurring the keyword in document, for example, The frequency for occurring the keyword in document is higher, then the degree of correlation is higher.Certainly, the setting rule can also be other rule Then, those skilled in the art can set rule according to demand.
In the present embodiment, it is described index chain according in memory document and the height of the degree of correlation of the keyword arranged It is formed after sequence.In one embodiment, the more forward corresponding document of node and the keyword in the index chain The degree of correlation is higher.It is described to block the part that chain is the index chain.In the present embodiment, the chain that blocks is by the index chain Sort forward specified quantity node composition.For example, blocking chain described in previous hundred node compositions of the index chain.
After the server 100 receives retrieval request, then chain preferential can be blocked according to and generates retrieval result and returns It returns.If the server 100 receives more retrieval requests, can also by it is described index chain except it is described block chain in addition to section The corresponding content of point generates further retrieval result.
According to the method in the present embodiment, chain is blocked by being formed in the memory so that the server 100 is connecing When receiving retrieval request, the querying condition of index is blocked to satisfaction, by it is described block chain can quick obtaining blocked with described The corresponding document of chain improves real time indexing effectiveness of retrieval.
Second embodiment
The present embodiment provides a kind of Dynamic Truncation method, the method in the present embodiment is similar with first embodiment, different Place is, as shown in figure 4, the method in the present embodiment includes:
Step S201, the document that newly enters to being newly stored in memory are identified, and obtain the keyword for newly entering document.
Step S202 obtains the corresponding index chain of the keyword, and all documents in the memory are advised according to setting It is then ranked up, selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
Step S203 establishes one for storing the heap for blocking the corresponding document of chain in the memory.One is preferably real It applies in example, process described in the step S203 can be established module 1103 by the heap and execute and realize.
In the present embodiment, described for storing the heap for blocking the corresponding document of chain is most rickle.The most rickle is A kind of complete binary tree by sequence, the key assignments of any nonterminal node is no more than its left child and right child nodes Key assignments.It is, the key assignments on the heap top of the most rickle is minimum.In the present embodiment, the key assignments can be the node of the heap The associated score of corresponding document and the degree of correlation of the keyword.For example, the phase of the document of a certain node and the keyword Guan Du is higher, then the associated score of the document is higher, then the key assignments of the node is bigger.
In an example, after the server 100 receives retrieval request, then chain generation preferential can be blocked according to Retrieval result, the retrieval result include the document in the heap.
The key assignments for newly entering document is compared by step S204 with the key assignments of the heap top document of the heap.One works as institute State newly enter document key assignments it is bigger than the key assignments of the heap top document of the heap when, execute step S205.Described newly enter document if stating Key assignments is smaller than the key assignments of the heap top document of the heap, then flow terminates.In preferred embodiment, mistake described in the step S204 Journey can be executed and realized by the key assignments comparison module 1104.In the present embodiment, it is described newly enter document key assignments be its with it is described The score of the degree of correlation of keyword.In detail, it is described newly enter document key assignments can also be used as it is described newly enter document replace described in After the heap top document of heap, the key assignments of a certain node in the heap is formed after being stored in the heap.
Step S205 newly enters document replacement heap top document with described, is then arranged again the document in the heap Sequence.In one preferred embodiment, process described in the step S205 can be reordered by the heap module 1105 execute and realize.
In the present embodiment, when it is described newly enter document key assignments it is bigger than the key assignments of the heap top document of the heap, can be first by institute State heap heap top delete, then by it is described newly enter document store to the heap top of the heap, to replace original heap top document.This reality It applies in example, a new most rickle is formed by rearrangement is carried out including the heap for newly entering document.In detail, by arranging again The heap of sequence is most rickle, when being added to new document every time in the server 100, it is only necessary to it is compared with heap top document, It can guarantee the document for blocking the preset quantity (e.g., top100) of the corresponding sequence of chain up front in the heap.
Step S201-S202 in the present embodiment is similar with the step S101-S102 in first embodiment, about step S201-S202 detailed descriptions can further refer to first embodiment, and details are not described herein.
It, can by blocking the document in chain by described and storing into a most rickle according to the method in the present embodiment The document blocked in chain can be quickly found in retrieval, improve recall precision.In addition, by the text blocked in chain Shelves are stored as most rickle, make newly to enter document and are directly compared with the heap top document of the most rickle, can obtain described newly entering Document whether than all documents inside the most rickle completion key assignments size comparison, and then ensure described in block chain correspondence Sequence preset quantity (e.g., top100) up front document in the heap, to facilitate subsequent indexed search.
3rd embodiment
The present embodiment provides a kind of Dynamic Truncation methods, and the present embodiment is similar with first embodiment, the difference is that, As shown in figure 5, the method in the present embodiment includes:
Step S301, the document that newly enters to being newly stored in memory are identified, and obtain the keyword for newly entering document.
Step S302 obtains the corresponding index chain of the keyword, and all documents in the memory are advised according to setting It is then ranked up, selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
Step S303, judges whether the data of the memory storage reach threshold value.In one preferred embodiment, the step Process described in S303 can be executed and realized by the internal storage data judgment module 1106.
If the data of the memory storage have arrived at threshold value, S304 is thened follow the steps;If the data of the memory storage Threshold value is not reached, thens follow the steps S307 or flow terminates.
In the present embodiment, the threshold value can be the capacity of the memory of the server 100, can also be the server The certain proportion of 100 memory sizes, for example, 90 the percent of the memory size.The threshold value can also be the memory The upper limit that can be stored.
Step S304 is transferred to the data in the memory as a data slot (segment) in disk.It is described Data slot includes the document stored in the memory, indexes chain and block chain.In one preferred embodiment, the step S304 institutes The process of description can be executed and realized by the data conversion storage module 1107.
Step S305 judges to block chain with the presence or absence of disk in the disk.In one preferred embodiment, the step S305 Described process can by it is described block chain judgment module 1108 execute and realize.
If there are the disks to block chain in the disk, S306 is thened follow the steps, if there is no described in the disk Disk blocks chain, thens follow the steps S307 or flow terminates.
In the present embodiment, it includes the corresponding index of preset quantity document in the disk that the disk, which blocks chain,.For example, institute The index for stating preset quantity can be the index for forming all documents in the disk according to the ranking results of setting rule The index for the forward preset quantity that sorts in chain.For example, it may be by all documents in the disk according to setting rule into Row sequence selects the index of preset quantity in the index chain to form disk and blocks chain according to ranking results.In a kind of embodiment party In formula, the disk blocks the corresponding index of document that chain includes the preset quantity in the disk.For example, the disk blocks Chain can be with the associated score of the degree of correlation of the keyword sort previous hundred document formed index chain.In other implementations In mode, the disk blocks chain and may not be the index chain for being ranked up and being formed according to the associated score, this field Technical staff can set the disk that rule is formed in the disk and block chain according to demand.
The chain that blocks newly stored into the disk is blocked chain with the disk and merged by step S306, is generated new Disk block chain and block chain to replace original disk.In one preferred embodiment, process can described in the step S306 By it is described block chain merging module 1109 execute and realize.
In one embodiment, the index chain is arranged with document and the associated score of the degree of correlation of the keyword Sequence blocks the chain that blocks newly stored into the disk in chain with disk if there is the disk to block chain in the disk Each node in associated score be compared sequence, the node for the forward specified quantity that sorts is blocked as new disk Chain realizes the merging for blocking chain.If there is no the disk to block chain in the disk, cutting into the disk will be newly stored Chain rupture is blocked chain as disk and is stored in the disk.In other embodiments, the index chain may be according to it It sets rule and is ranked up the index chain to be formed to document.Described will newly store blocks chain and the magnetic into the disk It can also be to block the node that chain and disk block in chain to described according to the setting rule and carry out that disk, which blocks chain and merges, Index chain is formed after sequence, is then selected the index of preset quantity in the index chain to form the disk according to ranking results and is blocked Chain.
Step S307 is loaded when the real-time search system, which receives search instruction, to be retrieved from the disk The disk blocks chain and is merged to memory with the chain that blocks stored in the memory, obtains the overall situation and blocks chain.One preferably implements In example, process described in the step S307 can be executed and realized by the search chain synthesis module 1110.
In one embodiment, if it is described block chain be the degree of correlation with document Yu the keyword associated score into The index chain that row sequence is formed, then the associated score progress for blocking each node that chain is blocked with the disk in chain in memory Sequence is formed the overall situation in the node of preceding specified quantity and blocks chain by sequence.In other embodiments, it described can will cut Chain rupture is ranked up to form index chain according to preset rules with the node that the disk blocks in chain, is then selected according to ranking results The index for selecting preset quantity in the index chain forms the overall situation and blocks chain.
Step S308, the real-time search system of the server 100, which according to the overall situation is blocked chain and retrieved, to be examined Hitch fruit.
In an example, the overall situation can be blocked the link of the corresponding document of chain and returned to by the server 100 The terminal for sending retrieval request is shown.In other examples, the retrieval result can also be that the overall situation blocks chain pair All documents answered.
Step S301-S302 in the present embodiment is similar with the step S101-S102 in first embodiment, about step S301-S302 detailed descriptions can further refer to first embodiment, and details are not described herein.
Further, in this embodiment method can also include second embodiment in step S203-S205.This implementation Step S203-S205 in example is executed after step S302.Detailed description about step S203-S205 can further be joined Second embodiment is examined, details are not described herein.
In some existing real-time retrieval schemes, when implementing real-time retrieval, should consider retrieval amount because Element considers the problems of internal memory performance again, is difficult often to accomplish the optimization scheme that internal memory performance and recall precision are taken into account.Phase It instead,, can will be in the memory when amount of storage has reached threshold value in the memory according to the method in above-described embodiment In data conversion storage to the disk, when the server 100 receives retrieval request, then the disk is blocked into chain and is loaded into In memory, it is the amalgamation result of the document in document and memory in disk that can make last retrieval result, so that retrieval is tied Fruit can be closer to the demand of user, and the demand of memory will not be very big, reduces the usage amount of memory, is retrieved improving Internal memory performance can also be promoted while efficiency, realize internal memory performance and effect that recall precision is taken into account.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, the flow chart in attached drawing and block diagram Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part for the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that at some as in the realization method replaced, the function of being marked in box can also be to be different from The sequence marked in attached drawing occurs.For example, two continuous boxes can essentially be basically executed in parallel, they are sometimes It can execute in the opposite order, this is depended on the functions involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use function or the dedicated base of action as defined in executing It realizes, or can be realized using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion Point, can also be modules individualism, can also two or more modules be integrated to form an independent part.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment including a series of elements includes not only those elements, but also includes Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, any made by repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should be noted that:Similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and is explained.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (21)

1. a kind of Dynamic Truncation method, it is applied to provide the server of real-time search service, which is characterized in that this method includes:
The document that newly enters to being newly stored in memory is identified, and obtains the keyword for newly entering document;
The corresponding index chain of the keyword is obtained, all documents in the memory are ranked up according to setting rule, root Select what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
2. Dynamic Truncation method as described in claim 1, which is characterized in that this method further includes:
One is established for storing the heap for blocking the corresponding document of chain in the memory.
3. Dynamic Truncation method as claimed in claim 2, which is characterized in that this method further includes:
The key assignments for newly entering document is compared with the key assignments of the heap top document of the heap;
If the key assignments for newly entering document is bigger than the key assignments of the heap top document of the heap, newly enter the document replacement heap with described Document is pushed up, is then resequenced to the document in the heap.
4. the Dynamic Truncation method as described in claim 1-3 any one, which is characterized in that this method further includes:
Judge whether the data of the memory storage reach threshold value;
If the data in the memory have reached threshold value, using the data in the memory as a data slot (segment) it is transferred in disk, the data slot includes the document stored in the memory, indexes chain and block chain.
5. Dynamic Truncation method as claimed in claim 4, which is characterized in that this method further includes:
Judge to block chain with the presence or absence of disk in the disk;
If there are disks to block chain, the chain that blocks newly stored into the disk is blocked chain with the disk and merged, It generates new disk and blocks chain and block chain to replace original disk.
6. Dynamic Truncation method as claimed in claim 5, which is characterized in that it includes in the disk that the disk, which blocks chain, The corresponding index of document of preset quantity.
7. Dynamic Truncation method as claimed in claim 5, which is characterized in that refer to when the real-time search system receives retrieval When order is retrieved, this method further includes:
The chain that blocks that the disk blocks chain to memory and the memory is loaded from the disk to merge, and is obtained the overall situation and is blocked Chain;
Chain is blocked according to the overall situation to be retrieved to obtain retrieval result.
8. a kind of Dynamic Truncation device, it is applied to provide the server of real-time search service, which is characterized in that the device includes:
Keyword acquisition module obtains the key for newly entering document for the document that newly enters for being newly stored in memory to be identified Word;
Block chain generation module, for obtaining the corresponding index chain of the keyword, by all documents in the memory according to Setting rule is ranked up, and selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
9. Dynamic Truncation device as claimed in claim 8, which is characterized in that the device further includes:
Heap establishes module, for establishing one for storing the heap for blocking the corresponding document of chain in the memory.
10. Dynamic Truncation device as claimed in claim 9, which is characterized in that the device further includes:
Key assignments comparison module, for the key assignments for newly entering document to be compared with the key assignments of the heap top document of the heap;
Heap reorders module, for when the key assignments for newly entering document is bigger than the key assignments of the heap top document of the heap, with described Newly enter document and replace heap top document, then resequences to the document in the heap.
11. the Dynamic Truncation device as described in claim 8-10 any one, which is characterized in that the device further includes:
Internal storage data judgment module, for judging whether the data of the memory storage reach threshold value;
Data conversion storage module, for when the data in the memory have reached threshold value, using the data in the memory as One data slot (segment) is transferred in disk, and the data slot includes the document stored in the memory, index chain And block chain.
12. Dynamic Truncation device as claimed in claim 11, which is characterized in that the device further includes:
Chain judgment module is blocked, chain is blocked with the presence or absence of disk in the disk for judging;
Block chain merging module, for when blocking chain there are disk, by newly store into the disk block chain with it is described Disk blocks chain and merges, and generates new disk and blocks chain and blocks chain to replace original disk.
13. Dynamic Truncation device as claimed in claim 12, which is characterized in that it includes in the disk that the disk, which blocks chain, Preset quantity the corresponding index of document.
14. Dynamic Truncation device as claimed in claim 12, which is characterized in that the device further includes:
Search chain synthesis module, for when the real-time search system receives search instruction and retrieved, from the disk It is middle to load the chain that blocks that the disk blocks chain to memory and the memory and merge, it obtains the overall situation and blocks chain, so that the reality When search system chain blocked according to the overall situation obtain indexed search result.
15. a kind of server, which is characterized in that
Including:
Memory;
Processor;
The Dynamic Truncation device installed/be stored in the memory and executed by the processor, the device include:
Keyword acquisition module obtains the key for newly entering document for the document that newly enters for being newly stored in memory to be identified Word;
Block chain generation module, for obtaining the corresponding index chain of the keyword, by all documents in the memory according to Setting rule is ranked up, and selects what the index of preset quantity in the index chain formed memory to block chain according to ranking results.
16. server as claimed in claim 15, which is characterized in that the device further includes:
Heap establishes module, for establishing one for storing the heap for blocking the corresponding document of chain in the memory.
17. server as claimed in claim 16, which is characterized in that the Dynamic Truncation device of the server further includes:
Key assignments comparison module, for the key assignments for newly entering document to be compared with the key assignments of the heap top document of the heap;
Heap reorders module, for when the key assignments for newly entering document is bigger than the key assignments of the heap top document of the heap, with described Newly enter document and replace heap top document, then resequences to the document in the heap.
18. the server as described in claim 15-17 any one, which is characterized in that the Dynamic Truncation of the server fills It sets and further includes:Further include:
Internal storage data judgment module, for judging whether the data of the memory storage reach threshold value;
Data conversion storage module, for when the data in the memory have reached threshold value, using the data in the memory as One data slot (segment) is transferred in disk, and the data slot includes the document stored in the memory, index chain And block chain.
19. server as claimed in claim 18, which is characterized in that the Dynamic Truncation device of the server further includes:
Chain judgment module is blocked, chain is blocked with the presence or absence of disk in the disk for judging;
Block chain merging module, for when blocking chain there are disk, by newly store into the disk block chain with it is described Disk blocks chain and merges, and generates new disk and blocks chain and blocks chain to replace original disk.
20. server as claimed in claim 19, which is characterized in that it includes default in the disk that the disk, which blocks chain, The corresponding index of document of quantity.
21. server as claimed in claim 19, which is characterized in that the server includes real-time search system, works as institute When stating real-time search system and receiving search instruction and retrieved, the Dynamic Truncation device further includes:
Search chain synthesis module, for loaded from the disk disk block chain to memory and the memory block chain into Row merges, and obtains the overall situation and blocks chain, so that the real-time search system blocks chain according to the overall situation obtains indexed search result.
CN201710311708.2A 2017-05-05 2017-05-05 Dynamic Truncation method, apparatus and server Pending CN108804477A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710311708.2A CN108804477A (en) 2017-05-05 2017-05-05 Dynamic Truncation method, apparatus and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710311708.2A CN108804477A (en) 2017-05-05 2017-05-05 Dynamic Truncation method, apparatus and server

Publications (1)

Publication Number Publication Date
CN108804477A true CN108804477A (en) 2018-11-13

Family

ID=64054824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710311708.2A Pending CN108804477A (en) 2017-05-05 2017-05-05 Dynamic Truncation method, apparatus and server

Country Status (1)

Country Link
CN (1) CN108804477A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2432357A1 (en) * 2000-12-29 2002-07-11 International Business Machines Corporation Lossy index compression
WO2010019873A1 (en) * 2008-08-15 2010-02-18 Pindar Corporation Systems and methods utilizing a search engine
CN101996246A (en) * 2010-11-09 2011-03-30 中国电信股份有限公司 Method and system for instant indexing
CN102737133A (en) * 2012-06-27 2012-10-17 北京城市网邻信息技术有限公司 Real-time searching method
US20140114942A1 (en) * 2012-10-23 2014-04-24 International Business Machines Corporation Dynamic Pruning of a Search Index Based on Search Results
CN104361009A (en) * 2014-10-11 2015-02-18 北京中搜网络技术股份有限公司 Real-time indexing method based on reverse index
US20160070735A1 (en) * 2012-08-07 2016-03-10 International Business Machines Corporation Incremental dynamic document index generation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2432357A1 (en) * 2000-12-29 2002-07-11 International Business Machines Corporation Lossy index compression
WO2010019873A1 (en) * 2008-08-15 2010-02-18 Pindar Corporation Systems and methods utilizing a search engine
CN101996246A (en) * 2010-11-09 2011-03-30 中国电信股份有限公司 Method and system for instant indexing
CN102737133A (en) * 2012-06-27 2012-10-17 北京城市网邻信息技术有限公司 Real-time searching method
US20160070735A1 (en) * 2012-08-07 2016-03-10 International Business Machines Corporation Incremental dynamic document index generation
US20140114942A1 (en) * 2012-10-23 2014-04-24 International Business Machines Corporation Dynamic Pruning of a Search Index Based on Search Results
CN104361009A (en) * 2014-10-11 2015-02-18 北京中搜网络技术股份有限公司 Real-time indexing method based on reverse index

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
单栋栋: "搜索引擎中索引剪枝的研究", 《万方数据》 *
樊重俊等: "《大数据分析与应用》", 31 January 2016, 立信会计出版社 *
马旸等: "大数据环境下 Lucene 性能优化方法研究", 《南京理工大学学报》 *

Similar Documents

Publication Publication Date Title
JP5661104B2 (en) Method and system for search using search engine indexing and index
US9014511B2 (en) Automatic discovery of popular landmarks
CN108520002A (en) Data processing method, server and computer storage media
CN102890714B (en) Method and device for indexing data
US9405784B2 (en) Ordered index
US7225186B2 (en) Binary search tree system and method
CN103345521A (en) Method and device for processing key values in hash table database
CN106407303A (en) Data storage method and apparatus, and data query method and apparatus
CN105210061A (en) Tagged search result maintenance
CN108572958A (en) Data processing method and device
CN105224560A (en) Data cached lookup method and device
CN105224532A (en) Data processing method and device
US20090276437A1 (en) Suggesting long-tail tags
CN104011713A (en) Search device, searching method, search program and recording medium
CN102360389A (en) Method for implementing data set specific management policies with respect to data
CN111752664A (en) Terminal multi-window popup management method and device
US7197498B2 (en) Apparatus, system and method for updating a sorted list
CN108804477A (en) Dynamic Truncation method, apparatus and server
CN102129454A (en) Method and system for processing encyclopaedia data based on cloud storage
US20140149380A1 (en) Methods and apparatuses for document processing at distributed processing nodes
CN108062326A (en) A kind of update recording method of data message and device
CN109376174A (en) A kind of method and apparatus selecting database
CN110399451B (en) Full-text search engine caching method, system and device based on nonvolatile memory and readable storage medium
CN104965839B (en) A kind of searching method and device of same category information
CN106202412A (en) Data retrieval method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200526

Address after: 310051 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 510000 Guangdong city of Guangzhou province Whampoa Tianhe District Road No. 163 Xiping Yun Lu Yun Ping square B radio tower 13 layer self unit 01 (only for office use)

Applicant before: Guangdong Shenma Search Technology Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181113