CN109408613A - Index structure operating method, device and system - Google Patents

Index structure operating method, device and system Download PDF

Info

Publication number
CN109408613A
CN109408613A CN201810924287.5A CN201810924287A CN109408613A CN 109408613 A CN109408613 A CN 109408613A CN 201810924287 A CN201810924287 A CN 201810924287A CN 109408613 A CN109408613 A CN 109408613A
Authority
CN
China
Prior art keywords
row
index structure
index
pointer
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810924287.5A
Other languages
Chinese (zh)
Inventor
吕文先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Guangdong Shenma Search Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Shenma Search Technology Co Ltd filed Critical Guangdong Shenma Search Technology Co Ltd
Priority to CN201810924287.5A priority Critical patent/CN109408613A/en
Publication of CN109408613A publication Critical patent/CN109408613A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclose a kind of index structure operating method, device and system.The described method includes: memory is written in the update of the index structure of lasting acquisition;And the index structure of duplication write-in, to ensure reading and writing for two parts of currently valid index structures respectively while carrying out for the index structure.Dynamic memory index structure of the invention can update local memory index by Remote Dynamic to realize the localization of index reading, to well solve the latency issue that index upgrade is read with index;Further, by Double buffer read-write is separated, further improves the efficiency of index operation.

Description

Index structure operating method, device and system
Technical field
The present invention relates to internet area more particularly to a kind of index structure operating methods, device and system.
Background technique
In search system, needs to hit and read the articles of certain classification or keyword (it is simple that keyword correspond to article list Claim the row of falling), and the attribute for obtaining these articles carries out personalized ordering (obtaining the referred to as positive row of document properties content from document id).
Since the corresponding number of documents of each keyword is constantly in dynamic change, inverted list and positive row's table need reality Shi Gengxin.In search system, usually there are foundation and maintenance of the special index server for index list.But business service Device, if being directly remotely indexed reading from index server, will lead to search when executing corresponding search related service It is delayed higher, is unfavorable for providing the retrieval service of high quality.Recommend in the information flow for needing to carry out multiple reverse and forward index In system, it is even more serious that long-range index reads caused delay issue.
Thus, it is desirable to a kind of index read schemes for being able to solve above-mentioned delay issue.
Summary of the invention
In order to solve the problems, such as above at least one, the invention proposes a kind of dynamic memory index structure, by long-range Dynamic updates local memory and indexes to realize the localization for indexing and reading, to well solve index upgrade and index reading Latency issue.
According to an aspect of the present invention, it proposes a kind of index structure operating methods, comprising: by the index of lasting acquisition Memory is written in the update of structure;And the index structure of duplication write-in, to ensure reading and writing for the index structure It is respectively carried out simultaneously for two parts of currently valid index structures.It is updated as a result, by dynamic and read-write separates, realize local Efficient index.
Index structure may include inverted list and positive row's table, and inverted list is by including keyword ID and direction text pointer vector The row's of falling item of the row's of falling pointer constitute, it is positive arrange table by include document id and the positive row's pointer for being directed toward document content positive row's item structure At.Preferably, row's pointer and positive row's pointer are intelligent pointers.Thereby, it is possible to realize the reliable acquisition to index content.
Preferably, for the duplication that the duplication of index structure can only include to ID and pointer, in other words, two parts current Thus effective index structure can avoid causing for realizing the separated pair buffers of read-write interior with common document property content Deposit being significantly increased of using.
It preferably, may include: the currently active by described two parts by the update write-in memory of the index structure of lasting acquisition Index structure in the currently valid index structure of portion be divided into and multiple be written in parallel to region;And it is written in parallel to for multiple Region executes parallel index structure and updates write operation.As a result, by subregion realize it is non-interfering be written in parallel to, promoted Index structure updates efficiency.
Preferably, index structure of the invention operation can also include in two parts of currently valid index structures Another currently valid index structure carry out the read operation of high concurrent.
Specifically, it is carried out for another currently valid index structure in two parts of currently valid index structures The read operation of high concurrent may include at least one of following: the keyword ID based on input, return in the inverted list with institute State the row's of falling pointer of the direction text pointer vector in the corresponding row's of the falling item of keyword ID;Document id based on input returns Positive row's pointer of document content is directed toward in positive row's table in positive row's item corresponding with the document id;And based on input Document id set returns to the pointer for being directed toward the document vector that pointer is just being arranged including multiple correspondences.
Preferably, index structure of the invention operation can also include the operation for index structure itself, such as obtain Inverted list and positive row's table are to construct initial index structure.And the update of the index structure of lasting acquisition write-in memory can wrap It includes: collecting the row's of the falling item being read in predetermined amount of time and positive row's item;Obtain the update for the row's of falling item and positive row's item with Memory is written, for example, obtaining in the case where the row's of falling item and positive row's item reach write-in pot life for the row's of falling item Update with positive row's item is to be written memory.Preferably, index structure of the invention operation can also include deleting the index knot It is more than the project of predetermined erasing time in structure.
According to another aspect of the present invention, a kind of index structure operating system is proposed, comprising: index server, institute It states index server maintenance, continuous updating and issues index structure for being retrieved;And multiple service servers, each The service server is used for: memory is written in the update of the index structure of lasting acquisition;And the index structure of duplication write-in, To ensure reading and writing for two parts of currently valid index structures respectively while carrying out for the index structure.
Index structure update is safeguarded and persistently issued by an index server as a result, realizes the efficient of whole system Index.
Preferably, service server index structure update may include will be in two parts of currently valid index structures The currently valid index structure of portion be divided into and multiple be written in parallel to region;And it is written in parallel to region for multiple, it executes simultaneously Capable index structure updates write operation.
Preferably, service server can be used for: work as another in two parts of currently valid index structures Preceding effective index structure carries out the read operation of high concurrent, and the read operation includes at least one of following: based on input Keyword ID returns to falling for the direction text pointer vector in the inverted list in the row's of falling item corresponding with the keyword ID Arrange pointer;Document id based on input returns in positive row's table and is directed toward document in positive row's item corresponding with the document id Positive row's pointer of content;And the document id set based on input, return be directed toward include multiple correspondences just arranging the document of pointer to The pointer of amount.
Preferably, service server is also used at least one of following: pulling inverted list and just from the index server Row's table is to construct initial index structure;The row's of the falling item being read in predetermined amount of time and positive row's item are collected, and is obtained for described The update of row's item and positive row's item is to be written memory;And more than the project of predetermined erasing time in the deletion index structure.
The index structure operating system can be recommender system, or as a part of recommender system, and business service The search result that device can be read based on index structure generates recommendation results.
According to an aspect of the present invention, it proposes a kind of index structure operating devices, comprising: writing unit, being used for will Memory is written in the update of the index structure persistently obtained;And copied cells, for replicating the index structure of write-in, to ensure needle Reading and writing for two parts of currently valid index structures respectively while carrying out to the index structure, wherein the rope Guiding structure includes inverted list and positive row's table, and the inverted list is by including keyword ID and the row's of the falling pointer for being directed toward text pointer vector The row's of falling item constitute, positive row's table is by including document id and being directed toward positive row's item of positive row's pointer of document content and constitute.
Preferably, writing unit can be further used for: the portion in two parts of currently valid index structures is worked as Preceding effective index structure, which is divided into, multiple is written in parallel to region;And it is written in parallel to region for multiple, execute parallel index Topology update write operation.
Preferably, index structure operating device can also include: reading unit, for currently valid for described two parts Another currently valid index structure in index structure carries out the read operation of high concurrent.Read operation may include as follows At least one of: the keyword ID based on input is returned in the inverted list in the row's of falling item corresponding with the keyword ID It is directed toward the row's of falling pointer of text pointer vector;Document id based on input returns opposite with the document id in positive row's table Positive row's pointer of document content is directed toward in the positive row's item answered;And the document id set based on input, it returns and is directed toward including multiple The pointer of the document vector of corresponding positive row's pointer.
Preferably, index structure operating device can also include at least one following: structural unit, be used for from the rope Draw server pull inverted list and positive row's table to construct initial index structure;Updating unit, for collecting quilt in predetermined amount of time The row's of the falling item and positive row's item read, and the update for the row's of falling item and positive row's item is obtained so that memory is written;And it deletes single Member deletes the project in the index structure more than predetermined erasing time.
According to a further aspect of the invention, a kind of calculating equipment is proposed, comprising: processor;And memory, thereon It is stored with executable code, when the executable code is executed by the processor, executes the processor as above any Index structure operating method described in.
According to a further aspect of the invention, a kind of non-transitory machinable medium is proposed, is stored thereon with Executable code executes the processor as above any when the executable code is executed by the processor of electronic equipment Index structure operating method described in.
Index structure operation scheme of the invention by dynamic index automatically update keep index data timeliness and Real-time;Realize that reading passes through more points for write buffer among these without lock and high concurrent reading performance by introducing Double buffer Block reduce write-in conflict, promoted write performance, and due to duplication only relate to pointer and ID so that the expense of Double buffer scheme compared with It is small.
Detailed description of the invention
Disclosure illustrative embodiments are described in more detail in conjunction with the accompanying drawings, the disclosure above-mentioned and its Its purpose, feature and advantage will be apparent, wherein in disclosure illustrative embodiments, identical reference label Typically represent same parts.
Fig. 1 shows the flow diagram of index structure operating method according to an embodiment of the invention.
Fig. 2 shows the flow diagrams of index structure operating method in accordance with another embodiment of the present invention.
Fig. 3 shows the schematic diagram that can implement the system of index structure operating method of the invention.
Fig. 4 shows the structural schematic diagram of index structure operating device according to an embodiment of the invention.
Fig. 5 shows the calculating equipment that can be used for realizing above-mentioned index structure operating method according to an embodiment of the present invention Structural schematic diagram.
Specific embodiment
The preferred embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing Preferred embodiment, however, it is to be appreciated that may be realized in various forms the disclosure without the embodiment party that should be illustrated here Formula is limited.On the contrary, these embodiments are provided so that this disclosure will be more thorough and complete, and can be by the disclosure Range is completely communicated to those skilled in the art.
In search system, needs to hit and read the articles of certain classification or keyword (it is simple that keyword correspond to article list Claim the row of falling), and the attribute for obtaining these articles carries out personalized ordering (obtaining the referred to as positive row of document properties content from document id). In information flow recommender system, recommending module is even more to need to be performed a plurality of times aforesaid operations, and therefore, reverse and forward table is to recommend system Most crucial data in system.Conventionally, as reverse and forward table is all present in independent index service, recommend mould It is higher that block repeatedly reads the delay for causing to recommend.For this purpose, passing through the invention proposes a kind of new dynamic index operation scheme Remote Dynamic updates index structure itself, and carries out local index using the index structure for being stored in the machine and well solve Index the latency issue read.
Fig. 1 shows the flow diagram of index structure operating method according to an embodiment of the invention.Such as Fig. 1 institute Show, in step S110, memory is written into the update of the index structure of lasting acquisition.In step S120, the index knot of write-in is replicated Structure, to ensure reading and writing for two parts of currently valid index structures respectively while carrying out for the index structure. In other words, by replicating in time, it is ensured that two parts having the same of currently valid index structure, a copy of it are used to be read It takes, another is used to be written, so that read-write is not interfere with each other separately.
It is written as a result, by the continuous updating for index structure, so that locally stored index structure copes with down Row and positive row's continually changing characteristic of table.Further, the memory being written and read is separated by the duplication to index structure, Without locking when data are written to reading data, to improve the efficiency of reading data and write-in.
Here, index structure may include inverted list and positive row's table.Inverted list and positive row's table respectively include multiple items, these Can due to articles all kinds of on internet publication and deletion and lasting variation.Therefore, in order to obtain update in time, in step It needs to continue in S110 to obtain more to newly arrive that local memory is written.Here, lasting obtain may refer to the machine (for example, following industry It is engaged in server) it continual obtains from long-range (for example, following index servers) about each, reverse and forward table It updates, also may refer to the machine and updated at a certain time interval from long-range obtain to agglomerate formula.These update items obtained can , for example, the row's of falling item of more documents can be linked to, being also possible to newly-built in the preceding existing but interior item for having update , for example, indicating new positive row's item of new crawl document.Also, it should be understood that these obtained can be referred to as more New index structure, can also be referred to as update or the index structure itself of index structure, and the present invention is herein with no restrictions.
In one embodiment, inverted list is by including keyword ID and the row of falling for arranging pointer for being directed toward text pointer vector Item is constituted.In other words, each row's of falling item includes a keyword ID, and the row of falling of one text pointer vector of direction refers to Needle.Specifically, the document pointer vector may include multiple elements, and each element is directed to a document (that is, including the pass The document of keyword) pointer, such as intelligent pointer.The row's of falling pointer can then be directed to the intelligent pointer of the vector, such as shared_ptr.Here, can voluntarily be deleted due to that can not be modified when intelligent pointer has and is cited in no external reference Characteristic, therefore can prevent after obtaining data outside index structure, index structure internal data destroy caused by access it is different Often.
Alternatively, or in addition, just row's table can be by including document id and being directed toward positive row's pointer of the document content just Item is arranged to constitute.The unique identifier distributed when document id can be such as crawler capturing by a certain specific document, such as can be with It is one 64 signless integers.Positive row's pointer can then be directed to the intelligent pointer of this article content.
Pass through the duplication in step S120 to write structure, it can be ensured that from the long-range index structure for obtaining update to the machine The operation that (that is, write-in content) executes search mission with the machine and read index content is separated from each other, and does not interfere with each other (that is, read-write Separate), to promote the efficiency and accuracy of both write-in and read operation.In other words, by step S120 to update The continuous duplication of content, it can be ensured that subsequent to update going on smoothly and (not interfering with each other with read operation) for step S110.
Step S110 may include that the portion in the two parts of currently valid index structures obtained for duplication is the currently active Index structure carry out write operation, more specifically, may include: the currently valid index structure of portion is divided into it is multiple It is written in parallel to region;And it is written in parallel to region for multiple, it executes parallel index structure and updates write operation.Here, obtaining The index structure taken can be including just arranging and the Hash table for the row of falling.Hash table for write-in can be divided into multiple regions, There are independent lock and eliminative mechanism in each region.For this purpose, can simultaneously for index structure multiple regions carry out write operation and It is independent of each other.
Alternatively, or in addition, index structure operating method of the invention can also include index structure read step, That is, carrying out the reading of high concurrent for another currently valid index structure in two parts of currently valid index structures Operation.
Fig. 2 shows the flow diagrams of index structure operating method in accordance with another embodiment of the present invention.Such as Fig. 2 institute Show, it is the process persistently carried out that index structure, which updates step S210 and copy step S220, and updated a index Structure is then to be used for read operation S230 relatively independently.Here, the index structure for reading for example can be one completely Hash table (for example, it may be unordered_map), be achieved in read without lock and high concurrent reading mechanism.
Here, it should be clear that, the index structure replicated in the present invention can be the positive row for only relating to ID and pointer And inverted list, without including the document properties content that can occupy a large amount of memory spaces.In other words, it is respectively used to read and write Two parts of currently valid index structures can be with common document property content.As a result, pair buffers of the invention only relate to And the duplication of ID and pointer, memory overhead are still smaller.
In one embodiment, the step of carrying out the read operation of high concurrent for another currently valid index structure S230 may include at least one of following: the keyword ID based on input, return in the inverted list with the keyword ID phase The row's of falling pointer of direction text pointer vector in the corresponding row's of falling item;Document id based on input returns in positive row's table Positive row's pointer of document content is directed toward in positive row's item corresponding with the document id;And the document id set based on input, Return to the pointer for being directed toward the document vector that pointer is just being arranged including multiple correspondences.
In the present invention, index can be provided and read interface, index, which reads interface and may include down row, reads interface, single Positive row reads interface and the positive row of batch reads interface.
The input parameter that row reads interface, which can be, arranges key, that is, keyword ID, and may include reading manner Option, for example whether needing to store into local memory index, if allow the options such as zipper truncation and time-out setting.Return to knot Fruit, which can be, arranges zipper, that is, is directed toward the pointer of text pointer vector, such as is directed toward the intelligent pointer of a vector (vector) shared_ptr.Each element of vector is directed to the intelligent pointer of a document.
The input parameter that single positive row reads interface can be document id, such as a 64bit signless integer.Return to knot Fruit can be the intelligent pointer of the document content (positive row).
The input parameter that the positive row of batch reads interface can be document id set, return the result, the row of can be similar to reads Interface returns the result, that is, is directed toward the intelligent pointer of a document vector.Each element of document vector is directed to input The intelligent pointer of a document in document id set.
In one embodiment, can also carry out the present invention for index structure update, duplication and read operation it Before, an initial index structure is constructed in the machine, for example, from the long-range inverted list and positive row's table of obtaining to construct initial index knot Structure.I.e., it is possible to preheat to the machine, preheating refers in index object building, comes a part of row of falling from remotely fetching in advance With positive row.Preferably, it can also include the index initialization interface for carrying out the machine preheating that above-mentioned index, which reads interface,.Index Initialization interface may include following parameter: long-range index data source IP and port list;Read time-out;Single is read from long-range Zipper data maximum length;Data update interval;Update Thread Count;Out-of-service time;Preheat the row's of falling key file of row chain;Number According to number of partitions (index data is divided into multiple regions, and region quantity influences the conflict possibility of write-in, influences performance);Rope is truncated Draw chain length: needing the maximum length of falling row chain of long-range real-time query.Using above-mentioned interface and relevant parameter is set, it can be The row of the falling key write-in file for needing to preheat, when index object constructs successfully, just can be in the efficient reading of the machine evidence of falling number of rows It takes.
The present invention does not stop to change to adapt to index data, devises dynamic memory Indexing Mechanism, including increase, modify And deletion.It should be understood that the increase and modification of index structure can originally be considered the index of such as step S110 and S210 A part of topology update operation.
It in one embodiment, may include: to collect pre- timing by the update write-in memory of the index structure of lasting acquisition Between the row's of the falling item and positive row's item that are read in section;And the update for the row's of falling item and positive row's item is obtained so that memory is written. And obtaining the update for the row's of falling item and positive row's item so that memory is written includes: to reach write-in in the row's of falling item and positive row's item The update for the row's of falling item and positive row's item is obtained in the case where pot life so that memory is written.
As replacement or attachment, index structure operating method of the invention can also include deleting in the index structure More than the project of predetermined erasing time.
Specifically, for data increase, due to real time indexing load data continually, can be multi-thread by periodically rising Journey is updated to remote data source pulling data, Lai Shixian data.The row of the falling key for having access in the recent period and positive row ID can be collected first As upgating object.It is judged when pulling after the load of data last time either with or without the pot life for being more than upgating object. If it exceeds more new data can be pulled.Preferably, upgating object is within a period of time after pot life is more than (for example, 3 In times pot life) still it can read, with inaccessible caused by preventing from pulling unsuccessfully.
When needing data modification, in order to avoid influencing to read, data modification can also be changed to data update, that is, wound A new data are built to be updated.And the data before modifying then are deleted after such as a period of time.
Data are deleted, when data are added to index, preferred record modification time.Then can make inside index Modification time is checked with thread cycle.If modification time is more than that data eliminate time setting, corresponding data is just deleted, thus It avoids index persistence expansion consumption memory and causes delay machine.
In search system or including the system of function of search, such as in recommender system, individual index service will use Device generates and safeguards index structure.Fig. 3 shows the signal that can implement the system of index structure operating method of the invention Figure.It should be understood that index structure operating system 300 shown in Fig. 3 can be recommender system, in other words, it can be and push away Recommend system realizes index structure operation scheme of the invention while being recommended.Service server 310 in figure in addition to After the index structure for executing the machine updates and replicates operation, it is also based on the search result that such as index structure is read and generates Recommendation results.Above-mentioned recommendation results for example can be distributed to corresponding client via the distribution server (Fig. 3 is not shown).
As shown in figure 3, system 300 may include index server 310 and multiple service servers 320.Index server It 310 can safeguard, continuous updating and issue index structure for being retrieved.Each service server 320 then can be used for by Memory is written in the update of the index structure persistently obtained;And the index structure of duplication write-in, to ensure to tie for the index Structure reads and writees for two parts of currently valid index structures respectively while carrying out.
Similarly, index structure herein may include inverted list and positive row's table, inverted list can by include keyword ID and The row's of falling item composition of the row's of falling pointer of direction text pointer vector, positive table of arranging then can be by including document id and direction document content Positive row's item of positive row's pointer is constituted.
In one embodiment, service server 320 can for that will continue the update write-in memory of the index structure obtained To include: that the currently valid index structure of portion in two parts of currently valid index structures is divided into multiple be written in parallel to Region;And it is written in parallel to region for multiple, it executes parallel index structure and updates write operation.
In one embodiment, service server 320 can be used for in two parts of currently valid index structures Another currently valid index structure carry out the read operation of high concurrent, the read operation includes at least one of following: Keyword ID based on input returns to the direction document in the inverted list in the row's of falling item corresponding with the keyword ID and refers to The row's of falling pointer of needle vector;Document id based on input returns to positive row's item corresponding with the document id in positive row's table The middle positive row's pointer for being directed toward document content;And the document id set based on input, return, which is directed toward just to arrange including multiple correspondences, to be referred to The pointer of the document vector of needle.
Further, service server 320 can be also used at least one following: pull down from index server 310 Row's table and positive row's table are to construct initial index structure;The row's of the falling item being read in predetermined amount of time and positive row's item are collected, and is obtained For the row's of falling item and the positive update for arranging item memory is written;And deleting is more than predetermined erasing time in the index structure Project.
In other words, service server 320 can be used as execution as above for index structure operating method described in Fig. 1-2 The machine.
In one embodiment, the present invention can also be implemented as a kind of index structure operating device.Fig. 4 is shown according to this The structural schematic diagram of the index structure operating device of invention one embodiment.
As shown in figure 4, index structure operating device 400 may include writing unit 410 and copied cells 420.Write-in is single Member 410 can be used for continue the update write-in memory of the index structure obtained.Copied cells 420 then can be used for replicating write-in Index structure, to ensure reading and writing for two parts of currently valid index structures respectively simultaneously for the index structure It carries out.
Specifically, writing unit 410 can be used for currently having the portion in two parts of currently valid index structures The index structure of effect, which is divided into, multiple is written in parallel to region;And it is written in parallel to region for multiple, execute parallel index structure Update write operation.
In one embodiment, index structure operating device 400 can optionally include reading unit 430, and the latter can be with The read operation of high concurrent is carried out for another currently valid index structure.Read operation may include following at least one : the keyword ID based on input returns to the direction text in the inverted list in the row's of falling item corresponding with the keyword ID The row's of falling pointer of shelves pointer vector;Document id based on input, return in positive row's table it is corresponding with the document id just Arrange positive row's pointer that document content is directed toward in item;And the document id set based on input, it returns and is directed toward including multiple correspondences just Arrange the pointer of the document vector of pointer.
In other embodiments, index structure operating device 400 can also optionally include other function unit: for example, For pulling inverted list and positive row's table from the index server to construct the structural unit of initial index structure;It is pre- for collecting The row's of the falling item and positive row's item being read in section of fixing time, and the update for the row's of falling item and positive row's item is obtained so that memory is written Updating unit;And it deletes in the index structure more than the deletion unit of the project of predetermined erasing time.Implement at one In example, above-mentioned updating unit can be a part of writing unit 410.In yet another embodiment, writing unit 410 can be with Function including structural unit and deletion unit, for use as the modification unit of index structure.
In another embodiment, index structure operation scheme of the invention can also be realized by calculating equipment.Fig. 5 is shown It can be used for realizing the structural schematic diagram of the calculating equipment of above-mentioned index structure operating method according to an embodiment of the present invention.
Referring to Fig. 5, calculating equipment 500 includes memory 510 and processor 520.
Processor 520 can be the processor of a multicore, also may include multiple processors.In some embodiments, Processor 520 may include a general primary processor and one or more special coprocessors, such as graphics process Device (GPU), digital signal processor (DSP) etc..In some embodiments, the circuit reality of customization can be used in processor 520 It is existing, such as application-specific IC (ASIC) or field programmable gate array (FPGA).
Memory 510 may include various types of storage units, such as Installed System Memory, read-only memory (ROM), and forever Long storage device.Wherein, ROM can store the static data of other modules needs of processor 520 or computer or refer to It enables.Permanent storage can be read-write storage device.Permanent storage can be after computer circuit breaking not The non-volatile memory device of the instruction and data of storage can be lost.In some embodiments, permanent storage device uses Mass storage device (such as magnetically or optically disk, flash memory) is used as permanent storage.In other embodiment, permanently deposit Storage device can be removable storage equipment (such as floppy disk, CD-ROM drive).Installed System Memory can be read-write storage equipment or The read-write storage equipment of volatibility, such as dynamic random access memory.Installed System Memory can store some or all processors The instruction and data needed at runtime.In addition, memory 510 may include the combination of any computer readable storage medium, Including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read only memory), disk and/or CD can also use.In some embodiments, memory 510 may include that removable storage that is readable and/or writing is set It is standby, for example, laser disc (CD), read-only digital versatile disc (such as DVD-ROM, DVD-dual layer-ROM), read-only Blu-ray Disc, Super disc density, flash card (such as SD card, min SD card, Micro-SD card etc.), magnetic floppy disc etc..It is computer-readable to deposit It stores up medium and does not include carrier wave and the momentary electron signal by wirelessly or non-wirelessly transmitting.
It is stored with executable code on memory 510, when executable code is handled by processor 520, can make to handle Device 520 executes the index structure operating method addressed above.
Index structure operating method, system and device according to the present invention above is described in detail by reference to attached drawing. The timeliness and real-time for automatically updating holding data that index structure operation scheme of the invention passes through dynamic index.Due to number Such as library lib can be used according to update to complete with the coordination of long-range index database automatically, thus significantly reduce data maintenance cost.Separately Outside, realize reading without lock and high concurrent reading performance by introducing Double buffer.Among these, more points can be passed through for write buffer Block reduce write-in conflict, promoted write performance, and due to duplication only relate to pointer and ID so that the expense of Double buffer scheme compared with It is small.
In addition, being also implemented as a kind of computer program or computer program product, the meter according to the method for the present invention Calculation machine program or computer program product include the calculating for executing the above steps limited in the above method of the invention Machine program code instruction.
Alternatively, the present invention can also be embodied as a kind of (or the computer-readable storage of non-transitory machinable medium Medium or machine readable storage medium), it is stored thereon with executable code (or computer program or computer instruction code), When the executable code (or computer program or computer instruction code) by electronic equipment (or calculate equipment, server Deng) processor execute when, so that the processor is executed each step according to the above method of the present invention.
Those skilled in the art will also understand is that, various illustrative logical blocks, mould in conjunction with described in disclosure herein Block, circuit and algorithm steps may be implemented as the combination of electronic hardware, computer software or both.
The flow chart and block diagram in the drawings show the possibility of the system and method for multiple embodiments according to the present invention realities Existing architecture, function and operation.In this regard, each box in flowchart or block diagram can represent module, a journey A part of sequence section or code, a part of the module, section or code include one or more for realizing defined The executable instruction of logic function.It should also be noted that in some implementations as replacements, the function of being marked in box can also To be occurred with being different from the sequence marked in attached drawing.For example, two continuous boxes can actually be basically executed in parallel, They can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or stream The combination of each box in journey figure and the box in block diagram and or flow chart, can the functions or operations as defined in executing Dedicated hardware based system realize, or can realize using a combination of dedicated hardware and computer instructions.
Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or improvement to the technology in market for best explaining each embodiment, or make the art Other those of ordinary skill can understand each embodiment disclosed herein.

Claims (23)

1. a kind of index structure operating method, comprising:
Memory is written into the update of the index structure of lasting acquisition;And
The index structure of write-in is replicated, to ensure reading and writing for two parts of currently valid ropes for the index structure Guiding structure respectively carries out simultaneously.
2. the method for claim 1, wherein the index structure includes inverted list and positive row's table, the inverted list by Including keyword ID and be directed toward text pointer vector fall row pointer fall row item constitute, positive row's table by include document id and The positive row's item for being directed toward positive row's pointer of document content is constituted.
3. method according to claim 2, wherein row's pointer and positive row's pointer are intelligent pointers.
4. method according to claim 2, wherein include: by the update write-in memory of the index structure of lasting acquisition
The currently valid index structure of portion in two parts of currently valid index structures is divided into and multiple is written in parallel to area Domain;And
It is written in parallel to region for multiple, parallel index structure is executed and updates write operation.
5. method as claimed in claim 4, further includes:
The reading of high concurrent is carried out for another currently valid index structure in two parts of currently valid index structures Extract operation.
6. method as claimed in claim 6, wherein in two parts of currently valid index structures another is current The read operation that effective index structure carries out high concurrent includes at least one of following:
Keyword ID based on input returns to the direction text in the inverted list in the row's of falling item corresponding with the keyword ID The row's of falling pointer of shelves pointer vector;
Document id based on input returns in positive row's table and is directed toward document content in positive row's item corresponding with the document id Positive row's pointer;And
Document id set based on input returns to the pointer for being directed toward the document vector that pointer is just being arranged including multiple correspondences.
7. the method as described in claim 1, further includes:
Inverted list and positive row's table are obtained to construct initial index structure.
8. method according to claim 2, wherein include: by the update write-in memory of the index structure of lasting acquisition
Collect the row's of the falling item being read in predetermined amount of time and positive row's item;
The update for the row's of falling item and positive row's item is obtained so that memory is written.
9. method according to claim 8, wherein obtain the update for the row's of falling item and positive row's item so that interior bag deposit is written It includes:
In the case where the row's of falling item reaches write-in pot life with positive row's item, acquisition is for the row's of falling item and positive row's item It updates so that memory is written.
10. the method as described in claim 1, further includes:
Delete the project in the index structure more than predetermined erasing time.
11. the method for claim 1, wherein described two parts currently valid index structure common document property contents.
12. a kind of index structure operating system, comprising:
Index server, index server maintenance, continuous updating simultaneously issue index structure for being retrieved;And
Multiple service servers, each service server are used for:
Memory is written into the update of the index structure of lasting acquisition;And
The index structure of write-in is replicated, to ensure to read and write needle for the index structure
Two parts of currently valid index structures are respectively carried out simultaneously.
13. system as claimed in claim 12, wherein the index structure includes inverted list and positive row's table, the inverted list It is made of the row's of the falling item for arranging pointer for including keyword ID and direction text pointer vector, positive row's table is by including document id It is constituted with the positive row's item for the positive row's pointer for being directed toward document content.
14. system as claimed in claim 13, wherein the service server will be for that will continue the index structure obtained more Newly write-in memory includes:
The currently valid index structure of portion in two parts of currently valid index structures is divided into and multiple is written in parallel to area Domain;And
It is written in parallel to region for multiple, parallel index structure is executed and updates write operation.
15. system as claimed in claim 13, wherein the service server is used for:
The reading of high concurrent is carried out for another currently valid index structure in two parts of currently valid index structures Extract operation, the read operation include at least one of following:
Keyword ID based on input returns to the direction text in the inverted list in the row's of falling item corresponding with the keyword ID The row's of falling pointer of shelves pointer vector;
Document id based on input returns in positive row's table and is directed toward document content in positive row's item corresponding with the document id Positive row's pointer;And
Document id set based on input returns to the pointer for being directed toward the document vector that pointer is just being arranged including multiple correspondences.
16. system as claimed in claim 12, wherein the service server is also used at least one following:
Inverted list and positive row's table are pulled from the index server to construct initial index structure;
The row's of the falling item being read in predetermined amount of time and positive row's item are collected, and obtains the update for the row's of falling item and positive row's item Memory is written;And
Delete the project in the index structure more than predetermined erasing time.
17. system as claimed in claim 12, wherein the index structure operating system is recommender system, and the industry Business server generates recommendation results based on the search result that the index structure is read.
18. a kind of index structure operating device, comprising:
Memory is written in writing unit, the update for that will continue the index structure obtained;And
Copied cells, for replicating the index structure of write-in, to ensure reading and writing for two for the index structure The currently valid index structure of part respectively carries out simultaneously, wherein
The index structure includes inverted list and positive row's table, and the inverted list is by including keyword ID and direction text pointer vector Fall row pointer fall row item constitute, positive row's table by include document id and be directed toward document content positive row's pointer positive row's item It constitutes.
19. device as claimed in claim 18, wherein said write unit is further used for:
The currently valid index structure of portion in two parts of currently valid index structures is divided into and multiple is written in parallel to area Domain;And
It is written in parallel to region for multiple, parallel index structure is executed and updates write operation.
20. device as claimed in claim 18, further includes:
Reading unit, for for another currently valid index structure in two parts of currently valid index structures into The read operation of row high concurrent, the read operation include at least one of following:
Keyword ID based on input returns to the direction text in the inverted list in the row's of falling item corresponding with the keyword ID The row's of falling pointer of shelves pointer vector;
Document id based on input returns in positive row's table and is directed toward document content in positive row's item corresponding with the document id Positive row's pointer;And
Document id set based on input returns to the pointer for being directed toward the document vector that pointer is just being arranged including multiple correspondences.
21. device as claimed in claim 18 further includes at least one following:
Structural unit, for pulling inverted list and positive row's table from the index server to construct initial index structure;
Updating unit for collecting the row's of the falling item being read in predetermined amount of time and positive row's item, and is obtained for the row's of falling item Update with positive row's item is to be written memory;And
Unit is deleted, the project in the index structure more than predetermined erasing time is deleted.
22. a kind of calculating equipment, comprising:
Processor;And
Memory is stored thereon with executable code, when the executable code is executed by the processor, makes the processing Device executes such as method of any of claims 1-11.
23. a kind of non-transitory machinable medium, is stored thereon with executable code, when the executable code is electric When the processor of sub- equipment executes, the processor is made to execute such as method of any of claims 1-11.
CN201810924287.5A 2018-08-14 2018-08-14 Index structure operating method, device and system Pending CN109408613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810924287.5A CN109408613A (en) 2018-08-14 2018-08-14 Index structure operating method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810924287.5A CN109408613A (en) 2018-08-14 2018-08-14 Index structure operating method, device and system

Publications (1)

Publication Number Publication Date
CN109408613A true CN109408613A (en) 2019-03-01

Family

ID=65464304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810924287.5A Pending CN109408613A (en) 2018-08-14 2018-08-14 Index structure operating method, device and system

Country Status (1)

Country Link
CN (1) CN109408613A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101963896A (en) * 2010-08-20 2011-02-02 中国科学院计算技术研究所 Memory device with quadratic index structure and operation method thereof
CN103177117A (en) * 2013-04-08 2013-06-26 北京奇虎科技有限公司 Information index system and information index update method
CN106250492A (en) * 2016-07-28 2016-12-21 五八同城信息技术有限公司 The processing method and processing device of index
US9529808B1 (en) * 2012-07-16 2016-12-27 Tintri Inc. Efficient and flexible organization and management of file metadata

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101963896A (en) * 2010-08-20 2011-02-02 中国科学院计算技术研究所 Memory device with quadratic index structure and operation method thereof
US9529808B1 (en) * 2012-07-16 2016-12-27 Tintri Inc. Efficient and flexible organization and management of file metadata
CN103177117A (en) * 2013-04-08 2013-06-26 北京奇虎科技有限公司 Information index system and information index update method
CN106250492A (en) * 2016-07-28 2016-12-21 五八同城信息技术有限公司 The processing method and processing device of index

Similar Documents

Publication Publication Date Title
JP6025149B2 (en) System and method for managing data
CN110287044B (en) Lock-free shared memory processing method and device, electronic equipment and readable storage medium
CN106980669B (en) A kind of storage of data, acquisition methods and device
CN103370691B (en) Managing buffer overflow conditions
CN108009008A (en) Data processing method and system, electronic equipment
US10095556B2 (en) Parallel priority queue utilizing parallel heap on many-core processors for accelerating priority-queue-based applications
CN105095261A (en) Data insertion method and device
CN104750720B (en) The realization that high-performance data is handled under multi-thread concurrent access environment
US20070250517A1 (en) Method and Apparatus for Autonomically Maintaining Latent Auxiliary Database Structures for Use in Executing Database Queries
AU2013361244A1 (en) Paraller priority queue utilizing parallel heap on many-core processors for accelerating priority-queue-based applications
CN104657143A (en) High-performance data caching method
CN109344348A (en) A kind of resource regeneration method and device
CN113721862B (en) Data processing method and device
US11681691B2 (en) Presenting updated data using persisting views
CN103946794A (en) Cross-reference and priority claim to related applications
CN104778077A (en) High-speed extranuclear graph processing method and system based on random and continuous disk access
US8543600B2 (en) Redistribute native XML index key shipping
CN110221829A (en) Information processing method and its system, computer system and computer-readable medium
CN110298213A (en) Video analytic system and method
CN111427885B (en) Database management method and device based on lookup table
CN110020272A (en) Caching method, device and computer storage medium
CN112000845B (en) Hyperspatial hash indexing method based on GPU acceleration
CN111752941B (en) Data storage and access method and device, server and storage medium
CN105354317A (en) Hotel database updating method and system
CN109408613A (en) Index structure operating method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200811

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 510627 Guangdong city of Guangzhou province Whampoa Tianhe District Road No. 163 Xiping Yun Lu Yun Ping square B radio tower 13 layer self unit 01

Applicant before: Guangdong Shenma Search Technology Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190301