CN106528623B - A kind of search engine accelerating method and device - Google Patents
A kind of search engine accelerating method and device Download PDFInfo
- Publication number
- CN106528623B CN106528623B CN201610878061.7A CN201610878061A CN106528623B CN 106528623 B CN106528623 B CN 106528623B CN 201610878061 A CN201610878061 A CN 201610878061A CN 106528623 B CN106528623 B CN 106528623B
- Authority
- CN
- China
- Prior art keywords
- document
- search
- address
- search engine
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of search engine accelerating method and device, including:It searches plain engine and receives search key input by user, search the preceding N search results that plain engine meets condition by search key output, the preceding N search result includes:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade;To be represented using bitmap data structure, the internal data sequence number of the bitmap data structure and the sequence of described search result match the address of document ID, and represent whether the document is stored in the corresponding position of the sequence number by a bit.Technical solution provided by the invention has search speed fast, the advantages of being lacked using space memory.
Description
Technical field
This application involves technical field of data processing, have specifically related generally to a kind of search engine accelerating method and device.
Background technology
Many search engine engines are all to surround to establish inverted order index using keyword and scan for document, to accelerate to close
The search of key word.But some fields are not suitable for establishing inverted order index, because the document many times returned using keyword
It being also required to do complex calculation to some fields in these documents simultaneously, these fields are just not suitable for establishing inverted order index, if after
The continuous design system according to routine may result in treating capacity and be very limited.
By taking picture search as an example, the process flow of usual picture search is:Camera acquisition mass picture, and generate corresponding
Characteristic value is stored into search engine, uploading pictures and generates individual features value A to the picture during search, with this characteristic value A and
Other filter conditions retrieve similar picture from search engine and complete to scheme to search figure.Such as (the CPU on single machine:
Intel (R) Xeon (R) CPU E5-2520 v3@2.40GHz, memory:64G, hard disk:1T SATA, are related to below
Machine is this configuration), it is 2,000,000 according to the keyword search retrieval scale of returning to document, designs with complicated mathematical operation very
Hardly possible is only converted into the inverted order index according to keyword search, in addition the computing of other complexity, entire corresponding time 3s, with retrieval
The sample of return is increasing, and the response time sharply increases, and when concurrency reaches 10 requests, whole system is just paralysed.
Face search is again an exemplary in picture search because the characteristic value of face face be a great array, it is necessary to
It is calculated using specific algorithm, industry has similar to scan for using LIRE (Lucene Image Retrieval) at present
Picture, but be not very high for the search accuracy of face, so the requirement for reaching industry generation of having no idea.
Be not suitable for establishing inverted order index there are complex fields in document and complete search mission, while have the data knot of magnanimity again
Fruit collects and the request of high concurrent needs to handle.In this case, engine search is easily since memory consumes more, cpu load
It is excessive, the problem for the treatment of effeciency is low, the response time is long and search accuracy is low is caused, and easily leads to the wind of memory spilling
Danger.
The content of the invention
The embodiment of the present application provides a kind of search engine accelerating method and device, and sea is supported to meet search engine
Result set, high concurrent, the demand of low latency are measured, so as to improve search engine processing efficiency and speed.
The embodiment of the present application first aspect provides a kind of search engine accelerated method, including:
It searches plain engine and receives search key input by user, search plain engine and condition is met by search key output
Preceding N search result, the preceding N search result include:Matching score, address of document ID and the search engine burst of document,
The N is the positive integer of magnanimity grade;
The address of document ID to be represented using bitmap data structure, the internal data sequence number of the bitmap data structure with
The sequence of described search result matches, and is corresponded to by a bit to represent whether the document is stored in the sequence number
Position.
Optionally, this method further includes:
Multithreading task is created to address of document ID processing, independent data space is distributed for per thread and makes meter
Calculation task reuses the data space.
Optionally, it is described to distribute independent data space for per thread and the calculating task is made to reuse the data sky
Between it is specific, including:
Independent data space is distributed for per thread, by the calculating people after calculating task of a thread is completed
Object is removed, and next calculating task that brings into operation.
Optionally, the data space independent for per thread distribution is specific, including:
For the identical data space of the size of per thread distribution.
Optionally, the method further includes:
Whole document information is obtained using the I/O model of non-block type, when the I/O model of the non-block type receives document
When information is completed, handled by interrupt notification central processor CPU, whole document information has been handled in central processing unit
When, by document information storage into memory.
Second aspect provides a kind of search engine device, including:
Receiving unit, for receiving search key input by user;
Search unit, for meeting preceding N search results of condition, the preceding N search by search key output
As a result include:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade;
The address of document ID to be represented using bitmap data structure, the internal data sequence number of the bitmap data structure with
The sequence of described search result matches, and is corresponded to by a bit to represent whether the document is stored in the sequence number
Position.
Optionally, described device further includes:
Creating unit, for creating multithreading task to address of document ID processing;
Allocation unit, it is empty for distributing independent data space for per thread and calculating task being made to reuse the data
Between.
Optionally, the allocation unit is specific, for distributing independent data space for per thread, when thread
Calculating task removes the calculating personage after completing, and next calculating task that brings into operation.
Optionally, the allocation unit is specific, for the identical data space of the size distributed for the per thread.
Optionally, described device further includes:
Acquiring unit, the I/O model for the non-block type of application obtain whole document information;
Storage unit, for when the I/O model of the non-block type receive document information complete when, by interrupt notification
When central processing unit has handled whole document information, the document information is stored to interior for central processor CPU processing
In depositing.
Technical solution provided by the invention has the advantages that search speed is fast.
Description of the drawings
In order to illustrate more clearly of the technical solution in the embodiment of the present application, below by embodiment it is required use it is attached
Figure is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present application, for this field
For those of ordinary skill, without creative efforts, other attached drawings are can also be obtained according to these attached drawings.
Fig. 1 is a kind of flow diagram of search engine accelerated method provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram for search engine accelerated method that another embodiment of the present invention provides;
Fig. 3 is a kind of flow diagram for search engine accelerated method that further embodiment of this invention provides;
Fig. 4 is a kind of structure diagram of search engine device provided in an embodiment of the present invention;
Fig. 5 is a kind of hardware architecture diagram for searching for equipment provided in an embodiment of the present invention.
Specific embodiment
It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detail
The processing described as flow chart or method.Although operations are described as the processing of order by flow chart, therein to be permitted
Multioperation can be implemented concurrently, concomitantly or simultaneously.In addition, the order of operations can be rearranged.When it
The processing can be terminated when operation is completed, it is also possible to have the additional step being not included in attached drawing.The processing
It can correspond to method, function, regulation, subroutine, subprogram etc..
Alleged within a context " computer equipment ", also referred to as " computer ", referring to can be by running preset program or referring to
Make performing the intelligent electronic device of the predetermined process process such as numerical computations and/or logical calculated, can include processor with
Memory, by the survival that prestores in memory of processor execution instruct to perform predetermined process process or by ASIC,
The hardware such as FPGA, DSP perform predetermined process process or are realized by said two devices combination.Computer equipment includes but unlimited
In server, PC, laptop, tablet computer, smart mobile phone etc..
Method (some of them are illustrated by flow) discussed hereafter can be by hardware, software, firmware, centre
Part, microcode, hardware description language or its any combination are implemented.Implement when with software, firmware, middleware or microcode
When, to implement the program code of necessary task or code segment can be stored in machine or computer-readable medium and (for example deposit
Storage media) in.(one or more) processor can implement necessary task.
Concrete structure and function detail disclosed herein are only representative, and are for describing showing for the present invention
The purpose of example property embodiment.But the present invention can be implemented by many alternative forms, and be not interpreted as
It is limited only by the embodiments set forth herein.
Although it should be appreciated that may have been used term " first ", " second " etc. herein to describe unit,
But these units should not be limited by these terms.The use of these items is only for by a unit and another unit
It distinguishes.For example, in the case of the scope without departing substantially from exemplary embodiment, it is single that first module can be referred to as second
Member, and similarly second unit can be referred to as first module.Term "and/or" used herein above include one of them or
The arbitrary and all combination of more listed associated items.
Term used herein above is not intended to limit exemplary embodiment just for the sake of description specific embodiment.Unless
Context clearly refers else, otherwise singulative used herein above "one", " one " also attempt to include plural number.Should also
When understanding, term " comprising " and/or "comprising" used herein above provide stated feature, integer, step, operation,
The presence of unit and/or component, and do not preclude the presence or addition of other one or more features, integer, step, operation, unit,
Component and/or its combination.
It should further be mentioned that in some replaces realization modes, the function/action being previously mentioned can be according to different from attached
The order indicated in figure occurs.For example, depending on involved function/action, the two width figures shown in succession actually may be used
Substantially simultaneously to perform or can perform in a reverse order sometimes.
The present invention is described in further detail below in conjunction with the accompanying drawings.
According to an aspect of the invention, there is provided a kind of search engine accelerated method.
In one embodiment, the above method can be used in smart machine, it is necessary to illustrate, the smart machine is only
For citing, other existing or network equipments and user equipment for being likely to occur from now on should also include if applicable in the present invention
Within the scope of the present invention, and it is incorporated herein by reference.
Referring first to Fig. 1, Fig. 1 is a kind of flow diagram of search engine accelerated method provided in an embodiment of the present invention.
As shown in Figure 1, the above method can be applied in intelligent terminal or in computer equipment, above-mentioned intelligent terminal includes but unlimited
In:The equipment such as mobile phone, tablet computer, computer, server, certainly itself or other equipment, such as smartwatch or
Intelligent bracelet etc..This method is as shown in Figure 1, include the following steps:
Step S101, search plain engine and receive search key input by user;
The mode of reception search key input by user in above-mentioned steps S101 can there are many, for example, in this hair
In a bright preferred embodiment, above-mentioned steps S101 obtains search key input by user by way of input through keyboard,
Certain above-mentioned keyboard can show different forms in different equipment, for example, with computer (including but not limited to desk-top
Computer and notebook computer), above-mentioned keyboard can be physical keyboard, certainly in tablet computer or mobile phone, above-mentioned keyboard
Can be the dummy keyboard of Software Create, the application is not intended to limit the specific manifestation form of above-mentioned keyboard.The present invention another
In preferred embodiment, search key input by user can be obtained in above-mentioned steps S101 by phonetic entry mode, when
So in practical applications, the mode of above-mentioned phonetic entry can be obtained by built-in Mike, when another it is actual answer
In, the phonetic entry can also be obtained by being connected the Mike of equipment with smart machine.Certainly in practical applications, it is above-mentioned
The reception mode of step S101 can also have other modes, and differ a citing here.
Step S102, the preceding N search results that plain engine meets condition by search key output are searched, it is described N first
Search result includes:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade;Institute
Address of document ID is stated to be represented using bitmap data structure, internal data sequence number and the described search knot of the bitmap data structure
The sequence of fruit matches, and represents whether the document is stored in the corresponding position of the sequence number by a bit.On
The positive integer for stating magnanimity grade refers generally to the positive integer that numerical value is more than 1,000,000.
Search result is the search result obtained by search engine in above-mentioned steps S102, according to different search engines
There may be different search results, for example, may be differed using the search result that Baidu or Google search come out.
Traditional address of document ID is replaced in above-mentioned steps using bitmap data structure, for existing address of document ID
For, the size with 64bit, and for the application, address of document ID is to pass through only only there are one Bit
Whether the corresponding position of digit is what is preserved by bitmap data structure to represent it, wherein 1 represents to preserve, 0 represents not preserve,
Illustrated below by the example of a reality in a manner of its concrete implementation, since length is limited, here using 15 address of document as
Example illustrates that in practical applications, the address ID number of possible document is up to up to a million a or even more than one hundred million.A such as number
String, 0111000111110101, corresponding meaning is, document sequence number 2,3,4,8,9,10,11,12,14,15 is preserved pair
The document answered does not preserve document for other sequence numbers.
By above description, for search engine, since 64bit is modified as 1bit by it, so greatly save
The memory space of memory, for search result, quantity is generally million ranks, 63*107Bit, even if pressing
1000000 search results calculate, and can also save 6.3*108The amount of ram of bit, so it can save substantial amounts of memory, from
And improve the speed of search.
Above-mentioned search engine (Search Engine) refer to according to certain strategy, with specific computer program from
Information is collected on internet, after tissue and processing are carried out to information, provides retrieval service to the user, user search is relevant
The system that information shows user.Search engine includes full-text index, directory index, META Search Engine, vertical search engine, collection
Box-like search engine, door search engine and free lists of links etc..
One search engine is made of four searcher, index, searcher and user interface parts.The function of searcher
It is to be roamed in internet, finds and collect information.The function of index is to understand the information that searcher is searched for, and is therefrom extracted
Go out index entry, for representing document and generating the concordance list of document library.The function of searcher is the inquiry according to user in rope
Draw Rapid Detection document in storehouse, carry out document and the covariance mapping of inquiry, the result that will be exported is ranked up, and is realized
Certain End-user relevance feedback mechanism.The effect of user interface is input user inquiry, display query result, provides user's correlation
Property feedback mechanism.
Referring to Fig. 2, Fig. 2 is a kind of flow diagram for search engine accelerated method that another embodiment of the present invention provides.
As shown in Fig. 2, the above method can be applied in intelligent terminal or in computer equipment, above-mentioned intelligent terminal includes but unlimited
In:The equipment such as mobile phone, tablet computer, computer, server, certainly itself or other equipment, such as smartwatch or
Intelligent bracelet etc..This method is as shown in Fig. 2, include the following steps:
Step S201, search plain engine and receive search key input by user.
The mode of reception search key input by user in above-mentioned steps S201 can there are many, for example, in this hair
In a bright preferred embodiment, above-mentioned steps S201 obtains search key input by user by way of input through keyboard,
Certain above-mentioned keyboard can show different forms in different equipment, for example, with computer (including but not limited to desk-top
Computer and notebook computer), above-mentioned keyboard can be physical keyboard, certainly in tablet computer or mobile phone, above-mentioned keyboard
Can be the dummy keyboard of Software Create, the application is not intended to limit the specific manifestation form of above-mentioned keyboard.The present invention another
In preferred embodiment, search key input by user can be obtained in above-mentioned steps S201 by phonetic entry mode, when
So in practical applications, the mode of above-mentioned phonetic entry can be obtained by built-in Mike, when another it is actual answer
In, the phonetic entry can also be obtained by being connected the Mike of equipment with smart machine.Certainly in practical applications, it is above-mentioned
The reception mode of step S201 can also have other modes, and differ a citing here.
Step S202, the preceding N search results that plain engine meets condition by search key output are searched, it is described N first
Search result includes:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade;Institute
Address of document ID is stated to be represented using bitmap data structure, internal data sequence number and the described search knot of the bitmap data structure
The sequence of fruit matches, and represents whether the document is stored in the corresponding position of the sequence number by a bit.
Search result is the search result obtained by search engine in above-mentioned steps S202, according to different search engines
There may be different search results, for example, may be differed using the search result that Baidu or Google search come out.
Traditional address of document ID is replaced in above-mentioned steps using bitmap data structure, for existing address of document ID
For, the size with 64bit, and for the application, address of document ID is to pass through only only there are one Bit
Whether the corresponding position of digit is what is preserved by bitmap data structure to represent it, wherein 1 represents to preserve, 0 represents not preserve,
Illustrated below by the example of a reality in a manner of its concrete implementation, since length is limited, here using 15 address of document as
Example illustrates that in practical applications, the address ID number of possible document is up to up to a million a or even more than one hundred million.A such as number
String, 0111000111110101, corresponding meaning is, document sequence number 2,3,4,8,9,10,11,12,14,15 is preserved pair
The document answered does not preserve document for other sequence numbers.
By above description, for search engine, since 64bit is modified as 1bit by it, so greatly save
The memory space of memory, for search result, quantity is generally million ranks, 63*107Bit, even if pressing
1000000 search results calculate, and can also save 6.3*108The amount of ram of bit, so it can save substantial amounts of memory, from
And improve the speed of search.
Step S203, multithreading task is created to address of document ID processing, and independent data are distributed for per thread
Space simultaneously makes calculating task reuse the data space.
The implementation method of above-mentioned steps S203 is specifically as follows:
Independent data space is distributed for per thread, by the calculating people after calculating task of a thread is completed
Object is removed, and next calculating task that brings into operation.In addition, be optionally the identical data space of per thread allocated size,
The quantity for so enabling to the document that per thread is handled in multithreading is essentially identical, avoids multiple threads number of documents
The problem of uneven, further improves speed.
The technical solution of another embodiment of the present invention since 64bit is modified as 1bit by it, be so greatly saved in
The memory space deposited, for search result, quantity is generally million ranks, 63*107Bit, even if by 1,000,000
Search result calculates, and can also save 6.3*108The amount of ram of bit, so it can save substantial amounts of memory, so as to improve
The speed of search.Document is handled additionally by configuration multithreading, further improves the speed of search, so it has into one
Step accelerates the advantages of speed.
Refering to Fig. 3, Fig. 3 is a kind of flow diagram for search engine accelerated method that further embodiment of this invention provides.
As shown in figure 3, the above method can be applied in intelligent terminal or in computer equipment, above-mentioned intelligent terminal includes but unlimited
In:The equipment such as mobile phone, tablet computer, computer, server, certainly itself or other equipment, such as smartwatch or
Intelligent bracelet etc..This method is as shown in figure 3, include the following steps:
Step S301, search plain engine and receive search key input by user.
The mode of reception search key input by user in above-mentioned steps S301 can there are many, for example, in this hair
In a bright preferred embodiment, above-mentioned steps S301 obtains search key input by user by way of input through keyboard,
Certain above-mentioned keyboard can show different forms in different equipment, for example, with computer (including but not limited to desk-top
Computer and notebook computer), above-mentioned keyboard can be physical keyboard, certainly in tablet computer or mobile phone, above-mentioned keyboard
Can be the dummy keyboard of Software Create, the application is not intended to limit the specific manifestation form of above-mentioned keyboard.The present invention another
In preferred embodiment, search key input by user can be obtained in above-mentioned steps S301 by phonetic entry mode, when
So in practical applications, the mode of above-mentioned phonetic entry can be obtained by built-in Mike, when another it is actual answer
In, the phonetic entry can also be obtained by being connected the Mike of equipment with smart machine.Certainly in practical applications, it is above-mentioned
The reception mode of step S301 can also have other modes, and differ a citing here.
Step S302, the preceding N search results that plain engine meets condition by search key output are searched, it is described N first
Search result includes:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade;Institute
Address of document ID is stated to be represented using bitmap data structure, internal data sequence number and the described search knot of the bitmap data structure
The sequence of fruit matches, and represents whether the document is stored in the corresponding position of the sequence number by a bit.
Search result is the search result obtained by search engine in above-mentioned steps S302, according to different search engines
There may be different search results, for example, may be differed using the search result that Baidu or Google search come out.
Traditional address of document ID is replaced in above-mentioned steps using bitmap data structure, for existing address of document ID
For, the size with 64bit, and for the application, address of document ID is to pass through only only there are one Bit
Whether the corresponding position of digit is what is preserved by bitmap data structure to represent it, wherein 1 represents to preserve, 0 represents not preserve,
Illustrated below by the example of a reality in a manner of its concrete implementation, since length is limited, here using 15 address of document as
Example illustrates that in practical applications, the address ID number of possible document is up to up to a million a or even more than one hundred million.A such as number
String, 0111000111110101, corresponding meaning is, document sequence number 2,3,4,8,9,10,11,12,14,15 is preserved pair
The document answered does not preserve document for other sequence numbers.
Step S303, whole document information is obtained using the I/O model of non-block type, when the IO moulds of the non-block type
It when type receives document information completion, informs that central processor CPU is handled, whole document letters has been handled in central processing unit
During breath, by document information storage into memory.
The operation of I/O intensive type can be separated using the IO of non-obstruction in above-mentioned steps S303 and CPU intensive type operates, reduce
The I/O latency and then speed up processing of CPU.
Under the I/O mode of obstruction, if reading the data volume less than specified size from network flow, obstruction IO is right over there
Block.For example, the known data for having 10 bytes below are sent, but I only receives 8 bytes now, then current
Thread is right just in the arrival that stupidly waits until next byte, is just waited at that, what thing is not also done, until this 10 words
Section has been read, this just decontrols obstruction current.
Under Non-Blocking I/O pattern, if reading the data volume less than specified size from network flow, Non-Blocking I/O is just immediately
It is current.For example, the known data for having 10 bytes below are sent, but I only receives 8 bytes now, then works as front
Journey just reads the data of this 8 bytes, is just returned immediately after running through, and goes to read again when other two byte is waited again to come.
From the above it can be seen that obstruction IO is very low in aspect of performance, if it is desired that completing a Web with obstruction IO
If server, then must enable a thread for each request and handle.And if using Non-Blocking I/O, one
It is substantially just much of that two threads, because thread will not generate obstruction, like the data for once receiving A requests, connect under another
Data of B requests, etc. are received, is exactly ceaselessly to run around here and there, is directly over to data receiver.
The technical solution of another embodiment of the present invention since 64bit is modified as 1bit by it, be so greatly saved in
The memory space deposited, for search result, quantity is generally million ranks, 63*107Bit, even if by 1,000,000
Search result calculates, and can also save 6.3*108The amount of ram of bit, so it can save substantial amounts of memory, so as to improve
The speed of search.In addition, the present embodiment further improves the speed of search using Non-Blocking I/O pattern.
Refering to Fig. 4, Fig. 4 is a kind of search engine device provided in an embodiment of the present invention, and the device is as shown in figure 4, such as Fig. 4
The definition of technical term in shown embodiment may refer to the definition of embodiment as shown in Figure 1, 2, 3, which includes:
Receiving unit 401, for receiving search key input by user.
Search unit 402, for meeting preceding N search results of condition by search key output, described first N is searched
Hitch fruit includes:Matching score, address of document ID and the search engine burst of document, the N are the positive integer of magnanimity grade.
The address of document ID to be represented using bitmap data structure, the internal data sequence number of the bitmap data structure with
The sequence of described search result matches, and is corresponded to by a bit to represent whether the document is stored in the sequence number
Position.
Optionally, which further includes:
Creating unit 403, for creating multithreading task to address of document ID processing.
Allocation unit 404, for distributing independent data space for per thread and calculating task being made to reuse the data
Space.
Allocation unit 404 is specific, for distributing independent data space for per thread, when the calculating task of a thread
The calculating personage is removed after completion, and next calculating task that brings into operation.
Allocation unit 404 is specific, for the identical data space of the size distributed for the per thread.
Optionally, above device further includes:
Acquiring unit 405, the I/O model for the non-block type of application obtain whole document information.
Storage unit 406, for when the I/O model of the non-block type receives document information completion, informing central processing
Device CPU processing, when central processing unit has handled whole document information, by document information storage into memory.
Refering to Fig. 5, Fig. 5 is a kind of hardware architecture diagram for searching for equipment provided in an embodiment of the present invention.Above-mentioned search
Equipment is specifically as follows:The equipment such as server, computer, smart mobile phone, the search equipment 50 as shown in figure 5, including:Processor
501st, memory 502, transceiver 503 and bus 504.Transceiver 503 is used for external equipment interaction with transceiving data.Search is set
The quantity of processor 501 in standby 50 can be one or more.In some embodiments of the present application, processor 501, memory
502 can be connected with transceiver 503 by bus or other modes.Memory 502 for storing program code, use by processor 501
In calling the program code that is stored in memory 502, to realize the function as shown in Figure 1, Figure 2, in Fig. 3.It is related on the present embodiment
Term meaning and citing, may be referred to the corresponding embodiment in Fig. 1,2,3.Details are not described herein again.It should be noted that this
In processor 501 can be a processing element or multiple processing elements general designation.For example, the processing element can
Be central processing unit (English:Central processing unit, referred to as:) or specific integrated circuit (English CPU
Text:Application-specific integrated circuit, referred to as:ASIC) or it is arranged to implement this Shen
Please embodiment one or more integrated circuits, such as:One or more digital signal processor (English:digital
Signal processor, referred to as:DSP) or, one or more field programmable gate array is (English:field-
Programmable gate array, referred to as:FPGA).
Memory 503 can be the general designation of a storage device or multiple memory elements, and for storing and can hold
Parameter, data etc. required for line program code or the operation of application program running gear.And memory 503 can include random storage
Device (English:Random-access memory, referred to as:RAM), nonvolatile memory (non-volatile can also be included
), such as magnetic disk storage, flash memory (flash) etc. memory.
Bus 504 can be industry standard architecture (English:Industry Standard Architecture, letter
Claim:ISA) bus, external equipment interconnection (English:Peripheral Component Interconnect, referred to as:PCI) bus
Or extended industry-standard architecture (English:Extended Industry Standard Architecture, referred to as:
EISA) bus etc..The bus can be divided into address bus, data/address bus, controlling bus etc..For ease of representing, only with one in Fig. 5
Bar thick line represents, it is not intended that an only bus or a type of bus.
Module or submodule in all embodiments of the invention, can be by universal integrated circuit, such as CPU or passes through
ASIC (Application Specific Integrated Circuit, application-specific integrated circuit) is realized.
It should be noted that for foregoing each embodiment of the method, in order to be briefly described, therefore it is all expressed as to a system
The combination of actions of row, but those skilled in the art should know, the present invention and from the limitation of described sequence of movement, because
For according to the application, certain some step may be employed other orders or be carried out at the same time.Secondly, those skilled in the art also should
Know, embodiment described in this description belongs to preferred embodiment, involved action and module not necessarily this Shen
It please be necessary.
In the above-described embodiments, all emphasize particularly on different fields to the description of each embodiment, be not described in some embodiment
Part, may refer to the associated description of other embodiment.
The steps in the embodiment of the present invention can be sequentially adjusted, merged and deleted according to actual needs.
Unit in user terminal of the embodiment of the present invention can be combined, divided and deleted according to actual needs.
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer read/write memory medium
In, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access
Memory, abbreviation RAM) etc..
A kind of search engine accelerating method and device disclosed by the embodiments of the present invention is described in detail above, herein
In apply specific case the principle of the present invention and embodiment be set forth, the explanation of above example is only intended to sides
Assistant solves the method and its core concept of the present invention;Meanwhile for those of ordinary skill in the art, think of according to the invention
Think, in specific embodiments and applications there will be changes, in conclusion this specification content should not be construed as pair
The limitation of the present invention.
Claims (2)
1. a kind of search engine accelerated method, which is characterized in that including:
Search engine receives search key input by user, before search engine meets condition by search key output
N search results, the preceding N search result include:Matching score, address of document ID and the search engine burst of document, institute
The positive integer that N is magnanimity grade is stated, the magnanimity grade is more than 1,000,000 positive integer for numerical value;
The address of document ID to be represented using bitmap data structure, the internal data sequence number of the bitmap data structure with it is described
The sequence of search result matches, and represents whether the document is stored in the corresponding position of the sequence number by a bit
It puts, only there are one bits by the address of document ID;
Wherein, the method further includes:
Whole document information is obtained using the I/O model of non-block type, when the I/O model of the non-block type receives document information
It during completion, is handled by interrupt notification central processing unit, when central processing unit has handled whole document information, by institute
Document information storage is stated into memory, the IO of the non-obstruction operates for separating the operation of I/O intensive type and CPU intensive type;
The method further includes:
Multithreading task is created to address of document ID processing, independent data space is distributed for per thread and appoints calculating
Business reuses the data space, is specially:
Independence and the identical data space of size are distributed for per thread, when described in general after the calculating task of a thread is completed
Calculating task is removed, and next calculating task that brings into operation.
2. a kind of search engine device, which is characterized in that including:
Receiving unit, for receiving search key input by user;
Search unit, for meeting preceding N search results of condition, the preceding N search result by search key output
Including:Matching score, address of document ID and the search engine burst of document, the N be magnanimity grade positive integer, the magnanimity grade
It is more than 1,000,000 positive integer for numerical value;
The address of document ID to be represented using bitmap data structure, the internal data sequence number of the bitmap data structure with it is described
The sequence of search result matches, and represents whether the document is stored in the corresponding position of the sequence number by a bit
It puts, only there are one bits by the address of document ID;
Wherein, described device further includes:
Acquiring unit, the I/O model for the non-block type of application obtain whole document information;
Storage unit, for when the I/O model of the non-block type receives document information completion, passing through interrupt notification centre
Device CPU processing is managed, when central processing unit has handled whole document information, by document information storage to memory
In, the IO of the non-obstruction operates for separating the operation of I/O intensive type and CPU intensive type;
Creating unit, for creating multithreading task to address of document ID processing;
Allocation unit, for distributing independent data space for per thread and calculating task being made to reuse the data space, tool
Body is:Independence and the identical data space of size are distributed for per thread, by institute after the calculating task of a thread is completed
State calculating task removing, and next calculating task that brings into operation.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610858742 | 2016-09-28 | ||
CN2016108587427 | 2016-09-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106528623A CN106528623A (en) | 2017-03-22 |
CN106528623B true CN106528623B (en) | 2018-05-22 |
Family
ID=58331772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610878061.7A Active CN106528623B (en) | 2016-09-28 | 2016-10-08 | A kind of search engine accelerating method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106528623B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108121815B (en) * | 2017-12-28 | 2022-03-11 | 深圳开思时代科技有限公司 | Automobile part query method, device and system, electronic equipment and medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101295323A (en) * | 2008-06-30 | 2008-10-29 | 腾讯科技(深圳)有限公司 | Processing method and system for index updating |
CN104636407A (en) * | 2013-11-15 | 2015-05-20 | 腾讯科技(深圳)有限公司 | Parameter choice training and search request processing method and device |
-
2016
- 2016-10-08 CN CN201610878061.7A patent/CN106528623B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101295323A (en) * | 2008-06-30 | 2008-10-29 | 腾讯科技(深圳)有限公司 | Processing method and system for index updating |
CN104636407A (en) * | 2013-11-15 | 2015-05-20 | 腾讯科技(深圳)有限公司 | Parameter choice training and search request processing method and device |
Non-Patent Citations (2)
Title |
---|
"分片位图索引:一种适用于云数据管理的辅助索引机制";孟必平 等;《计算机学报》;20121115;第35卷(第11期);第2306-2316页 * |
"数据库加速引擎中数据垂直分片技术研究";黄河等;《计算机工程》;20060831;第32卷(第16期);第34-35、51页 * |
Also Published As
Publication number | Publication date |
---|---|
CN106528623A (en) | 2017-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10585915B2 (en) | Database sharding | |
KR101661000B1 (en) | Systems and methods to enable identification of different data sets | |
CN108255958A (en) | Data query method, apparatus and storage medium | |
CN107704202B (en) | Method and device for quickly reading and writing data | |
CN110275864A (en) | Index establishing method, data query method and calculating equipment | |
CN104965826B (en) | Search method and retrieval device based on browser | |
WO2020250064A1 (en) | Context-aware data mining | |
CN109359237A (en) | It is a kind of for search for boarding program method and apparatus | |
CN112035529B (en) | Caching method, caching device, electronic equipment and computer readable storage medium | |
US20180129736A1 (en) | System to organize search and display unstructured data | |
CN112148701A (en) | File retrieval method and equipment | |
CN111367870A (en) | Method, device and system for sharing picture book | |
CN108171189A (en) | Video coding method, video coding device and electronic equipment | |
JP2021535473A (en) | Token matching in a large document corpus | |
CN112070550A (en) | Keyword determination method, device and equipment based on search platform and storage medium | |
CN107590248B (en) | Search method, search device, search terminal and computer-readable storage medium | |
CN106528623B (en) | A kind of search engine accelerating method and device | |
CN109614478A (en) | Construction method, key word matching method and the device of term vector model | |
CN112148865B (en) | Information pushing method and device | |
CN110287284B (en) | Semantic matching method, device and equipment | |
US20210034704A1 (en) | Identifying Ambiguity in Semantic Resources | |
CN110688223A (en) | Data processing method and related product | |
CN114741489A (en) | Document retrieval method, document retrieval device, storage medium and electronic equipment | |
CN111783440B (en) | Intention recognition method and device, readable medium and electronic equipment | |
CN111666449B (en) | Video retrieval method, apparatus, electronic device, and computer-readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |