CN107341217A - A kind of data capture method and equipment - Google Patents
A kind of data capture method and equipment Download PDFInfo
- Publication number
- CN107341217A CN107341217A CN201710501301.6A CN201710501301A CN107341217A CN 107341217 A CN107341217 A CN 107341217A CN 201710501301 A CN201710501301 A CN 201710501301A CN 107341217 A CN107341217 A CN 107341217A
- Authority
- CN
- China
- Prior art keywords
- elasticsearch
- data
- search engine
- methods
- data acquisition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- User Interface Of Digital Computer (AREA)
- Stored Programmes (AREA)
Abstract
The present invention, which provides a kind of data capture method and equipment, methods described, to be included:Set data query conditions and the rule parsed to the ElasticSearch data returned are supplied to predefined data acquisition component;Call the data acquisition component to initiate roll screen inquiry request to search engine ElasticSearch, obtain returning results of the search engine ElasticSearch by parsing to the roll screen inquiry request.A kind of data capture method proposed by the present invention and equipment, by calling customized data acquisition component to obtain high-volume data to search engine ElasticSearch so that data acquisition is more directly more reliable, in order, in real time and not repeatedly using ElasticSearch ScrollAPI.
Description
Technical field
The present invention relates to field of software engineering, more particularly, to a kind of data capture method and equipment.
Background technology
ElasticSearch is an outstanding distributed search engine of increasing income, except for searching for,
ElasticSearch is also daily record storage, the sharp weapon of off line data analysis excavation.It can be received in real time using ElasticSearch
Using being output to daily record on disk in the process of running on line concentration, and by real-time collecting to daily record storage arrive
In ElasticSearch clusters.
For the daily record being stored in ElasticSearch clusters, there are following two application scenarios:On the one hand according to being opened
The daily record central platform of hair, developer is by setting on search condition information trunk using the various days of output on the platform
Will, the problem of so as to help developer to understand the running situation applied on line and applied on fast positioning line.On the other hand
Storm clusters can pull the polymerization calculating that complexity is done in daily record from ElasticSearch clusters in bulk in real time, such as distributed to adjust
With chain calculating etc..Both the above scene be required to rapidly, continuously, a large amount of numbers are obtained from ElasticSearch clusters in real time
According to.ElasticSearch provides ScrollAPI (rolling search) and is used to make ElasticSearch quickly and efficiently perform greatly
The data query of batch.
But ScrollAPI (rolling search) is adapted to the substantial amounts of data of processing, is not suitable for active user request, and whenever application
When program initiates a new Scroll API Calls again, ElasticSearch can returned data from the beginning, cause client
End receives the data repeated.Following ask directly can be brought to application program using the ElasticSearch ScrollAPI provided
Topic:Can not ensure application program end it is reliable, sequentially, in real time and not repeatedly obtain large batch of data.
The content of the invention
In order to overcome directly using ElasticSearch provide ScrollAPI bring can not reliably, sequentially, in real time
And the problem of not repeatedly obtaining high-volume data, the present invention provides a kind of data capture method and equipment.
According to an aspect of the present invention, there is provided a kind of data capture method, including:
S1, set data query conditions and the rule parsed to the ElasticSearch data returned are carried
Supply predefined data acquisition component;
S2, call the data acquisition component to initiate roll screen inquiry request to search engine ElasticSearch, obtain warp
Cross returning results of the search engine ElasticSearch to the roll screen inquiry request of parsing.
Wherein, also include before step S1:
S0, realize the data acquisition component based on ElasticSearch ScrollAPI.
Wherein, the data acquisition component specifically includes:Prepare query interface class and roll screen enquiring component class;
The preparation query interface class includes prepare methods and parseResult methods, and the prepare methods are used
In providing developer's querying condition set to data acquisition component, the parseResult methods are used for data acquisition
Component provide developer set to the resolution rules of the data got from search engine ElasticSearch;
The roll screen enquiring component class includes doScrollSearch methods, the doScrollSearch methods be used for
ElasticSearch ScrollAPI mode obtains the data in search engine ElasticSearch, described
DoScrollSearch methods enter example of the ginseng for the preparation query interface class.
Wherein, step S1 further comprises:
S11, by the preparation query interface class instantiation, obtain an instance objects for preparing query interface class;
S12, the instance objects are passed to the doScrollSearch methods of the roll screen enquiring component.
Wherein, step S2 further comprises:
S21, the prepare methods are adjusted back in the doScrollSearch methods and obtain looking into for developer's setting
Inquiry condition, roll screen inquiry request is initiated to search engine ElasticSearch;
S22, when getting search engine ElasticSearch to the returning result of the roll screen inquiry request, readjustment
The parseResult methods parse to the returning result, obtain the ElasticSearch data by parsing;
S23, return to the ElasticSearch data by parsing.
Wherein, the roll screen inquiry request includes:The querying condition of developer's setting, request contexts ID, offset
The index of query argument and last visit.
Wherein, after the step of initiating roll screen inquiry request to search engine ElasticSearch in the step s 21, also
Including:
Search engine ElasticSearch is set to carry out ascending sort to data according to offset fields.
Wherein, step S22 also includes:
If knowing, search engine ElasticSearch is obtained less than the request contexts institute according to the request contexts ID
Corresponding data, then initiate new roll screen inquiry request again to search engine ElasticSearch.
According to another aspect of the present invention, there is provided a kind of data acquisition facility, including memory, processor, Yi Jizong
Line,
The processor and memory complete mutual communication by the bus;
The memory storage has and can call the memory by the programmed instruction of the computing device, the processor
In programmed instruction, to perform foregoing data capture method.
According to a further aspect of the invention, there is provided a kind of non-transient computer readable storage medium storing program for executing, the non-transient meter
Calculation machine readable storage medium storing program for executing stores computer instruction, and the computer instruction obtains the foregoing data of the computer execution
Take method.
A kind of data capture method proposed by the present invention and equipment, by calling customized data acquisition component to search
Engine ElasticSearch obtains high-volume data so that data acquisition is more directly using ElasticSearch's
ScrollAPI is more reliable, in order, in real time and not repeatedly.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the data capture method provided according to one embodiment of the invention;
Fig. 2 is the schematic flow sheet based on step S2 in Fig. 1 provided according to another embodiment of the present invention;
Fig. 3 is a kind of structural representation for data acquisition facility that another embodiment of the present invention provides.
Embodiment
With reference to the accompanying drawings and examples, the embodiment of the present invention is described in further detail.Implement below
Example is used to illustrate the present invention, but is not limited to the scope of the present invention.
As shown in figure 1, be the schematic flow sheet of the data capture method provided according to one embodiment of the invention, including:
S1, set data query conditions and the rule parsed to the ElasticSearch data returned are carried
Supply predefined data acquisition component;
S2, call the data acquisition component to initiate roll screen inquiry request to search engine ElasticSearch, obtain warp
Cross returning results of the search engine ElasticSearch to the roll screen inquiry request of parsing.
Specifically, it is large batch of in order to be obtained rapidly, continuously and in real time from search engine ElasticSearch
Data are used for specific business scenario, and application program can be by calling predefined data acquisition component to search engine
ElasticSearch initiates roll screen inquiry request (ScrollAPI requests), and prior art is replaced by data acquisition component
ScrollAPI interacts with search engine ElasticSearch, meets so as to obtain in search engine ElasticSearch
The data of querying condition, and accessed ElasticSearch data can be parsed into specific business by data acquisition component
Type object required for scene.Developer is only needed before predefined data acquisition component is called, by set number
Data acquisition component is supplied to according to querying condition and the rule parsed to the ElasticSearch data returned, it is possible to
By calling the data acquisition component, the ElasticSearch data after parsing are got.
Data acquisition component to the content of the search engine ElasticSearch roll screen inquiry requests sent be developer
The querying condition of the QueryBuilder types of setting, and added on the basis of the QueryBuilder type queries condition
The index of offset (data-bias value) query argument, scrollId parameters (request contexts ID) and last visit, and show
The content for the roll screen inquiry request that some is sent by ScrollAPI generally only includes the querying condition of developer's setting.Number
ScrollId parameters, which are relied on, according to securing component ensures the real-time of data acquisition and not repeated, dependence offset mechanism guarantee numbers
According to the order of acquisition.
A kind of data capture method provided in an embodiment of the present invention, by calling customized data acquisition component to search
Engine ElasticSearch obtains high-volume data so that data acquisition is more directly using ElasticSearch's
ScrollAPI is more reliable, in order, in real time and not repeatedly.
Another embodiment of the present invention, on the basis of above-described embodiment, also include before step S1:
S0, realize the data acquisition component based on ElasticSearch ScrollAPI;
Wherein, the data acquisition component specifically includes:Prepare query interface class and roll screen enquiring component class;
The preparation query interface class includes prepare methods and parseResult methods, and the prepare methods are used
In providing developer's querying condition set to data acquisition component, the parseResult methods are used for data acquisition
Component provide developer set to the resolution rules of the data got from search engine ElasticSearch;
The roll screen enquiring component class includes doScrollSearch methods, the doScrollSearch methods be used for
ElasticSearch ScrollAPI mode obtains the data in search engine ElasticSearch, described
DoScrollSearch methods enter example of the ginseng for the preparation query interface class.
Specifically, the data acquisition component is based on ElasticSearch ScrollAPI, realizes that the data obtain
Component is taken to include:Construction prepares query interface class (IPrepareSearch<T>Interface class) and construction roll screen enquiring component class
(ScrollSearchComponent classes).
1) IPrepareSearch is constructed<T>Interface class
The effect of the interface be available to application program (developer) querying condition is configured, to from
The resolution rules for the data that ElasticSearch is obtained are configured.
IPrepareSearch<T>Interface class is defined as follows:
The interface class is made up of two methods, and one is prepare methods, and the effect of this method is available to opening for application
Hair personnel provide to data acquisition component prepares querying condition.Here SearchRequestVo is defined to be used to describe exploit person
The inquiry data that member provides to data acquisition component.Another method is parseResult methods, and this method is supplied to exploit person
Member is set represents that data obtain to the resolution rules of the data got from ElasticSearch, the source that enters to join of wherein method
The a data for taking component to be got from ElasticSearch, the return Value Types of method have used general type, real by developer
Existing IPrepareSearch<R>There is provided during interface, source type is Map<String,Object>Type, it is more original
Data type, the data that developer finally needs are the type objects required for specific business scenario, it is therefore desirable to are passed through
ParseResult parses to the ElasticSearch data got.
SearchRequestVo is defined as follows:
Wherein, scrollId represents that ElasticSearch is that each roll screen inquiry request (scrollAPI requests) creates
Request contexts id.When application program calls ElasticSearch scroll API first, ElasticSearch meetings
A request contexts are created for the application program, the request contexts have necessarily ageing, i.e., after the specified time
Can be expired.And within the effective time of the request contexts, application program calls ElasticSearch ScrollAPI's again
During request, as long as the scrollId is transmitted into ElasticSearch, then ElasticSearch will then last time inquiry
As a result, remaining data are returned to, so as to ensure that the not repeated of data acquisition.
What scrollWindow was represented is that ElasticSearch is the expired time that request contexts are set, and unit is milli
Second, 180000 be the expired duration that developer is set, simply exemplary herein, can also be set as others as needed
Value, scrollWindow has default value, if developer is not configured to scrollWindow value,
ScrollWindow is default value.
Offset represented within the last time request contexts effective time, the last number that application program is had access to
According to deviant, the important role of the offset, for ensureing that application program will not be obtained disorderly in ElasticSearch
Data.ElasticSearch can't be to be automatically stored to data addition offset fields therein, that is to say, that ours should
Needed to ensure must have offset fields, the requirement of the field to the data of ElasticSearch cluster-based storages with developer
It is globally unique and monotonic increase.
Class in the java client libraries that queryBuilder type is provided by ElasticSearch defines, and represents out
Querying condition constructed by hair personnel.
2) ScrollSearchComponent classes are constructed
ScrollSearchComponent classes are the core classes of data acquisition component, developer by such
DoScrollSearch methods are finally obtained in ElasticSearch clusters in a manner of ElasticSearch ScrollAPI
Data.It is the signature of doScrollSearch methods below:
public<T>SearchResponseVo<T>
doScrollSearch(IPrepareSearch<T>prepareSearch)
The type for entering ginseng of this method is IPrepareSearch<T>Interface type, inside doScrollSearch methods
The prepare methods in prepareSearch can be adjusted back to determine the query argument of this inquiry request, while can automatically be adjusted
The every data got from ElasticSearch is solved with the parseResult methods in prepareSearch
Analysis.The result to application program local search is finally returned to, the Query Result is defined by SearchResponseVo, content
It is as follows:
Wherein scrollId, offset and scrollWindow are consistent with the implication in SearchRequestVo classes, this
In content represent call prepareSearch in parseResult methods ElasticSearch data are solved
Data after analysis.
Another embodiment of the present invention, on the basis of above-described embodiment, step S1 further comprises:
S11, by the preparation query interface class instantiation, obtain an instance objects for preparing query interface class;
S12, the instance objects are passed to the doScrollSearch methods of the roll screen enquiring component class.
Specifically, developer is calling the data acquisition component to search engine after data acquisition component is realized
Prepare query interface class (IPrepareSearch, it is necessary to realize before ElasticSearch acquisition data<T>Interface class), will
The preparation query interface class instantiation, obtains an example for preparing query interface class, so as to realize
IPrepareSearch<T>The prepare methods and parseResult methods of interface class.Developer passes through prepare methods
The querying condition write according to the rule of business is provided to data acquisition component, the querying condition is QueryBuilder types,
How provided by parseResult methods to data acquisition component by Map<String,Object>The data conversion of type into
The domain model type that developer wants.
Step S12 refers to using the instance objects as roll screen enquiring component class (ScrollSearchComponent classes)
DoScrollSearch methods enter ginseng.
As shown in Fig. 2 be another embodiment of the present invention, on the basis of above-described embodiment, step S2 schematic flow sheet,
Including:
S21, the prepare methods are adjusted back in the doScrollSearch methods and obtain looking into for developer's setting
Inquiry condition, roll screen inquiry request is initiated to search engine ElasticSearch;
S22, when getting search engine ElasticSearch to the returning result of the roll screen inquiry request, readjustment
The parseResult methods parse to the returning result, obtain the ElasticSearch data by parsing;
S23, return to the ElasticSearch data by parsing.
Specifically, predefined data acquisition component is called to initiate roll screen to search engine ElasticSearch
(scroll) inquiry request, returns of the search engine ElasticSearch by parsing to the roll screen inquiry request is obtained
As a result the step of, includes:
The inquiry request that data acquisition component obtains developer and set by adjusting back prepare methods, then in component
It is internal to initiate roll screen inquiry request to ElasticSearch.From above-described embodiment, the return Value Types of prepare methods
For SearchRequestVo, four parameters, i.e. request contexts ID values scrollId, request contexts ID institutes table are included
The expired duration scrollWindow for the request contexts shown, the deviant offset of the last data got and exploitation
The querying condition queryBuilder that personnel are set.Data acquisition component is by adjusting back prepare methods, you can on getting
State each parameter.After getting out querying condition, data acquisition component initiates roll screen inquiry to search engine ElasticSearch please
Ask.
When data acquisition component initiates roll screen inquiry request to search engine ElasticSearch for the first time,
ScrollId and offset is empty or default value, and ElasticSearch, which can create one, has ageing request contexts,
And return to the scrollId associated with the request contexts.When ElasticSearch poll-finals return to data acquisition component
While returning data, data acquisition component can retain the offset fields of the last item data this time got.When data obtain
(there is identical with roll screen inquiry request before when taking component to send same roll screen inquiry request to search engine again
ScrollId values), if the scrollId values are still effective, then ElasticSearch can just open according to offset fields
Beginning returned data, so as to ensure that the data that multiple roll screen inquiry request is returned are continuous and unduplicated.If should
Request contexts corresponding to scrollId are invalid, then ElasticSearch can prompt to data acquisition component should
ScrollId is invalid.
ScrollId and offset values in Prepare methods, are by data acquisition component and ElasticSearch
Between interact and automatically update.Interacting between data acquisition component and ElasticSearch is to pass through ElasticSearch
For the Java client of offer come what is realized, the Java client access the end of ElasticSearch servers in the form of TCP
Mouthful.
When data acquisition component gets ElasticSearch to the returning result of just inquiry request, the result
Data type is Map<String,Object>, data acquisition component can adjust back parseResult methods, will be without parsing
ElasticSearch return request results pass to parseResult methods, it is specific according to set by developer
Resolution rules parse the request results.Finally, data acquisition component will pass through what parseResult methods parsed
ElasticSearch data return to caller, i.e. developer.
Step S2 whole process is all to enter in ScrollSearchComponent classes in doScrollSearch methods
Capable.
Based on above-described embodiment, the roll screen inquiry request includes:The querying condition of developer's setting, request contexts
The index of ID, offset query argument and last visit.
The type of roll screen inquiry request is also QueryBuilder types, and its content is the inquiry bar set in developer
On the basis of part queryBuilder, addition offset query arguments, scrollId parameters and last index of reference, search
Index holds up ElasticSearch and inquires about the number for meeting querying condition in its server according to the content of above-mentioned roll screen inquiry request
According to.
Based on above-described embodiment, in the step s 21 to the step of search engine ElasticSearch initiation roll screen inquiry requests
After rapid, in addition to:
Search engine ElasticSearch is set to carry out ascending sort to data according to offset fields.
Specifically, application program initiates a new Scroll API Calls again every time, then ElasticSearch is just
Understand returned data from the beginning, this will result in client and receives the data repeated.In order to solve this problem, data acquisition group
Part requires that application developer ensures that storage must contain offset fields to the data in ElasticSearch, and the field needs
Want globally unique and monotonic increase (there are many schemes to realize the demand in the industry).So data acquisition component can be every in application
After the new ScrollAPI call requests of secondary execution, it can all require that ElasticSearch rises to data in offset fields
Sequence sorts, and the offset fields of the last item data in the data set got are remained, and ensures new one with this
The ScrollAPI requests of wheel are to continue to obtain data on the basis of upper ScrollAPI requests once.
Require that ElasticSearch is entered to data based on offset when being asked by performing new ScrollAPI every time
Row sequence can ensure the sequence type of data.Data acquisition component relies on ElasticSearch and offset mechanism and ensured continuously
Property and not repeated.
Based on above-described embodiment, step S22 also includes:
If knowing, search engine ElasticSearch is obtained less than the request contexts institute according to the request contexts ID
Corresponding data, then initiate new roll screen inquiry request again to search engine ElasticSearch.
Specifically, the result that ElasticSearch Scroll API requests return reflects initial search requests and established
When the state that indexes.It is just as a real-time snapshot, the follow-up change to text (insertion, renewal or delete) all only shadows
Later request is rung.That is ElasticSearch for new ScrollAPI requests create request contexts it
Afterwards, after this it is new on ElasticSearch addition, delete, renewal data all without influence the request contexts under
Multiple Scroll requests.Created in order that data acquisition component can be realized to get in real-time in Scroll request contexts
The data newly increased after building, data acquisition component is achieved in that knowing how search engine ElasticSearch passes through
ScrollId is obtained less than the data (showing that data have obtained to be over) corresponding to the request contexts, then data acquisition component
The roll screen inquiry request of a new round can be initiated again to search engine ElasticSearch.
As described in Figure 3, a kind of structural representation of the data acquisition facility provided for another embodiment of the present invention, including deposit
Reservoir 31, processor 32 and bus 33,
The processor 32 and memory 31 complete mutual communication by the bus 33;
The memory 31 is stored with the programmed instruction that can be performed by the processor 32, and the processor 32 calls described
Programmed instruction in memory 31, with the data capture method described in execution as described above each embodiment, such as including:By set by
Data query conditions and predefined data acquisition is supplied to the rule that is parsed of data that ElasticSearch is returned
Component;Call the data acquisition component to initiate roll screen inquiry request to search engine ElasticSearch, obtain by parsing
Search engine ElasticSearch to the returning result of the roll screen inquiry request.
Further embodiment of this invention, there is provided a kind of non-transient computer readable storage medium storing program for executing, the non-transient computer can
Storage medium storage computer instruction is read, the computer instruction makes the computer perform the number described in each embodiment as described above
According to acquisition methods, such as including:Parsed by set data query conditions and to the ElasticSearch data returned
Rule be supplied to predefined data acquisition component;The data acquisition component is called to search engine ElasticSearch
Roll screen inquiry request is initiated, obtains returns of the search engine ElasticSearch by parsing to the roll screen inquiry request
As a result.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through
Programmed instruction related hardware is completed, and foregoing program can be stored in a computer read/write memory medium, the program
Upon execution, the step of execution includes above method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or light
Disk etc. is various can be with the medium of store program codes.
The embodiment of data acquisition facility described above is only schematical, wherein described say as separating component
Bright unit can be or may not be physically separate, can be as the part that unit is shown or can not also
It is physical location, you can with positioned at a place, or can also be distributed on multiple NEs.Can be according to the need of reality
Some or all of module therein is selected to realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying
In the case of going out performing creative labour, you can to understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
Realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Based on such understanding, on
The part that technical scheme substantially in other words contributes to prior art is stated to embody in the form of software product, should
Computer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some fingers
Make to cause a computer equipment (can be personal computer, server, or network equipment etc.) to perform each implementation
Method described in some parts of example or embodiment.
The data capture method and equipment that the various embodiments described above of the present invention propose, by calling customized data acquisition group
Part obtains high-volume data to search engine ElasticSearch so that data acquisition is more directly using ElasticSearch's
ScrollAPI is more reliable, in order, in real time and not repeatedly.
Finally, method of the invention is only preferable embodiment, is not intended to limit the scope of the present invention.It is all
Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements made etc., the protection of the present invention should be included in
Within the scope of.
Claims (10)
- A kind of 1. data capture method, it is characterised in that including:S1, set data query conditions and the rule parsed to the ElasticSearch data returned are supplied to Predefined data acquisition component;S2, call the data acquisition component to initiate roll screen inquiry request to search engine ElasticSearch, obtain by solution Returning results of the search engine ElasticSearch of analysis to the roll screen inquiry request.
- 2. according to the method for claim 1, it is characterised in that also include before step S1:S0, realize the data acquisition component based on ElasticSearch ScrollAPI.
- 3. according to the method for claim 2, it is characterised in that the data acquisition component specifically includes:Prepare inquiry to connect Mouth class and roll screen enquiring component class;The preparation query interface class includes prepare methods and parseResult methods, the prepare methods for Data acquisition component provides the querying condition that developer is set, and the parseResult methods are used for data acquisition component There is provided developer set to the resolution rules of the data got from search engine ElasticSearch;The roll screen enquiring component class includes doScrollSearch methods, the doScrollSearch methods be used for ElasticSearch ScrollAPI mode obtains the data in search engine ElasticSearch, described DoScrollSearch methods enter example of the ginseng for the preparation query interface class.
- 4. according to the method for claim 3, it is characterised in that step S1 further comprises:S11, by the preparation query interface class instantiation, obtain an instance objects for preparing query interface class;S12, the instance objects are passed to the doScrollSearch methods of the roll screen enquiring component.
- 5. according to the method for claim 3, it is characterised in that step S2 further comprises:S21, the prepare methods are adjusted back in the doScrollSearch methods and obtain the inquiry bar that developer is set Part, roll screen inquiry request is initiated to search engine ElasticSearch;S22, when getting search engine ElasticSearch to the returning result of the roll screen inquiry request, described in readjustment ParseResult methods parse to the returning result, obtain the ElasticSearch data by parsing;S23, return to the ElasticSearch data by parsing.
- 6. according to the method for claim 5, it is characterised in that the roll screen inquiry request includes:What developer was set The index of querying condition, request contexts ID, offset query argument and last visit.
- 7. according to the method for claim 5, it is characterised in that sent out in the step s 21 to search engine ElasticSearch After the step of playing roll screen inquiry request, in addition to:Search engine ElasticSearch is set to carry out ascending sort to data according to offset fields.
- 8. according to the method for claim 6, it is characterised in that step S22 also includes:If know search engine ElasticSearch according to corresponding to request contexts ID acquisitions less than the request contexts Data, then initiate new roll screen inquiry request again to search engine ElasticSearch.
- A kind of 9. data acquisition facility, it is characterised in that including memory, processor and bus,The processor and memory complete mutual communication by the bus;The memory storage has and can called by the programmed instruction of the computing device, the processor in the memory Programmed instruction, to perform the method as described in claim 1 to 8 is any.
- 10. a kind of non-transient computer readable storage medium storing program for executing, it is characterised in that the non-transient computer readable storage medium storing program for executing is deposited Computer instruction is stored up, the computer instruction makes the computer perform the method as described in claim 1 to 8 is any.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710501301.6A CN107341217B (en) | 2017-06-27 | 2017-06-27 | Data acquisition method and equipment |
PCT/CN2017/120216 WO2019000897A1 (en) | 2017-06-27 | 2017-12-29 | Data acquisition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710501301.6A CN107341217B (en) | 2017-06-27 | 2017-06-27 | Data acquisition method and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341217A true CN107341217A (en) | 2017-11-10 |
CN107341217B CN107341217B (en) | 2020-02-07 |
Family
ID=60221638
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710501301.6A Active CN107341217B (en) | 2017-06-27 | 2017-06-27 | Data acquisition method and equipment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107341217B (en) |
WO (1) | WO2019000897A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019000897A1 (en) * | 2017-06-27 | 2019-01-03 | 武汉斗鱼网络科技有限公司 | Data acquisition method and device |
CN113407785A (en) * | 2021-06-11 | 2021-09-17 | 西北工业大学 | Data processing method and system based on distributed storage system |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457189A (en) * | 2019-07-02 | 2019-11-15 | 平安科技(深圳)有限公司 | A kind of blog management method and system, relevant device of application program |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103399887A (en) * | 2013-07-19 | 2013-11-20 | 蓝盾信息安全技术股份有限公司 | Query and statistical analysis system for mass logs |
US20160203548A1 (en) * | 2007-02-09 | 2016-07-14 | Xcira, Inc. | Integrated auctioning environment platform |
CN106126731A (en) * | 2016-07-01 | 2016-11-16 | 百势软件(北京)有限公司 | A kind of method and device obtaining Elasticsearch paged data |
CN106528797A (en) * | 2016-11-10 | 2017-03-22 | 上海轻维软件有限公司 | DSL query method based on Elasticsearch |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107341217B (en) * | 2017-06-27 | 2020-02-07 | 武汉斗鱼网络科技有限公司 | Data acquisition method and equipment |
-
2017
- 2017-06-27 CN CN201710501301.6A patent/CN107341217B/en active Active
- 2017-12-29 WO PCT/CN2017/120216 patent/WO2019000897A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160203548A1 (en) * | 2007-02-09 | 2016-07-14 | Xcira, Inc. | Integrated auctioning environment platform |
CN103399887A (en) * | 2013-07-19 | 2013-11-20 | 蓝盾信息安全技术股份有限公司 | Query and statistical analysis system for mass logs |
CN106126731A (en) * | 2016-07-01 | 2016-11-16 | 百势软件(北京)有限公司 | A kind of method and device obtaining Elasticsearch paged data |
CN106528797A (en) * | 2016-11-10 | 2017-03-22 | 上海轻维软件有限公司 | DSL query method based on Elasticsearch |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019000897A1 (en) * | 2017-06-27 | 2019-01-03 | 武汉斗鱼网络科技有限公司 | Data acquisition method and device |
CN113407785A (en) * | 2021-06-11 | 2021-09-17 | 西北工业大学 | Data processing method and system based on distributed storage system |
CN113407785B (en) * | 2021-06-11 | 2023-02-28 | 西北工业大学 | Data processing method and system based on distributed storage system |
Also Published As
Publication number | Publication date |
---|---|
WO2019000897A1 (en) | 2019-01-03 |
CN107341217B (en) | 2020-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104838377B (en) | It is handled using mapping reduction integration events | |
EP3502928A1 (en) | Intelligent natural language query processor | |
US7958104B2 (en) | Context based data searching | |
Bosco et al. | Discovering automatable routines from user interaction logs | |
US8443346B2 (en) | Server evaluation of client-side script | |
US8418142B2 (en) | Architecture for data validation | |
EP2469420A1 (en) | CEP engine and method for processing CEP queries | |
US10922282B2 (en) | On-demand collaboration user interfaces | |
US10599654B2 (en) | Method and system for determining unique events from a stream of events | |
EP1811447A1 (en) | Declarative adaptation of software entities stored in an object repository | |
US7836429B2 (en) | Data synchronization mechanism for change-request-management repository interoperation | |
CN110990447B (en) | Data exploration method, device, equipment and storage medium | |
CN109656963A (en) | Metadata acquisition methods, device, equipment and computer readable storage medium | |
Alchin | Pro Django | |
CN107341217A (en) | A kind of data capture method and equipment | |
Gilmore | Beginning PHP and MySQL 5: From novice to professional | |
CN109299913B (en) | Employee salary scheme generation method and device | |
CN108268468A (en) | The analysis method and system of a kind of big data | |
CN107085613A (en) | Enter the filter method and device of library file | |
CN108733543A (en) | A kind of method, apparatus of log analysis, electronic equipment and readable storage medium storing program for executing | |
Orlovskyi et al. | Enterprise architecture modeling support based on data extraction from business process models. | |
CN109344173A (en) | Data managing method and device, data structure | |
US10983989B2 (en) | Issue rank management in an issue tracking system | |
US7499932B2 (en) | Accessing data in an interlocking trees data structure using an application programming interface | |
US8875137B2 (en) | Configurable mass data portioning for parallel processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |