CN110245208B - Retrieval analysis method, device and medium based on big data storage - Google Patents
Retrieval analysis method, device and medium based on big data storage Download PDFInfo
- Publication number
- CN110245208B CN110245208B CN201910362509.3A CN201910362509A CN110245208B CN 110245208 B CN110245208 B CN 110245208B CN 201910362509 A CN201910362509 A CN 201910362509A CN 110245208 B CN110245208 B CN 110245208B
- Authority
- CN
- China
- Prior art keywords
- field
- events
- searchable
- information
- subset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 32
- 238000013500 data storage Methods 0.000 title claims abstract description 25
- 238000005516 engineering process Methods 0.000 claims abstract description 19
- 230000000694 effects Effects 0.000 claims abstract description 14
- 238000000034 method Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a retrieval analysis method, a retrieval analysis device and a retrieval analysis medium based on big data storage, which relate to the technical field of data storage analysis and comprise the steps of receiving field query information; searching corresponding field searchable events in the stored data according to the field query information, wherein the field searchable events are activity events in the aspect of safety or performance of one or more information technology systems; generating a subset of corresponding live searchable events; identifying fields in one or more events in the subset, and determining the number of unique values of the fields; displaying the selected field name and the number of the unique values, the field name being a field name of a field that can be referred to in the query information; according to the conditions input by the user, a group of searchable events are queried based on specific query standards for massive big data stored in each data center, and query field results and unique value numbers are displayed, so that the retrieval speed is greatly improved, and the use experience of big database retrieval is improved.
Description
Technical Field
The invention relates to the technical field of data storage analysis, in particular to a retrieval analysis method, a retrieval analysis device and a retrieval analysis medium based on big data storage.
Background
Under the environment of big data, information concerned and interested by the user can be quickly and accurately retrieved according to the conditions provided by the user, and the method is a basic and important component of big data application. With the continuous development of computer technology and the continuous improvement of informatization degree, the data volume is rapidly increased, mass data storage and application are developed rapidly, and the application of big data is more and more extensive. For example, in terms of network security, a big data technology is used for analyzing network attack behaviors; in electronic commerce, a big data technology is used for analyzing shopping preferences or most preferred commodities of a user; in city construction, a smart city is constructed by using a big data technology, and people can go out conveniently. Therefore, the big data technology plays a positive promoting role in building a conservation-oriented society, improving the generation efficiency and the like.
However, as the amount of data continues to increase and the application of big data continues to develop, more and more data centers are used for storing data at different service or provincial points. In mass data analysis application, only a single data center can be used for data extraction, and the requirement for performing simple analysis such as grouping, statistics, sequencing and the like on all data of each data center as an integral data set is increasingly obvious. In big data application, it is one of the necessary means to analyze the massive data stored in each data center as a whole.
Disclosure of Invention
The invention provides a retrieval analysis method, a retrieval analysis device and a retrieval analysis medium based on big data storage aiming at the problems of the background art, so as to improve the retrieval speed and improve the use feeling of big database retrieval.
In order to achieve the above object, the present invention provides a retrieval analysis method based on big data storage, and the method includes:
receiving field query information;
searching corresponding field searchable events in stored data according to the field query information, wherein the field searchable events are activity events in terms of safety or performance of one or more information technology systems;
generating a subset of the corresponding live searchable events;
identifying fields in one or more events in the subset and determining the number of unique values of the fields;
displaying the selected field name and the number of the unique values, wherein the field name is a field name of a field that can be referenced in the query information.
Preferably, the field query information includes, but is not limited to: a value criterion for a field, and/or a criterion for a key, and/or a time range.
Preferably, the storing data includes: a plurality of sets of live searchable events, wherein,
each event in the respective set of live searchable events is associated with a timestamp;
each event in the respective set of live searchable events includes machine data reflecting activity in one or more information technology systems;
at least one event in each set of live searchable events includes log data reflecting activity in one or more information technology systems;
at least one event of the sets of live searchable events includes unstructured data.
Preferably, the method further comprises the following steps:
displaying information of two or more events having fields and existing in the subset events, wherein the two or more events are displayed in order in an order corresponding to the fields.
Preferably, the field has a value criterion, wherein the field is present in one or more events in a set of field searchable events.
Preferably, the criterion of the keyword is specifically: criteria requiring matching events to have a particular keyword.
The invention also provides a retrieval analysis device based on big data storage, which comprises:
a query receiver for receiving field query information;
the query executor is used for searching corresponding field searchable events in the stored data according to the field query information, wherein the field searchable events are activity events in the aspect of safety or performance of one or more information technology systems; generating a subset of the corresponding live searchable events; identifying fields in one or more events in the subset and determining the number of unique values of the fields;
a data store for storing a plurality of sets of field searchable events;
and the field selector is used for selecting and displaying a field name and the number of the unique values, wherein the field name is the field name of the field which can be referred in the query information.
Preferably, the method further comprises the following steps: a data store for storing a plurality of sets of field searchable events.
Preferably, the field selector is further configured to display information for one or more events in the subset of live searchable events.
The present invention also proposes a computer-readable storage medium on which computer program instructions are stored, which, when executed by a processor, implement the big data storage based retrieval analysis method.
The invention provides a retrieval analysis method, a device and a medium based on big data storage, which are used for querying a group of searchable events based on a specific query standard for massive big data stored in each data center according to conditions input by a user and displaying a query field result and a unique value number, thereby greatly improving the retrieval rate and improving the use feeling of big database retrieval.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow chart of a big data storage based retrieval analysis method in an embodiment of the present invention;
FIG. 2 is a diagram illustrating a field searchable event in accordance with one embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a search analysis apparatus based on big data storage according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a computer storage medium according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.
In addition, if there is a description of "first", "second", etc. in an embodiment of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
And (3) system architecture: the execution equipment of the invention can be equipment in a local area network, can also be equipment on the Internet, can be a personal computer, and can also be a server, a workstation and the like;
the invention provides a retrieval analysis method based on big data storage;
in a first preferred embodiment of the present invention, as shown in fig. 1, the method comprises:
s100, receiving field query information;
in the embodiment of the present invention, information of a target field query input by a user is received through an execution device, where the field query information includes, but is not limited to: a value criterion of a field, and/or a criterion of a keyword, and/or a time range; the method specifically comprises the following steps: the field being present in one or more events in a set of field searchable events, requiring that matching events have criteria of a particular keyword, the query used to search the set of field searchable events being associated with a time range within which matching events must fall;
s200, searching corresponding field searchable events in stored data according to the field query information, wherein the field searchable events are activity events in the aspect of safety or performance of one or more information technology systems;
in the embodiment of the invention, the retrieval of the event is carried out based on the big data stored in each data center,
in the embodiment of the invention, a plurality of groups of on-site searchable events are stored, wherein each event in each group of on-site searchable events is associated with a timestamp; each event in the respective set of live searchable events includes machine data reflecting activity in one or more information technology systems; at least one event in each set of live searchable events includes log data reflecting activity in one or more information technology systems; at least one event in each set of live searchable events includes unstructured data;
for example, the following steps are carried out: as shown in FIG. 2, the stored set of searchable events includes events 1 through N, where N times are each associated with a timestamp, and event 1 includes: fields, machine data, log data, and unstructured data; event N then includes: fields, machine data, and unstructured data; the time range is set through event correlation time stamps and inquiry, and key information of events is extracted from each subset, so that targeted inquiry and search are realized, and the retrieval speed is improved;
s300, generating a subset of the corresponding on-site searchable events;
s400, identifying fields in one or more events in the subset, and determining the number of unique values of the fields;
and S500, displaying the selected field name and the number of the unique values, wherein the field name is the field name of the field which can be referred in the query information.
In an embodiment of the present invention, information of two or more events having fields and existing in a subset event is displayed, wherein the two or more events are sequentially displayed in an order corresponding to the fields; displaying the number of unique values corresponding to the fields and the field names that can reference the fields in the query (for specifying field criteria to further filter the subset of events), and also displaying information about one or more events in the subset of events; and receiving a user selection of a field name, hiding at least one previously displayed event that does not contain the selected field.
In an embodiment of the invention, the display of the histogram is such that it indicates how many events in said subset of events are associated with a timestamp falling within each of a plurality of time ranges.
In an embodiment of the invention, the display of at least a second number of how many unique values are present in the subset of events corresponding to the second field present in the one or more events of the subset of events is caused;
the invention also provides a retrieval analysis device based on big data storage, and the hardware structure of the device comprises a display, a processor and a storage;
in a second preferred embodiment of the present invention, as shown in fig. 3, the present invention comprises:
a query receiver for receiving field query information;
the query executor is used for searching corresponding field searchable events in the stored data according to the field query information, wherein the field searchable events are activity events in the aspect of safety or performance of one or more information technology systems; generating a subset of the corresponding live searchable events; identifying fields in one or more events in the subset and determining the number of unique values of the fields;
a data store for storing a plurality of sets of field searchable events;
and the field selector is used for selecting and displaying a field name and the number of the unique values, wherein the field name is the field name of the field which can be referred in the query information.
In the embodiments of the present invention, the technical details of each specific device have been set forth in the first preferred embodiment, and will not be repeated here;
the invention also provides a computer readable storage medium;
in a third preferred embodiment of the present invention, as shown in fig. 4, computer program instructions are stored thereon, which when executed by a processor implement the big data storage based retrieval analysis method, such as:
s100, receiving field query information;
s200, searching corresponding field searchable events in stored data according to the field query information, wherein the field searchable events are activity events in the safety or performance aspect of one or more information technology systems;
s300, generating a subset of the corresponding on-site searchable events;
s400, identifying fields in one or more events in the subset, and determining the number of unique values of the fields;
and S500, displaying the selected field name and the number of the unique values, wherein the field name is the field name of the field which can be referred in the query information.
In the embodiments of the present invention, the technical details of each specific device have been set forth in the first preferred embodiment, and will not be repeated here;
in describing embodiments of the present invention, it should be noted that any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and that the scope of the preferred embodiments of the present invention includes additional implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processing module-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer case (magnetic device), a random access memory, a read only memory, an erasable programmable read only memory, an optical fiber device, and a portable compact disc read only memory. Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structural changes made by using the contents of the present specification and the drawings, or any other related technical fields, which are directly or indirectly applied to the present invention, are included in the scope of the present invention.
Claims (9)
1. A retrieval analysis method based on big data storage is characterized by comprising the following steps:
receiving field query information;
searching corresponding field searchable events in stored data according to the field query information, wherein the field searchable events are activity events in terms of safety or performance of one or more information technology systems;
generating a subset of the corresponding live searchable events;
identifying fields in one or more events in the subset and determining the number of unique values of the fields;
displaying the selected field name and the number of the unique values, wherein the field name is a field name of a field that can be referenced in the query information;
the storage data comprises: a plurality of sets of live searchable events, wherein,
each event in the respective set of live searchable events is associated with a timestamp;
each event in the respective set of field searchable events includes machine data reflecting activity in one or more information technology systems;
at least one event in each set of live searchable events includes log data reflecting activity in one or more information technology systems;
at least one event of the sets of live searchable events includes unstructured data.
2. The big data storage based retrieval analysis method of claim 1, wherein the field queries information, including but not limited to: a value criterion for a field, and/or a criterion for a key, and/or a time range.
3. The big data storage based retrieval analysis method according to claim 1, further comprising:
displaying information of two or more events having fields and existing in the subset events, wherein the two or more events are displayed in order in an order corresponding to the fields.
4. The big-data-storage-based retrieval analysis method of claim 2, wherein the value criteria of the field, wherein the field exists in one or more events in a set of field-searchable events.
5. The retrieval analysis method based on big data storage according to claim 2, wherein the criteria of the keyword are specifically: criteria requiring matching events to have a particular keyword.
6. A retrieval analysis apparatus based on big data storage, applied to the retrieval analysis method according to any one of claims 1 to 5, characterized by comprising:
a query receiver for receiving field query information;
the query executor is used for searching corresponding field searchable events in the stored data according to the field query information, wherein the field searchable events are activity events in the aspect of safety or performance of one or more information technology systems; generating a subset of the corresponding live searchable events; identifying fields in one or more events in the subset and determining the number of unique values of the fields;
and the field selector is used for selecting and displaying a field name and the number of the unique values, wherein the field name is the field name of the field which can be referred in the query information.
7. The big data storage based retrieval analysis device of claim 6, further comprising: a data store for storing a plurality of sets of field searchable events.
8. The big-data-storage-based retrieval analysis device of claim 6, wherein the field selector is further configured to display information for one or more events in the subset of live searchable events.
9. A computer-readable storage medium on which computer program instructions are stored, which, when executed by a processor, implement the method of any one of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910362509.3A CN110245208B (en) | 2019-04-30 | 2019-04-30 | Retrieval analysis method, device and medium based on big data storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910362509.3A CN110245208B (en) | 2019-04-30 | 2019-04-30 | Retrieval analysis method, device and medium based on big data storage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110245208A CN110245208A (en) | 2019-09-17 |
CN110245208B true CN110245208B (en) | 2022-05-24 |
Family
ID=67883581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910362509.3A Active CN110245208B (en) | 2019-04-30 | 2019-04-30 | Retrieval analysis method, device and medium based on big data storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110245208B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20240143482A1 (en) * | 2022-10-31 | 2024-05-02 | Bitdrift, Inc | Systems and methods for providing a timeline view of log information for a client application |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081669A (en) * | 2011-01-24 | 2011-06-01 | 哈尔滨工业大学 | Hierarchical retrieval method for multi-source remote sensing resource heterogeneous databases |
CN104636468A (en) * | 2015-02-10 | 2015-05-20 | 广州供电局有限公司 | Data query analysis method and system |
CN107122358A (en) * | 2016-02-24 | 2017-09-01 | 阿里巴巴集团控股有限公司 | Mix querying method and equipment |
CN108062384A (en) * | 2017-12-13 | 2018-05-22 | 阿里巴巴集团控股有限公司 | The method and apparatus of data retrieval |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8375032B2 (en) * | 2009-06-25 | 2013-02-12 | University Of Tennessee Research Foundation | Method and apparatus for predicting object properties and events using similarity-based information retrieval and modeling |
US20130124496A1 (en) * | 2011-11-11 | 2013-05-16 | Microsoft Corporation | Contextual promotion of alternative search results |
US9516052B1 (en) * | 2015-08-01 | 2016-12-06 | Splunk Inc. | Timeline displays of network security investigation events |
CN108874990A (en) * | 2018-06-12 | 2018-11-23 | 亓富军 | A kind of method and system extracted based on power technology journal article unstructured data |
CN108984718A (en) * | 2018-07-10 | 2018-12-11 | 四川汇源吉迅数码科技有限公司 | A kind of digital content interactive system and exchange method based on big data technology |
CN109033387B (en) * | 2018-07-26 | 2021-09-24 | 广州大学 | Internet of things searching system and method fusing multi-source data and storage medium |
-
2019
- 2019-04-30 CN CN201910362509.3A patent/CN110245208B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081669A (en) * | 2011-01-24 | 2011-06-01 | 哈尔滨工业大学 | Hierarchical retrieval method for multi-source remote sensing resource heterogeneous databases |
CN104636468A (en) * | 2015-02-10 | 2015-05-20 | 广州供电局有限公司 | Data query analysis method and system |
CN107122358A (en) * | 2016-02-24 | 2017-09-01 | 阿里巴巴集团控股有限公司 | Mix querying method and equipment |
CN108062384A (en) * | 2017-12-13 | 2018-05-22 | 阿里巴巴集团控股有限公司 | The method and apparatus of data retrieval |
Also Published As
Publication number | Publication date |
---|---|
CN110245208A (en) | 2019-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI524193B (en) | Computer-readable media and computer-implemented method for semantic table of contents for search results | |
US7213198B1 (en) | Link based clustering of hyperlinked documents | |
US20100318492A1 (en) | Data analysis system and method | |
CN105512180B (en) | A kind of search recommended method and device | |
CN103733194A (en) | Dynamically organizing cloud computing resources to facilitate discovery | |
CN111913954B (en) | Intelligent data standard catalog generation method and device | |
CN106095738A (en) | Recommendation tables single slice | |
CN110427574B (en) | Route similarity determination method, device, equipment and medium | |
CN110245208B (en) | Retrieval analysis method, device and medium based on big data storage | |
CN103020225A (en) | CPU (Central Processing Unit) model identifying method and hardware detection system | |
CN110928893A (en) | Label query method, device, equipment and storage medium | |
US20110029480A1 (en) | Method of Compiling Multiple Data Sources into One Dataset | |
CN112527813A (en) | Data processing method and device of business system, electronic equipment and storage medium | |
CN115220987A (en) | Data acquisition method and device, electronic equipment and storage medium | |
CN113553477B (en) | Graph splitting method and device | |
CN115599520A (en) | Intelligent data processing method, device, equipment and storage medium | |
KR20080028031A (en) | System extracting and displaying keyword and contents related with the keyword and method using the system | |
CN116644102A (en) | Intelligent investment object selection method, system terminal and computer readable storage medium | |
CN111078972B (en) | Questioning behavior data acquisition method, questioning behavior data acquisition device and server | |
CN113625967A (en) | Data storage method, data query method and server | |
CN109934689B (en) | Target object ranking interpretation method and device, electronic equipment and readable storage medium | |
CN114371969A (en) | Page performance testing method and device, electronic equipment and storage medium | |
JP5538459B2 (en) | Information processing apparatus and method | |
CN113918796A (en) | Information searching method, device, server and storage medium | |
CN113076322A (en) | Commodity search processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder |
Address after: 510000 building 13, 100 martyrs Road, Yuexiu District, Guangzhou, Guangdong. Patentee after: Institute of intelligent manufacturing, Guangdong Academy of Sciences Address before: 510000 building 13, 100 martyrs Road, Yuexiu District, Guangzhou, Guangdong. Patentee before: GUANGDONG INSTITUTE OF INTELLIGENT MANUFACTURING |
|
CP01 | Change in the name or title of a patent holder |