CN111695000A - Multi-source big data loading method and system - Google Patents
Multi-source big data loading method and system Download PDFInfo
- Publication number
- CN111695000A CN111695000A CN202010551553.1A CN202010551553A CN111695000A CN 111695000 A CN111695000 A CN 111695000A CN 202010551553 A CN202010551553 A CN 202010551553A CN 111695000 A CN111695000 A CN 111695000A
- Authority
- CN
- China
- Prior art keywords
- loading
- storage
- rule
- storage server
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a multi-source big data loading method and a multi-source big data loading system, wherein the method comprises a first recording unit, a classifier and an extractor, wherein the first recording unit is used for recording storage paths of all storage nodes in a storage server and attribute information of storage entities, the classifier is used for linking at least one relation between the storage paths in the first recording unit and the attribute information of the storage entities, the extractor extracts expressions based on the relation, establishes a first loading rule and establishes a second loading rule, and the cloud database determines the attribute information corresponding to loaded data through at least one of the first loading rule and the second loading rule so as to search corresponding entity objects in an original data set of the storage server. The invention also provides a system and a matching method. According to the method and the device, the content loaded by the user is compared with the first loading rule and the second loading rule which are obtained according to the storage rule, and the corresponding entity object can be accurately searched in the original data set of the storage server.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a big data loading method, and specifically relates to a multi-source big data loading method and system.
Background
With the fusion of big data, the data types are various, and the storage of the storage object firstly needs to ensure the correctness of data storage and be convenient for retrieval. The method comprises the steps of storing customer data, loading production plans, financial statements, company plans, production orders, searching data and the like, wherein the loading of big data is not separated, the loading of the data is needed to directly reflect searched results, most of the big data loading modes at present are only used for loading terms searched by users, but when the terms of the users are wrong and deviated, the results required by the customers cannot be reflected timely, and meanwhile, in the aspect of stored data processing, the method of storing in a tree mode only can show single tree results under the traditional search results and cannot comprehensively meet the search requirements of the users.
Disclosure of Invention
The invention aims to provide a multi-source big data loading method and a multi-source big data loading system to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
the multi-source big data loading method is characterized by comprising the following steps
Setting at least one storage server, wherein the storage server comprises at least one first recording unit, the first recording unit is used for recording the storage paths of all storage nodes in the storage server and the attribute information of the storage entities,
providing a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
providing an extractor, the extractor extracting the expression based on the relationship,
selecting a screening rule to identify repeated items in the expression according to the expression, deleting the repeated items to establish a first loading rule,
analyzing the similar expression rules in the first loading rule, deleting the similar expression rules to establish a second loading rule,
the first loading rule and the second loading rule are saved in memory,
the memory is saved to a cloud database,
the user data end inputs and receives the operation command of the user,
the loading server obtains the loading command from the user data terminal,
the load command is transmitted to the cloud database,
the cloud database determines attribute information corresponding to the loaded data through at least one of the first loading rule and the second loading rule, so that a corresponding entity object is searched in an original data set of the storage server.
Preferably, the information searched by the user, which is received by the user data terminal, is split into one or more of connection words, specific terms and pictures, and is stored in the cloud database.
Preferably, the method for searching the entity object in the original data set of the storage server is as follows:
the cloud database compares the connection words, the specific entries and the pictures which are split into one or more than one with the established second loading rule, if the connection words, the specific entries and the pictures have the same or similar loading items, the loading items are executed to obtain corresponding entity objects in the original data set of the storage server,
if the same or similar loading items do not exist after the comparison with the second loading rule, the cloud database is divided into one or more of connection words, specific terms and pictures to be compared with the established first loading rule, if the same or similar loading items exist, the loading items are executed to obtain corresponding entity objects in the original data set of the storage server,
and if the same or similar loading items do not exist after the comparison with the first loading rule, loading the corresponding entity object in the original data set acquired in the storage server by using fuzzy query according to the recording habit and the loading related history recorded by the cloud database.
Preferably, the display mode of the entity object is as follows:
acquiring a corresponding entity object in an original data set of a storage server as a first display result by using a second loading rule;
acquiring a corresponding entity object in an original data set of a storage server as a second display result by using a first loading rule;
and acquiring a corresponding entity object in the original data set of the storage server by using the recording habit and the loading related history recorded by the cloud database as a third display result.
Preferably, the expression further includes an internal node address of the storage data under the corresponding storage node determined based on the relationship.
Preferably, the expressions are sequentially extracted in a parent-level encoding and child-level data set mode.
The invention also provides a multi-source big data loading system, which comprises
A first recording unit for recording attribute information of storage paths and storage entities of all storage nodes in the storage server,
a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
an extractor for extracting an expression based on the relationship,
a filter for selecting a filtering rule to identify a duplicate term in the expression according to the expression,
a first loading tag, deleting the repeated items to establish a first loading rule,
the second loading label analyzes the similar expression rules in the first loading rule, deletes the similar expression rules to establish a second loading rule,
a memory storing a first load tag and a second load tag,
a user data terminal for inputting and receiving the operation command of the user,
the loading server acquires a loading command from the user data terminal and transmits the loading command to the cloud database;
and the cloud database receives the loading command, starts a loading actuator to determine the attribute information corresponding to the loaded data through at least one of the first loading tag and the second loading tag so as to search the corresponding entity object in the original data set of the storage server.
Preferably, the classifier links the storage paths in a tree network.
Compared with the prior art, the invention has the beneficial effects that:
in the invention, the content loaded by the user is divided into one or more of connecting words, specific terms and pictures, and the first loading rule and the second loading rule which are obtained according to the storage rule are compared, so that the corresponding entity object can be accurately searched in the original data set of the storage server.
When the user fails to provide accurate loading elements, the user can load the results desired by the user through the recording habits and the loading related history of the user by utilizing fuzzy query.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
fig. 2 is a general block diagram of the system of the present invention.
Detailed Description
Detailed description of the preferred embodimentsreferring to fig. 1-2. The invention also provides a multi-source big data loading system, which comprises
A first recording unit for recording attribute information of storage paths and storage entities of all storage nodes in the storage server,
a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
an extractor for extracting an expression based on the relationship,
a filter for selecting a filtering rule to identify a duplicate term in the expression according to the expression,
a first loading tag, deleting the repeated items to establish a first loading rule,
the second loading label analyzes the similar expression rules in the first loading rule, deletes the similar expression rules to establish a second loading rule,
a memory storing a first load tag and a second load tag,
a user data terminal for inputting and receiving the operation command of the user,
the loading server acquires a loading command from the user data terminal and transmits the loading command to the cloud database;
and the cloud database receives the loading command, starts a loading actuator to determine the attribute information corresponding to the loaded data through at least one of the first loading tag and the second loading tag so as to search the corresponding entity object in the original data set of the storage server.
The classifier links the storage paths in a tree network.
Example 1
The invention provides a multi-source big data loading method which comprises the following steps
Setting at least one storage server, wherein the storage server comprises at least one first recording unit, the first recording unit is used for recording the storage paths of all storage nodes in the storage server and the attribute information of the storage entities,
providing a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
providing an extractor, the extractor extracting the expression based on the relationship,
selecting a screening rule to identify repeated items in the expression according to the expression, deleting the repeated items to establish a first loading rule,
analyzing the similar expression rules in the first loading rule, deleting the similar expression rules to establish a second loading rule,
the first loading rule and the second loading rule are saved in memory,
the memory is saved to a cloud database,
the user data end inputs and receives the operation command of the user,
the loading server obtains the loading command from the user data terminal,
the load command is transmitted to the cloud database,
the cloud database determines attribute information corresponding to the loaded data through at least one of the first loading rule and the second loading rule, so that a corresponding entity object is searched in an original data set of the storage server.
The method for searching the entity object in the original data set of the storage server comprises the following steps: and the cloud database compares the connection words, the specific entries and the pictures which are split into one or more than one with the established second loading rule, and if the connection words, the specific entries and the pictures have the same or similar loading items, the loading items are executed to acquire corresponding entity objects in the original data set of the storage server. And acquiring the corresponding entity object in the original data set of the storage server as a first display result by using a second loading rule.
Example 2
The invention provides a multi-source big data loading method which comprises the following steps
Setting at least one storage server, wherein the storage server comprises at least one first recording unit, the first recording unit is used for recording the storage paths of all storage nodes in the storage server and the attribute information of the storage entities,
providing a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
providing an extractor, the extractor extracting the expression based on the relationship,
selecting a screening rule to identify repeated items in the expression according to the expression, deleting the repeated items to establish a first loading rule,
analyzing the similar expression rules in the first loading rule, deleting the similar expression rules to establish a second loading rule,
the first loading rule and the second loading rule are saved in memory,
the memory is saved to a cloud database,
the user data end inputs and receives the operation command of the user,
the loading server obtains the loading command from the user data terminal,
the load command is transmitted to the cloud database,
the cloud database determines attribute information corresponding to the loaded data through at least one of the first loading rule and the second loading rule, so that a corresponding entity object is searched in an original data set of the storage server.
The method for searching the entity object in the original data set of the storage server comprises the following steps:
if there is no identical or similar loading item after the comparison with the second loading rule in embodiment 1, the cloud database compares the connection word, the specific entry and the picture split into one or more of the connection word, the specific entry and the picture with the established first loading rule, if there is an identical or similar loading item, the loading item is executed to obtain an entity object corresponding to the original data set of the storage server, and the entity object corresponding to the original data set of the storage server is obtained by using the first loading rule as a second display result.
Example 3
The invention provides a multi-source big data loading method which comprises the following steps
Setting at least one storage server, wherein the storage server comprises at least one first recording unit, the first recording unit is used for recording the storage paths of all storage nodes in the storage server and the attribute information of the storage entities,
providing a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
providing an extractor, the extractor extracting the expression based on the relationship,
selecting a screening rule to identify repeated items in the expression according to the expression, deleting the repeated items to establish a first loading rule,
analyzing the similar expression rules in the first loading rule, deleting the similar expression rules to establish a second loading rule,
the first loading rule and the second loading rule are saved in memory,
the memory is saved to a cloud database,
the user data end inputs and receives the operation command of the user,
the loading server obtains the loading command from the user data terminal,
the load command is transmitted to the cloud database,
the cloud database determines attribute information corresponding to the loaded data through at least one of the first loading rule and the second loading rule, so that a corresponding entity object is searched in an original data set of the storage server.
The method for searching the entity object in the original data set of the storage server comprises the following steps:
if there is no identical or similar loading item after comparison with the first loading rule in embodiment 2, the recording habit and the loading related history recorded in the cloud database are used, and the entity object corresponding to the original data set acquired in the storage server is loaded by using the fuzzy query. And acquiring a corresponding entity object in the original data set of the storage server by using the recording habit and the loading related history recorded by the cloud database as a third display result.
As can be seen from embodiments 1 to 3, the content loaded by the user is split into one or more of a connection word, a specific entry and a picture, and the first loading rule and the second loading rule obtained according to the storage rule are compared, so that when the user fails to provide an accurate loading element, a corresponding entity object can be accurately searched in the original data set of the storage server, and the result desired by the user can be loaded by fuzzy query through the recording habits and the loading related history of the user.
The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts of the present invention. The foregoing is only a preferred embodiment of the present invention, and it should be noted that there are objectively infinite specific structures due to the limited character expressions, and it will be apparent to those skilled in the art that a plurality of modifications, decorations or changes may be made without departing from the principle of the present invention, and the technical features described above may be combined in a suitable manner; such modifications, variations, combinations, or adaptations of the invention using its spirit and scope, as defined by the claims, may be directed to other uses and embodiments.
Claims (8)
1. The multi-source big data loading method is characterized by comprising the following steps
Setting at least one storage server, wherein the storage server comprises at least one first recording unit, the first recording unit is used for recording the storage paths of all storage nodes in the storage server and the attribute information of the storage entities,
providing a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
providing an extractor, the extractor extracting the expression based on the relationship,
selecting a screening rule to identify repeated items in the expression according to the expression, deleting the repeated items to establish a first loading rule,
analyzing the similar expression rules in the first loading rule, deleting the similar expression rules to establish a second loading rule,
the first loading rule and the second loading rule are saved in memory,
the memory is saved to a cloud database,
the user data end inputs and receives the operation command of the user,
the loading server obtains the loading command from the user data terminal,
the load command is transmitted to the cloud database,
the cloud database determines attribute information corresponding to the loaded data through at least one of the first loading rule and the second loading rule, so that a corresponding entity object is searched in an original data set of the storage server.
2. The multi-source big data loading method according to claim 1, wherein the information searched by the user is split into one or more of connection words, specific terms and pictures by using the information searched by the user received by the user data terminal, and is stored in a cloud database.
3. The multi-source big data loading method according to claim 1, wherein the method for searching the entity object in the original data set of the storage server is as follows:
the cloud database compares the connection words, the specific entries and the pictures which are split into one or more than one with the established second loading rule, if the connection words, the specific entries and the pictures have the same or similar loading items, the loading items are executed to obtain corresponding entity objects in the original data set of the storage server,
if the same or similar loading items do not exist after the comparison with the second loading rule, the cloud database is divided into one or more of connection words, specific terms and pictures to be compared with the established first loading rule, if the same or similar loading items exist, the loading items are executed to obtain corresponding entity objects in the original data set of the storage server,
and if the same or similar loading items do not exist after the comparison with the first loading rule, loading the corresponding entity object in the original data set acquired in the storage server by using fuzzy query according to the recording habit and the loading related history recorded by the cloud database.
4. The multi-source big data loading method according to claim 1 or 3, wherein the entity object is displayed in a manner that:
acquiring a corresponding entity object in an original data set of a storage server as a first display result by using a second loading rule;
acquiring a corresponding entity object in an original data set of a storage server as a second display result by using a first loading rule;
and acquiring a corresponding entity object in the original data set of the storage server by using the recording habit and the loading related history recorded by the cloud database as a third display result.
5. The multi-source big data loading method according to claim 1, wherein the expression further includes an internal node address of the storage data under the corresponding storage node determined based on the relationship.
6. The multi-source big data loading method according to claim 1, wherein the expressions are sequentially extracted in a parent-level encoding and child-level data set manner.
7. A multi-source big data loading system is characterized by comprising
A first recording unit for recording attribute information of storage paths and storage entities of all storage nodes in the storage server,
a classifier for linking at least one relationship between the storage path within the first recording unit and the attribute information of the storage entity,
an extractor for extracting an expression based on the relationship,
a filter for selecting a filtering rule to identify a duplicate term in the expression according to the expression,
a first loading tag, deleting the repeated items to establish a first loading rule,
the second loading label analyzes the similar expression rules in the first loading rule, deletes the similar expression rules to establish a second loading rule,
a memory storing a first load tag and a second load tag,
a user data terminal for inputting and receiving the operation command of the user,
the loading server acquires a loading command from the user data terminal and transmits the loading command to the cloud database;
and the cloud database receives the loading command, starts a loading actuator to determine the attribute information corresponding to the loaded data through at least one of the first loading tag and the second loading tag so as to search the corresponding entity object in the original data set of the storage server.
8. The multi-source big data loading system according to claim 7, wherein the classifier links the storage paths in a tree network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010551553.1A CN111695000B (en) | 2020-06-16 | 2020-06-16 | Multi-source big data loading method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010551553.1A CN111695000B (en) | 2020-06-16 | 2020-06-16 | Multi-source big data loading method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111695000A true CN111695000A (en) | 2020-09-22 |
CN111695000B CN111695000B (en) | 2021-04-27 |
Family
ID=72481408
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010551553.1A Active CN111695000B (en) | 2020-06-16 | 2020-06-16 | Multi-source big data loading method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111695000B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101026627A (en) * | 2007-03-15 | 2007-08-29 | 上海交通大学 | Multi-source data fusion system based on rule and certainty factor |
CN105893526A (en) * | 2016-03-30 | 2016-08-24 | 上海坤士合生信息科技有限公司 | Multi-source data fusion system and method |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
CN107341215A (en) * | 2017-06-07 | 2017-11-10 | 北京航空航天大学 | A kind of vertical knowledge mapping classification ensemble querying method of multi-source based on Distributed Computing Platform |
CN107609154A (en) * | 2017-09-23 | 2018-01-19 | 浪潮软件集团有限公司 | Method and device for processing multi-source heterogeneous data |
CN108804613A (en) * | 2018-05-30 | 2018-11-13 | 国网山东省电力公司经济技术研究院 | A kind of Various database real time fusion system and its fusion method |
US10146954B1 (en) * | 2012-06-11 | 2018-12-04 | Quest Software Inc. | System and method for data aggregation and analysis |
CN109710667A (en) * | 2018-11-27 | 2019-05-03 | 中科曙光国际信息产业有限公司 | A kind of shared realization method and system of the multisource data fusion based on big data platform |
CN110334133A (en) * | 2019-07-11 | 2019-10-15 | 京东城市(北京)数字科技有限公司 | Rule digging method and device, electronic equipment and computer readable storage medium |
CN110990351A (en) * | 2019-12-05 | 2020-04-10 | 南方电网数字电网研究院有限公司 | Unstructured data acquisition method, device and system and computer equipment |
CN110990390A (en) * | 2019-12-02 | 2020-04-10 | 东莞中国科学院云计算产业技术创新与育成中心 | Data cooperative processing method and device, computer equipment and storage medium |
-
2020
- 2020-06-16 CN CN202010551553.1A patent/CN111695000B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101026627A (en) * | 2007-03-15 | 2007-08-29 | 上海交通大学 | Multi-source data fusion system based on rule and certainty factor |
US10146954B1 (en) * | 2012-06-11 | 2018-12-04 | Quest Software Inc. | System and method for data aggregation and analysis |
CN105893526A (en) * | 2016-03-30 | 2016-08-24 | 上海坤士合生信息科技有限公司 | Multi-source data fusion system and method |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
CN107341215A (en) * | 2017-06-07 | 2017-11-10 | 北京航空航天大学 | A kind of vertical knowledge mapping classification ensemble querying method of multi-source based on Distributed Computing Platform |
CN107609154A (en) * | 2017-09-23 | 2018-01-19 | 浪潮软件集团有限公司 | Method and device for processing multi-source heterogeneous data |
CN108804613A (en) * | 2018-05-30 | 2018-11-13 | 国网山东省电力公司经济技术研究院 | A kind of Various database real time fusion system and its fusion method |
CN109710667A (en) * | 2018-11-27 | 2019-05-03 | 中科曙光国际信息产业有限公司 | A kind of shared realization method and system of the multisource data fusion based on big data platform |
CN110334133A (en) * | 2019-07-11 | 2019-10-15 | 京东城市(北京)数字科技有限公司 | Rule digging method and device, electronic equipment and computer readable storage medium |
CN110990390A (en) * | 2019-12-02 | 2020-04-10 | 东莞中国科学院云计算产业技术创新与育成中心 | Data cooperative processing method and device, computer equipment and storage medium |
CN110990351A (en) * | 2019-12-05 | 2020-04-10 | 南方电网数字电网研究院有限公司 | Unstructured data acquisition method, device and system and computer equipment |
Non-Patent Citations (1)
Title |
---|
田卫东 等: "一种精简的关联规则表示模型", 《计算机应用研究》 * |
Also Published As
Publication number | Publication date |
---|---|
CN111695000B (en) | 2021-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11899681B2 (en) | Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium | |
CN100530185C (en) | Network behavior based personalized recommendation method and system | |
US11907659B2 (en) | Item recall method and system, electronic device and readable storage medium | |
US11216516B2 (en) | Method and system for scalable search using microservice and cloud based search with records indexes | |
CN110532309B (en) | Generation method of college library user portrait system | |
CN102253936A (en) | Method for recording access of user to merchandise information, search method and server | |
CN113254630B (en) | Domain knowledge map recommendation method for global comprehensive observation results | |
CN112269816B (en) | Government affair appointment correlation retrieval method | |
CN110795613A (en) | Commodity searching method, device and system and electronic equipment | |
US8862609B2 (en) | Expanding high level queries | |
JP2004030221A (en) | Method for automatically detecting table to be modified | |
CN114265957A (en) | Multiple data source combined query method and system based on graph database | |
CN113407678A (en) | Knowledge graph construction method, device and equipment | |
CN105159898A (en) | Searching method and searching device | |
CN114328947A (en) | Knowledge graph-based question and answer method and device | |
CN117609468A (en) | Method and device for generating search statement | |
CN113779110A (en) | Family relation network extraction method and device, computer equipment and storage medium | |
CN111259223B (en) | News recommendation and text classification method based on emotion analysis model | |
CN117667841A (en) | Enterprise data management platform and method | |
CN111695000B (en) | Multi-source big data loading method and system | |
CN109460467B (en) | Method for constructing network information classification system | |
CN110062112A (en) | Data processing method, device, equipment and computer readable storage medium | |
CN110399431A (en) | A kind of incidence relation construction method, device and equipment | |
CN115935042A (en) | Intelligent pledge asset duplicate checking method and system based on fusion model | |
CN114662002A (en) | Object recommendation method, medium, device and computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |