CN113051332A - Multi-source data integration method and system based on big data technology - Google Patents

Multi-source data integration method and system based on big data technology Download PDF

Info

Publication number
CN113051332A
CN113051332A CN202110422153.5A CN202110422153A CN113051332A CN 113051332 A CN113051332 A CN 113051332A CN 202110422153 A CN202110422153 A CN 202110422153A CN 113051332 A CN113051332 A CN 113051332A
Authority
CN
China
Prior art keywords
data
query request
request
processing device
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110422153.5A
Other languages
Chinese (zh)
Other versions
CN113051332B (en
Inventor
章志容
李实�
彭添才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan Mengda Data Technology Co ltd
Dongguan Mengda Group Co ltd
Original Assignee
Dongguan Mengda Plasticizing Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan Mengda Plasticizing Science & Technology Co ltd filed Critical Dongguan Mengda Plasticizing Science & Technology Co ltd
Priority to CN202110422153.5A priority Critical patent/CN113051332B/en
Publication of CN113051332A publication Critical patent/CN113051332A/en
Application granted granted Critical
Publication of CN113051332B publication Critical patent/CN113051332B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24539Query rewriting; Transformation using cached or materialised query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of big data, and particularly discloses a multi-source data integration method and a system thereof based on big data technology, wherein a user initiates a data query request through a business end processing device; the interface processing device receives and analyzes the data query request, judges a corresponding data acquisition channel, acquires corresponding data through the corresponding channel and sends the corresponding data to the big data cluster device; establishing a data association relation and storing the data association relation; the big data cluster device executes data streams in real time, forms and stores corresponding data results, and performs data association; the task monitoring device monitors the data state in real time and triggers and executes a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof. The invention can effectively combine a plurality of service terminals with a big data system, and simultaneously integrate a plurality of third-party data interfaces to form an effective system for the user service terminals to inquire data.

Description

Multi-source data integration method and system based on big data technology
Technical Field
The invention relates to the technical field of big data, in particular to a multi-source data integration method and a multi-source data integration system based on big data technology.
Background
In many fields, a big data technology is required, and various data are required to provide information reference for relevant decision making. Meanwhile, in the social field, services are becoming more detailed and specialized, and correspondingly, data of corresponding specialization are formed respectively.
Therefore, when a user needs to query related data, different data needs to be queried and integrated through multiple channel approaches, which is undoubtedly inconvenient and inefficient for the user.
Disclosure of Invention
In order to solve the problems in the prior art, the present invention aims to provide a multi-source data integration method and system based on big data technology, which can effectively combine a plurality of service terminals with a big data system, and simultaneously integrate a plurality of third-party data interfaces to form an effective system for querying data at a user terminal.
In order to achieve the above purpose, the present invention adopts the following scheme.
A multi-source data integration method based on big data technology comprises the following steps:
a user initiates a data query request through a service end processing device;
the interface processing device receives and analyzes the data query request, judges a corresponding data acquisition channel, acquires corresponding data through the corresponding channel and sends the corresponding data to the big data cluster device; establishing a data association relation and storing the data association relation;
the big data cluster device executes data streams in real time, forms and stores corresponding data results, and performs data association;
the task monitoring device monitors the data state in real time and triggers and executes a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof.
As a preferred embodiment, analyzing a data query request and establishing a data association relationship, including judging whether the data query request generates a data result, if so, directly calling a corresponding data result, and associating the data result with the data query request; if not, judging whether the same data query request is associated in the data calling request table, and if so, associating the data query request with the same data query request in the data calling request table; and if not, generating a new data calling request corresponding to the data query request in the data calling request table.
Through the steps, the same query request is correlated to generate one data calling request, so that other data query requirements of the same data query request can be met only by completing data calling once, and waste of data calling resources is reduced.
Further, establishing a data association relationship, and before judging whether the data query request generates a data result, analyzing whether the user initiates the same data query request, if so, triggering the user to select to initiate a latest data query request or view a data result of historical query; if the user selects to initiate the latest data query request, continuously judging whether the data query request generates a data result; if the user selects to check the data result of the historical query, triggering to call the data result of the historical query; if not, continuously judging whether the data query request generates a data result or not.
As a preferred embodiment, the invoking of the data result of the historical query includes triggering the service end processing device to determine whether the data result of the historical query exists at the service end, and if so, triggering the invoking of the data result from the service end and displaying the data result; if not, triggering the interface processing device to request the big data cluster device to call the data result of the historical query.
The data result called from the big data cluster device is stored by the service end, so that the service end can read the data quickly, and simultaneously, the waste of resources for executing the data stream by the big data cluster device is reduced.
Further, establishing a data association relationship, including associating a data query request table of the service end processing device with a data retrieval request table of the interface processing device; establishing association between the acquired data and corresponding data calling requests in a data calling request table; establishing association between the data result and the corresponding data calling request in the data calling request table; and establishing association between the data result and the corresponding data query request through a data retrieval request table.
Preferably, analyzing the data query request further includes determining whether the user obtains authorization of an object of the data query request, associating authorized data with a corresponding data query request in the data query request table, and associating with a corresponding data retrieval request in the data retrieval request table through the data query request table.
Preferably, the big data cluster device screens out the data retrieval request with successful data acquisition, executes a data stream for the associated data in real time, forms and stores a corresponding data result, and associates the data result with the corresponding data retrieval request.
In addition, the task monitoring device monitors the data state in real time, and triggers and executes a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof, wherein the task monitoring device monitors the data query request table of the service end processing device and the data state marked in the data calling request table of the interface processing device and the big data cluster device in real time, and triggers the corresponding device to execute the corresponding processing mechanism according to the preset triggering condition and the processing mechanism thereof.
Meanwhile, the invention also provides a corresponding system for realizing the multi-source data integration method.
Specifically, a multi-source data integration system based on big data technology is provided, which comprises a business end processing device, an interface processing device, a big data cluster device, a task monitoring device and an external data source;
the service end processing device: a user initiates a data query request;
an interface processing device: receiving and analyzing a data query request, judging a corresponding data acquisition channel, acquiring corresponding data through the corresponding channel and sending the data to the big data cluster device; establishing a data association relation and storing the data association relation;
big data cluster device: executing data stream in real time, forming and storing corresponding data result, and performing data association and storage;
the task monitoring device comprises: monitoring the data state in real time, and triggering and executing a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof;
external data sources: and providing an external data acquisition channel.
Further, the task monitoring device monitors the data query request table of the service end processing device and the data states marked in the data call request tables of the interface processing device and the big data cluster device in real time, and triggers the corresponding device to execute the corresponding processing mechanism according to the preset triggering condition and the processing mechanism thereof.
The invention has the beneficial effects that: the invention provides a multi-source data integration method based on big data technology, which can effectively combine a plurality of service terminals with a big data system and integrate a plurality of third-party data interfaces to form an effective system for the user service terminals to inquire data. The scheme of the invention can be suitable for various different service scenes, when a user needs to inquire various data, the data inquiry request can be initiated only through the self service terminal, then the data inquiry request is processed uniformly by the interface processing device, the optimal acquisition channel is selected, the corresponding data is obtained and then is associated with the corresponding inquiry request and the data calling request, the analysis processing is carried out through the big data cluster device, the data analysis, the mining, the integration and the association are carried out, various ETL service data and the like are carried out, and the data result is returned to the user service terminal through the interface processing device for displaying, thereby really forming a big data system and being beneficial to the subsequent direct calling of the related data.
Drawings
Fig. 1 is a schematic diagram of a multi-source data integration system based on big data technology according to an embodiment of the present invention.
Detailed Description
For the understanding of those skilled in the art, the present invention will be further described with reference to the following examples and drawings, which are not intended to limit the present invention.
The embodiment of the invention provides a multi-source data integration system based on a big data technology, and as shown in fig. 1, the system comprises a business end processing device 1, an interface processing device 2, a big data cluster device 3, a task monitoring device 4 and an external data source 5.
Specifically, the service-side processing device 1 may be a plurality of service sides, and serves as a data interface of a specific service, and a user initiates a data query request through the service sides, and may simultaneously carry one or more parameters when initiating the request; meanwhile, the service-side processing device 1 may also generate a data query request table, and log in a data query request initiated by a user to the data query request table; then, the data query request is sent to the interface processing device 2;
the interface processing device 2: the data query request is used for receiving a data query request from a service end, analyzing the data query request, determining whether to generate a new data retrieval request for the data query request according to an analysis result (requiring deduplication processing) and recording the new data retrieval request into a data retrieval request table, analyzing the association condition of the data retrieval request, judging a corresponding data interface or data crawling channel, further correspondingly initiating the corresponding data interface request or data crawling request, and starting an interface data acquisition program or/and a data crawling program to acquire corresponding data.
For different data, different acquisition channels exist; some external data sources open data interfaces for the big data integrated system, and the interface processing device 2 can be directly connected to a database of the external data sources through the interfaces to acquire corresponding data; some external data sources are public data and require crawling of the data by a crawler.
After the interface processing device 2 acquires the corresponding data, according to the record of the data calling request table, the data is associated with the corresponding data calling request in the data calling request table, which indicates that the data is acquired corresponding to the data calling request; meanwhile, relevant rule conditions of the data calling request are read from the task monitoring device 4, association is established for the data relation according to the rule conditions, and the data association relation is stored; so that the relevant data can be directly called through the data association relationship subsequently without repeatedly acquiring external data; meanwhile, the method can be used for subsequent data analysis, mining, integration and the like, so that the reusability and high availability of the data are enhanced, and the redundancy of the data is reduced.
After the interface processing device 2 establishes data association, the obtained data is sent to the big data cluster device 3 for analysis and processing to obtain a data result which is required by the service end and can be received at the same time.
That is, after acquiring the relevant data through the external data source 5, the interface processing device 2 only temporarily stores the relevant data, but does not store the data for a long time or perform analysis processing, and only performs data relationship association on the relevant data, and then sends the relevant data to the big data cluster device 3 for deep processing analysis processing; the memory space is released after temporarily storing the data for a while to ensure that the interface processing device 2 can be in a light load state.
After the big data cluster device 3 analyzes the data and forms a data result, associating the data result with a corresponding data calling request in the data calling request table; the task monitoring device 4 monitors in real time that the monitored conditions and states meet the preset conditions, namely, the triggering device triggers the interface processing device 2 to request the big data cluster device 3 to send the data result; after receiving the data result, the interface processing device 2 associates the data result with the corresponding data query request through the data call request table according to the data association relationship recorded in the data call request table, and sends the data result to the corresponding service end. After receiving the data result, the service-side processing device 1 associates the data result with the corresponding data query request in the data query request table, and at the same time, may send a notification to the user and display the data result.
The large data cluster device 3: as described above, the large data cluster device 3 is directly docked to the interface processing device 2; the big data cluster device 3 receives the data and the data calling request table from the interface processing device 2, analyzes and processes the data to obtain a data result which is required by the service end and acceptable, stores the data result, records the result state into the data calling request table, and establishes data association between the data result and the corresponding data calling request of the data calling request table.
The task monitoring device 4 monitors the data state in the data call request table in real time, for example, a data result is formed, data acquisition is successful, data acquisition is failed, and the like, and correspondingly triggers execution of different processing mechanisms, for example, the interface processing device 2 is triggered to request the data result so that the big data cluster device 3 returns the data result, the interface processing device 2 is triggered to send data to the big data cluster device 3 for processing, the interface processing device 2 is triggered to acquire data again, and the like.
In addition, the big data cluster device 3, as a big data warehouse, can actively extract various data from the interface processing device 2 or the service end in real time or at regular time, perform deep learning, analysis, mining, integration, and the like on the data, and can classify the data processing result for subsequent calling or continuous mining, and the like.
In general, the big data cluster device 3 is used for integrally processing data of the service end and the interface processing device 2, and automatically executing data flow in real time according to a pre-established data analysis model, a data mining model, a data integration logic model, a service association logic model and the like, and according to a pre-set task flow and a data processing mechanism, and forming and storing a corresponding data result, so as to facilitate subsequent direct calling of related interface data without repeatedly acquiring external data; meanwhile, the method can be used for subsequent data analysis, mining, integration and the like, so that the reusability and high availability of the data are enhanced, and the redundancy of the data is reduced.
The task monitoring device 4: the task monitoring device 4 is configured to store information such as trigger conditions, time conditions, operation conditions, data states, system states, operation states and other pre-trigger conditions set in processing programs of the multiple service-side processing devices 1, the interface processing devices 2, the big data cluster device 3 and the like, and corresponding processing mechanisms, and the task monitoring device 4 monitors the relevant trigger conditions in real time and sends the corresponding processing mechanisms to the corresponding processing devices, and after receiving the corresponding processing mechanisms, the service-side processing devices 1, the interface processing devices 2, and the big data cluster device 3 correspondingly execute the relevant programs, such as data result display, data acquisition, data ETL, data stream execution, data output and the like. The data ETL specifically refers to extracting, converting, and loading data from a source end to a destination end.
For example, when a user clicks a query button on a service end, a task is essentially triggered; after monitoring the task, the task monitoring device 4 triggers the service-side processing device 1 to first determine whether the user has initiated the same data query request before according to the pre-stored triggering condition and processing mechanism, if the task monitoring device 4 monitors that the query request initiated by the user at the service side belongs to a repeated query request (i.e. the same query request has been initiated before), and corresponding data results have been generated corresponding to the previous query request, the user wants to query the data results again at present, but the service side does not store the data results, the data results are stored in the big data cluster device 3, and the service side only stores basic data; then, the task monitoring device 4 triggers the interface processing device 2 to request the big data cluster device 3 to send the data result according to the condition, the big data cluster device 3 retrieves the data result and sends the data result to the interface processing device 2, and the interface processing device 2 forwards the data result to the corresponding service end.
For another example, as described above, when the big data cluster device 3 parses data to generate a data result, and records the state of the data result in the corresponding data retrieval request table, and the task monitoring device 4 monitors that the state is "generated data result", correspondingly, the triggering device of the task monitoring device 4 triggers the interface processing device 2 to request the big data cluster device 3 to send the data result, and sends the data result to the corresponding service end according to the data association relationship in the data retrieval request table.
The big data cluster device 3 will actively extract data from the interface processing device 2 or the business end processing device 1 to analyze and process, but will not actively send data to the interface processing device 2; therefore, the task monitoring device 4 can monitor the data state and conditions in the big data cluster device 3 in real time, and further trigger the corresponding execution mechanism according to the monitoring result.
External data source 5: and providing external data for the interface processing device, wherein the external data comprises data from a plurality of interface channels and a plurality of crawling channels. The interface processing device 2 is connected to the external data source 5 and the service end processing device 1 in a butt joint mode, and after receiving a data query request of the service end processing device 1, the interface processing device 2 analyzes and judges which data interface or data crawling channel the data is acquired through according to parameters and association conditions of the data query request.
In an application scenario, the interface processing apparatus 2 may acquire the required external data through one or more interface channels or/and a plurality of crawling channels.
Meanwhile, the interface processing device 2 records the integrity of external data acquisition in the data calling request table, if the acquired data is complete, the "data acquisition success" is marked, and if the acquired data is incomplete, the "data acquisition failure" is marked; when reading the data acquisition failure state of the data retrieval request table, the task monitoring device 4 triggers the interface processing device 2 to continue to acquire the rest of the remaining data according to a preset processing mechanism, so that the complete data can be finally acquired.
According to the multi-source data integration system based on the big data technology, the business end processing device 1, the big data cluster device 3 and the external data source 5 are effectively linked through the interface processing device 2, and a real and effective big data integration system is formed; the system can be suitable for various different service scenes; when a new service end or a service system needs to be added into the data integration system, only an interface needs to be newly developed in the interface processing device 2 to the new service end processing device 1, so that the interface cost can be reduced; the big data cluster device 3 only needs to provide one uniform interface to transmit data with the interface processing device 2, and various different service ends do not need to be directly connected, so that the risk of the big data cluster device 3 can be greatly reduced; therefore, the integrated system can be compatible and suitable for various different service systems, saves data calling resources, and simultaneously can ensure the safety of the big data cluster processing device.
In addition, for the acquisition and processing of data, the interface processing device 2 acquires data according to the data calling request, and sends the acquired data to the big data cluster device 3 for analysis processing, that is, the interface processing device 2 is responsible for acquiring data, establishing data relation association, and releasing the data after temporarily storing the data, while the big data cluster device 3 is mainly responsible for deep processing of the data, forming a data result required by a service end, and storing the data result; the two are mutually matched to form a big data system together; because the interface processing device 2 only needs to transmit data and does not store data for a long time, even if the interface processing device 2 is connected with a plurality of service terminals at the same time, the interface processing device can work under light load, has low requirement on system hardware, and can save related cost.
Correspondingly, the embodiment of the invention provides a multi-source data integration method based on a big data technology, and the method specifically comprises the following steps.
A user initiates a data query request through the service end processing device 1, and one or more parameters can be simultaneously taken when the request is initiated; meanwhile, a data query request table can be generated, and a data query request initiated by a user is recorded in the data query request table; then, the data query request is sent to the interface processing device 2.
The interface processing device 2 receives the data query request from the service end processing device 1, analyzes the data query request, determines whether to generate a new data retrieval request for the data query request according to an analysis result (requiring deduplication processing) and records the new data retrieval request into a data retrieval request table, analyzes the association condition of the data retrieval request, judges a corresponding data interface or data crawling channel, further correspondingly initiates a corresponding data interface request or data crawling request, and starts an interface data acquisition program or/and a data crawling program to acquire corresponding data.
After acquiring the corresponding data, the interface processing device 2 associates the data with the corresponding data retrieval request in the data retrieval request table according to the record of the data retrieval request table, indicating that the data is acquired corresponding to the data retrieval request, and marks that the data is successfully acquired in the data retrieval request table corresponding to the data retrieval request; meanwhile, relevant rule conditions of the data calling request are read from the task monitoring device 4, association is established for the data relation according to the rule conditions, and the data association relation is stored; so that the relevant data can be directly called through the data association relationship subsequently without repeatedly acquiring external data; meanwhile, the method can be used for subsequent data analysis, mining, integration and the like, so that the reusability and high availability of the data are enhanced, and the redundancy of the data is reduced.
After the interface processing device 2 establishes the preliminary association of the data, the obtained data and the data calling request table are sent to the big data cluster device 3 for analysis processing, and the data calling request table are processed into data results which are required by the service end and can be received at the same time.
The big data cluster device 3 analyzes and processes the data, forms and stores data results, and establishes data association between the data results and corresponding data calling requests in the data calling request table. The big data cluster device 3 can further analyze, mine and integrate the data to enhance the reusability and high availability of the data and reduce the redundancy of the data; subsequent direct calls to the relevant data can also be facilitated without repeated retrieval of external data.
Specifically, the interface processing device 2 analyzes the data query request, including determining whether the data query request has a data result generated, if so, directly calling the corresponding data result, and associating the data result with the data query request; if not, judging whether the same data query request is associated in the data calling request table or not, and if so, associating the data query request with the same data query request in the data calling request table; and if not, generating a new data calling request corresponding to the data query request in the data calling request table.
In addition, as a preferred embodiment, before determining whether the data query request generates a data result, the interface processing device 2 may further include, to analyze whether the user has initiated the same data query request, and if so, trigger the user to select to initiate the latest data query request or view the data result of the historical query; if the user selects to initiate the latest data query request, continuously judging whether the data query request generates a data result; if the user selects to check the data result of the historical query, triggering to call the data result of the historical query; if not, continuously judging whether the data query request generates a data result.
Specifically, establishing a data association relationship, including associating a data query request table of the service end processing device with a data retrieval request table of the interface processing device; establishing association between the acquired data and corresponding data calling requests in a data calling request table; establishing association between the data result and the corresponding data calling request in the data calling request table; and establishing association between the data result and the corresponding data query request through a data retrieval request table.
The task monitoring device 4 monitors in real time that the data conditions and the states meet the preset conditions, and triggers the corresponding device to execute a corresponding processing mechanism according to the preset triggering conditions and the processing mechanism thereof;
for example, if the task monitoring device 4 monitors that a data query request is recorded in the data query request table of the service-side processing device 1 in real time, the task monitoring device triggers the service-side processing device 1 to determine an attribute of the data query request, for example, whether the data query request belongs to a new data query request or an old data query request; if the data is a new data query request, triggering the service end processing device 1 to send the data query request to the interface processing device 2; the interface processing device 2 processes the data query request.
For another example, the task monitoring device 4 monitors the data obtaining state of the interface processing device 2 corresponding to the data retrieval request mark in the data retrieval request table in real time, for example, the data obtaining is successful, then the task monitoring device 4 triggers the interface processing device 2 to send data to the big data cluster device 3 for analysis and processing into a data result required by the service end; if the interface processing device 2 cannot acquire the corresponding data, the corresponding data retrieval request in the data retrieval request table marks data acquisition failure, and the task monitoring device 4 monitors the data acquisition state in real time, then the interface processing device 2 may be triggered to process according to a preset rule condition, for example, to acquire again when a preset time is reached.
For another example, after the big data cluster device 3 analyzes the received data, a data result is generated, the data result is associated with a corresponding data retrieval request in the data retrieval request table, and a data result generation state is marked corresponding to the data retrieval request, for example, the data result is successfully generated; if the task monitoring device 4 monitors the data state in real time, it will trigger the interface processing device 2 to request the big data cluster device 3 to send the data result; if the data result generation state is that the data result generation fails, the task monitoring device 4 triggers the big data cluster device 3 to analyze the reason of the failure generation and then triggers the next processing flow according to the reason; for example, the reason for the generation failure is that the interface processing device 2 fails to acquire all the data successfully, and only part of the interface data is acquired successfully, the interface processing device 2 is triggered to acquire the interface data that has failed to acquire again; and so on.
In practical application, a user may need to query various different data in different service scenarios, and then, according to the scheme provided by the embodiment of the present invention, the user can query different data only through the service terminal of the user. The scheme of the invention is further explained by taking the example that the user A needs to inquire the enterprise data.
1. When enterprise data needs to be inquired, a user A can initiate a data inquiry request through a self service end. In the request, there is a query object, such as Enterprise A.
It should be noted that the user a may be an individual user or an enterprise user. In this embodiment, the user a is an enterprise user, and the query operations of the main account or other sub-accounts of the enterprise all belong to the query operation of the same enterprise user.
When a user a initiates a data query request through a service end, the user a has already triggered a task, the task monitoring device 4 monitors the data query request task in real time, and triggers the service end processing device 1 through the triggering device to determine whether the query request belongs to a new query request or an old query request (the old query request refers to the query request that the user has already queried before, and the same query request is raised again at present).
The business-side processing device 1 may analyze the association condition between the user a and the queried object (enterprise a) in the data query request, and determine whether the user a has queried the data of the enterprise a (within a certain time limit), has generated a corresponding query result, and the like, according to the record of the data query request table.
2. For the analysis result of the data query request, in an actual service scenario, several situations can be classified, as follows.
2.1. Scene 1: according to the record of the data query request table of the service end processing device 1, the user A queries the enterprise A to be queried before and generates a data result; when the task monitoring device 4 monitors the analysis result in real time, it may trigger the service-side processing device 1 to return a relevant prompt to the user, and may let the user a select "initiate a latest query request" or "view a data result of a historical query".
If the user A selects to initiate the latest query request, the task monitoring device 4 triggers the service end processing device 1 to initiate a data query request to the interface processing device 2, and records the data query request in a data query request table;
after receiving the data query request, the interface processing device 2 determines whether there is an associated record in the enterprise a currently queried, for example, whether there is a data result associated record corresponding to the enterprise a (in practical applications, it may be set that the data result within one month of the current time point includes a data result of the enterprise a requested to be queried by another user);
if the data result exists, the interface processing device 2 initiates a data result retrieval request to the big data cluster device 3 without generating a new data result, and retrieves the data result; the big data cluster device 3 searches the data result from the database thereof according to the association record of the data calling request table and sends the data result to the interface processing device 2, then the interface processing device 2 forwards the data result to the corresponding service end processing device 1, and the service end processing device 1 displays the data result to the user;
if not, a new data result needs to be generated; the interface processing device 2 further analyzes whether it is necessary to generate a data retrieval request for the data retrieval request of the currently queried enterprise a and logs in the data retrieval request table.
Specifically, the interface processing device 2 determines whether the enterprise a being queried is already in the current latest data retrieval request table, that is, determines whether the data retrieval request concerning the enterprise a is already recorded in the current latest data retrieval request table (since there is a possibility that other users also initiate a data retrieval request to the enterprise a, and the interface processing device 2 has already generated and posted a data retrieval request for this purpose in the data retrieval request table); if the data retrieval request table already records the data retrieval request of the enterprise A, the data retrieval request table does not need to record again so as to avoid repeated query, and the current data query request of the enterprise A is directly associated with the data retrieval request of the enterprise A in the data retrieval request table; if no record exists, the data retrieval request for enterprise A is logged into the current latest data retrieval request table.
That is, the same query requests are correlated to generate a data calling request; therefore, other query requirements of the same data query request can be met only by completing data calling once, and waste of data calling resources is reduced.
Meanwhile, the interface processing device 2 associates the data retrieval request of the enterprise a in the data retrieval request table with the corresponding data query request in the data query request table of the business end processing device 1, so that the subsequently generated data result is conveniently associated with the data retrieval request table and the data query request table.
If the user A selects to check the data result of the historical query, the task monitoring device 4 triggers the service end processing device 1 to search whether the data result of the historical query exists in the service end;
if yes, the business end processing device 1 can directly call and display from the business end; the data result called from the big data cluster device is stored by the service end, so that the service end can read the data quickly, and simultaneously, the waste of resources for executing the data stream by the big data cluster device is reduced.
If not, the service end processing device 1 initiates a data query request to the interface processing device 2, and the interface processing device 2 requests the big data cluster device 3 to invoke and send the data result of the historical query according to the association record of the data invoking request table, and forwards the data result to the corresponding service end processing device 1 for display after receiving the data result.
In addition, in combination with an actual service application scenario, for the user a to select to initiate the latest data query request, the embodiment may further be extended to: the interface processing device 2 further judges whether the user A obtains the authorization of the inquired object, if so, the interface processing device can further judge whether the user A pays, and if so, the interface processing device performs subsequent data acquisition processing.
2.2. Scene 2: according to the data association record, if the user A never inquires the enterprise A which is inquired currently, the task monitoring device 4 triggers the service end processing device 1 to send a data inquiry request to the interface processing device 2, and records the data inquiry request in a data inquiry request table;
after the interface processing device 2 receives the data query request, preferably, the interface processing device 2 may first determine whether the user a has obtained the authorization of the queried enterprise a and has paid the fee;
if the user a has been authorized and paid the fee, the interface processing device 2 continues to execute the processing flow downwards, and the specific processing flow is the same as the processing flow of the user a initiating the latest data query request in the scenario 1, which is not described herein again.
2.3. Scene 3: according to the data association record, the user A has previously requested to query the data of the enterprise A, but the previous query request does not obtain the authorization of the enterprise A and does not acquire the data; specifically, the data query request of the user a to the enterprise a recorded in the data query request table of the service-side processing device 1 is marked with: the user A does not obtain the authorization of the enterprise A; the task monitoring device 4 may trigger the service-side processing device 1 to return a notification prompt that authorization of the enterprise a needs to be obtained first to the user; when the user a receives the authorization of the enterprise a and then issues a query request, the processing flow refers to scenario 1 and scenario 2, which are not described herein again.
In addition, as an example, with respect to the process of the user a obtaining the authorization of the enterprise a, the following processing flow of 3.3 may be referred to; after the user a obtains the authorization of the enterprise a and calls the authorization data to the service end through the interface processing device 2, the task monitoring device 4 triggers the service end processing device 1 to associate the authorization data with the data query request corresponding to the data query request table, and updates and marks the state of the corresponding data query request as: user a has obtained the authorization of enterprise a. Then, when the user a issues a data query request to the enterprise a again, the service-side processing device 1 sends the data query request to the interface processing device 2 together with the authorization status, so that the interface processing device 2 performs the next analysis processing accordingly.
2.4. Scene 4: according to the data association record, the user A inquires the data of the current enterprise A and obtains the authorization of the enterprise A, but the data is not obtained due to the failure of deduction due to insufficient balance; specifically, the data query request of the user a to the enterprise a recorded in the data query request table of the service-side processing device 1 is marked with: user A has obtained the authorization of Enterprise A, but user A has not paid; the task monitoring device 4 can trigger the service-side processing device 1 to return a notification prompt that the user needs to be charged and paid for first; when the user a successfully pays the fee and then issues the query request, the processing flow refers to the scene 1 and the scene 2, which is not described herein again.
2.5. Scene 5: according to the data association record, the user a has inquired about the data of the current enterprise a, and in the process of generating the data result, the task monitoring device 4 can trigger the service end processing device 1 to return a corresponding notification prompt, so that the user can inquire about the data result after a certain waiting time.
The above is the corresponding processing operation performed by the embodiment for different service scenarios in practical application, but those skilled in the art can understand that in practical application, the operation is not limited to the above operation, and the skilled person can make corresponding adjustment according to the practical situation.
3. The interface processing device 2 records the retrieval request for the currently queried enterprise a in the data retrieval request table, and further analyzes the association condition of the data retrieval request to determine the corresponding retrieval channel, that is, from which interface or crawling channel the data should be retrieved.
3.1 for example, if the calling request needs to obtain the basic information of the industry and the commerce of the enterprise A, the calling request can be obtained from a channel I (a certain data platform) preferably; if the calling request needs to acquire the relationship nodes and the enterprise business information thereof in the enterprise relationship network of the enterprise A, the calling request can be acquired from a channel II (another data platform) preferably; if the user needs to obtain the authorization of enterprise A, the authorization can be obtained from channel three (another data platform); and so on.
Specifically, the specific data acquisition process is further explained below.
3.1.1 when the user wants to inquire the business information of the enterprise A, relevant parameters are brought in the data inquiry request: such as enterprise a, business information, etc.; after the request is entered into the data retrieval request table, the interface processing device 2 analyzes the retrieval request, and determines that the business information of the enterprise a can be acquired from the first channel according to the request parameters.
The interface processing means 2 initiates a data retrieval request as soon as the interface channel of the external data source 5.
After the related data is successfully acquired, the interface processing device 2 reads related rule conditions from the task monitoring device 4 (because the trigger conditions, the processing mechanisms and other rule conditions of the service end processing device 1, the interface processing device 2 and the big data cluster device 3 are all stored in the task monitoring device 4), establishes a preliminary association for the data according to the rule conditions, for example, records the related serial number of the acquired data under the corresponding data request in the data retrieval request table, so as to indicate that the data is acquired corresponding to the data retrieval request, and marks that the data retrieval is successful corresponding to the data retrieval request in the data retrieval request table;
then, the task monitoring device 4 triggers the interface processing device 2 to send the called data to the big data cluster device 3 for analysis processing; the big data cluster device 3 analyzes and processes the data, establishes data association for the processed data result, associates the data result with the corresponding data calling request in the data calling request table, and marks the generated data result corresponding to the data calling request in the data calling request table;
if the task monitoring device 4 monitors the data state of the generated data result in real time, the interface processing device 2 is triggered to request the big data cluster device 3 to send the data result;
after receiving the data retrieval request, the big data cluster device 3 sends the data result to the interface processing device 2 according to the association record of the data retrieval request table;
after receiving the data result, the interface processing device 2 associates the data result with the corresponding data query request through the data call request table according to the data association relationship recorded in the data call request table, and sends the data result to the corresponding service-side processing device 1. After receiving the data result, the service-side processing device 1 associates the data result with the corresponding data query request in the data query request table, and at the same time, may send a notification to the user and display the data result.
If the data acquisition fails, the data can be called again, in this embodiment, preferably, 5 times of repeated calling can be performed, and a calling time can also be set, the task monitoring device 4 monitors the time, and once the preset time is reached, the interface processing device 2 is triggered to call the data; if the data still cannot be acquired after 5 times of calling, the interface processing device 2 records data calling failure in the corresponding data calling request in the data calling request table, so that a subsequent management background can manually trigger an abnormal interface to acquire the data.
If the interface processing device 2 does not call the data for reasons such as authorization or cost, the interface processing device 2 correspondingly marks that the data is not called for reasons (such as authorization or cost) in the data call request table;
after monitoring the marking state in real time, the task monitoring device 4 further triggers related devices to perform the next processing according to the rule conditions stored in advance; for example, the task monitoring device 4 may trigger the interface processing device 2 to return a notification that data is not called to the business-end processing device 1, with a reason, such as unauthorized or unpaid; then, waiting for the service-side processing device 1 to perform corresponding processing, for example, triggering the service-side processing device 1 to initiate an authorized data query request, and then triggering the next processing mechanism by the task monitoring device 4 according to the processing result.
3.1.2 in the data result of the call, if there is list page data, when the user needs to show the relevant data information, in this embodiment, the interface processing device 2 can preferably request to obtain data from the interface of the external data channel one in real time; wherein, the number of pages requested to be acquired is preferably 10 pages; after the data is successfully acquired, similarly, the interface processing device 2 reads the relevant rule conditions from the task monitoring device 4, establishes preliminary association on the data according to the rule conditions, and marks that the data is successfully acquired in the data acquisition request table corresponding to the data request;
if there is data acquisition failure, the data may be called again, preferably 5 times, and if the data still fails after 5 times of calling, the interface processing device 2 records the current interface data acquisition failure of the calling request in the data calling request table, so as to facilitate the subsequent manual triggering of the abnormal interface to acquire data.
3.1.3 for some details in the enterprise information, such as details of lawsuits, court announcements, and delivery announcements, when the user clicks the details (i.e. needs to obtain the corresponding detail interface data) in the lawsuits, court announcements, and delivery announcements in the queried enterprise information, the business-end processing device 1 first determines whether the corresponding detail page data exists in the interface database of the business end, if so, the business-end processing device 1 directly displays the data at the business end, if not, the business-end processing device 1 needs to request the interface processing device 2 to obtain the detail interface data corresponding to a channel, and after receiving the request, the interface processing device 2 obtains the data from the corresponding interface, and obtains the flow similar to the corresponding flow.
After the data is successfully acquired, similarly, the interface processing device 2 reads the relevant rule conditions from the task monitoring device 4, establishes preliminary association on the data according to the rule conditions, and marks that the data is successfully acquired in the data acquisition request table corresponding to the data request;
if there is an interface with data acquisition failure, the interface can be called again, preferably 5 times, and if the interface still fails after 5 times of calling, the interface processing device 2 records the current interface data acquisition failure of the calling request in the data calling request table, so as to facilitate the subsequent manual triggering of an abnormal interface to acquire data.
The same request operation as described above may be performed for other business lists in the data retrieval request table.
3.2. When a user needs to further query the relevant information of the relation node in the enterprise relation network after obtaining the business information and the enterprise relation network information of the queried enterprise, the user can initiate a request to the interface processing device 2 through the business end processing device 1 to query the business information of the relation node;
the interface processing device 2 analyzes the data query request, and determines whether the interface database of the big data cluster device 3 stores the business information of the required relationship node within a certain time (for example, within one month) according to the data association record.
If the business information of the required relationship node does not exist in the interface database of the big data cluster device 3 within a certain time, the interface processing device 2 needs to acquire the data again, and judges that the business information of the relationship node can be acquired from the channel II preferentially, and the acquisition process is the same as the corresponding process;
if so, the interface processing device 2 directly requests the big data cluster device 3 to retrieve the required business information in the interface database.
After the data is successfully acquired, similarly, the interface processing device 2 reads the relevant rule conditions from the task monitoring device 4, establishes preliminary association on the data according to the rule conditions, and marks that the data is successfully acquired in the data acquisition request table corresponding to the data acquisition request;
if there is an interface with data acquisition failure, the interface can be called again, preferably 5 times, and if the interface still fails after 5 times of calling, the interface processing device 2 records the current interface data acquisition failure of the calling request in the data calling request table, so as to facilitate the subsequent manual triggering of an abnormal interface to acquire data.
3.3. When a user needs to acquire authorization data, an authorization data query request can be initiated through a service end; the interface processing device 2 receives the data query request, analyzes the data query request, judges that the data query request is an authorized data query request, further judges that the authorized data query request can obtain related authorized data through the interface channel III, obtains state information of enterprise authorization from an authorization interface of the channel III in real time, and matches the state information with a corresponding request record, and specifically can match the request record by using three condition parameters of an enterprise name (authorized enterprise name) of a user, a searched enterprise (authorized enterprise name) and an authorized state record.
Further, for the enterprise list in the data retrieval request table, an interface data request can be initiated to channel three in real time.
In an application scenario, it may be preferable to set the enterprise list in the data retrieval request table to obtain data from the interface of channel three after 12 o 'clock and 18 o' clock each day. When the task monitoring device 4 detects that the set time is reached, the interface processing device 2 is triggered to acquire data from the interface of the channel three.
After the data is successfully acquired, similarly, the interface processing device 2 reads the relevant rule conditions from the task monitoring device 4, establishes preliminary association on the data according to the rule conditions, and marks that the data is successfully acquired in the data acquisition request table corresponding to the data request;
if there is an interface with data acquisition failure, the interface can be called again, preferably 5 times, and if the interface still fails after 5 times of calling, the interface processing device 2 records the current interface data acquisition failure of the calling request in the data calling request table, so as to facilitate the subsequent manual triggering of an abnormal interface to acquire data.
4. According to the data retrieval request table, preferably, the big data cluster device 3 screens out the data retrieval request with successful data retrieval, and executes the data stream on the associated and successfully retrieved data in real time; preferably, the large data cluster device 3 may generate a data result report for the acquired data, and correspondingly, after successfully acquiring the data and generating the data result report, the data result report status corresponding to the data retrieval request in the data retrieval request table is updated to "generated". After the task monitoring device 4 monitors the data result report state in real time, the triggering interface processing device 2 requests the big data cluster device 3 to send the data result report, and sends the data result report to the corresponding service end processing device 1 according to the rule condition read from the task monitoring device 4.
The service end processing device 1 displays data for the corresponding user through the record of the data query request table.
The above embodiment takes the user request to query the related data of the enterprise as an example to illustrate the solution of the present invention, but it should be clear to those skilled in the art that the solution of the present invention can be applied to various business scenarios, such as querying credit data, tax payment data, etc. of the user, therefore, the above specific query steps regarding the enterprise data are only used for understanding the solution of the present invention, and should not be construed as limiting the idea of the solution of the present invention.
The multi-source data integration method based on the big data technology and the system formed by the corresponding device provided by the embodiment can effectively combine a plurality of service terminals and a big data system, and integrate a plurality of third-party data interfaces to form an effective system for querying data by a user terminal. The scheme of the invention can be suitable for various different service scenes, when a user needs to inquire various data, the user can initiate a data inquiry request only through the self service end, then the data inquiry request is processed uniformly by the interface processing device 2, an optimal acquisition channel is selected, after the corresponding data is obtained, the data inquiry request is associated with the corresponding data calling request and inquiry request, the data is further processed uniformly by the big data cluster device 3, the data analysis, the mining, the integration and the association are carried out, various service data of ETL are obtained, and the data result is returned to the user service end through the interface processing device, so that a big data system is formed really, and the subsequent direct calling of the related data is facilitated; the big data system can enhance the reusability and high availability of data and reduce the redundancy of data, and simultaneously can save data calling resources and reduce the interface cost, because the interface processing device 2 uniformly interfaces various different service terminals, the big data cluster device only needs to open one interface to the interface processing device; moreover, the user end can initiate the query request only through the service end of the user end, and query resources are saved for the user.
It should be noted that, as will be understood by those skilled in the art: all or part of the steps for implementing the method can be completed by hardware related to program instructions, the program instructions can be stored in a computer readable storage medium or storage device, and when the program instructions are executed, the steps of the multi-source data integration method based on the big data technology are executed; and the aforementioned storage media or storage devices include, but are not limited to: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Accordingly, the embodiment of the present invention further provides a computer-readable storage device, which stores a computer program, where the computer program is executed by a processor to implement the above-mentioned multi-source data integration method based on big data technology.
Further, the present invention also provides a corresponding mobile terminal and system to implement the above multi-source data integration method based on big data technology, specifically:
a mobile terminal, comprising:
a processor adapted to execute program instructions;
and the storage device is suitable for storing program instructions which are suitable for being loaded and executed by a processor to realize the multi-source data integration method based on the big data technology.
A multi-source data integration system based on big data technology comprises a server; the server comprises a processor and a storage device;
a processor adapted to execute program instructions;
and the storage device is suitable for storing program instructions which are suitable for being loaded and executed by a processor to realize the multi-source data integration method based on the big data technology.
The above description is only a preferred embodiment of the present invention, and for those skilled in the art, the present invention should not be limited by the description of the present invention, which should be interpreted as a limitation.

Claims (10)

1. A multi-source data integration method based on big data technology is characterized by comprising the following steps:
a user initiates a data query request through a service end processing device;
the interface processing device receives and analyzes the data query request, judges a corresponding data acquisition channel, acquires corresponding data through the corresponding channel and sends the corresponding data to the big data cluster device; establishing a data association relation and storing the data association relation; the big data cluster device executes data streams in real time, forms and stores corresponding data results, and performs data association; the task monitoring device monitors the data state in real time and triggers and executes a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof.
2. The multi-source data integration method of claim 1, wherein analyzing a data query request and establishing a data association relationship comprises determining whether the data query request has generated a data result, if so, directly retrieving a corresponding data result, and associating the data result with the data query request; if not, judging whether the same data query request is associated in the data calling request table, and if so, associating the data query request with the same data query request in the data calling request table; and if not, generating a new data calling request corresponding to the data query request in the data calling request table.
3. The multi-source data integration method of claim 2, wherein before determining whether the data query request has generated a data result, further comprising analyzing whether the user has initiated the same data query request, and if so, triggering the user to select to initiate a latest data query request or view a data result of a historical query; if the user selects to initiate the latest data query request, judging whether the data query request generates a data result; if the user selects to check the data result of the historical query, triggering to call the data result of the historical query; if not, continuously judging whether the data query request generates a data result or not.
4. The multi-source data integration method of claim 3, wherein retrieving data results of historical queries comprises triggering a service-side processing device to determine whether the data results of the historical queries exist at a service side, and if so, triggering retrieval of the data results from the service side and displaying the data results; if not, triggering the interface processing device to request the big data cluster device to call the data result of the historical query.
5. The multi-source data integration method according to claim 1, wherein the establishing of the data association relationship comprises associating a data query request table of the business-side processing device with a data retrieval request table of the interface processing device; establishing association between the acquired data and corresponding data calling requests in a data calling request table; establishing association between the data result and the corresponding data calling request in the data calling request table; and establishing association between the data result and the corresponding data query request through a data retrieval request table.
6. The multi-source data integration method of claim 1, wherein analyzing the data query request further comprises determining whether a user obtains authorization for an object of the data query request, associating authorized data with a corresponding data query request in the data query request table, and associating with a corresponding data retrieval request in the data retrieval request table through the data query request table.
7. The multi-source data integration method according to claim 1, further comprising the step of screening out data retrieval requests with successful data retrieval by the big data cluster device, executing data streams on associated data in real time to form corresponding data results and store the data results, and associating the data results with the corresponding data retrieval requests.
8. The multi-source data integration method of claim 1, wherein the task monitoring device monitors the data status in real time, and triggers and executes the corresponding processing mechanism according to the preset triggering condition and the processing mechanism thereof, and the method comprises the task monitoring device monitoring the data query request table of the service-side processing device and the data status marked in the data retrieval request table of the interface processing device and the big data cluster device in real time, and triggering the corresponding device to execute the corresponding processing mechanism according to the preset triggering condition and the processing mechanism thereof.
9. A multi-source data integration system based on big data technology is characterized by comprising a business end processing device, an interface processing device, a big data cluster device, a task monitoring device and an external data source;
the service end processing device: a user initiates a data query request;
an interface processing device: receiving and analyzing a data query request, judging a corresponding data acquisition channel, acquiring corresponding data through the corresponding channel and sending the data to the big data cluster device; establishing a data association relation and storing the data association relation;
big data cluster device: executing data stream in real time, forming and storing corresponding data result, and performing data association and storage;
the task monitoring device comprises: monitoring the data state in real time, and triggering and executing a corresponding processing mechanism according to a preset triggering condition and a processing mechanism thereof;
external data sources: and providing an external data acquisition channel.
10. The multi-source data integration system of claim 9, wherein the task monitoring device monitors the data query request table of the service-side processing device and the data states marked in the data retrieval request tables of the interface processing device and the big data cluster device in real time, and triggers the corresponding device to execute the corresponding processing mechanism according to the preset triggering condition and the processing mechanism thereof.
CN202110422153.5A 2021-04-20 2021-04-20 Multi-source data integration method and system based on big data technology Active CN113051332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110422153.5A CN113051332B (en) 2021-04-20 2021-04-20 Multi-source data integration method and system based on big data technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110422153.5A CN113051332B (en) 2021-04-20 2021-04-20 Multi-source data integration method and system based on big data technology

Publications (2)

Publication Number Publication Date
CN113051332A true CN113051332A (en) 2021-06-29
CN113051332B CN113051332B (en) 2023-04-28

Family

ID=76519730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110422153.5A Active CN113051332B (en) 2021-04-20 2021-04-20 Multi-source data integration method and system based on big data technology

Country Status (1)

Country Link
CN (1) CN113051332B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279849A (en) * 2010-06-09 2011-12-14 中兴通讯股份有限公司 Method and system for big data query
WO2012125166A1 (en) * 2011-03-17 2012-09-20 Hewlett-Packard Development Company L.P. Data source analytics
CN105183834A (en) * 2015-08-31 2015-12-23 上海电科智能系统股份有限公司 Ontology library based transportation big data semantic application service method
US20170262495A1 (en) * 2012-08-13 2017-09-14 Aria Solutions, Inc. Enhanced high performance real-time relational database system and methods for using same
CN109582659A (en) * 2018-12-04 2019-04-05 郑州云海信息技术有限公司 Request recording method, system, device and the readable storage medium storing program for executing of processing links
CN109656958A (en) * 2018-12-18 2019-04-19 北京小米移动软件有限公司 Data query method and system
US20200320075A1 (en) * 2019-04-02 2020-10-08 International Business Machines Corporation Method of Extracting Relationships from a Nosql Database
CN111767311A (en) * 2020-06-05 2020-10-13 浙江大搜车软件技术有限公司 Variable request method, system, computer device and computer readable storage medium
CN112527848A (en) * 2020-12-22 2021-03-19 苏州科达科技股份有限公司 Multi-data-source-based report data query method, device, system and storage medium
CN112632493A (en) * 2020-12-18 2021-04-09 中国建设银行股份有限公司 Authorization verification management method and system based on client privacy protection

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279849A (en) * 2010-06-09 2011-12-14 中兴通讯股份有限公司 Method and system for big data query
WO2012125166A1 (en) * 2011-03-17 2012-09-20 Hewlett-Packard Development Company L.P. Data source analytics
US20170262495A1 (en) * 2012-08-13 2017-09-14 Aria Solutions, Inc. Enhanced high performance real-time relational database system and methods for using same
CN105183834A (en) * 2015-08-31 2015-12-23 上海电科智能系统股份有限公司 Ontology library based transportation big data semantic application service method
CN109582659A (en) * 2018-12-04 2019-04-05 郑州云海信息技术有限公司 Request recording method, system, device and the readable storage medium storing program for executing of processing links
CN109656958A (en) * 2018-12-18 2019-04-19 北京小米移动软件有限公司 Data query method and system
US20200320075A1 (en) * 2019-04-02 2020-10-08 International Business Machines Corporation Method of Extracting Relationships from a Nosql Database
CN111767311A (en) * 2020-06-05 2020-10-13 浙江大搜车软件技术有限公司 Variable request method, system, computer device and computer readable storage medium
CN112632493A (en) * 2020-12-18 2021-04-09 中国建设银行股份有限公司 Authorization verification management method and system based on client privacy protection
CN112527848A (en) * 2020-12-22 2021-03-19 苏州科达科技股份有限公司 Multi-data-source-based report data query method, device, system and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
耿玉水,寇纪淞: "云计算下异构数据集成模型的构建", 《济南大学学报(自然科学版)》 *
耿玉水: "面向集团企业的数据集成模型构建方法研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Also Published As

Publication number Publication date
CN113051332B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN109873745B (en) Communication control method, communication control device and storage medium
KR20140120673A (en) Mobile terminal and method for processing notification information
TWI453608B (en) System and method for managing a large number of multiple data
CN109740129B (en) Report generation method, device and equipment based on blockchain and readable storage medium
US20130031178A1 (en) Method and Apparatus for Managing Message
CN112491617B (en) Link tracking method, device, electronic equipment and medium
CN111950908A (en) Scheduling task configuration pushing method and device, computer and storage medium
CN115688715A (en) Report generation method and device and computer readable medium
CN112182295B (en) Service processing method and device based on behavior prediction and electronic equipment
CN113641713A (en) Data processing method and device
CN107666431B (en) Bookmark communication message acquisition method and device
CN113238815A (en) Interface access control method, device, equipment and storage medium
US20180020075A1 (en) Apparatus and method for providing data based on cloud service
CN113051332A (en) Multi-source data integration method and system based on big data technology
US20230229296A1 (en) Notification display in message session
CN111831523A (en) Method and system for client-side non-perception on-line problem troubleshooting
CN111124891A (en) Access state detection method and device, storage medium and electronic device
US9852031B2 (en) Computer system and method of identifying a failure
US20150347529A1 (en) System and method for contextual workflow automation
CN115203260A (en) Abnormal data determination method and device, electronic equipment and storage medium
KR20020070274A (en) Systems and Methods of Message Queuing
CN114819981A (en) Customer service problem processing method, device, equipment and storage medium
CN111061543A (en) Multi-tenant workflow engine service method, device and server
CN110418020B (en) List state information processing method and device, electronic terminal and storage medium
CN111045983A (en) Nuclear power station electronic file management method and device, terminal equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230403

Address after: 523000 room 1301, unit 2, building 4, Tian'an Digital City, No. 1, Huangjin Road, Nancheng street, Dongguan City, Guangdong Province

Applicant after: Dongguan Mengda Group Co.,Ltd.

Applicant after: Dongguan Mengda Data Technology Co.,Ltd.

Address before: Room 701-703, 7th floor, Goldman Sachs technology building, phase II, Goldman Sachs Technology Park, 5 Longxi Road, Zhouxi, Nancheng District, Dongguan City, Guangdong Province, 523000

Applicant before: DONGGUAN MENGDA PLASTICIZING SCIENCE & TECHNOLOGY CO.,LTD.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant