CN110502662A - A kind of Heterogeneous Data Processing system and method - Google Patents

A kind of Heterogeneous Data Processing system and method Download PDF

Info

Publication number
CN110502662A
CN110502662A CN201910781410.7A CN201910781410A CN110502662A CN 110502662 A CN110502662 A CN 110502662A CN 201910781410 A CN201910781410 A CN 201910781410A CN 110502662 A CN110502662 A CN 110502662A
Authority
CN
China
Prior art keywords
data
server
layer
heterogeneous
acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910781410.7A
Other languages
Chinese (zh)
Inventor
周会群
王玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xinyida Computing Technology Co Ltd
Original Assignee
Nanjing Xinyida Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xinyida Computing Technology Co Ltd filed Critical Nanjing Xinyida Computing Technology Co Ltd
Priority to CN201910781410.7A priority Critical patent/CN110502662A/en
Publication of CN110502662A publication Critical patent/CN110502662A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles

Abstract

The present invention relates to technical field of data processing, specifically, it is related to a kind of Heterogeneous Data Processing system and method, including isomeric data acquisition layer, data analysis layer, application layer, data acquisition server, real-time data base and database, isomeric data acquisition layer includes data collection station and video acquisition terminal, data analysis layer includes application server and video server, and application layer is remote user end, and database uses SQL Server server.Architecture of the invention can accommodate different geographical, heterogeneous networks structure, and functionally relatively independent information management subsystem, and high integration can be carried out to the shared information of original subsystems on the basis of not influencing production, by the parameter for matching isomeric data layer, corresponding more data field processing units are found to generate business object, so that the processing scene of isomeric data layer is simplified, the processing details to more data fields can be avoided in the data Layer development phase, it is ensured that the efficiency of exploitation.

Description

A kind of Heterogeneous Data Processing system and method
Technical field
The present invention relates to technical field of data processing, specially a kind of Heterogeneous Data Processing system and method.
Background technique
With the continuous development of information technology, the levels of informatization such as government, enterprise are higher and higher, and each department's operation system produces Raw data volume is increasing, and type of data structure becomes increasingly complex, and data source is also more come also more.Often face isomery number According to the operational issue of layer.Isomeric data layer refers to the volume of data operation object for following the identical or different realization of same-interface Set.In the prior art, the operation of isomeric data layer is realized in ccf layer, is often compiled for a certain data Layer Corresponding processing code is write, these codes are constantly accumulated, so that logical boundary is more and more unintelligible, mutual calling also seems Confusion, development efficiency are low.
Summary of the invention
The purpose of the present invention is to provide a kind of Heterogeneous Data Processing system and methods, to solve to mention in above-mentioned background technique Certain or certain defects out.
To achieve the above object, the invention provides the following technical scheme:
A kind of Heterogeneous Data Processing system, including isomeric data acquisition layer, data analysis layer, application layer, data acquisition clothes It is engaged in device, real-time data base and database, the isomeric data acquisition layer includes data collection station and video acquisition terminal, described Data analysis layer includes application server and video server, and the application layer is remote user end, and database uses SQL Server server.
Preferably, isomeric data acquisition layer is used to handle the various real time datas from different acquisition terminal, and will be real When data carry out isomer data integration, data collection station and video acquisition terminal are more by various kinds of sensors and video frequency collection card Functional hardware devices are constituted, for obtaining the real-time digitization information at scene.
Preferably, data analysis layer further includes data filtering, the method for data filtering includes:
S1: it reads isomery big data and splits to obtain normal data according to data structure;
S2: the error rate of normal data is calculated;
S3: error rate obtains filtered data after being greater than the abnormal data of error threshold in deletion normal data;
S4: filtered data are sorted according to the size of error rate;
S5: the data of head and the tail 10% obtain filter result in the queue after deleting sequence;
S6: output filter result is simultaneously delivered to application server.
Calculate the sub-step of the error rate of normal data are as follows:
S1: setting x1, x2, x3 ..., xn is the data values of n normal data, then arithmetic mean of instantaneous value X ' is
S2: the formula by the error rate s of arithmetic mean of instantaneous value X ' normal data is
Wherein, n is the positive integer more than or equal to 0, and with no restrictions, i value range is 1-n to value range, and x is criterion numeral According to data values.
Preferably, the calculating step of poor threshold value are as follows: set S, S, S ..., S is the error rate of n normal data, then error Threshold value S ' is
Preferably, isomer data integration carries out data integration by ontology, ontology is existed by Ontology Mapping Semantic matches are established between isomeric data, shield Semantic Heterogeneous, for solving Semantic Heterogeneous, ontology includes single ontology side Method, more bulk process and mixed method.
Preferably, the hardware terminal of data collection station includes several sensors, slave computer and serial port board, sensor is caught The real time data of each real time data and slave computer that obtain is transferred to data collection station, data acquisition by the connection of serial port board Terminal is sent to the data acquisition server of system by network technology, and serial port board is that various types of isomery counts in real time According to.
Preferably, carrying out data by network technology between application server, video server and Terminal Server Client Transmission, network technology use application server and client based on TCP/IP or udp protocol.
Preferably, real-time data base is used to provide the data knowledge library of high efficiency data storage, the base of real-time data base This unit is Real-time Transaction, and real-time data base includes transaction model, con current control and memory database.
Preferably, memory database is used to carry out the algorithm and data structure of query processing con current control and recovery weight New design.
On the other hand, the present invention also provides a kind of Heterogeneous Data Processing methods, including at above-described isomeric data Reason system, specifically comprises the following steps:
S1: by data collection station and video acquisition terminal log evidence in isomeric data acquisition layer and video information into Row acquisition;
S2: data collection station handles the information conveyance of acquisition to data acquisition server, is sent to real-time number According to library;
S3: data information is delivered to application server and carries out data processing by real-time data base, while video information acquires Video information is directly transferred to video server and stored by terminal;
S4: application server will be sent to Terminal Server Client by network technology after processing and by data information memory to number Storage backup is carried out according to library, while video information is sent to Terminal Server Client by video server.
Compared with prior art, the beneficial effects of the present invention are:
1, the architecture of this Heterogeneous Data Processing system and method can accommodate different geographical, heterogeneous networks structure, and Functionally relatively independent information management subsystem, and can be on the basis of not influencing production to original subsystems Shared information carry out high integration, by match isomeric data layer parameter, find corresponding more data field processing units Business object is generated, so that the processing scene of isomeric data layer is simplified, can be avoided in the data Layer development phase to majority According to the processing details in domain, developer can as developed in the past single data layer application it is flexible, easily complete development, It is interfered without the little detail brought by multi-domain scenario, it is ensured that the efficiency of exploitation.
2, by being filtered to isomeric data in this Heterogeneous Data Processing system and method, different data can be directed to Structure has normalized data using unified Heterogeneous Data Processing method, reduces the error rate of isomeric data, improves different The logical consistency degree of structure data.
Detailed description of the invention
Fig. 1 is system structure diagram of the invention;
Fig. 2 is data collection station schematic diagram of the invention;
Fig. 3 is single ontological manner structure chart of the invention;
Fig. 4 is more ontological manner structure charts of the invention;
Fig. 5 is hybrid ontology mode structure chart of the invention.
In figure: isomeric data acquisition layer 1;Data analysis layer 2;Application layer 3;Data acquisition server 4;Real-time data base 5; Database 6.
Specific embodiment
Below in conjunction with the embodiment of the present invention, technical scheme in the embodiment of the invention is clearly and completely described, Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all Belong to the scope of protection of the invention.
Embodiment 1
A kind of Heterogeneous Data Processing system, as shown in Figure 1, including isomeric data acquisition layer 1, data analysis layer 2, application layer 3, data acquisition server 4, real-time data base 5 and database 6, isomeric data acquisition layer 1 include data collection station and video Acquisition terminal, data analysis layer 2 include application server and video server, and application layer is remote user end, and database 6 uses SQL Server server.
Further, as shown in Fig. 2, the hardware terminal of data collection station includes several sensors, slave computer and serial ports Card, each real time data of sensor capture and the real time data of slave computer are transferred to data acquisition eventually by the connection of serial port board End, data collection station are sent to the data acquisition server 4 of system by network technology, and serial port board is various types of Isomery real time data.
Specifically, carrying out data biography by network technology between application server, video server and Terminal Server Client Defeated, network technology uses application server and client based on TCP/IP or udp protocol.
In addition, real-time data base 5 be used for provide high efficiency data storage data knowledge library, real-time data base 5 it is basic Unit is Real-time Transaction, and real-time data base includes transaction model, con current control and memory database.
It is worth noting that memory database be used for the algorithm of query processing con current control and recovery and data structure into Row redesigns.
On the other hand, invention also improves a kind of Heterogeneous Data Processing method, specifically comprise the following steps:
S1: pass through the data collection station and video acquisition terminal log evidence and video information in isomeric data acquisition layer 1 It is acquired;
S2: data collection station handles the information conveyance of acquisition to data acquisition server 4, is sent to real-time number According to library 5;
S3: data information is delivered to application server and carries out data processing by real-time data base 5, while video information acquires Video information is directly transferred to video server and stored by terminal;
S4: application server will be sent to Terminal Server Client by network technology after processing and by data information memory to number Storage backup is carried out according to library 6, while video information is sent to Terminal Server Client by video server.
The architecture of the Heterogeneous Data Processing system of the present embodiment can accommodate different geographical, heterogeneous networks structure, and Functionally relatively independent information management subsystem, and can be on the basis of not influencing production to original subsystems Shared information carry out high integration, by match isomeric data layer parameter, find corresponding more data field processing units Business object is generated, so that the processing scene of isomeric data layer is simplified, can be avoided in the data Layer development phase to majority According to the processing details in domain, developer can as developed in the past single data layer application it is flexible, easily complete development, It is interfered without the little detail brought by multi-domain scenario, it is ensured that the efficiency of exploitation.
Embodiment 2
As second of embodiment of the invention, isomeric data acquisition layer 1 is for handling from each of different acquisition terminal Kind real time data, and real time data is subjected to isomer data integration, data collection station and video acquisition terminal are by all kinds of biographies Sensor and the multi-functional hardware device of video frequency collection card are constituted, for obtaining the real-time digitization information at scene.
Further, data analysis layer 2 further includes data filtering, and the method for data filtering includes:
S1: it reads isomery big data and splits to obtain normal data according to data structure;
S2: the error rate of normal data is calculated;
S3: error rate obtains filtered data after being greater than the abnormal data of error threshold in deletion normal data;
S4: filtered data are sorted according to the size of error rate;
S5: the data of head and the tail 10% obtain filter result in the queue after deleting sequence;
S6: output filter result is simultaneously delivered to application server.
Calculate the sub-step of the error rate of normal data are as follows:
S1: setting x1, x2, x3 ..., xn is the data values of n normal data, then arithmetic mean of instantaneous value X ' is
S2: the formula by the error rate s of arithmetic mean of instantaneous value X ' normal data is
Wherein, n is the positive integer more than or equal to 0, and with no restrictions, i value range is 1-n to value range, and x is criterion numeral According to data values.
The calculating step of poor threshold value are as follows: S, S, S are set ..., S is the error rate of n normal data, then error threshold S ' is
By being filtered to isomeric data in the present embodiment Heterogeneous Data Processing system, different data knots can be directed to Structure has normalized data using unified Heterogeneous Data Processing method, reduces the error rate of isomeric data, improves isomery The logical consistency degree of data.
Embodiment 3
As the third embodiment of the invention, isomer data integration carries out data integration, ontology skill by ontology Art establishes semantic matches by Ontology Mapping between isomeric data, Semantic Heterogeneous is shielded, for solving Semantic Heterogeneous, ontology Including single bulk process, more bulk process and mixed method.
As shown in figure 3, single bulk process is exactly to define a globally shared ontology as standard, it is language to be expressed Justice provides a shared vocabulary.All data sources are all associated with global ontology.This method is suitable for data source and changes not Big situation.Sensitive to the variation of data source, the change of data source will cause the change of field concept expressed by ontology.This Disadvantage leads to the appearance of more bulk process.As shown in figure 4, in more bulk process each data source have respective local ontology into Row description.It needs to establish mapping relations between the ontology of each data source.The advantages of this method be each source ontology do not need and its His data source is related and forms oneself ontology, when data source occurs to increase and changes and deletes, ontology change very little.This method can Simplify integration servers and supports the variation of data source.The disadvantage is that being mapped between local ontology since there is no unified global ontology Relationship is complicated.As shown in figure 5, hybrid ontology approach is the synthesis of first two method.There is respectively each data source in this method Ontology describing.Similar with more bulk process, each data source has respective local ontology in mixed method.But it also possesses One global ontology is as shared vocabulary.The advantages of mixed method be increase, more source of new data when do not need to mapping and altogether The vocabulary enjoyed does excessive change.
The architecture of Heterogeneous Data Processing system and method for the invention can accommodate different geographical, heterogeneous networks structure, And functionally relatively independent information management subsystem, and can be on the basis of not influencing production to original each height The shared information of system carries out high integration, by matching the parameter of isomeric data layer, finds corresponding more data field processing Unit generates business object, so that the processing scene of isomeric data layer is simplified, can avoid pair in the data Layer development phase The processing details of more data fields, developer can flexible as the application for developing single data layer in the past, easily completion exploitations Work, is interfered, it is ensured that the efficiency of exploitation without the little detail brought by multi-domain scenario.By to isomeric data into Row filtering can have been normalized data, reduced for different data structures using unified Heterogeneous Data Processing method The error rate of isomeric data improves the logical consistency degree of isomeric data.
The basic principles, main features and advantages of the present invention have been shown and described above.The technology of the industry For personnel it should be appreciated that the present invention is not limited to the above embodiments, described in the above embodiment and specification is only the present invention Preference, be not intended to limit the invention, without departing from the spirit and scope of the present invention, the present invention also has various Changes and improvements, these changes and improvements all fall within the protetion scope of the claimed invention.The claimed scope of the invention is by institute Attached claims and its equivalent thereof.

Claims (10)

1. a kind of Heterogeneous Data Processing system, it is characterised in that: including isomeric data acquisition layer (1), data analysis layer (2), answer With layer (3), data acquisition server (4), real-time data base (5) and database (6), the isomeric data acquisition layer (1) includes Data collection station and video acquisition terminal, the data analysis layer (2) include application server and video server, described to answer It is remote user end with layer, database (6) uses SQL Server server.
2. Heterogeneous Data Processing system according to claim 1, it is characterised in that: isomeric data acquisition layer (1) is for locating Manage the various real time datas from different acquisition terminal, and real time data be subjected to isomer data integration, data collection station and Video acquisition terminal is made of various kinds of sensors and the multi-functional hardware device of video frequency collection card, for obtaining the real-time number at scene Word information.
3. Heterogeneous Data Processing system according to claim 1, it is characterised in that: data analysis layer (2) further includes data Filtering, the method for data filtering include:
S1: it reads isomery big data and splits to obtain normal data according to data structure;
S2: the error rate of normal data is calculated;
S3: error rate obtains filtered data after being greater than the abnormal data of error threshold in deletion normal data;
S4: filtered data are sorted according to the size of error rate;
S5: the data of head and the tail 10% obtain filter result in the queue after deleting sequence;
S6: output filter result is simultaneously delivered to application server;
Calculate the sub-step of the error rate of normal data are as follows:
S1: setting x1, x2, x3 ..., xn is the data values of n normal data, then arithmetic mean of instantaneous value X ' is
S2: the formula by the error rate s of arithmetic mean of instantaneous value X ' normal data is
Wherein, n is the positive integer more than or equal to 0, and with no restrictions, i value range is 1-n to value range, and x is normal data Data values.
4. Heterogeneous Data Processing system according to claim 1, it is characterised in that: the calculating step of poor threshold value are as follows: S is set, S, S ..., S is the error rate of n normal data, then error threshold S ' is
5. Heterogeneous Data Processing system according to claim 1, it is characterised in that: isomer data integration passes through ontology Data integration is carried out, ontology establishes semantic matches by Ontology Mapping between isomeric data, Semantic Heterogeneous is shielded, for solving Certainly Semantic Heterogeneous, ontology include single bulk process, more bulk process and mixed method.
6. Heterogeneous Data Processing system according to claim 1, it is characterised in that: the hardware terminal packet of data collection station Several sensors, slave computer and serial port board are included, each real time data of sensor capture and the real time data of slave computer pass through string The connection of mouth card is transferred to data collection station, and data collection station is sent to the data acquisition service of system by network technology Device (4), serial port board are various types of isomery real time data.
7. Heterogeneous Data Processing system according to claim 6, it is characterised in that: application server, video server with Carried out data transmission by network technology between Terminal Server Client, network technology uses answering based on TCP/IP or udp protocol With server and client.
8. Heterogeneous Data Processing system according to claim 1, it is characterised in that: real-time data base (5) is for providing height The basic unit in the data knowledge library of efficiency data storage, real-time data base (5) is Real-time Transaction, and real-time data base includes affairs Model, con current control and memory database.
9. Heterogeneous Data Processing system according to claim 8, it is characterised in that: memory database is used for query processing The algorithm and data structure of con current control and recovery are redesigned.
10. a kind of Heterogeneous Data Processing method, including Heterogeneous Data Processing system described in claims 1-9 any one, It is characterized by: specifically comprising the following steps:
S1: by data collection station and video acquisition terminal log evidence in isomeric data acquisition layer (1) and video information into Row acquisition;
S2: data collection station handles the information conveyance of acquisition to data acquisition server (4), is sent to real time data Library (5);
S3: data information is delivered to application server and carries out data processing by real-time data base (5), while video information acquisition is eventually Video information is directly transferred to video server and stored by end;
S4: application server will be sent to Terminal Server Client by network technology after processing and by data information memory to database (6) storage backup is carried out, while video information is sent to Terminal Server Client by video server.
CN201910781410.7A 2019-08-23 2019-08-23 A kind of Heterogeneous Data Processing system and method Pending CN110502662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910781410.7A CN110502662A (en) 2019-08-23 2019-08-23 A kind of Heterogeneous Data Processing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910781410.7A CN110502662A (en) 2019-08-23 2019-08-23 A kind of Heterogeneous Data Processing system and method

Publications (1)

Publication Number Publication Date
CN110502662A true CN110502662A (en) 2019-11-26

Family

ID=68589010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910781410.7A Pending CN110502662A (en) 2019-08-23 2019-08-23 A kind of Heterogeneous Data Processing system and method

Country Status (1)

Country Link
CN (1) CN110502662A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656445A (en) * 2021-08-26 2021-11-16 五八同城信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN113806513A (en) * 2021-09-30 2021-12-17 中国人民解放军国防科技大学 Question-answering system construction method and system based on knowledge graph in military field

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855280A (en) * 2012-07-31 2013-01-02 北京壹人壹本信息科技有限公司 Heterogeneous data processing method and device
CN108121778A (en) * 2017-12-14 2018-06-05 浙江航天恒嘉数据科技有限公司 A kind of heterogeneous database exchange and cleaning system and method
CN108804533A (en) * 2018-05-04 2018-11-13 佛山科学技术学院 A kind of filter method and device of isomery big data information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855280A (en) * 2012-07-31 2013-01-02 北京壹人壹本信息科技有限公司 Heterogeneous data processing method and device
CN108121778A (en) * 2017-12-14 2018-06-05 浙江航天恒嘉数据科技有限公司 A kind of heterogeneous database exchange and cleaning system and method
CN108804533A (en) * 2018-05-04 2018-11-13 佛山科学技术学院 A kind of filter method and device of isomery big data information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢雄程: "异构实时数据处理系统的设计方法研究", 《现代计算机(专业版)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656445A (en) * 2021-08-26 2021-11-16 五八同城信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN113806513A (en) * 2021-09-30 2021-12-17 中国人民解放军国防科技大学 Question-answering system construction method and system based on knowledge graph in military field

Similar Documents

Publication Publication Date Title
CN110688495B (en) Method and device for constructing knowledge graph model of event information and storage medium
CN107545047B (en) The querying method and terminal device of user right data
US10237295B2 (en) Automated event ID field analysis on heterogeneous logs
US10579827B2 (en) Event processing system to estimate unique user count
Srivastava et al. Operator placement for in-network stream query processing
US10621180B2 (en) Attribute-based detection of anomalous relational database queries
EP3913997B1 (en) Generating wireless network access point models using clustering techniques
CN109842628A (en) A kind of anomaly detection method and device
US9992269B1 (en) Distributed complex event processing
CN110502662A (en) A kind of Heterogeneous Data Processing system and method
CN104378370A (en) Secure use method of privacy data in cloud computation
CN112600697B (en) QoS prediction method and system based on federal learning, client and server
Preuveneers et al. SAMURAI: A batch and streaming context architecture for large-scale intelligent applications and environments
WO2022057525A1 (en) Method and device for data retrieval, electronic device, and storage medium
CN112968873B (en) Encryption method and device for private data transmission
CN108491499B (en) Data acquisition method, data acquisition platform, client and business server
CN106202456B (en) Send the method and device of picture
US11494408B2 (en) Asynchronous row to object enrichment of database change streams
CN114066636A (en) Financial information system based on big data and operation method
US11327969B2 (en) Term vector modeling of database workloads
CN114756301A (en) Log processing method, device and system
WO2022036165A1 (en) Universal blockchain data model
KR101935249B1 (en) Method for processing based on flow using stored procedure, system and apparatus
CN112817938A (en) General data service construction method and system based on data productization
US11818087B1 (en) User-to-user messaging-based software troubleshooting tool

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination