CN113901117A - Multi-source test data leading processing method - Google Patents
Multi-source test data leading processing method Download PDFInfo
- Publication number
- CN113901117A CN113901117A CN202111130413.8A CN202111130413A CN113901117A CN 113901117 A CN113901117 A CN 113901117A CN 202111130413 A CN202111130413 A CN 202111130413A CN 113901117 A CN113901117 A CN 113901117A
- Authority
- CN
- China
- Prior art keywords
- data
- leading
- database
- structured
- local database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/81—Indexing, e.g. XML tags; Data structures therefor; Storage structures
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention provides a multi-source test data leading processing method, which comprises the following steps: and carrying out data connection, identifying the data type, calling a connection module of the data of the corresponding type, and starting a connection processing flow. The invention provides a uniform interface for source data, automatically identifies the data type by the adapter and calls a corresponding leading model, establishes a leading model of structured, semi-structured, unstructured, API (application program interface) interface and real-time message data, and meets the requirements of acquisition, integration and storage of test data.
Description
Technical Field
The invention belongs to the technical field of test data management, and particularly relates to a multi-source test data leading processing method.
Background
At present, test data is 'first-hand data' for timely, accurately and truly recording important activities such as equipment testing, training, researching and researching, is high-value data concerned by an equipment identification and sizing party, a demonstration and research party and a use and improvement party, is an important basic strategic resource for supporting the construction and development of equipment, and has huge value to be mined.
Part of test data comes from research and development tests, verification tests, performance tests and simulation tests, and part of data comes from standing item demonstration reports, general research and development requirements, daily training, exercise and similar equipment data in a scheme design stage. With the improvement of the informatization level of the equipment test and the progress of various testing technologies, the test data volume acquired in the test process is continuously increased, and the device has the data characteristics of large data volume, multi-source isomerism, various types, high real-time processing requirements and the like.
Due to the restriction of factors such as concept, technology, system and the like, the test data are dispersed in units such as development industrial departments, test bases, naval troops and the like and are in a split management and dispersed storage state for a long time, so that a test data acquisition and access method needs to be established, data channels among test related units of various equipment are communicated, the construction efficiency of the test data is improved, and support is provided for efficient management and analysis application of various test data resources.
The existing data leading-in technology is mainly directed at leading-in of structured data, data information is firstly collected and then transmitted to a data storage unit, and the data storage unit is subjected to standardized processing and then warehoused and filed to form a data resource pool. The test data not only comprises structured data, but also comprises unstructured data and semi-structured data, wherein the unstructured data and the semi-structured data mainly comprise image, number, character, video, audio and other types of data generated in the test process, and in addition, the unstructured data and the semi-structured data also comprise interface data, message data and other special types of data, and the scale and the complexity of the unstructured data and the semi-structured data exceed the range of processing and analyzing by the conventional technology.
Disclosure of Invention
In order to solve the above problems, the present invention provides a method for processing multi-source test data leading, comprising the following steps:
carrying out data leading;
identifying a data type;
calling a leading model of the corresponding type data;
and starting a leading connection processing flow.
Further, the process flow of the leading and connecting includes: the method comprises a semi-structured data leading process, a structured data leading process, an unstructured data leading process, an API interface data leading process and a real-time message data leading process.
Further, the semi-structured data connection process includes the following steps:
connecting a first data source base by using a first search engine;
acquiring a set list from a first data source library according to specific service requirements;
connecting the collection list to a first local database;
the first search engine is constructed by integrating a MongoDB database operation component on the basis of Python technology, the semi-structured data of the first database is stored mainly in a MongoDB database, and the first local database is a MongoDB database.
Further, the structured data connection process includes the following steps:
connecting a second database by using a second search engine;
acquiring a data table list from a second data source library according to specific service requirements;
the data table list is connected to a second local database by using a field conversion engine;
the second search engine is constructed by integrating various conventional database operation components on the basis of a Python technology, the structured data of the second database comprise conventional databases such as Oracle, Mysql, sqlserver, postgresql, domestic databases (Dameng) and the like, and the second local database is a Mysql database.
Further, the unstructured data connection process includes the following steps:
data are input in an FTP mode;
selecting a file classification directory;
uploading the file data to the selected directory;
storing the related information of the file data to a third local database;
uploading the file data to a server through an FTP service;
wherein the third local database is a mysql database.
Further, the entry of the unstructured data is divided into an offline mode and an online mode:
the off-line mode is as follows: a user selects a file classification directory and directly uploads file data;
the online mode is as follows: and setting files needing to be connected and storage positions thereof through the provided service connection information, dynamically monitoring whether the file folder generates new files or not, and automatically and dynamically uploading the new files.
Further, the API interface data connection process includes the following steps:
acquiring data through an interface engine;
dividing the acquired data into structured data and semi-structured data;
structuring data, selecting a matched analysis component from the first analysis component library, analyzing and then leading to a fourth local database;
the semi-structured data is directly stored into a fourth local database;
the first analysis component library is developed and constructed based on Python technology according to different interface contents, and the fourth local database comprises a Mysql database and a mongoDB database.
Further, the real-time message data connection process includes the following steps:
the data is acquired through the UDP,
selecting a matched analysis component from the second analysis component library, analyzing and then leading to a fifth local database;
the second analysis component library is developed and constructed for different message formats on the basis of Python technology, and the fifth local database is a Mysql database.
The invention has the beneficial effects that: the invention provides a multi-source test data leading processing method, which provides a leading model for source data, automatically identifies data types by an adapter through a uniform interface, calls a corresponding leading model, establishes a structured, semi-structured, unstructured, API (application program interface) interface and real-time message data leading model, and meets the requirements of test data acquisition, integration and storage.
Drawings
FIG. 1 is a schematic flow chart of the present invention,
FIG. 2 is a schematic diagram of the semi-structured data connection process of the present invention,
FIG. 3 is a schematic diagram of the structured data joining process of the present invention,
FIG. 4 is a schematic diagram of the unstructured data tapping flow of the present invention,
FIG. 5 is a schematic diagram of the API interface data connection flow of the present invention,
fig. 6 is a schematic diagram of a real-time packet data connection process according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples, which are provided for the purpose of illustrating the general inventive concept and are not intended to limit the scope of the invention.
As shown in FIG. 1, the invention provides a multi-source test data leading processing method, which utilizes an adapter mode to provide a uniform data interface for multi-source heterogeneous test data access, wherein when data leading is carried out, leading data passes through the adapter, the adapter automatically identifies the data type, a leading model of the corresponding type of data is called, a leading processing flow is started, and the multi-source heterogeneous data leading is completed. Designing a leading model according to the data type, wherein the leading model mainly comprises a leading processing model of structured data, semi-structured data, unstructured data, an API (application programming interface) interface and real-time message data.
The semi-structured data joining process is shown in fig. 2, in which:
the semi-structured data of the source data is stored mainly in a MongoDB database. The method is based on Python technology, integrates MongoDB database operation components, constructs a database engine, and supports the realization of the direct connection of the database through information such as database addresses, users, passwords and the like. And (5) obtaining the collection list in the database after the connection test is successful. Creating a leading model: and a MongoDB database is built in a local resource pool, and a user selects a required set and directly connects data without field conversion. And setting a leading mode, incremental leading and covering leading. And setting the frequency of the connection task, and supporting timing connection and one-time connection. Leading to a log: in the process of executing the leading task, log recording is carried out on data with errors in data processing, failure reasons are searched, leading procedures are adjusted, and the quality of the leading data is guaranteed.
The structured data joining flow is shown in fig. 3, in which:
the structuralization includes regular databases such as Oracle, MySQL, sqlserver, postgresql, domestic database (dreams). The Python technology is used as a basis, various database operation components are integrated, a database engine is constructed, and the direct connection of the database is realized through information such as database addresses, users, passwords and the like. And if the connection test is successful, a data table list in the database can be obtained. Creating a leading model: the user selects the required data table (part or all) according to the specific service requirement, and determines the leading external source data. And the leading target library performs new creation of the database through local original data management, and the new creation is used as an output target of leading source data. And setting a leading mode, incremental leading and covering leading. And setting the frequency of the connection task, and supporting timing connection and one-time connection. A field conversion engine: when a user executes the connection, because the source data are stored in different databases and the supported field types are inconsistent, the system constructs the processing rule of transferring various database fields into the corresponding fields in the Mysql database through the python technology, thereby realizing the seamless migration of the data and ensuring the accuracy of the data. Leading to a log: in the process of executing the leading task, log recording is carried out on data with errors in data processing, failure reasons are searched, leading procedures are adjusted, and the quality of the leading data is guaranteed.
The unstructured data joining flow is shown in fig. 4, in which:
the unstructured data mainly comprise reports, audios and videos and other data generated in the test, and data storage and management are achieved in an FTP mode. The method for providing the context data is divided into an off-line mode and an on-line mode (FTP \ HTTP, etc.). An off-line mode: the user selects the file classification directory and directly uploads the file data. And (3) online mode: and setting files needing to be introduced and storage positions thereof through the provided service connection information, dynamically monitoring whether the folder files generate new files, and dynamically uploading and warehousing the new files. Managing a file classification directory: and the user creates a corresponding file storage directory according to the service requirement of the user. And selecting an upload file under the directory. And (3) file information storage: and specially establishing an entity file information table in a local mysql database, and recording information such as the source, name, type, size, storage path and the like of the file. And the entity file is uploaded to a designated directory of the server through the FTP service.
The API interface data tapping flow is shown in fig. 5, in which:
the API interface is provided by other business systems in the test system and is used for acquiring relevant data. Based on Python technology, an API (application program interface) engine is built, access modes such as POST (POST position) and GET (GET. The returned result is in json format and is divided into structured and semi-structured data. Semi-structured json data: and storing the data into a mongoDB database of a local resource pool by adopting a direct storage mode, and constructing a database set aiming at each interface through local data management. Structured json data: the development of analysis components is carried out according to different interface contents on the basis of Python technology to form an analysis component library, and the analysis and the storage of data are realized. And for the leading target table, the creation of a database and a data table is realized through local data management. And setting a leading mode, incremental leading and covering leading. And setting the frequency of the connection task, and supporting timing connection and one-time connection. Connecting the log: in the process of executing the leading task, log recording is carried out on data with errors in data processing, failure reasons are searched, leading procedures are adjusted, and the quality of the leading data is guaranteed.
The real-time message data connection process is shown in fig. 6, in which:
the real-time message data is provided for the relevant systems of the test system, and is generally sent through UDP. The Python technology is used as a basis, analysis component development is carried out on different message format data to form an analysis component library, connection, receiving and data analysis of data are achieved, and a structured data set is formed. And (3) connecting and leading configuration: and according to the analyzed result object structure, the leading target library carries out the creation of the database and the data table through local data management, and the corresponding relation between the object attribute and the table field is configured during leading. Connecting the log: and logging the contents of connection failure, analysis error and the like.
Therefore, the invention is not limited to the specific embodiments and examples, but rather, all equivalent variations and modifications are within the scope of the invention as defined in the claims and the specification.
Claims (8)
1. A multi-source test data leading processing method is characterized by comprising the following steps:
carrying out data leading;
identifying a data type;
calling a leading model of the corresponding type data;
and starting a leading connection processing flow.
2. The method for multi-source test data leading processing according to claim 1, wherein the leading processing flow comprises: the method comprises a semi-structured data leading process, a structured data leading process, an unstructured data leading process, an API interface data leading process and a real-time message data leading process.
3. The method for multi-source test data splicing processing according to claim 2, wherein the semi-structured data splicing process comprises the following steps:
connecting a first data source base by using a first search engine;
acquiring a set list from a first data source library according to specific service requirements;
connecting the collection list to a first local database;
the first search engine is constructed by integrating a MongoDB database operation component on the basis of Python technology, the semi-structured data of the first database is stored mainly in a MongoDB database, and the first local database is a MongoDB database.
4. The method for multi-source test data tieback processing according to claim 2, wherein the structured data tieback process comprises the steps of:
connecting a second database by using a second search engine;
acquiring a data table list from a second data source library according to specific service requirements;
the data table list is connected to a second local database by using a field conversion engine;
the second search engine is constructed by integrating various conventional database operation components on the basis of a Python technology, the structured data of the second database comprise conventional databases such as Oracle, Mysql, sqlserver, postgresql, domestic databases (Dameng) and the like, and the second local database is a Mysql database.
5. The method for multi-source test data tieback processing according to claim 2, wherein the unstructured data tieback process comprises the following steps:
data are input in an FTP mode;
selecting a file classification directory;
uploading the file data to the selected directory;
storing the related information of the file data to a third local database;
uploading the file data to a server through an FTP service;
wherein the third local database is a mysql database.
6. The multi-source test data leading processing method according to claim 5, wherein the entry of the unstructured data is divided into an off-line mode and an on-line mode:
the off-line mode is as follows: a user selects a file classification directory and directly uploads file data;
the online mode is as follows: and setting files needing to be connected and storage positions thereof through the provided service connection information, dynamically monitoring whether the file folder generates new files or not, and automatically and dynamically uploading the new files.
7. The multi-source test data leading processing method according to claim 2, wherein the API interface data leading process comprises the following steps:
acquiring data through an interface engine;
dividing the acquired data into structured data and semi-structured data;
structuring data, selecting a matched analysis component from the first analysis component library, analyzing and then leading to a fourth local database;
the semi-structured data is directly stored into a fourth local database;
the first analysis component library is developed and constructed based on Python technology according to different interface contents, and the fourth local database comprises a Mysql database and a mongoDB database.
8. The multi-source test data leading processing method according to claim 2, wherein the real-time message data leading process comprises the following steps:
the data is acquired through the UDP,
selecting a matched analysis component from the second analysis component library, analyzing and then leading to a fifth local database;
the second analysis component library is developed and constructed for different message formats on the basis of Python technology, and the fifth local database is a Mysql database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111130413.8A CN113901117A (en) | 2021-09-26 | 2021-09-26 | Multi-source test data leading processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111130413.8A CN113901117A (en) | 2021-09-26 | 2021-09-26 | Multi-source test data leading processing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113901117A true CN113901117A (en) | 2022-01-07 |
Family
ID=79029527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111130413.8A Pending CN113901117A (en) | 2021-09-26 | 2021-09-26 | Multi-source test data leading processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113901117A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114866541A (en) * | 2022-07-11 | 2022-08-05 | 太极计算机股份有限公司 | Data transmission method, device and system |
CN116455678A (en) * | 2023-06-16 | 2023-07-18 | 中国电子科技集团公司第十五研究所 | Network security log tandem method and system |
-
2021
- 2021-09-26 CN CN202111130413.8A patent/CN113901117A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114866541A (en) * | 2022-07-11 | 2022-08-05 | 太极计算机股份有限公司 | Data transmission method, device and system |
CN114866541B (en) * | 2022-07-11 | 2022-09-23 | 太极计算机股份有限公司 | Data transmission method, device and system |
CN116455678A (en) * | 2023-06-16 | 2023-07-18 | 中国电子科技集团公司第十五研究所 | Network security log tandem method and system |
CN116455678B (en) * | 2023-06-16 | 2023-09-05 | 中国电子科技集团公司第十五研究所 | Network security log tandem method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111522922A (en) | Log information query method and device, storage medium and computer equipment | |
TW201935279A (en) | Data processing method, apparatus and electronic device | |
CN109637602B (en) | Medical data storage and query method, device, storage medium and electronic equipment | |
CN113901117A (en) | Multi-source test data leading processing method | |
CN111459944B (en) | MR data storage method, device, server and storage medium | |
US20130086420A1 (en) | Method and system for implementing a test automation results importer | |
CN109241384B (en) | Scientific research information visualization method and device | |
CN110704476A (en) | Data processing method, device, equipment and storage medium | |
CN111125068A (en) | Metadata management method and system | |
CN105205757A (en) | Android-based elective system | |
CN110198327B (en) | Data transmission method and related equipment | |
CN104901845A (en) | Automation test system and method of domain name WHOIS service | |
CN116450890A (en) | Graph data processing method, device and system, electronic equipment and storage medium | |
CN115825312A (en) | Chromatographic detection data interaction method, device, equipment and computer readable medium | |
US11693764B2 (en) | Method, apparatus, device and storage medium for map retrieval test | |
CN116975649A (en) | Data processing method, device, electronic equipment, storage medium and program product | |
CN116150236A (en) | Data synchronization method and device, electronic equipment and computer readable storage medium | |
CN109062797B (en) | Method and device for generating information | |
CN112445811A (en) | Data service method, device, storage medium and component based on SQL configuration | |
CN110955709A (en) | Data processing method and device and electronic equipment | |
CN113190236B (en) | HQL script verification method and device | |
CN117076515B (en) | Metadata tracing method and device in medical management system, server and storage medium | |
CN117931199A (en) | Code standardization method, device, electronic equipment and computer readable medium | |
CN117827902A (en) | Service data processing method, device, computer equipment and storage medium | |
CN118113724A (en) | Data query method and device, electronic equipment and nonvolatile storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |