CN105893392A - Data batch-loading method for intelligent multi-variable data processing - Google Patents

Data batch-loading method for intelligent multi-variable data processing Download PDF

Info

Publication number
CN105893392A
CN105893392A CN201510038598.8A CN201510038598A CN105893392A CN 105893392 A CN105893392 A CN 105893392A CN 201510038598 A CN201510038598 A CN 201510038598A CN 105893392 A CN105893392 A CN 105893392A
Authority
CN
China
Prior art keywords
data
coordinate
loading
loaded
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510038598.8A
Other languages
Chinese (zh)
Inventor
曾仲大
陈爱明
邓毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Chemdatasolution Information Technology Co Ltd
Original Assignee
Dalian Chemdatasolution Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Chemdatasolution Information Technology Co Ltd filed Critical Dalian Chemdatasolution Information Technology Co Ltd
Priority to CN201510038598.8A priority Critical patent/CN105893392A/en
Publication of CN105893392A publication Critical patent/CN105893392A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a data batch-loading method for intelligent multi-variable data processing, and belongs to the field of analytic chemometrics. The method disclosed by the present invention completely solves various problems related to data batch-loading by integrated analysis on data files under an abnormal and complicated condition, wherein the various problems comprise: data file type and data partitioning, data and characters, data head files, data transferring, data coordinates and responses, data coordinate equalization and length equalization, various situations and processing of loaded data and the like, so as to realize data batch-loading based on a folder form and assist intelligent analysis processing and information extraction mining of "three high" data (high dimensionality, high throughput and high complexity). The method has a wide application prospect.

Description

A kind of data processed for wisdom multivariate data criticize loading method
Technical field
The present invention is that the data proposing the process of a kind of wisdom multivariate data criticize loading method, belongs to the Chemical Measurement field in analytical chemistry.Specifically For be to the complexity that need to analyze and process, magnanimity and multivariate data batch be loaded into, relate to different types of data and data delimiter by solution, be No comprise data and character, if comprise data coordinates, if in same data, comprise the response of multiple sample, if be loaded into multiple different length With the data of coordinate, and whether load data into the medium various complicated cases of the data existed, propose the solution automatically processed respectively, To realize batch loading automatically of data, thus get through the initial key link that intelligent multivariate data processes, have a good application prospect.
Background technology
Amount data complicated and changeable process to be excavated with information retrieval, strong depend-ence mathematics, statistics, artificial intelligence, chemistry and bioinformatics, and The application of chemometrics method and development, " the big data " of the most chemical and biological association area process, and need especially by means of calculating quickly Intelligence, result accurately and reliably, are adaptable to the basic algorithm (higher-dimension, high flux and high complexity) of " three high " data analysis, and this is also several According to the key point processed.The process the most very very complicated that data process, as a example by the process of the analytical tool data such as chromatograph and spectrum, generally Different instrument producers, the raw data format that its instrument gathers, it is different from, scientific research personnel is when processing these data, it is difficult to integrate not The complex data gathered with instrument is applied for scientific research and industrial reality.
Different types of Data Integration and batch are loaded in the multivariate data list of an analyzed process, are the key one of wisdom data process Ring.But data traditionally criticize loading be a thing taken time and effort very much, be difficulty with automatization, especially directly input a file After, it is rapidly loaded the total data under file.Because on the one hand this need the artificial data form that the data of various forms are converted into uniform type File, when data and file amount are the biggest when, format transformation will expend substantial amounts of energy and time, is also easy to people occur during conversion For mistake, the file format even having generally is difficult to the method finding conversion, has data and can not analyze and process effectively, easily, Obtain data value, be thing unfortunately, also reduce scientific research efficiency.Data X-axis value length in the most different files is the most inconsistent, Cause data to criticize the failure of loading, when running into such problem, it usually needs manually the data to each data file shear, zero padding, interpolation, Under big data, such artificial treatment, substantially to reduce work efficiency, increases cost.The most also existing a large amount of of above-mentioned data itself are included Intractable problem, including being directly loaded into of data under file, data type and data delimiter, data and character, data coordinates, multiple data It is concurrently present in data file medium, all greatly increases data and criticize the difficulty of loading.
Intelligent data criticize loading automatically, can be widely applied to the analyzing and processing of the produced data of analytical tool excavate with information retrieval (as chromatograph, Mass spectrum and spectrum etc.), use scene then to include numerous row such as pharmacy, Nicotiana tabacum L., wine brewing, agricultural, food, petrochemical industry, environment, quality supervision, biology Industry field, has wide range of applications, and has good prospects.
Summary of the invention
The invention reside in and provide a kind of data processed for wisdom multivariate data to criticize loading method (hereinafter referred to as data criticize loading method).Pass through The method can fast realize batch data and be loaded into, it is only necessary to according to user demand, definition needs the file at analyzed process file place, just may be used Realize data are rapidly loaded, and to data form, the particular content included in data, data sample number and coordinate information, if Having the data etc. being loaded, do not have strict requirements, the wisdom that can really realize data criticizes loading.Its core point includes: 1), identifies not apposition The file of formula type, including txt, excel, Mat, SPC, JCAMP-DX etc.;2), the file data of different-format type batch is loaded into and Integrate in one file, facilitate the analyzing and processing of data;3), when data length in different files is inconsistent, also can be carried out by this method Data shearing, zero padding, interpolation etc. process so that data still keep concordance, it is achieved data are loaded into;4), the stream that current data is loaded into is preserved Journey, can realize a key and process;5), by solving different types of data and data delimiter, data and character, data coordinates, multiple data are simultaneously It is present in the problem that data file is medium, it is achieved the data of folder-type criticize loading.
The present invention is compared with traditional data loading method, and superiority is obvious.First, can automatically identify all of data file by this method, and According to demand, one or more files are simultaneously load in data list, it is to avoid change the data file that can not identify traditionally, Yi Jiren The data of work are loaded into and Data Integration;Secondly, the data to different length, the present invention provides method, it is achieved data shearing, zero padding, interpolation etc. Reason so that all loadings can directly carry out complicated analyzing and processing subsequently;Especially, the present invention really realizes intelligent folder data and criticizes loading Criticizing loading with a key data, this is also one of the difficult point of data wisdom analyzing and processing up to now.
The data that the invention provides the process of wisdom multivariate data criticize loading method, before having applications well in complicated high flux data message excavates Scape.
Accompanying drawing explanation
Fig. 1, the data for the process of wisdom multivariate data criticize the method flow of loading.
Fig. 2, the data for the process of wisdom multivariate data are criticized the typical case of loading method and are realized (program interface).In figure region 1 browsable, search Data file position;Region 2 shows the data file of corresponding form, selects for user;Region 3 is current file title, clicks on dialogue Frame can be with switch data file;Region 4 is arranged for other parameter, can simply process data;Region 5 can carry out the place of coordinate axes Reason and data interpolating are arranged;Region 6 is the data of the current sample file of preview;Region 7 is arranged for basic parameter.
Fig. 3, uses the multiple data loaded by the method for the invention, is illustrated in the software of loading data.The data type sum of these data According to separator, data and all property of there are differences such as character and data coordinates.The batch being realized data by the inventive method is loaded automatically.
The visualized graphs of Fig. 4, Fig. 3 institute loading data shows, the effect being loaded into for evaluating data and accuracy.
Detailed description of the invention
Embodiment: to be loaded into a typical actual spectrum data instance from file interlinear notes, illustrate what wisdom multivariate data of the present invention processed Data are criticized loading method and are used and result.
Fig. 1 is the method flow that the data processed for wisdom multivariate data criticize loading, program can realize data of the present invention based on this flow process Criticize loading method.
Fig. 2 is that a program interface of the method for the invention realizes, and for example and the availability of explanation the inventive method, and verifies that data are loaded into Result.In figure, region 1 is the data found after user input or load document folder path;Region 2 is all numbers in then being pressed from both sides by selected file Show according to file, required data file can be chosen by demand;Configuration parameter in region 4 as required, such as can be to selected data Carry out transposition, delete header file, reservation header file, self-defined separation mode etc.;In region 5, the process to coordinate axes is arranged with data interpolating; In region 6, the data of the current sample file of preview, check whether data tally with the actual situation;Basic setup is carried out in region 7, if select " with Upper parameter is applied to all sample files ", it will all of data are carried out identical process, it is achieved a key processes, save the time, improve work Efficiency.After data have been loaded into, just obtain the visualized graphs expression of results shown in tables of data as shown in Figure 3 and Fig. 4.Criticizing is loaded into successfully, real The data that existing wisdom multivariate data processes are loaded into, and loaded data can be used for various Data Analysis Services subsequently, including: data prediction, Variable selection, exploratory analysis, classification analysis and regression analysis etc..

Claims (8)

1. the data processed for wisdom multivariate data criticize loading method, it is characterised in that comprise the steps of
A. arranging the folder location of storage file, according to the demand of use, determine the data file class that needs load, select corresponding load mode, batch loading identifies the file that can read automatically, adds data sequence to by needing loading data file entirety;
B. criticize loading Auto-matching and identify data delimiter, precisely reading and distinguish the data and character comprised in file;
C. according to identified data and character, header file, and the information characterized according to data row, column are deleted, it is achieved the transposition etc. of data is operated;
D. criticize and be loaded into automatic discrimination and be written into the coordinate of data and Response List form, the response coordinate being i.e. loaded in data or response, thus judge the coordinate figure that need to be intercepted and the data that need to be loaded, it is achieved the operation to coordinate, including adding, revise and deletion etc.;
E. criticize to be loaded into and can process multiple different length or different zero point and the data of end point, data parameters is set, it is ensured that data wait coordinate and isometric property;
F. criticize loading automatic discrimination to have existed by the previously loaded data, it is achieved add new data in the data being loaded or be loaded as new data;
G. criticize loading automatically identifies in data whether there is coordinate, if there is coordinate, then wisdom coordinate identification character, and and then transmit out by coordinate figure, it is provided that give and be written into data as with reference to using;
H. criticizing to be loaded into and automatically record the parameter setting that passing data load, the data passing to if desired currently be loaded use.
2. according to arranging file store path described in claims 1, can directly input, also may select path position, all files under automatic reading folder, it is achieved wisdom batch data loads.
3. according to the data file class described in claims 1, it is characterised in that comprise the loading of various different types of data, including txt, excel, Mat, SPC and JCAMP-DX etc..
4. according to identification data and the character automatically described in claims 1, it is characterised in that according to the different expression of data Yu character, to find the character in header file, it is achieved with the differentiation of real response data.
5. according to the data coordinates described in claims 1, it is characterised in that include chemistry coordinate and mathematical coordinates, the most respectively comprise coordinate and the mathematics sequence number coordinate of actual chemical sense.
6. according to the wisdom identification of data coordinates described in claims 1, it is characterised in that according to the continuous, equidistant of coordinate and the characteristic of linear change, with the real data response of distinguishing nonlinear change.
7. according to long process such as the coordinates described in claims 1, it is characterised in that when being loaded into the file data of different length, can shear data, zero padding or linear interpolation etc. process, to ensure the extension consistency of different pieces of information.
8. criticize loading method according to the data described in claims 1, to realize intelligent multivariate data analysis process, be characterised by batches, realize to few manual intervention the loading of data, complete the initial of Data Analysis Services but final steps.
CN201510038598.8A 2015-01-26 2015-01-26 Data batch-loading method for intelligent multi-variable data processing Pending CN105893392A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510038598.8A CN105893392A (en) 2015-01-26 2015-01-26 Data batch-loading method for intelligent multi-variable data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510038598.8A CN105893392A (en) 2015-01-26 2015-01-26 Data batch-loading method for intelligent multi-variable data processing

Publications (1)

Publication Number Publication Date
CN105893392A true CN105893392A (en) 2016-08-24

Family

ID=56999229

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510038598.8A Pending CN105893392A (en) 2015-01-26 2015-01-26 Data batch-loading method for intelligent multi-variable data processing

Country Status (1)

Country Link
CN (1) CN105893392A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832348A (en) * 2017-10-19 2018-03-23 江苏省邮电规划设计院有限责任公司 A kind of processing method of the network data flow based on collecting terminal to cloud
CN107977349A (en) * 2017-11-23 2018-05-01 郑州云海信息技术有限公司 It is a kind of toward the method and system for adding polytype file in Excel in batches
CN108572271A (en) * 2018-01-26 2018-09-25 深圳市鼎阳科技有限公司 A kind of cache information sweep-out method and digital oscilloscope for oscillograph

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169711A (en) * 2006-10-27 2008-04-30 鸿富锦精密工业(深圳)有限公司 Data conversion system and method
CN103399848A (en) * 2013-06-21 2013-11-20 西安航天动力试验技术研究所 Engine test data standardized specific format leading-in processing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169711A (en) * 2006-10-27 2008-04-30 鸿富锦精密工业(深圳)有限公司 Data conversion system and method
CN103399848A (en) * 2013-06-21 2013-11-20 西安航天动力试验技术研究所 Engine test data standardized specific format leading-in processing method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832348A (en) * 2017-10-19 2018-03-23 江苏省邮电规划设计院有限责任公司 A kind of processing method of the network data flow based on collecting terminal to cloud
CN107832348B (en) * 2017-10-19 2020-01-21 中通服咨询设计研究院有限公司 Method for processing network data stream from intelligent acquisition terminal to cloud
CN107977349A (en) * 2017-11-23 2018-05-01 郑州云海信息技术有限公司 It is a kind of toward the method and system for adding polytype file in Excel in batches
CN108572271A (en) * 2018-01-26 2018-09-25 深圳市鼎阳科技有限公司 A kind of cache information sweep-out method and digital oscilloscope for oscillograph

Similar Documents

Publication Publication Date Title
Sokolov et al. Enhanced process understanding and multivariate prediction of the relationship between cell culture process and monoclonal antibody quality
CN109408102B (en) Version comparison method and device, household electrical appliance and network equipment
US20190155797A1 (en) Systems and methods for providing data quality management
AU2011224139B2 (en) Analysis of object structures such as benefits and provider contracts
EP3165984B1 (en) An event analysis apparatus, an event analysis method, and an event analysis program
EP2648152A1 (en) Data solutions system
CN109635292B (en) Work order quality inspection method and device based on machine learning algorithm
CN105868310A (en) Data processing method and device and electronic device
CN102317877A (en) Program analysis support device
CN105893392A (en) Data batch-loading method for intelligent multi-variable data processing
CN113688288B (en) Data association analysis method, device, computer equipment and storage medium
CN110706750B (en) Dynamic interactive microbiology online analysis cloud platform and generation method thereof
CN105488089A (en) Automatic generation method and system of quality evaluation report
CN102067117B (en) Method for displaying and operating table
JP2019074889A (en) System, method, and program for automating business process involving operation of web browser
EP3126979A1 (en) Specific risk toolkit
Hares et al. Rapid DNA for crime scene use: Enhancements and data needed to consider use on forensic evidence for State and National DNA Databasing–An agreed position statement by ENFSI, SWGDAM and the Rapid DNA Crime Scene Technology Advancement Task Group
CN112585547A (en) Analysis device, analysis method, and analysis program
CN104933077B (en) Rule-based multifile information analysis method
CN102637244B (en) Biological sequence analysis platform and using method thereof
US10437847B1 (en) Transformation based sampling for preprocessing big data
BRZOZOWSKA et al. DATA ENGINEERING IN CRISP-DM PROCESS PRODUCTION DATA–CASE STUDY
CN106688002B (en) Simulation system, simulation method, and simulation program
US20230102127A1 (en) Systems and methods for identifying samples of interest by comparing aligned time-series measurements
CN115409541A (en) Cigarette brand data processing method based on data blood relationship

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160824

WD01 Invention patent application deemed withdrawn after publication