CN105893392A - Data batch-loading method for intelligent multi-variable data processing - Google Patents
Data batch-loading method for intelligent multi-variable data processing Download PDFInfo
- Publication number
- CN105893392A CN105893392A CN201510038598.8A CN201510038598A CN105893392A CN 105893392 A CN105893392 A CN 105893392A CN 201510038598 A CN201510038598 A CN 201510038598A CN 105893392 A CN105893392 A CN 105893392A
- Authority
- CN
- China
- Prior art keywords
- data
- coordinate
- loading
- loaded
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention discloses a data batch-loading method for intelligent multi-variable data processing, and belongs to the field of analytic chemometrics. The method disclosed by the present invention completely solves various problems related to data batch-loading by integrated analysis on data files under an abnormal and complicated condition, wherein the various problems comprise: data file type and data partitioning, data and characters, data head files, data transferring, data coordinates and responses, data coordinate equalization and length equalization, various situations and processing of loaded data and the like, so as to realize data batch-loading based on a folder form and assist intelligent analysis processing and information extraction mining of "three high" data (high dimensionality, high throughput and high complexity). The method has a wide application prospect.
Description
Technical field
The present invention is that the data proposing the process of a kind of wisdom multivariate data criticize loading method, belongs to the Chemical Measurement field in analytical chemistry.Specifically
For be to the complexity that need to analyze and process, magnanimity and multivariate data batch be loaded into, relate to different types of data and data delimiter by solution, be
No comprise data and character, if comprise data coordinates, if in same data, comprise the response of multiple sample, if be loaded into multiple different length
With the data of coordinate, and whether load data into the medium various complicated cases of the data existed, propose the solution automatically processed respectively,
To realize batch loading automatically of data, thus get through the initial key link that intelligent multivariate data processes, have a good application prospect.
Background technology
Amount data complicated and changeable process to be excavated with information retrieval, strong depend-ence mathematics, statistics, artificial intelligence, chemistry and bioinformatics, and
The application of chemometrics method and development, " the big data " of the most chemical and biological association area process, and need especially by means of calculating quickly
Intelligence, result accurately and reliably, are adaptable to the basic algorithm (higher-dimension, high flux and high complexity) of " three high " data analysis, and this is also several
According to the key point processed.The process the most very very complicated that data process, as a example by the process of the analytical tool data such as chromatograph and spectrum, generally
Different instrument producers, the raw data format that its instrument gathers, it is different from, scientific research personnel is when processing these data, it is difficult to integrate not
The complex data gathered with instrument is applied for scientific research and industrial reality.
Different types of Data Integration and batch are loaded in the multivariate data list of an analyzed process, are the key one of wisdom data process
Ring.But data traditionally criticize loading be a thing taken time and effort very much, be difficulty with automatization, especially directly input a file
After, it is rapidly loaded the total data under file.Because on the one hand this need the artificial data form that the data of various forms are converted into uniform type
File, when data and file amount are the biggest when, format transformation will expend substantial amounts of energy and time, is also easy to people occur during conversion
For mistake, the file format even having generally is difficult to the method finding conversion, has data and can not analyze and process effectively, easily,
Obtain data value, be thing unfortunately, also reduce scientific research efficiency.Data X-axis value length in the most different files is the most inconsistent,
Cause data to criticize the failure of loading, when running into such problem, it usually needs manually the data to each data file shear, zero padding, interpolation,
Under big data, such artificial treatment, substantially to reduce work efficiency, increases cost.The most also existing a large amount of of above-mentioned data itself are included
Intractable problem, including being directly loaded into of data under file, data type and data delimiter, data and character, data coordinates, multiple data
It is concurrently present in data file medium, all greatly increases data and criticize the difficulty of loading.
Intelligent data criticize loading automatically, can be widely applied to the analyzing and processing of the produced data of analytical tool excavate with information retrieval (as chromatograph,
Mass spectrum and spectrum etc.), use scene then to include numerous row such as pharmacy, Nicotiana tabacum L., wine brewing, agricultural, food, petrochemical industry, environment, quality supervision, biology
Industry field, has wide range of applications, and has good prospects.
Summary of the invention
The invention reside in and provide a kind of data processed for wisdom multivariate data to criticize loading method (hereinafter referred to as data criticize loading method).Pass through
The method can fast realize batch data and be loaded into, it is only necessary to according to user demand, definition needs the file at analyzed process file place, just may be used
Realize data are rapidly loaded, and to data form, the particular content included in data, data sample number and coordinate information, if
Having the data etc. being loaded, do not have strict requirements, the wisdom that can really realize data criticizes loading.Its core point includes: 1), identifies not apposition
The file of formula type, including txt, excel, Mat, SPC, JCAMP-DX etc.;2), the file data of different-format type batch is loaded into and
Integrate in one file, facilitate the analyzing and processing of data;3), when data length in different files is inconsistent, also can be carried out by this method
Data shearing, zero padding, interpolation etc. process so that data still keep concordance, it is achieved data are loaded into;4), the stream that current data is loaded into is preserved
Journey, can realize a key and process;5), by solving different types of data and data delimiter, data and character, data coordinates, multiple data are simultaneously
It is present in the problem that data file is medium, it is achieved the data of folder-type criticize loading.
The present invention is compared with traditional data loading method, and superiority is obvious.First, can automatically identify all of data file by this method, and
According to demand, one or more files are simultaneously load in data list, it is to avoid change the data file that can not identify traditionally, Yi Jiren
The data of work are loaded into and Data Integration;Secondly, the data to different length, the present invention provides method, it is achieved data shearing, zero padding, interpolation etc.
Reason so that all loadings can directly carry out complicated analyzing and processing subsequently;Especially, the present invention really realizes intelligent folder data and criticizes loading
Criticizing loading with a key data, this is also one of the difficult point of data wisdom analyzing and processing up to now.
The data that the invention provides the process of wisdom multivariate data criticize loading method, before having applications well in complicated high flux data message excavates
Scape.
Accompanying drawing explanation
Fig. 1, the data for the process of wisdom multivariate data criticize the method flow of loading.
Fig. 2, the data for the process of wisdom multivariate data are criticized the typical case of loading method and are realized (program interface).In figure region 1 browsable, search
Data file position;Region 2 shows the data file of corresponding form, selects for user;Region 3 is current file title, clicks on dialogue
Frame can be with switch data file;Region 4 is arranged for other parameter, can simply process data;Region 5 can carry out the place of coordinate axes
Reason and data interpolating are arranged;Region 6 is the data of the current sample file of preview;Region 7 is arranged for basic parameter.
Fig. 3, uses the multiple data loaded by the method for the invention, is illustrated in the software of loading data.The data type sum of these data
According to separator, data and all property of there are differences such as character and data coordinates.The batch being realized data by the inventive method is loaded automatically.
The visualized graphs of Fig. 4, Fig. 3 institute loading data shows, the effect being loaded into for evaluating data and accuracy.
Detailed description of the invention
Embodiment: to be loaded into a typical actual spectrum data instance from file interlinear notes, illustrate what wisdom multivariate data of the present invention processed
Data are criticized loading method and are used and result.
Fig. 1 is the method flow that the data processed for wisdom multivariate data criticize loading, program can realize data of the present invention based on this flow process
Criticize loading method.
Fig. 2 is that a program interface of the method for the invention realizes, and for example and the availability of explanation the inventive method, and verifies that data are loaded into
Result.In figure, region 1 is the data found after user input or load document folder path;Region 2 is all numbers in then being pressed from both sides by selected file
Show according to file, required data file can be chosen by demand;Configuration parameter in region 4 as required, such as can be to selected data
Carry out transposition, delete header file, reservation header file, self-defined separation mode etc.;In region 5, the process to coordinate axes is arranged with data interpolating;
In region 6, the data of the current sample file of preview, check whether data tally with the actual situation;Basic setup is carried out in region 7, if select " with
Upper parameter is applied to all sample files ", it will all of data are carried out identical process, it is achieved a key processes, save the time, improve work
Efficiency.After data have been loaded into, just obtain the visualized graphs expression of results shown in tables of data as shown in Figure 3 and Fig. 4.Criticizing is loaded into successfully, real
The data that existing wisdom multivariate data processes are loaded into, and loaded data can be used for various Data Analysis Services subsequently, including: data prediction,
Variable selection, exploratory analysis, classification analysis and regression analysis etc..
Claims (8)
1. the data processed for wisdom multivariate data criticize loading method, it is characterised in that comprise the steps of
A. arranging the folder location of storage file, according to the demand of use, determine the data file class that needs load, select corresponding load mode, batch loading identifies the file that can read automatically, adds data sequence to by needing loading data file entirety;
B. criticize loading Auto-matching and identify data delimiter, precisely reading and distinguish the data and character comprised in file;
C. according to identified data and character, header file, and the information characterized according to data row, column are deleted, it is achieved the transposition etc. of data is operated;
D. criticize and be loaded into automatic discrimination and be written into the coordinate of data and Response List form, the response coordinate being i.e. loaded in data or response, thus judge the coordinate figure that need to be intercepted and the data that need to be loaded, it is achieved the operation to coordinate, including adding, revise and deletion etc.;
E. criticize to be loaded into and can process multiple different length or different zero point and the data of end point, data parameters is set, it is ensured that data wait coordinate and isometric property;
F. criticize loading automatic discrimination to have existed by the previously loaded data, it is achieved add new data in the data being loaded or be loaded as new data;
G. criticize loading automatically identifies in data whether there is coordinate, if there is coordinate, then wisdom coordinate identification character, and and then transmit out by coordinate figure, it is provided that give and be written into data as with reference to using;
H. criticizing to be loaded into and automatically record the parameter setting that passing data load, the data passing to if desired currently be loaded use.
2. according to arranging file store path described in claims 1, can directly input, also may select path position, all files under automatic reading folder, it is achieved wisdom batch data loads.
3. according to the data file class described in claims 1, it is characterised in that comprise the loading of various different types of data, including txt, excel, Mat, SPC and JCAMP-DX etc..
4. according to identification data and the character automatically described in claims 1, it is characterised in that according to the different expression of data Yu character, to find the character in header file, it is achieved with the differentiation of real response data.
5. according to the data coordinates described in claims 1, it is characterised in that include chemistry coordinate and mathematical coordinates, the most respectively comprise coordinate and the mathematics sequence number coordinate of actual chemical sense.
6. according to the wisdom identification of data coordinates described in claims 1, it is characterised in that according to the continuous, equidistant of coordinate and the characteristic of linear change, with the real data response of distinguishing nonlinear change.
7. according to long process such as the coordinates described in claims 1, it is characterised in that when being loaded into the file data of different length, can shear data, zero padding or linear interpolation etc. process, to ensure the extension consistency of different pieces of information.
8. criticize loading method according to the data described in claims 1, to realize intelligent multivariate data analysis process, be characterised by batches, realize to few manual intervention the loading of data, complete the initial of Data Analysis Services but final steps.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038598.8A CN105893392A (en) | 2015-01-26 | 2015-01-26 | Data batch-loading method for intelligent multi-variable data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038598.8A CN105893392A (en) | 2015-01-26 | 2015-01-26 | Data batch-loading method for intelligent multi-variable data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105893392A true CN105893392A (en) | 2016-08-24 |
Family
ID=56999229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510038598.8A Pending CN105893392A (en) | 2015-01-26 | 2015-01-26 | Data batch-loading method for intelligent multi-variable data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105893392A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832348A (en) * | 2017-10-19 | 2018-03-23 | 江苏省邮电规划设计院有限责任公司 | A kind of processing method of the network data flow based on collecting terminal to cloud |
CN107977349A (en) * | 2017-11-23 | 2018-05-01 | 郑州云海信息技术有限公司 | It is a kind of toward the method and system for adding polytype file in Excel in batches |
CN108572271A (en) * | 2018-01-26 | 2018-09-25 | 深圳市鼎阳科技有限公司 | A kind of cache information sweep-out method and digital oscilloscope for oscillograph |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101169711A (en) * | 2006-10-27 | 2008-04-30 | 鸿富锦精密工业(深圳)有限公司 | Data conversion system and method |
CN103399848A (en) * | 2013-06-21 | 2013-11-20 | 西安航天动力试验技术研究所 | Engine test data standardized specific format leading-in processing method |
-
2015
- 2015-01-26 CN CN201510038598.8A patent/CN105893392A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101169711A (en) * | 2006-10-27 | 2008-04-30 | 鸿富锦精密工业(深圳)有限公司 | Data conversion system and method |
CN103399848A (en) * | 2013-06-21 | 2013-11-20 | 西安航天动力试验技术研究所 | Engine test data standardized specific format leading-in processing method |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832348A (en) * | 2017-10-19 | 2018-03-23 | 江苏省邮电规划设计院有限责任公司 | A kind of processing method of the network data flow based on collecting terminal to cloud |
CN107832348B (en) * | 2017-10-19 | 2020-01-21 | 中通服咨询设计研究院有限公司 | Method for processing network data stream from intelligent acquisition terminal to cloud |
CN107977349A (en) * | 2017-11-23 | 2018-05-01 | 郑州云海信息技术有限公司 | It is a kind of toward the method and system for adding polytype file in Excel in batches |
CN108572271A (en) * | 2018-01-26 | 2018-09-25 | 深圳市鼎阳科技有限公司 | A kind of cache information sweep-out method and digital oscilloscope for oscillograph |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sokolov et al. | Enhanced process understanding and multivariate prediction of the relationship between cell culture process and monoclonal antibody quality | |
CN109408102B (en) | Version comparison method and device, household electrical appliance and network equipment | |
US20190155797A1 (en) | Systems and methods for providing data quality management | |
AU2011224139B2 (en) | Analysis of object structures such as benefits and provider contracts | |
EP3165984B1 (en) | An event analysis apparatus, an event analysis method, and an event analysis program | |
EP2648152A1 (en) | Data solutions system | |
CN109635292B (en) | Work order quality inspection method and device based on machine learning algorithm | |
CN105868310A (en) | Data processing method and device and electronic device | |
CN102317877A (en) | Program analysis support device | |
CN105893392A (en) | Data batch-loading method for intelligent multi-variable data processing | |
CN113688288B (en) | Data association analysis method, device, computer equipment and storage medium | |
CN110706750B (en) | Dynamic interactive microbiology online analysis cloud platform and generation method thereof | |
CN105488089A (en) | Automatic generation method and system of quality evaluation report | |
CN102067117B (en) | Method for displaying and operating table | |
JP2019074889A (en) | System, method, and program for automating business process involving operation of web browser | |
EP3126979A1 (en) | Specific risk toolkit | |
Hares et al. | Rapid DNA for crime scene use: Enhancements and data needed to consider use on forensic evidence for State and National DNA Databasing–An agreed position statement by ENFSI, SWGDAM and the Rapid DNA Crime Scene Technology Advancement Task Group | |
CN112585547A (en) | Analysis device, analysis method, and analysis program | |
CN104933077B (en) | Rule-based multifile information analysis method | |
CN102637244B (en) | Biological sequence analysis platform and using method thereof | |
US10437847B1 (en) | Transformation based sampling for preprocessing big data | |
BRZOZOWSKA et al. | DATA ENGINEERING IN CRISP-DM PROCESS PRODUCTION DATA–CASE STUDY | |
CN106688002B (en) | Simulation system, simulation method, and simulation program | |
US20230102127A1 (en) | Systems and methods for identifying samples of interest by comparing aligned time-series measurements | |
CN115409541A (en) | Cigarette brand data processing method based on data blood relationship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160824 |
|
WD01 | Invention patent application deemed withdrawn after publication |