CN107222583A - A kind of data transmission method of fusion structure data and unstructured data - Google Patents
A kind of data transmission method of fusion structure data and unstructured data Download PDFInfo
- Publication number
- CN107222583A CN107222583A CN201710671366.5A CN201710671366A CN107222583A CN 107222583 A CN107222583 A CN 107222583A CN 201710671366 A CN201710671366 A CN 201710671366A CN 107222583 A CN107222583 A CN 107222583A
- Authority
- CN
- China
- Prior art keywords
- data
- structural
- unstructured
- file
- structural data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of fusion structure data and the data transmission method of unstructured data, structural data is merged with unstructured data and is transmitted in one file, structural data and non-structural data are transmitted in a network, exchange synchronous with when sharing, it is to avoid there is the nonsynchronous problem of data during separated transmission;And in data transfer and processing procedure, structural data and unstructured data in application system by that can carry out unified, synchronization process after design method efficient association of the present invention, the complexity of asynchronous process is avoided, while it also avoid the data inconsistence problems in structural data or unstructured data caused by loss of data;And designed structure during based on data transfer, designs corresponding dissection process, analytic uniform, processing are realized for received structural data and unstructured data, the parsing effect in data transmission procedure is substantially increased.
Description
Technical field
The present invention relates to a kind of fusion structure data and the data transmission method of unstructured data, belong to data biography
Defeated, data exchange, data sharing technology field.
Background technology
In the mobile Internet epoch, the data volume of all trades and professions all shows the growth of geometric progression.Data are assets, are
Collection, store and excavate the value in these data, big data technology rises therewith.During big data is risen, have
A kind of demand seems particularly urgent.In the big data epoch, industry-by-industry is proposed the demand of Data Integration, that is, will be dispersed in each
Data in individual field, each system carry out unified extraction, processing, and centrally stored.
During Data Integration, inevitably there is data transfer, data exchange and data sharing, therefore each row
Industry is all formulating data transfer, data exchange and the data sharing standard of oneself.Two class numbers are generally related in the data transmission
According to:Structural data and unstructured data.Generally, structural data and unstructured data are separately located during data transfer
Reason.But structural data and unstructured data are closely related in many cases, with strong correlation, if separated
Transmission can bring many problems with processing., can be with using this method therefore, we devise a kind of new data transmission method
Fusion structure data and unstructured data, synchronous transfer, synchronization process, to data transfer, data exchange and data sharing
With great convenience.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of design simply, can efficiently ensure structural data and non-
Structural data synchronizes the fusion structure data of transmission and the data transmission method of unstructured data.
In order to solve the above-mentioned technical problem the present invention uses following technical scheme:The present invention devises a kind of fusion structure
The data transmission method of data and unstructured data, is carried out for the data for including structural data and unstructured data
Data transfer, comprises the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, meanwhile, by structure
Change data and be converted to preset data coded format, subsequently into step 002;
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with unstructured data
In N number of file one-to-one N number of field respectively, subsequently into step 003, wherein, N >=1;
Step 003. adds default every attribute information of the file of each in unstructured data as structuring respectively
The extended field of corresponding field in data, constitutes the reference of respective file in unstructured data, wherein, have in structured data
The field of extended field, constitutes compound fields, subsequently into step 004 with corresponding extended field;
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then will knot
Structure data length information, unstructured data length information, and the preset data coded format three of structural data enter
Row combination configuration file head, and enter step 005;
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and
Carry out data transmission, data transfer is realized for the data for including structural data and unstructured data.
It is used as a preferred technical solution of the present invention:Also comprise the following steps after the step 005, execution of step
After 005, into step 006;
Step 006. receiving terminal receives semi structured data, is parsed for file header, structural data is obtained respectively
Length information, unstructured data length information, and structural data preset data coded format, subsequently into step
007;
Step 007. extracts the structural data in semi-structured data, onestep extraction of going forward side by side each compound therein
Section, subsequently into step 008;
Step 008. is directed to structure according to the preset data coded format of structural data length information and structural data
Change data to be parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain half structure one by one
Change each file in unstructured data in data.
It is used as a preferred technical solution of the present invention:In the step 001, default every attribute information includes text
Part name, file type, file size.
It is used as a preferred technical solution of the present invention:In the step 001, structural data is converted into JSON data
Coded format.
More than a kind of data transmission method based on fusion structure data and unstructured data of the present invention is used
Technical scheme compared with prior art, with following technique effect:
(1) a kind of data transmission method based on fusion structure data and unstructured data designed by the present invention,
Structural data is merged with unstructured data and is transmitted in one file so that structural data and non-structural data
Can be synchronous when being transmitted, exchange and sharing in a network, it is to avoid there is the nonsynchronous problem of data during separated transmission;And
And in data transfer and processing procedure, after structural data and unstructured data are by design method efficient association of the present invention
Unified, synchronization process can be carried out in application system, it is to avoid the complexity of asynchronous process, while it also avoid structuring number
According to or unstructured data in data inconsistence problems caused by loss of data;And designed knot during based on data transfer
Structure, designs corresponding dissection process, and the realization for received structural data and unstructured data is unified
Parsing, processing, substantially increase the analyzing efficiency in data transmission procedure;
(2) a kind of data transmission method based on fusion structure data and unstructured data designed by the present invention
In, for structural data, specific design is converted to JSON data encoding formaies, and one is that JSON data encoding formaies are simple, clear
It is clear, be compared to XML format it is smaller, faster, be more easy to parsing;Two be that JSON data encoding formaies are a kind of standards, independently of language
Speech has extensive supportive again, and essentially all main flow programming language has corresponding storehouse for parsing the data of JSON forms,
And then the data transmission method based on fusion structure data and unstructured data designed by the present invention is effectively increased in reality
Operating efficiency among the application process of border.
Brief description of the drawings
Fig. 1 is the flow signal of the data transmission method of the fusion structure data that the present invention is designed and unstructured data
Figure;
Fig. 2 is the schematic diagram that structural data is converted to JSON data encoding formaies during the present invention is designed;
Fig. 3 is the structural representation of file header in present invention design.
Embodiment
The embodiment of the present invention is described in further detail with reference to Figure of description.
As shown in figure 1, the transmission side data of a kind of fusion structure data and unstructured data designed by the present invention
Method, among actual application process, carries out data transmission for the data for including structural data and unstructured data,
Comprise the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, including filename,
File type, file size, meanwhile, structural data is converted into JSON data encoding formaies, subsequently into step 002.
Herein for structural data, I is designed using JSON data encoding formaies, one is the letter of JSON data encoding formaies
It is single, clear, be compared to XML format it is smaller, faster, be more easy to parsing;Two be that JSON data encoding formaies are a kind of standards, independent
Have extensive supportive again in language, essentially all main flow programming language has corresponding storehouse to parse the number of JSON forms
According to, and then effectively increase the data transmission method based on fusion structure data and unstructured data designed by the present invention and exist
Operating efficiency among actual application.
In practical application, structural data is converted into JSON data encoding formaies, as shown in Fig. 2 the superiors' structure
Entitled record, represents the record of a structural data;Record next stage represents specific field information, each word
Duan Youyi key-value key-value pair represents that key represents field name, and value represents the value of field, direct by generic field
It is converted.If certain field is associated with unstructured data, the value parts of the field can extend further to non-knot
The reference of structure data, the information of reference includes previous step and generates the information extracted during unstructured data, including filename, text
Part type, whether it is the information such as binary file, file size.
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with unstructured data
In N number of file one-to-one N number of field respectively, subsequently into step 003, wherein, N >=1.
Step 003. adds default every attribute information of the file of each in unstructured data as structuring respectively
The extended field of corresponding field in data, constitutes the reference of respective file in unstructured data, wherein, have in structured data
The field of extended field, constitutes compound fields, subsequently into step 004 with corresponding extended field.
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then will knot
Structure data length information, unstructured data length information, and the preset data coded format three of structural data enter
Row combination configuration file head, and enter step 005.
Based on the above, the file header constituted in actual applications, can specific design as shown in Figure 3, wherein, text
Part head point total length is fixed as 24 bytes, is made up of three parts, is respectively:Structural data length information, non-structural
Change the preset data coded format of data length information and structural data." structural data length information " takes 4 bytes,
Represented with the binary form of signless integer, to hold mode to store greatly, the length range of expression is 0-4294967295.It is " non-
Structural data length information " equally takes 4 bytes, is represented with the binary form of signless integer, to hold mode to deposit greatly
Storage, the length range of expression is 0-4294967295." the preset data coded format of structural data " take 16 bytes with
The all capitalization of string representation, wherein English alphabet, character string order is from left to right, remainder is with the 0x00 of 16 systems
Filling.Such as character code is UTF-8, and 5 bytes are taken from left to right, and remaining 11 bytes are filled with the 0x00 of 16 systems.
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and
Carry out data transmission, data transfer is realized for the data for including structural data and unstructured data.
Correspondingly, after receiving terminal receives above-mentioned semi-structured data, it is directed to using following specific design step
The semi-structured data is parsed.
Step 006. receiving terminal receives semi structured data, is parsed for file header, structural data is obtained respectively
Length information, unstructured data length information, and structural data preset data coded format, subsequently into step
007;
Step 007. extracts the structural data in semi-structured data, onestep extraction of going forward side by side each compound therein
Section, subsequently into step 008;
Step 008. is directed to structure according to the preset data coded format of structural data length information and structural data
Change data to be parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain half structure one by one
Change each file in unstructured data in data.
Based on above-mentioned design technology project, the present invention is designed a kind of based on fusion structure data and unstructured data
Data transmission method, among actual application, structural data is merged in one file with unstructured data
It is transmitted so that structural data and non-structural data can be synchronous when being transmitted, exchange and sharing in a network, it is to avoid
There is the nonsynchronous problem of data during separated transmission;And in data transfer and processing procedure, structural data and non-knot
Structure data in application system by that can carry out unified, synchronization process after design method efficient association of the present invention, it is to avoid
The complexity of asynchronous process, while it also avoid data in structural data or unstructured data caused by loss of data not
Consensus;And designed structure during based on data transfer, designs corresponding dissection process, for received
Structural data and unstructured data realize analytic uniform, processing, substantially increase the parsing in data transmission procedure
Effect.
Embodiments of the present invention are explained in detail above in conjunction with accompanying drawing, but the present invention is not limited to above-mentioned implementation
Mode, can also be on the premise of present inventive concept not be departed from the knowledge that those of ordinary skill in the art possess
Make a variety of changes.
Claims (4)
1. the data transmission method of a kind of fusion structure data and unstructured data, for including structural data and non-
The data of structural data carry out data transmission, it is characterised in that comprise the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, meanwhile, by structuring number
According to preset data coded format is converted to, subsequently into step 002;
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with it is N number of in unstructured data
The one-to-one N number of field of file difference, subsequently into step 003, wherein, N >=1;
Step 003. adds default every attribute information of the file of each in unstructured data as structural data respectively
The extended field of middle corresponding field, constitutes the reference of respective file in unstructured data, wherein, there is extension in structured data
The field of field, constitutes compound fields, subsequently into step 004 with corresponding extended field;
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then by structuring
Data length information, unstructured data length information, and structural data preset data coded format three's carry out group
Configuration file head is closed, and enters step 005;
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and carry out
Data transfer, data transfer is realized for the data for including structural data and unstructured data.
2. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1
It is:Also comprise the following steps after the step 005, after execution of step 005, into step 006;
Step 006. receiving terminal receives semi structured data, is parsed for file header, and structural data length is obtained respectively
Information, unstructured data length information, and structural data preset data coded format, subsequently into step 007;
Structural data in step 007. extraction semi-structured data, onestep extraction of going forward side by side each compound fields therein, so
Enter step 008 afterwards;
Step 008. is directed to structuring number according to the preset data coded format of structural data length information and structural data
According to being parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain semi-structured number one by one
According to each file in middle unstructured data.
3. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1
It is:In the step 001, default every attribute information includes filename, file type, file size.
4. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1
It is:In the step 001, structural data is converted into JSON data encoding formaies.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710671366.5A CN107222583A (en) | 2017-08-08 | 2017-08-08 | A kind of data transmission method of fusion structure data and unstructured data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710671366.5A CN107222583A (en) | 2017-08-08 | 2017-08-08 | A kind of data transmission method of fusion structure data and unstructured data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107222583A true CN107222583A (en) | 2017-09-29 |
Family
ID=59954723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710671366.5A Pending CN107222583A (en) | 2017-08-08 | 2017-08-08 | A kind of data transmission method of fusion structure data and unstructured data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107222583A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513141A (en) * | 2018-03-26 | 2018-09-07 | 深圳市景阳信息技术有限公司 | A kind of receiving/transmission method of data, device and equipment |
CN111611011A (en) * | 2020-04-13 | 2020-09-01 | 中国科学院计算机网络信息中心 | JSON syntax extension method and analysis method and device supporting Blob data types |
CN112422510A (en) * | 2020-10-22 | 2021-02-26 | 山东浪潮通软信息科技有限公司 | Data transmission method and system based on DMZ zone |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103186610A (en) * | 2011-12-30 | 2013-07-03 | 金蝶软件(中国)有限公司 | Data synchronization method and device |
CN104899261A (en) * | 2015-05-20 | 2015-09-09 | 杜晓通 | Device and method for constructing structured video image information |
WO2015175548A1 (en) * | 2014-05-12 | 2015-11-19 | Diffeo, Inc. | Entity-centric knowledge discovery |
CN106993041A (en) * | 2017-04-01 | 2017-07-28 | 国网福建省电力有限公司 | A kind of power marketing moves work data synchronous method |
-
2017
- 2017-08-08 CN CN201710671366.5A patent/CN107222583A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103186610A (en) * | 2011-12-30 | 2013-07-03 | 金蝶软件(中国)有限公司 | Data synchronization method and device |
WO2015175548A1 (en) * | 2014-05-12 | 2015-11-19 | Diffeo, Inc. | Entity-centric knowledge discovery |
CN104899261A (en) * | 2015-05-20 | 2015-09-09 | 杜晓通 | Device and method for constructing structured video image information |
CN106993041A (en) * | 2017-04-01 | 2017-07-28 | 国网福建省电力有限公司 | A kind of power marketing moves work data synchronous method |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513141A (en) * | 2018-03-26 | 2018-09-07 | 深圳市景阳信息技术有限公司 | A kind of receiving/transmission method of data, device and equipment |
CN111611011A (en) * | 2020-04-13 | 2020-09-01 | 中国科学院计算机网络信息中心 | JSON syntax extension method and analysis method and device supporting Blob data types |
CN112422510A (en) * | 2020-10-22 | 2021-02-26 | 山东浪潮通软信息科技有限公司 | Data transmission method and system based on DMZ zone |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11568144B2 (en) | Calculating structural differences from binary differences in publish subscribe system | |
CN107222583A (en) | A kind of data transmission method of fusion structure data and unstructured data | |
CN107561564B (en) | A kind of compression implementation method of big-dipper satellite information transmission | |
CN102103605A (en) | Method and system for intelligently extracting document structure | |
CN109492177B (en) | web page blocking method based on web page semantic structure | |
CN102799592A (en) | Parsing method and system of rich text document | |
CN101950312A (en) | Method for analyzing webpage content of internet | |
US7318194B2 (en) | Methods and apparatus for representing markup language data | |
CN103902918B (en) | Method and device for rapidly extracting text from Word document | |
CN105808262B (en) | A kind of name matching process based on json formatted datas | |
CN102411602B (en) | Extensive makeup language (XML) parallel speculation analysis method realized on basis of field programmable gate array (FPGA) | |
CN108664546A (en) | Xml data structure conversion method and device | |
CN101388731B (en) | Low rate equivalent speech water sound communication technique | |
CN103188267A (en) | Protocol analyzing method based on DFA (Deterministic Finite Automaton) | |
CN108366050A (en) | A kind of common communication protocol processing method | |
CN102663108B (en) | Medicine corporation finding method based on parallelization label propagation algorithm for complex network model | |
CN106874240A (en) | Digital publishing method and system | |
CN102487353A (en) | Data transmission method | |
CN105740292B (en) | A kind of coding/decoding method and device | |
CN106777061B (en) | Information hiding system and method based on webpage text and image and extraction method | |
US12056434B2 (en) | Generating tagged content from text of an electronic document | |
CN109857958B (en) | Method for searching http input point | |
CN101686568B (en) | Methods and terminals for transmitting and displaying text information | |
CN112783836A (en) | Information exchange method, device and computer storage medium | |
CN105354021A (en) | Implementation method for integrating command lines in Linux kernel |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170929 |
|
RJ01 | Rejection of invention patent application after publication |