CN107222583A - A kind of data transmission method of fusion structure data and unstructured data - Google Patents

A kind of data transmission method of fusion structure data and unstructured data Download PDF

Info

Publication number
CN107222583A
CN107222583A CN201710671366.5A CN201710671366A CN107222583A CN 107222583 A CN107222583 A CN 107222583A CN 201710671366 A CN201710671366 A CN 201710671366A CN 107222583 A CN107222583 A CN 107222583A
Authority
CN
China
Prior art keywords
data
structural
unstructured
file
structural data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710671366.5A
Other languages
Chinese (zh)
Inventor
吉琨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu City Science And Technology Co
Original Assignee
Jiangsu City Science And Technology Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu City Science And Technology Co filed Critical Jiangsu City Science And Technology Co
Priority to CN201710671366.5A priority Critical patent/CN107222583A/en
Publication of CN107222583A publication Critical patent/CN107222583A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of fusion structure data and the data transmission method of unstructured data, structural data is merged with unstructured data and is transmitted in one file, structural data and non-structural data are transmitted in a network, exchange synchronous with when sharing, it is to avoid there is the nonsynchronous problem of data during separated transmission;And in data transfer and processing procedure, structural data and unstructured data in application system by that can carry out unified, synchronization process after design method efficient association of the present invention, the complexity of asynchronous process is avoided, while it also avoid the data inconsistence problems in structural data or unstructured data caused by loss of data;And designed structure during based on data transfer, designs corresponding dissection process, analytic uniform, processing are realized for received structural data and unstructured data, the parsing effect in data transmission procedure is substantially increased.

Description

A kind of data transmission method of fusion structure data and unstructured data
Technical field
The present invention relates to a kind of fusion structure data and the data transmission method of unstructured data, belong to data biography Defeated, data exchange, data sharing technology field.
Background technology
In the mobile Internet epoch, the data volume of all trades and professions all shows the growth of geometric progression.Data are assets, are Collection, store and excavate the value in these data, big data technology rises therewith.During big data is risen, have A kind of demand seems particularly urgent.In the big data epoch, industry-by-industry is proposed the demand of Data Integration, that is, will be dispersed in each Data in individual field, each system carry out unified extraction, processing, and centrally stored.
During Data Integration, inevitably there is data transfer, data exchange and data sharing, therefore each row Industry is all formulating data transfer, data exchange and the data sharing standard of oneself.Two class numbers are generally related in the data transmission According to:Structural data and unstructured data.Generally, structural data and unstructured data are separately located during data transfer Reason.But structural data and unstructured data are closely related in many cases, with strong correlation, if separated Transmission can bring many problems with processing., can be with using this method therefore, we devise a kind of new data transmission method Fusion structure data and unstructured data, synchronous transfer, synchronization process, to data transfer, data exchange and data sharing With great convenience.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of design simply, can efficiently ensure structural data and non- Structural data synchronizes the fusion structure data of transmission and the data transmission method of unstructured data.
In order to solve the above-mentioned technical problem the present invention uses following technical scheme:The present invention devises a kind of fusion structure The data transmission method of data and unstructured data, is carried out for the data for including structural data and unstructured data Data transfer, comprises the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, meanwhile, by structure Change data and be converted to preset data coded format, subsequently into step 002;
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with unstructured data In N number of file one-to-one N number of field respectively, subsequently into step 003, wherein, N >=1;
Step 003. adds default every attribute information of the file of each in unstructured data as structuring respectively The extended field of corresponding field in data, constitutes the reference of respective file in unstructured data, wherein, have in structured data The field of extended field, constitutes compound fields, subsequently into step 004 with corresponding extended field;
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then will knot Structure data length information, unstructured data length information, and the preset data coded format three of structural data enter Row combination configuration file head, and enter step 005;
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and Carry out data transmission, data transfer is realized for the data for including structural data and unstructured data.
It is used as a preferred technical solution of the present invention:Also comprise the following steps after the step 005, execution of step After 005, into step 006;
Step 006. receiving terminal receives semi structured data, is parsed for file header, structural data is obtained respectively Length information, unstructured data length information, and structural data preset data coded format, subsequently into step 007;
Step 007. extracts the structural data in semi-structured data, onestep extraction of going forward side by side each compound therein Section, subsequently into step 008;
Step 008. is directed to structure according to the preset data coded format of structural data length information and structural data Change data to be parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain half structure one by one Change each file in unstructured data in data.
It is used as a preferred technical solution of the present invention:In the step 001, default every attribute information includes text Part name, file type, file size.
It is used as a preferred technical solution of the present invention:In the step 001, structural data is converted into JSON data Coded format.
More than a kind of data transmission method based on fusion structure data and unstructured data of the present invention is used Technical scheme compared with prior art, with following technique effect:
(1) a kind of data transmission method based on fusion structure data and unstructured data designed by the present invention, Structural data is merged with unstructured data and is transmitted in one file so that structural data and non-structural data Can be synchronous when being transmitted, exchange and sharing in a network, it is to avoid there is the nonsynchronous problem of data during separated transmission;And And in data transfer and processing procedure, after structural data and unstructured data are by design method efficient association of the present invention Unified, synchronization process can be carried out in application system, it is to avoid the complexity of asynchronous process, while it also avoid structuring number According to or unstructured data in data inconsistence problems caused by loss of data;And designed knot during based on data transfer Structure, designs corresponding dissection process, and the realization for received structural data and unstructured data is unified Parsing, processing, substantially increase the analyzing efficiency in data transmission procedure;
(2) a kind of data transmission method based on fusion structure data and unstructured data designed by the present invention In, for structural data, specific design is converted to JSON data encoding formaies, and one is that JSON data encoding formaies are simple, clear It is clear, be compared to XML format it is smaller, faster, be more easy to parsing;Two be that JSON data encoding formaies are a kind of standards, independently of language Speech has extensive supportive again, and essentially all main flow programming language has corresponding storehouse for parsing the data of JSON forms, And then the data transmission method based on fusion structure data and unstructured data designed by the present invention is effectively increased in reality Operating efficiency among the application process of border.
Brief description of the drawings
Fig. 1 is the flow signal of the data transmission method of the fusion structure data that the present invention is designed and unstructured data Figure;
Fig. 2 is the schematic diagram that structural data is converted to JSON data encoding formaies during the present invention is designed;
Fig. 3 is the structural representation of file header in present invention design.
Embodiment
The embodiment of the present invention is described in further detail with reference to Figure of description.
As shown in figure 1, the transmission side data of a kind of fusion structure data and unstructured data designed by the present invention Method, among actual application process, carries out data transmission for the data for including structural data and unstructured data, Comprise the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, including filename, File type, file size, meanwhile, structural data is converted into JSON data encoding formaies, subsequently into step 002.
Herein for structural data, I is designed using JSON data encoding formaies, one is the letter of JSON data encoding formaies It is single, clear, be compared to XML format it is smaller, faster, be more easy to parsing;Two be that JSON data encoding formaies are a kind of standards, independent Have extensive supportive again in language, essentially all main flow programming language has corresponding storehouse to parse the number of JSON forms According to, and then effectively increase the data transmission method based on fusion structure data and unstructured data designed by the present invention and exist Operating efficiency among actual application.
In practical application, structural data is converted into JSON data encoding formaies, as shown in Fig. 2 the superiors' structure Entitled record, represents the record of a structural data;Record next stage represents specific field information, each word Duan Youyi key-value key-value pair represents that key represents field name, and value represents the value of field, direct by generic field It is converted.If certain field is associated with unstructured data, the value parts of the field can extend further to non-knot The reference of structure data, the information of reference includes previous step and generates the information extracted during unstructured data, including filename, text Part type, whether it is the information such as binary file, file size.
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with unstructured data In N number of file one-to-one N number of field respectively, subsequently into step 003, wherein, N >=1.
Step 003. adds default every attribute information of the file of each in unstructured data as structuring respectively The extended field of corresponding field in data, constitutes the reference of respective file in unstructured data, wherein, have in structured data The field of extended field, constitutes compound fields, subsequently into step 004 with corresponding extended field.
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then will knot Structure data length information, unstructured data length information, and the preset data coded format three of structural data enter Row combination configuration file head, and enter step 005.
Based on the above, the file header constituted in actual applications, can specific design as shown in Figure 3, wherein, text Part head point total length is fixed as 24 bytes, is made up of three parts, is respectively:Structural data length information, non-structural Change the preset data coded format of data length information and structural data." structural data length information " takes 4 bytes, Represented with the binary form of signless integer, to hold mode to store greatly, the length range of expression is 0-4294967295.It is " non- Structural data length information " equally takes 4 bytes, is represented with the binary form of signless integer, to hold mode to deposit greatly Storage, the length range of expression is 0-4294967295." the preset data coded format of structural data " take 16 bytes with The all capitalization of string representation, wherein English alphabet, character string order is from left to right, remainder is with the 0x00 of 16 systems Filling.Such as character code is UTF-8, and 5 bytes are taken from left to right, and remaining 11 bytes are filled with the 0x00 of 16 systems.
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and Carry out data transmission, data transfer is realized for the data for including structural data and unstructured data.
Correspondingly, after receiving terminal receives above-mentioned semi-structured data, it is directed to using following specific design step The semi-structured data is parsed.
Step 006. receiving terminal receives semi structured data, is parsed for file header, structural data is obtained respectively Length information, unstructured data length information, and structural data preset data coded format, subsequently into step 007;
Step 007. extracts the structural data in semi-structured data, onestep extraction of going forward side by side each compound therein Section, subsequently into step 008;
Step 008. is directed to structure according to the preset data coded format of structural data length information and structural data Change data to be parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain half structure one by one Change each file in unstructured data in data.
Based on above-mentioned design technology project, the present invention is designed a kind of based on fusion structure data and unstructured data Data transmission method, among actual application, structural data is merged in one file with unstructured data It is transmitted so that structural data and non-structural data can be synchronous when being transmitted, exchange and sharing in a network, it is to avoid There is the nonsynchronous problem of data during separated transmission;And in data transfer and processing procedure, structural data and non-knot Structure data in application system by that can carry out unified, synchronization process after design method efficient association of the present invention, it is to avoid The complexity of asynchronous process, while it also avoid data in structural data or unstructured data caused by loss of data not Consensus;And designed structure during based on data transfer, designs corresponding dissection process, for received Structural data and unstructured data realize analytic uniform, processing, substantially increase the parsing in data transmission procedure Effect.
Embodiments of the present invention are explained in detail above in conjunction with accompanying drawing, but the present invention is not limited to above-mentioned implementation Mode, can also be on the premise of present inventive concept not be departed from the knowledge that those of ordinary skill in the art possess Make a variety of changes.

Claims (4)

1. the data transmission method of a kind of fusion structure data and unstructured data, for including structural data and non- The data of structural data carry out data transmission, it is characterised in that comprise the following steps:
Step 001. obtains default every attribute information of each file in unstructured data respectively, meanwhile, by structuring number According to preset data coded format is converted to, subsequently into step 002;
Step 002. according to the quantity N of file in unstructured data, obtain in structural data with it is N number of in unstructured data The one-to-one N number of field of file difference, subsequently into step 003, wherein, N >=1;
Step 003. adds default every attribute information of the file of each in unstructured data as structural data respectively The extended field of middle corresponding field, constitutes the reference of respective file in unstructured data, wherein, there is extension in structured data The field of field, constitutes compound fields, subsequently into step 004 with corresponding extended field;
Step 004. obtains the length information of structural data and the length information of unstructured data respectively, then by structuring Data length information, unstructured data length information, and structural data preset data coded format three's carry out group Configuration file head is closed, and enters step 005;
Step 005. sequentially splices file header, structural data and unstructured data, constitutes semi-structured data, and carry out Data transfer, data transfer is realized for the data for including structural data and unstructured data.
2. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1 It is:Also comprise the following steps after the step 005, after execution of step 005, into step 006;
Step 006. receiving terminal receives semi structured data, is parsed for file header, and structural data length is obtained respectively Information, unstructured data length information, and structural data preset data coded format, subsequently into step 007;
Structural data in step 007. extraction semi-structured data, onestep extraction of going forward side by side each compound fields therein, so Enter step 008 afterwards;
Step 008. is directed to structuring number according to the preset data coded format of structural data length information and structural data According to being parsed, the structural data after being parsed, subsequently into step 009;
Extended field in each compound fields of step 009. in structural data, extracts obtain semi-structured number one by one According to each file in middle unstructured data.
3. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1 It is:In the step 001, default every attribute information includes filename, file type, file size.
4. the data transmission method of a kind of fusion structure data and unstructured data, its feature according to claim 1 It is:In the step 001, structural data is converted into JSON data encoding formaies.
CN201710671366.5A 2017-08-08 2017-08-08 A kind of data transmission method of fusion structure data and unstructured data Pending CN107222583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710671366.5A CN107222583A (en) 2017-08-08 2017-08-08 A kind of data transmission method of fusion structure data and unstructured data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710671366.5A CN107222583A (en) 2017-08-08 2017-08-08 A kind of data transmission method of fusion structure data and unstructured data

Publications (1)

Publication Number Publication Date
CN107222583A true CN107222583A (en) 2017-09-29

Family

ID=59954723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710671366.5A Pending CN107222583A (en) 2017-08-08 2017-08-08 A kind of data transmission method of fusion structure data and unstructured data

Country Status (1)

Country Link
CN (1) CN107222583A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108513141A (en) * 2018-03-26 2018-09-07 深圳市景阳信息技术有限公司 A kind of receiving/transmission method of data, device and equipment
CN111611011A (en) * 2020-04-13 2020-09-01 中国科学院计算机网络信息中心 JSON syntax extension method and analysis method and device supporting Blob data types
CN112422510A (en) * 2020-10-22 2021-02-26 山东浪潮通软信息科技有限公司 Data transmission method and system based on DMZ zone

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186610A (en) * 2011-12-30 2013-07-03 金蝶软件(中国)有限公司 Data synchronization method and device
CN104899261A (en) * 2015-05-20 2015-09-09 杜晓通 Device and method for constructing structured video image information
WO2015175548A1 (en) * 2014-05-12 2015-11-19 Diffeo, Inc. Entity-centric knowledge discovery
CN106993041A (en) * 2017-04-01 2017-07-28 国网福建省电力有限公司 A kind of power marketing moves work data synchronous method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186610A (en) * 2011-12-30 2013-07-03 金蝶软件(中国)有限公司 Data synchronization method and device
WO2015175548A1 (en) * 2014-05-12 2015-11-19 Diffeo, Inc. Entity-centric knowledge discovery
CN104899261A (en) * 2015-05-20 2015-09-09 杜晓通 Device and method for constructing structured video image information
CN106993041A (en) * 2017-04-01 2017-07-28 国网福建省电力有限公司 A kind of power marketing moves work data synchronous method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108513141A (en) * 2018-03-26 2018-09-07 深圳市景阳信息技术有限公司 A kind of receiving/transmission method of data, device and equipment
CN111611011A (en) * 2020-04-13 2020-09-01 中国科学院计算机网络信息中心 JSON syntax extension method and analysis method and device supporting Blob data types
CN112422510A (en) * 2020-10-22 2021-02-26 山东浪潮通软信息科技有限公司 Data transmission method and system based on DMZ zone

Similar Documents

Publication Publication Date Title
US11568144B2 (en) Calculating structural differences from binary differences in publish subscribe system
CN107222583A (en) A kind of data transmission method of fusion structure data and unstructured data
CN107561564B (en) A kind of compression implementation method of big-dipper satellite information transmission
CN102103605A (en) Method and system for intelligently extracting document structure
CN109492177B (en) web page blocking method based on web page semantic structure
CN102799592A (en) Parsing method and system of rich text document
CN101950312A (en) Method for analyzing webpage content of internet
US7318194B2 (en) Methods and apparatus for representing markup language data
CN103902918B (en) Method and device for rapidly extracting text from Word document
CN105808262B (en) A kind of name matching process based on json formatted datas
CN102411602B (en) Extensive makeup language (XML) parallel speculation analysis method realized on basis of field programmable gate array (FPGA)
CN108664546A (en) Xml data structure conversion method and device
CN101388731B (en) Low rate equivalent speech water sound communication technique
CN103188267A (en) Protocol analyzing method based on DFA (Deterministic Finite Automaton)
CN108366050A (en) A kind of common communication protocol processing method
CN102663108B (en) Medicine corporation finding method based on parallelization label propagation algorithm for complex network model
CN106874240A (en) Digital publishing method and system
CN102487353A (en) Data transmission method
CN105740292B (en) A kind of coding/decoding method and device
CN106777061B (en) Information hiding system and method based on webpage text and image and extraction method
US12056434B2 (en) Generating tagged content from text of an electronic document
CN109857958B (en) Method for searching http input point
CN101686568B (en) Methods and terminals for transmitting and displaying text information
CN112783836A (en) Information exchange method, device and computer storage medium
CN105354021A (en) Implementation method for integrating command lines in Linux kernel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170929

RJ01 Rejection of invention patent application after publication