CN103853775A - Method for converting data storage format based on multimedia data - Google Patents

Method for converting data storage format based on multimedia data Download PDF

Info

Publication number
CN103853775A
CN103853775A CN201210512403.5A CN201210512403A CN103853775A CN 103853775 A CN103853775 A CN 103853775A CN 201210512403 A CN201210512403 A CN 201210512403A CN 103853775 A CN103853775 A CN 103853775A
Authority
CN
China
Prior art keywords
data
xml
vrml
unstructured
storage format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210512403.5A
Other languages
Chinese (zh)
Inventor
刘海亮
杨艾琳
罗笑南
苏航
曾坤
王炫盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute of Sun Yat Sen University
Original Assignee
Shenzhen Research Institute of Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute of Sun Yat Sen University filed Critical Shenzhen Research Institute of Sun Yat Sen University
Priority to CN201210512403.5A priority Critical patent/CN103853775A/en
Publication of CN103853775A publication Critical patent/CN103853775A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a method for data storage format based on multimedia data identification. The method comprises the following steps: receiving the input of unstructured data based on the multimedia data; judging the data format in the unstructured data; if the data format in the unstructured data is judged and recognized as an initialized common text, adopting XML (Extensive Markup Language) base for settling the acquired initialized common text into XML flow; if the data format in the unstructured data is judged and recognized as VRML (Virtual Reality Markup Language) data, adopting a X3D converter for converting the data format for the acquired VRML data; storing the settled XML flow and/or the VRML data converted by the X3D converter into a relation database. Through the implementation of the method provided by the invention, the unstructured data are converted into the data which can be expressed in the relation database, so that the data can be utilized by an application program adopting the relation database.

Description

A kind of method based on multi-medium data translation data storage format
Technical field
The present invention relates to digital home technical field, be specifically related to a kind of method based on multi-medium data translation data storage format.
Background technology
At present, most of information is all non-structured, unstructured data occupies most information in digital home's application, it has applied range, processes the features such as difficult, standard is many, unstructured data is also the principal mode of isomeric data, is that Contemporary Digital family urgently breaks through and one of the significant problem that solves.
Interaction multimedia in digital home refers to and can carry out the interactive data that possess the information such as text, image with user.This information is generally all non-structured information, comprises the information such as writings and image.But process these unstructured data not a duck soups; The data storage technology of current maturation or based on relational database, relational database has standard simple to operate, data analysis and excavation and analysis is also easy than unstructured data; Become very necessary so how non-structured multi-medium data is converted to the relation data of structured storage.
Currently plain text is converted to relevant database there is certain methods.First these methods carry out non-structured data semi-structured, and then half-and-half structurized data are further processed into the structurized data that meet relevant database.Doing the structurized stage, can to adopt general way be XML; XML is exactly a kind of semi-structured data memory format, and it can be good at
Present dump method can obtain good effect for common text; But can not well support for some graph datas and view data time; Figure has very consequence in digital home aspect amusement, generally adopts grid representation; Image is extremely important in domestic medicine, is conventionally accompanied by the data such as characteristics of image, and these two kinds of data need special processing, and current method can not well be processed this two kinds of data.
Summary of the invention
The object of the invention is, for the multi-medium data that magnanimity interactive application in digital household environment produces provides a kind of conversion storage means, these unstructured datas to be stored in the relational database of standard, facilitate subsequent applications utilization.
The embodiment of the present invention provides a kind of method based on multi-medium data identification translation data storage format, and described method comprises:
Receive the input of the unstructured data based under multi-medium data;
Judge the data layout in described unstructured data;
The data layout identifying if judge in unstructured data is initialization plain text, adopts expandable mark language XML storehouse to be organized into XML stream to the initialization plain text of obtaining;
The data layout identifying if judge in unstructured data is Virtual Realization modeling language VRML data, the VRML data acquisition obtaining is carried out to the conversion of data layout with expanding three-dimensional language X3D converter;
Deposit the XML stream being organized into and/or the VRML data after the conversion of X3D converter in relational database.
Described initialization plain text is the text of preserving with text TXT.
Describedly adopt expandable mark language XML storehouse to be organized into XML stream to the initialization plain text obtained to comprise:
Make XML template, input identification character and separating character and source data, then generate XML file.
Described the VRML data acquisition that obtains is comprised with expanding the conversion that three-dimensional language X3D converter carries out data layout:
Graphic image data is carried out to mark semantically;
The good information of mark is sorted out to the pre-service of statistics, establishment logic;
Node-classification;
The X3D that these point of good classification processed formats, and is formatted into the type of pattern in relational database, table, the required correspondence of list item, imports in database.
Described relational database is MS-SQL or MySQL.
By implementing the present invention, by converting non-structured data process to can represent data in relational database, make to adopt the application program of relational database can utilize these data; Pattern in element in XML and relational database is shone upon, close element is summarized as to same classification, corresponding to row, the tuple of the table in relational database and table, can be beneficial to like this excavate the key message in multimedia; It is many that the element type that example is many has become tuple, so just only needs enquiry form just can obtain multimedia key feature data and key element.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the method flow diagram based on multi-medium data identification translation data storage format in the embodiment of the present invention;
Fig. 2 converts text data to XML document flow schematic diagram in the embodiment of the present invention;
Fig. 3 is the schematic flow sheet that X3D data-switching is become to the data in relational database in the embodiment of the present invention;
Fig. 4 is X3D element in the embodiment of the present invention and the mapping relations schematic diagram of relational database;
Fig. 5 is statistics in the embodiment of the present invention schematic flow sheet with dvielement;
Fig. 6 be in the embodiment of the present invention based on multi-medium data translation data storage format system dispose structural representation.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills, not making all other embodiment that obtain under creative work prerequisite, belong to the scope of protection of the invention.
In general, digital home terminal different application data that produce or that gather generation are all that non-structured, different its design data of application must be different; A lot of unpromising relational databases of data are as storage, and this is for having brought difficulty take relational database as basic application in follow-up.So be necessary data to change storage.As shown in Figure 1, system receives the input of the unstructured data based under multi-medium data to the flow process of native system; Judge the data layout in described unstructured data; The data layout identifying if judge in unstructured data is initialization plain text, adopts expandable mark language XML storehouse to be organized into XML stream to the initialization plain text of obtaining; The data layout identifying if judge in unstructured data is Virtual Realization modeling language VRML data, the VRML data acquisition obtaining is carried out to the conversion of data layout with expanding three-dimensional language X3D converter; Deposit the XML stream being organized into and/or the VRML data after the conversion of X3D converter in relational database.The present embodiment is first using unstructured data as input, and data are obtained in identification automatically; The packet of identification contains initialized plain text, the file of for example preserving with TXT; Or VRML (Virtual Reality Modeling Language, Virtual Reality Modeling Language) data, VRML is general modeling language in the model of place of virtual reality and the scene of three-dimensional world.Next plain text and/or the VRML data of identification are separated to processing, plain text adopts XML storehouse to be organized into XML stream, and VRML is through the conversion of X3D converter, finally by the data importing after these conversions in relational database, such as MS-SQL, MySQL etc.There are a variety of selections in XML storehouse wherein, can adopt the MSXML of Microsoft, also can adopt other as some XML storehouses of java language, XML storehouse is very general, and the function realizing is also basic identical, can carry out under suitable platform so be chosen at.X3D is a substitute technology of VRML, and it has not only realized the repertoire of VRML, and it be encapsulated into one more light-duty, among extendible core.X3D regards VRML script as a structurized data acquisition, then by element map wherein to corresponding node under XML document.Thereby the graphic image data under VRML describes can be processed in the XML document of standard.
Native system is treated plain text data and VRML data separation.To convert text data to XML file as shown in Figure 2.She mainly comprises three partial contents and makes XML template, input identification character and separating character and source data, then generates XML file.XML template is by manual manufacture, because it is larger that the data structure of different application differs, the logical organization that you want is also not necessarily identical with the logic of source data, so good XML template of the logic custom of wanting according to user, then according to this template, coordinate and specify identification character and separation flags, source data is converted to the data of XML form, be finally kept in XML document.
For graphic image data, generally adopt VRML script to be described, then need to convert these script datas to need inside relational database form and import in relational database; Wherein conversion work is undertaken by X3D converter.Its process is illustrated as shown in Figure 3.It,, using X3D stream or X3D text as being input in X3D converter, finally imports in relational database.X3D converter mainly comprises semantic annotations, initial statistics, node-classification and four steps of X3D stream transformation.Specific as follows:
Step1: semantic annotations is mainly that graphic image data is carried out to mark semantically, need to carry out characteristic processing to graph image conventionally, for example, image is carried out to marginalisation processing, then extracts its feature, the semantic expressiveness using these features as image;
Step2: initial statistics is the good information of mark to sort out the pre-service of statistics, establishment logic, only has through these statistical informations, and next step can carry out according to these information the classification of node.
Step3: node-classification.In these scripts, node be the most basic unit it be the fundamental element that forms three-dimensional scenic.And a large scene may also be referred to as a node, so node has classification and hierarchical relationship; Different nodes represents different elements, the attribute difference that they are all, and the level and the logic that are positioned are also not quite similar.
Step4:X3D flows transformation.The X3D finally these point of good classification being processed formats, and is formatted into the type of pattern in relational database, table, the required correspondence of list item.Finally import in database.
X3D data are changed into relation data, consider that exactly the data that how this XML represented change into the list item in relational database; The way having now has two kinds, a kind of based on structure; A kind of based on pattern.The DTD of XML (Document Type Definition, text mark) is mapped to relation schema by the former, and the latter stores all XML document by a fixing relation schema.No matter which kind of is, the two is to the multi-medium data in digital home interaction service improper.Because the relationship map between these nodes has just become different external keys in table, this is very complicated for relational database, and form is related to complexity, and logical relation confusion is unintelligible.Very difficult for follow-up processing.In fact, do not need all relations all to preserve yet, but need to keep the logical relation of certain level.
It should be noted that, in the time carrying out XML to the conversion of relational database, should change according to hierarchical relationship, can stipulate only to change four layers of upper strata; For example scene, group, node, attribute.Regard the node with a large amount of close attributes as a class, each node is the example of this class, corresponds in relational database, be exactly class corresponding to a table, attribute be table row, each example is tuple.As shown in Figure 4.
In Fig. 4, mention class, it is the general character performance of a dvielement, is abstracted into class.But this need to add up the element in X3D document, identifying these elements is same types, then they is summarized as to of a sort different instances, then carries out the mapping shown in Fig. 4.X3D document essence is also a kind of XML document, and the present invention has designed a kind of method of adding up these close examples and be then generalized into class, and its flow process is as shown in Figure 5, specific as follows:
S1: get the root node of XML as present node, current=root; Putting level is 0 layer, depth=0; Turn S2;
S2: judge whether current level depth is less than compulsory level, for example before us 4 of defined layers, if be greater than this threshold value, finish, exit present procedure; Otherwise turn S3;
S3: present node name whether equals query name and whether the current degree of depth equals query depth, if this both meets, turns S4, otherwise turns S5;
S4: counter adds 1, i.e. count=count+1; Specially S5;
S5: obtain the child nodes list of present node, then determine whether sky, sky is exported this count value if, otherwise turns S6;
S6: change the sensing of currentElement, current=list element; The level of inquiry adds 1, depth=depth+1; Turn S2.
Finally, we need to illustrate the deployment way of native system, as shown in Figure 6.Comprise server and client side, server can be certain Set Top Box or home gateway, can be district of community end server; Any information equipment of client under can home environment, such as handheld device, mobile phone, panel computer or as amusement equipment of this class of XBoX etc.Server comprises server and represents layer, logical layer and database.Deployment between them and correspondence are as shown in Figure 6.Client sends to server end by the data of oneself after XML changes, and the logical layer of server is opened two threads after listening to this request; First thread is by use XSLT (Extensible Ssylesheet Language Transformations, extensible stylesheet table code-switching) convert XML to html, then return to client by presentation layer, for showing the data layout after conversion; Second thread is that the data after conversion are deposited in relational database.
To sum up, by then converting non-structured data to can represent data through XML in relational database, make to adopt the application program of relational database can utilize these data; Pattern in element in XML and relational database is shone upon, close element is summarized as to same classification, corresponding to row, the tuple of the table in relational database and table, can be beneficial to like this excavate the key message in multimedia; It is many that the element type that example is many has become tuple, so just only needs enquiry form just can obtain multimedia key feature data and key element.
One of ordinary skill in the art will appreciate that all or part of step in the whole bag of tricks of above-described embodiment is can carry out the hardware that instruction is relevant by program to complete, this program can be stored in a computer-readable recording medium, storage medium can comprise: ROM (read-only memory) (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc.
The method of the digital home's content read data based under the distributed storage above embodiment of the present invention being provided is described in detail, applied specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; , for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention meanwhile.

Claims (5)

1. the method based on multi-medium data identification translation data storage format, is characterized in that, described method comprises:
Receive the input of the unstructured data based under multi-medium data;
Judge the data layout in described unstructured data;
The data layout identifying if judge in unstructured data is initialization plain text, adopts expandable mark language XML storehouse to be organized into XML stream to the initialization plain text of obtaining;
The data layout identifying if judge in unstructured data is Virtual Realization modeling language VRML data, the VRML data acquisition obtaining is carried out to the conversion of data layout with expanding three-dimensional language X3D converter;
Deposit the XML stream being organized into and/or the VRML data after the conversion of X3D converter in relational database.
2. the method based on multi-medium data translation data storage format as claimed in claim 1, is characterized in that, described initialization plain text is the text of preserving with text TXT.
3. the method based on multi-medium data translation data storage format as claimed in claim 2, is characterized in that, describedly adopts expandable mark language XML storehouse to be organized into XML stream to the initialization plain text obtained to comprise:
Make XML template, input identification character and separating character and source data, then generate XML file.
4. the method based on multi-medium data translation data storage format as claimed in claim 1, is characterized in that, described the VRML data acquisition that obtains is comprised with expanding the conversion that three-dimensional language X3D converter carries out data layout:
Graphic image data is carried out to mark semantically;
The good information of mark is sorted out to the pre-service of statistics, establishment logic;
Node-classification;
The X3D that these point of good classification processed formats, and is formatted into the type of pattern in relational database, table, the required correspondence of list item, imports in database.
5. the method based on multi-medium data translation data storage format as described in claim 1 to 4 any one, is characterized in that, described relational database is MS-SQL or MySQL.
CN201210512403.5A 2012-12-04 2012-12-04 Method for converting data storage format based on multimedia data Pending CN103853775A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210512403.5A CN103853775A (en) 2012-12-04 2012-12-04 Method for converting data storage format based on multimedia data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210512403.5A CN103853775A (en) 2012-12-04 2012-12-04 Method for converting data storage format based on multimedia data

Publications (1)

Publication Number Publication Date
CN103853775A true CN103853775A (en) 2014-06-11

Family

ID=50861441

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210512403.5A Pending CN103853775A (en) 2012-12-04 2012-12-04 Method for converting data storage format based on multimedia data

Country Status (1)

Country Link
CN (1) CN103853775A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105683945A (en) * 2014-07-01 2016-06-15 非凡全球控股有限公司 Computer implemented system and method for collating and presenting multi-format information
CN106202468A (en) * 2016-04-29 2016-12-07 河南大学 A kind of ancient books structuring method for sorting based on XML
CN106649648A (en) * 2016-12-09 2017-05-10 无锡云汇科技有限公司 Non-structural data processing system and processing method
CN110851586A (en) * 2019-10-22 2020-02-28 陈华 Bank operation data processing system and method, equipment and storage medium
CN112685601A (en) * 2021-01-31 2021-04-20 重庆渝高科技产业(集团)股份有限公司 Data extraction method and system for engineering measurement list
CN113407782A (en) * 2021-07-23 2021-09-17 重庆交通大学 MapReduce-based distributed XSLT processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1893430A (en) * 2005-07-05 2007-01-10 年代数位媒体股份有限公司 Content integration method with format and protocol conversion system
CN101247589A (en) * 2007-07-04 2008-08-20 华为技术有限公司 Mobile terminal data conversion/backup method, device and system
CN101826094A (en) * 2009-08-24 2010-09-08 张艳红 Hand-held wireless video search engine combining mobile streaming medium and 3G technology
CN102646125A (en) * 2012-02-28 2012-08-22 中国标准化研究院 Structured digital content extraction and reorganization method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1893430A (en) * 2005-07-05 2007-01-10 年代数位媒体股份有限公司 Content integration method with format and protocol conversion system
CN101247589A (en) * 2007-07-04 2008-08-20 华为技术有限公司 Mobile terminal data conversion/backup method, device and system
CN101826094A (en) * 2009-08-24 2010-09-08 张艳红 Hand-held wireless video search engine combining mobile streaming medium and 3G technology
CN102646125A (en) * 2012-02-28 2012-08-22 中国标准化研究院 Structured digital content extraction and reorganization method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
熊江: "SQL Server数据库中多媒体信息的抽取与转换", 《重庆三峡学院学报》 *
黄蓓蓓: "基于XML的多媒体数据转换存储研究", 《中国优秀博硕士学位论文全文数据库 (硕士) 信息科技辑》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105683945A (en) * 2014-07-01 2016-06-15 非凡全球控股有限公司 Computer implemented system and method for collating and presenting multi-format information
CN105683945B (en) * 2014-07-01 2020-04-07 非凡全球控股有限公司 Computer-implemented system for collating and presenting multi-format information
CN106202468A (en) * 2016-04-29 2016-12-07 河南大学 A kind of ancient books structuring method for sorting based on XML
CN106202468B (en) * 2016-04-29 2019-04-16 河南大学 A kind of ancient books structuring method for sorting based on XML
CN106649648A (en) * 2016-12-09 2017-05-10 无锡云汇科技有限公司 Non-structural data processing system and processing method
CN110851586A (en) * 2019-10-22 2020-02-28 陈华 Bank operation data processing system and method, equipment and storage medium
CN110851586B (en) * 2019-10-22 2022-10-11 陈华 Bank operation data processing system, method, equipment and storage medium
CN112685601A (en) * 2021-01-31 2021-04-20 重庆渝高科技产业(集团)股份有限公司 Data extraction method and system for engineering measurement list
CN113407782A (en) * 2021-07-23 2021-09-17 重庆交通大学 MapReduce-based distributed XSLT processing method and system

Similar Documents

Publication Publication Date Title
CN110347843B (en) Knowledge map-based Chinese tourism field knowledge service platform construction method
Kaur et al. Modeling and querying data in NoSQL databases
CN103853775A (en) Method for converting data storage format based on multimedia data
CN103646032A (en) Database query method based on body and restricted natural language processing
CN108845942B (en) Product feature management method, device, system and storage medium
CN109344298A (en) Method and device for converting unstructured data into structured data
CN103116574B (en) From the method for natural language text excavation applications process body
CN105808753B (en) A kind of regionality digital resources system
CN104021198A (en) Relational database information retrieval method and device based on ontology semantic index
CN101799827A (en) Video database management method based on layering structure
CN103019689A (en) Universal object serialization realizing method
CN103927385A (en) Unifying method and device of data model
CN110069450A (en) Interactive electronic technical manual platform based on S1000D standard
US11341418B2 (en) Ascriptive and descriptive entities for process and translation: a limited iterative ontological notation
CN103123646B (en) XML document is converted into automatically conversion method and the device of OWL document
CN106462585A (en) System and method for column-specific materialization scheduling
CN114064926A (en) Multi-modal power knowledge graph construction method, device, equipment and storage medium
CN102937992A (en) Object mapping transformation design method based on Java and X extensive makeup language (XML) database
CN106844470A (en) The pattern exhibition and analysis method of a kind of general field model based on SVG display techniques
CN101770291A (en) Semantic analysis data hashing storage and analysis methods for input system
CN101794225A (en) GML analytic method based on GDOM and persistence thereof
CN101794223A (en) Design method of WADE service message architecture
US9886424B2 (en) Web application framework for extracting content
CN112560490A (en) Knowledge graph relation extraction method and device, electronic equipment and storage medium
KR20200073302A (en) A platform system, a method of providing a visual analysis service utilizing linked data, and a recording media in which the program is recorded

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140611