CN102184265A - Electronic newspaper data conversion method - Google Patents

Electronic newspaper data conversion method Download PDF

Info

Publication number
CN102184265A
CN102184265A CN 201110164572 CN201110164572A CN102184265A CN 102184265 A CN102184265 A CN 102184265A CN 201110164572 CN201110164572 CN 201110164572 CN 201110164572 A CN201110164572 A CN 201110164572A CN 102184265 A CN102184265 A CN 102184265A
Authority
CN
China
Prior art keywords
file
files
picture
xml
electronic newspaper
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201110164572
Other languages
Chinese (zh)
Inventor
程卫健
王仁胤
卢雪辉
张震宁
刘涛
周晓晨
孙军
戴杨
管云峰
黄雯琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI HIGH DEFINITION DIGITAL TECHNOLOGY INDUSTRIAL Co Ltd
Original Assignee
SHANGHAI HIGH DEFINITION DIGITAL TECHNOLOGY INDUSTRIAL Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI HIGH DEFINITION DIGITAL TECHNOLOGY INDUSTRIAL Co Ltd filed Critical SHANGHAI HIGH DEFINITION DIGITAL TECHNOLOGY INDUSTRIAL Co Ltd
Priority to CN 201110164572 priority Critical patent/CN102184265A/en
Publication of CN102184265A publication Critical patent/CN102184265A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an electronic newspaper data conversion method which is used for screening and converting original files in various formats to reduce data capacity so as to conveniently distribute information. Complete electronic newspaper data are mainly formed by identifying and screening picture files, video files and XML (Extensive Makeup Language) files of the original data, converting the selected picture files, video files and XML files into files in a unified format according to predetermined requirements and integrating the files.

Description

The electronic newspaper data transfer device
Technical field
The relevant a kind of data transfer device of the present invention is meant especially to the distributed electronic method that the file of the required electronic newspaper of system changes of reading newspapers.
Background technology
Development along with the electronic newspaper technology, electronic newspaper enters people's life day by day, and becoming a kind of by the generally accepted reading method of people, yet the source file of electronic newspaper is (as main medium usually, as Xinhua Daily, People's Daily etc.) be to use at its electronics website and printing, a lot of lengthy and tedious information are arranged, and its file layout has various ways, as the JPG file, pdf document, the XML file, TIF file etc., data generally between 100M-500M, can't be applicable to existing employing data broadcasting simultaneously, IP network, the distributed electronic of the 3G network distributing data system that reads newspapers.
Summary of the invention
The object of the present invention is to provide a kind of electronic newspaper data transfer device, be not suitable for the read newspapers problem of system of the distributed electronic that adopts data broadcasting, IP network, 3G network distributing data to solve existing electronic newspaper source file.
For achieving the above object, implementing electronic newspaper data transfer device of the present invention comprises the steps:
Obtain the source file of electronic newspaper;
Discern and extract picture file, video file and XML file in the source file;
The picture file that surpasses setting threshold is converted to the file of consolidation form, and form and the resolution of video file are changed, merge conversion and get in touch with picture file and video file foundation after the above-mentioned conversion for the XML file.
Preferably, described picture file comprises BMP, PNG, PDF, TIF file, and these picture files are by the unified JPG file that is converted to, and the resolution of the unified JPG file that is converted to is 868*1247.
Preferably, comprise a plurality of files in the source file with title, identification also extracts in the process of the picture file in the source file each file is carried out all files retrieval, filter out picture file earlier, judge the picture file of size maximum then, if it surpasses setting threshold, as 2M, and file name is corresponding with the title of file, then extract this picture file, it is copied in the file of appointment, and corresponding file folder title before the RNTO, simultaneously, the order of screening picture file is the JPG file, the BMP file, the PNG file, pdf document, if promptly do not satisfy the JPG file of above-mentioned two conditions, then carry out the BMP file successively, the screening of PNG file and pdf document.
Preferably, implementing electronic newspaper data transfer device of the present invention also comprises all files in the source file is retrieved, filter out whole picture files, judge the size of these picture files then, if satisfy default dimensional requirement, as width is 400 pixels, then it is copied to another file, afterwards these picture files is carried out the JPEG coding.
Preferably, when discerning and extracting the video file in the source file, be that each file to source file carries out all files retrieval, filter out wmv, mpg, avi, mp4,3gp, mov, rm, rmvb file, and it is copied in the default file, unification afterwards is converted to the flv file, is 480*360 as resolution.
Preferably, identification and the XML file that extracts in the source file at first carry out all files retrieval to each file of source file, filter out the XML file, if a certain file exists 2 or 2 above XML files, then judge the title of XML file, filter out the corresponding XML file of filename and this document folder, it is copied to predetermined file, corresponding file folder title before the RNTO, afterwards all the XML files in the predetermined file are carried out data read successively, extract useful data in the XML file, generate new XML file, replace corresponding XML file, the XML file after the conversion not only comprises text data, also comprise corresponding text position information, the information of picture file, the information of corresponding video file, the display font form, size, color, and space of a whole page title etc.
Compared with prior art, the present invention is by discerning and extract the picture file in the source file, video file and XML file, afterwards the picture file that surpasses setting threshold is converted to the file of consolidation form, and the form and the resolution of video file changed, merge conversion and get in touch for the XML file with picture file and the video file foundation after the above-mentioned conversion, so the electronic newspaper data data volume that forms is less, usually between 1M-15M, thereby be applicable to the employing data broadcasting, IP network, the distributed electronic of the 3G network distributing data system that reads newspapers.
Description of drawings
Fig. 1 is for implementing the process flow diagram of electronic newspaper data transfer device of the present invention.
Fig. 2 is the file storage catalogue Organization Chart of the source file of an existing electronic newspaper.
Fig. 3 is the file storage catalogue Organization Chart of the electronic newspaper after conversion.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the invention is described.
See also shown in Figure 1ly, for implementing the process flow diagram of electronic newspaper data transfer device of the present invention, this method comprises the steps:
Step 1: the source file of obtaining electronic newspaper;
Step 2: discern and extract picture file, video file and XML file in the source file;
Step 3: the picture file that surpasses setting threshold is converted to the file of consolidation form, and form and the resolution of video file are changed, merge conversion and get in touch with picture file and video file foundation after the above-mentioned conversion for the XML file.
Preferably, described picture file comprises BM P, PNG, PDF, TIF file, and these picture files are by the unified JPG file that is converted to, and the resolution of the unified JPG file that is converted to is 868*1247.
Preferably, comprise a plurality of files in the source file with title, identification also extracts in the process of the picture file in the source file each file is carried out all files retrieval, filter out picture file earlier, judge the picture file of size maximum then, if it surpasses setting threshold, as 2M, and file name is corresponding with the title of file, then extract this picture file, it is copied in the file of appointment, and corresponding file folder title before the RNTO, simultaneously, the order of screening picture file is the JPG file, BM P file, the PNG file, pdf document, if promptly do not satisfy the J PG file of above-mentioned two conditions, then carry out the BMP file successively, the screening of PNG file and pdf document.
Preferably, implementing electronic newspaper data transfer device of the present invention also comprises all files in the source file is retrieved, filter out whole picture files, judge the size of these picture files then, if satisfy default dimensional requirement, as width is 400 pixels, then it is copied to another file, afterwards this part picture file is carried out the JPEG coding.
Preferably, when discerning and extracting the video file in the source file, be that each file to source file carries out all files retrieval, filter out wmv, mpg, avi, mp4,3gp, mov, rm, rmvb file, and it is copied in the default file, unifiedly afterwards be converted to the flv file, be 480*360 as resolution, after converting, with all non-flv file Delete Alls in this document folder.
Preferably, identification and the XML file that extracts in the source file at first carry out all files retrieval to each file of source file, filter out the XML file, if a certain file exists 2 or 2 above XML files, then judge the title of XML file, filter out the corresponding XML file of filename and this document folder, it is copied to predetermined file, corresponding file folder title before the RNTO, afterwards all the XML files in the predetermined file are carried out data read successively, extract useful data in the XML file, generate new XML file, replace corresponding XML file, the XML file after the conversion not only comprises text data, also comprise corresponding text position information, the information of picture file, the information of corresponding video file, the display font form, size, color, and space of a whole page title etc.
Below in conjunction with a concrete example above-mentioned process is described.
As shown in Figure 2, file storage catalogue Organization Chart for the source file of an existing electronic newspaper, usually the source file of electronic newspaper leaves in the date file, as the 2011-05-31 file, generally comprise 4-24 sub-folder (the newspaper layout number on the file number and the same day is consistent) in the file, the JPG file that includes a big figure of the space of a whole page in each sub-folder, the JPG file of a space of a whole page thumbnail, a full space of a whole page pdf document, a space of a whole page data XML file, the JPG file of some news illustrations, the TIF file of some news illustrations.
And source file by conversion after, can generate as shown in Figure 33 file: pic, news and TV (TV deposits video).The pic file is deposited the JPG file by the big figure of the space of a whole page of conversion, puts according to the filename of 01-24; The news file is deposited XML file and the little figure JPG of the news file by conversion and optimization; The TV file is deposited the flv video file.
Transfer process for various files is as follows:
Wherein the picture conversion comprises the conversion to big figure of the space of a whole page (picture file that promptly exceeds predetermined threshold value) and the little picture of news in all sub-folders in the source file;
At first each sub-folder in the source file is carried out all files retrieval, filter out the JPG file, judge the JPG of size maximum then, if it surpasses 2M, and file name can be corresponding with sub-folder, then takes out; If these two conditions have one not satisfy, then abandon this sub-folder JPG screening extraction, then retrieval BMP file, by that analogy, then screen the PNG file if not, pdf document is up to filtering out the big figure of the correct space of a whole page, it is copied to (as the pic file) in the assigned catalogue, and the corresponding before sub-folder title (being replicated back called after 01.JPG as the big figure of the JPG under 01 file) of RNTO.
The big figure that extracts may be JPG, BMP, PNG, PDF, once each form is judged, at first the JPG file is compressed and optimize, generate the JPG file of the same name (06.JPG as the 06.JPG of the 2522*3587 that originally duplicates is converted into 868*1247 guarantees that under the situation of compression as far as possible image quality is more than 70%) of specified size; Secondly successively BMP, PNG, PDF are carried out corresponding judgment, unification at last converts the JPG file to, and process can be with reference to the transfer process of JPG file.
For little figure, so be that sub-folder in the source file is carried out all files retrieval, filter out whole JPG files, judge the size of these JPG files then, if satisfy dimensional requirement (is 400 pixels as width), then it is copied to respective directories (as the news file).Afterwards the little figure that takes out is carried out compression optimization, guarantee that little figure in the quality more than 70%, reduces the byte number of file as far as possible.
Conversion for video file is meant that the video to existing in the source file carries out form, conversion of resolution, and detailed process is:
Earlier each sub-folder in the source file is carried out all files retrieval, filter out wmv, mpg, avi, mp4,3gp, mov, rm, rmvb file, and it is copied to corresponding file (as the TV file); If do not detect video file, then abandon retrieval, after treating that afterwards video file copies to the corresponding document folder, begin the file of corresponding format is carried out format conversion, the unified flv file (is 480*360 such as resolution) that converts a certain size to, after converting, with all non-flv file Delete Alls in this document folder.
File conversion is meant that detailed process is to the data-switching of all XML files in the source file for XML:
At first each sub-folder in the former packet is carried out all files retrieval, filter out the XML file, if a certain file exists 2 or 2 above XML files, then judge the title of XML file, filter out the corresponding XML file of filename and this document folder, it is copied to corresponding file (as the news file), corresponding sub-folder name before the RNTO (copies among the news after being removed as the RMRB20110531B001.XML in 01 file, Name Change is 01.XML), afterwards (as the news file) all XML in the corresponding document folder are carried out data read successively, extract useful data among the XML, generate new XML, replace corresponding XML.XML after the conversion not only comprises the text data of news, the information, news font format, size, the color that also comprise the information of corresponding news picture, corresponding video, and space of a whole page title etc., last newspaper program is by reading the XML file after the conversion, thus can be with scrappy JPG big figure, little figure, video be integrated into the complete electronics system that reads newspapers.
Be that unification is converted to the JPG file to picture file in said method, wherein JPG full name is JPEG, and wherein the JPEG picture is stored single raster image with 24 colors.JPEG is and the form of platform independence, supports the compression of highest level.The described feature of XML makes it be used to describe concrete display message content with electronic newspaper in the present invention.Advantage is: photographic work or write implementation and support advanced compression; Utilize the variable ratio of compression can the control documents size; Support staggered (for the approximate JPEG file); Extensively support the Internet standard; Because volume is little, JPG is used to store and transmit the form of photo in WWW, and the described advantage of JPG makes it be used to show the justifying effect of electronic newspaper in the present invention.
From the above mentioned, the present invention is by discerning and extract picture file, video file and the XML file in the source file, afterwards the picture file that surpasses setting threshold is converted to the file of consolidation form, and the form and the resolution of video file changed, merge conversion and get in touch for the XML file with picture file and the video file foundation after the above-mentioned conversion, so the electronic newspaper data data volume that forms is less, usually between 1M-15M, thereby be applicable to the distributed electronic that adopts data broadcasting, IP network, the 3G network distributing data system that reads newspapers.
Be understandable that, for those of ordinary skills, can be equal to replacement or change according to technical scheme of the present invention and inventive concept thereof, and all these changes or replacement all should belong to the protection domain of the appended claim of the present invention.

Claims (11)

1. an electronic newspaper data transfer device comprises the steps:
Obtain the source file of electronic newspaper;
Discern and extract picture file, video file and XML file in the source file;
The picture file that surpasses setting threshold is converted to the file of consolidation form, and form and the resolution of video file are changed, merge conversion and get in touch with picture file and video file foundation after the above-mentioned conversion for the XML file.
2. electronic newspaper data transfer device as claimed in claim 1 is characterized in that: described picture file comprises BMP, PNG, PDF, TIF file, and these picture files are by the unified JPG file that is converted to.
3. electronic newspaper data transfer device as claimed in claim 2 is characterized in that: the resolution of the JPG file that is converted to is 868*1247.
4. electronic newspaper data transfer device as claimed in claim 1, it is characterized in that: comprise a plurality of files in the source file with title, identification also extracts in the process of the picture file in the source file each file is carried out all files retrieval, filter out picture file earlier, judge the picture file of size maximum then, judge whether it surpasses preset threshold, and file name is corresponding with the title of file, then extract this picture file, it is copied in the file of appointment, and corresponding file folder title before the RNTO.
5. electronic newspaper data transfer device as claimed in claim 4 is characterized in that: the order of screening picture file is JPG file, BMP file, PNG file, pdf document.
6. electronic newspaper data transfer device as claimed in claim 5 is characterized in that: preset threshold was 2M when the picture file size was judged.
7. electronic newspaper data transfer device as claimed in claim 1, it is characterized in that: this method also comprises to be retrieved all files in the source file, filter out whole picture files, judge the size of these picture files then, if satisfy default dimensional requirement, then it is copied to another file, afterwards these picture files are carried out the JPEG coding.
8. electronic newspaper data transfer device as claimed in claim 7 is characterized in that: default dimensional requirement is that width is 400 pixels.
9. electronic newspaper data transfer device as claimed in claim 1, it is characterized in that: when discerning and extracting the video file in the source file, be that each file to source file carries out all files retrieval, filter out wmv, mpg, avi, mp4,3gp, mov, rm, rmvb file, and it is copied in the default file the unified afterwards flv file that is converted to.
10. electronic newspaper data transfer device as claimed in claim 9 is characterized in that: the resolution of described flv file is 480*360.
11. electronic newspaper data transfer device as claimed in claim 1, it is characterized in that: the XML file of discerning and extracting in the source file at first carries out all files retrieval to each file of source file, filter out the XML file, if a certain file exists 2 or 2 above XML files, then judge the title of XML file, filter out the corresponding XML file of filename and this document folder, it is copied to predetermined file, corresponding file folder title before the RNTO, afterwards all the XML files in the predetermined file are carried out data read successively, extract useful data in the XML file, generate new XML file, replace corresponding XML file, XML file after the conversion not only comprises text data, also comprise corresponding text position information, the information of picture file, the information of corresponding video file, the display font form, size, color, and space of a whole page title.
CN 201110164572 2011-06-17 2011-06-17 Electronic newspaper data conversion method Pending CN102184265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110164572 CN102184265A (en) 2011-06-17 2011-06-17 Electronic newspaper data conversion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110164572 CN102184265A (en) 2011-06-17 2011-06-17 Electronic newspaper data conversion method

Publications (1)

Publication Number Publication Date
CN102184265A true CN102184265A (en) 2011-09-14

Family

ID=44570442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110164572 Pending CN102184265A (en) 2011-06-17 2011-06-17 Electronic newspaper data conversion method

Country Status (1)

Country Link
CN (1) CN102184265A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127116A (en) * 2016-06-16 2016-11-16 新浪网技术(中国)有限公司 A kind of method and apparatus of picture/mb-type identification
CN106844453A (en) * 2016-12-20 2017-06-13 江苏瀚远科技股份有限公司 A kind of electronic document format conversion method
CN108021661A (en) * 2017-12-04 2018-05-11 北京锐安科技有限公司 A kind of conversion method of data format and system
CN109241470A (en) * 2018-09-27 2019-01-18 北京小米移动软件有限公司 Page display method, apparatus and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020063681A1 (en) * 2000-06-04 2002-05-30 Lan Hsin Ting Networked system for producing multimedia files and the method thereof
CN1613105A (en) * 2002-09-05 2005-05-04 奥帕拉软件公司 Presenting html content on a small screen terminal display
CN101526953A (en) * 2009-01-19 2009-09-09 北京跳网无限科技发展有限公司 WWW transformation technology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020063681A1 (en) * 2000-06-04 2002-05-30 Lan Hsin Ting Networked system for producing multimedia files and the method thereof
CN1613105A (en) * 2002-09-05 2005-05-04 奥帕拉软件公司 Presenting html content on a small screen terminal display
CN101526953A (en) * 2009-01-19 2009-09-09 北京跳网无限科技发展有限公司 WWW transformation technology

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127116A (en) * 2016-06-16 2016-11-16 新浪网技术(中国)有限公司 A kind of method and apparatus of picture/mb-type identification
CN106127116B (en) * 2016-06-16 2019-05-28 新浪网技术(中国)有限公司 A kind of method and apparatus of picture type identification
CN106844453A (en) * 2016-12-20 2017-06-13 江苏瀚远科技股份有限公司 A kind of electronic document format conversion method
CN108021661A (en) * 2017-12-04 2018-05-11 北京锐安科技有限公司 A kind of conversion method of data format and system
CN109241470A (en) * 2018-09-27 2019-01-18 北京小米移动软件有限公司 Page display method, apparatus and system

Similar Documents

Publication Publication Date Title
JP6566330B2 (en) Video editing method
DE60016032T2 (en) VIDEO CUTTING FLOW PROCESS AND SYSTEM
US20190361969A1 (en) Method and system for annotation and connection of electronic documents
CN102414721B (en) There is the data file of more than one operator scheme
Mullan et al. Forensic source identification using JPEG image headers: The case of smartphones
CN102663125B (en) Method and system for collecting microblog contents to make electronic document
WO2023078407A1 (en) Method and apparatus for merging multi-format files into one ofd file
CN101387946B (en) Electronic file archiving method and system
CN103093298A (en) Multi-version digital archive management and application method for image or video file
CN101098386A (en) Image output apparatus, image output apparatus control method, program, electronic document management system
CN105868286A (en) Parallel adding method and system for merging small files on basis of distributed file system
US9223528B2 (en) Electronic content management workflow
CN103049491A (en) Method and device for managing picture file
CN102184265A (en) Electronic newspaper data conversion method
CN106777179A (en) A kind of online method for previewing of document and system
US7703012B2 (en) Adaptable transfer and publication of digital media
US20020135685A1 (en) Digital camera device
Sandoval Orozco et al. Analysis of errors in exif metadata on mobile devices
DE102014105183A1 (en) Method for the electronic and physical archiving of documents
CN105721810B (en) A kind of image compression and storing method and device
US20130222610A1 (en) Capturing metadata on set using a smart pen
Zhou Are your digital documents web friendly?: Making scanned documents web accessible
CN103853849A (en) Method for establishing and drawing high-compression reflowable file
KR100810692B1 (en) Vector image composition system and thereof
JP2008287606A (en) Information processor and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110914