CN102063416B - Method and system for embedding double-byte fonts into PDF file - Google Patents

Method and system for embedding double-byte fonts into PDF file Download PDF

Info

Publication number
CN102063416B
CN102063416B CN2009102381327A CN200910238132A CN102063416B CN 102063416 B CN102063416 B CN 102063416B CN 2009102381327 A CN2009102381327 A CN 2009102381327A CN 200910238132 A CN200910238132 A CN 200910238132A CN 102063416 B CN102063416 B CN 102063416B
Authority
CN
China
Prior art keywords
font
character
identifier
embedded
double
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009102381327A
Other languages
Chinese (zh)
Other versions
CN102063416A (en
Inventor
刘佳峰
姚磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN2009102381327A priority Critical patent/CN102063416B/en
Publication of CN102063416A publication Critical patent/CN102063416A/en
Application granted granted Critical
Publication of CN102063416B publication Critical patent/CN102063416B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a method and a system for embedding double-byte fonts into a portable document format (PDF) file. The method comprises the following steps of: determining the double-byte fonts which are used by the PDF file to be embedded with the fonts and are not embedded into the PDF file and font description information of the double-byte fonts; determining all characters and character identities or character pattern identities thereof for character output performed by using the double-byte fonts in the PDF file to be embedded with the fonts, and acquiring character pattern description information corresponding to the identities according to the font file of the double-byte fonts; and generating the PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired character pattern description information. The method and the system can solve the problems of easy error production and low efficiency caused by a middle step of generating PS data stream for embedding the double-byte fonts into the PDF file in the prior art.

Description

Method and system for embedding double-byte font into PDF file
Technical Field
The invention relates to the technical field of typesetting, in particular to a method and a system for embedding double-byte fonts into a PDF file.
Background
PDF (Portable Document Format) is an electronic file Format developed by Adobe corporation. This file format is independent of the operating system platform, i.e., PDF files are common in either Windows, Unix or Mac OS operating systems. This feature makes it an ideal document format for electronic document distribution and digital information dissemination over the Internet. More and more electronic books, product descriptions, company literature, web materials, and e-mail are beginning to use PDF files. PDF files have become a de facto industry standard for digitized information.
PDF files are intended to support the publishing and publication of information integrated across platforms, and to this end, PDF has many advantages over other electronic document formats. A PDF file may encapsulate text, fonts, formats, colors, and graphical images independent of device and resolution, etc. in one file. The PDF file can also contain electronic information such as hypertext links, sound, dynamic images and the like, supports a very long file, and has high integration level and high safety and reliability.
The font embedding is an important branch in the PDF technology, and a PDF file with embedded fonts is not dependent on the characteristics of a font environment of a presentation program when being presented, and is extremely important for keeping the presented content stable. Therefore, a large number of applications for PDF require PDF files with fully or partially embedded fonts as recommendations or even mandatory.
Currently, a method for embedding a PDF font takes generating a PS (postscript) data stream as an intermediate step, that is, converting a PDF file to be embedded into a PS stream, and then converting the PS stream into a PDF file embedded with a font, that is, in a process of converting the PS stream into the PDF file, a font embedding function is implemented.
The main problems of this method are: the mutual conversion process of PDF and PS is quite complex, and errors are easily introduced, so that the difference between the finally obtained PDF and the original PDF to be embedded appears in the content; this two-step conversion process is also relatively inefficient.
Disclosure of Invention
The embodiment of the invention provides a method and a system for embedding double-byte fonts into a PDF (Portable document Format) file, which are used for solving the problems of high error probability and low efficiency caused by taking a generated PS (packet switched) data stream as an intermediate step for embedding the fonts into the PDF file in the prior art.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
a method for embedding double-byte fonts into a PDF file, comprising:
determining double-byte fonts used by a PDF file with fonts to be embedded but not embedded in the PDF file and font description information of the double-byte fonts;
determining all characters and character identifications or font identifications thereof for performing character output by using the double-byte fonts in the PDF file of the fonts to be embedded, and acquiring font description information corresponding to the identifications according to the font files of the double-byte fonts;
and generating the PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired font description information.
A system for embedding double-byte fonts into a PDF file, comprising:
the device comprises a font description information determining module, a font description information determining module and a font processing module, wherein the font description information determining module is used for determining double-byte fonts used by a PDF (portable document format) file of fonts to be embedded but not embedded in the PDF file and font description information of the double-byte fonts;
the font description information acquisition module is used for determining all characters and character identifications or font identifications thereof which are output by using the double-byte fonts in the PDF file of the fonts to be embedded, and acquiring font description information corresponding to the identifications according to the font files of the double-byte fonts;
and the PDF file generating module is used for generating the PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired font description information.
In the above embodiment of the present invention, according to the double-byte font used by the PDF file of the font to be embedded but not embedded in the PDF file, the double-byte font to be embedded is determined, and the font description information of the double-byte font to be embedded is obtained; determining all characters and character identifications or font identifications of characters output by using the double-byte fonts to be embedded in the PDF file of the fonts to be embedded, and then acquiring font description information corresponding to the identifications from the font file of the double-byte fonts to be embedded; and finally, generating the PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired font description information. Because the process of determining the double-byte fonts to be embedded, the process of acquiring the font description information and the process of acquiring the font description information can be realized by analyzing the PDF file of the fonts to be embedded, compared with the prior art, the conversion process of PS data stream is omitted, thereby simplifying the flow of embedding the fonts in the PDF file, reducing the error probability caused by PS data stream conversion and improving the efficiency of font embedding.
Drawings
FIG. 1 is a schematic flow chart illustrating embedding a double-byte font into a PDF file according to an embodiment of the present invention;
FIG. 2 is a second flowchart illustrating a process of embedding a double-byte font into a PDF file according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating the encoding mapping of Type1(CID) fonts according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating encoding mapping of TrueType (CID) fonts according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a system for embedding a double-byte font into a PDF file according to an embodiment of the present invention.
Detailed Description
In order to solve the problems in the prior art, embodiments of the present invention provide a method and a system for embedding double-byte fonts into a PDF file, so that a target PDF file (i.e., a PDF file with embedded fonts) is directly generated while a PDF file with to-be-embedded fonts is parsed, and font descriptions of the double-byte fonts are embedded in a process of generating the target file. Compared with the prior art, the method avoids the use of the intermediate format, thereby better ensuring the correctness of the target file and improving the efficiency of the embedded operation.
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
For an original PDF file with a font to be embedded as an input (hereinafter referred to as an original file), in order to generate a PDF file with an embedded font as an output (hereinafter referred to as a target file) based thereon, in the embodiment of the present invention, the target file is generated according to the steps shown in fig. 1:
step 101, analyzing an original PDF file, determining a font used by the PDF file but not embedded into the PDF file, and determining a double-byte font to be embedded (such as Chinese simplified font, Chinese traditional font, Japanese and Korean); obtaining font description information of the double-byte font to be embedded according to the determined double-byte font to be embedded, wherein the font description information can comprise font coding mode information and font name;
102, determining all characters which are output by using a double-byte font to be embedded in the original PDF file by analyzing the content flow of the original PDF file, coding and mapping the characters according to the font type and the corresponding coding mode of the characters to obtain identifications (such as character identifications or font identifications) corresponding to the characters, and acquiring font description information corresponding to the identifications of the characters from the font file of the double-byte font to be embedded;
step 103, organizing the obtained font description information into a font program (fontprogram) data stream conforming to the PDF file specification, and using the font program data stream and the obtained font description object as font file data embedded in the PDF file, thereby generating a target PDF file.
In the above process, the determined double-byte fonts to be embedded may be all the double-byte fonts used by the original PDF file but not embedded, or a part of the double-byte fonts. The above-mentioned process can be implemented by a corresponding software system.
A detailed flow of all double-byte fonts used by embedding a PDF file in the PDF file, but not embedded in the PDF file, by using the software system of the embodiment of the present invention is described below with reference to fig. 2.
To facilitate implementation of the embodiments of the present invention, the following set may be used as a data storage module for storing intermediate data when implementing the process:
and (3) a font set to be embedded: is a simple set of font objects that contains all the double-byte fonts to be embedded. When the original file is analyzed, when a double-byte font to be embedded (existing in the form of a font object) is found, one record is added in the set, and the repetition is not counted (namely, only one corresponding record is stored under the condition that the same font object is used for multiple times);
the font description set to be embedded comprises: is a simple set of font description objects containing all the double-byte fonts to be embedded. Not counting repetition (namely only saving a corresponding record for the condition that the same font description object is referred for multiple times;
and (3) character set to be embedded: is a set indexed by the font object that contains all the characters that the font is used in the original document. In this set, characters are recorded in the form of a character Code (Char Code) or a Character Identifier (CID) or a Glyph identifier (Glyph ID), not counting repetitions (i.e. only one corresponding record is saved for the case where the same character is used multiple times).
As shown in fig. 2, the process of embedding double-byte fonts in a PDF file by the software system of the embodiment of the present invention includes:
step 201, parsing the original file, obtaining Font objects (Font objects) of all non-embedded double-byte fonts used by the original file, and storing the objects in a Font set to be embedded.
In general, a Font object in a PDF exists in a PDF file in the form of a PDF dictionary object, and a double-byte Font used by the PDF file but not embedded in the PDF file can be determined by looking up the PDF dictionary object of the original PDF file. The Font object contains important information about the Font, such as the Font name and the encoding method.
Step 202, for the font objects of all double-byte fonts in the font set to be embedded, searching the corresponding descendant fonts (DespendantFonts), searching the font description objects (FontDescriptors) corresponding to the double-byte fonts to be embedded from the descendant fonts, and storing the searched font description objects in the font description set to be embedded.
In general, child fonts exist in a PDF file in the form of a PDF dictionary object, and FontDescriptor objects of the corresponding fonts are included in the child fonts and exist in the PDF file in the form of a dictionary object.
Step 203, analyzing all content streams in the original PDF file, acquiring the fonts used by all instructions related to text output and codes of output characters, acquiring character identifications or font identifications of each output character using the to-be-embedded double-byte fonts according to the font type and the coding mode of each output character, and storing the acquired character identifications or font identifications into a to-be-embedded character set with the font description as an index.
In this step, for a font of Type1(CID), the character can be coded and mapped to obtain a Character Identification (CID); for a font of a TrueType (CID) type, encoding and mapping characters to obtain character Unicode, and then querying a font identification table in a font file of the TrueType to obtain a corresponding character identification (Glyph), wherein the encoding and mapping method is to find a character name corresponding to a character code by searching an encoding mapping table, and the encoding mapping table is an attribute contained in each font description object. The method comprises the following specific steps:
as shown in FIG. 3, for fonts of Type1(CID), if the encoding mode is Identity-H or Identity-V, the Character Identification (CID) of the character can be parsed from the content stream (see steps 301-302); for other encoding modes, analyzing the character code (CharCode) of the character from the content stream, and then mapping the character code to obtain a corresponding Character Identifier (CID) according to the character code (see steps 301, 303 and 304); after the Character Identification (CID) is obtained, the font description corresponding to the character can be found from the corresponding font file according to the character identification. If the character also contains the sub-character, the character identifier corresponding to the sub-character needs to be acquired together according to the above mode.
As shown in fig. 4, for a font of truetype (CID), if the encoding mode is Identity-H or Identity-V, the Character Identifier (CID) of the character is analyzed from the content stream, a character Unicode (Unicode) corresponding to the character identifier is obtained according to a mapping table from the character identifier to the character Unicode, and a Glyph identifier (Glyph ID) corresponding to the character Unicode is obtained according to a mapping table from the character Unicode to the Glyph identifier (see steps 401, 402, 406, and 407); for the Unicode encoding mode, analyzing the character Unicode of the character from the content stream, and then obtaining a Glyph identifier (Glyph ID) corresponding to the character Unicode according to a mapping table from the character Unicode to the Glyph identifier (Glyph ID) (see steps 401, 403 and 407); for other coding modes, analyzing the character Code (Char Code) of the character from the content stream, acquiring the Character Identifier (CID) corresponding to the character Code (Char Code) through the middle mapping table of the font description object, using the inquired Character Identifier (CID), and then inquiring the mapping table from the CID to the Unicode to acquire the corresponding Unicode and the corresponding font identifier (Glyph ID) (see steps 401, 404, 405, 406, 407); after the font identifier (Glyph ID) is obtained, the corresponding font description can be found from the corresponding font file according to the font identifier (Glyph ID). If the character also contains the sub-character, the font identification corresponding to the sub-character is required to be obtained together according to the above mode.
Step 204, a font program data stream is constructed. Constructing a CFF font program data stream if the double-byte font to be embedded is of Type1(CID) Type; constructing a TrueType (CID) font program data stream if the double-byte font to be embedded is a TrueType (CID) type; if the double-byte font to be embedded includes both Type1(CID) and TrueType (CID) types, a CFF font program data stream and a TrueType (CID) font program data stream are constructed.
Step 205, in the font description set to be embedded, one font description is located by an index formed by the font descriptions, all character identifications or font identifications corresponding to the font descriptions are read, corresponding font description information is respectively searched in the font file, and the searched font description information is written into the corresponding font program data stream.
In this step, if the font corresponding to the current font description is of Type1(CID), the following steps are performed:
in a character set to be embedded, traversing each Character Identifier (CID) under the current font description by taking the current font description as an index, searching corresponding font description information in a Type1(CID) font file according to each Character Identifier (CID), and if the Character Identifier (CID) comprises a sub-character identifier, searching sub-character font description information corresponding to the sub-character identifier; then storing the found font description information into the CFF data stream constructed before according to the CFF font program specification;
if the font corresponding to the current font description is a TrueType (CID) type, the following steps are performed:
in a character set to be embedded, traversing each font identifier (Glyph ID) under the current font description by taking the current font description as an index, searching corresponding font description information in a TrueType (CID) font file according to each font identifier, and if the font identifiers contain the font identifiers of sub-characters, searching sub-character font description information corresponding to the font identifiers of the sub-characters; the obtained glyph description information is then stored in the previously constructed TrueType (CID) data stream according to the TrueType (CID) font program specification.
Step 206, regarding to the character identifiers and the font identifiers corresponding to all the font descriptions in the character set to be embedded, whether the corresponding font description information has been written into the font program data stream, that is, whether the character identifiers and the font description information corresponding to the font identifiers in the character set to be embedded have been written into the font program data stream, if yes, step 207 is executed; otherwise, return to step 205.
And step 207, writing the font program data stream into the target PDF file, and writing the font description objects recorded in the font description set to be embedded into the target PDF file according to the PDF specification.
Writing a font description object of a Type1(CID) font into a target PDF file after necessary modification (mainly referring to a generated CFF data stream) according to a specification embedded in a CFF font in a PDF specification; and writing a font description object of a TrueType (CID) type font into the target PDF file after necessary modification (mainly referring to the generated TrueType (CID) data stream) according to the embedded specification of the TrueType (CID) font in the PDF specification.
And step 208, traversing the objects in the original PDF file, and storing all other objects into the target PDF file without modification except the font description objects written into the target PDF file through the steps.
In step 202 of the above flow, the obtained font description information of the to-be-embedded font may selectively include a set of characters (such as an identifier or a name of a character set) used by the font in the original PDF file, so that when obtaining the font description information in the subsequent font embedding process, it is only necessary to obtain the corresponding font description information from the character set according to the character set included in the corresponding font file and write the corresponding font description information into the target PDF file, and thus, only one minimized subset of the font is embedded when embedding the font, and the subset only includes the characters used by the original PDF file in the font, thereby reducing the data volume of the target PDF file.
Based on the same technical concept, an embodiment of the present invention further provides a system capable of embedding a double-byte font into a PDF file, where as shown in fig. 5, the system includes: a font description information determining module 501, a font description information acquiring module 502, and a PDF file generating module 503; wherein,
a font description information determining module 501, configured to determine a double-byte font used by a PDF file to be embedded but not embedded in the PDF file, and font description information of the double-byte font;
a font description information obtaining module 502, configured to determine, in the PDF file of the font to be embedded, all characters and character identifiers thereof or font identifiers thereof that perform text output using the double-byte font, and obtain, according to the font file of the double-byte font, font description information corresponding to the identifiers;
the PDF file generating module 503 is configured to generate a PDF file embedded with the to-be-embedded double-byte fonts according to the acquired font description information and the acquired font description information.
The font describing information determining module 501 may include:
the file analysis submodule 5011 is used for analyzing a PDF file of the fonts to be embedded;
the sub-module 5012 for determining the fonts to be embedded and the description thereof is configured to determine the double-byte fonts used by the PDF file but not embedded and the font description information of the double-byte fonts according to the PDF dictionary object of the PDF file analyzed by the file analyzing sub-module 5011.
The font describing information obtaining module 502 may include:
the content stream analyzing submodule 5021 is used for analyzing the content stream of the PDF file with the fonts to be embedded to obtain all instructions related to character output;
the character and identifier acquiring submodule 5022 is used for determining characters which are output by using the double-byte fonts to be embedded according to the instruction analyzed by the content stream analyzing submodule 5021; and acquiring a character identifier or a font identifier of the output character according to the determined font type and the corresponding encoding mode of the output character. If the font Type to which the output character belongs is the Type1 Type, the character identifier is obtained, and if the font Type to which the output character belongs is the Type TrueType, the font identifier is obtained, and the process of obtaining the character identifier or the font identifier according to the font Type and further according to the encoding mode is as described above;
the font describing information obtaining sub-module 5023 is used for obtaining font describing information corresponding to the identifier according to the font file of the double-byte font.
The font describing information obtaining module 502 further includes: the font file loading submodule 5024 is configured to load a corresponding font file according to the font description information of the double-byte font determined by the font to be embedded and the description determination submodule 5012 thereof. When the font describing information acquiring submodule 5023 acquires the font describing information, the font describing information corresponding to the identifier is acquired from the loaded font file.
The PDF file generating module 503 may include:
a font program data stream constructing submodule 5031, configured to construct a corresponding font program data stream according to a font type to which a double-byte font to be embedded belongs;
the font program data stream writing sub-module 5032 is configured to store the obtained font description information into a corresponding font program data stream;
the PDF file writing sub-module 5033 is configured to write the font program data stream in which the font description information is stored and the font description information of the double-byte font to be embedded into the target PDF file, where the target PDF file is the PDF file in which the double-byte font to be embedded is embedded.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A method for embedding double-byte fonts into a PDF file is characterized by comprising the following steps:
determining double-byte fonts used by a PDF file with fonts to be embedded but not embedded in the PDF file and font description information of the double-byte fonts;
determining all characters and character identifications or font identifications thereof for performing character output by using the double-byte fonts in the PDF file of the fonts to be embedded, and acquiring font description information corresponding to the character identifications or font identifications according to the font files of the double-byte fonts; wherein, in the PDF file with the fonts to be embedded, determining all characters and character identifications or font identifications thereof for performing character output by using the double-byte fonts to be embedded specifically includes: analyzing the content stream of the PDF file with the fonts to be embedded to obtain all instructions related to character output, and determining codes of characters which are output by using the double-byte fonts to be embedded according to the instructions; acquiring a character identifier or a font identifier of the output character according to the determined font type of the output character and the code of the output character;
generating a PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired font description information; the method specifically comprises the following steps: constructing a corresponding font program data stream according to the font type to which the double-byte font to be embedded belongs; storing the obtained font description information into a corresponding font program data stream; writing the font program data stream stored with the font description information and the font description information of the double-byte fonts to be embedded into a target PDF file, wherein the target PDF file is the PDF file embedded with the double-byte fonts to be embedded.
2. The method according to claim 1, wherein when the font Type to which the output character belongs is Type1 Type and the corresponding code is Identity-H or Identity-V code, the character identifier of the output character is obtained, specifically: analyzing the character identifier of the output character from the content stream of the PDF file with the font to be embedded;
when the font Type to which the output character belongs is Type1 Type and the corresponding code is a code other than Identity-H and Identity-V, acquiring the character identifier of the output character, specifically: analyzing the character code of the output character from the content stream of the PDF file with the font to be embedded, and obtaining a character identifier corresponding to the code of the output character according to the mapping relation between the character code and the character identifier;
when the font type to which the output character belongs is a TrueType type and the corresponding codes are the Identity-H and Identity-V codes, acquiring the font identifier of the output character, specifically: analyzing the character identifier of the output character from the content stream of the PDF file with the fonts to be embedded, obtaining a uniform code corresponding to the code of the output character according to the mapping relation between the character identifier and the uniform code, and obtaining a font identifier corresponding to the uniform code according to the mapping relation between the uniform code and the font identifier;
when the font type to which the output character belongs is a TrueType type and the corresponding codes are codes except for Identity-H and Identity-V, acquiring the font identifier of the output character, specifically: analyzing the character identifier or character code of the output character from the content stream of the PDF file with the font to be embedded, and obtaining the character identifier or the font identifier corresponding to the character code of the output character according to the mapping relation between the character identifier or the character code and the font identifier.
3. The method of claim 1, wherein obtaining glyph description information corresponding to the character identifier or glyph identifier from a font file of the double-byte font comprises:
loading a corresponding font file according to the font description information of the double-byte font;
and acquiring font description information corresponding to the character identifier or the font identifier from the loaded font file.
4. The method according to claim 3, wherein the font description information of the double-byte font comprises character set information used by the font in the PDF file of the font to be embedded;
and when acquiring font description information corresponding to the character identifier or the font identifier from the loaded font file, acquiring the font description information from the corresponding character set in the loaded double-byte font file only according to the character set information carried in the font description information of the double-byte font.
5. The method of claim 1, wherein if the character identifier of the character comprises a character identifier or a font identifier of a sub-character, when obtaining the font describing information, further comprising: obtaining font description information corresponding to the character identifier or the font identifier of the sub-character from the font file of the double-byte font to be embedded; and, when storing the font describing information into the corresponding font program data stream, further comprising: and storing the font description information of the sub-characters into a corresponding font program data stream.
6. A system for embedding double-byte fonts into a PDF file, comprising:
the device comprises a font description information determining module, a font description information determining module and a font processing module, wherein the font description information determining module is used for determining double-byte fonts used by a PDF (portable document format) file of fonts to be embedded but not embedded in the PDF file and font description information of the double-byte fonts;
the font description information acquisition module is used for determining all characters and character identifications or font identifications thereof which are output by using the double-byte fonts in the PDF file of the fonts to be embedded, and acquiring font description information corresponding to the character identifications or the font identifications according to the font files of the double-byte fonts; the font description information obtaining module specifically includes: the content stream analysis submodule is used for analyzing the content stream of the PDF file with the fonts to be embedded to obtain all instructions related to character output; the character and identification obtaining submodule is used for determining the code of the character which is output by using the double-byte font to be embedded according to the instruction; acquiring a character identifier or a font identifier of the output character according to the determined font type of the output character and the code of the output character; the font description information acquisition submodule is used for acquiring font description information corresponding to the character identifier or the font identifier according to the font file of the double-byte font;
the PDF file generating module is used for generating a PDF file embedded with the double-byte fonts to be embedded according to the acquired font description information and the acquired font description information; the PDF file generating module specifically includes: the font program data stream construction submodule is used for constructing a corresponding font program data stream according to the font type to which the double-byte font to be embedded belongs; the font program data stream writing submodule is used for storing the acquired font description information into the corresponding font program data stream; and the PDF file writing submodule is used for writing the font program data stream in which the font description information is stored and the font description information of the double-byte fonts to be embedded into a target PDF file, and the target PDF file is the PDF file in which the double-byte fonts to be embedded are embedded.
7. The system of claim 6, wherein the character and identifier obtaining sub-module obtains the character identifier or font identifier of the output character according to the determined font type of the output character and the code of the output character, and comprises:
when the font Type to which the output character belongs is Type1 Type and the corresponding code is a code other than Identity-H and Identity-V, acquiring the character identifier of the output character, specifically: analyzing the character code of the output character from the content stream of the PDF file with the font to be embedded, and obtaining a character identifier corresponding to the code of the output character according to the mapping relation between the character code and the character identifier;
when the font type to which the output character belongs is a TrueType type and the corresponding codes are the Identity-H and Identity-V codes, acquiring the font identifier of the output character, specifically: analyzing the character identifier of the output character from the content stream of the PDF file with the fonts to be embedded, obtaining a uniform code corresponding to the code of the output character according to the mapping relation between the character identifier and the uniform code, and obtaining a font identifier corresponding to the uniform code according to the mapping relation between the uniform code and the font identifier;
when the font type to which the output character belongs is a TrueType type and the corresponding codes are codes except for Identity-H and Identity-V, acquiring the font identifier of the output character, specifically: analyzing the character identifier or character code of the output character from the content stream of the PDF file with the font to be embedded, and obtaining the character identifier or the font identifier corresponding to the character code of the output character according to the mapping relation between the character identifier or the character code and the font identifier.
8. The system of claim 6, wherein the font descriptor obtaining module further comprises a font file loading sub-module for loading a corresponding font file according to the font descriptor of the double-byte font;
and when the font description information acquisition sub-module acquires the font description information, acquiring the font description information corresponding to the character identifier or the font identifier from the loaded font file.
CN2009102381327A 2009-11-16 2009-11-16 Method and system for embedding double-byte fonts into PDF file Expired - Fee Related CN102063416B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102381327A CN102063416B (en) 2009-11-16 2009-11-16 Method and system for embedding double-byte fonts into PDF file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102381327A CN102063416B (en) 2009-11-16 2009-11-16 Method and system for embedding double-byte fonts into PDF file

Publications (2)

Publication Number Publication Date
CN102063416A CN102063416A (en) 2011-05-18
CN102063416B true CN102063416B (en) 2012-07-25

Family

ID=43998697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102381327A Expired - Fee Related CN102063416B (en) 2009-11-16 2009-11-16 Method and system for embedding double-byte fonts into PDF file

Country Status (1)

Country Link
CN (1) CN102063416B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186513B (en) * 2011-12-31 2016-04-27 北大方正集团有限公司 A kind of method of document format conversion and device
CN103631968B (en) * 2013-12-17 2017-01-18 天津书生软件技术有限公司 Method and device for realizing font imbedding of document
CN105224509A (en) * 2014-05-30 2016-01-06 北大方正集团有限公司 A kind of method and device generating font format
CN108664457A (en) * 2017-04-01 2018-10-16 北大方正集团有限公司 Pdf document processing method and processing device
CN110826005B (en) * 2019-11-13 2022-12-16 北大方正集团有限公司 File generation method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0815517B1 (en) * 1995-03-21 1999-05-06 The Dialog Corporation plc Image data transfer
US6966029B1 (en) * 1999-12-08 2005-11-15 Koninklijke Philips Electronics N.V. Script embedded in electronic documents as invisible encoding
CN101187939A (en) * 2007-11-22 2008-05-28 北大方正集团有限公司 A font file built-in method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0815517B1 (en) * 1995-03-21 1999-05-06 The Dialog Corporation plc Image data transfer
US6966029B1 (en) * 1999-12-08 2005-11-15 Koninklijke Philips Electronics N.V. Script embedded in electronic documents as invisible encoding
CN101187939A (en) * 2007-11-22 2008-05-28 北大方正集团有限公司 A font file built-in method and device

Also Published As

Publication number Publication date
CN102063416A (en) 2011-05-18

Similar Documents

Publication Publication Date Title
CN109325009B (en) Log analysis method and device
CN110083805B (en) Method and system for converting Word file into EPUB file
CN101996160B (en) Method and system for processing script data
US7509574B2 (en) Method and system for reducing delimiters
US20100119151A1 (en) System and method for binary persistence format for a recognition result lattice
CN102063416B (en) Method and system for embedding double-byte fonts into PDF file
CN101364216B (en) Method and device for displaying electronic book documentary on mobile terminal
CN102063415B (en) Method and system for embedding single-byte fonts in PDF (Portable Document Format) file
CN111062187A (en) Structured parsing method and system for docx format document
US8024353B2 (en) Method and system for sequentially accessing compiled schema
US7500184B2 (en) Determining an acceptance status during document parsing
CN110457526A (en) Unitized data analytic method based on xml document
US9286272B2 (en) Method for transformation of an extensible markup language vocabulary to a generic document structure format
CN108664546B (en) XML data structure conversion method and device
CN116521621A (en) Data processing method and device, electronic equipment and storage medium
CN115630343A (en) Electronic document information processing method, device and equipment
CN113297425B (en) Document conversion method, device, server and storage medium
US7735001B2 (en) Method and system for decoding encoded documents
US8996991B2 (en) System and method for displaying an acceptance status
CN101369953A (en) Font network distribution method and system
CN111241096A (en) Text extraction method, system, terminal and storage medium for EXCEL document
CN115174375A (en) Message unpacking method, device, equipment and medium
CN110852039A (en) Method and device for converting characters into curves in PDF (Portable document Format) file
CN111401005B (en) Text conversion method and device and readable storage medium
US20110145700A1 (en) Structured document analysis apparatus and structured document analysis method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220627

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District Cheng Fu Road 298, founder building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120725