CN103412905A - PDF (Portable document format) file comparison method and system - Google Patents

PDF (Portable document format) file comparison method and system Download PDF

Info

Publication number
CN103412905A
CN103412905A CN2013103299006A CN201310329900A CN103412905A CN 103412905 A CN103412905 A CN 103412905A CN 2013103299006 A CN2013103299006 A CN 2013103299006A CN 201310329900 A CN201310329900 A CN 201310329900A CN 103412905 A CN103412905 A CN 103412905A
Authority
CN
China
Prior art keywords
paragraph
pdf document
expression vector
page
computing machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013103299006A
Other languages
Chinese (zh)
Inventor
张树坤
周剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GLODON SOFTWARE Co Ltd
Original Assignee
GLODON SOFTWARE Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GLODON SOFTWARE Co Ltd filed Critical GLODON SOFTWARE Co Ltd
Priority to CN2013103299006A priority Critical patent/CN103412905A/en
Publication of CN103412905A publication Critical patent/CN103412905A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a PDF (portable document format) file comparison method and system and relates to the field of computers. The method includes: extracting a residual paragraph in a first PDF file as a target paragraph; judging whether the residual paragraph exists or not within the preset range in a second PDF file; if yes, matching in the preset range to obtain a paragraph the most similar to the target paragraph, and removing the target paragraph and the most similar paragraph; if not, removing the target paragraph; marking a text the same as the target paragraph and the most similar paragraph, in a computer representing carrier which switches pages of the similar paragraph; marking a text the same as the target paragraph and the most similar paragraph, in a computer representing carrier which switches pages of the most similar paragraph; judging whether the residual paragraph exists in the first PDF file or not; if yes, executing the first step; if not, outputting the computer representing carriers corresponding to the first PDF file and the second PDF file. The PDF file comparison method and system has the advantage that PDF file comparison is more efficient and accurate.

Description

Pdf document control methods and system
Technical field
The present invention relates to field of computer technology, particularly a kind of PDF(Portable Document Format, portable file layout) file control methods and system.
Background technology
The tenderer often can occur in the construction project bidding and tendering process gang up the target behavior of enclosing.To enclose mark and be exactly a tenderer in order in the assessment of bids, winning, to use a plurality of tenderer's identity to produce many parts of biddings documents, get the bid to help own acquisition.Enclosing mark is a kind of opportunistic behavior, and it usually can cause middle marked price to exceed normal range, has had a strong impact on fairness and the seriousness of bid, has damaged bid inviter and other tenderers' interests.But, in the process of building field bid, enclose the mark behavior and compare and more be difficult to be discovered by the people with other lawbreaking activities.In the electronics bid evaluation system, bidding documents is generally PDF, so the bidding documents contrast can be carried out based on the pdf document contrast.
Existing PDF bidding documents contrast work, mainly by manually completing.And the method for artificial contrast only is only applicable to content bidding documents seldom, when the content of bidding documents is a lot of, can increase greatly staff's work load, and the reduced time is long, low to specific efficiency, contrast accuracy low.
Summary of the invention
The technical matters that (one) will solve
The technical problem to be solved in the present invention is: how to provide a kind of pdf document control methods and system, to improve specific efficiency.
(2) technical scheme
For solving the problems of the technologies described above, the invention provides a kind of pdf document control methods, comprising:
110: one that extracts in the first pdf document remains paragraph as the target paragraph;
120: judge in the second pdf document and in preset range, whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph;
130: whether the page that judges described target paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
140: whether the page that judges the most similar described paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
150: judge whether described the first pdf document exists the residue paragraph, if exist, carry out described step 110; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
Wherein, described method also comprises: according to computing machine corresponding to described the first pdf document, be the quantity of same text identified in expression vector, export the identical degree value of described the first pdf document and described the second pdf document.
Wherein, the computing formula of described identical degree value L is as follows:
L=S/(A+B-S);
Wherein, S means that computing machine corresponding to described the first pdf document is the quantity of same text identified in expression vector, and A means the word quantity of described the first pdf document, and B means the word quantity of described the second pdf document.
Wherein, described preset range is [F min, F max], and F minAnd F maxComputing formula as follows:
F min=P m-Y;
F max=P m+Y;
Wherein, F minMean the lower limit page number corresponding to preset range described in described the second pdf document, F maxMean the upper limit page number corresponding to preset range described in described the second pdf document, P mThe page number that means the page of target paragraph place described in described the first pdf document, Y are normal value.
Wherein, Y equals 3 or 5.
Wherein, coupling obtains the most similar paragraph of described target paragraph in described preset range, specifically comprises:
By described target paragraph successively with described preset range in each paragraph be complementary, obtain each paragraph in described preset range and the quantity of described target paragraph same text;
In described preset range to the most similar paragraph of the maximum paragraph of the quantity of described target paragraph same text as described target paragraph.
Wherein, before described step 150, also comprise:
Judge in the page object at target paragraph place described in described the first pdf document whether have the residue paragraph, if exist, carry out described step 150, otherwise, export computing machine corresponding to described page object and be expression vector.
The invention provides a kind of pdf document comparison system, comprising:
Extraction unit, for extracting one of the first pdf document residue paragraph as the target paragraph;
The first judging unit, for judging in the second pdf document preset range whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph;
The second judging unit, for the page that judges described target paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
The 3rd judging unit, for the page that judges the most similar described paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
Whether the 4th judging unit, exist the residue paragraph be used to judging described the first pdf document, if exist, one that notifies described extraction unit to extract in the first pdf document remains paragraph as the target paragraph; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
Wherein, described system also comprises: identical degree unit, for according to computing machine corresponding to described the first pdf document, being the quantity of the identified same text of expression vector, export the identical degree value of described the first pdf document and described the second pdf document.
Wherein, described system also comprises: middle output unit, for the page object that judges target paragraph place described in described the first pdf document, whether there is the residue paragraph, and if there is no, export computing machine corresponding to described page object and be expression vector.
(3) beneficial effect
The described pdf document control methods of the embodiment of the present invention and system, automatically the paragraph of take carries out the contrast of two pdf documents (the first pdf document and the second pdf document) as unit, the page that will have the paragraph place of same text is converted to computing machine and is expression vector, and identical word identifies in computing machine is in expression vector two pdf documents, then export two computing machines corresponding to pdf document and be expression vector, significantly improved pdf document to specific efficiency and the contrast accuracy.
The accompanying drawing explanation
Fig. 1 is the described pdf document control methods of the embodiment of the present invention 1 process flow diagram;
Fig. 2 is the described pdf document control methods of the embodiment of the present invention 2 process flow diagram;
Fig. 3 is the modular structure schematic diagram of the described pdf document comparison system of the embodiment of the present invention 3;
Fig. 4 is the modular structure schematic diagram of the described pdf document comparison system of the embodiment of the present invention 4.
Embodiment
Below in conjunction with drawings and Examples, the specific embodiment of the present invention is described in further detail.Following examples are used for the present invention is described, but are not used for limiting the scope of the invention.
Embodiment 1
Fig. 1 is the described pdf document control methods of the embodiment of the present invention 1 process flow diagram, and as shown in Figure 1, described method comprises:
110: one that extracts in the first pdf document remains paragraph as the target paragraph.
Specifically, can be according to original sequencing in described the first pdf document successively from residue paragraph of each extraction described the first pdf document, the residue paragraph here removes the backward remaining paragraph of target phase after referring to and carrying out subsequent step.Described the first pdf document can be a PDF bidding documents.
120: judge in the second pdf document and in preset range, whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph.
Concrete, described preset range can be expressed as [F min, F max], and F minAnd F maxComputing formula as follows:
F min=P m-Y;
F max=P m+Y;
Wherein, F minMean the lower limit page number corresponding to preset range described in described the second pdf document, F maxMean the upper limit page number corresponding to preset range described in described the second pdf document, P mThe page number that means the page of target paragraph place described in described the first pdf document, Y are normal value, generally can be set to 3,5 etc.That is to say, described preset range is the page number P of described target paragraph place page mIn the Y page scope of front and back.Suppose that Y is 3, as the page number P of described target paragraph place page mBe 10, described preset range is 7 to 13 pages, as the page number P of described target paragraph place page mBe 1 o'clock, because its front does not have the page number, described preset range is 1 to 4 page.
Wherein, coupling obtains the most similar paragraph of described target paragraph in described preset range, specifically comprises:
By described target paragraph successively with described preset range in each paragraph be complementary, obtain each paragraph in described preset range and the quantity of described target paragraph same text;
In described preset range to the most similar paragraph of the maximum paragraph of the quantity of described target paragraph same text as described target paragraph.
130: whether the page that judges described target paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph.
Concrete, computing machine described in this step is expression vector and refers to and can by computing machine, present the form of expression of comparing result, such as picture, document etc.Preferably, it is picture that described computing machine is expression vector, and picture format is unfixing,, can be the forms such as .png .bmp or .jpg, adopt picture can conveniently identify identical word, and, when to the user, showing comparing result, can improve loading velocity.When the computing machine of the page conversion at described target paragraph place is the expression vector acceptance of the bid and knows the described target paragraph word identical with the most similar described paragraph, can adopt highlighted mode to identify, also can adopt the modes such as frame, predetermined color to identify.
140: whether the page that judges the most similar described paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph.
Concrete, it is consistent that the computing machine of the page conversion at the most similar described paragraph place presents the form that the general computing machine of changing with the page at described target paragraph place of carrier format is expression vector.When the computing machine of the page conversion at the most similar described paragraph place is the expression vector acceptance of the bid and knows the described target paragraph word identical with the most similar described paragraph, also can adopt the modes such as highlighted, frame, predetermined color to identify, it is consistent that the computing machine of general page conversion with at described target paragraph place is the mode of the word that the described target paragraph of expression vector acceptance of the bid knowledge is identical with the most similar described paragraph.
150: judge whether described the first pdf document exists the residue paragraph, if exist, carry out described step 110; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
Concrete, when there is not the residue paragraph in described the first pdf document, be that in described the first pdf document, each paragraph has all completed contrast with the second pdf document, mean to have contrasted, at this moment can export comparing result by display screen, namely export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector, and can show control knob below display screen, for the user, select computing machine that same page is not corresponding to be expression vector and check.
The described method of the present embodiment, automatically the paragraph of take carries out the contrast of two pdf documents as unit, the page that will have the paragraph place of same text is converted to computing machine and is expression vector, and identical word identifies in computing machine is in expression vector two pdf documents, then export two computing machines corresponding to pdf document and be expression vector, significantly improved pdf document to specific efficiency and the contrast accuracy.
Embodiment 2
Fig. 2 is the process flow diagram of the described pdf document control methods of the embodiment of the present invention 2, and as shown in Figure 2, described method is substantially the same manner as Example 1, and its difference is:
Before described step 150, also comprise:
210: judge in the page object at target paragraph place described in described the first pdf document whether have the residue paragraph, if exist, carry out described step 150, otherwise, export computing machine corresponding to described page object and be expression vector.
By this step is set, can be in two pdf document comparison process, to the user, show that described the first pdf document has completed the content of the corresponding page of contrast in advance, thereby make the user before two pdf documents have contrasted, can roughly understand the identical degree of two pdf documents.
In addition, in the present embodiment, after described 150, also comprise step 220: the quantity of identified same text in being expression vector according to computing machine corresponding to described the first pdf document, export the identical degree value of described the first pdf document and described the second pdf document.
Wherein, the computing formula of described identical degree value L is as follows:
L=S/(A+B-S);
Wherein, S means that computing machine corresponding to described the first pdf document is the quantity of same text identified in expression vector, and A means the word quantity of described the first pdf document, and B means the word quantity of described the second pdf document.
By exporting the identical degree value of described the first pdf document and described the second pdf document, can make the identical degree of two pdf documents more clear and directly perceived.
Embodiment 3
Fig. 3 is the modular structure schematic diagram of the described pdf document comparison system of the embodiment of the present invention 3, as shown in Figure 3, described system 300 comprises: extraction unit 310, the first judging unit 320, the second judging unit 330, the 3rd judging unit 340 and the 4th judging unit 350.
Described extraction unit 310, for extracting one of described the first pdf document residue paragraph as the target paragraph.
Described the first judging unit 320, for judging in the second pdf document preset range whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph.
Described the second judging unit 330, for the page that judges described target paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph.
Described the 3rd judging unit 340, for the page that judges the most similar described paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph.
Whether described the 4th judging unit 350, exist the residue paragraph be used to judging described the first pdf document, if exist, one that notifies described extraction unit 310 to extract in the first pdf document remains paragraph as the target paragraph; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
Embodiment 4
Fig. 4 is the modular structure schematic diagram of the described pdf document comparison system of the embodiment of the present invention 4, as shown in Figure 4, the described system of the present embodiment is substantially the same manner as Example 3, and its difference is, the described system 300 of the present embodiment also comprises: middle output unit 410 and identical degree unit 420.
Whether output unit 410 in the middle of described, exist the residue paragraph for the page object that judges target paragraph place described in described the first pdf document, if there is no, exports computing machine corresponding to described page object and be expression vector.
Described identical degree unit 420, for according to computing machine corresponding to described the first pdf document, being the quantity of the identified same text of expression vector, export the identical degree value of described the first pdf document and described the second pdf document.
The described pdf document control methods of the embodiment of the present invention and system, automatically the paragraph of take carries out the contrast of two pdf documents as unit, the page that will have the paragraph place of same text is converted to computing machine and is expression vector, and identical word identifies in computing machine is in expression vector two pdf documents, then export two computing machines corresponding to pdf document and be expression vector, significantly improved pdf document to specific efficiency and the contrast accuracy.Simultaneously, described method and system, can be before having contrasted fully, the output comparing result, and after having contrasted fully, the identical degree value of two pdf documents of output, make the user can more clearly check sooner comparing result.
Above embodiment is only be used to illustrating the present invention; and be not limitation of the present invention; the those of ordinary skill in relevant technologies field; without departing from the spirit and scope of the present invention; can also make a variety of changes and modification; therefore all technical schemes that are equal to also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims (10)

1. portable file layout pdf document control methods, is characterized in that, comprising:
110: one that extracts in the first pdf document remains paragraph as the target paragraph;
120: judge in the second pdf document and in preset range, whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph;
130: whether the page that judges described target paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
140: whether the page that judges the most similar described paragraph place has been converted to computing machine and has been expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
150: judge whether described the first pdf document exists the residue paragraph, if exist, carry out described step 110; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
2. the method for claim 1, it is characterized in that, described method also comprises: according to computing machine corresponding to described the first pdf document, be the quantity of same text identified in expression vector, export the identical degree value of described the first pdf document and described the second pdf document.
3. method as claimed in claim 2, is characterized in that, the computing formula of described identical degree value L is as follows:
L=S/(A+B-S);
Wherein, S means that computing machine corresponding to described the first pdf document is the quantity of same text identified in expression vector, and A means the word quantity of described the first pdf document, and B means the word quantity of described the second pdf document.
4. the method for claim 1, is characterized in that, described preset range is [F min, F max], and F minAnd F maxComputing formula as follows:
F min=P m-Y;
F max=P m+Y;
Wherein, F minMean the lower limit page number corresponding to preset range described in described the second pdf document, F maxMean the upper limit page number corresponding to preset range described in described the second pdf document, P mThe page number that means the page of target paragraph place described in described the first pdf document, Y are normal value.
5. method as claimed in claim 4, is characterized in that, Y equals 3 or 5.
6. the method for claim 1, is characterized in that, coupling obtains the most similar paragraph of described target paragraph in described preset range, specifically comprises:
By described target paragraph successively with described preset range in each paragraph be complementary, obtain each paragraph in described preset range and the quantity of described target paragraph same text;
In described preset range to the most similar paragraph of the maximum paragraph of the quantity of described target paragraph same text as described target paragraph.
7. the method for claim 1, is characterized in that, also comprises before described step 150:
Judge in the page object at target paragraph place described in described the first pdf document whether have the residue paragraph, if exist, carry out described step 150, otherwise, export computing machine corresponding to described page object and be expression vector.
8. a pdf document comparison system, is characterized in that, comprising:
Extraction unit, for extracting one of the first pdf document residue paragraph as the target paragraph;
The first judging unit, for judging in the second pdf document preset range whether have the residue paragraph, if exist, coupling obtains the most similar paragraph of described target paragraph in described preset range, removes described target paragraph and the most similar described paragraph; Otherwise, remove described target paragraph;
The second judging unit, for the page that judges described target paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at described target paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at described target paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
The 3rd judging unit, for the page that judges the most similar described paragraph place, whether be converted to computing machine and be expression vector, if, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph, otherwise, the page at the most similar described paragraph place is converted to computing machine and is expression vector, the computing machine of changing at the page at the most similar described paragraph place is expression vector acceptance of the bid knowledge described target paragraph and the identical word of the most similar described paragraph;
Whether the 4th judging unit, exist the residue paragraph be used to judging described the first pdf document, if exist, one that notifies described extraction unit to extract in the first pdf document remains paragraph as the target paragraph; Otherwise, export described the first pdf document and computing machine corresponding to described the second pdf document is expression vector.
9. system as claimed in claim 8, it is characterized in that, described system also comprises: identical degree unit, for according to computing machine corresponding to described the first pdf document, being the quantity of the identified same text of expression vector, export the identical degree value of described the first pdf document and described the second pdf document.
10. system as claimed in claim 8, it is characterized in that, described system also comprises: middle output unit, for judging whether the page object at target paragraph place described in described the first pdf document exists the residue paragraph, if there is no, computing machine corresponding to the described page object of output is expression vector.
CN2013103299006A 2013-07-31 2013-07-31 PDF (Portable document format) file comparison method and system Pending CN103412905A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013103299006A CN103412905A (en) 2013-07-31 2013-07-31 PDF (Portable document format) file comparison method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013103299006A CN103412905A (en) 2013-07-31 2013-07-31 PDF (Portable document format) file comparison method and system

Publications (1)

Publication Number Publication Date
CN103412905A true CN103412905A (en) 2013-11-27

Family

ID=49605917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013103299006A Pending CN103412905A (en) 2013-07-31 2013-07-31 PDF (Portable document format) file comparison method and system

Country Status (1)

Country Link
CN (1) CN103412905A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294610A1 (en) * 2006-06-02 2007-12-20 Ching Phillip W System and method for identifying similar portions in documents
KR100788440B1 (en) * 2006-06-29 2007-12-24 중앙대학교 산학협력단 A document copy detection system based on plagiarism patterns
CN101404037A (en) * 2008-11-18 2009-04-08 西安交通大学 Method for detecting and positioning electronic text contents plagiary
CN102915295A (en) * 2011-03-31 2013-02-06 百度在线网络技术(北京)有限公司 Document detecting method and document detecting device
CN103049467A (en) * 2011-10-12 2013-04-17 杨纯青 Chinese digital anti-plagiarism detection and comparison system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070294610A1 (en) * 2006-06-02 2007-12-20 Ching Phillip W System and method for identifying similar portions in documents
KR100788440B1 (en) * 2006-06-29 2007-12-24 중앙대학교 산학협력단 A document copy detection system based on plagiarism patterns
CN101404037A (en) * 2008-11-18 2009-04-08 西安交通大学 Method for detecting and positioning electronic text contents plagiary
CN102915295A (en) * 2011-03-31 2013-02-06 百度在线网络技术(北京)有限公司 Document detecting method and document detecting device
CN103049467A (en) * 2011-10-12 2013-04-17 杨纯青 Chinese digital anti-plagiarism detection and comparison system and method

Similar Documents

Publication Publication Date Title
CN108734110B (en) Text paragraph identification and comparison method and system based on longest public subsequence
CN104636428B (en) A kind of trade mark recommends method and device
CN103838875A (en) Information collecting system based on two-dimensional bar code and method of information collecting system
CN105139041A (en) Method and device for recognizing languages based on image
CN103279846A (en) Project acceptance method and system based on BIM model
CN105320734A (en) Web page core content extraction method
CN104123608A (en) Method and device for creating accounting records
CN110765739A (en) Method for extracting table data and chapter structure from PDF document
CN112416331A (en) Page adaptation method and device, electronic equipment and computer readable storage medium
CN103186880B (en) Generate the method and apparatus of thumbnail
CN103258021B (en) The character terminal characteristic extracting method that a kind of Behavior-based control is analyzed
CN113627132B (en) Data deduplication marking code generation method, system, electronic equipment and storage medium
CN103412904A (en) PDF (portable document format) file comparison method and PDF file comparison system
CN106354731A (en) Document inspection method and device
CN110598623B (en) Method and device for cutting and extracting picture, computer equipment and storage medium
CN103412905A (en) PDF (Portable document format) file comparison method and system
JP2011123825A (en) Character recognition method, character recognition device, and character recognition program
CN102200966A (en) Method for extracting and processing layout information
CN107861931B (en) Template file processing method and device, computer equipment and storage medium
CN106406560A (en) Method and system for outputting vector fonts of mechanical engineering characters in desktop operation system
CN104268496A (en) Handheld terminal recognizing system and method for eliminating out-dated energy-consuming devices or products
CN105302776A (en) Data proofreading platform server
CN104091353A (en) Method for extracting image color labels
CN205139915U (en) Calculator for express fee
CN103699482A (en) Method and device for testing reasonableness of controls

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20131127