CN111783384A - Method, device, server and storage medium for determining position on PDF document - Google Patents

Method, device, server and storage medium for determining position on PDF document Download PDF

Info

Publication number
CN111783384A
CN111783384A CN202010606221.9A CN202010606221A CN111783384A CN 111783384 A CN111783384 A CN 111783384A CN 202010606221 A CN202010606221 A CN 202010606221A CN 111783384 A CN111783384 A CN 111783384A
Authority
CN
China
Prior art keywords
coordinate
pdf document
page
determining
document page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010606221.9A
Other languages
Chinese (zh)
Other versions
CN111783384B (en
Inventor
霍筱宁
苏洲
席桐鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinmao Digital Technology Co ltd
Original Assignee
Jinmao Investment Management Tianjin Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinmao Investment Management Tianjin Co ltd filed Critical Jinmao Investment Management Tianjin Co ltd
Priority to CN202010606221.9A priority Critical patent/CN111783384B/en
Publication of CN111783384A publication Critical patent/CN111783384A/en
Application granted granted Critical
Publication of CN111783384B publication Critical patent/CN111783384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application discloses a method for determining a position on a PDF document, which comprises the following steps: receiving a PDF document, and determining a general coordinate of a certain position of a page of the PDF document; acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment, and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm; and acquiring a general coordinate corresponding to the first coordinate, and determining a second coordinate in a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm. The method ensures that the position of the PDF document is determined and obtained, and can keep cooperative and consistent in different analysis environments of different terminals, operating systems and application program interfaces.

Description

Method, device, server and storage medium for determining position on PDF document
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a server, and a storage medium for determining a location on a PDF document.
Background
With the continuous development of electronic commerce and electronic government technology, various common official documents, contract documents, industry reports and the like are generally displayed in a PDF format. When a user processes various PDF documents, the key content at the key position needs to be labeled and confirmed. In addition, when multi-party operation is involved or electronic signatures are performed by two parties of a contract, multi-party cooperative operation needs to be performed under different terminals, operating systems and application program interfaces. Therefore, the position needs to be determined on the PDF document under different parsing environments, and the operation on the PDF document is guaranteed to have consistency.
In the prior art, in ISO 32000-1: in the 2008 standard, the PDF defines a device independent coordinate system that always has the same relationship to the current page, regardless of which output device it is printed or displayed on. This device-independent coordinate system is called User Space. Each page of the document is initialized by the default state of UserSpace. The CropBox entry in the page dictionary specifies a visible rectangular area of the output medium (i.e., the display window or the print page). Typically, the positive x-axis extends horizontally to the right and the positive y-axis extends vertically upward (as may be changed by the Rotate entry in the page dictionary). The length units along the x and y axes are set by the subscriber unit userlit entry in the page dictionary, and if the entry is not present or supported, a default value of 1/72 inches is used. This coordinate system is referred to as the default user coordinate system.
However, the user space coordinate system is an absolute coordinate system and is ideally used (e.g., the resolution of all users' display devices is the same, and is a maximized window), and the user space in the standard does not provide a solution for the evaluation of coordinate positions in the coordinate system. Factors such as the resolution of the different user display devices, whether the window is scaled, etc., affect the positioning and use of the coordinate system.
In a specific application, the following two methods are generally used for determining the position of a PDF document: firstly, segmenting pdf texts in a system according to nine-square lattices, wherein definition modes of the nine-square lattices are different for each system, and then acquiring approximate position information according to a self-defined nine-square lattice; and secondly, making invisible character marks on the pdf and using the invisible character marks as position location. The above method, from the position information acquisition of the pdf text to the use after the information acquisition, is completed in a single system, and cannot realize cross-platform operation. In addition, only approximate positions can be obtained by a method of positioning the nine-square grid and the character marks, and the precision is not high.
Therefore, it is desirable to provide a method for determining a position on a PDF document, so that the position determination and acquisition of the PDF document can be cooperatively and consistently maintained under different analysis environments of different terminals, operating systems and application program interfaces.
Disclosure of Invention
The embodiment of the application provides a method for determining a position on a PDF document. Specifically, the method for determining the position on the PDF document comprises the following steps:
receiving a PDF document, and determining a general coordinate of a certain position of a page of the PDF document;
acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment, and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
and acquiring a general coordinate corresponding to the first coordinate, and determining a second coordinate in a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
Further, in a preferred embodiment provided by the present application, receiving a PDF document, and determining a general coordinate of a certain position of a page of the PDF document specifically includes:
receiving a PDF document, and selecting a certain page of the PDF document;
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page on the selected PDF document page, determining the range of the selected PDF document page, and establishing a coordinate plane;
defining an origin, an x axis, a y axis and a unit length of a Cartesian coordinate system on the coordinate plane, wherein the unit length of the Cartesian coordinate system is determined according to the maximum horizontal width or the maximum vertical height of the selected PDF document page range and according to N equal divisions, wherein N is a positive integer;
and determining the coordinate value of a certain position of the selected PDF document page according to the Cartesian coordinate system.
Further, in a preferred embodiment provided in the present application, in the selected PDF document page, the maximum horizontal width or the maximum vertical height of the selected PDF document page is obtained, the range of the selected PDF document page is determined, and a coordinate plane is established, which specifically includes:
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page range according to the User Space and the maximum values of the x axis and the y axis of the display component CropBox, and determining the selected PDF document page range;
and establishing a coordinate plane in the selected PDF document page range.
Further, in a preferred embodiment provided by the present application, on the coordinate plane, an origin, an x-axis, a y-axis, and a unit length of a cartesian coordinate system are defined, which specifically include:
according to the User Space and the display component cropsbox, defining the lower left corner of the User Space as the origin of the Cartesian coordinate system, defining the x axis as a horizontal axis and the y axis as a longitudinal axis;
the Unit length of the cartesian coordinate system is defined according to a User Unit, which is set by default to 1/72 inch.
Further, in a preferred embodiment provided by the present application, determining a coordinate value of a certain position of the selected PDF document page according to the cartesian coordinate system specifically includes:
acquiring the page number of the selected PDF document page;
and determining the coordinate value of a certain position on the PDF document according to the page number of the selected PDF document page and the coordinate value of the certain position of the selected PDF document page.
Further, in a preferred embodiment provided in the present application, the first mapping algorithm is:
Figure BDA0002559190960000031
Figure BDA0002559190960000032
wherein, (x1, y1) is a first coordinate of a position of the page of the PDF document in a first resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X1a, X1b ], [ Y1a, Y1b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the first analysis environment, respectively.
Further, in a preferred embodiment provided in the present application, the second mapping algorithm is:
Figure BDA0002559190960000041
Figure BDA0002559190960000042
wherein (x2, y2) is a second coordinate of a certain position of the page of the PDF document in a second resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X2a, X2b ], [ Y2a, Y2b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the second analysis environment, respectively.
An embodiment of the present application further provides a device for determining a location on a PDF document, including:
the universal coordinate determination module is used for receiving the PDF document and determining a universal coordinate of a certain position of a page of the PDF document;
the first coordinate conversion module is used for acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
and the second coordinate conversion module is used for acquiring the general coordinate corresponding to the first coordinate and determining a second coordinate under a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
An embodiment of the present application further provides a server, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the steps of the method for determining a position on a PDF document according to any one of claims 1 to 7.
Embodiments of the present application further provide a computer-readable storage medium, which stores a computer program, and the computer program is executed by a processor to implement the steps of the method for determining a position on a PDF document according to any one of claims 1 to 7.
The method for determining the position on the PDF document, provided by the embodiment of the application, enables the position of the PDF document to be determined and obtained, and can keep coordination and consistency under different analysis environments of different terminals, operating systems and application program interfaces.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a method for determining a location on a PDF document according to an embodiment of the present application;
FIG. 2 is a diagram illustrating a Cartesian coordinate system defined according to User Space and a display component CropBox according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of an apparatus for determining a position on a PDF document according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical scheme of the embodiment of the invention relates to a method for determining a position on a PDF document.
Referring to fig. 1, a method for determining a position on a PDF document provided in the embodiment of the present application specifically includes the following steps:
s100: receiving the PDF document, and determining the universal coordinate of a certain position of the page of the PDF document.
The universal coordinates in the embodiment of the application are absolute coordinates of the PDF document established according to a custom cartesian coordinate system. The coordinate system is established on the basis of the PDF document, and the coordinates of any point in the document are determined by the distance of the point relative to the transverse and longitudinal boundaries of the page and cannot be changed along with different analysis environments such as a terminal, an operating system, an application program interface and the like. The method comprises the steps of receiving a PDF document, wherein the PDF document is uploaded by a user through a front-end application program and is received by a rear end of a server, the PDF document is transmitted by other users through a network, and the transmission and receiving modes of the PDF document and the determination of the universal coordinate are not influenced by a terminal, an operating system and an application program interface.
Further, in a preferred embodiment provided by the present application, receiving a PDF document, and determining a general coordinate of a certain position of a page of the PDF document specifically includes:
receiving a PDF document, and selecting a certain page of the PDF document;
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page on the selected PDF document page, determining the range of the selected PDF document page, and establishing a coordinate plane;
defining an origin, an x axis, a y axis and a unit length of a Cartesian coordinate system on the coordinate plane, wherein the unit length of the Cartesian coordinate system is determined according to the maximum horizontal width or the maximum vertical height of the selected PDF document page range and according to N equal divisions, wherein N is a positive integer;
and determining the coordinate value of a certain position of the selected PDF document page according to the Cartesian coordinate system.
In the embodiment of the application, the universal coordinate of a certain position of a PDF document page is determined, and the universal coordinate corresponds to each page of the PDF document. Therefore, the coordinate plane and coordinate system of each page should be initialized. Under the current page, a coordinate plane is established according to the page range of the PDF document, the unit length of a Cartesian coordinate system is determined as N equal parts of the maximum horizontal width or the maximum vertical height of the page range of the PDF document, so that the coordinate value of each coordinate point only depends on the relative position of each coordinate point in the whole page range of the PDF document, the establishment of the coordinate system and the establishment of the coordinate points can establish an absolute coordinate system in the page range of the PDF document independently of different analysis environments such as a terminal, an operating system, an application program interface and the like.
The coordinate origin can be selected at any point in the current page range of the PDF document, and the lower left corner of the current page range is usually selected as the coordinate origin, so that the maximum horizontal width or maximum vertical height of the current page range is exactly equal to the coordinate values of the current page range in a cartesian coordinate system, thereby facilitating calculation and conversion. The establishment of the cartesian coordinate system and the definition of the origin, x-axis and y-axis are not limited to the above manner, and even not limited to the planar rectangular coordinate system. An oblique coordinate system or other types of cartesian coordinate systems may also be established according to different needs.
It should be particularly noted that the unit length is determined according to the maximum horizontal width or the maximum vertical height of the page range of the PDF document in N equal divisions, where N is a positive integer, and it can be understood that the larger the value of N is, the more accurate the coordinate system is. The value of N is generally set up by comprehensive evaluation according to the display environment or the requirement for operation accuracy in the PDF document job. Obviously, the value of N should be large enough to ensure a certain coordinate precision, and the value of N is not infinite and is limited by software and hardware environments of devices and operating systems.
Further, in a preferred embodiment provided in the present application, in the selected PDF document page, the maximum horizontal width or the maximum vertical height of the selected PDF document page is obtained, the range of the selected PDF document page is determined, and a coordinate plane is established, which specifically includes:
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page range according to the User Space and the maximum values of the x axis and the y axis of the display component CropBox, and determining the selected PDF document page range;
and establishing a coordinate plane in the selected PDF document page range.
Using ISO 32000-1: the User Space and the display component cropsbox defined in the 2008 standard are used for determining the page range of the PDF document and establishing a coordinate plane, and can be conveniently connected with the prior art system to directly call the prior programming language and function library or assemble an independent call interface through the prior programming language and function library, so that the acquisition of related information can be quickly, flexibly and conveniently realized.
Further, in a preferred embodiment provided by the present application, on the coordinate plane, an origin, an x-axis, a y-axis, and a unit length of a cartesian coordinate system are defined, which specifically include:
according to the User Space and the display component cropsbox, defining the lower left corner of the User Space as the origin of the Cartesian coordinate system, defining the x axis as a horizontal axis and the y axis as a longitudinal axis;
the Unit length of the cartesian coordinate system is defined according to a User Unit, which is set by default to 1/72 inch.
A coordinate plane is established according to the User Space and the display component cropsbox, a cartesian coordinate system is defined, referring to fig. 2, the lower left corner of the User Space is defined as the origin of the cartesian coordinate system, the User unit 1/72inch is defined as the unit length of the cartesian coordinate system, and the maximum values of the x axis and the y axis of the display component cropsbox are the maximum horizontal width or the maximum vertical height of the PDF document page. Specifically, defining the format of the textPosition parameter as { recPara, x × y }, wherein recPara represents the numerical values of an x axis and a y axis of the CropBox; x y represents the specific location coordinates.
For example: "{ \\" 320 × 640\ "," 189 × 230\ "}" indicates the positions of x 189 and y 230 in the CropBox with x axis 320 and y axis 640.
Using ISO 32000-1: the User Space and the display component CropBox defined in the 2008 standard define the origin, the x axis and the y axis of a Cartesian coordinate system, and the Unit length is defined through the User Unit, so that the system can be conveniently connected with the prior art, the acquisition of related information can be quickly, flexibly and conveniently realized, and meanwhile, the related coordinate conversion and mapping calculation is more concise and efficient.
Further, in a preferred embodiment provided by the present application, determining a coordinate value of a certain position of the selected PDF document page according to the cartesian coordinate system specifically includes:
acquiring the page number of the selected PDF document page;
and determining the coordinate value of a certain position on the PDF document according to the page number of the selected PDF document page and the coordinate value of the certain position of the selected PDF document page.
When the operation of the contents of a plurality of pages on the PDF document is involved, a coordinate plane and a Cartesian coordinate system are required to be established separately for each page, the coordinate value of a certain position of the current page of the PDF document is determined, the page number of the current page is acquired, and the coordinate value of the certain position on the PDF document is determined. Specifically, defining the format of the textPosition parameter as { recPara, p-x y }, wherein p represents a page number, and recPara represents the numerical values of an x axis and a y axis of the cropBox; x y represents the specific location coordinates.
For example: "\\" 320 \ 640\ "1-189 \ 230\" indicates the positions of x 189 and y 230 in the CropBox with the first page of the PDF document having an x axis of 320 and a y axis of 640.
S200: acquiring a first coordinate of a certain position of the PDF document page in a first analysis environment, and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm.
Further, in a preferred embodiment provided in the present application, the first mapping algorithm is:
Figure BDA0002559190960000081
Figure BDA0002559190960000082
wherein, (x1, y1) is a first coordinate of a position of the page of the PDF document in a first resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X1a, X1b ], [ Y1a, Y1b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the first analysis environment, respectively.
The following describes a mapping relationship between a certain position of a PDF document page and a general coordinate in different parsing environments through a specific application scenario.
Specifically, the first parsing environment is a web page end, an area is fixed through the web page end, a PDF document page is displayed in the area, and a canvas transparent canvas is created in the area, where the size of the transparent canvas is equal to the display size of the PDF document. And monitoring a mouse click event of the area, and recording pixel coordinates (X1, Y1) of a certain position of a PDF document page when the mouse clicks the position, wherein the pixel coordinates of the lower left corner and the upper right corner of the canvas transparent canvas are respectively [ X1a, Y1a ], [ X1b, Y1b ]. Then according to the first mapping algorithm, the general coordinates (x ', y') corresponding to the position of the PDF document clicked by the mouse at the web page end can be calculated.
S300: and acquiring a general coordinate corresponding to the first coordinate, and determining a second coordinate in a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
Further, in a preferred embodiment provided in the present application, the second mapping algorithm is:
Figure BDA0002559190960000091
Figure BDA0002559190960000092
wherein (x2, y2) is a second coordinate of a certain position of the page of the PDF document in a second resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X2a, X2b ], [ Y2a, Y2b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the second analysis environment, respectively.
The following describes a mapping relationship between a general coordinate and a certain position of a PDF document page in different parsing environments through a specific application scenario.
Specifically, the second parsing environment is a background server, the background server receives the PDF document and performs operation processing on the PDF document, and transmits the general coordinates (x ', y') corresponding to the positions of the web page end and the PDF document clicked by the mouse to the background server, so that the background server can determine the positions (x1, y1) of the PDF document clicked by the mouse, which correspond to the positions (x2, y2) in the background server environment, according to the general coordinates (x ', y') and the second mapping algorithm.
By determining the universal coordinates, the position of the PDF document can be determined and obtained, and coordination and consistency can be kept in different analysis environments under different terminals, operating systems and application program interfaces. The first analysis environment and the second analysis environment are not limited to a web page end and a background server, and can be other specific application scenes based on different terminals, operating systems and application program interfaces. For example, the APP program interface and the backend server of the front end of the mobile phone, and the application program interface and the backend server of the PC. Or, for example, the first parsing environment is a user a under a mobile phone terminal, and an operation interface of the first parsing environment is a Safari browser under an IOS system; the second analysis environment is user B under the PC terminal, and the operation interface is an IE browser under the Windows-based system. A. And B, the user carries out collaborative plotting on a PDF document through a network environment, and the operation of any party on the PDF document can be synchronously displayed on the terminal interface of the other party. In the application scene, the universal coordinate and coordinate mapping and conversion algorithm in the embodiment of the application can conveniently and quickly realize the consistent operation in different analysis environments.
An embodiment of the present application further provides a device for determining a location on a PDF document, including:
the general coordinate determination module 310 is configured to receive a PDF document and determine a general coordinate of a certain position of a page of the PDF document;
the first coordinate conversion module 320 is configured to obtain a first coordinate of a certain position of the PDF document page in a first parsing environment, and determine a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
the second coordinate conversion module 330 obtains a general coordinate corresponding to the first coordinate, and determines a second coordinate in a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
Further, in an embodiment provided herein, the universal coordinate determination module 310 is specifically configured to:
receiving a PDF document, and selecting a certain page of the PDF document;
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page on the selected PDF document page, determining the range of the selected PDF document page, and establishing a coordinate plane;
defining an origin, an x axis, a y axis and a unit length of a Cartesian coordinate system on the coordinate plane, wherein the unit length of the Cartesian coordinate system is determined according to the maximum horizontal width or the maximum vertical height of the selected PDF document page range and according to N equal divisions, wherein N is a positive integer;
and determining the coordinate value of a certain position of the selected PDF document page according to the Cartesian coordinate system.
Further, in an embodiment provided by the present application, the universal coordinate determination module 310 further includes a coordinate plane determination module, specifically configured to:
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page range according to the User Space and the maximum values of the x axis and the y axis of the display component CropBox, and determining the selected PDF document page range;
and establishing a coordinate plane in the selected PDF document page range.
Further, in an embodiment provided by the present application, the universal coordinate determination module 310 further includes a cartesian coordinate system definition module, specifically configured to:
according to the User Space and the display component cropsbox, defining the lower left corner of the User Space as the origin of the Cartesian coordinate system, defining the x axis as a horizontal axis and the y axis as a longitudinal axis;
the Unit length of the cartesian coordinate system is defined according to a User Unit, which is set by default to 1/72 inch.
Further, in an embodiment provided by the present application, the universal coordinate determination module 310 further includes a page and coordinate determination module, specifically configured to:
acquiring the page number of the selected PDF document page;
and determining the coordinate value of a certain position on the PDF document according to the page number of the selected PDF document page and the coordinate value of the certain position of the selected PDF document page.
Further, in an embodiment provided by the present application, the first coordinate conversion module 320 is configured to obtain a first coordinate of a certain position of the PDF document page in a first parsing environment, and according to a preset first mapping algorithm, a specific algorithm for determining a general coordinate corresponding to the first coordinate is as follows:
Figure BDA0002559190960000111
Figure BDA0002559190960000112
wherein, (x1, y1) is a first coordinate of a position of the page of the PDF document in a first resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X1a, X1b ], [ Y1a, Y1b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the first analysis environment, respectively.
Further, in an embodiment provided in the present application, the second coordinate transformation module 330 is configured to obtain a general coordinate corresponding to the first coordinate, and determine, according to a preset second mapping algorithm, that a specific algorithm of the second coordinate in the second analysis environment corresponding to the first coordinate is:
Figure BDA0002559190960000121
Figure BDA0002559190960000122
wherein (x2, y2) is a second coordinate of a certain position of the page of the PDF document in a second resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X2a, X2b ], [ Y2a, Y2b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the second analysis environment, respectively.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present application further provides a server, which is a schematic structural diagram of a service provided in the embodiment of the present application, as shown in fig. 4. As can be seen from fig. 4, the server comprises a processor 40, a memory 41 and a computer program 42 stored in the memory 41 and executable on the processor 40, for example a program for determining a position on a PDF document. The processor 40, when executing the computer program 42, implements the steps in the above-described method embodiment of determining a location on a PDF document, such as the steps 100 to 300 shown in fig. 1.
Alternatively, the processor 40, when executing the computer program 42, implements the functionality of the various modules/units in the above-described apparatus embodiment for determining a position on a PDF document, such as the functionality of the modules 310 to 330 shown in fig. 3.
Illustratively, the computer program 42 may be divided into one or more modules/units, which are stored in the memory 41 and executed by the processor 40 to perform the steps of the method of determining a position on a PDF document of the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 42 in the server 4. For example, the computer program 42 may be divided into a general coordinate determination module, a first coordinate conversion module, and a second coordinate conversion module, each module having the following specific functions:
the universal coordinate determination module is used for receiving the PDF document and determining a universal coordinate of a certain position of a page of the PDF document;
the first coordinate conversion module is used for acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
and the second coordinate conversion module is used for acquiring the general coordinate corresponding to the first coordinate and determining a second coordinate under a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of communication units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A method of determining a location on a PDF document comprising the steps of:
receiving a PDF document, and determining a general coordinate of a certain position of a page of the PDF document;
acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment, and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
and acquiring a general coordinate corresponding to the first coordinate, and determining a second coordinate in a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
2. The method of claim 1, wherein receiving the PDF document and determining the universal coordinates of a certain position of a page of the PDF document specifically comprises:
receiving a PDF document, and selecting a certain page of the PDF document;
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page on the selected PDF document page, determining the range of the selected PDF document page, and establishing a coordinate plane;
defining an origin, an x axis, a y axis and a unit length of a Cartesian coordinate system on the coordinate plane, wherein the unit length of the Cartesian coordinate system is determined according to the maximum horizontal width or the maximum vertical height of the selected PDF document page range and according to N equal divisions, wherein N is a positive integer;
and determining the coordinate value of a certain position of the selected PDF document page according to the Cartesian coordinate system.
3. The method according to claim 2, wherein the step of obtaining the maximum width in the horizontal direction or the maximum height in the vertical direction of the selected PDF document page on the selected PDF document page, determining the range of the selected PDF document page, and establishing a coordinate plane specifically comprises:
acquiring the transverse maximum width or the longitudinal maximum height of the selected PDF document page range according to the User Space and the maximum values of the x axis and the y axis of the display component CropBox, and determining the selected PDF document page range;
and establishing a coordinate plane in the selected PDF document page range.
4. The method of claim 3, wherein defining an origin, an x-axis, a y-axis, and a unit length of a Cartesian coordinate system on the coordinate plane comprises:
according to the User Space and the display component cropsbox, defining the lower left corner of the User Space as the origin of the Cartesian coordinate system, defining the x axis as a horizontal axis and the y axis as a longitudinal axis;
the Unit length of the cartesian coordinate system is defined according to a User Unit, wherein the User Unit is default to 1/72 inch.
5. The method of claim 2, wherein determining the coordinate value of a position on the selected PDF document page according to the cartesian coordinate system further comprises:
acquiring the page number of the selected PDF document page;
and determining the coordinate value of a certain position on the PDF document according to the page number of the selected PDF document page and the coordinate value of the certain position of the selected PDF document page.
6. The method of claim 2, wherein the first mapping algorithm is:
Figure FDA0002559190950000021
Figure FDA0002559190950000022
wherein, (x1, y1) is a first coordinate of a position of the page of the PDF document in a first resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X1a, X1b ], [ Y1a, Y1b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the first analysis environment, respectively.
7. The method of claim 2, wherein the second mapping algorithm is:
Figure FDA0002559190950000023
Figure FDA0002559190950000024
wherein (x2, y2) is a second coordinate of a certain position of the page of the PDF document in a second resolution environment; (x ', y') is a common coordinate corresponding to the first coordinate; [ X2a, X2b ], [ Y2a, Y2b ] are the horizontal coordinate interval and the vertical coordinate interval of the PDF document page in the second analysis environment, respectively.
8. An apparatus for determining a location on a PDF document, comprising:
the universal coordinate determination module is used for receiving the PDF document and determining a universal coordinate of a certain position of a page of the PDF document;
the first coordinate conversion module is used for acquiring a first coordinate of a certain position of a PDF document page in a first analysis environment and determining a general coordinate corresponding to the first coordinate according to a preset first mapping algorithm;
and the second coordinate conversion module is used for acquiring the general coordinate corresponding to the first coordinate and determining a second coordinate under a second analysis environment corresponding to the first coordinate according to a preset second mapping algorithm.
9. A server comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor when executing the computer program realizes the steps of the method of determining a position on a PDF document according to any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method of determining a position on a PDF document according to any one of claims 1 to 7.
CN202010606221.9A 2020-06-29 2020-06-29 Method, device, server and storage medium for determining position on PDF document Active CN111783384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010606221.9A CN111783384B (en) 2020-06-29 2020-06-29 Method, device, server and storage medium for determining position on PDF document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010606221.9A CN111783384B (en) 2020-06-29 2020-06-29 Method, device, server and storage medium for determining position on PDF document

Publications (2)

Publication Number Publication Date
CN111783384A true CN111783384A (en) 2020-10-16
CN111783384B CN111783384B (en) 2024-05-03

Family

ID=72760958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010606221.9A Active CN111783384B (en) 2020-06-29 2020-06-29 Method, device, server and storage medium for determining position on PDF document

Country Status (1)

Country Link
CN (1) CN111783384B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861821A (en) * 2021-04-06 2021-05-28 刘羽 Map data reduction method based on PDF file analysis
CN112861822A (en) * 2021-04-06 2021-05-28 刘羽 Map data processing method based on PDF file analysis
CN113918059A (en) * 2021-10-26 2022-01-11 国电南瑞科技股份有限公司 Signature position positioning method and device of electronic cloud signature

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446760A (en) * 2018-09-17 2019-03-08 江苏敏行信息技术有限公司 Electronic Signature localization method in a kind of webpage PDF
CN110297224A (en) * 2019-08-01 2019-10-01 深圳前海达闼云端智能科技有限公司 Laser radar positioning method and device, robot and computing equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109446760A (en) * 2018-09-17 2019-03-08 江苏敏行信息技术有限公司 Electronic Signature localization method in a kind of webpage PDF
CN110297224A (en) * 2019-08-01 2019-10-01 深圳前海达闼云端智能科技有限公司 Laser radar positioning method and device, robot and computing equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GRIDMIX: "PDF坐标系统", pages 1, Retrieved from the Internet <URL:https://blog.51cto.com/gridmix/1339122> *
高书东: "测绘常用坐标系转换框架设计分析", 住宅与房地产, pages 205 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112861821A (en) * 2021-04-06 2021-05-28 刘羽 Map data reduction method based on PDF file analysis
CN112861822A (en) * 2021-04-06 2021-05-28 刘羽 Map data processing method based on PDF file analysis
CN112861822B (en) * 2021-04-06 2024-03-12 刘羽 Map data processing method based on PDF file analysis
CN112861821B (en) * 2021-04-06 2024-04-19 刘羽 Map data reduction method based on PDF file analysis
CN113918059A (en) * 2021-10-26 2022-01-11 国电南瑞科技股份有限公司 Signature position positioning method and device of electronic cloud signature
CN113918059B (en) * 2021-10-26 2023-11-28 国电南瑞科技股份有限公司 Signature position positioning method and device for electronic cloud signature

Also Published As

Publication number Publication date
CN111783384B (en) 2024-05-03

Similar Documents

Publication Publication Date Title
CN111783384B (en) Method, device, server and storage medium for determining position on PDF document
CN112102437B (en) Canvas-based radar map generation method and device, storage medium and terminal
CN107463348B (en) Method and system for realizing Web end custom format printing based on B/S architecture
JP5334338B2 (en) Terminal device and drawing display program for terminal device
CN110490141B (en) Method, device, terminal and storage medium for identifying filling information
CN109102264B (en) Electronic red packet detection method and device and terminal equipment
CN112000902B (en) Method, electronic device, and storage medium for mapping an area
CN106776994B (en) Application method and system of engineering symbols in engineering report forms and web pages
CN111223155B (en) Image data processing method, device, computer equipment and storage medium
KR101909628B1 (en) System and method for mapping bim object using plane shape data
CN109683834B (en) Gerber file conversion precision processing method, system, equipment and storage medium
CN116956845A (en) Method for rapidly configuring form field, page field generation mode and system
CN114896175A (en) Automatic test method, device, equipment and medium for report export function
CN115657899A (en) Icon processing method and device, electronic equipment and storage medium
CN115203238A (en) License plate information input method and device, terminal equipment and storage medium
CN111324269B (en) Marking method, system, storage medium and terminal equipment based on visualization
CN112988310A (en) Online experiment method based on multi-split-screen browser
CN111191974B (en) Medicine inventory method and device
CN108519962B (en) Font display method and apparatus applied to android system, and terminal device
CN113837181A (en) Screening method and device, computer equipment and computer readable storage medium
CN111696154A (en) Coordinate positioning method, device, equipment and storage medium
CN112380828A (en) PDF document generation method and device, storage medium and electronic equipment
CN112650493A (en) Method for generating control state image, using method, device and electronic equipment
CN113268959B (en) Document processing method and device and electronic equipment
CN110955161A (en) Plate shearing machine control system based on two-dimensional code control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201118

Address after: Area a166, 4th floor, building 4, Baitai Industrial Park, Yazhou Bay science and Technology City, Yazhou District, Sanya City, Hainan Province, 572022

Applicant after: Jinmao Digital Technology Co.,Ltd.

Address before: No.4 building-201-16, Hengsheng Plaza, north of Helan road and east of Europe Road, Jinghai Free Trade Zone, Tianjin

Applicant before: Jinmao Investment Management (Tianjin) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant