WO2021210912A1 - 특허 도면 부호 설명 출력 방법 및 이를 위한 장치, 시스템 - Google Patents
특허 도면 부호 설명 출력 방법 및 이를 위한 장치, 시스템 Download PDFInfo
- Publication number
- WO2021210912A1 WO2021210912A1 PCT/KR2021/004706 KR2021004706W WO2021210912A1 WO 2021210912 A1 WO2021210912 A1 WO 2021210912A1 KR 2021004706 W KR2021004706 W KR 2021004706W WO 2021210912 A1 WO2021210912 A1 WO 2021210912A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reference number
- description
- reference numerals
- recognizing
- recognized
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 90
- 238000000605 extraction Methods 0.000 claims description 31
- 239000012634 fragment Substances 0.000 claims description 17
- 238000004891 communication Methods 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 238000005065 mining Methods 0.000 claims description 2
- 230000000306 recurrent effect Effects 0.000 claims description 2
- 230000001502 supplementing effect Effects 0.000 claims 2
- 230000006870 function Effects 0.000 description 25
- 238000010586 diagram Methods 0.000 description 23
- 238000005516 engineering process Methods 0.000 description 18
- 239000000284 extract Substances 0.000 description 16
- 230000008859 change Effects 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 6
- 238000012015 optical character recognition Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 150000001875 compounds Chemical class 0.000 description 4
- 239000000470 constituent Substances 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/54—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04845—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/0485—Scrolling or panning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/451—Execution arrangements for user interfaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/15—Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/19007—Matching; Proximity measures
- G06V30/19013—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19173—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/30—Character recognition based on the type of data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/42—Document-oriented image-based pattern recognition based on the type of document
- G06V30/422—Technical drawings; Geographical maps
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/11—Patent retrieval
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/912—Applications of a database
- Y10S707/923—Intellectual property
- Y10S707/937—Intellectual property intellectual property searching
Definitions
- the present specification proposes a method for outputting patent reference numerals, and an apparatus and system for the same.
- Knowledge information contents such as papers and patent documents are generally composed of a large number of pages.
- Such content includes drawings, equations, and texts that explain them, and in particular, due to the limitation of the format, the drawings and texts related thereto are frequently arranged on different pages. Accordingly, the reader of the content reads the content multiple times alternately back and forth in order to understand the description of the drawings or formulas, and there is a problem in that time and effort are consumed more than necessary to acquire knowledge information.
- the technical problem to be solved by the present invention is to provide a solution for efficiently analyzing knowledge information content, centering on drawing information.
- another technical problem to be solved by the present invention is to provide a method of searching for a drawing based on the text, as well as searching for text (especially, the description of the reference numeral) based on the reference number of the drawing.
- FIG. 1 Another technical problem to be solved by the present invention is a drawing-oriented content analysis method by filtering all drawings including the reference numbers based on the characters (in particular, descriptions of reference numbers) linked to the reference numbers of the drawings.
- another technical problem to be solved by the present invention is to provide intuitive drawing-related information to a user by locating a reference number description matching the reference number in the area indicated by the reference number on the drawing.
- Another technical problem to be solved by the present invention is to provide a solution for locating/moving the reference numeral description to an appropriate area by adapting to the state change of the drawing.
- a patent reference number recognition method comprising: learning a plurality of patent drawing samples to build a reference number position recognition model and a reference number recognition model; receiving a patent drawing, which is a reference mark recognition target; recognizing the position of the reference number included in the patent drawing by using the reference number position recognition model; cutting out the reference number of the recognized position into an image fragment in the patent drawing; and recognizing the reference numbers included in the image fragment using the reference number recognition model.
- learning a plurality of patent drawing samples to build a reference number position recognition model and a reference number recognition model may include.
- the drawing is performed in various analysis environments. Convenience of analyzing knowledge information content may be provided.
- FIG. 1 is a diagram illustrating an embodiment of providing a patent drawing according to an embodiment of the present invention.
- FIG. 2 is a flowchart illustrating a method for recognizing reference numerals according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating a method of constructing a reference mark recognition model according to an embodiment of the present invention.
- FIG. 4 is a flowchart of a method for extracting reference numerals according to an embodiment of the present invention.
- FIG. 5 is a diagram illustrating an example of extracting reference numerals according to an embodiment of the present invention.
- FIG. 6 is a flowchart illustrating a method of outputting reference numerals corresponding to reference numerals according to an embodiment of the present invention.
- FIG. 7 is a flowchart illustrating a method for outputting reference numerals using a Scalable Vector Graphics (SVG) image according to an embodiment of the present invention.
- SVG Scalable Vector Graphics
- FIG. 8 is a diagram illustrating a method for outputting reference numerals using SVG images according to an embodiment of the present invention.
- FIG. 9 is a diagram illustrating a method for outputting reference numerals using SVG images according to an embodiment of the present invention.
- FIG. 10 is a diagram illustrating a patent document according to an embodiment of the present invention.
- FIG. 11 is a diagram illustrating a drawing interface in which reference numerals and reference numerals are interlocked according to an embodiment of the present invention.
- FIG. 12 is a diagram illustrating a drawing interface in which reference numerals and reference numerals are interlocked according to an embodiment of the present invention.
- FIG. 13 illustrates a keyword setting interface according to an embodiment of the present invention.
- FIG. 14 is a diagram illustrating an example of inter-category linkage using reference numerals as a medium according to an embodiment of the present invention.
- 15 is a diagram illustrating an example of inter-category interworking using reference numerals as a medium according to an embodiment of the present invention.
- 16 is a diagram illustrating a patent information retrieval system according to an embodiment of the present invention.
- FIG. 17 is a block diagram of a web server according to an embodiment of the present invention.
- first, second, A, and B may be used to describe various components, but the components are not limited by the above terms, and only for the purpose of distinguishing one component from other components.
- a first component may be named as a second component, and similarly, a second component may also be referred to as a first component without departing from the scope of the technology to be described below. and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.
- 'A and/or B' may be interpreted as meaning 'at least one of A or B'.
- '/' may be interpreted as 'and' or 'or'.
- each constituent unit is responsible for. That is, two or more components to be described below may be combined into one component, or one component may be divided into two or more for each more subdivided function.
- each of the constituent units to be described below may additionally perform some or all of the functions of other constituent units in addition to the main function it is responsible for. Of course, it may be carried out by being dedicated to it.
- each process constituting the method may occur differently from the specified order unless a specific order is clearly described in context. That is, each process may occur in the same order as specified, may be performed substantially simultaneously, or may be performed in the reverse order.
- FIG. 1 is a diagram illustrating an embodiment of providing a patent drawing according to an embodiment of the present invention.
- Fig. 1 (a) is a general patent drawing providing embodiment
- Fig. 1 (b) is a patent drawing providing embodiment in which reference numerals are replaced with reference numerals according to an embodiment of the present invention.
- Knowledge information contents such as papers and patent documents often use drawings as a means to explain information more easily and intuitively to users.
- the user can grasp the content of the knowledge information content more easily and efficiently by simultaneously grasping the drawing and the description of the drawing.
- knowledge information content is generally composed of a large number of pages, the user had to read the pages alternately in order to grasp the drawings and the descriptions of the drawings at the same time. This acted as a very big obstacle for users to easily and efficiently grasp the contents of knowledge information.
- each reference number is directly replaced with a reference number description corresponding to each reference number so that the user can understand the invention more easily and efficiently through the drawings.
- Techniques 1 to 3 described above may be sequentially performed by being integrated into one technique, or may be independently performed/borrowed as individual techniques, depending on the embodiment. Techniques 1 to 3 will be described below in detail with reference to each drawing.
- FIG. 2 is a flowchart illustrating a method for recognizing reference numerals according to an embodiment of the present invention.
- the web server may learn a plurality of patent drawing samples to build a reference mark position recognition model and a reference number recognition model ( S201 ). Both the reference mark location recognition model and the reference mark recognition model can be built based on deep learning technology.
- the web server may recognize positions of reference numerals included in a plurality of patent drawing samples based on deep learning technology.
- a Fully Convolutional Network FCN
- FCN is a deep learning model useful for checking the presence or absence of characters in image pixels, and is a transformation model derived based on Convolutional Neural Networks (CNN).
- CNN Convolutional Neural Networks
- the FCN has a feature that only a convolutional layer is used (ie, convolutionalization) instead of a fully connected layer. Due to these features, FCN does not lose the location information of the image unlike CNN, so it can be very usefully used to recognize location information of an object (especially a character) included in an image.
- the web server may recognize positions of reference numbers included in a plurality of patent drawing samples by using such FCN, and may extract common features from the recognized positions of reference numbers. For example, the web server may extract a feature that is not located in the center of the drawing, a feature that is not located outside the drawing, a feature that reference numbers are not located overlapping each other, etc. as common features of the reference numbers. In addition, the web server may extract various common features as a learning result by learning the positions of reference numerals, and is not limited to the features listed above. The web server may build a reference mark location recognition model based on the common features extracted in this way. The reference number position recognition model may receive a patent drawing, and may recognize and output the reference number position included in the corresponding patent drawing based on the extracted common features.
- the web server may extract common features by learning the positions of reference signs included in a plurality of patent drawing samples, and build a reference mark location recognition model based on the extracted common features.
- the web server having built the reference position recognition model may build the reference number recognition model using the reference position recognition model, which will be described in more detail later with reference to FIG. 3 .
- FIG. 3 is a diagram illustrating a method of constructing a reference mark recognition model according to an embodiment of the present invention.
- the web server In order to build a reference number recognition model, the web server first recognizes the positions of reference numbers 10-14, 16, 18 included in the patent drawing sample 301(s) using a pre-built drawing location recognition model. can do.
- the web server may cut and collect the reference numerals 10 to 14, 16, 18 of the recognized position from the patent drawing sample 301(s) to the image fragment 302.
- the web server uses image character recognition technology (eg, deep learning technology (especially C-RNN) and/or OCR) included in each of the collected image pieces 302 . (Optical character recognition, etc.) can be used for recognition.
- image character recognition technology eg, deep learning technology (especially C-RNN) and/or OCR
- the web server may construct a reference code recognition model by extracting common features from the reference numerals 10 to 14, 16, and 18 recognized in this way.
- common features of the web server for example, a feature that a reference number is composed of numbers, English letters, or a combination thereof, a feature that a reference number has a length of 5 characters or less, etc. may be derived, but is not limited thereto.
- the web server can build a reference mark recognition model based on the common features extracted in this way.
- the reference number recognition model may recognize and output reference numbers included in each image fragment based on the extracted common features.
- the web server may extract common features by learning the reference signs included in the image fragment, and build a reference mark recognition model based on the extracted common features.
- the reference number position recognition model and reference number recognition model constructed in this way are used to recognize reference numbers in the patent drawing selected/input by the user.
- the web server may input/select a patent drawing, which is a reference mark recognition target (S202). More specifically, the web server may receive a selection/input for a specific patent drawing (or a specific patent drawing) from a user device that is a client device.
- the web server recognizes the position of the reference number included in the input/selected patent drawing using the reference symbol position recognition model built in step S201 ( S203 ), and cuts the reference number of the recognized position from the patent drawing.
- An image fragment including a sign may be obtained (S204).
- the web server may recognize the reference numerals included in the thus-obtained image fragment using the reference symbol recognition model built in step S201 ( S205 ).
- the web server may generate one image by collecting the image fragments in units of a preset number (eg, 200), and a plurality of image fragments included in the generated image. may be recognized using image character recognition technology (eg, deep learning technology (particularly, C-RNN) and/or optical character recognition (OCR)).
- image character recognition technology eg, deep learning technology (particularly, C-RNN) and/or optical character recognition (OCR)
- FIG. 4 is a flowchart of a method for extracting reference numerals according to an embodiment of the present invention.
- the web server may extract reference numeral descriptions from the patent specification.
- Patent literature can be largely divided into categories into patent specifications and drawings, and reference numerals can be extracted from these patent specifications. A detailed description of the configuration of the patent document will be described below in detail with reference to FIG. 10 .
- the web server may extract reference numbers corresponding to the reference numbers recognized according to the embodiments proposed in FIGS. 2 and 3 from the patent specification.
- the web server may first establish a rule for extracting reference numeral descriptions (S401).
- the reference code description extraction rule can be established in various embodiments/methods, and in the present specification, the following establishment examples are established based on text mining technology, paying attention to the fact that the format of a patent document is different for each country where the patent document is filed. suggest
- the web server may first classify a plurality of patent specification samples by application country, and extract common features for each classified application country. As a common feature, it may be extracted based on at least one of the relative position of the reference numeral description with respect to the reference number, the format applied to the reference number description or reference number, and the filing year.
- a common feature for Korea a feature in which the reference number description is located before the reference number, the characteristic that the reference number is included in parentheses, etc. can be extracted, and for the United States, the reference number description as a common feature can be extracted. Characteristics located before the reference number, the characteristic that bold font is applied to the reference number, and the characteristic that the reference number is not separated by parentheses unlike in Korea can be extracted.
- the web server can variously extract features common to each application country.
- the web server After the web server establishes a reference number description extraction rule (or model) based on the common features extracted in this way, it can extract reference number descriptions from the patent specification using this (S402).
- the web The server may retrieve the reference numbers included in the patent specification. Furthermore, the web server may estimate the position of the reference number description corresponding to the searched reference number according to the established relative position rule, and extract the character of the expected position as the reference number description.
- the web server searches for reference number 16 in a patent specification having the following description, and then the character 'bolt' written in front of reference number 16 ' can be extracted as a reference number description for reference numeral 16.
- FIG. 5 is a diagram illustrating an example of extracting reference numerals according to an embodiment of the present invention.
- the web server may search for a plurality of extracted reference number descriptions in the patent specification, and then determine and extract the most searched reference number description as the final reference number description. Taking this drawing as an example, if the web server searches each of the display, the light emitting display, and the organic light emitting display in the patent specification, the organic light emitting display is searched 3 times, the light emitting display 0, and the display 1 is searched. It can be extracted as a final reference number description.
- the number of words extracted by the web server as reference number description candidates may be limited to a preset number based on reference number description data accumulated so far.
- the web server may build a patent drawing search database by databaseizing the reference numbers and the reference number description recognition results.
- the web server statistically calculates the number of words of a compound word extracted as a reference number description based on the reference code description data accumulated so far in the database, and sets the number of words with a statistically low extraction ratio/probability to a limited number.
- the web server can extract up to four as reference number description It can be limited to the number of words. In this case, when extracting the reference number description according to the reference number description extraction rule, the web server may extract up to four words as reference number description.
- the web server may establish an error extraction rule for determining whether there is an error in the extracted reference symbol description in order to further improve the extraction accuracy of the reference symbol description, and correct/complement the found error can do.
- the web server may extract reference number descriptions from the patent specification sample(s) based on the established reference number description extraction rules, and retrieve the extracted reference number descriptions from the patent specification.
- the web server may classify the reference reference description as an error in the reference number description.
- the web server may establish error extraction rules (or models) by extracting (ie, learning) common features from such error-prone reference numerals.
- the web server may use the established error extraction rule to determine whether there is an error in the reference numeral description extracted from the patent specification.
- Examples of common features include a feature that the number 0 is incorrectly extracted as the alphabet o, O, D, the feature that the number 9 is incorrectly extracted as the number 0, an adjective such as 'to do', a connective word, an adverb, etc. There may be features from which the symbols !, @, #, $, %, ⁇ , (, ), etc. are extracted.
- the extracted reference number description may be supplemented/corrected according to a preset method. For example, if the web server finds an error that a number/alphabet is incorrectly recognized as an alphanumeric character, it may replace the number/alphabet with the corresponding alphanumeric character. If found, the corresponding part-of-speech or symbol can be deleted.
- the web server learns the established reference code description extraction rule and error extraction rule based on deep learning technology (CNN, Recurrent Neural Network (RNN), or a combination thereof), so that the performance of the reference code description extraction model is improved.
- CNN Deep learning technology
- RNN Recurrent Neural Network
- the web server utilizes the set of correct answers for reference signs and reference signs accumulated so far in the patent drawing search database that has already been built, and builds a reference code description extraction model with very high recognition rate and accuracy to extract drawing code descriptions can be used for
- the web server may determine whether reference numerals are correctly extracted from each of the specification and drawings, and the reference numerals determined not to be correctly extracted may be supplemented. Through this, matching accuracy between reference numerals and reference numbers can be further improved.
- the web server may improve the accuracy of recognizing reference numbers for drawings by matching reference numbers extracted from drawings with reference numbers included in the patent specification. More specifically, the web server may search for reference numerals recognized in the patent drawing according to the above-described embodiment in the patent specification corresponding to the corresponding patent drawing. If the reference number recognized in the patent drawing is searched for in the patent specification, the web server may determine the recognized reference number as an appropriate reference number and determine it as the final reference number. Conversely, if the recognized reference number is not retrieved from the patent specification, the web server determines that the reference number is an inappropriate reference number, and inserts characters/words/terms having a shape similarity greater than or equal to a predetermined ratio to the reference number in the patent specification. It can be searched in and determined as the final reference number.
- the web server may determine 360 having the highest shape similarity to 36D as the final reference number.
- the web server may improve the reference number recognition accuracy for the specification by matching reference numbers recognized through the specification with reference numbers recognized in the drawings. More specifically, the web server may search for reference numerals recognized in the patent specification according to the above-described embodiment, among reference numbers recognized from the patent drawings corresponding to the patent specification. If the reference numbers recognized in the patent specification are retrieved from the reference numbers recognized from the patent drawings, the web server may determine the recognized reference numbers as appropriate reference numbers and determine the final reference numbers. Conversely, when the recognized reference number is not retrieved from among the reference numbers recognized from the patent drawing, the web server determines that the reference number is an inappropriate reference number, and a character/word having a shape similarity greater than or equal to a predetermined ratio with the reference number. /Terminology can be searched among the reference numbers recognized from the patent drawings and this can be determined as the final reference number.
- the web server 360 has the highest shape similarity to 36D among the reference numbers recognized in the patent drawing. may be determined as the final reference number.
- the web server can supplement the reference numerals recognized in the patent drawing by matching the reference numbers recognized in the patent specification
- the web server can supplement the reference numerals recognized in the patent specification. can be supplemented by matching the reference numerals recognized in the patent drawings.
- first and second embodiments may be selectively used or used in combination depending on the purpose and effect.
- only reference numbers that match each other between the patent drawing and the patent specification can be determined/extracted/confirmed as the final reference numbers, and reference numbers are recognized than when the first and second embodiments are selectively applied.
- the error probability can be significantly reduced.
- the reference numeral description output method proposed in the present specification unlike the prior art, the relative position in the drawing is always the same regardless of the state change (eg, movement, rotation, enlargement or reduction) of the drawing itself.
- the reference number description can accurately track the reference number position and replace it.
- FIG. 6 is a flowchart illustrating a method of outputting reference numerals corresponding to reference numerals according to an embodiment of the present invention.
- the web server may first recognize the size of the patent drawing and the position of the reference number included in the patent drawing to obtain the relative positional coordinates of the reference number in the patent drawing ( S601 ). In other words, the web server may obtain the relative positional coordinates of the reference numbers relative to the size of the patent drawing.
- the reason for obtaining the relative position coordinates in this way is to accurately track the position of the reference numeral even if the state of the drawing is changed as described above.
- the web server may set the relative position coordinates obtained in the previous step as the relative position coordinates of the reference numerals corresponding to the reference numerals (S602).
- the web server may output a reference number description to the set relative position coordinates (S603).
- the outputted reference numeral description may be output in the form of an icon/GUI (Graphic User Interface) having an opaque background color, and as a result outputted at the same relative position coordinates as the drawing, at least a part of the reference numerals is covered/ will cover That is, as a result of the reference numeral description being output at the same position as the reference numerals in the drawing, the reference numerals are output instead of the reference numerals.
- all reference numerals in the drawing of FIG. 1(a) are replaced with reference numerals as in FIG. 1(b).
- the reference numeral description is output to cover/cover at least a part of the reference number
- the reference number description is at a position corresponding to the reference number (for example, at least part of the Positions in the upper/lower/left/right/slanted/neighboring directions of the reference numerals including the cover/cover position) may be output anywhere.
- the web server moves the relative position coordinates of the reference number obtained in step S601 in a predetermined direction and/or by a predetermined distance to the reference number. It can be set/assigned to the relative coordinates of the description.
- FIG. 6 for convenience of description, a detailed embodiment in which reference numerals are replaced by reference numerals will be described based on the above-described embodiment of FIG. 6 .
- FIG. 7 to 9 are flowcharts illustrating a method for outputting reference numerals using SVG images according to an embodiment of the present invention.
- FIG. 7 is a flowchart illustrating a method for outputting reference numeral descriptions using a Scalable Vector Graphics (SVG) image according to an embodiment of the present invention
- FIGS. 8 and 9 are diagrams according to an embodiment of the present invention. It is a diagram exemplifying a method of outputting reference numerals using SVG images.
- the web server first, the web server generates an SVG image 802 of the same size as the patent drawing 801, and then overlaps the SVG image 802 on the patent drawing 801 and then the patent drawing ( 801) (S701).
- the SVG image 802 after generating the SVG image 802 that is the same size as the patent drawing 801 and transparent, it can be fixed to the patent drawing 801 while covering the entire patent drawing 801 .
- the SVG image 802 is unrecognizable from the user's point of view, but has a characteristic that the state changes in the same manner as the state of the patent drawing 801 changes.
- the SVG image is an XML (Extensible Markup Language)-based image file format for expressing two-dimensional vector graphics, and has a characteristic that the quality does not deteriorate even when the state of the image changes (particularly, enlargement).
- the web server may engrave/add/display/assign 903 a reference numeral description 902 to the preset relative position coordinates 901 within the SVG image 802 ( S702).
- the preset relative position coordinates 901 may refer to reference numerals obtained in steps S601 and S602 of FIG. 6 and relative position coordinates of reference numerals.
- Reference numeral 902 may be engraved/added/displayed/assigned within SVG image 802 in the form of an icon/GUI with an opaque background color.
- a reference number description 902 is engraved/added/displayed/assigned 903 to a position 901 corresponding to the reference number.
- the web server may superimpose the SVG image 903 engraved/added/marked/assigned on the drawing 801 with reference numerals 902, and as a result, the reference numerals of the drawing 801 are SVG It is covered by the reference numeral description 902 engraved/added/marked/assigned on the image 903 ( S703 ).
- reference numeral 902 for reference numeral '16' in the drawing 801 is 'bolt'
- 'bolt' 902 is engraved/added/in the same position as reference numeral '16'
- the displayed/assigned SVG image 903 may be generated, and the output result of the SVG image 903 being superimposed on the drawing 801 may be covered by a 'bolt' 902 .
- the reference numeral '16' appears to be replaced with the 'bolt' 902 .
- the web server generates an SVG image 903 in which a reference numeral description 902 is engraved/added/assigned/displayed at a position 901 corresponding to a reference numeral, and the SVG image thus generated (903) can be output by matching/interlocking/corresponding to the drawing 801.
- reference numerals in the drawing 801 are replaced with reference numerals description 902 and output.
- the state can be changed freely without deterioration in quality
- the reference number description is an object/component constituting the SVG image. Since the coordinates are scaled, there is no need to recalculate the position coordinates.
- reference numbers are also included as image objects constituting the drawing image
- reference number descriptions in SVG images are also included as image objects constituting the SVG images, and reference numbers are displayed at positions corresponding to reference numbers (or engraved).
- the reference number and reference number description position coordinates of the image are automatically scaled, and as a result of scaling, the moved position becomes the same.
- the SVG image fixed on the patent drawing is also changed in the same state as the patent drawing, and as a result, the state engraved/added on the SVG image is changed. Also, even if the SVG image is moved, rotated, enlarged, or reduced, the relative position within the SVG image is not changed and is fixed.
- a specific format for example, yellow highlighting, etc.
- Reference numerals may be displayed later as a tool tip using HTML or the like.
- the reference number description replacement speed is very fast compared to the existing method of allocating reference number descriptions by tracking/recalculating the positions of reference numbers each time according to a change in the state of the drawing.
- the replacement speed is very slow because the position of the reference number has to be tracked/recalculated every time according to the change in the state of the drawing, the reference number description is replaced/outputted for only one reference number at a time
- the replacement speed is very fast, so that it is possible to replace/output all the reference numbers (ie, a plurality of reference numbers) included in one drawing at once.
- the web server may engrave/display/add/allocate on the SVG image after arbitrarily adjusting the relative position coordinates of at least one of the overlapping reference numerals in a direction that does not overlap with each other. For example, when the first reference number description and the second reference number description overlap each other, the web server may set the first reference number description in the first direction and the second reference number description in a second direction opposite to the first direction. It can be moved by a predetermined length in the direction.
- an 'SVG image' has been described as a representative embodiment as an image used for outputting reference numerals, but the present disclosure is not limited thereto, and images in various formats may be utilized. Therefore, in the present specification, the SVG image may be referred to as/replaced as an 'image', and the 'image' in this case may refer to an image of various formats, such as an SVG image, in which quality is not deteriorated even when a state changes.
- the web server when a reference numeral description for each reference number is already included in the drawing, such as a block diagram or a flowchart, the web server replaces the reference number with the reference number description Instead of not, you can apply a highlighting format to the reference numbers. To this end, the web server may perform an operation for recognizing characters included in the drawing in advance.
- each category of the patent document can be interlocked/synchronized with each other.
- the user is able to selectively search/search for desired information by using reference numerals and/or descriptions of reference numerals, thereby enabling a more efficient grasp of the invention. Examples of interworking/synchronization for each category of patent literature will be described below with reference to FIGS. 11 to 15 , and before examining them, the categories of patent literature defined in the present specification will be briefly reviewed.
- FIG. 10 is a diagram illustrating a patent document according to an embodiment of the present invention.
- the patent document 1000 may be divided into a plurality of categories 1001 and 1002 .
- the patent document 1000 can be largely divided into a patent specification 1001 and a patent drawing 1002, and the patent specification 1001 again includes claims 1001-1, detailed description of the invention 1001-2, and It may be divided by / or a description of the symbol (not shown).
- the plurality of categories 1001 and 1002 divided in this way may be output by being divided into a plurality of areas/windows.
- the patent specification 1001 and the patent drawing 1002 may be output separately in different first and second regions within one window.
- the patent specification 1001 and the patent drawing 1002 may be output separately in first and second windows different from each other.
- the patent specification 1001 and the patent drawing 1002 may be output separately for each area in one window, and the patent specification 1001 or the patent drawing 1002 may be additionally output as a separate window.
- the reason for dividing output by region/window in this way is to provide convenience so that all categories 1001 and 1002 can enter the user's field of view at once, so that the user can more easily and efficiently grasp information.
- a plurality of categories 1001 and 1002 may be linked with each other using reference numerals and/or reference numerals as a medium.
- the web server searches for reference numbers in the plurality/all categories 1001 and 1002 can do. Furthermore, the web server may apply and output all of the reference marks searched for in the plurality/all categories 1001 and 1002 by applying the highlighting format.
- the web server highlights by applying a preset format (eg, underline, bold text display, other text color application, highlight display, etc.) to all the searched reference numbers. can be displayed
- a preset format eg, underline, bold text display
- the web server displays a preset format (eg, underline, bold text display) for all reference numerals retrieved or all reference number descriptions that are output in place of the reference numbers. , applying a different text color, highlighting, etc.) can be applied to highlight it.
- reference numeral '16' in the patent specification category 1001 a drawing including reference numeral '16' is automatically selected/outputted from the patent drawing category 1002 and then reference numeral '16' ' is highlighted, or if reference numeral '16' is selected in the patent drawing category 1002, all reference numerals '16' in the patent specification category 1001 are highlighted and at the same time, a sentence containing the reference numeral '16' An automatic scrolling/output operation to a paragraph may be performed.
- reference numbers in the patent specification category 1001 may be output in a hyperlink format, and reference numbers (or descriptions of reference numbers) in the patent drawing category 1002 are also selectable in a form It can be output in GUI/icon form.
- 11 and 12 are diagrams illustrating a drawing interface in which reference numerals and descriptions of reference numerals are interlocked according to an embodiment of the present invention.
- the drawing interface proposed in this specification may be output in the drawing category as a user interface that provides various functions related to drawings to the user.
- a drawing interface can basically output a drawing, as shown in the drawings.
- the drawing interface provides a preview and shortcut function for the entire drawing included in the patent document, a function to change the drawing state (e.g., rotate, move, enlarge and reduce the drawing) function, a function to replace reference number description, etc.
- User convenience can be provided by providing various functions.
- the function of replacing reference numerals may correspond to functions to which the above-described embodiments are applied.
- the drawing interface may provide a function of recognizing all reference numbers of the selected drawing, extracting all reference number descriptions corresponding to them, and listing-up all extracted reference number descriptions to provide the user with a function.
- reference numerals and reference numerals corresponding to each other may be output by matching/pairing with each other.
- the above-described embodiments may be applied to the reference number recognition and reference number description extraction.
- the web server when the web server receives a user's selection input for at least one reference number description (or reference number description) from the reference number description list output through the drawing interface, the selected input Only reference numerals (or reference numerals) can be selectively output.
- the web server may output the bolt, which is a reference numeral 16 only.
- the user can selectively view only the desired reference numerals or reference numerals, so that the drawing/invention can be easily and efficiently understood.
- FIG. 13 illustrates a keyword setting interface according to an embodiment of the present invention.
- the drawing interface proposed in this specification may provide a keyword setting interface.
- the keyword setting interface corresponds to a user interface provided through a drawing category in order to set at least some of the reference numerals as keywords.
- the keyword setting interface includes an input window for receiving an input of a reference to be set as a keyword, a color setting window for setting an accent color of the set keyword, and/or a keyword indicator indicating the keyword set so far. It may be composed of However, the keyword setting interface is not limited thereto, and various functions may be added or at least some of the above-described functions may be excluded.
- the user can register/set specific reference descriptions as keywords through this keyword setting interface, and the web server highlights the registered/set keywords within the drawing so that the user can easily find them or contains keywords. Drawings can be highlighted.
- the web server may set/register the bolt as a keyword and output a keyword indicator indicating that the bolt is set/registered as a keyword in a predetermined area of the keyword setting interface.
- the web server may search for a bolt in the drawing category, and apply an emphasis color to a keyword in the drawing currently selected and being enlarged and outputted.
- the web server may output an indicator of the same color as the keyword highlighting color for a drawing for which a keyword is searched among drawings that are being previewed.
- FIGS. 14 and 15 are diagrams illustrating an example of inter-category interworking using reference numerals as a medium according to an embodiment of the present invention.
- a selection window for selecting at least one function may be output.
- the function 1403 provided there may be a function of searching/searching for the selected reference reference 1402 in the patent specification.
- the web server can search for the selected reference number description 1402 or a reference number corresponding to it 1402 in all categories, and the retrieved reference number description 1402 is highlighted.
- the display 1502 may be applied to output as shown in FIG. 15 .
- a preset format For example, underline, bold text, apply a different text color, highlight, etc.
- the web server locates the reference number description or reference number 1506-1 located at the top of the patent specification category (in particular, the claim category and the detailed description category of the invention) among the searched reference number descriptions or reference numbers. You can automatically scroll web pages by sentence/paragraph.
- the web server provides a scroll bar area (or scroll bar area) corresponding to all reference numbers or reference numbers searched in the entire scroll bar area provided in the patent specification category (especially the claim category and detailed description category).
- Each of the first indicators 1505 may be output to the bar neighborhood). That is, by displaying the reference number description or the page area where the reference numbers are located in the scroll bar area in the form of a mini map through the first indicator 1505 , the user can move the scroll bar to the position of the first indicator 1505 to more easily desired provide access to information. Furthermore, through the first indicator output in the form of a mini map, the user can grasp at a glance how much all the referenced descriptions and/or reference numbers are distributed in which category, so it is useful for determining the importance of components can be
- the web server provides/outputs search windows 1504-1 and 1504-2 for searching the searched reference number description or reference number within the patent specification category (especially the claim category and the detailed description category of the invention). can do. Accordingly, the user can directly search for a reference number description or paragraphs, pages, and sentences including reference numbers to be searched through the search windows 1504-1 and 1504-2.
- the web server may search for the selected reference number description or reference number corresponding thereto also in the patent drawing category, and a second indicator is displayed in the drawing including reference number description or corresponding reference number among drawings that are being previewed. (1507) can be provided/printed. Accordingly, the user can search/search for a drawing including a reference number description to be immediately searched for by selecting a drawing provided/output by the second indicator 1507 .
- the web server may automatically select a drawing corresponding to a paragraph/sentence part currently being read/searched by the user from among the patent specification categories, and perform an operation to enlarge and output the drawing.
- the web server can determine the contents of the patent specification category (especially, the detailed description category of the invention) and divide the area for each description (eg, paragraph, sentence, page) of each drawing, and the current web page You can automatically select the drawing corresponding to the area being output at the highest ratio on the screen and print it enlarged.
- the web server may automatically select FIG. 1 to enlarge and output the image within the drawing category.
- the web server may automatically select a drawing corresponding to the selected drawing from the drawing category and output the enlarged drawing. Furthermore, the web server may perform an operation of automatically replacing reference numerals included in the selected specific area with reference numeral descriptions.
- the web server can build a patent drawing search database by databaseizing at least one information acquired/recognized through the above-described operation/method/embodiments (that is, by accumulating it as data and separately storing it in the database). have.
- the web server groups a patent document, a patent drawing, the size of a patent drawing, a reference number included in the patent drawing, a description of a reference number corresponding to each reference number, and/or the relative position coordinates of the reference number as one data. And by storing it in the database, it is possible to build a patent drawing search database.
- the user can easily obtain information about a patent drawing to be searched among patent drawings worldwide by inputting a search word related to a patent document through the patent drawing search database constructed in this way.
- the web server can improve recognition accuracy/speed by updating various models/rules described above by learning various patent drawing data in real time/periodically through the patent drawing search database constructed in this way.
- 16 is a diagram illustrating a patent information retrieval system according to an embodiment of the present invention.
- the patent information retrieval system proposed in the present specification may include a web server and a user device.
- the web server 1601 and the user device 1602 are mainly interconnected through an Internet connection to perform communication, and may provide/receive a patent information search service through a web service/page.
- the web server 1601 may correspond to a server/device including at least one software and hardware component designed to perform the embodiments proposed herein.
- the web server 1601 may provide the patent information retrieval service proposed in the present specification to a user device, which is a client device, through an Internet web page.
- the user device 1602 may correspond to a client device that receives a patent information search service provided through a web server.
- the user device 1602 may receive a patent information search service provided by a web server through an Internet web page.
- the patent information retrieval system consists of the web server 1601 and the user device 1602 is exemplified, but the present invention is not limited thereto. can be described by replacing
- the execution subject of the embodiment is described as the web server 1601 in the present specification, it is not limited thereto, and the web server 1601 may be replaced with a program or application designed to implement the above-described embodiment, and the web server ( 1601) can be interpreted as their functions.
- FIG. 17 is a block diagram of a web server according to an embodiment of the present invention.
- the web server may include a processor 1710 , a memory unit 1720 , and a communication unit 1730 .
- the processor 1710 may perform communication with or control other components in order to perform the embodiment proposed in this specification, execute various programs and/or applications stored in the memory unit 1720, and process internal data can do.
- the processor 1710 may include a central processing unit (CPU), a micro processor unit (MPU), a micro controller unit (MCU), an application processor (AP), an application processor (AP), or any form well known in the art. It may be configured to include at least one processor. Accordingly, in this specification, the web server may be described as being replaced with a processor.
- the memory unit 1720 not only means a digital data storage space that can be embedded, such as a flash memory, a hard disk drive (HDD), a solid state drive (SSD), etc., but also an external storage space that can store data through a communication connection, such as a cloud. referred to as up to Accordingly, the memory unit 1720 may store various digital data such as video, audio, photo, moving picture, image, text, application, and program.
- the memory unit 1720 proposed in this specification can store various knowledge information content (particularly, patent documents) data, and a patent document search database and/or a patent drawing search database built by the processor 1710 . (1720-1) can be stored.
- the processor 1710 may load various data from a patent document and/or a patent drawing search database stored in the memory unit 1720 to perform data processing/output operations, and the like.
- the communication unit 1730 may transmit/receive data by performing communication using at least one wired/wireless communication protocol.
- Embodiments according to the present invention may be implemented by various means, for example, hardware, firmware, software, or a combination thereof.
- an embodiment of the present invention provides one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), FPGAs ( field programmable gate arrays), a processor, a controller, a microcontroller, a microprocessor, and the like.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGAs field programmable gate arrays
- an embodiment of the present invention is implemented in the form of a module, procedure, function, etc. that performs the functions or operations described above, and is stored in a recording medium readable through various computer means.
- the recording medium may include a program command, a data file, a data structure, etc. alone or in combination.
- the program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software.
- the recording medium includes a magnetic medium such as a hard disk, a floppy disk, and a magnetic tape, an optical recording medium such as a compact disk read only memory (CD-ROM), a digital video disk (DVD), and a floppy disk.
- magneto-optical media such as a disk
- hardware devices specially configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions may include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those generated by a compiler.
- Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.
- the device or terminal according to the present invention may be driven by a command that causes one or more processors to perform the functions and processes described above.
- such instructions may include interpreted instructions, such as script instructions, such as JavaScript or ECMAScript instructions, or executable code or other instructions stored on a computer-readable medium.
- the device according to the present invention may be implemented in a distributed manner over a network, such as a server farm, or may be implemented in a single computer device.
- a computer program (also known as a program, software, software application, script or code) mounted on the device according to the invention and executing the method according to the invention includes compiled or interpreted language or a priori or procedural language. It can be written in any form of programming language, and can be deployed in any form, including stand-alone programs, modules, components, subroutines, or other units suitable for use in a computer environment.
- a computer program does not necessarily correspond to a file in a file system.
- a program may be in a single file provided to the requested program, or in multiple interacting files (eg, files that store one or more modules, subprograms, or portions of code), or portions of files that hold other programs or data. (eg, one or more scripts stored within a markup language document).
- the computer program may be deployed to be executed on a single computer or multiple computers located at one site or distributed over a plurality of sites and interconnected by a communication network.
- the present invention can be applied to various patent search systems/devices/methods.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims (20)
- 특허 도면 부호 인식 방법에 있어서,복수의 특허 도면 샘플을 학습하여 도면 부호 위치 인식 모델 및 도면 부호 인식 모델을 구축하는 단계;도면 부호 인식 대상인 특허 도면을 입력받는 단계;상기 도면 부호 위치 인식 모델을 이용하여 상기 특허 도면에 포함된 도면 부호의 위치를 인식하는 단계;상기 인식된 위치의 도면 부호를 상기 특허 도면에서 이미지 조각으로 잘라내는 단계; 및상기 도면 부호 인식 모델을 이용하여 상기 이미지 조각에 포함된 도면 부호를 인식하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 1 항에 있어서,상기 도면 부호 위치 인식 모델을 구축하는 단계는,상기 복수의 특허 도면 샘플에 포함된 도면 부호의 위치들을 FCN(Fully Convolutional Network)을 이용하여 인식하는 단계;상기 인식한 도면 부호의 위치들로부터 공통된 특징을 추출하는 단계; 및상기 추출한 공통된 특징을 기초로 상기 도면 부호 위치 인식 모델을 구축하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 2 항에 있어서,상기 도면 부호 인식 모델을 구축하는 단계는,상기 도면 위치 인식 모델을 이용하여 상기 복수의 특허 도면 샘플들에 포함된 도면 부호의 위치들을 인식하는 단계;상기 인식된 위치의 도면 부호들을 상기 복수의 특허 도면 샘플들에서 이미지 조각으로 잘라내는 단계;상기 잘라낸 이미지 조각들 각각에 포함된 도면 부호를 C-RNN(Convolution Recurrent Neural Network)을 이용하여 인식하는 단계;상기 인식한 도면 부호로부터 공통된 특징을 추출하는 단계; 및상기 추출한 공통된 특징을 기초로 상기 도면 부호 인식 모델을 구축하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 3 항에 있어서,상기 도면 부호를 인식하는 단계는,기설정된 수 단위로 이미지 조각을 모아 하나의 이미지를 생성하는 단계; 및상기 하나의 이미지에 포함된 복수의 도면 부호들을 상기 C-RNN을 이용하여 인식하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 1 항에 있어서,상기 특허 도면에 대응하는 특허 명세서에서 상기 인식한 도면 부호에 대응하는 도면 부호 설명을 추출하는 단계;상기 특허 도면의 크기 및 상기 특허 도면 내에 포함된 상기 도면 부호의 위치를 인식하여 상기 특허 도면 내에서의 상기 도면 부호의 상대적인 위치 좌표를 획득하는 단계;상기 특허 도면과 동일한 크기의 이미지를 생성하는 단계; 및상기 이미지를 상기 특허 도면 상에 오버랩(overlap)시킨 후 상기 특허 도면에 고정시키는 단계;상기 획득한 상대적인 위치 좌표와 대응하는 위치의 상대적인 위치 좌표를 상기 도면 부호 설명에 할당하는 단계;상기 이미지 상에, 상기 도면 부호 설명에 할당된 상대적인 위치 좌표에 상기 도면 부호 설명을 표시하는 단계; 및상기 도면 부호 설명이 표시된 이미지를 출력하는 단계; 를 포함하는, 특허 도면 부호에 대한 도면 부호 설명 출력 방법.
- 제 5 항에 있어서,상기 도면 부호 설명을 추출하는 단계는,텍스트 마이닝 기술을 기반으로 복수의 특허 명세서 샘플들을 이용하여 도면 부호 설명 추출 규칙을 확립하는 단계; 및상기 확립한 도면 부호 설명 추출 규칙을 기반으로 상기 특허 명세서에서 상기 도면 부호 설명을 추출하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 6 항에 있어서,상기 도면 부호 설명 추출 규칙을 확립하는 단계는,상기 복수의 특허 명세서 샘플들을 출원 국가별로 분류하는 단계;상기 분류된 출원 국가별로 공통된 특징을 추출하는 단계; 및상기 추출한 특징을 기초로 상기 도면 부호 설명 추출 규칙을 확립하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 7 항에 있어서,상기 공통된 특징을 추출하는 단계는,상기 도면 부호를 기준으로 한 상기 도면 부호 설명의 상대적인 위치, 상기 도면 부호 설명에 적용된 서식, 및 출원 년도 중 적어도 하나를 기초로 상기 공통된 특징을 추출하는 단계인, 특허 도면 부호 인식 방법.
- 제 8 항에 있어서,상기 도면 부호 설명 추출 규칙이 상기 도면 부호를 기준으로 한 상기 도면 부호 설명의 상대적인 위치를 기초로 확립된 경우, 상기 도면 부호 설명을 추출하는 단계는,상기 특허 명세서에서 상기 인식한 도면 부호를 검색하는 단계;상기 도면 부호 설명 추출 규칙에 따라 상기 검색된 도면 부호를 기준으로 한 상기 도면 부호 설명의 상대적인 위치를 예상하는 단계; 및상기 예상한 위치의 문자를 상기 도면 부호 설명으로서 추출하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 9 항에 있어서,상기 도면 부호 설명이 복수개 추출되는 경우, 상기 복수개 추출된 도면 부호 설명을 상기 특허 명세서에서 검색하는 단계; 및가장 많이 검색된 도면 부호 설명을 최종 도면 부호 설명으로 결정하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 제 6 항에 있어서,상기 도면 부호 설명의 추출 정확도를 높이기 위하여, 상기 추출한 도면 부호 설명을 보완하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 제 11 항에 있어서,상기 확립한 도면 부호 설명 추출 규칙을 기반으로 상기 복수의 특허 명세서 샘플들을 이용하여 추출한 도면 부호 설명을 상기 특허 명세서에서 검색하는 단계;상기 특허 명세서에서 검색되지 않는 도면 부호 설명을 오류가 발생한 도면 부호 설명으로 분류하는 단계;상기 분류한 도면 부호 설명으로부터 공통된 특징을 추출하여 오류 추출 규칙을 확립하는 단계; 및상기 확립한 오류 추출 규칙을 기반으로 상기 특허 명세서에서 추출한 도면 부호 설명에 오류가 있는지 판단하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 제 12 항에 있어서,상기 공통된 특징을 추출하는 단계는,숫자 또는 알파벳이 상기 알파벳 또는 상기 숫자로 추출되었는지 여부, 상기 도면 부호 설명에 기설정된 품사의 포함 여부, 및 상기 도면 부호 설명에 기호 포함 여부 중 적어도 하나를 기초로 상기 공통된 특징을 추출하는 단계인, 특허 도면 부호 인식 방법.
- 제 12 항에 있어서,상기 추출한 도면 부호 설명에 오류가 있는 것으로 판단된 경우, 상기 추출한 도면 부호 설명을 보완하는 단계는,상기 추출한 도면 부호 설명에서 상기 오류를 삭제하거나 다른 문자로 대체하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 12 항에 있어서,상기 확립한 도면 부호 설명 추출 규칙 및 상기 확립한 오류 추출 규칙을 학습하여 도면 부호 설명 추출 모델을 구축하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 제 1 항에 있어서,상기 인식한 도면 부호를 상기 특허 도면에 대응하는 특허 명세서에서 검색하는 단계;상기 인식한 도면 부호가 상기 특허 명세서에서 검색된 경우, 상기 인식한 도면 부호를 최종 도면 부호로 결정하는 단계; 및상기 인식한 도면 부호가 상기 특허 명세서에서 검색되지 않는 경우, 상기 특허 명세서 내에서 상기 인식한 도면 부호와 기설정된 비율 이상의 형상 유사도를 갖는 문자를 상기 최종 도면 부호로 결정하는 단계; 를 포함하는, 특허 도면 부호 인식 방법.
- 제 1 항에 있어서,상기 특허 도면에 대응하는 특허 명세서에서 도면 부호를 인식하는 단계;상기 특허 도면을 통해 인식한 도면 부호 중에서 상기 특허 명세서에서 인식한 도면 부호와 매칭되는 도면 부호가 있는지 판단하는 단계;매칭되는 도면 부호가 있는 경우, 상기 특허 명세서에서 인식한 도면 부호를 최종 도면 부호로 결정하는 단계; 및매칭되는 도면 부호가 없는 경우, 상기 특허 도면을 통해 인식한 도면 부호 중에서, 상기 특허 명세서에서 인식한 도면 부호와 기설정된 비율 이상의 형상 유사도를 갖는 도면 부호를 상기 최종 도면 부호로 결정하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 제 5 항에 있어서,상기 인식한 도면 부호 및 상기 인식한 도면 부호 설명을 상호 매칭하여 데이터 베이스에 저장함으로써 특허 도면 검색 데이터 베이스를 구축하는 단계; 를 더 포함하는, 특허 도면 부호 인식 방법.
- 특허 도면 부호를 인식하는 웹 서버에 있어서,적어도 하나의 통신 프로토콜을 사용하여 통신을 수행하는, 통신 유닛;데이터를 저장하는, 메모리 유닛; 및상기 통신 유닛 및 상기 메모리 유닛을 제어하는, 프로세서; 를 포함하고,상기 프로세서는,복수의 특허 도면 샘플을 학습하여 도면 부호 위치 인식 모델 및 도면 부호 인식 모델을 구축하고,도면 부호 인식 대상인 특허 도면을 선택받고,상기 도면 부호 위치 인식 모델을 이용하여 상기 특허 도면에 포함된 도면 부호의 위치를 인식하고,상기 인식된 위치의 도면 부호를 상기 특허 도면에서 이미지 조각으로 잘라내고,상기 도면 부호 인식 모델을 이용하여 상기 이미지 조각에 포함된 도면 부호를 인식하는, 웹 서버.
- 특허 도면 부호 인식 및 도면 부호 설명 출력 방법에 있어서,특허 도면을 입력받는 단계;상기 특허 도면에 포함된 도면 부호의 위치를 인식하는 단계;상기 인식된 도면 부호 위치의 도면 부호를 인식하는 단계;상기 특허 도면 내에서의 상기 도면 부호의 상대적인 위치 좌표를 획득하는 단계;상기 특허 도면과 동일한 크기의 이미지를 생성하는 단계;상기 이미지를 상기 특허 도면 상에 오버랩(overlap)시킨 후 상기 특허 도면에 고정시키는 단계;상기 획득한 상대적인 위치 좌표와 대응하는 위치의 좌표를, 상기 도면 부호에 대응하는 도면 부호 설명에 상기 이미지에 대한 상대적인 위치 좌표로서 할당하는 단계;상기 이미지 상에, 상기 도면 부호 설명에 할당된 상대적인 위치 좌표에 상기 도면 부호 설명을 표시하는 단계; 및상기 도면 부호 설명이 표시된 이미지를 출력하는 단계; 를 포함하는, 특허 도면 부호 인식 및 도면 부호 설명 출력 방법.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/918,670 US20230351791A1 (en) | 2020-04-14 | 2021-04-14 | Method, device, and system for outputting description of patent reference sign |
JP2022562581A JP2023523575A (ja) | 2020-04-14 | 2021-04-14 | 特許図面符号の説明の出力方法およびそのための装置、システム |
CN202180028853.4A CN115427944A (zh) | 2020-04-14 | 2021-04-14 | 专利附图标记说明输出方法及用于其的装置、系统 |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2020-0045054 | 2020-04-14 | ||
KR20200045054 | 2020-04-14 | ||
KR10-2020-0045051 | 2020-04-14 | ||
KR20200045051 | 2020-04-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021210912A1 true WO2021210912A1 (ko) | 2021-10-21 |
Family
ID=78084800
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/004706 WO2021210912A1 (ko) | 2020-04-14 | 2021-04-14 | 특허 도면 부호 설명 출력 방법 및 이를 위한 장치, 시스템 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230351791A1 (ko) |
JP (1) | JP2023523575A (ko) |
KR (2) | KR102601980B1 (ko) |
CN (1) | CN115427944A (ko) |
WO (1) | WO2021210912A1 (ko) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150309969A1 (en) * | 2008-02-15 | 2015-10-29 | Edyt Inc. | Methods and Apparatus for Improved Navigation Among Controlled Terms in One or More User Documents |
KR20160125931A (ko) * | 2016-10-11 | 2016-11-01 | 이현엽 | 특허문서의 도면가독성 증진 서비스 제공을 위한 시스템 및 방법 |
KR20180106517A (ko) * | 2017-03-20 | 2018-10-01 | (주)광개토연구소 | 인공 지능 기술 기반의 머신 러닝을 사용하는 도면 부호를 포함하는 특허 도면 이미지에 도면 부호의 설명을 맵핑 처리하는 방법 및 장치 |
KR20200013130A (ko) * | 2018-07-12 | 2020-02-06 | (주)광개토연구소 | 인공 지능 기술 기반의 머신 러닝을 사용하는 특허 도면 이미지의 도면 부호에 대응되는 도면 부호의 설명 데이터 처리 방법 및 장치 |
KR20200038006A (ko) * | 2018-10-02 | 2020-04-10 | 경북대학교 산학협력단 | 디지털 도면 제공 방법 및 디지털 도면 제공 시스템 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4183311B2 (ja) * | 1997-12-22 | 2008-11-19 | 株式会社リコー | 文書の注釈方法、注釈装置および記録媒体 |
US8041739B2 (en) * | 2001-08-31 | 2011-10-18 | Jinan Glasgow | Automated system and method for patent drafting and technology assessment |
US7417645B2 (en) * | 2003-03-27 | 2008-08-26 | Microsoft Corporation | Markup language and object model for vector graphics |
JP2008181174A (ja) * | 2007-01-23 | 2008-08-07 | Silent Technology Co Ltd | 特許出願又は実用新案登録出願の図面原稿の作成方法 |
KR20140046333A (ko) * | 2012-10-10 | 2014-04-18 | 삼성테크윈 주식회사 | 디지털 도면 제공 장치 및 방법 |
KR20180107707A (ko) * | 2017-03-22 | 2018-10-02 | (주)광개토연구소 | 인공 지능 기술 기반의 머신 러닝을 사용하는 특허 도면 이미지에 도면 부호의 설명이 표시되도록 맵핑 처리하는 방법 및 장치 |
-
2021
- 2021-04-14 JP JP2022562581A patent/JP2023523575A/ja active Pending
- 2021-04-14 US US17/918,670 patent/US20230351791A1/en active Pending
- 2021-04-14 CN CN202180028853.4A patent/CN115427944A/zh active Pending
- 2021-04-14 KR KR1020210048365A patent/KR102601980B1/ko active IP Right Grant
- 2021-04-14 WO PCT/KR2021/004706 patent/WO2021210912A1/ko active Application Filing
-
2023
- 2023-11-08 KR KR1020230153547A patent/KR20230161381A/ko active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150309969A1 (en) * | 2008-02-15 | 2015-10-29 | Edyt Inc. | Methods and Apparatus for Improved Navigation Among Controlled Terms in One or More User Documents |
KR20160125931A (ko) * | 2016-10-11 | 2016-11-01 | 이현엽 | 특허문서의 도면가독성 증진 서비스 제공을 위한 시스템 및 방법 |
KR20180106517A (ko) * | 2017-03-20 | 2018-10-01 | (주)광개토연구소 | 인공 지능 기술 기반의 머신 러닝을 사용하는 도면 부호를 포함하는 특허 도면 이미지에 도면 부호의 설명을 맵핑 처리하는 방법 및 장치 |
KR20200013130A (ko) * | 2018-07-12 | 2020-02-06 | (주)광개토연구소 | 인공 지능 기술 기반의 머신 러닝을 사용하는 특허 도면 이미지의 도면 부호에 대응되는 도면 부호의 설명 데이터 처리 방법 및 장치 |
KR20200038006A (ko) * | 2018-10-02 | 2020-04-10 | 경북대학교 산학협력단 | 디지털 도면 제공 방법 및 디지털 도면 제공 시스템 |
Also Published As
Publication number | Publication date |
---|---|
US20230351791A1 (en) | 2023-11-02 |
KR20210127637A (ko) | 2021-10-22 |
KR20230161381A (ko) | 2023-11-27 |
KR102601980B1 (ko) | 2023-11-14 |
CN115427944A (zh) | 2022-12-02 |
JP2023523575A (ja) | 2023-06-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018062580A1 (ko) | 문자를 번역하는 방법 및 그 장치 | |
WO2015030461A1 (en) | User device and method for creating handwriting content | |
JPH1055371A (ja) | 文書探索および検索システム | |
CN103970475A (zh) | 辞典信息显示装置、方法、系统及服务器装置、终端装置 | |
WO2010137814A2 (en) | Method of providing by-viewpoint patent map and system thereof | |
WO2014035199A1 (en) | User interface apparatus in a user terminal and method for supporting the same | |
CN111859856A (zh) | 信息显示方法、装置、电子设备及存储介质 | |
WO2017026655A1 (ko) | 사용자 단말 장치 및 이의 제어 방법 | |
WO2015037815A1 (ko) | 스마트 기기 내 시맨틱 검색 시스템 및 검색방법 | |
WO2019146951A1 (en) | Electronic apparatus and control method thereof | |
WO2021210912A1 (ko) | 특허 도면 부호 설명 출력 방법 및 이를 위한 장치, 시스템 | |
US9690393B2 (en) | Information processing device, program, recording medium, and information processing system | |
WO2012165847A2 (ko) | 사용자 주석 처리 장치 및 그를 위한 전자책 서비스 시스템 및 방법 | |
EP3039512A1 (en) | User device and method for creating handwriting content | |
WO2016072772A1 (ko) | 레퍼런스 의미 지도를 이용한 데이터 시각화 방법 및 시스템 | |
JP5672357B2 (ja) | 電子機器及びプログラム | |
WO2022131723A1 (ko) | 도면 독해 및 검색 기능 제공방법 및 그 장치와 시스템 | |
CN107590140B (zh) | 一种文档漏译条目处理方法 | |
JP4491389B2 (ja) | 電子機器、プログラム、及びプログラムを記録した記録媒体 | |
AU2018100324B4 (en) | Image Analysis | |
JP2008225676A (ja) | 辞書検索装置及びその制御プログラム | |
WO2016200194A1 (ko) | 문제 콘텐트 제공 방법 및 디바이스 | |
WO2022191427A1 (ko) | 여백을 활용한 특허 문서의 도면 표시 시스템 | |
WO2019117567A1 (en) | Method and apparatus for managing navigation of web content | |
JP5515571B2 (ja) | 電子機器及びプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21789385 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022562581 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 06/12/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21789385 Country of ref document: EP Kind code of ref document: A1 |