WO2015136692A1 - Système d'édition de document image numérique - Google Patents

Système d'édition de document image numérique Download PDF

Info

Publication number
WO2015136692A1
WO2015136692A1 PCT/JP2014/056927 JP2014056927W WO2015136692A1 WO 2015136692 A1 WO2015136692 A1 WO 2015136692A1 JP 2014056927 W JP2014056927 W JP 2014056927W WO 2015136692 A1 WO2015136692 A1 WO 2015136692A1
Authority
WO
WIPO (PCT)
Prior art keywords
character string
electronic image
image document
recognized
editing system
Prior art date
Application number
PCT/JP2014/056927
Other languages
English (en)
Japanese (ja)
Inventor
久雄 間瀬
義行 小林
新庄 広
竜治 嶺
高橋 寿一
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2014/056927 priority Critical patent/WO2015136692A1/fr
Priority to JP2016507228A priority patent/JPWO2015136692A1/ja
Publication of WO2015136692A1 publication Critical patent/WO2015136692A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/42Document-oriented image-based pattern recognition based on the type of document
    • G06V30/422Technical drawings; Geographical maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Definitions

  • the present invention relates to an electronic image document editing system.
  • Editing a document includes creating a new document, updating an existing document (addition, correction, deletion, etc.), proofreading character information in the document, and translating the character information.
  • design documents of past products often remain only in the form of electronic image documents.
  • the character string is recognized from the design document that is an electronic image document, The recognized character string must be edited.
  • Patent Document 1 original image data read by the scanner unit 1 is input to the recognition processing unit 102 via the image processing unit 3 when recognition processing is set, and character recognition is performed.
  • a predetermined number of lines (the number of hits) of the translated words in the total number of recognized words are stored, and when the hit number of the target line and the preceding and following lines is equal to or less than the predetermined number, or the character code of the recognized word is
  • an electronic image document editing system In order for a system for editing an electronic image document (hereinafter referred to as an electronic image document editing system) to edit character information in the electronic image document, it is necessary to first perform character recognition processing. That is, the electronic image document editing system needs to specify a region in which character information is described in the electronic image document, and to perform processing for specifying the character content described in the region.
  • the electronic image document editing system can accurately specify the description location and the description amount of the character information in the electronic image document.
  • the electronic image document editing system causes erroneous recognition due to the quality of the scanned paper, the resolution of the electronic image, the font type and size of the written characters, and the like in the character recognition process.
  • Patent Document 1 applies a predetermined rule to a recognized character string and its translation result (translation search result) as a result of character recognition processing, and when the rule matches, the recognized character string and its Do not output translation results.
  • the rules mentioned in Patent Document 1 are the following two.
  • the first rule is “if the number of translations in the total number of recognized words in one line (the number of hits) is memorized in a predetermined number of lines, and the hit number of the target line and the preceding and following lines is less than the predetermined number, “Stop drawing the line of interest”.
  • the second rule is to determine whether or not the character code of the recognized word matches a certain pattern such as a character other than a character or the same code continuing a predetermined number of times. When the ratio of the number of pattern matches in one line in the total number of recognized words is equal to or greater than a predetermined value, the drawing of the attention line is stopped.
  • the recognized noise character string includes, for example, one or more characters composed of kanji, hiragana, katakana, numbers, alphabets, symbols, and the like. It is a string.
  • the noise character string cannot be identified with high accuracy, and as a result, many noise character strings are to be drawn.
  • the technique described in Patent Document 1 uses a translation result (translation word search result) to identify whether or not the recognized character string is a drawing target character string. That is, in the technique described in Patent Document 1, when the first rule is used, it is necessary to perform a translation process even on a character string that does not output a translated word, and the processing load increases. For example, when the first rule is applied to an electronic image document editing system for editing work other than translation, a translation function that is not necessary for the original editing work must be installed in the electronic image document editing system. The cost burden increases.
  • an object of the present invention is to identify and remove a noise character string with high accuracy from a recognized character string in an electronic image document. Another object of the present invention is to specify a noise character string from a recognized character string without performing a character string editing process.
  • An electronic image document editing system for editing a character string recognized from an electronic image document including a processor and a storage device, wherein the storage device is a character string made up of one or more characters whether or not the character string to be edited
  • One or more character string determination criteria that are criteria for determining whether or not the processor accepts an input of an electronic image document, and from one or more characters in a plurality of types of characters in the input electronic image document
  • the recognized character string satisfies the character string criterion, it is determined that the recognized character string is an edit target character string, and the character string criterion is the recognized character string.
  • the first threshold is an integer of 2 or more
  • the characters of the first type group in which the recognized character string is part of the plurality of types A second determination criterion including a partial character string composed of two or more threshold values (the second threshold value is an integer of 2 or more), and a character in the second type group in which the recognized character string is a part of the plurality of types.
  • An electronic image document editing system including at least one determination criterion among a third determination criterion including and a fourth determination criterion in which the recognized character string includes a content word.
  • a noise character string can be specified with high accuracy from a character string recognized by character recognition processing from an electronic image document.
  • a noise character string can be specified from a recognized character string with high accuracy without performing editing processing on the recognized character string.
  • 2 shows an example of the system configuration of an electronic image document editing system.
  • 2 shows an example of a hardware configuration of an electronic image document editing system.
  • An example of input electronic image document data is shown.
  • the example of the electronic image document data after a translation process is shown.
  • the example of the electronic image document data before a character recognition process is shown.
  • the example of the electronic image document data after a character recognition process is shown.
  • the 1st example of a character string criterion is shown.
  • the 2nd example of a character string criteria is shown.
  • the 3rd example of a character string criterion is shown.
  • An example of a character string information table is shown.
  • the example of a character string determination process flowchart by a character string determination part in case the value of Judge is the number of matching items is shown.
  • the example of the character string determination process flowchart by a character string determination part in case the value of Judge is the weight sum of a matching item is shown.
  • the example of the list output screen of the character string determined to be a translation object character string is shown.
  • the example of the change screen of a character string criterion is shown.
  • the example of the output screen which re-executed the translation process after a character string determination standard change is shown.
  • the electronic image document editing system accepts an input of an electronic image document of a design drawing and performs an editing process on the input electronic image document.
  • the electronic image document editing system supports the work of translating a Japanese character string described in an electronic image document into an English character string as an example of editing processing.
  • the character string in this embodiment is composed of one or more characters.
  • the electronic image document editing system recognizes characters in an electronic image document such as a design drawing and extracts candidates for locations where character strings are described.
  • the electronic image document editing system identifies a part where a character string is actually described from among candidates for a part where the character string is described by a character string determination process described later.
  • the electronic image document editing system performs translation processing on the Japanese character string among the specified character strings, and presents translation candidates for the Japanese character string.
  • the electronic image document editing system generates a translation object for the translation selected by the user, corrects the layout of the translation object, and pastes it at an appropriate position on the document.
  • the design drawing is used as an example of the input electronic image document.
  • an electronic image diagram included in a text or a paper that is converted into an electronic image may be used as the input electronic image document.
  • the electronic image document editing system in this embodiment mainly describes the work of recognizing Japanese character strings and translating the recognized Japanese character strings into English character strings. There are no particular restrictions on.
  • the work of translating a document is described, but the present invention can also be applied to other editing work such as document update and document proofreading.
  • FIG. 1 shows a configuration example of the electronic image document editing system of this embodiment.
  • the electronic image document editing system includes an input processing unit 1, an output processing unit 2, a character recognition processing unit 4, a character string determination unit 7, a translation processing unit 10, a translated word object generating unit 13, a translated word object editing unit 14, and character string information.
  • a management unit 16 is included. Each unit described above is a program.
  • the electronic image document editing system also includes a translation target image document 3, a character recognition dictionary 5, a translation target image document 6 with character recognition results, a character string determination criterion 8, a character string information table 9, a translation dictionary 11, and a translation word candidate table 12. , An image document 15 with a character recognition result / translation result, and a word / character dictionary 17.
  • the input processing unit 1 accepts various data and operations designated or instructed by the user via input means such as a keyboard, a mouse, a touch panel, and a touch pen. As an example of specific data or operation instructions, the input processing unit 1 selects an electronic image document to be translated, instructs to perform character recognition, changes contents of character string criteria, specifies a character string to be translated, Selection and input, editing of translation object, etc. are accepted.
  • the output processing unit 2 outputs various data and processing results to the user via output means such as a display.
  • the output processing unit 2 includes, as an example of specific data or processing results, an image document to be translated, an image document to be translated with character recognition results, a character string determination criterion, and character string information to be translated. Outputs an image document with a list, translation candidates, character recognition results, and translation results.
  • the user When using the electronic image document editing system of this embodiment, the user first selects an electronic image document to be translated from the electronic image document input to the electronic image document editing system.
  • the content of the selected electronic image document is displayed to the user via a display or the like and is stored in the translation target image document 3.
  • the character recognition processing unit 4 extracts electronic image document data from the translation target image document 3 and refers to a character recognition dictionary 5 that stores data relating to individual characters, rules for character recognition, and the like. Perform character recognition.
  • the character recognition process includes a character string area specifying process, a character cut-out process from the character string area, and a cut-out character recognition process. Since many character recognition algorithms used for character recognition processing are already widely known, description of character recognition processing is omitted. Note that the character recognition processing unit 4 may perform the character recognition process using any character recognition algorithm.
  • the character string recognized by the character recognition processing unit 4 is stored in the character string information table 9 together with the description location (coordinate position in the document image) of the character string. Further, the recognized character string is stored in the translation object image document 6 with the character recognition result in a form embedded in the document description portion of the translation object image document 3.
  • the character string determination unit 7 analyzes the character string recognized by the character recognition processing unit 4 and determines whether or not the recognized character string is a character string to be translated.
  • the character string determination unit 7 analyzes the character string with reference to a word / character dictionary 17 in which a list of characters and attributes, and a headline and attributes of words are stored.
  • the character string determination unit 7 refers to a character string determination criterion item stored in the character string determination criterion 8 and determines whether the recognized character string is a character string to be translated. Details of the processing by the character string determination unit 7 and the character string determination reference 8 will be described later.
  • the determination result by the character string determination unit 7 is stored in the character string information table 9.
  • the user looks at the displayed translation target image document 6 with the character recognition result, designates a description portion corresponding to the character string via a mouse, a touch pen, etc., and instructs execution of translation.
  • designating the description location for example, clicking the description location, dragging the description location, and selecting a rectangle of the range including the description location, etc., any method may be used.
  • the translation processing unit 10 extracts a character string corresponding to the description location (coordinates) designated by the user from the character string information table 9.
  • the translation processing unit 10 refers to the translation dictionary 11, extracts translation word candidates corresponding to the character string, and presents them to the user.
  • the translation processing unit 10 searches the translated word by matching the character string with the translation dictionary.
  • the translation string is divided into words by morphological analysis of the character string, and the translation dictionary for each word. You may search and present a translation from 11. Further, the translation processing unit 10 may pass the character string to a machine translation system and present a translation result by the machine translation system.
  • the electronic image document editing system of the present embodiment may use any translation dictionary search algorithm and machine translation algorithm when performing translation processing.
  • the translation result is stored in the translated word candidate table 12.
  • the translation candidate table 12 temporarily stores the correspondence between Japanese character strings and translation candidates.
  • the translation object generation unit 13 transmits the translation word candidates stored in the translation word candidate table 12 to the output processing unit 2, and the output processing unit 2 presents the received translation word candidates to the user.
  • the user selects a correct translation from the presented translation candidates. If there is no correct translation in the presented translation candidates, the user inputs the correct translation directly from the keyboard or the like. If there is an error in the recognized character string, the user corrects the recognized character string and instructs re-execution of translation. The user selects a correct translation from the translation candidates presented again.
  • the translated object generating unit 13 When the translated word is confirmed by the user inputting or selecting a correct translated word, the translated object generating unit 13 generates a translated object consisting of the translated text character string and displays it on the translation target image document 3. Further, the character string information management unit 16 stores the corrected character string and the confirmed translation result in the character string information table 9.
  • the translated object editing unit 14 adjusts the object size of the displayed translated object, the font size of the text, and the like, and performs an editing process for prompting the user to move to an appropriate position on the document and paste it. .
  • the translated object editing unit 14 may automatically adjust the object size of the translated object, the font size of the text, and the like according to the character string length before and after translation.
  • the electronic image document data at that time is stored in the image document 15 with character recognition result / translation result.
  • the character string information management unit 16 manages the character string to be translated and the translation processing status of the character string. Specifically, the character string information management unit 16 analyzes the character string information table 9, calculates the number of character strings to be translated and the number of characters in the electronic image document, and holds them. In addition, the character string information management unit 16 manages the editing work status such as whether or not each translation target character string has been translated in cooperation with the translated word object generating unit 13 and the translated word object editing unit 14.
  • the character string information management unit 16 When the character string information management unit 16 receives information from the translated object editing unit 14 that the translated object has been pasted on a predetermined coordinate, the translation of the character string to be translated corresponding to the coordinate is completed. I reckon. At this time, the character string information management unit 16 stores 1 in a translation work completion flag (to be described later) of the character string information table 9.
  • the electronic image document editing system can manage to what extent the translation work is completed at a certain point of time by the translation work management by the character string information management unit 16 and can present the translation work status to the user.
  • FIG. 2 shows a hardware configuration example of the electronic image document editing system of the present embodiment.
  • the electronic image document editing system includes a processing device 50, an input device 30, an output device 40, and a storage device 60, and is connected to a network 90.
  • the processing device 50 includes a processor and / or a logic circuit that operates according to a program, inputs / outputs data, reads / writes data, and executes each program shown in FIG.
  • the program is executed by the processor to perform a predetermined process using a storage device and a communication port (communication device). Therefore, in the present embodiment and other embodiments, the description with the program as the subject may be the description with the processor as the subject. Alternatively, the process executed by the program is a process performed by a computer and a computer system on which the program operates.
  • the processor operates as a functional unit that realizes a predetermined function by operating according to a program.
  • the processor functions as the character recognition processing unit 4 by operating according to the character recognition processing program, and functions as the character string determination unit 7 by operating according to the character string determination program.
  • the processor also operates as a functional unit that realizes each of a plurality of processes executed by each program.
  • a computer and a computer system are an apparatus and a system including these functional units.
  • the input device 30 is a device that accepts an operation content or data input from a user.
  • the input device 30 includes a keyboard 31 and a mouse 32.
  • the input device 30 may include a touch pen, a touch panel, or the like instead of or in addition to the keyboard 31 and the mouse 32.
  • the output device 40 is a device that outputs calculation processing results and the like to the user.
  • the output device 40 includes an output monitor 41.
  • the electronic image document editing system transmits / receives input / output data via the network 90 when the input / output data is exchanged with another computer.
  • the storage device 60 stores the program and data shown in FIG.
  • the storage device 60 includes a working area 61 that temporarily stores processing data generated by the processing device 50 when the program is executed.
  • the storage device 60 is an area for storing each data shown in FIG. 1, which is a translation target image document storage area 62, a character recognition dictionary storage area 64, a translation target image document storage area 65 with a character recognition result, a character A column criterion storage area 67, a character string information table storage area 68, a translation dictionary storage area 70, a translation word candidate table storage area 71, an image document storage area 74 with character recognition results / translation results, and a word / character dictionary storage area 75 Including.
  • the storage device 60 is an area for storing each unit shown in FIG. 1, and is a character recognition processing unit storage area 63, a character string determination unit storage area 66, a translation processing unit storage area 69, and a translated object generation unit storage area. 72, and a translated object editing section storage area 73.
  • the electronic image document editing system has a configuration in which all data and processing are aggregated in one computer, but the data and processing may be distributed and arranged in a plurality of computers.
  • a character recognition server which is another computer storing the character recognition processing unit 4 and the character recognition dictionary 5, and a computer having a function other than character recognition may exchange data with each other via the network 90.
  • a translation server which is another computer storing the translation processing unit 10 and the translation dictionary 11 and a computer having a function other than translation may exchange data with each other via the network 90.
  • FIG. 3A shows an example of an input electronic image document before translation.
  • a simple electric circuit diagram is used as an example of an electronic image document.
  • the electronic image document editing system includes a large amount of character information and drawing information including non-character information. Is often entered.
  • the electric circuit diagram in the electronic image document 301 before translation shows a circuit including a 6V dry cell, a miniature bulb, a transistor, and a resistor.
  • a character string representing the content and explanation of each symbol is described. That is, character information and non-character information such as a symbol representing a circuit and wiring are mixed in the electric circuit diagram.
  • FIG. 3B shows an example of the electronic image document of FIG. 3A translated by the electronic image document editing system.
  • the portion of the figure (non-character division) in the translated electronic image document 302 is not edited, the contents of FIG. 3A are displayed as they are, and only the Japanese character string is translated into English.
  • character strings having the same notation and meaning in Japanese and English, such as “100 ⁇ ” and “6V”, are not translated, and the contents in the electronic image document 301 before translation are displayed as they are.
  • the user can adjust the character font or add a line break to the electronic image document of the translated word. Editing processing such as making multiple lines or adjusting the description position may be performed.
  • FIG. 4A shows an example of an electronic image document before character recognition by the electronic image document editing system.
  • the electronic image document 401 before character recognition is the same as the electronic image document 301 before translation described in FIG. 3A.
  • FIG. 4B shows an example of a character recognition result for the electronic image document of FIG. 4A by the electronic image document editing system.
  • the character string in the character recognition result 402 is associated with the description location (coordinates) of the character string.
  • the character recognition result is displayed by overwriting the document data for convenience of explanation. However, in actuality, the character string obtained as the character recognition result is arranged behind the document data and is visible to the user. Absent.
  • Character strings “resistance 100 ⁇ ”, “resistance 200 ⁇ ”, “bean bulb”, and “dry battery 6V” in the character recognition result 402 are correctly recognized.
  • the character string “NPN transistor” is erroneously recognized as “NPN transistor” with only one character (“J” is “S”).
  • the partial character string “input” is “entered”
  • the partial character string “change” is “weird”
  • the partial character string “)” is “ ⁇ ”.
  • the partial character string in the character string of length n is a continuous character string from the i-th character to the j-th character (1 ⁇ i ⁇ j ⁇ n) of the character string.
  • the miniature bulb circuit symbol is the character string “te W”
  • the NPN transistor circuit symbol is the character string “six”
  • the resistor circuit symbol is the character string “ ⁇ VV-”
  • the dry cell circuit Each symbol is recognized as a character string “state”.
  • These recognized character strings are all noise character strings in which non-character information is erroneously recognized as character information.
  • FIG. 5A shows a first example of the character string determination criterion 8.
  • the character string determination criterion 8 includes a plurality of determination criterion items. Each determination criterion item includes an ID 501 for identifying the determination criterion item, a determination criterion item content 502 describing the specific content of the determination criterion item, a weight value 503 representing the importance (reliability) of each determination criterion item, and An application flag 504 that indicates by 1/0 whether or not to apply the criterion item is included. Further, the character string determination criterion 8 includes a determination method 505 that defines a determination method using one or more determination criterion items.
  • the determination criterion item content 502 is described using a variable that can be recognized by the character string determination unit 7.
  • S_length represents the number of characters constituting the character string.
  • C_type represents the type of characters constituting the character string. For example, Kanji, hiragana, katakana, symbol, symbol suffix (n_suffix), number, alphabet, alphabet ), An emergency kanji (non_j_kanji), etc., and a character type value recognized by the character recognition processing unit 4.
  • the character recognition processing unit 4 recognizes characters used for the original word and the translated word.
  • the number may indicate an Arabic numeral, or may indicate a numeral including a numeral other than an Arabic numeral (for example, a Roman numeral or a Greek numeral).
  • the number suffix is an example of a classifier and represents a word that is a suffix among the classifiers.
  • the classifier is a concept including a unit of measurement.
  • the character recognition processing unit 4 identifies one type of each recognized character.
  • the type of the letter “A” may be an alphabet or a numeric suffix (ampere) representing a unit of current.
  • the character recognition processing unit 4 identifies one type of the character “A” from the relationship with the characters before and after the character “A”.
  • C_type_seq represents the number of consecutive characters defined by C_type.
  • C_word represents the number of independent words included in the character string.
  • the determination criterion item Rule_1 indicates that “the number of characters constituting the recognized character string is two or more characters”.
  • the numerical value 2 in Rule_1 may be an integer of 3 or more. Since a character string with a small number of characters is unlikely to constitute a word, it is highly likely that it is a noise character string.
  • the character string determination unit 7 can exclude such a noise character string from the edit target character string by applying Rule_1 to the recognized character string.
  • the character string determination unit 7 can perform determination processing using Rule_1 by counting the number of characters in the character string without performing morphological analysis and without referring to various dictionaries. Therefore, the character string determination unit 7 can perform determination processing using Rule_1 at high speed.
  • the character string to be edited includes many character strings of two characters. Therefore, the character string determination unit 7 removes a large number of noise character strings by performing determination using Rule_1 of the present embodiment, in which two or more character strings are to be edited. Despite this, it is possible to reduce the number of character strings that are no longer considered for editing. Note that the character string determination unit 7 can directly apply the above Rule_1 to the recognized character string even when the original language is other than Japanese.
  • the criterion item Rule_2 indicates that “the recognized character string includes a partial character string in which two or more kanji characters, hiragana characters, or katakana characters continue”.
  • the numerical value 2 in Rule_2 may be an integer of 3 or more.
  • a character string that does not include a partial character string in which a predetermined type of characters continues for a certain number of characters is highly likely to be a noise character string.
  • a character string that does not satisfy Rule_2 having the predetermined type as kanji, hiragana or katakana is not likely to include a partial character string that is a Japanese (original) word, and thus may be a noise character string in particular. Is expensive.
  • the character string determination unit 7 can exclude such a noise character string from the editing target by applying Rule_2 to the recognized character string.
  • the character string determination unit 7 determines Rule_2 by determining the type of characters constituting the character string and counting the number of characters in the character string without performing morphological analysis and referring to various dictionaries. The used determination process can be performed. Therefore, the character string determination unit 7 can perform determination processing using Rule_2 at high speed.
  • noise character strings are character strings that do not include a partial character string in which two or more kanji characters, hiragana characters, or katakana characters continue.
  • the character string to be edited is a character string including a partial character string in which two kanji characters, hiragana characters, or katakana characters are continuous, and includes a partial character string in which three or more kanji characters, hiragana characters, or katakana characters are continuous. Contains a lot of non-character strings. Therefore, the character string determination unit 7 removes a large number of noise character strings by performing determination using Rule_2 of the present embodiment in which two or more character strings are to be edited. Despite this, it is possible to reduce the number of character strings that are no longer considered for editing.
  • Rule_2 may be, for example, “the recognized character string includes a partial character string in which two or more characters in the first type group that is a part of the character type recognized by the character recognition processing unit 4 are continuous”. .
  • the type group represents one or a plurality of types. Therefore, in Rule_2, for example, the first type group can be hiragana or katakana. Also in this case, the numerical value 2 in Rule_2 may be an integer of 3 or more.
  • Rule_2 can be, for example, “a recognized character string includes a partial character string in which two or more alphabets are continuous”. Further, when the original language is Chinese, Rule_2 can be, for example, “the recognized character string includes a partial character string in which two or more kanji characters are continuous”.
  • Judgment criteria item Rule_3 indicates that “the recognized character string includes one or more characters other than symbols, numeral suffixes, numbers, alphabets, and emergency kanji”.
  • a character string whose recognition character string does not include one or more characters other than the predetermined type is highly likely to be a noise character string.
  • a character string that does not satisfy Rule_3 except that the predetermined type is a symbol, a numeric suffix, a number, an alphabet, or an emergency kanji is highly likely not to be a Japanese (original language) word, and thus is particularly a noise character string. Is likely.
  • the character string determination unit 7 can exclude such a noise character string from the editing target by applying Rule_3 to the recognized character string.
  • Rule_3 may be, for example, “the recognized character string includes one or more characters in the second type group that is a part of the character type recognized by the character recognition processing unit 4”.
  • Rule_3 can be, for example, “the recognized character string includes one or more characters other than symbols, classifiers, numbers, kanji, hiragana, katakana, and emergency kanji”.
  • Rule_3 is, for example, “recognized character string includes one or more characters other than symbols, classifiers, numbers, hiragana, katakana, alphabet, and emergency kanji”. it can.
  • Judgment criteria item Rule_4 indicates that “the recognized character string includes one or more independent words”. Since a character string that does not include an independent word is a character string that does not have a word content, there is a high possibility that the character string is a noise character string.
  • the character string determination unit 7 can exclude the noise character string from editing targets by applying Rule_4 to the recognized character string.
  • Rule_3 may be, for example, “the recognized character string includes one or more content words”.
  • a content word is a word having a specific meaning content other than a grammatical role, such as a noun, a verb, and an adjective.
  • An independent word is an example of a content word in Japanese.
  • the electronic image document editing system does not use the information related to the editing result when performing the determination using the determination reference items Rule_1 to Rule_4, and therefore does not perform the editing process on the recognized character string. However, it can be determined whether the recognized character string is an edit target character string.
  • the electronic image document editing system provides two types of determination methods 505.
  • FIG. 5A shows that a determination method (Num_of_items) is designated as the determination method 505 based on the number of matching determination criterion items. Also, the threshold value (Num_of_items_threshold) for the number of determination criterion items in FIG.
  • the character string determination unit 7 Indicates that the recognized character string is determined to be a character string to be translated. In the determination method 505, when the recognized character string satisfies only two or less types of determination criterion items among the determination criterion items, the character string determination unit 7 determines that the recognized character string is not a character string to be translated. It shows that.
  • FIG. 5B shows a second example of the character string criterion 8.
  • the determination method 505 indicates that the threshold (Num_of_items_threshold) for the number of determination criterion items is 2. That is, in the determination method 505, the character string determination unit 7 translates the recognized character string only when the recognized character string satisfies all of the two types of determination criterion items that are application targets (the application flag 504 is 1). Indicates that the target character string is AND determination. Further, for example, if the threshold value of the number of determination criterion items is 1, the determination method 505 indicates that the character string determination unit 7 determines that the recognized character string 7 satisfies one of the two types of determination criterion items. Indicates an OR determination in which is determined as a character string to be translated.
  • FIG. 5C shows a third example of the character string criterion 8.
  • the determination method 505 indicates that a method (Sum_of_weights) for determining whether or not the recognized character string is a character string to be translated is specified based on the sum of the weight values 503 of the matching determination criterion items.
  • the weight sum threshold (Sum_of_weights_threshold) is 3.0.
  • the determination method 505 is a character string when the sum of the weight values of the items satisfied by the recognized character string is 3.0 or more among the four types of determination criterion items to be applied (the application flag 504 is 1). It shows that the determination part 7 determines the said recognition character string as a translation object character string.
  • the determination method 505 indicates that, when the sum of the weight values is less than 3.0, the character string determination unit 7 determines that the recognized character string is not a character string to be translated.
  • the determination using FIG. 5C is based on the number of matched items shown in FIG. This is the same as the determination. That is, the determination based on the number of matching items is an example of determination based on a weight value.
  • the user can change the contents of the character string judgment standard 8. That is, the user can select the determination criterion items to be applied, and can add, delete, or change parameters, threshold values, and the like. Details of the change of the contents of the character string determination criterion 8 will be described later.
  • FIG. 6 shows an example of the character string information table 9.
  • the character string information table 9 holds data relating to the determination result by the character string determination unit 7 and the translation result by the translation processing unit 10.
  • a character string information table 9 that holds the result of the processing performed by the character recognition processing unit 4 and the character string determination unit 7 on the input document of FIG. 4A by the electronic image document editing system.
  • the character string information table 9 includes a recognized character string 601, a description position 602, a determination reference collation result 607, a translation target flag 610, a modified character string 611, a translated character string 612, and a translation status flag 613.
  • the recognized character string 601 holds a character string recognized by the character recognition processing unit 4.
  • the description position 602 holds information on a rectangular area where the recognized character string 601 is displayed.
  • the description position 602 includes an upper left X coordinate 603 and an upper left Y coordinate 604 that are coordinates of the upper left vertex of the rectangular area where the recognition character string 601 is displayed, and a lower right X coordinate that is a coordinate of the lower right vertex of the rectangular area. 605 and the lower right Y coordinate 606.
  • the determination reference collation result 607 holds the determination reference collation result in the character string determination unit 7.
  • the judgment reference matching result 607 includes a column that holds the matching result for each judgment criterion item, the number of matches 608 that holds the number of matching judgment criterion items, and the weight that holds the sum of the weight values of the matching judgment criterion items. Total 609.
  • the translation target flag 610 is a flag for identifying whether or not the recognized character string 601 is a translation target character string.
  • the number of matches 608 is equal to or greater than a threshold. If the number of matches is less than the threshold, 0 is held.
  • the translation target flag 610 is determined based on the sum of weights of the matched items as to whether or not the recognized character string is a translation target as in the character string determination criterion 8 shown in FIG. If there is, 1 is held, and if the weight sum 609 is less than the threshold, 0 is held.
  • the corrected character string 611 holds a character string corrected by the user when there is an error in the recognized character string 601 whose translation target flag 610 is 1.
  • the translated character string 612 holds the translation result for the recognized character string 601 or the corrected character string 611.
  • the translation status flag 613 is a flag for identifying whether or not the translation work of the recognized character string 601 or the corrected character string 611 having the translation target flag 610 of 1 is completed.
  • the translation status flag 613 holds 1 when the translation result is held in the translated character string 612, 0 when it is not held, and NULL when the recognized character string 601 is not a translation target.
  • the recognized character string “resistance 100 ⁇ ” is described in a rectangular area having the coordinates (160, 30) as the upper left vertex and the coordinates (300, 50) as the lower right vertex.
  • the translation processing unit 10 translates the recognized character string “resistance 100 ⁇ ” into “Resistor 100 ⁇ ” and stores the translation result in the translated character string 612. Since the translation of the recognized character string “resistance 100 ⁇ ” by the translation processing unit 10 has been completed, the corresponding translation status flag 613 holds “1”.
  • the recognition character string “NPN transistor” is described in a rectangular area having coordinates (250, 250) as the upper left vertex and coordinates (390, 270) as the lower right vertex.
  • the recognition character string “NPN transistor” has an incorrect character recognition result.
  • the translation object generation unit 13 stores the correction result in the corrected character string 611.
  • the translation processing unit 10 translates the modified character string “NPN transistor” and stores the translation result in the translated character string 612. Since the translation of the corrected character string “NPN transistor” by the translation processing unit 10 has been completed, the corresponding translation status flag 613 holds “1”.
  • the recognition character string “Te W” is described in a rectangular area having coordinates (160, 240) upper left vertex and coordinates (200, 260) lower right vertex.
  • the recognition character string “dry battery 6V” is described in a rectangular area having coordinates (335, 410) as the upper left vertex and coordinates (460, 430) as the lower right vertex.
  • the character string determining unit 7 determines that the character string is to be translated. Accordingly, the corresponding translation target flag 610 holds “1”. However, since the translation work for the recognized character string “dry battery 6V” has not been performed yet, that is, the translation character string 612 does not hold the translation result, the corresponding translation status flag 613 holds 0.
  • the character string determination unit 7 checks the determination method defined in the character string determination standard 8. That is, the character string determination unit 7 checks whether the value of the variable Judge of the determination method 505 is the number of matching items (Num_of_items) or the weight sum of matching items (Sum_of_weights). When the value of Judge is the number of matching items, the character string determination unit 7 performs the process shown in FIG. 7A. If the Judge value is the weighted sum of the matching items, the character string determination unit 7 performs the process shown in FIG. 7B.
  • FIG. 7A shows an example of a translation target character string determination process performed by the character string determination unit 7 when the Judge value is the number of matching items.
  • the character string determination unit 7 acquires the value of the threshold value S1 defined in the determination method 505 of the character string determination criterion 8 (step 702). That is, the character string determination unit 7 holds the value of the variable Num_of_items_threshold as the threshold value S1.
  • the character string determination unit 7 determines whether or not there is an undetermined recognized character string in the character string information table 9 (step 703). If there is no undetermined recognized character string (step 703: No), the character string determination unit 7 ends the process.
  • the character string determining unit 7 analyzes the recognized character string (step 704).
  • the character string determination unit 7 refers to the word / character dictionary 17 and, for example, the number of characters constituting the recognized character string, the type of characters constituting the recognized character string, and the independent character included in the recognized character string. Information necessary for determination of the determination criterion item content 502 such as a word is extracted.
  • the character string determination unit 7 determines whether or not there is a determination criterion item content 502 not applied to the recognized character string (step 705).
  • the character string determination unit 7 collates the unapplied determination criterion item content 502 (step 706), and the recognized character string is not applied. It is determined whether or not the determination criterion item content 502 is met (step 707).
  • the character string determination unit 7 stores the value 0 in the corresponding determination criterion item of the determination criterion collation result 607 of the character string information table 9. (Step 708), the process returns to Step 705.
  • the recognized character string matches the determination criterion item (step 705: No)
  • the character string determination unit 7 stores the value 1 in the corresponding determination criterion item of the determination criterion collation result 607 of the character string information table 9 (step 705). 709), and returns to step 705.
  • step 705 If there is no unapplied determination criterion item (step 705: No), the character string determination unit 7 sums the values for each determination criterion item stored in the determination criterion collation result 607 of the character string information table 9, and the total value Is stored in the match number 608 (step 710). Next, the character string determination unit 7 determines whether or not the total value stored in the number of matches 608 is equal to or greater than the threshold value S1 acquired in Step 702 (Step 711).
  • step 711: No If the total value is not equal to or greater than the threshold value S1 (step 711: No), the character string determination unit 7 stores a value 0 in the translation target flag 610 of the recognized character string in the character string information table 9 (step 712). Return to 703. If the total value is greater than or equal to the threshold value S1 (step 711: Yes), the character string determination unit 7 stores the value 1 in the translation target flag 610 of the recognized character string in the character string information table 9 (step 713). Return to step 703.
  • FIG. 7B shows an example of translation target character string determination processing by the character string determination unit 7 when the Judge value is the weighted sum of matching items.
  • the character string determination unit 7 acquires the value of the threshold value S2 defined in the determination method 505 of the character string determination criterion 8 (step 714). That is, the character string determination unit 7 holds the value of the variable Sum_of_weights_threshold as the threshold value S2.
  • the character string determination unit 7 determines whether or not there is an undetermined recognized character string in the character string information table 9 (step 715).
  • step 715: No If there is no undetermined recognized character string (step 715: No), the character string determination unit 7 ends the process. If there is an undetermined recognized character string (step 715: Yes), the character string determining unit 7 analyzes the recognized character string (step 716). Since this analysis is the same as the analysis performed in step 704, description thereof is omitted.
  • the character string determination unit 7 determines whether or not there is an unapplied determination criterion item content 502 for the recognized character string (step 717). If there is an unapplied determination criterion item content 502 (step 717: Yes), the character string determination unit 7 collates the unapplied determination criterion item content 502 (step 718), and the recognized character string is not applied. It is determined whether or not the determination criterion item content 502 is met (step 719).
  • the character string determination unit 7 stores the value 0 in the corresponding determination criterion item of the determination criterion collation result 607 of the character string information table 9 (step 719). 720), the process returns to Step 717.
  • the character string determination unit 7 adds the character string determination criterion 8 to the corresponding determination criterion item of the determination criterion matching result 607 in the character string information table 9.
  • the weight value 503 of the corresponding determination criterion item is stored (step 721), and the process returns to step 717.
  • step 717 when there is no unapplied determination criterion item (step 717: No), the character string determination unit 7 sums the values for each determination criterion item stored in the determination criterion matching result 607 of the character string information table 9. The total value is stored in the weight sum 609 (step 722). Next, the character string determination unit 7 determines whether or not the total value stored in the weight sum 609 is equal to or greater than the threshold value S2 acquired in Step 714 (Step 723).
  • step 723: No If the total value is not greater than or equal to the threshold value S2 (step 723: No), the character string determination unit 7 stores the value 0 in the translation target flag 610 of the recognized character string in the character string information table 9 (step 724), and step Return to 715. If the total value is greater than or equal to the threshold value (step 723: Yes), the character string determination unit 7 stores the value 1 in the translation target flag 610 of the recognized character string in the character string information table 9 (step 725). Return to 715.
  • FIG. 8 shows an example of a list output screen of character strings determined as translation target character strings.
  • FIG. 8 illustrates a list output screen that is output and displayed based on the data in the character string information table 9 shown in FIG.
  • the character string list output screen 800 includes an output image subscreen 801 that outputs an electronic image document and a translation result of a character string in the electronic image document, and a translation status subscreen 802 that outputs a list of translation target character strings and a translation result. And including.
  • the translation status sub-screen 802 includes a status 803 that displays the translation work status of each character string, a translation target character string 804 that is a recognized character string to be translated, and a translation result 805 for the translation target character string 804. .
  • the user can sort the values of the selected items in descending or ascending order by selecting any of the item headings of the status 803, the translation target character string 804, and the translation result 805. Thereby, for example, the user can easily grasp a character string that has not yet been translated, or can easily check whether the translation result of the same character string varies.
  • the translation target character string 804 is linked with the description position on the output image sub-screen 801.
  • the output image sub-screen 801 displays a description portion of the designated character string.
  • the character string information management unit 16 refers to the description position 602 of the character string information table 9 to obtain the description position of the character string of the translation target character string 804.
  • the user can refer to the list of character strings to be translated on the translation status sub-screen 802 and display it in conjunction with the output image sub-screen 801. Therefore, the translation omission for the character strings in the electronic image document can be performed. Can be reduced.
  • the translation status sub-screen 802 displays the number of character strings to be translated and the total number of characters at the top. In the example of FIG. 8, the translation status subscreen 802 displays that the number of character strings to be translated is six and the total number of characters is 40 characters. These values are calculated by the character string information management unit 16 from the recognized character string 601 whose translation target flag is 1 in the character string information table 9. From the display of the translation status sub-screen 802, the user can grasp the amount of character strings to be translated in the electronic image document together with the list of character strings. Can be estimated.
  • FIG. 9 shows an example of a screen for changing the character string criterion 8.
  • a determination criterion change screen 900 for changing the character string determination criterion 8 is displayed.
  • the user changes the constituent elements and values (values enclosed in [] in the determination criterion change screen 900) for each determination criterion item constituting the character string determination criterion 8 on the determination criterion change screen 900.
  • FIG. 9 shows that when the threshold value (Num_of_items_threshold) 902 of the character string determination criterion 8 shown in FIG. 5A is 3 to 4 (that is, when the recognized character string satisfies all four kinds of reference items) An example of a change to (determined as a character string) is shown.
  • the update button 903 When the user presses the update button 903 after inputting the determination criterion change content, the content displayed on the determination criterion change screen 900 immediately before the pressing is updated and reflected in the character string determination criterion 8.
  • the cancel button 904 When the user presses the cancel button 904, the content is not updated and reflected in the character string determination standard 8.
  • FIG. 10 shows an example of an output screen in which the translation process is re-executed after the character string determination standard 8 is changed.
  • the character string determination unit 7 re-executes the determination process using the updated character string determination standard 8.
  • the re-execution result is stored in the character string information table 9. Then, based on the information in the character string information table 9 storing the re-execution result, the translation status sub-screen 802 is re-displayed.
  • the recognized character string “NPN transistor” is not regarded as a character string to be translated based on the updated determination criteria, and the characters output and displayed on the translation status sub-screen 802 Excluded from the column.
  • the recognized character string “NPN transistor” is no longer regarded as a translation target, the number of translation target character strings and the number of translation target characters are also reduced.
  • the electronic image document editing system allows the user to set the character string determination criterion 8 corresponding to the electronic image document to be edited because the contents of the character string determination criterion 8 can be updated.
  • the character string to be edited can be extracted with high accuracy.
  • the electronic image document editing system of the present embodiment is recognized by character recognition processing from an electronic image document in which figure (non-character information) and character information are mixed, such as a design drawing.
  • character information to be edited can be specified with high accuracy.
  • the user can easily and accurately grasp the description location and amount of text information to be edited, which in turn improves the efficiency and quality of editing work. Can be improved.
  • this invention is not limited to the above-mentioned Example, Various modifications are included.
  • the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.
  • a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment.
  • each of the above-described configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them with, for example, an integrated circuit.
  • Each of the above-described configurations, functions, and the like may be realized by software by interpreting and executing a program that realizes each function by the processor.
  • Information such as programs, tables, and files for realizing each function can be stored in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

Abstract

La présente invention se rapporte à un système d'édition de document image numérique, qui : accepte une entrée d'un document image numérique ; reconnaît, dans le document d'image numérique d'entrée, une chaîne de caractères de texte formée à partir d'un ou plusieurs caractères de texte d'une pluralité de catégories de caractères de texte ; et si la chaîne de caractères de texte reconnue satisfait à une règle de détermination de chaîne de caractères de texte, détermine que la chaîne de caractères de texte reconnue est une chaîne de caractères de texte à éditer. Les règles de détermination de chaîne de caractères de texte comprennent au moins une règle de détermination parmi : une première règle de détermination dans laquelle la chaîne de caractères de texte reconnue est formée à partir d'un nombre de caractères de texte supérieur ou égal à un premier seuil (qui est un nombre entier supérieur à un) ; une deuxième règle de détermination dans laquelle la chaîne de caractères de texte reconnue comprend une chaîne de caractères de texte partielle comportant un nombre de caractères de texte appartenant à un premier groupe de catégories, qui sont une partie de ladite pluralité de catégories, supérieur ou égal à un deuxième seuil (qui est un nombre entier supérieur à un) ; une troisième règle de détermination dans laquelle la chaîne de caractères de texte reconnue comprend un caractère de texte qui appartient à un deuxième groupe de catégories, qui sont une partie de la pluralité de catégories ; et une quatrième règle de détermination dans laquelle la chaîne de caractères de texte reconnue comprend un mot de contenu.
PCT/JP2014/056927 2014-03-14 2014-03-14 Système d'édition de document image numérique WO2015136692A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2014/056927 WO2015136692A1 (fr) 2014-03-14 2014-03-14 Système d'édition de document image numérique
JP2016507228A JPWO2015136692A1 (ja) 2014-03-14 2014-03-14 電子イメージ文書編集システム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/056927 WO2015136692A1 (fr) 2014-03-14 2014-03-14 Système d'édition de document image numérique

Publications (1)

Publication Number Publication Date
WO2015136692A1 true WO2015136692A1 (fr) 2015-09-17

Family

ID=54071169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/056927 WO2015136692A1 (fr) 2014-03-14 2014-03-14 Système d'édition de document image numérique

Country Status (2)

Country Link
JP (1) JPWO2015136692A1 (fr)
WO (1) WO2015136692A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241594A (zh) * 2016-12-26 2018-07-03 卡西欧计算机株式会社 文字编辑方法、电子设备以及记录介质
US11568659B2 (en) * 2019-03-29 2023-01-31 Fujifilm Business Innovation Corp. Character recognizing apparatus and non-transitory computer readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05266074A (ja) * 1992-03-24 1993-10-15 Ricoh Co Ltd 対訳画像形成装置
JP2009205209A (ja) * 2008-02-26 2009-09-10 Fuji Xerox Co Ltd 文書画像処理装置、及び文書画像処理プログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05266074A (ja) * 1992-03-24 1993-10-15 Ricoh Co Ltd 対訳画像形成装置
JP2009205209A (ja) * 2008-02-26 2009-09-10 Fuji Xerox Co Ltd 文書画像処理装置、及び文書画像処理プログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HIROKI TAKAHASHI ET AL.: "Extraction of Hangul Text from Scenery Images by Using Hangul Structure", THE TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS, vol. J88-D-II, no. 9, 1 September 2005 (2005-09-01), pages 1808 - 1816, XP008171336 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241594A (zh) * 2016-12-26 2018-07-03 卡西欧计算机株式会社 文字编辑方法、电子设备以及记录介质
US11568659B2 (en) * 2019-03-29 2023-01-31 Fujifilm Business Innovation Corp. Character recognizing apparatus and non-transitory computer readable medium

Also Published As

Publication number Publication date
JPWO2015136692A1 (ja) 2017-04-06

Similar Documents

Publication Publication Date Title
EP0686286B1 (fr) Systeme de translitteration de textes entres
US5640587A (en) Object-oriented rule-based text transliteration system
US8843815B2 (en) System and method for automatically extracting metadata from unstructured electronic documents
US10671805B2 (en) Digital processing and completion of form documents
US9384389B1 (en) Detecting errors in recognized text
US20120136647A1 (en) Machine translation apparatus and non-transitory computer readable medium
US11520835B2 (en) Learning system, learning method, and program
JPWO2008146583A1 (ja) 辞書登録システム、辞書登録方法および辞書登録プログラム
WO2015136692A1 (fr) Système d'édition de document image numérique
JP2019179470A (ja) 情報処理プログラム、情報処理方法、および情報処理装置
JP2011238159A (ja) 計算機システム
CN104794140B (zh) 一种文本高亮显示的方法和装置
US10049107B2 (en) Non-transitory computer readable medium and information processing apparatus and method
JP2011039576A (ja) 特定情報検出装置、特定情報検出方法および特定情報検出プログラム
JP2010026718A (ja) 文字入力装置および方法
TWM491194U (zh) 資料校對平台伺服器
JP4466241B2 (ja) 文書処理手法及び文書処理装置
JP2012108893A (ja) 手描き入力方法
JP2943791B2 (ja) 言語識別装置,言語識別方法および言語識別のプログラムを記録した記録媒体
JP2011090524A (ja) 書籍掲載文書の差異検出表示システムおよび書籍掲載文書の差異検出表示プログラム
Xiang et al. Recovering semantic relations from web pages based on visual cues
JP2019087233A (ja) 情報処理装置、情報処理方法及び情報処理プログラム
JP6303508B2 (ja) 文書分析装置、文書分析システム、文書分析方法およびプログラム
CN114417871B (zh) 模型训练及命名实体识别方法、装置、电子设备及介质
US20220198127A1 (en) Enhancement aware text transition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14885758

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016507228

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14885758

Country of ref document: EP

Kind code of ref document: A1