WO2015045680A1 - Dispositif de traduction et son programme de commande - Google Patents

Dispositif de traduction et son programme de commande Download PDF

Info

Publication number
WO2015045680A1
WO2015045680A1 PCT/JP2014/071716 JP2014071716W WO2015045680A1 WO 2015045680 A1 WO2015045680 A1 WO 2015045680A1 JP 2014071716 W JP2014071716 W JP 2014071716W WO 2015045680 A1 WO2015045680 A1 WO 2015045680A1
Authority
WO
WIPO (PCT)
Prior art keywords
translation
image
character
result
unit
Prior art date
Application number
PCT/JP2014/071716
Other languages
English (en)
Japanese (ja)
Inventor
健文 大塚
慎哉 佐藤
梅津 克彦
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2015045680A1 publication Critical patent/WO2015045680A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • G06V30/1456Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields based on user interactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • G06V30/262Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present invention relates to a translation device that translates a character or character string that has been character-recognized, and a control program thereof.
  • Patent Document 1 One method for solving such a problem is a method disclosed in Patent Document 1.
  • an evaluation value related to character recognition is calculated for each of a plurality of images including a character or a character string to be character-recognized and images before and after the image, and the evaluation value exceeds a certain threshold value.
  • the image with the highest evaluation value is selected from the images or images within a range up to a certain threshold.
  • the present invention has been made in view of the above problems, and an object thereof is to provide a translation apparatus and the like that can improve the translation accuracy in a specific image region.
  • a translation apparatus acquires an image of a captured image and at least one still image recorded before or after the captured image is captured.
  • An acquisition unit a character recognition unit that recognizes characters or character strings included in each acquired image, a translation unit that translates character-recognized characters or character strings, and a result of translation of the characters or character strings.
  • a translation evaluation unit that evaluates pass / fail, and a control that displays the result of the translation in an image that best evaluates the result of the translation in a specific image region of each image among the results of the translation in each image.
  • a display control unit for performing the operation.
  • FIG. 4 is an explanatory diagram for explaining the operation of the translation apparatus, wherein (a) and (b) show states before and after imaging of the translation apparatus, and (c) to (e) show examples of through images. .
  • FIGS. 1 to 7 Embodiments of the present invention will be described with reference to FIGS. 1 to 7 as follows.
  • FIG. 1 is a block diagram showing a configuration of translation apparatus 1 according to an embodiment of the present invention.
  • the translation device 1 includes a control unit 2, a storage unit 3, an imaging unit (image acquisition unit) 4, an operation unit 5, and a display unit 6.
  • the translation device 1 performs character recognition processing and translation processing on each image of at least one through image (still image or recorded image) recorded before and after the captured image as well as the captured image at the time of capturing.
  • This is an apparatus for performing each process such as a process for evaluating the result of translation and a process for displaying the result of translation.
  • the control unit 2 controls the entire translation apparatus 1 in an integrated manner, and can be configured by, for example, a CPU (Central Processing Unit).
  • the control unit 2 controls each control block of the storage unit 3, the imaging unit 4, the operation unit 5, and the display unit 6.
  • the detailed configuration of the control unit 2 will be described later.
  • the storage unit 3 stores various data read when executing a control program of each unit executed by the control unit 2.
  • the storage unit 3 is configured by a nonvolatile storage device such as a flash memory.
  • the storage unit 3 is a work area for temporarily storing various data generated in the process of executing the above-described program by the control unit 2 by a volatile storage device such as a RAM (Random Access Memory). It has a region that is configured.
  • the storage unit 3 is not necessarily provided in the translation device 1 and is connected to the translation device 1 as an external storage device that can be attached to and detached from the translation device 1 or an external storage device on a network that can communicate. It may be configured.
  • the storage unit 3 particularly includes a through image data 31 (still image or recorded image), captured image data 32 (captured image), analysis area setting information 33, a character / character string DB (database) 34, a dictionary DB 35, and Various data such as weight setting information 36 is stored.
  • the imaging unit 4 includes a function of a normal camera that captures an imaging target such as a signboard or a menu as a subject based on a user operation received by the operation unit 5, for example. Then, the imaging unit 4 outputs the image acquired by the imaging to the control unit 2. In addition, the imaging unit 4 has a function of periodically or continuously recording a through image obtained by imaging the state of the captured image before and after imaging.
  • the operation unit 5 receives an operation of the user of the translation apparatus 1 and is typically a physical key, a keyboard, an imaging button, a touch panel, and the like.
  • the control unit 2 includes an operation unit IF (interface) 21, a mode setting unit 22, an imaging unit IF 23, an image processing unit 24, a character recognition unit 25, a translation unit 26, a display layout setting unit 27, and A display control unit 28 is provided.
  • the display unit 6 displays an image based on an instruction from the control unit 2, displays a result of character recognition of a character or character string included in the image, or translates a character recognized character string or character string. It is a display device that displays. As the display unit 6, a liquid crystal display panel, an EL (Electro Luminescence) display panel, or the like can be applied. The display unit 6 may be a touch panel having both functions of image display and operation input. In the translation apparatus 1, the display unit 6 has a function of displaying a captured image, a through image, a character or character string with character recognition, a result of translation of the character or character string with character recognition, and the like.
  • the operation unit IF21 converts a user operation signal input via the operation unit 5 into digital data that can be processed, and transmits the digital data to each unit of the control unit 2.
  • the mode setting unit 22 sets or changes the operation mode of the translation apparatus 1 in accordance with a user operation via the operation unit 5, and notifies each unit of the control unit 2 that the operation mode is set or changed to a specific operation mode.
  • operation modes include a character recognition mode for recognizing characters or character strings, a translation mode for translating character-recognized characters or character strings into a predetermined language, various display modes (captured image display mode, Through image display mode, recognized character display mode, translation result display mode, etc.).
  • the imaging unit IF 23 captures a captured image captured by the user in the imaging unit 4 and stores it in the storage unit 3 as captured image data 32 (hereinafter simply referred to as “captured image”). In addition, the imaging unit IF23 captures a through image that is periodically or continuously captured by the imaging unit 4 before and after capturing a captured image, and stores as a through image data 31 (hereinafter simply referred to as a “through image”). 3 is stored. Further, the imaging unit IF 23 reads the captured image and the through image stored in the storage unit 3 and passes them to the image processing unit 24.
  • the captured image is not always captured at the same angle or inclination, but is captured at a specific angle or inclination that is continuously changing.
  • the angle or the inclination can be defined as an angle or an inclination with respect to the vertical direction of the memo paper (the same as the vertical direction of the characters included in the memo paper).
  • the vertical direction of the memo paper the same as the vertical direction of the characters included in the memo paper.
  • FIG. 5 (a) and FIG. 5 (b) respectively show the state before and after imaging of the translation apparatus 1.
  • FIG. 5A shows a state at the moment when the user is about to press the imaging button and take an image while viewing the camera preview image of the subject (memo paper) displayed on the display unit 6.
  • FIG. 5B the vertical direction of the subject (same as the vertical direction of the characters included in the subject) due to a hand shake or the like caused by the user pressing the imaging button is the same as the longitudinal direction of the display screen.
  • a state is shown in which the image is picked up with a slight tilt in the clockwise direction. Until the user shoots, the tilt and focus of the subject change every moment due to camera shake.
  • the through image is also stored periodically or continuously, and not only the instantaneous captured image captured by the user.
  • the user can capture the captured image without having to be aware of environmental factors such as angles and shadows when capturing a signboard or menu that the user wants to translate. If there is a character that can be recognized in the front and rear through images, the result of translation expected from the user can be displayed. For example, in the through images A and B shown in FIG. 5C and FIG. 5D, the images are captured in a state where the vertical direction of the subject is slightly inclined clockwise with respect to the longitudinal direction of the display screen.
  • the subject is not tilted but is imaged without being in focus.
  • the vertical direction of the subject happens to be inclined with respect to the longitudinal direction of the display screen, as in the through image C shown in FIG.
  • the image is captured in a state where the focus is not achieved. That is, in the case as described above, it is more likely that a correct result will be obtained if the through image C is processed.
  • the image processing unit 24 performs image processing on the captured image and the through image received from the imaging unit IF23.
  • the image processing unit 24 includes an analysis area setting unit 241 and a character cutout unit 242. Further, the image processing unit 24 transmits the captured image to the display layout setting unit 27 when the captured image display mode is currently set.
  • the analysis area setting unit 241 automatically sets an area of a specific size including the vicinity of the center of each image of the captured image and the through image as an analysis area (specific image area), or the user via the operation unit 5
  • the image area specified (or selected) is set as an analysis area.
  • the analysis area setting unit 241 stores information related to the set analysis area in the storage unit 3 as analysis area setting information 33.
  • the character cutout unit 242 generates a cutout image (see, for example, (b) in FIG. 6 and (f) in FIG. 6) obtained by cutting out an image portion including a character or a character string from the captured image and the through image, and the character recognition unit 25. Since the conventional method can be used for the method of generating the cutout image, the description thereof is omitted.
  • the character recognition unit 25 compares, for example, a character included in the cut-out image received from the character cut-out unit 242 and a character model recorded in association with the character code in the character / character string DB 34. When the similarity between the character included in the clipped image and the character model exceeds a preset threshold, the character included in the clipped image is recognized as a character having a character code corresponding to the similar character model.
  • the recognition rate for character recognition may be determined based on, for example, the degree of similarity between characters included in the cut-out image and the character model.
  • the character recognition method is not limited to such a method.
  • the character included in the cut-out image may be recognized as a character having a character code associated with a feature amount similar to the feature amount.
  • the character recognition unit 25 recognizes a character string
  • the character recognition unit 25 recognizes the characters constituting the character string one by one, and a character string in which the character recognition characters are arranged is present in the character / character string DB 34. Check if it is. At this time, for example, an average value of the recognition rates of the characters may be used as the recognition rate of the character string that has been recognized.
  • the method of character recognition of a character string is not limited to the above methods, Other known methods can be used. For example, a configuration in which a character string included in the cut-out image is directly compared with a character string model recorded in association with a character string code in the character / character string DB 34 may be employed.
  • the character recognition unit 25 includes a recognition result evaluation unit 251, and the recognition result evaluation unit 251 determines that the character recognition is successful when the recognition rate of the character or the character string exceeds a predetermined threshold. To do.
  • the character recognition unit 25 transmits the result of character recognition to the display layout setting unit 27 described later.
  • the translation unit 26 translates a character or a character string recognized by the character recognition unit 25 (successful character recognition) into a specific language (for example, English to Japanese). More specifically, the translation unit 26 confirms whether or not the translation result corresponding to the character or character string recognized by the character exists in the dictionary DB 35. At this time, the evaluation value indicating the quality of the translation result, that is, the evaluation value of the translation result (translation accuracy) may be determined based on the recognition rate of the character or character string, for example.
  • the translation evaluation method is not limited to the above method. For example, the best result may be determined when the translation of the largest number of characters or character strings is successful for all the characters or character strings included in the captured image.
  • evaluation value (translation The total number of characters or character strings successfully completed / the total number of all characters or character strings included in the captured image).
  • the translation unit 26 of the present embodiment includes a translation evaluation unit 261 and a weighting setting unit 262.
  • the translation evaluation unit 261 calculates an evaluation value indicating the degree of quality of the translation result for each of the captured image and the through image, and evaluates the quality of the translation result based on the magnitude of the evaluation value.
  • the weight setting unit 262 sets the weight of the evaluation value according to the position on each image. If you want to translate the characters included in the target image, the user does not want to translate all the characters included in the image evenly. There are cases to think. In that case, since it is considered that the user often shoots the character or the like that he / she wants to translate most in the center of the screen, for example, from the translation result shown in FIG. The translation result shown is considered to be the expected result for the user.
  • the translation result shown in FIG. 7B is 75%, and FIG. The translation result shown is 25%, and the former is judged to be a better result. Therefore, when “Rest today” present in the vicinity of the center of the screen can be translated, the translation result shown in (c) of FIG. Can be displayed. Such added points are called weighting.
  • the place where the weighting is performed is not limited to the center of the screen, but the user may be allowed to specify a place that the user wants to particularly translate.
  • the weighting setting unit 262 refers to the weighting setting information 36 recorded in advance in the storage unit 3, and sets the weighting of the evaluation value calculated for the analysis area in each image of the captured image and the through image other than the analysis area. You may set larger than the weighting of the evaluation value calculated about an area
  • the weighting method is not limited to the above method.
  • the weighting setting information 36 recorded in advance in the storage unit 3 is referred to, and the weighting is reduced according to the distance from the center position of each image of the captured image and the through image (for example, the weighting size).
  • the weight setting unit 262 may set the weight to 1 / (number of pixels from the center of the image), for example, depending on how many pixels are away from the center of the image.
  • the translation evaluation unit 261 calculates a weighted sum of evaluation values calculated for each image using the weighting set by the weighting setting unit 262, and evaluates the quality of the result of translation based on the magnitude of the weighted sum.
  • the translation result in the image having the maximum weighted sum of the evaluation values of the translation results is specified.
  • the translation unit 26 transmits the specified translation result to the display layout setting unit 27 described later.
  • the translation evaluation unit 261 identifies an image that maximizes the sum of the evaluation values calculated for the analysis area, and the translation unit 26 transmits the result of translation of the identified image to the display layout setting unit 27. May be.
  • the character or character string that the user wants to translate is often near the center of the image (or near the center of the image), not at the edge of the image, so when evaluating the results of translation for multiple images
  • the evaluation value is weighted according to the position in the image, and the result using the image having a high evaluation value in the vicinity of the center or the image area selected by the user is displayed to the user. This makes it easier to obtain translation results.
  • the translation unit 26 does not necessarily include the weight setting unit 262.
  • the display layout setting unit 27 displays the captured image received from the image processing unit 24, the character recognition received from the character recognition unit 25, and the translation received from the translation unit 26 according to the currently set display mode type. Based on the result, these various data are integrated to generate display image data to be displayed on the display unit 6.
  • the display layout setting unit 27 passes the generated display image data to the display control unit 28.
  • the display control unit 28 controls the drive of the display unit 6 to display the display image on the display screen using the display image data. If the display control unit 28 is currently set to the captured image display mode, the display control unit 28 causes the display unit 6 to display the captured image. If the character recognition mode is set, the character recognition result is displayed on the display unit 6.
  • the character recognition result may be displayed together with the character recognition target character or character string, or may be displayed by replacing the character recognition target character or character string with the character recognition result.
  • the display control unit 28 causes the display unit 6 to display the translation result if the translation result display mode is set.
  • the result of translation may be displayed together with the character or character string to be recognized (see FIG. 6D, FIG. 7C, and FIG. 7F), or character recognition. You may display with the result of.
  • the character or character string to be recognized may be replaced with the result of translation and displayed (see (h) in FIG. 6, (b) in FIG. 7, and (e) in FIG. 7)).
  • the user it is necessary for the user to take an image in consideration of environmental factors such as an angle and a tilt in the imaging of the imaging target, and an expected result cannot be obtained unless the imaging is performed in accordance with the environmental factor.
  • environmental factors such as an angle and a tilt in the imaging of the imaging target
  • an expected result cannot be obtained unless the imaging is performed in accordance with the environmental factor.
  • the translation device 1 of the present embodiment not only the instantaneous image captured by the user but also the through images before and after the image capture are analyzed, so that the analysis obtained until the target is imaged is analyzed. Character recognition and translation results can be displayed with high accuracy from a suitable through image. That is, according to the translation device 1, the user often obtains the expected translation result without performing imaging in consideration of environmental factors when imaging the target.
  • the translation apparatus 1 after giving importance to the evaluation result of the analysis area, the translation result having the highest evaluation is displayed to the user. That is, displaying the result of using the image with the best translation evaluation in the analysis area to the user makes it easier to obtain the translation result desired by the user.
  • FIG. 2 is a flowchart showing an aspect of the operation of the translation apparatus 1.
  • the translation apparatus 1 is turned on to start operation.
  • step S hereinafter, “step” is omitted) 11
  • the user operates the operation unit 5 to set the operation mode to, for example, the translation mode (or the translation result display mode), and activates the imaging unit 4. Proceed to S12.
  • the imaging unit IF23 starts saving a through image that is periodically or continuously captured by the imaging unit 4, and proceeds to S13. Specifically, the through image captured periodically or continuously by the imaging unit 4 is stored in the storage unit 3. In S ⁇ b> 13, the imaging unit 4 captures a captured image by shutter operation via the user operation unit 5, stores the captured image in the storage unit 3, and proceeds to S ⁇ b> 14.
  • the image processing unit 24 confirms whether or not an unanalyzed image exists in the captured image and the through images (still images) before and after the imaging. As a result, when an unanalyzed image exists, the process proceeds to S15, and when an unanalyzed image does not exist, the process returns to S12. At this time, for example, the image processing unit 24 automatically sets a circular area surrounded by a locus of a point separated by a certain distance from the coordinates of the center of the unanalyzed image as the analysis area, or the operation unit 5 Set in the analysis area according to the user's specification via.
  • the character recognition unit 25 performs character recognition processing only within the analysis area (near the center of the image or the region selected by the user), and proceeds to S16.
  • the analysis area near the center of the image or the region selected by the user
  • S16 proceeds to S16.
  • S16 if at least one character has been successfully recognized, the process proceeds to S16. On the other hand, if no character has been successfully recognized, the process returns to S14. The details of character recognition are as described above.
  • the translation unit 26 performs a translation process on the character or character string that has been successfully recognized, and proceeds to S18. The details of the translation process are as described above.
  • the translation unit 26 determines whether the evaluation value (translation accuracy) of the translation result is equal to or greater than a predetermined threshold value, and if the evaluation value of the translation result is equal to or greater than the predetermined threshold value, Proceed to S19. At this time, the translation unit 26 transmits to the display layout setting unit 27 the result of translation of the image having the best translation evaluation among the images subjected to the translation process. On the other hand, if the evaluation value of the result of translation is less than a predetermined threshold value, the process returns to S14. In S19, the display layout setting unit 27 causes the display control unit 28 to control the driving of the display unit 6, and displays the result of translation of the image with the best translation evaluation on the display unit 6 to be “END”.
  • the entire image is not used for evaluation of the frame image, but only the analysis area is used to increase the recognition rate of characters or character strings in the analysis area, and processing per frame. Since the time can also be shortened, more images can be used for evaluation without waiting for the user, and the burden on the user during imaging can be reduced.
  • FIG. 6 is an explanatory diagram for explaining the operation of the translation apparatus 1 described above.
  • FIGS. 6A to 6D show the case where the vicinity of the center (analysis area) of the captured image is cut out. The flow of operation is shown.
  • a rectangular image area A1 near the center of the target image is set as the analysis area. If the processing range is limited to the analysis area as in this example, the processing time is shorter than that for the entire image.
  • the shape of the analysis area may not be a circle but may be a rectangle or the like, and the shape is not particularly limited.
  • the character cutout unit 242 first cuts out only the vicinity of the center of the target image, and the character string “Rest Today” is cut out (extracted). .
  • FIG. 6C shows a state where the translation unit 26 translates “Rest Today” and obtains a translation result of “rested today”.
  • a character string “Today's holiday” is displayed on the display unit 6 as the translation result TR.
  • the character string “Rest Today” that the user most wanted to translate is also displayed as the translation target UR, and the translation result TR is a translation result corresponding to the translation target UR.
  • the display is devised so that it can be seen at a glance.
  • FIG. 6 shows the operation flow of the translation apparatus 1 when the image area specified by the user is cut out.
  • the image area A2 designated by the user who wants to place importance on translation is set as the analysis area.
  • the character cutout unit 242 first cuts out only the analysis area portion of the target image, and the character string “RestRToday” is cut out.
  • FIG. 6G shows a state where the translation unit 26 translates “Rest Today” and obtains a translation result “today's holiday”.
  • FIG. 6H shows an example in which a character string “today's holiday” as the translation result TR is displayed on the display unit 6 instead of “Rest Today” as the translation target UR. ing.
  • FIG. 3 is a flowchart showing another aspect of the operation of the translation apparatus 1. Since the operations of S21 to S24, S26, S27, and S29 are substantially the same as the operations of S11 to S14, S16, S17, and S19 described above, description thereof is omitted here.
  • the character recognition unit 25 performs character recognition processing not only on the analysis area but also on the entire image.
  • the translation evaluation unit 261 determines whether or not a translation result with an accuracy (evaluation value) equal to or higher than a threshold is obtained in the analysis area, and when a translation result with an accuracy equal to or higher than the threshold is obtained. Advances to S29. On the other hand, if a translation result with an accuracy equal to or higher than the threshold is not obtained, the process returns to S24.
  • the character recognition range is the entire image, the processing time per frame cannot be shortened, but the character or character string in the analysis area The recognition rate can be increased.
  • FIG. 4 is a flowchart showing still another aspect of the operation of the translation apparatus 1. Since each operation of S31 to S37 is almost the same as each operation of S21 to S27 described above, description thereof is omitted here.
  • the weight setting unit 262 weights the evaluation value of the translation result according to the position in the image. The details of the weighting method are as described above.
  • the translation evaluation unit 261 calculates a weighted sum of the evaluation values calculated for each image using the weighting set by the weighting setting unit 262, and evaluates the quality of the translation result based on the weighted sum. .
  • the display control unit 28 drives the display unit 6 to cause the display unit 6 to display the result of translation corresponding to the image having the highest weighted sum of the evaluation values, thereby indicating “END”.
  • the display control unit 28 drives the display unit 6 to cause the display unit 6 to display the result of translation corresponding to the image having the highest weighted sum of the evaluation values, thereby indicating “END”.
  • Control blocks of translation apparatus 1 (or control unit 2) (in particular, imaging unit IF23, image processing unit 24, character recognition unit 25, translation unit 26, translation evaluation unit 261, weight setting unit 262, etc.) May be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU.
  • the translation apparatus 1 includes a CPU that executes instructions of a program that is software that implements each function, a ROM (Read Only Memory) in which the program and various data are recorded so as to be readable by a computer (or CPU), or A storage device (these are referred to as “recording media”), a RAM for developing the program, and the like are provided.
  • a computer reads the said program from the said recording medium and runs it.
  • a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used.
  • the program may be supplied to the computer via any transmission medium (such as a communication network or a broadcast wave) that can transmit the program.
  • the present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.
  • the translation device (1) includes an image acquisition unit that acquires each image of a captured image and at least one still image (through image) recorded before or after the captured image is captured.
  • (Imaging unit 4) a character recognition unit (25) for recognizing characters or character strings included in each of the acquired images, a translation unit (26) for translating character-recognized characters or character strings, and the characters
  • the translation evaluation unit (261) that evaluates the quality of the result of translation of the character string, and the evaluation of the result of translation in a specific image region of each image is the best among the results of translation in each image.
  • a display control unit (28) that performs control to display the result of the translation in the resulting image.
  • the character recognition process, the translation process, the process of evaluating the translation result, and the process of displaying the translation result are performed not only on the captured image but also on at least one still image before and after the imaging. For this reason, for example, when imaging a signboard, menu, or the like that is desired to be translated, character recognition processing and translation processing in still images before and after capturing a captured image without being aware of environmental factors such as angle and shadow If there is something that is possible, the result of translation expected by the user can be displayed.
  • region can be improved by displaying to a user the result of using the image with the best evaluation value of a specific image area
  • the translation evaluation unit calculates an evaluation value indicating a degree of quality of the translation result of the character or the character string for each image.
  • the quality of the result of the translation is evaluated based on the magnitude of the evaluation value, and the display control unit displays the result of the translation in the image in which the total sum of the evaluation values calculated for the specific image region is maximum. Control may be performed.
  • the translation result in the image in which the total sum of the evaluation results of the translation results calculated for the specific image area is the maximum that is, the image in which the evaluation of the translation result in the specific image area is the highest is obtained. Since it is displayed, the translation accuracy in a specific image region can be improved.
  • the translation evaluation unit has an evaluation value indicating a degree of quality of the translation result of the character or the character string for each image.
  • a weight setting unit configured to calculate and set the weight of the evaluation value calculated for the specific image region to be larger than the weight of the evaluation value calculated for a region other than the image region; Calculating the weighted sum of the evaluation values calculated for each image using the weight set by the weight setting unit, evaluating the quality of the result of the translation based on the magnitude of the weighted sum, and the display control unit Control for displaying the result of translation in an image that maximizes the weighted sum of the evaluation values of the results of translation may be performed.
  • the character or character string that the user wants to translate often exists near the center of the image, not at the edge of the image.
  • the weight of the evaluation value of the translation result calculated for the specific image region is set larger than the weight of the evaluation value of the translation result calculated for the region other than the image region, A weighted sum of evaluation values of translation results is calculated for each image, and a result of translation of an image with the maximum weighted sum of evaluation values of translation results is displayed.
  • the character recognition unit may recognize only the character or the character string included in the specific image region. good.
  • the user does not want to translate all the characters included in the image evenly.
  • I think since it is considered that the user often shoots the character or the like that he / she wants to translate most near the center of the screen, for example, near the center of the image (a specific image) without character recognition of the entire image.
  • a control program for causing a computer to execute processing in the translation apparatus according to any one of aspects 1 to 4 of the present invention and a computer-readable recording medium on which the control program is recorded also fall within the scope of the present invention.
  • the present invention can be used in an information processing apparatus equipped with a character recognition function for recognizing characters or character strings included in a captured image and a translation function for translating characters or character strings recognized.
  • a character recognition function for recognizing characters or character strings included in a captured image
  • a translation function for translating characters or character strings recognized.
  • various information processing apparatuses such as a PC (Personal Computer), a mobile phone, a smartphone, a tablet PC, an electronic dictionary, a digital camera, and a game machine.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

Le dispositif de traduction selon l'invention comporte les éléments suivants: une unité de reconnaissance de caractères (25) qui effectue une reconnaissance de caractère sur du texte et similaire dans une image prise et une image fixe; une unité de traduction (26) qui traduit le texte, ou similaire, reconnu; une unité d'évaluation de traduction (261) qui évalue l'acceptabilité de chaque résultat de traduction; et une unité de commande d'affichage (28) qui commande le dispositif de traduction de manière à afficher le résultat de traduction à partir de l'image qui a donné le meilleur résultat de traduction pour la zone en cours d'analyse.
PCT/JP2014/071716 2013-09-27 2014-08-20 Dispositif de traduction et son programme de commande WO2015045680A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2013-202483 2013-09-27
JP2013202483A JP6144168B2 (ja) 2013-09-27 2013-09-27 翻訳装置およびその制御プログラム

Publications (1)

Publication Number Publication Date
WO2015045680A1 true WO2015045680A1 (fr) 2015-04-02

Family

ID=52742834

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/071716 WO2015045680A1 (fr) 2013-09-27 2014-08-20 Dispositif de traduction et son programme de commande

Country Status (2)

Country Link
JP (1) JP6144168B2 (fr)
WO (1) WO2015045680A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000023012A (ja) * 1998-07-06 2000-01-21 Olympus Optical Co Ltd 翻訳機能付カメラ
JP2003178067A (ja) * 2001-12-10 2003-06-27 Mitsubishi Electric Corp 携帯端末型画像処理システム、携帯端末およびサーバ
JP2003242440A (ja) * 2002-02-20 2003-08-29 Fujitsu Ltd 文字認識方法及びその装置
JP2007052613A (ja) * 2005-08-17 2007-03-01 Fuji Xerox Co Ltd 翻訳装置、翻訳システムおよび翻訳方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008054236A (ja) * 2006-08-28 2008-03-06 Nikon Corp 撮像装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000023012A (ja) * 1998-07-06 2000-01-21 Olympus Optical Co Ltd 翻訳機能付カメラ
JP2003178067A (ja) * 2001-12-10 2003-06-27 Mitsubishi Electric Corp 携帯端末型画像処理システム、携帯端末およびサーバ
JP2003242440A (ja) * 2002-02-20 2003-08-29 Fujitsu Ltd 文字認識方法及びその装置
JP2007052613A (ja) * 2005-08-17 2007-03-01 Fuji Xerox Co Ltd 翻訳装置、翻訳システムおよび翻訳方法

Also Published As

Publication number Publication date
JP2015069366A (ja) 2015-04-13
JP6144168B2 (ja) 2017-06-07

Similar Documents

Publication Publication Date Title
US11113523B2 (en) Method for recognizing a specific object inside an image and electronic device thereof
US10803367B2 (en) Method and apparatus for recognizing characters
US9477138B2 (en) Autofocus
US10291843B2 (en) Information processing apparatus having camera function and producing guide display to capture character recognizable image, control method thereof, and storage medium
JP2008205774A (ja) 撮影操作誘導システム、撮影操作誘導方法および撮影操作誘導プログラム
CN108694400A (zh) 信息处理装置、其控制方法及存储介质
JPWO2015163118A1 (ja) 文字特定装置、および制御プログラム
US10373329B2 (en) Information processing apparatus, information processing method and storage medium for determining an image to be subjected to a character recognition processing
CN112822394A (zh) 显示控制方法、装置、电子设备及可读存储介质
CN111784604A (zh) 图像处理方法、装置、设备及计算机可读存储介质
JP2010217997A (ja) 文字認識装置、文字認識プログラム、および文字認識方法
JP6144168B2 (ja) 翻訳装置およびその制御プログラム
WO2015045679A1 (fr) Dispositif d'informations et programme de commande
JP6251075B2 (ja) 翻訳装置
KR102071975B1 (ko) 광학적 문자 인식을 사용하는 카드 결제 장치 및 방법
US10321089B2 (en) Image preproduction apparatus, method for controlling the same, and recording medium
JP2005055973A (ja) 携帯情報端末
US20230215311A1 (en) Automatic user interface reconfiguration based on external capture
US20090244002A1 (en) Method, Device and Program for Controlling Display, and Printing Device
KR20200069869A (ko) 영상 내 문자 자동번역 시스템 및 방법
CN113495836A (zh) 一种页面检测方法、装置和用于页面检测的装置
CN116745673A (zh) 图像处理装置、摄像装置、图像处理方法及程序
US9742955B2 (en) Image processing apparatus, image processing method, and storage medium
KR20140112919A (ko) 이미지 처리장치 및 방법
JP2006331216A (ja) 画像処理装置、画像処理装置における処理対象範囲指定方法、画像処理範囲指定プログラム、および画像処理範囲指定プログラムを記録する記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14849814

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14849814

Country of ref document: EP

Kind code of ref document: A1