WO2021018110A1 - 翻译笔及其控制方法 - Google Patents
翻译笔及其控制方法 Download PDFInfo
- Publication number
- WO2021018110A1 WO2021018110A1 PCT/CN2020/105026 CN2020105026W WO2021018110A1 WO 2021018110 A1 WO2021018110 A1 WO 2021018110A1 CN 2020105026 W CN2020105026 W CN 2020105026W WO 2021018110 A1 WO2021018110 A1 WO 2021018110A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- translated
- image
- pen
- processor
- Prior art date
Links
- 238000013519 translation Methods 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims description 32
- 238000000926 separation method Methods 0.000 claims description 36
- 230000011218 segmentation Effects 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 16
- 230000003044 adaptive effect Effects 0.000 claims description 13
- 239000003973 paint Substances 0.000 claims description 10
- 230000000306 recurrent effect Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 5
- 239000003086 colorant Substances 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000004040 coloring Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000006740 morphological transformation Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
- G06V10/17—Image acquisition using hand-held instruments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/47—Machine-assisted translation, e.g. using translation memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/235—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/142—Image acquisition using hand-held instruments; Constructional details of the instruments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/1444—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/1444—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
- G06V30/1456—Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields based on user interactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/158—Segmentation of character regions using character size, text spacings or pitch estimation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/24—Character recognition characterised by the processing or recognition method
- G06V30/242—Division of the character sequences into groups prior to recognition; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present disclosure relates to the technical field of translation devices, in particular to a translation pen and a control method thereof.
- Scanning translation pen scans the printed text to the text recognition system in the pen through scanning technology, so that the text recognition system can recognize the scanned text, and then translate the recognized text through the translation software in the pen.
- the scanning translation pen has a pen tip similar to the appearance of a brush, and the pen tip is attached to the page and brushed along the line direction of the printed text on the page to scan the printed text for translation.
- the above scanning translation pen scans line by line in the process of scanning text, and there will be occlusions for the text to be scanned on the same line, which affects the reader's perception.
- a scanning method cannot perform a good scanning volume. To be sure, it is easy to cause too much scan volume to cause inaccurate translation results.
- a translation pen which includes a pen body, a pointing component, an image collector, and a first processor.
- the pen body includes a pen tip end.
- the indicating component is arranged at the tip end of the pen body.
- the image collector is arranged on the pen body; the image collector is configured to collect an image of the text to be translated according to the position indicated by the indicating component, and send the collected image of the text to be translated.
- the first processor is arranged in the pen body, and the first processor is electrically connected to the image collector; the first processor is configured to receive the text to be translated sent by the image collector To identify the text to be translated in the image of the text to be translated.
- the pen body further has a pen tail end opposite to the pen tip end.
- the image collector is arranged on a side outside of the pen body, and along the direction from the tip end of the pen body to the end of the pen, the image collector is far away from the tip end of the pen relative to the indicating member.
- the indicating component is arranged within the viewing angle range of the image collector.
- the indicating member can be used to paint colors on external objects.
- the indicating member has a tip.
- the first processor includes: a detection component electrically connected to the image collector, and a locking component electrically connected to the detection component.
- the detection component is configured to detect the image of the text to be translated to form at least one text box, and the text box contains part of the text in the image of the text to be translated.
- the locking component is configured to lock a text box that meets the set requirements in at least one of the formed text boxes, and use the text in the locked text box as the text to be translated.
- the setting requirement is that along the column direction of the text, the row where the text box is located is the row closest to the indicating component; and along the row direction of the text, the center line of the text box is distanced from the waiting The center line of the image of the translated text is closest.
- the setting requirement is that along the column direction of the text, the row where the text box is located is the row closest to the indicating component ; And at least part of the area in the text box is painted.
- the first processor further includes: an adaptive word segmentation component electrically connected to the locking component.
- the adaptive word segmentation component is configured to calculate a threshold value of the separation distance between adjacent letters in the text to be translated according to the size of the locked text to be translated, and to segment the locked text to be translated according to the threshold .
- the translation pen further includes: a second processor disposed in the pen body, and the second processor is electrically connected to the first processor.
- the second processor is configured to receive the text to be translated recognized by the first processor, translate the text to be translated, and generate and send a translation result.
- an observation window is provided on the pen body.
- the translation pen further includes: a display screen arranged in the observation window of the pen body, and the display screen is electrically connected with the second processor.
- the display screen is configured to receive the translation result sent by the second processor, and display the translation result.
- a control method applied to the above translation pen including: the image collector of the translation pen collects an image of the text to be translated according to the position indicated by the indicating part of the translation pen; The first processor recognizes the text to be translated in the image of the text to be translated.
- the first processor recognizing the text to be translated in the image of the text to be translated includes: detecting the image of the text to be translated to form at least one text box.
- the text box contains part of the text in the image of the text to be translated; the text box that meets the set requirements is locked in at least one text box formed, and the text in the locked text box is used as the text to be translated.
- the locking a text box that meets a set requirement in the formed at least one text box includes: selecting a text line that is closest to the indicating component along a column direction of the text; determining the selected text The center line of each text box in the row, and the center line of the image of the text to be translated, lock the text box whose line is closest to the center line of the image of the text to be translated.
- locking a text box that meets a set requirement in the formed at least one text box includes: The column direction is the text line closest to the indicating component; the text box in which at least part of the area in the text box is colored is locked.
- the method further includes: adaptively segmenting the text to be translated.
- the adaptive segmentation of the text to be translated includes: obtaining sub-text boxes circumscribed by each letter in the text to be translated; and determining according to the area of each of the sub-text boxes Reference text box; calculate the distance threshold between two adjacent letters according to the width of the reference text box; obtain the actual distance value between every two adjacent letters; according to the distance between every two adjacent letters The actual separation distance value and the separation distance threshold value are used to segment the text to be translated.
- the determining the reference text box according to the area of each of the sub text boxes includes: selecting a sub text box whose area value is in the middle range according to the area of each of the sub text boxes, and selecting As the reference text box.
- the difference between the lower limit of the middle range and the minimum area value of each of the sub-text boxes is equal to or approximately equal to the upper limit of the middle range and the maximum area of each of the sub-text boxes The difference between the values.
- the calculating the distance threshold between two adjacent letters according to the width of the reference text box includes: according to the following formula Calculate the distance threshold between two adjacent letters. Wherein, N is the distance threshold between two adjacent letters, and W is the ratio of the width of the reference text box to the width of a unit pixel in the image.
- the segmentation of the text to be translated according to the actual separation distance value between every two adjacent letters and the separation distance threshold includes: comparing the actual distance between two adjacent letters The separation distance value and the separation distance threshold; if the actual separation distance value is greater than the separation distance threshold, then two adjacent letters belong to two adjacent words; if the actual separation distance value is less than or equal to the Distance threshold, then two adjacent letters belong to the same word.
- the translation pen when the translation pen further includes a second processor and a display screen, after the first processor recognizes the text to be translated in the image of the text to be translated, it further includes: The second processor translates the text to be translated recognized by the first processor to generate a translation result; the display screen displays the translation result.
- the first processor re-identifies the text to be translated in the image of the text to be translated, and the re-recognized text to be translated cannot be the above The text to be translated once recognized.
- Figure 1 is a structural diagram of a translation pen according to some embodiments of the present disclosure.
- Fig. 2 is a schematic diagram of a connection of various components of the translation pen according to some embodiments of the present disclosure
- Fig. 3 is another schematic diagram of connection of various components of the translation pen according to some embodiments of the present disclosure.
- Figure 4 is a schematic diagram of the use of a translation pen according to some embodiments of the present disclosure.
- FIG. 5 is a schematic diagram of a detection image of a translation pen according to some embodiments of the present disclosure.
- FIG. 6 is a schematic diagram of a translation pen locking a text to be translated according to some embodiments of the present disclosure
- FIG. 7 is another schematic diagram of locking the text to be translated by the translation pen according to some embodiments of the present disclosure.
- FIG. 8 is a schematic diagram of invalid text collected by a translation pen according to some embodiments of the present disclosure.
- Fig. 9 is a flowchart of a control method applied to a translation pen according to some embodiments of the present disclosure.
- FIG. 10 is a flowchart of adaptive word segmentation in a control method according to some embodiments of the present disclosure.
- FIG. 11 is a flowchart of acquiring sub-text boxes in a control method according to some embodiments of the present disclosure.
- 12 to 15 are schematic diagrams of various steps of adaptive word segmentation according to some embodiments of the present disclosure.
- first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, the features defined with “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the embodiments of the present disclosure, unless otherwise specified, “plurality” means two or more.
- the expressions “coupled” and “connected” and their extensions may be used.
- the term “connected” may be used when describing some embodiments to indicate that two or more components are in direct physical or electrical contact with each other.
- the term “coupled” may be used when describing some embodiments to indicate that two or more components have direct physical or electrical contact.
- the term “coupled” or “communicatively coupled” may also mean that two or more components are not in direct contact with each other, but still cooperate or interact with each other.
- the embodiments disclosed herein are not necessarily limited to the content herein.
- a and/or B includes the following three combinations: A only, B only, and the combination of A and B.
- the translation pen 100 includes a pen body 1 for packaging, an indicating component 2 provided on the pen body 1, and an image collector 3 , And the first processor 4 (see Fig. 2) arranged in the pen body.
- the pen body 1 of the translation pen 100 has a pen tip 11 and a pen tail 12 opposite to the pen tip 11.
- the indicating member 2 is disposed on the pen tip 11 of the pen body 1.
- the pen tip 11 has a mounting hole, and the indicating member 2 is disposed on the pen tip 11 through the mounting hole.
- the indicating component 2 is configured to indicate the position of the image of the text to be translated.
- the image collector 3 is configured to collect an image of the text to be translated according to the position indicated by the indicating component 2 and send the collected image of the text to be translated.
- the image collector 3 may use an electronic device with an image acquisition function such as a camera.
- the first processor 4 is electrically connected to the image collector 3.
- the first processor 4 is configured to receive the image of the text to be translated sent by the image collector 3, and to recognize the text to be translated in the image of the text to be translated.
- the first processor 4 may also be arranged outside the pen body 1.
- the first processor 1 is a computer terminal or a computer server.
- the "text to be translated” can be the text printed on a book page or the text on an electronic display device. Moreover, the "text to be translated” may be English letters and/or words, and the embodiments of the present disclosure are not limited thereto.
- the indicating member 2 is arranged on the tip end 11 of the pen body 1
- the image collector 3 is arranged on the pen body 1
- the indicating member 2 and the image collector 3 are arranged separately, It is avoided that the tip end 11 of the pen body 1 blocks the image of the text to be translated collected by the image collector 3.
- the position of the image of the text to be translated is indicated by the indicating component 2
- the image collector 3 collects the image of the text to be translated according to the position indicated by the indicating component 2
- the first processor 4 compares the collected images of the text to be translated
- Recognition of the text to be translated can improve the accuracy of the image collected by the image collector 3 and avoid collecting redundant images, thereby helping the first processor 4 to accurately recognize the text to be translated and improving the accuracy of translation by the translation pen 100.
- the image collector 3 is arranged on a side outside the pen body 1, for example, the image collector 3 is fixed on the side wall of the pen body 1.
- the image collector 3 is away from the tip end 11 relative to the pointing part 2, and it is ensured that the pointing part 2 is set within the viewing angle range of the image collector 3. In this way, it can be ensured that the image collector 3 collects the image of the text to be translated indicated by the indicating component 2.
- the tip end 11 of the pen body 1 is a quadrangular pyramid or similarly a quadrangular pyramid, one end of the quadrangular pyramid is integrally formed with the pen body 1, and the other end is a pointed end.
- the indicating component 2 is fixed at the tip of the quadrangular pyramid, and the lens of the image collector 3 is located at or near the junction of the quadrangular pyramid and the main part of the pen body 1, ensuring that the indicating component 2 is within the viewing angle range of the image collector 3, thus achieving
- the image collector 3 collects an image of the text to be translated according to the position indicated by the indicating component 2.
- the distance between the image collector 3 and the indicating component 2 there is a distance between the image collector 3 and the indicating component 2.
- the distance between the image collector 3 and the indicating component 2 can be ensured that the size of the image collected by the image collector 3 meets the size requirement.
- the pen body 1 of the translation pen 100 is also provided with a switch button or a controller electrically connected to the first processor 4, which can be used to control the image collector 3 to perform image collection work.
- the indicating member 2 can be used to paint colors on external objects.
- the indicating component 2 can be used to color the text to be translated, so that the image collector 3 can accurately collect the image of the text to be translated.
- the indicating member 2 can be the tip of a highlighter pen or a colored pen, and can perform the same operations of drawing, coloring and marking as a conventional pen.
- the color painted by the indicating component 2 (the color painted in the area where "recurrent” is located in the figure) is different from the printing color of the text to be translated.
- the indicating member 2 has a tip. Compared with the pen tip with a brush-like appearance of the scanning translation pen in the related art, the indicating member 2 mounted on the translation pen 100 in the embodiment of the present disclosure has a sharp tip, and the size of the tip is relatively small. Is small, so when the indicating component 2 indicates the location of the image of the text to be translated, the indicating component 2 will not block the image of the text to be translated.
- the first processor 4 includes a detection component 41 and a locking component 42.
- the detection component 41 is electrically connected to the image collector 3, and the image collector 3 sends the collected image of the text to be translated to the detection component 41.
- the detection component 41 is configured to detect the image of the text to be translated, and form at least one text box in the image of the text to be translated, and each text box contains part of the text in the image of the text to be translated .
- Fig. 5 shows a situation where multiple text boxes are formed.
- the locking component 42 is electrically connected to the detecting component 41.
- the locking component 42 is configured to lock a text box that meets the setting requirements in at least one of the formed text boxes according to the setting requirements, and use the text in the locked text box as the waiting Translate text.
- the above setting requirements are that along the column direction Y of the text, the row where the text box T is located is the row closest to the indicator 2; and along the row direction X of the text, the text box The center line L2 of T is closest to the center line L1 of the image P of the text to be translated.
- the text box T where "recurrent” is located in Figure 6 meets the set requirements, so "recurrent” is the text to be translated.
- the setting requirement is that along the column direction Y of the text to be translated, the row of the text box T is Indicate the line closest to the component 2; and at least part of the area in the text box T is painted.
- the text box T where "recurrent” is located in Figure 7 meets the set requirements, so "recurrent” is the text to be translated.
- the center line L2 of the text box T and “the center line L1 of the image P of the text to be translated” are the center lines of the text box T and the image P of the text to be translated along the column direction Y, respectively.
- the size of "at least part of the area” in "at least part of the area in the text box T is painted” can be set according to actual requirements, which is not specifically limited in the embodiment of the present disclosure.
- the area of the colored area in the text box T may account for 70% to 100% of the total area of the text box T, for example, 70%, 80%, 90% or 100%. That is to say, in the column direction Y of the text, the area where the text box T is located is the area where the coloring in the text box T is closest to the indicator 2 and can account for 70% of the total area of the text box T.
- the first processor 4 further includes an adaptive word segmentation component 43 electrically connected to the locking component 42.
- the adaptive word segmentation component 43 is configured to calculate a threshold value of the separation distance between adjacent letters in the text to be translated according to the size of the locked text to be translated, and to segment the locked text to be translated according to the threshold.
- CTPN International full name: Connectionist Text Proposal Network
- YOLO English full name: You Only Look Once
- the adaptive word segmentation component 43 in the translation pen 100 provided in the above-mentioned embodiment of the present disclosure can calculate the threshold of the separation distance between adjacent letters in the text to be translated according to the font size of the text to be translated in the image of the text to be translated.
- the function of adaptively adjusting the threshold of the separation distance is realized, and the locked text to be translated is segmented according to the threshold, so that the translation pen 100 is suitable for the translation of texts of various font sizes.
- the translation pen 100 further includes a second processor 5 arranged in the pen body, and the second processor 5 is electrically connected to the first processor 4.
- the second processor 5 is configured to receive the text to be translated recognized by the first processor 4, translate the text to be translated, and generate and send the translation result.
- the second processor 5 may be arranged in the pen body 1.
- the second processor 5 and the first processor 4 may be combined into one processor, or may be arranged separately.
- an observation window 7 is provided on the pen body 1.
- the translation pen 100 further includes a display screen 6 arranged in the observation window 7 of the pen body 1, and the display screen 6 is electrically connected to the second processor 5.
- the display screen 6 is configured to receive the translation result sent by the second processor 5 and display the translation result.
- the observation window 7 is an opening that penetrates the side wall on one side of the pen body 1, and the display screen 6 is embedded in the observation window 7 to ensure the stable installation of the display screen 6.
- One side of the display screen 6 for displaying is exposed outside the pen body 1 to facilitate reading the translation results displayed on the display screen 6.
- Some embodiments of the present disclosure also provide a control method applied to the aforementioned translation pen 100, as shown in FIG. 9, including the following S1 to S2:
- the image collector 3 of the translation pen 100 collects an image P of the text to be translated according to the position indicated by the indicating component 2 of the translation pen 100.
- S2 The first processor 4 of the translation pen 100 recognizes the text to be translated in the image P of the text to be translated.
- the indicating component 2 indicates the position of the image P of the text to be translated
- the image collector 3 collects the text to be translated according to the position indicated by the indicating component 2
- the first processor 4 recognizes the text to be translated in the collected image P of the text to be translated, which can improve the accuracy of the image collected by the image collector 3, thereby helping the first processor 4 to accurately recognize the text to be translated Translate text, improve the accuracy of translation pen 100.
- the first processor 4 recognizes the text to be translated in the image of the text to be translated, including the following S21 to S23:
- S21 Detect the image of the text to be translated to form at least one text box; the text box contains part of the text in the image of the text to be translated.
- the text to be translated in the image P of the text to be translated contains multiple letters and has multiple text lines
- the image P of the text to be translated is detected by the detection component 51
- the text A plurality of text boxes T are formed in the image P
- the text box T contains part of the text in the image P of the text to be translated, thereby facilitating subsequent locking of the text box T where the text to be translated is located, and realizing the locking of the text to be translated.
- S22 Lock a text box that meets the setting requirements in at least one of the formed text boxes, and use the text in the locked text box as the text to be translated.
- the locking component 52 of the first processor 4 locks the text box T that meets the setting requirements, and places the locked text box T in As the text to be translated.
- the above setting requirements are that along the column direction Y of the text, the row of the text box T is the row closest to the indicator 2 and along the row direction X of the text, the text box The center line L2 of T is closest to the center line L1 of the image P of the text to be translated.
- the setting requirement is that along the column direction Y of the text, the row where the text box T is located is the closest to the indicating member 2 , And at least part of the area in the text box T is painted.
- the row where the text box T is located is the row closest to the indicator 2 and along the row direction X of the text, the center line L2 of the text box T If it is closest to the center line L1 of the image P of the text to be translated, locking a text box T that meets the set requirements in the formed at least one text box T (S22), including the following steps:
- the text line that is closest to the indicator 2 along the column direction Y of the text is selected, that is, the text line where "s recurrent la" is located in the figure.
- the setting requirement is that along the column direction Y of the text, the row of the text box T is the row closest to the indicating component 2, and the text At least part of the area in the frame T is painted, then the text frame T meeting the set requirements is locked in the formed at least one text frame T (S22), including the following steps:
- the text line that is closest to the indicator 2 along the column direction Y of the text is selected, that is, the text line where "s recurrent la" is located in the figure.
- the control method applied to the above-mentioned translation pen 100 further includes:
- S23 includes the following steps:
- the sub-text box circumscribed by each letter in the text to be translated can be obtained through the following S2311 ⁇ S2315:
- the opencv function cv::threshold() can be used to perform binarization processing on the collected image of the text to be translated to obtain the binarized image of the text to be translated as shown in FIG. 12.
- the corrosive image processing method cv::erode() can be used to remove the tiny connections between the text to be translated due to ink printing problems in the binary image of the text to be translated.
- Figure 11 shows that there is a tiny connection between the two letters "N" in "CNN".
- the tiny connection between the two letters "N” is removed. Connection.
- the expanded image processing method cv::dilate() can be used to expand the letters in the text to be translated, so as to restore the original thickness of the letters, that is, the thickness before the letters are corroded.
- Figure 13 shows the removal of two The tiny connection between the letters "N", and the effect of the morphological transformation of the letters in the text to be translated (restoring the original thickness).
- the function cv::findContours() can be used to find the contour of each letter in the text to be translated, and the function cv::convexHull() can be used to obtain the convex hull 21 of the contour. Convex hull 21 with the outline of two letters "N".
- the minimum circumscribed rectangle of the convex hull 21 of the outline of each letter is calculated to obtain the sub-text box 22 of each letter, and the minimum circumscribed rectangle is the sub-text box 22 of each letter.
- the sub-text boxes 22 are sorted according to the area size of each sub-text box, a sub-text box 22 with an area value in the middle range is selected, and the selected sub-text box 22 is selected.
- the text box 22 serves as a reference text box.
- the difference between the lower limit of the "middle range” and the minimum area value of each sub-text box 22 is equal to or approximately equal to the upper limit of the intermediate range and the maximum area value of each sub-text box 22 The difference between.
- the "intermediate range” can also be a fixed intermediate value. At this time, the difference between the intermediate value and the minimum area value of each sub-text box 22 is equal to the intermediate value and the maximum area value of each sub-text box 22 The difference.
- S233 Calculate the threshold of the separation distance between two adjacent letters according to the width of the reference text box.
- the threshold of the separation distance between two adjacent letters is calculated according to the following formula:
- N is the distance threshold between two adjacent letters
- W is the ratio of the width of the reference text box (the size of the reference text box along the line direction of the text) to the width of the unit pixel in the image
- 0.6 is the present application
- the empirical coefficient value obtained by the inventor through many experiments, 16 is the width of 16 unit pixels set by the commonly used text detection network CTPN or YOLO according to the row direction X.
- S235 Perform word segmentation on the text to be translated according to the actual separation distance value and the separation distance threshold between every two adjacent letters.
- the actual separation distance value is greater than the separation distance threshold, it is determined that two adjacent letters belong to two adjacent words respectively. If the actual separation distance value is less than or equal to the separation distance threshold, it is determined that two adjacent letters belong to the same word. In this way, each letter of the text to be translated is divided into at least one word, and word segmentation is realized.
- the translation pen 100 when the translation pen 100 further includes the second processor 5 and the display screen 6, after the first processor 4 recognizes the text to be translated in the image of the text to be translated, That is, after S2, the above-mentioned control method applied to the translation pen 100 further includes the following steps:
- the second processor 5 translates the text to be translated recognized by the first processor 4 to generate a translation result, and the display screen 6 displays the translation result.
- the first processor 4 re-identifies the text to be translated in the image of the text to be translated, and the re-recognized text to be translated cannot be the text to be translated that was recognized last time.
- the image collector 3 when the image collector 3 collects images, it may be affected by external factors, such as the trembling of the operator's hand, which causes only a part of the text to be translated to be collected in the locked text box, thereby locking
- the text to be translated in the text box of is invalid text ("recurrent" in the locked text box in FIG. 8 is the invalid text), resulting in the translation pen 100 unable to generate the translation result.
- the first processor 4 re-identifies the text to be translated in the image of the text to be translated. For example, along the column direction Y of the text, except for the row where "recurrent" is located, lock the row that is closest to the indicating part 2, that is, the row where the text boxes of "Network not” and “o” are located; and along the line direction of the text X. According to the principle that the center line of the text box is closest to the center line of the image of the text to be translated, lock the "Network not” text box, and "Network Not” is the re-identified text to be translated.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
- Character Input (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims (20)
- 一种翻译笔,包括:笔体,所述笔体具有笔头端;设置于所述笔体的笔头端的指示部件;设置于所述笔体上的图像采集器;所述图像采集器被配置为,根据所述指示部件所指示的位置采集待翻译文字的图像,并发送所采集的待翻译文字的图像;设置于所述笔体内的第一处理器,所述第一处理器与所述图像采集器电连接;所述第一处理器被配置为,接收所述图像采集器所发送的待翻译文字的图像,识别所述待翻译文字的图像中的待翻译文字。
- 根据权利要求1所述的翻译笔,其中,所述笔体还具有与所述笔头端相对的笔尾端;所述图像采集器设置于所述笔体外的一侧,沿由所述笔体的笔头端指向笔尾端的方向,所述图像采集器相对于所述指示部件远离所述笔头端;所述指示部件设置于所述图像采集器的视角范围内。
- 根据权利要求2所述的翻译笔,其中,所述图像采集器与所述指示部件之间具有间距。
- 根据权利要求1~3中任一项所述的翻译笔,其中,所述指示部件能够用于在外部物体上涂色。
- 根据权利要求1~4中任一项所述的翻译笔,其中,所述指示部件具有尖端。
- 根据权利要求1~5中任一项所述的翻译笔,其中,所述第一处理器包括:与所述图像采集器电连接的检测部件;所述检测部件被配置为,检测所述待翻译文字的图像,形成至少一个文本框,所述文本框内包含所述待翻译文字的图像中的部分文字;与所述检测部件电连接的锁定部件;所述锁定部件被配置为,在所形成的至少一个文本框中锁定符合设定要求的文本框,将锁定的文本框中的文字作为待翻译文字;其中,所述设定要求为,沿文字的列方向,所述文本框所在的行是与所述指示部件距离最近的行;且沿文字的行方向,所述文本框的中线距离所述待翻译文字的图像的中线最近;或者,在所述指示部件能够用于在外部物体上涂色的情况下,所述设定要求为,沿文字的列方向,所述文本框所在的行是与所述指示部件距离最近的行;且所述文本框内的至少部分区域被涂色。
- 根据权利要求6所述的翻译笔,其中,所述第一处理器还包括:与所述锁定部件电连接的自适应分词部件;所述自适应分词部件被配置为,根据锁定的待翻译文字的大小,计算所述待翻译文字中相邻字母之间的间隔距离的阈值,并根据所述阈值对锁定的待翻译文字进行分词。
- 根据权利要求1~7中任一项所述的翻译笔,其中,所述翻译笔还包括:设置于所述笔体内的第二处理器,所述第二处理器与所述第一处理器电连接;所述第二处理器被配置为,接收所述第一处理器所识别的待翻译文字,并对所述待翻译文字进行翻译,生成并发送翻译结果。
- 根据权利要求8所述的翻译笔,其中,所述笔体上设置有观察窗;所述翻译笔还包括:设置于所述笔体的观察窗内的显示屏,所述显示屏与所述第二处理器电连接;所述显示屏被配置为,接收所述第二处理器发送的翻译结果,并显示所述翻译结果。
- 一种应用于如权利要求1~9中任一项所述的翻译笔的控制方法,包括:所述翻译笔的图像采集器根据所述翻译笔的指示部件所指示的位置采集待翻译文字的图像;所述翻译笔的第一处理器识别所述待翻译文字的图像中的待翻译文字。
- 根据权利要求10所述的控制方法,其中,所述第一处理器识别所述待翻译文字的图像中的待翻译文字,包括:检测所述待翻译文字的图像,形成至少一个文本框;所述文本框内包含所述待翻译文字的图像中的部分文字;在所形成的至少一个文本框中锁定符合设定要求的文本框,将锁定的文本框中的文字作为待翻译文字。
- 根据权利要求11所述的控制方法,其中,所述在所形成的至少一个文本框中锁定符合设定要求的文本框,包括:选择沿文字的列方向与所述指示部件距离最近的文字行;确定所选择的文字行中的各文本框的中线,以及所述待翻译文字的图像的中线,锁定文本框中线距离所述待翻译文字的图像的中线最近的文本框。
- 根据权利要求11所述的控制方法,其中,在所述指示部件能够用于在外部物体上涂色的情况下,所述在所形成的至少一个文本框中锁定符合设定要求的文本框,包括:选择沿文字的列方向与所述指示部件距离最近的文字行;锁定文本框内的至少部分区域被涂色的文本框。
- 根据权利要求11所述的控制方法,其中,在所述将锁定的文本框中的文字作为待翻译文字之后,还包括:对所述待翻译文字进行自适应分词。
- 根据权利要求14所述的控制方法,其中,所述对所述待翻译文字进行自适应分词,包括:获取所述待翻译文字中的每个字母所外接的子文本框;根据各所述子文本框的面积,确定参考文本框;根据所述参考文本框的宽度,计算相邻两个字母之间的间隔距离阈值;获取每相邻两个字母之间的实际间隔距离值;根据每相邻两个字母之间的实际间隔距离值和所述间隔距离阈值,对所述待翻译文字进行分词。
- 根据权利要求15所述的控制方法,其中,所述根据各所述子文本框的面积,确定参考文本框,包括:根据各所述子文本框的面积,选取面积值处于中间范围的一个子文本框,将所选取的子文本框作为所述参考文本框;其中,所述中间范围的下限值与各所述子文本框的最小面积值之间的差值,等于或大致等于,所述中间范围的上限值与各所述子文本框的最大面积值之间的差值。
- 根据权利要求15~17中任一项所述的控制方法,其中,所述根据每 相邻两个字母之间的实际间隔距离值和所述间隔距离阈值,对所述待翻译文字进行分词,包括:比较相邻两个字母之间的实际间隔距离值和所述间隔距离阈值;若所述实际间隔距离值大于所述间隔距离阈值,则相邻两个字母分别属于相邻两个单词;若所述实际间隔距离值小于或等于所述间隔距离阈值,则相邻两个字母属于同一个单词。
- 根据权利要求10~18中任一项所述的控制方法,其中,在所述翻译笔还包括第二处理器和显示屏的情况下,在所述第一处理器识别所述待翻译文字的图像中的待翻译文字之后,还包括:所述第二处理器对所述第一处理器所识别的待翻译文字进行翻译,生成翻译结果;所述显示屏显示所述翻译结果。
- 根据权利要求19所述的控制方法,其中,若所述第二处理器无法生成翻译结果,则所述第一处理器重新识别所述待翻译文字的图像中的待翻译文字,且重新识别的待翻译文字不能是上一次被识别的待翻译文字。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/423,413 US20220076042A1 (en) | 2019-07-29 | 2020-07-28 | Translation pen and control method therefor |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910690400.2 | 2019-07-29 | ||
CN201910690400.2A CN112308063B (zh) | 2019-07-29 | 2019-07-29 | 文字识别装置、翻译笔、图像翻译方法及图像翻译装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021018110A1 true WO2021018110A1 (zh) | 2021-02-04 |
Family
ID=74230176
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/105026 WO2021018110A1 (zh) | 2019-07-29 | 2020-07-28 | 翻译笔及其控制方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220076042A1 (zh) |
CN (1) | CN112308063B (zh) |
WO (1) | WO2021018110A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113627417A (zh) * | 2021-08-16 | 2021-11-09 | 广州番禺职业技术学院 | 外语翻译用文本检阅装置及其实施方法 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11978267B2 (en) | 2022-04-22 | 2024-05-07 | Verkada Inc. | Automatic multi-plate recognition |
US11557133B1 (en) * | 2022-04-22 | 2023-01-17 | Verkada Inc. | Automatic license plate recognition |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169541A (zh) * | 2011-04-02 | 2011-08-31 | 郝震龙 | 一种采用光学定位的字符识别输入系统及其方法 |
CN107220242A (zh) * | 2017-04-19 | 2017-09-29 | 广东小天才科技有限公司 | 一种基于翻译笔的翻译方法、装置及系统 |
CN109263362A (zh) * | 2018-10-29 | 2019-01-25 | 广东小天才科技有限公司 | 一种智能笔及其控制方法 |
US20190182402A1 (en) * | 2017-12-07 | 2019-06-13 | Nedal Shriesher | Print scanner and translator |
CN110874957A (zh) * | 2018-08-30 | 2020-03-10 | 朱笑笑 | 翻译笔 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6704699B2 (en) * | 2000-09-05 | 2004-03-09 | Einat H. Nir | Language acquisition aide |
IT1390595B1 (it) * | 2008-07-10 | 2011-09-09 | Universita' Degli Studi Di Brescia | Dispositivo di ausilio nella lettura di un testo stampato |
US9251144B2 (en) * | 2011-10-19 | 2016-02-02 | Microsoft Technology Licensing, Llc | Translating language characters in media content |
US9519641B2 (en) * | 2012-09-18 | 2016-12-13 | Abbyy Development Llc | Photography recognition translation |
US9836456B2 (en) * | 2015-01-12 | 2017-12-05 | Google Llc | Techniques for providing user image capture feedback for improved machine language translation |
US20170177189A1 (en) * | 2015-12-19 | 2017-06-22 | Radean T. Anvari | Method and System for Capturing Data on Display Using Scanner Pen |
CN105718930A (zh) * | 2016-01-26 | 2016-06-29 | 北京纽思曼教育科技有限公司 | 一种多功能翻译笔及其翻译方法 |
CN107992867A (zh) * | 2016-10-26 | 2018-05-04 | 深圳超多维科技有限公司 | 用于手势指点翻译的方法、装置及电子设备 |
US10127673B1 (en) * | 2016-12-16 | 2018-11-13 | Workday, Inc. | Word bounding box detection |
-
2019
- 2019-07-29 CN CN201910690400.2A patent/CN112308063B/zh active Active
-
2020
- 2020-07-28 US US17/423,413 patent/US20220076042A1/en active Pending
- 2020-07-28 WO PCT/CN2020/105026 patent/WO2021018110A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169541A (zh) * | 2011-04-02 | 2011-08-31 | 郝震龙 | 一种采用光学定位的字符识别输入系统及其方法 |
CN107220242A (zh) * | 2017-04-19 | 2017-09-29 | 广东小天才科技有限公司 | 一种基于翻译笔的翻译方法、装置及系统 |
US20190182402A1 (en) * | 2017-12-07 | 2019-06-13 | Nedal Shriesher | Print scanner and translator |
CN110874957A (zh) * | 2018-08-30 | 2020-03-10 | 朱笑笑 | 翻译笔 |
CN109263362A (zh) * | 2018-10-29 | 2019-01-25 | 广东小天才科技有限公司 | 一种智能笔及其控制方法 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113627417A (zh) * | 2021-08-16 | 2021-11-09 | 广州番禺职业技术学院 | 外语翻译用文本检阅装置及其实施方法 |
CN113627417B (zh) * | 2021-08-16 | 2023-11-24 | 广州番禺职业技术学院 | 外语翻译用文本检阅装置及其实施方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112308063B (zh) | 2022-07-29 |
CN112308063A (zh) | 2021-02-02 |
US20220076042A1 (en) | 2022-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021018110A1 (zh) | 翻译笔及其控制方法 | |
USRE47889E1 (en) | System and method for segmenting text lines in documents | |
JP5379085B2 (ja) | スキャンされた文書画像内の前景画素群の連結グループをマーキング種類に基づき分類する方法及びシステム | |
Dongre et al. | Devnagari document segmentation using histogram approach | |
US6014450A (en) | Method and apparatus for address block location | |
US8027539B2 (en) | Method and apparatus for determining an orientation of a document including Korean characters | |
CN103310211B (zh) | 一种基于图像处理的填注标记识别方法 | |
CN111695555B (zh) | 一种基于题号的精准框题方法、装置、设备和介质 | |
US11823497B2 (en) | Image processing system and an image processing method | |
CN112419260A (zh) | 一种pcb文字区域缺陷检测方法 | |
Mullick et al. | An efficient line segmentation approach for handwritten Bangla document image | |
CN115588208A (zh) | 一种基于数字图像处理技术的全线表结构识别方法 | |
CN108062548B (zh) | 一种盲文方自适应定位方法及系统 | |
Naz et al. | Challenges in baseline detection of cursive script languages | |
Munir et al. | Automatic character extraction from handwritten scanned documents to build large scale database | |
Dongre et al. | Segmentation of printed Devnagari documents | |
JP2006107534A (ja) | 文字認識方法および文字認識装置 | |
JP4492258B2 (ja) | 文字・図形の認識方法および検査方法 | |
Kleber et al. | Document reconstruction by layout analysis of snippets | |
JP3914119B2 (ja) | 文字認識方法および文字認識装置 | |
Tikader et al. | Edge based directional features for English-Bengali script recognition | |
Mandal et al. | Slant Estimation and Correction for Online Handwritten Bengali Words | |
Saroui et al. | Recognition of handwritten mathematical characters on whiteboards using colour images | |
CN106408021A (zh) | 一种基于笔画粗细的手写体与印刷体的鉴别算法 | |
CN117711004A (zh) | 一种基于图像识别的表格文档信息抽取方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20846771 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20846771 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20846771 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.02.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20846771 Country of ref document: EP Kind code of ref document: A1 |