WO2023005813A1 - Image direction adjustment method and apparatus, and storage medium and electronic device - Google Patents

Image direction adjustment method and apparatus, and storage medium and electronic device Download PDF

Info

Publication number
WO2023005813A1
WO2023005813A1 PCT/CN2022/107240 CN2022107240W WO2023005813A1 WO 2023005813 A1 WO2023005813 A1 WO 2023005813A1 CN 2022107240 W CN2022107240 W CN 2022107240W WO 2023005813 A1 WO2023005813 A1 WO 2023005813A1
Authority
WO
WIPO (PCT)
Prior art keywords
preview image
target
text line
target object
image
Prior art date
Application number
PCT/CN2022/107240
Other languages
French (fr)
Chinese (zh)
Inventor
郑侠松
赵佳鹏
Original Assignee
广州视源电子科技股份有限公司
广州视睿电子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州视源电子科技股份有限公司, 广州视睿电子科技有限公司 filed Critical 广州视源电子科技股份有限公司
Publication of WO2023005813A1 publication Critical patent/WO2023005813A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Definitions

  • the present application relates to the field of computer technology, and in particular to an image orientation adjustment method, device, storage medium and electronic equipment.
  • the mobile terminal takes pictures of the subject
  • the camera since the camera is only responsible for taking pictures, and the shooting application does not consider whether the mobile terminal will rotate to make a corresponding response, this will occur when the mobile terminal is rotated by 90°, 180° or Rotation at equal angles of 270°, but the image does not follow the rotation, or even if the camera is facing the subject to shoot and the mobile terminal does not rotate after shooting, the captured image still has 90°, 180° Or in the case of rotation at 270° and other angles, the direction of the characters in the acquired image does not match the actual direction of the characters in the subject, which does not conform to the user's usage habits, causing inconvenience to the user.
  • Embodiments of the present application provide an image orientation adjustment method, device, storage medium, and electronic device, which can make the text orientation in the captured image the orientation required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • This technical scheme is as follows:
  • an embodiment of the present application provides a method for adjusting an image direction, the method including:
  • a target direction of the preview image is determined based on each of the directions, and a display direction of the preview image is adjusted based on the target direction of the preview image.
  • an embodiment of the present application provides a method for adjusting an image direction, the method including:
  • the embodiment of the present application provides an image direction adjustment device, the device includes:
  • An image acquisition module configured to acquire a preview image collected by the camera for the subject
  • a text line acquisition module configured to identify a target object in the preview image, and acquire at least one text line in the target object
  • an orientation determination module configured to determine the orientation of each of the at least one text lines
  • the angle adjustment module is configured to determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
  • the embodiment of the present application provides an image orientation adjustment device, the device comprising:
  • the camera opening module is configured to receive a shooting instruction, and turn on the camera in response to the shooting instruction;
  • An image display module configured to display a preview image collected by the camera for the subject
  • a direction determination module configured to determine the direction of each text line determined in the preview image
  • the direction adjustment module is configured to determine the target direction of the preview image for each direction, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
  • an embodiment of the present application provides a computer storage medium, where a plurality of instructions are stored in the computer storage medium, and the instructions are adapted to be loaded by a processor and execute the method steps of the first aspect above.
  • an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein, the memory stores a computer program, and the computer program is adapted to be loaded by the processor and execute the above-mentioned first aspect method steps.
  • the embodiment of the present application provides an image orientation adjustment program.
  • the program When the program is executed, operations related to the image orientation adjustment method as described in the first aspect or the second aspect can be realized.
  • the mobile terminal acquires a preview image collected by the camera for the object to be photographed, and when the preview image includes an identified object, recognizes the target object in the preview image, acquires at least one text line in the target object, and determines at least one For the direction of each text line in the text line, the target direction of the preview image is determined based on each direction, and the display direction of the preview image is adjusted based on the target direction of the preview image. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • FIG. 1 is a schematic flow chart of an image orientation adjustment method provided in an embodiment of the present application
  • Fig. 2a is a system architecture diagram of an image direction adjustment method provided by an embodiment of the present application.
  • Fig. 2b is a system architecture diagram of an image direction adjustment method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a preview image provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of an image orientation adjustment method provided in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a prompt message provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a target object marking provided by an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a text line mark provided by an embodiment of the present application.
  • Fig. 8 is an exemplary schematic diagram of a first pixel distribution histogram provided by an embodiment of the present application.
  • FIG. 9 is an exemplary schematic diagram of a second pixel distribution histogram provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a method for adjusting an image direction provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a trigger shooting function provided by an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of a method for adjusting an image direction provided by an embodiment of the present application.
  • Fig. 13 is a schematic diagram of prompt information provided by the embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of an image orientation adjustment device provided in an embodiment of the present application.
  • Fig. 15 is a schematic structural diagram of a text line acquisition module provided by an embodiment of the present application.
  • Fig. 16 is a schematic structural diagram of a direction determination module provided by an embodiment of the present application.
  • Fig. 17 is a schematic structural diagram of a direction determining unit provided in an embodiment of the present application.
  • Fig. 18 is a schematic structural diagram of a direction adjustment module provided by an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of an image orientation adjustment device provided in an embodiment of the present application.
  • Fig. 20 is a schematic structural diagram of an image orientation adjustment device provided by an embodiment of the present application.
  • Fig. 21 is a schematic structural diagram of a direction adjustment module provided by an embodiment of the present application.
  • Fig. 22 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the method can be implemented relying on a computer program, and can run on an image orientation adjustment device based on the von Neumann system.
  • the computer program can be integrated in the application, or run as an independent utility application.
  • the image orientation adjustment device in the embodiment of the present application may be a mobile terminal, including but not limited to: personal computer, tablet computer, handheld device, vehicle-mounted device, wearable device, computing device or other processing device connected to a wireless modem, etc. .
  • User terminals can be called by different names in different networks, for example: user equipment, access terminal, subscriber unit, subscriber station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication Equipment, user agent or user device, cellular phone, cordless phone, personal digital assistant (PDA), terminal equipment in 5G network or future evolution network, etc.
  • PDA personal digital assistant
  • This method includes but is not limited to application to student learning machines.
  • This learning machine refers to a tablet computer that is generally used for students to take online classes or conduct other learning projects. It has a camera set on the top of the learning machine and has a shooting function.
  • FIG. 1 provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
  • the embodiment of the present application is described by taking the mobile terminal side as an example, and the image direction adjustment method may include the following steps:
  • the camera can be a device that communicates with the mobile terminal in a wireless or wired manner, where the wireless manner for the mobile terminal to communicate with the camera includes but is not limited to, for example, a cellular network, a wireless local area network, an infrared network, a near Field communication network or bluetooth network, etc., wired methods include but not limited to Universal Serial Bus (Universal Serial Bus, USB) and so on.
  • the camera may also be a part of the mobile terminal, that is, a device installed on the camera of the mobile terminal.
  • the shooting object may be any object containing text, which is not limited here.
  • the preview image may be a pre-browsing image captured by the camera based on the shooting picture, or may be a pre-browsing image obtained based on the capturing of the shooting picture by the camera installed on the mobile terminal in response to a shooting instruction of the mobile terminal.
  • the preview image can be displayed on the screen.
  • the user triggers to turn on the camera, and the camera of the camera collects images of the subject to obtain a preview image, and sends the preview image to the mobile terminal.
  • the acquisition of the preview image by the mobile terminal can be based on a preset timing acquisition mechanism, for example, acquisition in the 3rd second after the camera is turned on, or can be acquired in a preset certain step length (eg, once every 2 seconds).
  • the preview image is a preview image that has been rotated by a certain angle or an image that does not follow the rotation of the device.
  • Figure 3 where the orientation of the subject in the figure is is the direction before the rotation, and the preview image in the mobile terminal is a preview image rotated by a certain angle.
  • a text line refers to a specific line of text, and the direction of the text can be any direction.
  • the target object is any object containing text information, such as books, certificates, paper, and the like.
  • the recognition object is the same as the target object, and can be any object containing text, such as books, certificates, paper, etc.
  • a binary classification algorithm is used to classify the preview image.
  • algorithms that can realize binary classification include support vector machines, decision trees, neural networks, and k-nearest neighbors. , classification based on association rules, etc.
  • the binary classification algorithm in this application can be ResNet-18 in the ResNet (Deep Residual Network, ResNet) series network
  • the ResNet series network is an algorithm in the field of image classification
  • the number 18 in ResNet-18 represents the network Depth
  • 18 specifies 18 layers with weights, including convolutional layers and fully connected layers, excluding pooling layers and BN (Batch Normalization, BN) layers.
  • BN Batch Normalization, BN
  • the prompt information is used to remind the user that the preview image does not include the identified object, and the prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
  • a feasible prompting method may be a voice prompt, and the voice outputs "the book is not recognized in the current image, please shoot again".
  • the preview image includes one recognition object
  • one recognition object is used as the target object
  • the preview image includes multiple recognition objects
  • the selected recognition object among the multiple recognition objects is used as the target object.
  • Any text line in the text line can be a text line in the vertical direction, a text line in the horizontal direction, or a text line in any other direction, and there are no restrictions on the character shape, size, language, font and other characteristics contained in the text line.
  • the Advanced East (An Efficient and Accurate Scene Text Detector, East) algorithm is used to Identify lines of text on a target object.
  • Advanced EAST is an algorithm for scene image text detection, which is essentially an algorithm model for simultaneous detection of target direction and area. Mainly based on the EAST algorithm, the EAST algorithm has been improved in the detection of long texts to make the prediction of long texts more accurate.
  • the target object includes a text line
  • get a text line as at least one text line when the target object includes multiple text lines, get at least one text line in the target object can be, get all the text in the target object line as at least one text line, and at least one text line may also be randomly obtained as a part of all text lines in the target object.
  • the characters in the at least one text line are cut by using a projection method.
  • the projection method is an algorithm in the field of text segmentation, including horizontal projection method and vertical projection method.
  • the horizontal projection method can be understood as a beam of light irradiating from the horizontal direction of the image, and each light can be understood as a line of the image. , calculate the black pixels of the image on each line, so that the characters in the text line can be segmented.
  • the vertical projection method can be understood as a beam of light shining from the vertical direction of the image, and each light can be understood as the image One column, calculate the black pixels of the image on each column, so that the characters in the text line can be segmented.
  • the orientation of each of the at least one text line is determined based on the result of character cutting.
  • S104 Determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
  • the target orientation of the preview image is a preset orientation conforming to user habits or an image orientation set by the user.
  • calculate the quantity ratio of each text line in each direction obtain the direction indicated by the highest quantity ratio in the quantity ratio, adjust the display direction of the preview image based on the direction indicated by the highest quantity ratio, and finally obtain and acquire the preview after display adjustment image.
  • Adjust the display direction of the preview image based on the direction indicated by the highest quantity ratio.
  • the mobile terminal acquires a preview image collected by the camera for the object to be photographed, and when the preview image includes the identified object, recognizes the target object in the preview image, and acquires at least one of the target objects
  • the text line determines the direction of each text line in the at least one text line, determines the target direction of the preview image based on each of the directions, and adjusts the display direction of the preview image based on the target direction of the preview image. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image.
  • the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured preview image match the direction required by the user, so as to conform to the usage habits of the user and facilitate the use of the user.
  • FIG. 4 provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
  • the embodiment of the present application is described by taking the mobile terminal side as an example, and the image direction adjustment method may include the following steps:
  • the preview image collected by the camera on the mobile terminal for the subject is shown in Figure 3.
  • the preview image has been rotated, which does not match the actual direction of the user, which is inconvenient for the user. use.
  • a prompt message is output.
  • the prompt message is used to remind the user that the preview image contains multiple recognized objects, and one of the recognized objects needs to be selected as the For the target object, the prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
  • a feasible prompting method may be a voice prompt, and the voice output "the current image includes multiple recognized objects, please select one of the recognized objects for image direction adjustment".
  • a feasible prompting method can be a text box prompt, as shown in Figure 5, a text box pops up, and the content of the text box is "The current image includes multiple recognized objects, please select one of the recognized objects for image orientation adjustment! ".
  • the user's selection signal for the target object is received, and the target object is determined.
  • the Advanced East algorithm is used to identify all text lines in the target object.
  • the target object includes a text line or includes multiple text lines
  • when the target object includes a text line obtain a text line as at least one text line
  • all text lines in the target object may be obtained as at least one text line, or any number of text lines or text lines with any proportion in the target object may be obtained.
  • a book in the figure is the target object, when there are 6 text lines in the target object, randomly obtain several (3) text lines as at least one text line, or, when the target When there are 6 text lines in the object, a certain proportion (50%) of the text lines is obtained, that is, 3 of the text lines are obtained as at least one text line.
  • the binarized image is processed by the horizontal projection method to obtain the first pixel distribution histogram
  • the binarized image is processed by a vertical projection method to obtain a second pixel distribution histogram
  • the binarization of the image is to set the gray value of the pixel on the image to 0 (lowest brightness) or 255 (highest brightness), that is, to present the entire image with a visual effect of only black and white. That is, the grayscale images with 256 brightness levels are selected through appropriate thresholds to obtain a binary image that can still reflect the overall and local features of the image.
  • the binarized image can also be further image processed. After binarization, the content of the image is only related to the point with a pixel value of 0 or 255 and the position of the point. When the image is processed later, the subsequent image processing process is simplified.
  • Image binarization processing methods include:
  • the selected threshold should not only preserve the image information as much as possible, but also minimize the interference of background and noise;
  • the grayscale value of the pixel in the preview image is equal to or higher than the threshold, the grayscale value is set to 255;
  • the gray value of the pixel in the preview image is lower than the threshold, the gray value is set to 0;
  • the binarized image is processed by the horizontal projection method to obtain the first pixel distribution histogram, as shown in Figure 8.
  • the binarized image is processed by the horizontal projection method to obtain The first pixel distribution histogram of .
  • the binarized image is processed by a vertical projection method to obtain a second pixel distribution histogram, and a second pixel distribution histogram.
  • the first pixel distribution histogram and the second pixel distribution histogram respectively obtained after the preview image is binarized, and then the binarized image is processed by the horizontal projection method and the vertical projection method are unique and stable.
  • the characters in the target text line are segmented to determine the direction of each character.
  • the target direction refers to the direction indicated by the highest number ratio among the number ratios.
  • the target text is determined by calculating the quantity ratio of each character in each direction and obtaining the target direction of the character indicated by the highest quantity ratio in the quantity ratio
  • the direction of the line for example, there are 10 characters in the target text line, and the ratio of the number of 10 characters in each direction is calculated, if the number of characters in the A direction among the 10 characters: the number of characters in the B direction: the number of characters in the C direction If the number of characters is 5:4:1, then obtain the direction indicated by the highest number ratio among the number ratios, that is, the A direction, take the A direction as the target direction, and then use the A direction as the direction of the currently traversed target text line.
  • the target text line there are 10 characters in the target text line, randomly obtain any number (6) or a certain percentage (60%) of characters that exist in the target text line, that is, 6 characters in the 10 characters are placed in each The number ratio in the direction is calculated. If the number of characters in the A direction: the number of characters in the B direction: the number of characters in the C direction among the 6 characters is 3:2:1, then the direction indicated by the highest number ratio among the obtained number ratios is A Direction, take the A direction as the target direction, and then use the A direction as the direction of the currently traversed target text line.
  • Each text line refers to each text line in at least one text line.
  • the direction of each text line in at least one text line is determined, and the quantity ratio of each text line in each direction is calculated, for example, as shown in Figure 7
  • there are 4 text lines in the figure and at least one text line and the ratio of the 4 text lines in each direction is calculated, and the ratio is text line 3 in the horizontal direction: text line 1 in the vertical direction.
  • the direction indicated by the highest quantity ratio in the quantity ratio is obtained, for example, the direction indicated by the highest quantity ratio in at least one text line as shown in Figure 7 is the horizontal direction, therefore, Adjust the display direction of the preview image based on the horizontal direction. If the rotation angle is not 0, rotate the display direction of the preview image by the rotation angle to obtain and obtain the preview image after display adjustment.
  • a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified.
  • Text line when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction.
  • FIG. 10 provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
  • the embodiment of the present application is described on the mobile terminal side, and the image direction adjustment method may include the following steps:
  • the camera can be a device that communicates with the mobile terminal wirelessly or wiredly, and as shown in Figure 2b, the camera can also be a part of the mobile terminal, that is, a device installed on the mobile terminal.
  • the camera as an example of a device installed on a mobile terminal, as shown in Figure 11, when the user triggers the mobile terminal to start the shooting function of the camera (application-camera), that is, to send a shooting instruction, when the mobile terminal receives the shooting instruction, In response to the shooting instruction, the shooting function of the camera is turned on, and the camera shoots the object to be shot.
  • application-camera application-camera
  • the preview image refers to the pre-browsing image obtained by the camera based on the capture of the shooting screen in response to the start shooting command sent by the mobile terminal. It can be understood that, as shown in Figure 3, before the shooting angle is adjusted, the mobile terminal obtains the The preview image for is randomly rotated.
  • the display direction of the preview image is rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, then Rotate the display direction of the preview image by this rotation angle, so that the text direction in the captured image can match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • FIG. 12 provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
  • the embodiment of the present application is described on the mobile terminal side, and the image direction adjustment method may include the following steps:
  • the prompt information is used to prompt the user whether to enable the prompt information for detecting the direction of the target object.
  • the prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
  • a feasible prompting method can be a text box prompt, and a text box pops up, and the content of the text box is "whether to enable direction detection for books?".
  • the mobile terminal receives the confirmation instruction for the prompt information, and responds to the confirmation instruction to enable the direction detection function of the target object.
  • the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects;
  • the preview image includes one recognition object
  • one recognition object is used as the target object
  • the preview image includes multiple recognition objects
  • the selected recognition object among the multiple recognition objects is used as the target object.
  • prompt information including the multiple recognized objects is displayed. Refer to S102 and S202 for the manner of displaying the prompt information, which will not be repeated here.
  • the target object is determined in response to the selection instruction.
  • S102 and S202 For the identification of the target object, reference may be made to S102 and S202, which will not be repeated here.
  • the process of determining the direction indicated by the highest number ratio among the number ratios can refer to S104 and S208, and will not be repeated here.
  • the preview image is rotated by the rotation angle to display the adjusted preview image.
  • a direction is preset, which is the direction required by the user. If the direction indicated by the highest number ratio is inconsistent with the preset direction, that is, the rotation angle is not 0, the preview image is rotated by this rotation angle, and the preview image can be rotated. Make the direction indicated by the highest quantity ratio consistent with the preset direction, and finally obtain and display the adjusted preview image.
  • the preview image when the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects to prompt the user to select any one of the recognized objects as the target object.
  • all text lines will be recognized.
  • the target object When multiple text lines are included, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line. Compared with obtaining all text lines for direction determination, the calculation workload can be reduced . Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction.
  • FIG. 14 shows a schematic structural diagram of an image orientation adjustment device provided by an exemplary embodiment of the present application.
  • the device for adjusting the image direction can be implemented as all or a part of the terminal through software, hardware or a combination of the two.
  • the device 1 includes an image acquisition module 11 , a text line acquisition module 12 , a direction determination module 13 and an angle adjustment module 14 .
  • the image acquisition module 11 is configured to acquire a preview image collected by the camera for the subject
  • the text line acquisition module 12 is configured to identify the target object in the preview image, and acquire at least one text line in the target object;
  • a direction determining module 13 configured to determine the direction of each text line in the at least one text line
  • the direction adjusting module 14 is configured to determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
  • the text line acquisition module 12 includes:
  • the target object recognition unit 121 is configured to use a binary classification algorithm to recognize the target object in the preview image
  • a text line identification unit 122 configured to identify all text lines in the target object
  • the text line selection unit 123 is configured to select at least one text line from all the text lines.
  • the direction determining module 13 includes:
  • a text line traversing unit 131 configured to traverse each text line in the at least one text line
  • the direction determining unit 132 is configured to perform character cutting processing on the currently traversed target text line, and determine the direction of the target text line;
  • the traversal end unit 133 is configured to wait until the traversal ends.
  • the direction determining unit 132 includes:
  • the image acquisition subunit 1321 is configured to perform binarization processing on the currently traversed target text line to obtain a binarized image
  • the first histogram acquisition subunit 1322 is configured to process the binarized image using a horizontal projection method to obtain a first pixel distribution histogram
  • the second histogram acquisition subunit 1323 is configured to process the binarized image using a vertical projection method to obtain a second pixel distribution histogram
  • the character cutting subunit 1324 is configured to cut the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determine the direction of the target text line.
  • the character cutting subunit 1324 is specifically configured as:
  • the target direction of the character is used as the direction of the target text line.
  • the character cutting subunit 1324 is specifically configured as:
  • the character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined.
  • the direction of a character is used as the direction of the target text line.
  • the direction adjustment module 14 includes:
  • the ratio calculation unit 141 is configured to calculate the quantity ratio of each of the text lines in each direction;
  • the direction adjusting unit 142 is configured to acquire the direction indicated by the highest number ratio among the number ratios, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio.
  • the text line acquisition module 12 is specifically configured as:
  • the selected recognized object among the multiple recognized objects is used as the target object.
  • a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified.
  • Text line when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction.
  • the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • FIG. 19 shows a schematic structural diagram of an image orientation adjustment device provided by an exemplary embodiment of the present application.
  • the device 2 includes a camera opening module 21 , an image display module 22 , a direction determining module 23 and a direction adjusting module 24 .
  • the camera opening module 21 is configured to receive a shooting instruction, and open the camera in response to the shooting instruction;
  • the image display module 22 is configured to display the preview image collected by the camera for the subject
  • a direction determining module 23 configured to determine the direction of each text line determined in the preview image
  • the direction adjustment module 24 is configured to determine the target direction of the preview image based on each of the directions, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
  • the device 2 further includes:
  • the detection prompt display module 25 is configured to display whether to open the prompt information for direction detection of the target object
  • the detection enabling module 26 is configured to receive a confirmation instruction for the prompt information, and respond to the confirmation instruction to enable the function of detecting the direction of the target object.
  • the device 2 further includes:
  • the information prompt display module 27 is configured to display prompt information including the multiple recognized objects when the preview image includes multiple recognized objects;
  • the target object determination module 28 is configured to receive a selection instruction for the target object in the prompt information, and determine the target object in response to the selection instruction.
  • the direction adjustment module 24 includes:
  • the ratio calculation unit 241 is configured to calculate the quantity ratios of each of the text lines in each direction, and determine the direction indicated by the highest quantity ratio among the quantity ratios;
  • an angle output unit 242 configured to calculate and output a rotation angle based on the direction indicated by the highest number ratio
  • the direction adjusting unit 243 is configured to adjust the display direction of the preview image based on the rotation angle, and display the adjusted preview image.
  • the angle adjustment unit is specifically configured as:
  • the display direction of the preview image is rotated by the rotation angle to obtain the display direction of the preview image.
  • a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified.
  • Text line when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction.
  • the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • the image orientation adjustment device provided in the above-mentioned embodiments executes the image orientation adjustment method
  • the division of the above-mentioned functional modules is used as an example for illustration.
  • the above-mentioned functions can be assigned to different function modules as required Module completion means that the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the image orientation adjustment device and the image orientation adjustment method embodiment provided by the above embodiment belong to the same idea, and the implementation process thereof is detailed in the method embodiment, and will not be repeated here.
  • the embodiment of the present application also provides a computer storage medium, the computer storage medium can store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the method steps of the above-mentioned embodiments shown in Figures 1-13
  • the specific execution process refer to the specific description of the embodiments shown in FIGS. 1-13 , and details are not repeated here.
  • the present application also provides a computer program product, the computer program product stores at least one instruction, and the at least one instruction is loaded by the processor and executes the method steps of the above-mentioned embodiments shown in Figures 1-13, specifically For the execution process, reference may be made to the specific descriptions of the embodiments shown in FIGS. 1-13 , and details are not repeated here.
  • the mobile terminal 1000 may include: at least one processor 1001 , at least one network interface 1004 , a user interface 1003 , a memory 1005 , and at least one communication bus 1002 .
  • the communication bus 1002 is used to realize connection and communication between these components.
  • the user interface 1003 may include a display screen (Display) and a camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • Display display screen
  • Camera Camera
  • the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the processor 1001 may include one or more processing cores.
  • the processor 1001 uses various interfaces and lines to connect various parts of the entire electronic device 1000, and by running or executing instructions, programs, code sets or instruction sets stored in the memory 1005, and calling data stored in the memory 1005, execute Various functions of the electronic device 1000 and processing data.
  • the processor 1001 may use at least one of Digital Signal Processing (Digital Signal Processing, DSP), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and Programmable Logic Array (Programmable Logic Array, PLA). implemented in the form of hardware.
  • DSP Digital Signal Processing
  • FPGA Field-Programmable Gate Array
  • PLA Programmable Logic Array
  • the processor 1001 may integrate one or a combination of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU) and a modem.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • the CPU mainly handles the operating system, user interface and application programs, etc.
  • the GPU is used to render and draw the content that needs to be displayed on the display screen
  • the modem is used to handle wireless communication. It can be understood that the above modem may also not be integrated into the processor 1001, but implemented by a single chip.
  • the memory 1005 may include a random access memory (Random Access Memory, RAM), and may also include a read-only memory (Read-Only Memory).
  • the memory 1005 includes a non-transitory computer-readable storage medium (non-transitory computer-readable storage medium). The memory 1005 may be used to store instructions, programs, codes, sets of codes or sets of instructions.
  • the memory 1005 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), Instructions and the like for implementing the above method embodiments; the storage data area can store the data and the like involved in the above method embodiments.
  • the memory 1005 may also be at least one storage device located away from the aforementioned processor 1001 .
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and an image orientation adjustment application program.
  • the user interface 1003 is mainly used to provide the user with an input interface to obtain the data input by the user; and the processor 1001 can be used to call the generated image orientation adjustment application program stored in the memory 1005, And specifically do the following:
  • a target direction of the preview image is determined based on each of the directions, and a display direction of the preview image is adjusted based on the target direction of the preview image.
  • the processor 1001 when the processor 1001 recognizes the target object in the preview image and acquires at least one text line in the target object, it specifically performs the following operations:
  • At least one text line is selected from all the text lines.
  • processor 1001 when the processor 1001 determines the direction of each text line in the at least one text line, it specifically performs the following operations:
  • processor 1001 when the processor 1001 performs character cutting processing on the currently traversed target text line and determines the direction of the target text line, it specifically performs the following operations:
  • the characters in the target text line are cut to determine the direction of the target text line.
  • the processor 1001 cuts the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram to determine the target text line In the direction of , specifically perform the following operations:
  • the target direction of the character is used as the direction of the target text line.
  • the processor 1001 executes when all characters in the target text line have the same direction, based on the first pixel distribution histogram and the second pixel distribution histogram, the target The characters in the text line are cut, and when the direction of the target text line is determined, the following operations are specifically performed:
  • the character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined.
  • the direction of a character is used as the direction of the target text line.
  • the processor 1001 determines the target direction of the preview image based on each of the directions, adjusts the display direction of the preview image based on the target direction of the preview image, and displays the adjusted preview image. , specifically perform the following operations:
  • the processor 1001 when the processor 1001 recognizes the target object in the preview image, the following operations are specifically performed:
  • the selected recognized object among the multiple recognized objects is used as the target object.
  • processor 1001 also performs the following operations:
  • the processor 1001 also performs the following operations:
  • a confirmation instruction for the prompt information is received, and a function of detecting the direction of the target object is turned on in response to the confirmation instruction.
  • the processor 1001 executes displaying the preview image captured by the camera for the subject, the following operations are further performed:
  • the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects;
  • a selection instruction for the target object in the prompt information is received, and marking information for the target object is displayed in response to the selection instruction.
  • the processor 1001 determines the target direction of the preview image based on each of the directions, adjusts the display direction of the preview image based on the target direction of the preview image, and displays the adjusted preview image. , specifically do the following:
  • the processor 1001 when the processor 1001 adjusts the display direction of the preview image based on the rotation angle, it specifically performs the following operations:
  • the display direction of the preview image is rotated by the rotation angle.
  • a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified.
  • Text line when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction.
  • the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
  • the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Character Input (AREA)

Abstract

Provided in the present application are an image direction adjustment method and apparatus, and a storage medium and an electronic device. The method comprises: acquiring a preview image, which is collected by a camera for a photographed object; recognizing a target object in the preview image, and acquiring at least one text row in the target object; determining the direction of each text row in the at least one text row; and determining a target direction of the preview image on the basis of the directions, and adjusting a display direction of the preview image on the basis of the target direction of the preview image. By means of the present disclosure, the direction of a photographed preview image can conform to the direction required by a user, so as to meet the usage habits of the user, thereby facilitating the usage of the user.

Description

图像方向调整方法、装置、存储介质及电子设备Image orientation adjustment method, device, storage medium and electronic equipment
本申请要求于2021年07月30日提交中国专利局、申请号为202110879167.X、发明名称为“图像方向调整方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110879167.X and the title of the invention "image orientation adjustment method, device, storage medium and electronic equipment" submitted to the China Patent Office on July 30, 2021, the entire content of which Incorporated in this application by reference.
技术领域technical field
本申请涉及计算机技术领域,尤其涉及一种图像方向调整方法、装置、存储介质及电子设备。The present application relates to the field of computer technology, and in particular to an image orientation adjustment method, device, storage medium and electronic equipment.
背景技术Background technique
移动终端在针对拍摄对象拍摄图像时,由于摄像头只负责拍摄,且拍摄应用不会考虑移动终端是否会进行旋转以作出相应的响应,这将出现当拍摄后移动终端进行了90°、180°或者270°等角度的旋转,而图像并不跟随旋转的情况,或者出现即使将摄像头正对着拍摄对象进行拍摄且拍摄后移动终端并不进行旋转,但拍摄到的图像依然发生90°、180°或者270°等角度的旋转的情况,这样导致获取到的图像中的文字方向与拍摄对象中文字的实际方向不匹配,进而不符合用户的使用习惯,给用户使用造成不便。When the mobile terminal takes pictures of the subject, since the camera is only responsible for taking pictures, and the shooting application does not consider whether the mobile terminal will rotate to make a corresponding response, this will occur when the mobile terminal is rotated by 90°, 180° or Rotation at equal angles of 270°, but the image does not follow the rotation, or even if the camera is facing the subject to shoot and the mobile terminal does not rotate after shooting, the captured image still has 90°, 180° Or in the case of rotation at 270° and other angles, the direction of the characters in the acquired image does not match the actual direction of the characters in the subject, which does not conform to the user's usage habits, causing inconvenience to the user.
发明内容Contents of the invention
本申请实施例提供一种图像方向调整方法、装置、存储介质及电子设备,可以使得拍摄到的图像中的文字方向为用户需要的方向,以符合用户的使用习惯,方便用户使用。本技术方案如下:Embodiments of the present application provide an image orientation adjustment method, device, storage medium, and electronic device, which can make the text orientation in the captured image the orientation required by the user, so as to conform to the user's usage habits and facilitate the user's use. This technical scheme is as follows:
第一方面,本申请实施例提供了一种图像方向调整方法,所述方法包括:In the first aspect, an embodiment of the present application provides a method for adjusting an image direction, the method including:
获取摄像头针对拍摄对象采集的预览图像;Obtain the preview image collected by the camera for the subject;
识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;identifying a target object in the preview image, and acquiring at least one text line in the target object;
确定所述至少一个文本行中各文本行的方向;determining the orientation of each of the at least one line of text;
基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。A target direction of the preview image is determined based on each of the directions, and a display direction of the preview image is adjusted based on the target direction of the preview image.
第二方面,本申请实施例提供了一种图像方向调整方法,所述方法包括:In a second aspect, an embodiment of the present application provides a method for adjusting an image direction, the method including:
接收拍摄指令,响应于所述拍摄指令,开启摄像头;receiving a shooting instruction, and turning on the camera in response to the shooting instruction;
显示所述摄像头针对拍摄对象采集的预览图像;Displaying a preview image collected by the camera for the subject;
确定所述预览图像中所确定的各文本行的方向;determining the orientation of each of the lines of text determined in the preview image;
基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。determining a target direction of the preview image based on each of the directions, adjusting a display direction of the preview image based on the target direction of the preview image, and displaying the adjusted preview image.
第三方面,本申请实施例提供了一种图像方向调整装置,所述装置包括:In the third aspect, the embodiment of the present application provides an image direction adjustment device, the device includes:
图像获取模块,配置为获取摄像头针对拍摄对象采集的预览图像;An image acquisition module configured to acquire a preview image collected by the camera for the subject;
文本行获取模块,配置为识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;A text line acquisition module configured to identify a target object in the preview image, and acquire at least one text line in the target object;
方向确定模块,配置为确定所述至少一个文本行中各文本行的方向;an orientation determination module configured to determine the orientation of each of the at least one text lines;
角度调整模块,配置为基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。The angle adjustment module is configured to determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
第四方面,本申请实施例提供了一种图像方向调整装置,所述装置包括:In a fourth aspect, the embodiment of the present application provides an image orientation adjustment device, the device comprising:
摄像头开启模块,配置为接收拍摄指令,响应于所述拍摄指令,开启摄像头;The camera opening module is configured to receive a shooting instruction, and turn on the camera in response to the shooting instruction;
图像显示模块,配置为显示所述摄像头针对拍摄对象采集的预览图像;An image display module configured to display a preview image collected by the camera for the subject;
方向确定模块,配置为确定所述预览图像中所确定的各文本行的方向;a direction determination module configured to determine the direction of each text line determined in the preview image;
方向调整模块,配置为各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。The direction adjustment module is configured to determine the target direction of the preview image for each direction, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
第五方面,本申请实施例提供一种计算机存储介质,所述计算机存储介质存储有多条指令,所述指令适于由处理器加载并执行上述第一方面的方法步骤。In a fifth aspect, an embodiment of the present application provides a computer storage medium, where a plurality of instructions are stored in the computer storage medium, and the instructions are adapted to be loaded by a processor and execute the method steps of the first aspect above.
第六方面,本申请实施例提供一种电子设备,可包括:处理器和存储器;其中,所述存储器存储有计算机程序,所述计算机程序适于由所述处理器加载并执行上述第一方面的方法步骤。In a sixth aspect, an embodiment of the present application provides an electronic device, which may include: a processor and a memory; wherein, the memory stores a computer program, and the computer program is adapted to be loaded by the processor and execute the above-mentioned first aspect method steps.
第七方面,本申请实施例提供了一种图像方向调整的程序,该程序被执行时,可以实现如第一方面或第二方面所述的图像方向调整方法有关的操作。In the seventh aspect, the embodiment of the present application provides an image orientation adjustment program. When the program is executed, operations related to the image orientation adjustment method as described in the first aspect or the second aspect can be realized.
本申请一些实施例提供的技术方案带来的有益效果至少包括:The beneficial effects brought by the technical solutions provided by some embodiments of the present application at least include:
在本申请实施例中,移动终端获取摄像头针对拍摄对象采集的预览图像,当所述预览图像包括识别物体时,识别预览图像中的目标物体,获取目标物体中的至少一个文本行,确定至少一个文本行中各文本行的方向,基于各方向确定所述预览图像的目标方向,基于预览图像的目标方向调整预览图像的显示方向。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of the present application, the mobile terminal acquires a preview image collected by the camera for the object to be photographed, and when the preview image includes an identified object, recognizes the target object in the preview image, acquires at least one text line in the target object, and determines at least one For the direction of each text line in the text line, the target direction of the preview image is determined based on each direction, and the display direction of the preview image is adjusted based on the target direction of the preview image. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present application. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1是本申请实施例提供的一种图像方向调整方法的流程示意图;FIG. 1 is a schematic flow chart of an image orientation adjustment method provided in an embodiment of the present application;
图2a是本申请实施例提供的一种图像方向调整方法的系统架构图;Fig. 2a is a system architecture diagram of an image direction adjustment method provided by an embodiment of the present application;
图2b是本申请实施例提供的一种图像方向调整方法的系统架构图;Fig. 2b is a system architecture diagram of an image direction adjustment method provided by an embodiment of the present application;
图3是本申请实施例提供的一种预览图像示意图;FIG. 3 is a schematic diagram of a preview image provided by an embodiment of the present application;
图4是本申请实施例提供的一种图像方向调整方法的流程示意图;FIG. 4 is a schematic flowchart of an image orientation adjustment method provided in an embodiment of the present application;
图5是本申请实施例提供的一种提示信息示意图;FIG. 5 is a schematic diagram of a prompt message provided by an embodiment of the present application;
图6是本申请实施例提供的一种目标物体标记示意图;Fig. 6 is a schematic diagram of a target object marking provided by an embodiment of the present application;
图7是本申请实施例提供的一种文本行标记示意图;Fig. 7 is a schematic diagram of a text line mark provided by an embodiment of the present application;
图8是本申请实施例提供的一种第一像素分布直方图的举例示意图;Fig. 8 is an exemplary schematic diagram of a first pixel distribution histogram provided by an embodiment of the present application;
图9是本申请实施例提供的一种第二像素分布直方图的举例示意图;FIG. 9 is an exemplary schematic diagram of a second pixel distribution histogram provided by an embodiment of the present application;
图10是本申请实施例提供的一种图像方向调整方法的流程示意图;FIG. 10 is a schematic flowchart of a method for adjusting an image direction provided by an embodiment of the present application;
图11是本申请实施例提供的一种触发拍摄功能示意图;FIG. 11 is a schematic diagram of a trigger shooting function provided by an embodiment of the present application;
图12是本申请实施例提供的一种图像方向调整方法的流程示意图;FIG. 12 is a schematic flowchart of a method for adjusting an image direction provided by an embodiment of the present application;
图13是本申请实施例提供的一种提示信息示意图;Fig. 13 is a schematic diagram of prompt information provided by the embodiment of the present application;
图14是本申请实施例提供的一种图像方向调整装置的结构示意图;FIG. 14 is a schematic structural diagram of an image orientation adjustment device provided in an embodiment of the present application;
图15是本申请实施例提供的一种文本行获取模块的结构示意图;Fig. 15 is a schematic structural diagram of a text line acquisition module provided by an embodiment of the present application;
图16是本申请实施例提供的一种方向确定模块的结构示意图;Fig. 16 is a schematic structural diagram of a direction determination module provided by an embodiment of the present application;
图17是本申请实施例提供的一种方向确定单元的结构示意图;Fig. 17 is a schematic structural diagram of a direction determining unit provided in an embodiment of the present application;
图18是本申请实施例提供的一种方向调整模块的结构示意图;Fig. 18 is a schematic structural diagram of a direction adjustment module provided by an embodiment of the present application;
图19是本申请实施例提供的一种图像方向调整装置的结构示意图;FIG. 19 is a schematic structural diagram of an image orientation adjustment device provided in an embodiment of the present application;
图20是本申请实施例提供的一种图像方向调整装置的结构示意图;Fig. 20 is a schematic structural diagram of an image orientation adjustment device provided by an embodiment of the present application;
图21是本申请实施例提供的一种方向调整模块的结构示意图;Fig. 21 is a schematic structural diagram of a direction adjustment module provided by an embodiment of the present application;
图22是本申请实施例提供的一种电子设备的结构示意图。Fig. 22 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例方式作进一步地详细描述。In order to make the purpose, technical solution and advantages of the present application clearer, the embodiments of the present application will be further described in detail below in conjunction with the accompanying drawings.
下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.
在本申请的描述中,需要理解的是,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本申请中的具体含义。此外,在本申请的描述中,除非另有说明,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关6系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。In the description of the present application, it should be understood that the terms "first", "second" and so on are used for descriptive purposes only, and should not be understood as indicating or implying relative importance. Those of ordinary skill in the art can understand the specific meanings of the above terms in this application in specific situations. In addition, in the description of the present application, unless otherwise specified, "plurality" means two or more. "And/or" describes the association relationship of associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone. . The character "/" generally indicates that the contextual objects are an "or" relationship.
下面结合具体的实施例对本申请进行详细说明。The present application will be described in detail below in conjunction with specific embodiments.
该方法可依赖于计算机程序实现,可运行于基于冯诺依曼体系的图像方向调整装置上。该计算机程序可集成在应用中,也可作为独立的工具类应用运行。其中,本申请实施例中的图像方向调整装置可以为移动终端,包括但不限于:个人电脑、平板电脑、手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其它处理设备等。在不同的网络中用户终端可以叫做不同的名称,例如:用户设备、接入终端、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、5G网络或未来演进网络中的终端设备等。The method can be implemented relying on a computer program, and can run on an image orientation adjustment device based on the von Neumann system. The computer program can be integrated in the application, or run as an independent utility application. Wherein, the image orientation adjustment device in the embodiment of the present application may be a mobile terminal, including but not limited to: personal computer, tablet computer, handheld device, vehicle-mounted device, wearable device, computing device or other processing device connected to a wireless modem, etc. . User terminals can be called by different names in different networks, for example: user equipment, access terminal, subscriber unit, subscriber station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication Equipment, user agent or user device, cellular phone, cordless phone, personal digital assistant (PDA), terminal equipment in 5G network or future evolution network, etc.
该方法包括但不限于应用于学生学习机,此学习机指的是一般用于学生上网课或进行其他学习项目的平板电脑,其带有一个设置在学习机顶部的摄像头,具有拍摄功能。This method includes but is not limited to application to student learning machines. This learning machine refers to a tablet computer that is generally used for students to take online classes or conduct other learning projects. It has a camera set on the top of the learning machine and has a shooting function.
请参见图1,为本申请实施例提供了一种图像方向调整方法的流程示意图。本申请实施例以移动终端侧为例进行描述,该图像方向调整方法可以包括以下步骤:Please refer to FIG. 1 , which provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application. The embodiment of the present application is described by taking the mobile terminal side as an example, and the image direction adjustment method may include the following steps:
S101,获取摄像头针对拍摄对象采集的预览图像;S101, acquiring a preview image collected by the camera for the subject;
具体的,如图2a所示,摄像头可以为通过无线或有线方式与移动终端进行通信的器件,其中移动终端与摄像机进行通信的无线方式包括但不限于例如蜂窝网络、无线局域网、红外网络、近场通信网或蓝牙网络等,有线方式包括但不限于通用串行总线(Universal Serial Bus,USB)等。如图2b所示,摄像头也可以为作为移动终端的一部分,即安装在移动终端上的摄像机上的器件。Specifically, as shown in Figure 2a, the camera can be a device that communicates with the mobile terminal in a wireless or wired manner, where the wireless manner for the mobile terminal to communicate with the camera includes but is not limited to, for example, a cellular network, a wireless local area network, an infrared network, a near Field communication network or bluetooth network, etc., wired methods include but not limited to Universal Serial Bus (Universal Serial Bus, USB) and so on. As shown in FIG. 2b, the camera may also be a part of the mobile terminal, that is, a device installed on the camera of the mobile terminal.
其中,拍摄对象可以为任意包含文字的物体,此处不作限定。Wherein, the shooting object may be any object containing text, which is not limited here.
其中,预览图像可以为摄像机基于拍摄画面捕捉到的预先浏览图像,也可以为安装在移动终端上的摄像头响应移动终端的拍摄指令,基于拍摄画面的捕捉而得到的预先浏览图像。该预览图像可以在屏幕上进行显示。Wherein, the preview image may be a pre-browsing image captured by the camera based on the shooting picture, or may be a pre-browsing image obtained based on the capturing of the shooting picture by the camera installed on the mobile terminal in response to a shooting instruction of the mobile terminal. The preview image can be displayed on the screen.
例如,如图2a所示,用户触发打开摄像机,摄像机的摄像头针对拍摄对象进行图像采集,从而得到预览图像,将预览图像发送至移动终端。移动终端获取预览图像可以为基于预设的定时获取机制进行获取,例如开启摄像机之后的第3秒钟进行获取,也可以为预设一定步长(例如每2秒一次)进行获取。For example, as shown in FIG. 2a, the user triggers to turn on the camera, and the camera of the camera collects images of the subject to obtain a preview image, and sends the preview image to the mobile terminal. The acquisition of the preview image by the mobile terminal can be based on a preset timing acquisition mechanism, for example, acquisition in the 3rd second after the camera is turned on, or can be acquired in a preset certain step length (eg, once every 2 seconds).
可以理解的是,在没调整图像方向前,预览图像为经过旋转了一定角度的预览图像或未跟随设备旋转的图像,一种可能的情况为如图3所示,图中的拍摄对象的方向为旋转之前的方向,移动终端中的预览图像为经过旋转了一定角度的预览图像。It can be understood that before the orientation of the image is adjusted, the preview image is a preview image that has been rotated by a certain angle or an image that does not follow the rotation of the device. One possible situation is as shown in Figure 3, where the orientation of the subject in the figure is is the direction before the rotation, and the preview image in the mobile terminal is a preview image rotated by a certain angle.
S102,识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;S102. Identify a target object in the preview image, and acquire at least one text line in the target object;
文本行指具体的一行文字,其文字的方向可以为任意方向。A text line refers to a specific line of text, and the direction of the text can be any direction.
具体的,目标物体为包含文字信息的任意物体,可以为书籍、证件、纸张等。Specifically, the target object is any object containing text information, such as books, certificates, paper, and the like.
识别预览图像中的目标物体,需先预设识别物体,获取预览图像中的识别物体,其中,识别物体同目标物体,可以为任意包含文本的物体,可以为书籍、证件、纸张等。To recognize the target object in the preview image, you need to preset the recognition object first, and obtain the recognition object in the preview image. The recognition object is the same as the target object, and can be any object containing text, such as books, certificates, paper, etc.
具体的,因识别预览图像中是否包含识别物体是个二分类任务,因此采用二分类算法对预览图像进行分类,其中,可实现二分类的算法有如支持向量机、决策树、神经网络、k-近邻、基于关联规则的分类等。Specifically, because it is a binary classification task to identify whether the preview image contains an object or not, a binary classification algorithm is used to classify the preview image. Among them, algorithms that can realize binary classification include support vector machines, decision trees, neural networks, and k-nearest neighbors. , classification based on association rules, etc.
本申请中的二分类算法可以为ResNet(Deep Residual Network,ResNet)系列网络中的ResNet-18,ResNet系列网络是图像分类领域的一种算法,ResNet-18,中的数字18代表的是网络的深度,18指定的是带有权重的18层,包括卷积层和全连接层,不包括池化层和BN(Batch Normalization,BN)层。采用ResNet-18,可将预览图像分为包括识别物体的预览图像以及不包括识别物体的预览图像。The binary classification algorithm in this application can be ResNet-18 in the ResNet (Deep Residual Network, ResNet) series network, the ResNet series network is an algorithm in the field of image classification, and the number 18 in ResNet-18 represents the network Depth, 18 specifies 18 layers with weights, including convolutional layers and fully connected layers, excluding pooling layers and BN (Batch Normalization, BN) layers. Using ResNet-18, preview images can be divided into preview images that include recognized objects and preview images that do not include recognized objects.
当预览图像不包括识别物体时,输出提示信息,提示信息用于提示用户预览图像不包括识别物体,提示信息可采用如音频、文本或动画的方式输出提示信息。可多种提示方式并行。具体方式和内容不限定。When the preview image does not include the identified object, output prompt information, the prompt information is used to remind the user that the preview image does not include the identified object, and the prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
例如,当预设的识别物体为书籍时,一种可行的提示方式可以为语音提示,语音输出“当前图像没有识别到书籍,请重新拍摄”。For example, when the preset recognized object is a book, a feasible prompting method may be a voice prompt, and the voice outputs "the book is not recognized in the current image, please shoot again".
当预览图像中包括一个识别物体时,将一个识别物体作为目标物体,当预览图像中包括多个识别物体时,将多个识别物体中被选择的识别物体作为目标物体。When the preview image includes one recognition object, one recognition object is used as the target object, and when the preview image includes multiple recognition objects, the selected recognition object among the multiple recognition objects is used as the target object.
文本行中的任一文本行可以为垂直方向的文本行、水平方向的文本行或其他任一方向的文本行,文本行中包含的文字形状、大小、语种、字体等特征不做限定。Any text line in the text line can be a text line in the vertical direction, a text line in the horizontal direction, or a text line in any other direction, and there are no restrictions on the character shape, size, language, font and other characteristics contained in the text line.
因目标物体上的文本行有可能为垂直方向的文本行,也有可能为水平方向的文本行或其他任一方向的文本行,因此采用Advanced East(An Efficient and Accurate Scene Text Detector,East)算法来识别目标物体上的文本行。Advanced EAST是一种用于场景图像文本检测的算法,本质是一种对目标方向和区域同时进行检测的算法模型。主要基于EAST算法,对EAST算法在长文本检测方面进行了改进,使长文本的预测更加准确。Because the text lines on the target object may be vertical text lines, horizontal text lines or text lines in any other direction, the Advanced East (An Efficient and Accurate Scene Text Detector, East) algorithm is used to Identify lines of text on a target object. Advanced EAST is an algorithm for scene image text detection, which is essentially an algorithm model for simultaneous detection of target direction and area. Mainly based on the EAST algorithm, the EAST algorithm has been improved in the detection of long texts to make the prediction of long texts more accurate.
采用Advanced East算法识别目标物体中的所有文本行后,在所有文本行中选取至少一个文本行。After using the Advanced East algorithm to identify all the text lines in the target object, select at least one text line from all the text lines.
当目标物体中包括一个文本行时,获取一个文本行作为至少一个文本行,当目标物体中包括多个文本行时,获取目标物体中的至少一个文本行可以为,获取目标物体中的所有文本行作为至少一个文本行,还可以为随机获取目标物体中的所有文本行中的一部分作为至少一个文本行。When the target object includes a text line, get a text line as at least one text line, when the target object includes multiple text lines, get at least one text line in the target object can be, get all the text in the target object line as at least one text line, and at least one text line may also be randomly obtained as a part of all text lines in the target object.
S103,确定所述至少一个文本行中各文本行的方向;S103. Determine the direction of each text line in the at least one text line;
获取目标物体中的至少一个文本行后,采用投影法对至少一个文本行中的字符进行切割。After obtaining at least one text line in the target object, the characters in the at least one text line are cut by using a projection method.
其中,投影法是文本分割领域中的一种算法,包括水平投影法和垂直投影法,其中水平投影法可以理解为一束光线从图像的水平方向进行照射,每一条光线可以理解为图像的一行,计算每一行上图像的黑色像素点,从而可以对文本行中的字符进行分割,同理,垂直投影法可以理解为一束光线从图像的垂直方向进行照射,每一条光线可以理解为图像的一列,计算每一列上图像的黑色像素点,从而可以对文本行中的字符进行分割。Among them, the projection method is an algorithm in the field of text segmentation, including horizontal projection method and vertical projection method. The horizontal projection method can be understood as a beam of light irradiating from the horizontal direction of the image, and each light can be understood as a line of the image. , calculate the black pixels of the image on each line, so that the characters in the text line can be segmented. Similarly, the vertical projection method can be understood as a beam of light shining from the vertical direction of the image, and each light can be understood as the image One column, calculate the black pixels of the image on each column, so that the characters in the text line can be segmented.
基于字符切割的结果确定至少一个文本行中各文本行的方向。The orientation of each of the at least one text line is determined based on the result of character cutting.
S104,基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。S104. Determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
预览图像的目标方向为预设的符合用户习惯的方向或用户设置的图像方向。The target orientation of the preview image is a preset orientation conforming to user habits or an image orientation set by the user.
具体的,计算各文本行分别在各个方向上的数量比值;获取数量比值中最高数量比值指示的方向,基于最高数量比值指示的方向调整预览图像的显示方向,最后得到并获取显示调整后的预览图像。Specifically, calculate the quantity ratio of each text line in each direction; obtain the direction indicated by the highest quantity ratio in the quantity ratio, adjust the display direction of the preview image based on the direction indicated by the highest quantity ratio, and finally obtain and acquire the preview after display adjustment image.
基于最高数量比值指示的方向调整预览图像的显示方向,有两种调整方式,可以将预览图像的显示方向调整至与数量比值中最高数量比值指示的方向一致,也可以将预览图像的显示方向调整至与数量比值中最高数量比值指示的方向相差一定角度的方向。Adjust the display direction of the preview image based on the direction indicated by the highest quantity ratio. There are two adjustment methods. You can adjust the display direction of the preview image to be consistent with the direction indicated by the highest quantity ratio in the quantity ratio, or you can adjust the display direction of the preview image. To a direction that differs by an angle from the direction indicated by the highest number ratio of the number ratios.
在本申请实施例中,移动终端获取摄像头针对拍摄对象采集的预览图像,当所述预览图像包括所述识别物体时,识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行,确定所述至少一个文本行中各文本行的方向,基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的预览图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of the present application, the mobile terminal acquires a preview image collected by the camera for the object to be photographed, and when the preview image includes the identified object, recognizes the target object in the preview image, and acquires at least one of the target objects The text line determines the direction of each text line in the at least one text line, determines the target direction of the preview image based on each of the directions, and adjusts the display direction of the preview image based on the target direction of the preview image. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured preview image match the direction required by the user, so as to conform to the usage habits of the user and facilitate the use of the user.
请参见图4,为本申请实施例提供了一种图像方向调整方法的流程示意图。本申请实施例以移动终端侧为例进行描述,该图像方向调整方法可以包括以下步骤:Please refer to FIG. 4 , which provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application. The embodiment of the present application is described by taking the mobile terminal side as an example, and the image direction adjustment method may include the following steps:
S201,获取摄像头针对拍摄对象采集的预览图像;S201. Obtain a preview image collected by the camera for the shooting object;
以摄像头为安装在移动终端上的摄像机上的器件为例,移动终端上的摄像头针对拍摄对象采集的预览图像如图3所示,预览图像经过了旋转,与用户实际需求方向不符,不方便用户使用。Taking the camera as an example of the device installed on the camera on the mobile terminal, the preview image collected by the camera on the mobile terminal for the subject is shown in Figure 3. The preview image has been rotated, which does not match the actual direction of the user, which is inconvenient for the user. use.
获取摄像头针对拍摄对象采集的预览图像具体可参见S106,此处不再赘述。For details of obtaining the preview image collected by the camera for the object to be photographed, refer to S106, which will not be repeated here.
S202,采用二分类算法识别所述预览图像中的目标物体;当所述预览图像中包括多个识别物体时,将所述多个识别物体中被选择的识别物体作为目标物体;S202, using a binary classification algorithm to identify the target object in the preview image; when the preview image includes multiple recognized objects, use the selected recognized object among the multiple recognized objects as the target object;
采用二分类算法识别预览图像中的目标物体后,当预览图像中包括多个识别物体时,输出提示信息,提示信息用于提示用户预览图像中包括多个识别物体,需选择其中一个识别物体作为目标物体,提示信息可采用如音频、文本或动画的方式输出提示信息。可多种提示方式并行。具体方式和内容不限定。After the binary classification algorithm is used to identify the target object in the preview image, when the preview image includes multiple recognized objects, a prompt message is output. The prompt message is used to remind the user that the preview image contains multiple recognized objects, and one of the recognized objects needs to be selected as the For the target object, the prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
例如,一种可行的提示方式可以为语音提示,语音输出“当前图像包括多个识别物体,请选择其中一个识别物体用于图像方向调整”。For example, a feasible prompting method may be a voice prompt, and the voice output "the current image includes multiple recognized objects, please select one of the recognized objects for image direction adjustment".
例如,一种可行的提示方式可以为文本框提示,如图5所示,弹出一个文本框,文本框内容为“当前图像包括多个识别物体,请选择其中一个识别物体用于图像方向调整!”。For example, a feasible prompting method can be a text box prompt, as shown in Figure 5, a text box pops up, and the content of the text box is "The current image includes multiple recognized objects, please select one of the recognized objects for image orientation adjustment! ".
如图6所示,接收用户的针对目标物体的选择信号,确定目标物体。As shown in FIG. 6 , the user's selection signal for the target object is received, and the target object is determined.
S203,识别所述目标物体中的所有文本行;在所述所有文本行中选取至少一个文本行;S203, identifying all text lines in the target object; selecting at least one text line from all the text lines;
具体的,采用Advanced East算法识别出目标物体中的所有文本行。Specifically, the Advanced East algorithm is used to identify all text lines in the target object.
具体的,因目标物体中存在包括一个文本行或者包括多个文本行的情况,当目标物体中包括一个文本行时,获取一个文本行作为至少一个文本行,当目标物体中包括多个文本行时,可以是获取目标物体中的所有文本行作为至少一个文本行,也可以是获取目标物体中的任意数目的文本行或者任意比例的文本行。例如如图7所示,图中的一本书籍为目标物体,当目标物体中有6个文本行时,随机获取其中的若干个(3个)文本行作为至少一个文本行,或,当目标物体中有6个文本行时,获取其中的一定比例(50%)的文本行,即获取其中的3个文本行作为至少一个文本行。Specifically, because the target object includes a text line or includes multiple text lines, when the target object includes a text line, obtain a text line as at least one text line, and when the target object includes multiple text lines When , all text lines in the target object may be obtained as at least one text line, or any number of text lines or text lines with any proportion in the target object may be obtained. For example, as shown in Figure 7, a book in the figure is the target object, when there are 6 text lines in the target object, randomly obtain several (3) text lines as at least one text line, or, when the target When there are 6 text lines in the object, a certain proportion (50%) of the text lines is obtained, that is, 3 of the text lines are obtained as at least one text line.
S204,遍历所述至少一个文本行中的各文本行;S204. Traverse each text line in the at least one text line;
对当前遍历到的目标文本行进行二值化处理,得到二值化图像;Perform binarization processing on the currently traversed target text line to obtain a binarized image;
采用水平投影法对二值化图像进行处理,得到第一像素分布直方图;The binarized image is processed by the horizontal projection method to obtain the first pixel distribution histogram;
采用垂直投影法对二值化图像进行处理,得到第二像素分布直方图;The binarized image is processed by a vertical projection method to obtain a second pixel distribution histogram;
其中,图像的二值化,就是将图像上的像素点的灰度值设置为0(最低亮度)或255(最高亮度),也就是将整个图像呈现出明显的只有黑和白的视觉效果。即将256个亮度等级的灰度图像通过适当的阈值选取,获得仍可反应图像整体和局部特征的二值化图像。Among them, the binarization of the image is to set the gray value of the pixel on the image to 0 (lowest brightness) or 255 (highest brightness), that is, to present the entire image with a visual effect of only black and white. That is, the grayscale images with 256 brightness levels are selected through appropriate thresholds to obtain a binary image that can still reflect the overall and local features of the image.
二值化后的图像还可以进一步进行图像处理。经二值化后的,图像的内容只与像素的值为0或者255的点以及点的位置有关,后续再对图像进行处理的时,以简化后续的图像处理过程。The binarized image can also be further image processed. After binarization, the content of the image is only related to the point with a pixel value of 0 or 255 and the position of the point. When the image is processed later, the subsequent image processing process is simplified.
图像二值化的处理方式包括:Image binarization processing methods include:
选取适当的阈值,选取的阈值既要尽可能保存图像信息,又要最大程度减少背景和噪声 的干扰;Select an appropriate threshold, the selected threshold should not only preserve the image information as much as possible, but also minimize the interference of background and noise;
若预览图像中的像素点的灰度值等于高于阈值时,将灰度值设置为255;If the grayscale value of the pixel in the preview image is equal to or higher than the threshold, the grayscale value is set to 255;
若预览图像中的像素点的灰度值低于阈值时,将灰度值设置为0;If the gray value of the pixel in the preview image is lower than the threshold, the gray value is set to 0;
采用水平投影法对二值化图像进行处理,得到第一像素分布直方图,如图8所示,当二值化图像如图所示时,经采用水平投影法对二值化图像进行处理得到的第一像素分布直方图。The binarized image is processed by the horizontal projection method to obtain the first pixel distribution histogram, as shown in Figure 8. When the binarized image is as shown in the figure, the binarized image is processed by the horizontal projection method to obtain The first pixel distribution histogram of .
如图9所示,采用垂直投影法对二值化图像进行处理,得到第二像素分布直方图,第二像素分布直方图。As shown in FIG. 9 , the binarized image is processed by a vertical projection method to obtain a second pixel distribution histogram, and a second pixel distribution histogram.
可以理解的是,当预览图像经二值化处理,再经水平投影法、垂直投影法对二值化图像进行处理后分别得到的第一像素分布直方图以及第二像素分布直方图是唯一且固定的。It can be understood that the first pixel distribution histogram and the second pixel distribution histogram respectively obtained after the preview image is binarized, and then the binarized image is processed by the horizontal projection method and the vertical projection method are unique and stable.
基于第一像素分布直方图以及第二像素分布直方图,对目标文本行中的字符进行切割,确定各字符的方向。Based on the first pixel distribution histogram and the second pixel distribution histogram, the characters in the target text line are segmented to determine the direction of each character.
S205,计算各所述字符分别在各个方向上的数量比值,获取所述数量比值中最高数量比值指示的字符的目标方向;S205. Calculate the quantity ratio of each of the characters in each direction, and obtain the target direction of the character indicated by the highest quantity ratio among the quantity ratios;
将字符的目标方向作为目标文本行的方向;Take the target direction of the character as the direction of the target text line;
目标方向指的是数量比值中最高数量比值指示的方向。The target direction refers to the direction indicated by the highest number ratio among the number ratios.
因遍历到的文本行中的字符可能存在字符方向不一致的情况,因此采用计算各字符分别在各个方向上的数量比值,获取数量比值中最高数量比值指示的字符的目标方向的方式,确定目标文本行的方向,例如,目标文本行中存在10个字符,对10个字符分别在各个方向上的数量比值进行计算,若10个字符中A方向的字符数量:B方向的字符数量:C方向的字符数量为5:4:1,则获取数量比值中最高数量比值指示的方向即A方向,将A方向作为目标方向,继而将A方向作为当前遍历到的目标文本行的方向。又例如,目标文本行中存在10个字符,随机获取目标文本行中存在的任意数目(6个)或一定比例(60%)数量的字符,即对10个字符中的6个字符分别在各个方向上的数量比值进行计算,若6个字符中A方向的字符数量:B方向的字符数量:C方向的字符数量为3:2:1,则获取数量比值中最高数量比值指示的方向即A方向,将A方向作为目标方向,继而将A方向作为当前遍历到的目标文本行的方向。Because the characters in the traversed text line may have inconsistent character directions, the target text is determined by calculating the quantity ratio of each character in each direction and obtaining the target direction of the character indicated by the highest quantity ratio in the quantity ratio The direction of the line, for example, there are 10 characters in the target text line, and the ratio of the number of 10 characters in each direction is calculated, if the number of characters in the A direction among the 10 characters: the number of characters in the B direction: the number of characters in the C direction If the number of characters is 5:4:1, then obtain the direction indicated by the highest number ratio among the number ratios, that is, the A direction, take the A direction as the target direction, and then use the A direction as the direction of the currently traversed target text line. For another example, there are 10 characters in the target text line, randomly obtain any number (6) or a certain percentage (60%) of characters that exist in the target text line, that is, 6 characters in the 10 characters are placed in each The number ratio in the direction is calculated. If the number of characters in the A direction: the number of characters in the B direction: the number of characters in the C direction among the 6 characters is 3:2:1, then the direction indicated by the highest number ratio among the obtained number ratios is A Direction, take the A direction as the target direction, and then use the A direction as the direction of the currently traversed target text line.
S206,当所述目标文本行中所有的字符方向相同时,所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行中任一字符的方向,将所述任一字符的方向作为所述目标文本行的方向;S206. When all the characters in the target text line have the same direction, cut the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determine The direction of any character in the target text line, using the direction of any character as the direction of the target text line;
当目标文本行中所有的字符方向相同时,为减小计算工作量,可以在对目标文本行中的字符进行切割后,随机获取目标文本行中的任一字符,确定随机获取的任一字符的方向,将任一字符的方向作为目标文本行的方向。When all the characters in the target text line have the same direction, in order to reduce the computational workload, after cutting the characters in the target text line, randomly obtain any character in the target text line, and determine any randomly obtained character The direction of any character is used as the direction of the target text line.
直至遍历结束。until the traversal ends.
S207,计算各所述文本行分别在各个方向上的数量比值;S207, calculating the quantity ratios of each of the text lines in each direction;
各文本行指的是至少一个文本行中的各文本行,在遍历结束后确定至少一个文本行中各文本行的方向,计算各文本行分别在各个方向上的数量比值,例如,如图7所示,图中的有4个文本行为至少一个文本行,计算4个文本行在各个方向上的比值,其比值为水平方向的文本行3:垂直方向的文本行1。Each text line refers to each text line in at least one text line. After the traversal, the direction of each text line in at least one text line is determined, and the quantity ratio of each text line in each direction is calculated, for example, as shown in Figure 7 As shown, there are 4 text lines in the figure and at least one text line, and the ratio of the 4 text lines in each direction is calculated, and the ratio is text line 3 in the horizontal direction: text line 1 in the vertical direction.
S208,获取所述数量比值中最高数量比值指示的方向,基于所述最高数量比值指示的方 向调整所述预览图像的显示方向。S208. Acquire the direction indicated by the highest number ratio among the number ratios, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio.
基于各文本行分别在各个方向上的数量比值,获取数量比值中最高数量比值指示的方向,例如,在如图7所示的至少一个文本行中最高数量比值指示的方向为水平方向,因此,基于水平方向调整预览图像的显示方向,若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,得到并获取显示调整后的预览图像。Based on the quantity ratios of each text line in each direction respectively, the direction indicated by the highest quantity ratio in the quantity ratio is obtained, for example, the direction indicated by the highest quantity ratio in at least one text line as shown in Figure 7 is the horizontal direction, therefore, Adjust the display direction of the preview image based on the horizontal direction. If the rotation angle is not 0, rotate the display direction of the preview image by the rotation angle to obtain and obtain the preview image after display adjustment.
在本申请实施例中,采用二分类算法识别预览图像,若预览图像包括多个识别物体时,输出提示信息以提示用户需选择其中一个识别物体作为目标物体,采用Advanced East算法后将识别出所有文本行,当目标物体中包括多个文本行时,可获取目标物体中的任意数目的文本行或者获取目标物体中的任意比例的文本行作为至少一个文本行,对比获取所有文本行进行方向确定,可降低了计算工作量。采用投影法中的水平投影法与垂直投影法相结合的方式对字符进行切割可提高确定的字符的方向的准确率。在遍历到的目标文本行中随机获取目标文本行中存在的任意数目或预设比例数量的字符,基于任意数目的字符确定目标文本行的方向,或当目标文本行中所有的字符方向相同时,随机获取目标文本行中的任一字符,确定随机获取的任一字符的方向,可降低计算工作量。计算各文本行分别在各个方向上的数量比值,基于最高数量比值指示的方向调整预览图像的显示方向,就可以避免目标物体中的文本行方向不一致时预览图像的方向调整没有具体依据,调整方向混乱的问题。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of this application, a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified. Text line, when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction. Randomly obtain any number or preset ratio of characters in the target text line traversed, determine the direction of the target text line based on any number of characters, or when all the characters in the target text line have the same direction , randomly obtain any character in the target text line, and determine the direction of any randomly obtained character, which can reduce the computational workload. Calculate the number ratio of each text line in each direction, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio, so as to avoid the direction adjustment of the preview image when the text lines in the target object have inconsistent directions. There is no specific basis for adjusting the direction. confusing question. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
请参见图10,为本申请实施例提供了一种图像方向调整方法的流程示意图。Please refer to FIG. 10 , which provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
本申请实施例以移动终端侧进行描述,该图像方向调整方法可以包括以下步骤:The embodiment of the present application is described on the mobile terminal side, and the image direction adjustment method may include the following steps:
S301,接收拍摄指令,响应于所述拍摄指令,开启摄像头;S301. Receive a shooting instruction, and turn on the camera in response to the shooting instruction;
如图2a所示,摄像头可以为通过无线或有线方式与移动终端进行通信的器件,如图2b所示,摄像头也可以为作为移动终端的一部分,即安装在移动终端上的器件。As shown in Figure 2a, the camera can be a device that communicates with the mobile terminal wirelessly or wiredly, and as shown in Figure 2b, the camera can also be a part of the mobile terminal, that is, a device installed on the mobile terminal.
以摄像头为安装在移动终端上的器件为例,如图11所示,当用户触发移动终端开启摄像机(应用程序-相机)的拍摄功能,即发送拍摄指令,当移动终端接收到拍摄指令时,响应于拍摄指令开启摄像头的拍摄功能,摄像头针对的拍摄对象进行拍摄。Taking the camera as an example of a device installed on a mobile terminal, as shown in Figure 11, when the user triggers the mobile terminal to start the shooting function of the camera (application-camera), that is, to send a shooting instruction, when the mobile terminal receives the shooting instruction, In response to the shooting instruction, the shooting function of the camera is turned on, and the camera shoots the object to be shot.
S302,显示所述摄像头针对拍摄对象采集的预览图像;S302, displaying a preview image collected by the camera for the shooting object;
预览图像是指摄像头响应移动终端发送的开始摄像指令,基于拍摄画面的捕捉而得到的预先浏览图像,可以理解的是,如图3所示,在没经过拍摄角度调整前,移动终端拍摄后得到的预览图像会发生随机旋转。The preview image refers to the pre-browsing image obtained by the camera based on the capture of the shooting screen in response to the start shooting command sent by the mobile terminal. It can be understood that, as shown in Figure 3, before the shooting angle is adjusted, the mobile terminal obtains the The preview image for is randomly rotated.
获取摄像头针对拍摄对象采集的预览图像并显示,获取所述摄像头针对拍摄对象采集的预览图像具体可参见S106,此处不再赘述。Acquiring and displaying the preview image collected by the camera for the object to be photographed. For details of obtaining the preview image collected by the camera for the object to be photographed, refer to S106 , which will not be repeated here.
S303,确定所述预览图像中所确定的各文本行的方向;S303. Determine the direction of each text line determined in the preview image;
确定所述预览图像中所确定的各文本行的方向可参见S103以及S204~S207,此处不再赘述。For determining the direction of each text line determined in the preview image, reference may be made to S103 and S204-S207, which will not be repeated here.
S304,基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调 整所述预览图像的显示方向,显示调整后的预览图像。S304. Determine the target direction of the preview image based on each of the directions, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像可参见S104以及S208,此处不再赘述。Determine the target direction of the preview image based on each of the directions, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image. Refer to S104 and S208, which will not be repeated here.
在本申请实施例中,即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In this embodiment of the application, even if the shooting device is rotated or the captured preview image will rotate randomly, the display direction of the preview image is rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, then Rotate the display direction of the preview image by this rotation angle, so that the text direction in the captured image can match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
请参见图12,为本申请实施例提供了一种图像方向调整方法的流程示意图。Please refer to FIG. 12 , which provides a schematic flowchart of a method for adjusting an image direction according to an embodiment of the present application.
本申请实施例以移动终端侧进行描述,该图像方向调整方法可以包括以下步骤:The embodiment of the present application is described on the mobile terminal side, and the image direction adjustment method may include the following steps:
S401,接收拍摄指令,响应于所述拍摄指令,开启摄像头;S401. Receive a shooting instruction, and turn on the camera in response to the shooting instruction;
显示是否开启对目标物体进行方向检测的提示信息;Display whether to open the prompt information for the direction detection of the target object;
接收针对所述提示信息的确认指令,响应于所述确认指令,开启对目标物体进行方向检测的功能;receiving a confirmation instruction for the prompt information, and in response to the confirmation instruction, enabling the function of detecting the direction of the target object;
接收拍摄指令,响应于所述拍摄指令,开启摄像头可参见S301,此处不再赘述。See S301 for receiving a shooting instruction, and turning on the camera in response to the shooting instruction, which will not be repeated here.
显示是否开启对目标物体进行方向检测的提示信息,提示信息用于提示用户是否开启对目标物体进行方向检测的提示信息,提示信息可采用如音频、文本或动画的方式输出提示信息。可多种提示方式并行。具体方式和内容不限定。Display whether to enable the prompt information for detecting the direction of the target object. The prompt information is used to prompt the user whether to enable the prompt information for detecting the direction of the target object. The prompt information can be output in the form of audio, text or animation. Multiple prompting methods can be used in parallel. The specific method and content are not limited.
如图13所示,若预设的识别物体为书籍,一种可行的提示方式可以为文本框提示,弹出一个文本框,文本框内容为“是否开启对书籍进行方向检测?”。As shown in Figure 13, if the preset recognition object is a book, a feasible prompting method can be a text box prompt, and a text box pops up, and the content of the text box is "whether to enable direction detection for books?".
若用户点击“是”按键,表示用户同一开启对书籍进行方向检测,移动终端接收针对提示信息的确认指令,响应于确认指令,开启对目标物体进行方向检测的功能。If the user clicks the "Yes" button, it means that the user also enables the direction detection of the book, the mobile terminal receives the confirmation instruction for the prompt information, and responds to the confirmation instruction to enable the direction detection function of the target object.
S402,显示所述摄像头针对拍摄对象采集的预览图像;S402, displaying a preview image collected by the camera for the shooting object;
当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息;When the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects;
当预览图像中包括一个识别物体时,将一个识别物体作为目标物体,当预览图像中包括多个识别物体时,将多个识别物体中被选择的识别物体作为目标物体。When the preview image includes one recognition object, one recognition object is used as the target object, and when the preview image includes multiple recognition objects, the selected recognition object among the multiple recognition objects is used as the target object.
如图5所示,当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息,提示信息的显示的方式可参见S102以及S202,此处不再赘述。As shown in FIG. 5 , when the preview image includes multiple recognized objects, prompt information including the multiple recognized objects is displayed. Refer to S102 and S202 for the manner of displaying the prompt information, which will not be repeated here.
S403,接收针对所述提示信息中目标物体的选择指令,响应于所述选择指令,确定目标物体。S403. Receive a selection instruction for the target object in the prompt information, and determine the target object in response to the selection instruction.
如图6所示,当用户对所述移动终端进行触控操作,选择其中的任一识别物体作为所述目标物体时,响应于所述选择指令,确定目标物体。所述目标物体的识别可参见S102以及S202,此处不再赘述。As shown in FIG. 6 , when the user performs a touch operation on the mobile terminal and selects any of the recognized objects as the target object, the target object is determined in response to the selection instruction. For the identification of the target object, reference may be made to S102 and S202, which will not be repeated here.
S404,确定所述预览图像中所确定的各文本行的方向;S404. Determine the direction of each text line determined in the preview image;
确定所述预览图像中所确定的各文本行的方向可参见S103以及S204~S207,此处不再赘述。For determining the direction of each text line determined in the preview image, reference may be made to S103 and S204-S207, which will not be repeated here.
S405,计算各所述文本行分别在各个方向上的数量比值,确定所述数量比值中最高数量比值指示的方向;S405. Calculate the quantity ratios of each of the text lines in each direction, and determine the direction indicated by the highest quantity ratio among the quantity ratios;
确定所述数量比值中最高数量比值指示的方向过程可参见S104以及S208,此处不再赘 述。The process of determining the direction indicated by the highest number ratio among the number ratios can refer to S104 and S208, and will not be repeated here.
S406,基于所述最高数量比值指示的方向计算并输出旋转角度;S406. Calculate and output a rotation angle based on the direction indicated by the highest quantity ratio;
若旋转角度不为0,则将预览图像旋转所述旋转角度,显示调整后的预览图像。If the rotation angle is not 0, the preview image is rotated by the rotation angle to display the adjusted preview image.
具体的,预设一个方向,为用户所要求的方向,若最高数量比值指示的方向与预设方向不一致,即旋转角度不为0,则将预览图像旋转此旋转角度,将预览图像旋转后可使得最高数量比值指示的方向与预设的方向一致,最后得到并显示调整后的预览图像。Specifically, a direction is preset, which is the direction required by the user. If the direction indicated by the highest number ratio is inconsistent with the preset direction, that is, the rotation angle is not 0, the preview image is rotated by this rotation angle, and the preview image can be rotated. Make the direction indicated by the highest quantity ratio consistent with the preset direction, and finally obtain and display the adjusted preview image.
在本申请实施例中,当预览图像中包括多个识别物体时,显示包含多个识别物体的提示信息以提示用户选择其中的任一识别物体作为目标物体时。采用二分类算法识别预览图像,若预览图像包括多个识别物体时,输出提示信息以提示用户需选择其中一个识别物体作为目标物体,采用Advanced East算法后将识别出所有文本行,当目标物体中包括多个文本行时,可获取目标物体中的任意数目的文本行或者获取目标物体中的任意比例的文本行作为至少一个文本行,对比获取所有文本行进行方向确定,可降低了计算工作量。采用投影法中的水平投影法与垂直投影法相结合的方式对字符进行切割可提高确定的字符的方向的准确率。在遍历到的目标文本行中随机获取目标文本行中存在的任意数目或预设比例数量的字符,基于任意数目的字符确定目标文本行的方向,或当目标文本行中所有的字符方向相同时,随机获取目标文本行中的任一字符,确定随机获取的任一字符的方向,可降低计算工作量。计算各文本行分别在各个方向上的数量比值,基于最高数量比值指示的方向调整摄像头的图像方向,就可以避免目标物体中的文本行方向不一致时图像方向的调整没有具体依据,调整方向混乱的问题。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of the present application, when the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects to prompt the user to select any one of the recognized objects as the target object. Use the binary classification algorithm to recognize the preview image. If the preview image includes multiple recognition objects, output a prompt message to remind the user to select one of the recognition objects as the target object. After using the Advanced East algorithm, all text lines will be recognized. When the target object is included When multiple text lines are included, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line. Compared with obtaining all text lines for direction determination, the calculation workload can be reduced . Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction. Randomly obtain any number or preset ratio of characters in the target text line traversed, determine the direction of the target text line based on any number of characters, or when all the characters in the target text line have the same direction , randomly obtain any character in the target text line, and determine the direction of any randomly obtained character, which can reduce the computational workload. Calculate the number ratio of each text line in each direction, and adjust the image direction of the camera based on the direction indicated by the highest number ratio, which can avoid the adjustment of the direction of the image when the direction of the text line in the target object is inconsistent. There is no specific basis for adjusting the direction. question. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。The following are device embodiments of the present application, which can be used to implement the method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
请参见图14,其示出了本申请一个示例性实施例提供的图像方向调整装置的结构示意图。该图像方向调整装置可以通过软件、硬件或者两者的结合实现成为终端的全部或一部分。该装置1包括图像获取模块11、文本行获取模块12以及方向确定模块13以及角度调整模块14。Please refer to FIG. 14 , which shows a schematic structural diagram of an image orientation adjustment device provided by an exemplary embodiment of the present application. The device for adjusting the image direction can be implemented as all or a part of the terminal through software, hardware or a combination of the two. The device 1 includes an image acquisition module 11 , a text line acquisition module 12 , a direction determination module 13 and an angle adjustment module 14 .
图像获取模块11,配置为获取摄像头针对拍摄对象采集的预览图像;The image acquisition module 11 is configured to acquire a preview image collected by the camera for the subject;
文本行获取模块12,配置为识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;The text line acquisition module 12 is configured to identify the target object in the preview image, and acquire at least one text line in the target object;
方向确定模块13,配置为确定所述至少一个文本行中各文本行的方向;a direction determining module 13, configured to determine the direction of each text line in the at least one text line;
方向调整模块14,配置为基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。The direction adjusting module 14 is configured to determine a target direction of the preview image based on each of the directions, and adjust a display direction of the preview image based on the target direction of the preview image.
可选的,如图15所示,所述文本行获取模块12,包括:Optionally, as shown in Figure 15, the text line acquisition module 12 includes:
目标物体识别单元121,配置为采用二分类算法识别所述预览图像中的目标物体;The target object recognition unit 121 is configured to use a binary classification algorithm to recognize the target object in the preview image;
文本行识别单元122,配置为识别所述目标物体中的所有文本行;A text line identification unit 122 configured to identify all text lines in the target object;
文本行选取单元123,配置为在所述所有文本行中选取至少一个文本行。The text line selection unit 123 is configured to select at least one text line from all the text lines.
可选的,如图16所示,所述方向确定模块13,包括:Optionally, as shown in FIG. 16, the direction determining module 13 includes:
文本行遍历单元131,配置为遍历所述至少一个文本行中的各文本行;A text line traversing unit 131 configured to traverse each text line in the at least one text line;
方向确定单元132,配置为对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向;The direction determining unit 132 is configured to perform character cutting processing on the currently traversed target text line, and determine the direction of the target text line;
遍历结束单元133,配置为直至遍历结束。The traversal end unit 133 is configured to wait until the traversal ends.
可选的,如图17所示,所述方向确定单元132,包括:Optionally, as shown in FIG. 17, the direction determining unit 132 includes:
图像获取子单元1321,配置为对当前遍历到的目标文本行进行二值化处理,得到二值化图像;The image acquisition subunit 1321 is configured to perform binarization processing on the currently traversed target text line to obtain a binarized image;
第一直方图获取子单元1322,配置为采用水平投影法对所述二值化图像进行处理,得到第一像素分布直方图;The first histogram acquisition subunit 1322 is configured to process the binarized image using a horizontal projection method to obtain a first pixel distribution histogram;
第二直方图获取子单元1323,配置为采用垂直投影法对所述二值化图像进行处理,得到第二像素分布直方图;The second histogram acquisition subunit 1323 is configured to process the binarized image using a vertical projection method to obtain a second pixel distribution histogram;
字符切割子单元1324,配置为基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向。The character cutting subunit 1324 is configured to cut the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determine the direction of the target text line.
可选的,所述字符切割子单元1324,具体配置为:Optionally, the character cutting subunit 1324 is specifically configured as:
基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定各所述字符的方向;cutting the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determining the direction of each character;
计算各所述字符分别在各个方向上的数量比值,获取所述数量比值中最高数量比值指示的字符的目标方向;calculating the quantity ratios of the characters in each direction, and obtaining the target direction of the character indicated by the highest quantity ratio among the quantity ratios;
将所述字符的目标方向作为所述目标文本行的方向。The target direction of the character is used as the direction of the target text line.
可选的,当所述目标文本行中所有的字符方向相同时,所述字符切割子单元1324,具体配置为:Optionally, when all the characters in the target text line have the same direction, the character cutting subunit 1324 is specifically configured as:
所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行中任一字符的方向,将所述任一字符的方向作为所述目标文本行的方向。The character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined. The direction of a character is used as the direction of the target text line.
可选的,如图18所示,所述方向调整模块14,包括:Optionally, as shown in Figure 18, the direction adjustment module 14 includes:
比值计算单元141,配置为计算各所述文本行分别在各个方向上的数量比值;The ratio calculation unit 141 is configured to calculate the quantity ratio of each of the text lines in each direction;
方向调整单元142,配置为获取所述数量比值中最高数量比值指示的方向,基于所述最高数量比值指示的方向调整所述预览图像的显示方向。The direction adjusting unit 142 is configured to acquire the direction indicated by the highest number ratio among the number ratios, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio.
可选的,所述文本行获取模块12,具体配置为:Optionally, the text line acquisition module 12 is specifically configured as:
当所述预览图像中包括多个识别物体时,将所述多个识别物体中被选择的识别物体作为目标物体。When the preview image includes multiple recognized objects, the selected recognized object among the multiple recognized objects is used as the target object.
在本申请实施例中,采用二分类算法识别预览图像,若预览图像包括多个识别物体时,输出提示信息以提示用户需选择其中一个识别物体作为目标物体,采用Advanced East算法后将识别出所有文本行,当目标物体中包括多个文本行时,可获取目标物体中的任意数目的文本行或者获取目标物体中的任意比例的文本行作为至少一个文本行,对比获取所有文本行进行方向确定,可降低了计算工作量。采用投影法中的水平投影法与垂直投影法相结合的方式对字符进行切割可提高确定的字符的方向的准确率。在遍历到的目标文本行中随机获取目标文本行中存在的任意数目或预设比例数量的字符,基于任意数目的字符确定目标文本行的方向,或当目标文本行中所有的字符方向相同时,随机获取目标文本行中的任一字符,确定 随机获取的任一字符的方向,可降低计算工作量。计算各文本行分别在各个方向上的数量比值,基于最高数量比值指示的方向调整预览图像的显示方向,就可以避免目标物体中的文本行方向不一致时图像方向的调整没有具体依据,调整方向混乱的问题。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of this application, a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified. Text line, when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction. Randomly obtain any number or preset ratio of characters in the target text line traversed, determine the direction of the target text line based on any number of characters, or when all the characters in the target text line have the same direction , randomly obtain any character in the target text line, and determine the direction of any randomly obtained character, which can reduce the computational workload. Calculate the number ratio of each text line in each direction, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio, so as to avoid that there is no specific basis for adjusting the direction of the image when the direction of the text line in the target object is inconsistent, and the direction of the adjustment is confused. The problem. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
请参见图19,其示出了本申请一个示例性实施例提供的图像方向调整装置的结构示意图。该装置2包括摄像头开启模块21、图像显示模块22方向确定模块23以及方向调整模块24。Please refer to FIG. 19 , which shows a schematic structural diagram of an image orientation adjustment device provided by an exemplary embodiment of the present application. The device 2 includes a camera opening module 21 , an image display module 22 , a direction determining module 23 and a direction adjusting module 24 .
摄像头开启模块21,配置为接收拍摄指令,响应于所述拍摄指令,开启摄像头;The camera opening module 21 is configured to receive a shooting instruction, and open the camera in response to the shooting instruction;
图像显示模块22,配置为显示所述摄像头针对拍摄对象采集的预览图像;The image display module 22 is configured to display the preview image collected by the camera for the subject;
方向确定模块23,配置为确定所述预览图像中所确定的各文本行的方向;a direction determining module 23, configured to determine the direction of each text line determined in the preview image;
方向调整模块24,配置为基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。The direction adjustment module 24 is configured to determine the target direction of the preview image based on each of the directions, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
可选的,如图20所示,所述装置2还包括:Optionally, as shown in Figure 20, the device 2 further includes:
检测提示显示模块25,配置为显示是否开启对目标物体进行方向检测的提示信息;The detection prompt display module 25 is configured to display whether to open the prompt information for direction detection of the target object;
检测开启模块26,配置为接收针对所述提示信息的确认指令,响应于所述确认指令,开启对目标物体进行方向检测的功能。The detection enabling module 26 is configured to receive a confirmation instruction for the prompt information, and respond to the confirmation instruction to enable the function of detecting the direction of the target object.
可选的,如图20所示,所述装置2,还包括:Optionally, as shown in Figure 20, the device 2 further includes:
信息提示显示模块27,配置为当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息;The information prompt display module 27 is configured to display prompt information including the multiple recognized objects when the preview image includes multiple recognized objects;
目标物体确定模块28,配置为接收针对所述提示信息中目标物体的选择指令,响应于所述选择指令,确定目标物体。The target object determination module 28 is configured to receive a selection instruction for the target object in the prompt information, and determine the target object in response to the selection instruction.
可选的,如图21所示,所述方向调整模块24,包括:Optionally, as shown in Figure 21, the direction adjustment module 24 includes:
比值计算单元241,配置为计算各所述文本行分别在各个方向上的数量比值,确定所述数量比值中最高数量比值指示的方向;The ratio calculation unit 241 is configured to calculate the quantity ratios of each of the text lines in each direction, and determine the direction indicated by the highest quantity ratio among the quantity ratios;
角度输出单元242,配置为基于所述最高数量比值指示的方向计算并输出旋转角度;an angle output unit 242 configured to calculate and output a rotation angle based on the direction indicated by the highest number ratio;
方向调整单元243,配置为基于所述旋转角度调整所述预览图像的显示方向,显示调整后的预览图像。The direction adjusting unit 243 is configured to adjust the display direction of the preview image based on the rotation angle, and display the adjusted preview image.
可选的,所述角度调整单元,具体配置为:Optionally, the angle adjustment unit is specifically configured as:
若所述旋转角度不为0,则将所述预览图像的显示方向旋转所述旋转角度,得到所述预览图像的显示方向。If the rotation angle is not 0, the display direction of the preview image is rotated by the rotation angle to obtain the display direction of the preview image.
在本申请实施例中,采用二分类算法识别预览图像,若预览图像包括多个识别物体时,输出提示信息以提示用户需选择其中一个识别物体作为目标物体,采用Advanced East算法后将识别出所有文本行,当目标物体中包括多个文本行时,可获取目标物体中的任意数目的文本行或者获取目标物体中的任意比例的文本行作为至少一个文本行,对比获取所有文本行进行方向确定,可降低了计算工作量。采用投影法中的水平投影法与垂直投影法相结合的方式对字符进行切割可提高确定的字符的方向的准确率。在遍历到的目标文本行中随机获取目 标文本行中存在的任意数目或预设比例数量的字符,基于任意数目的字符确定目标文本行的方向,或当目标文本行中所有的字符方向相同时,随机获取目标文本行中的任一字符,确定随机获取的任一字符的方向,可降低计算工作量。计算各文本行分别在各个方向上的数量比值,基于最高数量比值指示的方向调整预览图像的显示方向,就可以避免目标物体中的文本行方向不一致时图像方向的调整没有具体依据,调整方向混乱的问题。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of this application, a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified. Text line, when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction. Randomly obtain any number or preset ratio of characters in the target text line traversed, determine the direction of the target text line based on any number of characters, or when all the characters in the target text line have the same direction , randomly obtain any character in the target text line, and determine the direction of any randomly obtained character, which can reduce the computational workload. Calculate the number ratio of each text line in each direction, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio, so as to avoid that there is no specific basis for adjusting the direction of the image when the direction of the text line in the target object is inconsistent, and the direction of the adjustment is confused. The problem. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
需要说明的是,上述实施例提供的图像方向调整装置在执行图像方向调整方法时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的图像方向调整装置与图像方向调整方法实施例属于同一构思,其体现实现过程详见方法实施例,这里不再赘述。It should be noted that, when the image orientation adjustment device provided in the above-mentioned embodiments executes the image orientation adjustment method, the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be assigned to different function modules as required Module completion means that the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the image orientation adjustment device and the image orientation adjustment method embodiment provided by the above embodiment belong to the same idea, and the implementation process thereof is detailed in the method embodiment, and will not be repeated here.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
本申请实施例还提供了一种计算机存储介质,所述计算机存储介质可以存储有多条指令,所述指令适于由处理器加载并执行如上述图1-图13所示实施例的方法步骤,具体执行过程可以参见图1-图13所示实施例的具体说明,在此不进行赘述。The embodiment of the present application also provides a computer storage medium, the computer storage medium can store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the method steps of the above-mentioned embodiments shown in Figures 1-13 For the specific execution process, refer to the specific description of the embodiments shown in FIGS. 1-13 , and details are not repeated here.
本申请还提供了一种计算机程序产品,该计算机程序产品存储有至少一条指令,所述至少一条指令由所述处理器加载并执行如上述图1-图13所示实施例的方法步骤,具体执行过程可以参见图1-图13所示实施例的具体说明,在此不进行赘述。The present application also provides a computer program product, the computer program product stores at least one instruction, and the at least one instruction is loaded by the processor and executes the method steps of the above-mentioned embodiments shown in Figures 1-13, specifically For the execution process, reference may be made to the specific descriptions of the embodiments shown in FIGS. 1-13 , and details are not repeated here.
请参见图22,为本申请实施例提供了一种电子设备的结构示意图。如图22所示,所述移动终端1000可以包括:至少一个处理器1001,至少一个网络接口1004,用户接口1003,存储器1005,至少一个通信总线1002。Please refer to FIG. 22 , which provides a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 22 , the mobile terminal 1000 may include: at least one processor 1001 , at least one network interface 1004 , a user interface 1003 , a memory 1005 , and at least one communication bus 1002 .
其中,通信总线1002用于实现这些组件之间的连接通信。Wherein, the communication bus 1002 is used to realize connection and communication between these components.
其中,用户接口1003可以包括显示屏(Display)、摄像头(Camera),可选用户接口1003还可以包括标准的有线接口、无线接口。Wherein, the user interface 1003 may include a display screen (Display) and a camera (Camera), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
其中,网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。Wherein, the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
其中,处理器1001可以包括一个或者多个处理核心。处理器1001利用各种借口和线路连接整个电子设备1000内的各个部分,通过运行或执行存储在存储器1005内的指令、程序、代码集或指令集,以及调用存储在存储器1005内的数据,执行电子设备1000的各种功能和处理数据。可选的,处理器1001可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器1001可集成中央处理器(Central Processing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负 责显示屏所需要显示的内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器1001中,单独通过一块芯片进行实现。Wherein, the processor 1001 may include one or more processing cores. The processor 1001 uses various interfaces and lines to connect various parts of the entire electronic device 1000, and by running or executing instructions, programs, code sets or instruction sets stored in the memory 1005, and calling data stored in the memory 1005, execute Various functions of the electronic device 1000 and processing data. Optionally, the processor 1001 may use at least one of Digital Signal Processing (Digital Signal Processing, DSP), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and Programmable Logic Array (Programmable Logic Array, PLA). implemented in the form of hardware. The processor 1001 may integrate one or a combination of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU) and a modem. Among them, the CPU mainly handles the operating system, user interface and application programs, etc.; the GPU is used to render and draw the content that needs to be displayed on the display screen; the modem is used to handle wireless communication. It can be understood that the above modem may also not be integrated into the processor 1001, but implemented by a single chip.
其中,存储器1005可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory)。可选的,该存储器1005包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。存储器1005可用于存储指令、程序、代码、代码集或指令集。存储器1005可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现上述各个方法实施例的指令等;存储数据区可存储上面各个方法实施例中涉及到的数据等。存储器1005可选的还可以是至少一个位于远离前述处理器1001的存储装置。如图22所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及图像方向调整应用程序。Wherein, the memory 1005 may include a random access memory (Random Access Memory, RAM), and may also include a read-only memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer-readable storage medium (non-transitory computer-readable storage medium). The memory 1005 may be used to store instructions, programs, codes, sets of codes or sets of instructions. The memory 1005 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playback function, an image playback function, etc.), Instructions and the like for implementing the above method embodiments; the storage data area can store the data and the like involved in the above method embodiments. Optionally, the memory 1005 may also be at least one storage device located away from the aforementioned processor 1001 . As shown in FIG. 22 , the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and an image orientation adjustment application program.
在图22所示的移动终端1000中,用户接口1003主要用于为用户提供输入的接口,获取用户输入的数据;而处理器1001可以用于调用存储器1005中存储的生成图像方向调整应用程序,并具体执行以下操作:In the mobile terminal 1000 shown in FIG. 22 , the user interface 1003 is mainly used to provide the user with an input interface to obtain the data input by the user; and the processor 1001 can be used to call the generated image orientation adjustment application program stored in the memory 1005, And specifically do the following:
获取摄像头针对拍摄对象采集的预览图像;Obtain the preview image collected by the camera for the subject;
识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;identifying a target object in the preview image, and acquiring at least one text line in the target object;
确定所述至少一个文本行中各文本行的方向;determining the orientation of each of the at least one line of text;
基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。A target direction of the preview image is determined based on each of the directions, and a display direction of the preview image is adjusted based on the target direction of the preview image.
在一个实施例中,所述处理器1001在执行识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行时,具体执行以下操作:In one embodiment, when the processor 1001 recognizes the target object in the preview image and acquires at least one text line in the target object, it specifically performs the following operations:
采用二分类算法识别所述预览图像中的目标物体;Using a binary classification algorithm to identify the target object in the preview image;
识别所述目标物体中的所有文本行;identifying all text lines in said target object;
在所述所有文本行中选取至少一个文本行。At least one text line is selected from all the text lines.
在一个实施例中,所述处理器1001在执行确定所述至少一个文本行中各文本行的方向时,具体执行以下操作:In one embodiment, when the processor 1001 determines the direction of each text line in the at least one text line, it specifically performs the following operations:
遍历所述至少一个文本行中的各文本行;traversing each of the at least one text line;
对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向;Perform character cutting processing on the currently traversed target text line, and determine the direction of the target text line;
直至遍历结束。until the traversal ends.
在一个实施例中,所述处理器1001在执行对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向时,具体执行以下操作:In one embodiment, when the processor 1001 performs character cutting processing on the currently traversed target text line and determines the direction of the target text line, it specifically performs the following operations:
对当前遍历到的目标文本行进行二值化处理,得到二值化图像;Perform binarization processing on the currently traversed target text line to obtain a binarized image;
采用水平投影法对所述二值化图像进行处理,得到第一像素分布直方图;Processing the binarized image by using a horizontal projection method to obtain a first pixel distribution histogram;
采用垂直投影法对所述二值化图像进行处理,得到第二像素分布直方图;Processing the binarized image by using a vertical projection method to obtain a second pixel distribution histogram;
基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向。Based on the first pixel distribution histogram and the second pixel distribution histogram, the characters in the target text line are cut to determine the direction of the target text line.
在一个实施例中,所述处理器1001在执行基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向时,具体执行以下操作:In one embodiment, the processor 1001 cuts the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram to determine the target text line In the direction of , specifically perform the following operations:
基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定各所述字符的方向;cutting the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determining the direction of each character;
计算各所述字符分别在各个方向上的数量比值,获取所述数量比值中最高数量比值指示的字符的目标方向;calculating the quantity ratios of the characters in each direction, and obtaining the target direction of the character indicated by the highest quantity ratio among the quantity ratios;
将所述字符的目标方向作为所述目标文本行的方向。The target direction of the character is used as the direction of the target text line.
在一个实施例中,所述处理器1001在执行当所述目标文本行中所有的字符方向相同时,基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向时,具体执行以下操作:In one embodiment, when the processor 1001 executes when all characters in the target text line have the same direction, based on the first pixel distribution histogram and the second pixel distribution histogram, the target The characters in the text line are cut, and when the direction of the target text line is determined, the following operations are specifically performed:
所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行中任一字符的方向,将所述任一字符的方向作为所述目标文本行的方向。The character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined. The direction of a character is used as the direction of the target text line.
在一个实施例中,所述处理器1001在执行基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像,具体执行以下操作:In one embodiment, the processor 1001 determines the target direction of the preview image based on each of the directions, adjusts the display direction of the preview image based on the target direction of the preview image, and displays the adjusted preview image. , specifically perform the following operations:
计算各所述文本行分别在各个方向上的数量比值;calculating the quantity ratios of each of the text lines in each direction;
获取所述数量比值中最高数量比值指示的方向,基于所述最高数量比值指示的方向调整所述预览图像的显示方向。Acquiring the direction indicated by the highest number ratio among the number ratios, and adjusting the display direction of the preview image based on the direction indicated by the highest number ratio.
在一个实施例中,所述处理器1001在执行识别所述预览图像中的目标物体,具体执行以下操作:In one embodiment, when the processor 1001 recognizes the target object in the preview image, the following operations are specifically performed:
当所述预览图像中包括多个识别物体时,将所述多个识别物体中被选择的识别物体作为目标物体。When the preview image includes multiple recognized objects, the selected recognized object among the multiple recognized objects is used as the target object.
可选的,所述处理器1001还执行以下操作:Optionally, the processor 1001 also performs the following operations:
接收拍摄指令,响应于所述拍摄指令,开启摄像头;receiving a shooting instruction, and turning on the camera in response to the shooting instruction;
显示所述摄像头针对拍摄对象采集的预览图像;Displaying a preview image collected by the camera for the subject;
确定所述预览图像中所确定的各文本行的方向;determining the orientation of each of the lines of text determined in the preview image;
基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。determining a target direction of the preview image based on each of the directions, adjusting a display direction of the preview image based on the target direction of the preview image, and displaying the adjusted preview image.
在一个实施例中,所述处理器1001还执行以下操作:In one embodiment, the processor 1001 also performs the following operations:
显示是否开启对目标物体进行方向检测的提示信息;Display whether to open the prompt information for the direction detection of the target object;
接收针对所述提示信息的确认指令,响应于所述确认指令,开启对目标物体进行方向检测的功能。A confirmation instruction for the prompt information is received, and a function of detecting the direction of the target object is turned on in response to the confirmation instruction.
在一个实施例中,所述处理器1001在执行显示所述摄像头针对拍摄对象采集的预览图像之后,还执行以下操作:In one embodiment, after the processor 1001 executes displaying the preview image captured by the camera for the subject, the following operations are further performed:
当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息;When the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects;
接收针对所述提示信息中目标物体的选择指令,响应于所述选择指令,显示对所述目标物体的标记信息。A selection instruction for the target object in the prompt information is received, and marking information for the target object is displayed in response to the selection instruction.
在一个实施例中,所述处理器1001在执行基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像,具体执行以下操作:In one embodiment, the processor 1001 determines the target direction of the preview image based on each of the directions, adjusts the display direction of the preview image based on the target direction of the preview image, and displays the adjusted preview image. , specifically do the following:
计算各所述文本行分别在各个方向上的数量比值,确定所述数量比值中最高数量比值指示的方向;calculating the quantity ratios of each of the text lines in each direction, and determining the direction indicated by the highest quantity ratio among the quantity ratios;
基于所述最高数量比值指示的方向计算并输出旋转角度;calculating and outputting a rotation angle based on the direction indicated by the highest number ratio;
基于所述旋转角度调整所述预览图像的显示方向,显示调整后的预览图像。Adjusting the display direction of the preview image based on the rotation angle, and displaying the adjusted preview image.
在一个实施例中,所述处理器1001在执行基于所述旋转角度调整所述预览图像的显示方向时,具体执行以下操作:In one embodiment, when the processor 1001 adjusts the display direction of the preview image based on the rotation angle, it specifically performs the following operations:
若所述旋转角度不为0,则将所述预览图像的显示方向旋转所述旋转角度。If the rotation angle is not 0, the display direction of the preview image is rotated by the rotation angle.
在本申请实施例中,采用二分类算法识别预览图像,若预览图像包括多个识别物体时,输出提示信息以提示用户需选择其中一个识别物体作为目标物体,采用Advanced East算法后将识别出所有文本行,当目标物体中包括多个文本行时,可获取目标物体中的任意数目的文本行或者获取目标物体中的任意比例的文本行作为至少一个文本行,对比获取所有文本行进行方向确定,可降低了计算工作量。采用投影法中的水平投影法与垂直投影法相结合的方式对字符进行切割可提高确定的字符的方向的准确率。在遍历到的目标文本行中随机获取目标文本行中存在的任意数目或预设比例数量的字符,基于任意数目的字符确定目标文本行的方向,或当目标文本行中所有的字符方向相同时,随机获取目标文本行中的任一字符,确定随机获取的任一字符的方向,可降低计算工作量。计算各文本行分别在各个方向上的数量比值,基于最高数量比值指示的方向调整预览图像的显示方向,就可以避免目标物体中的文本行方向不一致时图像方向的调整没有具体依据,调整方向混乱的问题。即使将拍摄设备进行旋转或拍摄到的预览图像会发生随机旋转,通过根据预览图像的目标方向将预览图像的显示方向进行旋转,具体为若旋转角度不为0,则将预览图像的显示方向旋转此旋转角度,从而可以使得拍摄到的图像中的文字方向与用户需要的方向相匹配,以符合用户的使用习惯,方便用户使用。In the embodiment of this application, a binary classification algorithm is used to identify the preview image. If the preview image includes multiple identification objects, a prompt message is output to prompt the user to select one of the identification objects as the target object. After the Advanced East algorithm is used, all identification objects will be identified. Text line, when the target object includes multiple text lines, any number of text lines in the target object or any proportion of text lines in the target object can be obtained as at least one text line, and the direction is determined by comparing all text lines , which reduces the computational workload. Cutting the characters by combining the horizontal projection method and the vertical projection method in the projection method can improve the accuracy of the determined character direction. Randomly obtain any number or preset ratio of characters in the target text line traversed, determine the direction of the target text line based on any number of characters, or when all the characters in the target text line have the same direction , randomly obtain any character in the target text line, and determine the direction of any randomly obtained character, which can reduce the computational workload. Calculate the number ratio of each text line in each direction, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio, so as to avoid that there is no specific basis for adjusting the direction of the image when the direction of the text line in the target object is inconsistent, and the direction of the adjustment is confused. The problem. Even if the shooting device is rotated or the captured preview image will be rotated randomly, the display direction of the preview image will be rotated according to the target direction of the preview image. Specifically, if the rotation angle is not 0, the display direction of the preview image will be rotated This rotation angle can make the direction of the text in the captured image match the direction required by the user, so as to conform to the user's usage habits and facilitate the user's use.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体或随机存储记忆体等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory or a random access memory, and the like.
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。The above disclosures are only preferred embodiments of the present application, which certainly cannot limit the scope of the present application. Therefore, equivalent changes made according to the claims of the present application still fall within the scope of the present application.

Claims (28)

  1. 一种图像方向调整方法,其中,应用于移动终端,所述方法包括:A method for adjusting an image direction, wherein, applied to a mobile terminal, the method includes:
    获取摄像头针对拍摄对象采集的预览图像;Obtain the preview image collected by the camera for the subject;
    识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;identifying a target object in the preview image, and acquiring at least one text line in the target object;
    确定所述至少一个文本行中各文本行的方向;determining the orientation of each of the at least one line of text;
    基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。A target direction of the preview image is determined based on each of the directions, and a display direction of the preview image is adjusted based on the target direction of the preview image.
  2. 根据权利要求1所述的方法,其中,所述识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行,包括:The method according to claim 1, wherein said identifying the target object in the preview image and obtaining at least one text line in the target object comprises:
    采用二分类算法识别所述预览图像中的目标物体;Using a binary classification algorithm to identify the target object in the preview image;
    识别所述目标物体中的所有文本行;identifying all text lines in said target object;
    在所述所有文本行中选取至少一个文本行。At least one text line is selected from all the text lines.
  3. 根据权利要求1所述的方法,其中,所述确定所述至少一个文本行中各文本行的方向,包括:The method according to claim 1, wherein said determining the direction of each text line in said at least one text line comprises:
    遍历所述至少一个文本行中的各文本行;traversing each of the at least one text line;
    对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向;Perform character cutting processing on the currently traversed target text line, and determine the direction of the target text line;
    直至遍历结束。until the traversal ends.
  4. 根据权利要求3所述的方法,其中,所述对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向,包括:The method according to claim 3, wherein said performing character cutting processing on the currently traversed target text line, and determining the direction of the target text line include:
    对当前遍历到的目标文本行进行二值化处理,得到二值化图像;Perform binarization processing on the currently traversed target text line to obtain a binarized image;
    采用水平投影法对所述二值化图像进行处理,得到第一像素分布直方图;Processing the binarized image by using a horizontal projection method to obtain a first pixel distribution histogram;
    采用垂直投影法对所述二值化图像进行处理,得到第二像素分布直方图;Processing the binarized image by using a vertical projection method to obtain a second pixel distribution histogram;
    基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向。Based on the first pixel distribution histogram and the second pixel distribution histogram, the characters in the target text line are cut to determine the direction of the target text line.
  5. 根据权利要求4所述的方法,其中,所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向,包括:The method according to claim 4, wherein, based on the first pixel distribution histogram and the second pixel distribution histogram, the characters in the target text line are cut to determine the target text line directions, including:
    基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定各所述字符的方向;cutting the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determining the direction of each character;
    计算各所述字符分别在各个方向上的数量比值,获取所述数量比值中最高数量比值指示的字符的目标方向;calculating the quantity ratios of the characters in each direction, and obtaining the target direction of the character indicated by the highest quantity ratio among the quantity ratios;
    将所述字符的目标方向作为所述目标文本行的方向。The target direction of the character is used as the direction of the target text line.
  6. 根据权利要求4所述的方法,其中,当所述目标文本行中所有的字符方向相同时,所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向,包括:The method according to claim 4, wherein, when all characters in the target text line have the same direction, the target The characters in the text line are cut to determine the direction of the target text line, including:
    所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行中任一字符的方向,将所述任一字符的方向作为所述目标文本行的方向。The character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined. The direction of a character is used as the direction of the target text line.
  7. 根据权利要求1所述的方法,其中,基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,包括:The method according to claim 1, wherein determining the target direction of the preview image based on each of the directions, and adjusting the display direction of the preview image based on the target direction of the preview image comprises:
    计算各所述文本行分别在各个方向上的数量比值;calculating the quantity ratios of each of the text lines in each direction;
    获取所述数量比值中最高数量比值指示的方向,基于所述最高数量比值指示的方向调整所述预览图像的显示方向。Acquiring the direction indicated by the highest number ratio among the number ratios, and adjusting the display direction of the preview image based on the direction indicated by the highest number ratio.
  8. 根据权利要求1所述的方法,其中,所述识别所述预览图像中的目标物体,包括:The method according to claim 1, wherein the identifying the target object in the preview image comprises:
    当所述预览图像中包括多个识别物体时,将所述多个识别物体中被选择的识别物体作为目标物体。When the preview image includes multiple recognized objects, the selected recognized object among the multiple recognized objects is used as the target object.
  9. 一种图像方向调整方法,其中,所述方法包括:A method for adjusting an image orientation, wherein the method includes:
    接收拍摄指令,响应于所述拍摄指令,开启摄像头;receiving a shooting instruction, and turning on the camera in response to the shooting instruction;
    显示所述摄像头针对拍摄对象采集的预览图像;Displaying a preview image collected by the camera for the subject;
    确定所述预览图像中所确定的各文本行的方向;determining the orientation of each of the lines of text determined in the preview image;
    基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。determining a target direction of the preview image based on each of the directions, adjusting a display direction of the preview image based on the target direction of the preview image, and displaying the adjusted preview image.
  10. 根据权利要求9所述的方法,其中,所述开启摄像头之后,还包括:The method according to claim 9, wherein, after turning on the camera, further comprising:
    显示是否开启对目标物体进行方向检测的提示信息;Display whether to open the prompt information for the direction detection of the target object;
    接收针对所述提示信息的确认指令,响应于所述确认指令,开启对目标物体进行方向检测的功能。A confirmation instruction for the prompt information is received, and a function of detecting the direction of the target object is turned on in response to the confirmation instruction.
  11. 根据权利要求9所述的方法,其中,显示所述摄像头针对拍摄对象采集的预览图像之后,还包括:The method according to claim 9, wherein, after displaying the preview image collected by the camera for the subject, further comprising:
    当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息;When the preview image includes multiple recognized objects, displaying prompt information containing the multiple recognized objects;
    接收针对所述提示信息中目标物体的选择指令,响应于所述选择指令,确定目标物体。A selection instruction for the target object in the prompt information is received, and the target object is determined in response to the selection instruction.
  12. 根据权利要求9所述的方法,其中,基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像,包括:The method according to claim 9, wherein determining the target direction of the preview image based on each of the directions, adjusting the display direction of the preview image based on the target direction of the preview image, and displaying the adjusted preview image comprises :
    计算各所述文本行分别在各个方向上的数量比值,确定所述数量比值中最高数量比值指示的方向;calculating the quantity ratios of each of the text lines in each direction, and determining the direction indicated by the highest quantity ratio among the quantity ratios;
    基于所述最高数量比值指示的方向计算并输出旋转角度;calculating and outputting a rotation angle based on the direction indicated by the highest number ratio;
    基于所述旋转角度调整所述预览图像的显示方向,显示调整后的预览图像。Adjusting the display direction of the preview image based on the rotation angle, and displaying the adjusted preview image.
  13. 根据权利要求12所述的方法,其中,所述基于所述旋转角度调整所述预览图像的显示方向,包括:The method according to claim 12, wherein the adjusting the display direction of the preview image based on the rotation angle comprises:
    若所述旋转角度不为0,则将所述预览图像的显示方向旋转所述旋转角度。If the rotation angle is not 0, the display direction of the preview image is rotated by the rotation angle.
  14. 一种图像方向调整装置,其中,所述装置包括:An image orientation adjustment device, wherein the device includes:
    图像获取模块,配置为获取摄像头针对拍摄对象采集的预览图像;An image acquisition module configured to acquire a preview image collected by the camera for the subject;
    文本行获取模块,配置为识别所述预览图像中的目标物体,获取所述目标物体中的至少一个文本行;A text line acquisition module configured to identify a target object in the preview image, and acquire at least one text line in the target object;
    方向确定模块,配置为确定所述至少一个文本行中各文本行的方向;an orientation determination module configured to determine the orientation of each of the at least one text lines;
    方向调整模块,配置为基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向。The orientation adjustment module is configured to determine the target orientation of the preview image based on each of the orientations, and adjust the display orientation of the preview image based on the target orientation of the preview image.
  15. 根据权利要求14所述的装置,其中,所述文本行获取模块,包括:The device according to claim 14, wherein the text line acquisition module comprises:
    目标物体识别单元,配置为采用二分类算法识别所述预览图像中的目标物体;a target object recognition unit configured to use a binary classification algorithm to recognize the target object in the preview image;
    文本行识别单元,配置为识别所述目标物体中的所有文本行;a text line identification unit configured to identify all text lines in the target object;
    文本行选取单元,配置为在所述所有文本行中选取至少一个文本行。The text line selection unit is configured to select at least one text line from all the text lines.
  16. 根据权利要求14所述的装置,其中,所述方向确定模块,包括:The device according to claim 14, wherein the direction determining module comprises:
    文本行遍历单元,配置为遍历所述至少一个文本行中的各文本行;a text line traversing unit configured to traverse each text line in the at least one text line;
    方向确定单元,配置为对当前遍历到的目标文本行进行字符切割处理,确定所述目标文本行的方向;a direction determination unit configured to perform character cutting processing on the currently traversed target text line, and determine the direction of the target text line;
    遍历结束单元,配置为直至遍历结束。The end unit of the traversal is configured until the end of the traversal.
  17. 根据权利要求16所述的装置,其中,所述方向确定单元,包括:The device according to claim 16, wherein the direction determining unit comprises:
    图像获取子单元,配置为对当前遍历到的目标文本行进行二值化处理,得到二值化图像;The image acquisition subunit is configured to perform binarization processing on the currently traversed target text line to obtain a binarized image;
    第一直方图获取子单元,配置为采用水平投影法对所述二值化图像进行处理,得到第一像素分布直方图;The first histogram acquisition subunit is configured to process the binarized image using a horizontal projection method to obtain a first pixel distribution histogram;
    第二直方图获取子单元,配置为采用垂直投影法对所述二值化图像进行处理,得到第二像素分布直方图;The second histogram acquisition subunit is configured to process the binarized image using a vertical projection method to obtain a second pixel distribution histogram;
    字符切割子单元,配置为基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行的方向。The character cutting subunit is configured to cut the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determine the direction of the target text line.
  18. 根据权利要求17所述的装置,其中,所述字符切割子单元,具体配置为:The device according to claim 17, wherein the character cutting subunit is specifically configured as:
    基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定各所述字符的方向;cutting the characters in the target text line based on the first pixel distribution histogram and the second pixel distribution histogram, and determining the direction of each character;
    计算各所述字符分别在各个方向上的数量比值,获取所述数量比值中最高数量比值指示的字符的目标方向;calculating the quantity ratios of the characters in each direction, and obtaining the target direction of the character indicated by the highest quantity ratio among the quantity ratios;
    将所述字符的目标方向作为所述目标文本行的方向。The target direction of the character is used as the direction of the target text line.
  19. 根据权利要求17所述的装置,其中,当所述目标文本行中所有的字符方向相同时,所述字符切割子单元,具体配置为:The device according to claim 17, wherein when all the characters in the target text line have the same direction, the character cutting subunit is specifically configured as:
    所述基于所述第一像素分布直方图以及所述第二像素分布直方图,对所述目标文本行中的字符进行切割,确定所述目标文本行中任一字符的方向,将所述任一字符的方向作为所述目标文本行的方向。The character in the target text line is cut based on the first pixel distribution histogram and the second pixel distribution histogram, the direction of any character in the target text line is determined, and any character in the target text line is determined. The direction of a character is used as the direction of the target text line.
  20. 根据权利要求14所述的装置,其中,所述方向调整模块,包括:The device according to claim 14, wherein the direction adjustment module comprises:
    比值计算单元,配置为计算各所述文本行分别在各个方向上的数量比值;A ratio calculation unit configured to calculate the quantity ratio of each of the text lines in each direction;
    角度调整单元,配置为获取所述数量比值中最高数量比值指示的方向,基于所述最高数量比值指示的方向调整所述预览图像的显示方向。The angle adjustment unit is configured to obtain the direction indicated by the highest number ratio among the number ratios, and adjust the display direction of the preview image based on the direction indicated by the highest number ratio.
  21. 根据权利要求14所述的装置,其中,所述文本行获取模块,具体配置为:The device according to claim 14, wherein the text line acquisition module is specifically configured as:
    当所述预览图像中包括多个识别物体时,将所述多个识别物体中被选择的识别物体作为目标物体。When the preview image includes multiple recognized objects, the selected recognized object among the multiple recognized objects is used as the target object.
  22. 一种图像方向调整装置,其中,所述装置包括:An image orientation adjustment device, wherein the device includes:
    摄像头开启模块,配置为接收拍摄指令,响应于所述拍摄指令,开启摄像头;The camera opening module is configured to receive a shooting instruction, and turn on the camera in response to the shooting instruction;
    图像显示模块,配置为显示所述摄像头针对拍摄对象采集的预览图像;An image display module configured to display a preview image collected by the camera for the subject;
    方向确定模块,配置为确定所述预览图像中所确定的各文本行的方向;a direction determination module configured to determine the direction of each text line determined in the preview image;
    方向调整模块,配置为基于各所述方向确定所述预览图像的目标方向,基于所述预览图像的目标方向调整所述预览图像的显示方向,显示调整后的预览图像。The direction adjustment module is configured to determine the target direction of the preview image based on each of the directions, adjust the display direction of the preview image based on the target direction of the preview image, and display the adjusted preview image.
  23. 根据权利要求22所述的装置,其中,所述装置还包括:The apparatus according to claim 22, wherein said apparatus further comprises:
    检测提示显示模块,配置为显示是否开启对目标物体进行方向检测的提示信息;The detection prompt display module is configured to display the prompt information of whether to open the direction detection of the target object;
    检测开启模块,配置为接收针对所述提示信息的确认指令,响应于所述确认指令,开启对目标物体进行方向检测的功能。The detection enabling module is configured to receive a confirmation instruction for the prompt information, and respond to the confirmation instruction to enable the function of detecting the direction of the target object.
  24. 根据权利要求22所述的装置,其中,所述装置还包括:The apparatus according to claim 22, wherein said apparatus further comprises:
    信息提示显示模块,配置为当所述预览图像中包括多个识别物体时,显示包含所述多个识别物体的提示信息;An information prompt display module configured to display prompt information containing the multiple recognized objects when the preview image includes multiple recognized objects;
    目标物体确定模块,配置为接收针对所述提示信息中目标物体的选择指令,响应于所述选择指令,确定目标物体。The target object determination module is configured to receive a selection instruction for the target object in the prompt information, and determine the target object in response to the selection instruction.
  25. 根据权利要求22所述的装置,其中,所述方向调整模块,包括:The device according to claim 22, wherein the direction adjustment module comprises:
    比值计算单元,配置为计算各所述文本行分别在各个方向上的数量比值,确定所述数量比值中最高数量比值指示的方向;A ratio calculation unit configured to calculate the quantity ratios of each of the text lines in each direction, and determine the direction indicated by the highest quantity ratio among the quantity ratios;
    角度输出单元,配置为基于所述最高数量比值指示的方向计算并输出旋转角度;an angle output unit configured to calculate and output a rotation angle based on the direction indicated by the highest number ratio;
    方向调整单元,配置为基于所述旋转角度调整所述预览图像的显示方向,显示调整后的预览图像。The direction adjusting unit is configured to adjust the display direction of the preview image based on the rotation angle, and display the adjusted preview image.
  26. 根据权利要求25所述的装置,其中,所述角度调整单元,具体配置为:The device according to claim 25, wherein the angle adjustment unit is specifically configured as:
    若所述旋转角度不为0,则将所述预览图像的显示方向旋转所述旋转角度。If the rotation angle is not 0, the display direction of the preview image is rotated by the rotation angle.
  27. 一种计算机存储介质,其中,所述计算机存储介质存储有多条指令,所述指令适于由处理器加载并执行如权利要求1~8或9~13任意一项的方法步骤。A computer storage medium, wherein the computer storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the method steps according to any one of claims 1-8 or 9-13.
  28. 一种电子设备,其中,包括:处理器和存储器;其中,所述存储器存储有计算机程序,所述计算机程序适于由所述处理器加载并执行如权利要求1~8或9~13任意一项的方法步骤。An electronic device, including: a processor and a memory; wherein, the memory stores a computer program, and the computer program is suitable for being loaded by the processor and executing any one of claims 1-8 or 9-13 method steps for the item.
PCT/CN2022/107240 2021-07-30 2022-07-22 Image direction adjustment method and apparatus, and storage medium and electronic device WO2023005813A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110879167.X 2021-07-30
CN202110879167.XA CN115696028A (en) 2021-07-30 2021-07-30 Image direction adjusting method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2023005813A1 true WO2023005813A1 (en) 2023-02-02

Family

ID=85060075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/107240 WO2023005813A1 (en) 2021-07-30 2022-07-22 Image direction adjustment method and apparatus, and storage medium and electronic device

Country Status (2)

Country Link
CN (1) CN115696028A (en)
WO (1) WO2023005813A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116320732A (en) * 2023-04-18 2023-06-23 广州市宏视电子技术有限公司 Solar camera control method and device, solar camera and storage medium
CN116740740A (en) * 2023-08-11 2023-09-12 浙江太美医疗科技股份有限公司 Method for judging same-line text, method for ordering documents and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010011359A (en) * 2008-06-30 2010-01-14 Sharp Corp Image processing apparatus, image forming apparatus, method of controlling the image processing apparatus, control program, and recording medium
CN101833648A (en) * 2009-03-13 2010-09-15 汉王科技股份有限公司 Method for correcting text image
CN103885611A (en) * 2014-04-22 2014-06-25 锤子科技(北京)有限公司 Method and device for adjusting image
CN107798355A (en) * 2017-11-17 2018-03-13 山西同方知网数字出版技术有限公司 A kind of method automatically analyzed based on file and picture format with judging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010011359A (en) * 2008-06-30 2010-01-14 Sharp Corp Image processing apparatus, image forming apparatus, method of controlling the image processing apparatus, control program, and recording medium
CN101833648A (en) * 2009-03-13 2010-09-15 汉王科技股份有限公司 Method for correcting text image
CN103885611A (en) * 2014-04-22 2014-06-25 锤子科技(北京)有限公司 Method and device for adjusting image
CN107798355A (en) * 2017-11-17 2018-03-13 山西同方知网数字出版技术有限公司 A kind of method automatically analyzed based on file and picture format with judging

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116320732A (en) * 2023-04-18 2023-06-23 广州市宏视电子技术有限公司 Solar camera control method and device, solar camera and storage medium
CN116320732B (en) * 2023-04-18 2023-12-05 广州市宏视电子技术有限公司 Solar camera control method and device, solar camera and storage medium
CN116740740A (en) * 2023-08-11 2023-09-12 浙江太美医疗科技股份有限公司 Method for judging same-line text, method for ordering documents and application thereof
CN116740740B (en) * 2023-08-11 2023-11-21 浙江太美医疗科技股份有限公司 Method for judging same-line text, method for ordering documents and application thereof

Also Published As

Publication number Publication date
CN115696028A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
WO2023005813A1 (en) Image direction adjustment method and apparatus, and storage medium and electronic device
JP7058760B2 (en) Image processing methods and their devices, terminals and computer programs
RU2651240C1 (en) Method and device for processing photos
US11481975B2 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
TW202113757A (en) Target object matching method and apparatus, electronic device and storage medium
JP2017517980A (en) Image capturing parameter adjustment in preview mode
US8666145B2 (en) System and method for identifying a region of interest in a digital image
WO2020078105A1 (en) Posture detection method, apparatus and device, and storage medium
WO2017054442A1 (en) Image information recognition processing method and device, and computer storage medium
CN107871001B (en) Audio playing method and device, storage medium and electronic equipment
CN110619656B (en) Face detection tracking method and device based on binocular camera and electronic equipment
EP3975046B1 (en) Method and apparatus for detecting occluded image and medium
JP2011095862A (en) Apparatus and method for processing image and program
WO2021179856A1 (en) Content recognition method and apparatus, electronic device, and storage medium
CN111709414A (en) AR device, character recognition method and device thereof, and computer-readable storage medium
WO2022002262A1 (en) Character sequence recognition method and apparatus based on computer vision, and device and medium
CN111279684A (en) Shooting control method and electronic device
CN108093177B (en) Image acquisition method and device, storage medium and electronic equipment
CN111522524B (en) Presentation control method and device based on conference robot, storage medium and terminal
CN108055461B (en) Self-photographing angle recommendation method and device, terminal equipment and storage medium
JP2015191358A (en) Central person determination system, information terminal to be used by central person determination system, central person determination method, central person determination program, and recording medium
CN111010526A (en) Interaction method and device in video communication
CN106650727B (en) Information display method and AR equipment
CN111401242B (en) Credential detection method, apparatus, electronic device and storage medium
CN111507139A (en) Image effect generation method and device and electronic equipment

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE